0% found this document useful (0 votes)
414 views

ANT Intro

This document provides context for a book on algebraic number theory written for high school students interested in more advanced mathematics. The book is intended to serve as a transition from mathematical olympiads to higher-level theory. It was inspired by classes the author attended and draws from several reference books, but aims to limit direct overlap with those texts. The document outlines the book's chapter dependencies and includes remarks on exercises and solutions included at the end to aid the learning process.

Uploaded by

Kaan Bilge
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
414 views

ANT Intro

This document provides context for a book on algebraic number theory written for high school students interested in more advanced mathematics. The book is intended to serve as a transition from mathematical olympiads to higher-level theory. It was inspired by classes the author attended and draws from several reference books, but aims to limit direct overlap with those texts. The document outlines the book's chapter dependencies and includes remarks on exercises and solutions included at the end to aid the learning process.

Uploaded by

Kaan Bilge
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 429

An Introduction to Algebraic Number Theory through

Olympiad Problems

Elias Caeiro
Foreword

Here is a quick summary of how this book came to be. In July 2019, I attended a class by Gabriel
Dospinescu where he exposed a bit of algebraic number theory (roughly chapter 1 of this book).
Amazed by what I had seen, I started reading a bit of Ireland-Rosen [11] and in late 2019, the thought
of writing a handout inspired by the content of Gabriel’s class for the website https://mathraining.be
crossed my mind. I submitted a first version in July 2020. Then, in March 2021, I had to make a
few corrections. At that time, I knew significantly more than when I first wrote it, so I realised while
doing these corrections that there was so much more that I wanted to add but couldn’t (due to lack of
space). This became the present book; which I wrote during the summer 2021 of my last year of high
school.

This book is intended to serve as a transition from olympiads to higher mathematics, for high
school students who are interested in learning more advanced theory but find regular textbooks too
different from olympiads. I stress that this book is not an efficient way to prepare for
mathematical olympiads.1 As such, there is hardly any prerequisite2 , apart from some amount
of (olympiad) mathematical maturity. Accordingly, there is an appendix providing background on
polynomials at the end of the book. Most of the content of the first section should be familiar to the
reader, but I still recommend to skim through it quickly to have a firm footing on the technicalities
(e.g. a polynomial is not a polynomial function).3 The second section of this appendix is dedicated
to introducing notions of abstract algebra: no theory will be introduced there, it serves both as a way
to explain what morphisms are and as a reference for the definitions of the algebraic structures which
will be used throughout this book (you should not try to remember the actual algebraic structures,
only the intuitive concept of a morphism).

I was aware of some excellent books on algebraic number theory (and related subjects) which
helped me visualise how I wanted this book to be: Andreescu and Dospinescu’s Problems from The
Book [1] (PFTB) and Straight from The Book (SFTB) [2], Ireland and Rosen’s A Classical Introduction
to Modern Number Theory [11], and Murty’s Problems in Algebraic Number Theory [19]. Here is a
small summary of these books: PFTB presents miscellaneous mathematical gems in an (advanced)
olympiad style, and SFTB has solutions to the first 12 chapters and amazing expositions of advanced
topics in addenda. Ireland-Rosen discusses a wide variety of number theoretic topics with an algebraic
flavor, and Murty is a classical first semester course in algebraic number theory but written from a
problem-solving oriented approach. The problems in Ireland-Rosen are generally easier than the ones
in PFTB, SFTB, and Murty.

As a consequence, I have tried to limit the intersection of the present book with these ones, since
the exposition there was already extraordinary. Therefore, I strongly encourage the reader to have a
look at them too. I particularly recommend the addenda 3B, 7A and 9B4 of SFTB, chapter 9 and
13 on algebraic number theory and the geometry of numbers of PFTB as well as the chapters 8 and
9 on Gauss and Jacobi sums and on cubic and biquadratic reciprocity of Ireland-Rosen as they are
particularly similar to the topics of this book, but of course one should read all of the chapters if
possible.
1 Except maybe chapters 3 and 5 on cyclotomic polynomials and polynomial number theory.
2 If I had to state them, maybe the chinese remainder theorem, Fermat’s little theorem, modular arithmetic, complex
numbers, and the binomial expansion?
3 It doesn’t hurt to read it quickly even if you think you know everything, at best you learn something new, at worst

you lose a few minutes (and there are cool exercises!).


4 I have personally found 9A to be too dense for me (before reading Murty).

2
3

One could, for instance, read this book along with the the addenda from SFTB, and then start
with Murty and Ireland-Rosen as they are a bit more abstract (although Ireland-Rosen starts very
gently). This choice is also motivated by the fact that Murty (and some chapters of PFTB/SFTB)
makes a fair use of linear algebra, which I have included an appendix on; for this reason as well as
because it is useful in a fair amount of exercises. In a sense, this book can be thought as a prequel to
Murty, and one should have (almost5 ) all the necessary prerequisites after finishing it. In particular,
at no point6 do I mention ideals, even though they are fundamental in algebraic number theory. As a
consequence, some problems which are solved by tricky uses of the fundamental theorem of symmetric
polynomials can be solved more easily with ideal theory. I hope this will not affect the reader once
they learn ideal theory.

I will now talk more about the book itself. Chapter 1, which roughly corresponds to the Mathraining
version of the book, starts with general definitions and properties of algebraic numbers. It also roughly
corresponds to the chapter of PFTB7 . I don’t actually have much more to say briefly about the
chapters than what the table of contents does, so I will focus on the last two appendices, on symmetric
polynomials and linear algebra. Symmetric polynomials, and above all the fundamental theorem of
symmetric polynomials, are used everywhere in the book. The knowledge of the proofs of these results
however is not strictly required to progress through the book. I have thus arranged them in an appendix
which the reader can read when they want (personally I recommend after reading the first chapter).

Regarding the appendix on linear algebra, I would recommend reading it after chapter 1 too, but
since it is rather long, one can, say, read one section after each chapter. The first section on vector
spaces and bases is fundamental and is necessary for chapter 6 on field theory. I would also recommend
reading it before chapter 4 on finite fields since it is used to give a quick proof of the fact that the
cardinality of a finite field is a power of a prime. Section 2 on linear maps is less important, but
it cannot be skipped because this is where matrices are defined and where some properties of these
matrices are established. Section 3 on determinants is extremely important and rather long; I suggest
to first look at applications of the determinant and then have a more careful look at its construction.
Section 4 uses the results of section 2 and 3 to derive the formula for linear recurrences. Since many
exercises are about linear recurrences, this has to be seen in the beginning, even if one does not read
the proof. Here is a diagram of chapter dependencies. Dashed lines indicate non-strict dependencies
(some facts from the previous chapter might be used, or the previous chapter might provide some
additional motivation, but it is still understandable without it). There is a weak dependency between
6 and 7 because some notations and results of chapter 6 will be used, but only in the last section,
Section 7.4. Also, note that there is one forward dependency: in Section 7.4: at the end of the proof
of the main result, one result from Chapter 8 is used. (This is intentional. Readers should not read
Chapter 7 before Chapter 6. More generally, I think the best course of action is to read this book in
order.)

5 There might still be a few things which need a bit of googling, but they should not take too long to grasp. There

is also some real analysis and geometry involved, mainly for the geometry of numbers part, but I trust the reader will
manage (PFTB has a great chapter on the geometry of numbers which can be used to introduced the subject).
6 Except in some remarks or footnotes.
7 Along with section 1 of chapter 4 on the Frobenius morphism.
4

A B

5 1

2 3

7 6 4

8 C.1

Finally, here are some miscellaneous remarks about the layout of the book. The layout of the
theorems etc. comes from Mathraining. Murty has also inspired me a lot: I have followed its style of
leaving parts of the theory as exercises for the reader. These exercises will be written in purple and
in smaller font. This serves two purposes at once: it keeps the exposition neater (for instance it is
easier to see the main ideas of a proof) and keeps the reader active in the learning process. Some of
these purple exercises will have a star near them, this means that they are part of the theory. In that
case, they cannot be skipped. Otherwise, it is usually an additional remark about an object that
will not be important for the rest of the book but still good to do. Purple exercises are generally easy.
They are all corrected at the end of the book to avoid the reader getting stuck at an early stage due
to a misunderstanding8 . The solutions are deliberately not linked to the exercises to encourage the
reader to try them and not read the solution directly. However, the reader is encouraged to read the
solutions to the exercises they had trouble with, and particularly so for the vagues ones such as the
ones about motivation.

Similarly, a star after a proposition or corollary means that it’s an important result.

Now, here are a few remarks about the supplementary exercises at the end of the chapters, i.e. black
exercises. Some of these are pretty hard, so it is fine to move on to another chapter without having
solved them all (or even almost none of them, it is not a problem9 ) and come back later. A dagger
at the right of an exercise indicates that it is particularly instructive, beautiful, or interesting. These
are all corrected at the end of the book. The exercises can also be seen as a companion to the theory:
many exercises are theorems or classical results. If an exercise doesn’t seem nice enough to attempt it,
but nice enough to want to know the solution, it’s fine10 to read it without trying the problem. Also,
I strongly encourage the reader to read the solutions at the end after solving an exercise: multiple
solutions are often given so you may still learn something new.

I have decided to include all the objects which are defined in the book in the index at the end
of it. This means that I cannot put all the occurences of words like "algebraic number" which are
used almost everywhere. For such words, I chose to include the first occurence of the word where it’s
defined, as well as some more exotic occurences (e.g. embeddings arise everywhere in chapter 6 on
field theory but are pretty rare in the other chapters so I have included all the later occurences). If
words can appear in two indices, I chose to put them in both. For instance, "quadratic unit" appears
both in the index for "quadratic" and the one for "unit". Another initiative I have taken is that I do
not include words appearing in the solutions in the index unless they do not appear near the original
exercise.
8 Being implemented.
9 Ofcourse, I still recommend to try them.
10 But don’t do that too much! This is for exercises like the fact that a Galois extension L/K is solvable if and only if

its Galois group is: it’s a very nice result, but it has very little to do with number theory so it’s understandable if you
feel lazy.
5

There are a few notations or abbreviations which are not completely standard that I haven’t defined
in the book. The reader shall find a table with the notations of this book in the next section, but here
they are for the sake of convenience. I use LHS and RHS to denote "left-hand side" and "right-hand
side", [n] to denote the set of integers from 1 to n and := to define an object. When S is a set
and a an element of some ring, I use aS to denote {as | s ∈ S}. Similarly, U + V and U V mean
{u + v | u ∈ U, v ∈ V } and {uv | u ∈ U, v ∈ V }.

Now comes the time of the acknowledgements. As I said in the beginning of the foreword, this
book could not have existed without the classes of Gabriel Dospinescu and Bodo Lass at the Club de
Mathématiques Discrètes in Lyon. I want to thank Nicolas Radu as well, the creator of Mathraining,
for his very valuable comments on the Mathraining version. I also thank everyone involved in the
French Olympiad Mathematics Preparation as well as in the website https://mathraining.be, for
making me discover (olympiad) mathematics. Lastly, many thanks to Lucas Nistor for patching up
solutions that should work but don’t as well as to Vladimir Ivanov and Alexis Miller for their very
careful proofreading11 .

Finally, I am still very inexperienced so I apologise in advance for all the poor expositions and
mistakes in this book! In particular, I would be very grateful if you could email all the mistakes and
typos you find as well as any suggestion you have (for instance, a very nice alternative solution to
an exercise, or a better way to present the motivation of a solution) to [email protected]
(or pm me on AoPS or discord depending on where you found this book). The dropbox link should
always be (mostly) up to date. The advantage is that you will always have the last version, but the
drawback is that the numbering of theorems, etc. and results may change with time, because I add
content thematically.
Paris Elias Caeiro
October 2

11 If you still see many mistakes, it means that they haven’t finished proofreading the whole book yet, or that there

were so many mistakes that they couldn’t catch them all. The latter is probably true in all cases.
Notations

Sets
• [[a, b]]: the set of integers [a, b] ∩ Z between a and b.

• [n]: the set of integers [[1, n]] between 1 and n.

• N: the set of natural integers {0, 1, 2, . . .}.

• N∗ : the set of positive integers {1, 2, 3, . . . , }.

• Z: the ring of rational integers.

• Q: the field of rational numbers.

• Z: the ring of algebraic integers.

• Q: the field of algebraic numbers.

• H: the skew field of quaternions.

• H: the ring of Hurwitz integers.

• Fq : the field with q elements.

• Zp : the ring of p-adic integers.

• Qp : the field of p-adic numbers.

• Z/nZ: Z modulo n.

• R[α1 , . . . , αn ]: the ring of polynomial expressions in α1 , . . . , αn with coefficients in R.

• K(α1 , . . . , αn ): the field of rational expressions in α1 , . . . , αn (with non-zero denominator) with


coefficients in K.

• OK : the ring of integers K ∩ Z of a number field K.

• Sn : the symmetric group of permutations of [n].

• Rm×n : the set of m × n matrices with coefficients in a commutative ring R. When n = 1 we


simply write Rm .

Polynomials
• Φn : the nth cyclotomic polynomial.


• Ψn : the minimal polynomial of 2 cos n .

• ek : the kth elementary symmetric polynomial.

• pk : the kth power sum polynomial.

6
7

• hk : the kth complete homogeneous polynomial.


• πα : the minimal polynomial of an algebraic number α.
• f 0 : the (formal) derivative of a rational function f .
• f ∗ : the primitive part f /c(f ) of f ∈ Q[X].

Sequences and Functions


• Fn : the nth Fibonacci number.
• Ln : the nth Lucas number.
• Tn : the nth Tribonacci number.
• P (n): the greatest prime factor of a non-zero rational integer n ∈ Z.
• µ(·): the Möbius function.
• rad(·): the squarefree part of the the prime factorisation of an element in a UFD. For Z you take
it to be positive, for Q[X] monic and for Z[X] primitive with positive leading coefficient.
• c(f ): the content of a polynomial f ∈ Z[X].
• N(α): the absolute norm of an algebraic number α ∈ Q, i.e. the product of its conjugates.
• NL/K (α): the norm of α in the extension L/K.
• α: the quadratic conjugate of an element α in a quadratic extension L/K. Without context it is
the complex conjugate (L = C and K = R).
• bxc: the floor of a real number x, i.e. the greatest integer n ≤ x.
• dxe: the ceiling of a real number x, i.e. the smallest integer n ≥ x.
• <(z): the real part of a complex number z ∈ C.
• =(z): the imaginary part of a complex number z ∈ C.
• vp : p-adic valuation.
 
• p· : the Legendre symbol (or Jacobi symbol when p isn’t prime).
n

• k : n choose k, the number of ways to select a subset of k elements from a set of n elements,
n!
i.e. k!(n−k)! .

Algebra
• |R : divides in R.
• d: left-divisibility.
• e: right-divisibility.
• R× : the multiplicative group of units of R.
• FrobR : the Frobenius morphism of R.
• EmbK (L): the set of K-embeddings of L.
• Gal(L/K): the Galois group of L/K.
• AutK (L) = Aut(L/K): the group of automorphisms of L/K.
8

• LH : the fixed field of H.


• Res(f, g): the resultant of two polynomials f and g.
• ker: the kernel of a morphism.
• im: the image of a morphism.

• det: the determinant of a matrix or a linear map.


• Tr: the trace of a matrix or a linear map.
• χM : the characteristic polynomial of a matrix M .

Miscellaneous
• ":=": a definition.
• LHS: left-hand side.

• RHS: right-hand side.


• f n : the nth iterate of a function f unless otherwise specified.
• U ? V : the set {u ? v | (u, v) ∈ U × V } for some sets U, V and an operation ? on (U ∪ V )2 (e.g.
addition or multiplication on the complex numbers). When U = {a} we also write a ? V for {a}V
(and U ? a for U {a} when ? is not commutative).
Contents

Foreword 2

Notations 6

Theory 14
1 Algebraic Numbers and Integers 14
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Symmetric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4 Worked Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2 Quadratic Integers 26
2.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Unique Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Gaussian Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Eisenstein Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5 Hurwitz Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Cyclotomic Polynomials 43
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Zsigmondy’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4 Finite Fields 57
4.1 Frobenius Morphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Cyclotomic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5 Polynomial Number Theory 75


5.1 Factorisation of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Prime Divisors of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Hensel’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4 Bézout’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

9
10 CONTENTS

6 The Primitive Element Theorem and Galois Theory 89


6.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 The Primitive Element Theorem and Field Theory . . . . . . . . . . . . . . . . . . . . . 93
6.3 Galois Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Splitting of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7 Units in Quadratic Fields and Pell’s Equation 109


7.1 Fundamental Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.2 Pell-Type Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.3 Størmer’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.4 Units in Complex Cubic Fields, Thue’s Equation and Kobayashi’s Theorem . . . . . . . 116
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

8 p-adic Analysis 122


8.1 p-adic Integers and Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.2 p-adic Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.3 Binomial Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.4 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.5 The Skolem-Mahler-Lech Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.6 Strassmann’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

A Polynomials 143
A.1 Fields and Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
A.2 Algebraic Structures and Morphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
A.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

B Symmetric Polynomials 160


B.1 The Fundamental Theorem of Symmetric Polynomials . . . . . . . . . . . . . . . . . . . 160
B.2 Newton’s Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
B.3 The Fundamental Theorem of Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
B.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

C Linear Algebra 168


C.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
C.2 Linear Maps and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
C.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
C.4 Linear Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Solutions 196
1 Algebraic Numbers and Integers 196
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
1.2 Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
1.3 Symmetric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
1.4 Worked Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

2 Quadratic Integers 209


2.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
2.2 Unique Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
2.3 Gaussian Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
2.4 Eisenstein Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
2.5 Hurwitz Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
CONTENTS 11

3 Cyclotomic Polynomials 238


3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
3.2 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
3.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
3.4 Zsigmondy’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

4 Finite Fields 265


4.1 Frobenius Morphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
4.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
4.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
4.4 Cyclotomic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
4.5 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

5 Polynomial Number Theory 287


5.1 Factorisation of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
5.2 Prime Divisors of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
5.3 Hensel’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
5.4 Bézout’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

6 The Primitive Element Theorem and Galois Theory 301


6.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
6.2 The Primitive Element Theorem and Field Theory . . . . . . . . . . . . . . . . . . . . . 303
6.3 Galois Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
6.4 Splitting of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

7 Units in Quadratic Fields and Pell’s Equation 330


7.1 Fundamental Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
7.2 Pell-Type Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
7.3 Størmer’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
7.4 Units in Complex Cubic Fields and Kobayashi’s Theorem . . . . . . . . . . . . . . . . . 331
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

8 p-adic Analysis 344


8.1 p-adic Integers and Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
8.2 p-adic Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
8.3 Binomial Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
8.4 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
8.5 The Skolem-Mahler-Lech Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
8.6 Strassmann’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

A Polynomials 371
A.1 Fields and Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
A.2 Algebraic Structures and Morphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
A.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

B Symmetric Polynomials 391


B.1 The Fundamental Theorem of Symmetric Polynomials . . . . . . . . . . . . . . . . . . . 391
B.2 Newton’s Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
B.3 The Fundamental Theorem of Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
B.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
12 CONTENTS

C Linear Algebra 402


C.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
C.2 Linear Maps and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
C.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
C.4 Linear Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409

Further Reading 420

Bibliography 422

Index 429
Theory

13
Chapter 1

Algebraic Numbers and Integers

Prerequisites for this chapter: Section A.1.

1.1 Definition
First of all, what is an algebraic number?

Definition 1.1.1 (Algebraic Numbers and Algebraic Integers)

Let α ∈ C be a complex number. We say α is an algebraic number if it is a root of a monic


polynomial with rational coefficients. Further, if this polynomial has integer coefficients, we say
α is an algebraic integer .

The set of algebraic numbers will be denoted by Q, and the set of algebraic integers by Z.

Note that the "monic" part is very important, otherwise there would be no difference between
algebraic numbers and algebraic integers for a number is a root of a polynomial with integer coefficients
if and only if it is a root of a polynomial with rational coefficients.

Note also that every integer n is an algebraic integer since it’s a root of X − n, and every rational
number q is an algebraic number since it’s a root of X − q. This partly explains the notations we chose.

We also say a complex number is transcendental if it isn’t algebraic, but this won’t be relevant in
this book as we will only discuss properties of algebraic numbers.

Here are some examples of algebraic numbers:

• 1 is an algebraic integer (root of X − 1).

• i is an algebraic integer (root of X 2 + 1).



• 2 + 4 3 is an algebraic integer (root of (X − 2)4 − 3).
1
• 2 is an algebraic number (root of X − 12 ). However, it is not an algebraic integer. This is a
consequence of the following proposition.

Proposition 1.1.1 (Rational Algebraic Integers)*

The only rational algebraic integers are regular integers. In other words, Z ∩ Q = Z.

14
1.1. DEFINITION 15

Proof

Firstly, it is clear that regular integers are algebraic integers since n ∈ Z is a root of X −n ∈ Z[X].
Pn
Let f = i=0 ai X i be a monic degree n polynomial with integer coefficients, and assume uv is a
rational root of f , where u, v are coprime integers.

Then,
n
X  u i
ai =0
i=0
v
is equivalent, after multiplication by v n ,
n
X
ai ui v n−i = 0.
i=0

Modulo v, we get an un ≡ 0, i.e. un ≡ 0 since f is monic. Since u and v are coprime by


assumption, this must mean that v = ±1. Finally, this means that the root uv we started with
was in fact an integer.


i
Exercise 1.1.1. Is 2
an algebraic integer?

Exercise 1.1.2 (Rational Root Theorem). Let f ∈ Z[X] be a polynomial. Suppose that u/v is a rational root
of f , written in irreducible form. Prove that u divides the constant coefficient of f and v divides its leading
coefficient. (This is a generalisation of Proposition 1.1.1.)

To distinguish algebraic integers from regular integers, we will call the latter rational integers since
they are precisely the algebraic integers which are rational.

A deep fact about algebraic numbers and algebraic integers is that they’re closed under addition
and multiplication. This will be proven in Section 1.3, but we will already give an application of these
results to a seemingly unrelated problem in order showcase their power.

Problem 1.1.1

Let q be a rational number. Which rational values can cos(qπ) take?

Solution

The key point is that the numbers of the form cos(qπ) are precisely the real parts of roots of
unity. Indeed, any root of unity has its real part of this form, and if q = ab then cos(qπ) is the
real part of the 2bth root of unity exp 2aπi
2b .

Thus, let ω = exp(qiπ) be a root of unity. Twice its real part is ω +ω, where ω = ω1 is the complex
conjugate of ω. Thus, 2<(ω) is a sum of the roots of unity and hence of algebraic integers, which
means it’s an algebraic integer itself.

Finally, we conclude that if 2 cos(qπ) = 2<(ω) is rational it must be a rational integer. Since
2 cos(qπ) ∈ [−2, 2] we must have
 
1
cos(qπ) ∈ 0, ± , ±1
2

which, conversely, are all easily seen to work. 


16 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

We may now define divisibility and congruences in algebraic integers, exactly like it is done in Z.

Definition 1.1.2 (Divisibility in Z)

Let α and β be algebraic integers. We say α divides β and write α | β if there exists an algebraic
integer γ such that β = αγ.

Definition 1.1.3 (Congruences in Z)

Let α, β, γ be algebraic integers. We say say α is congruent to β modulo γ, and write α ≡ β


(mod γ), if γ | α − β.

Like in Z, α ≡ β (mod 0) is just equivalent to α = β since 0 only divides 0.

There is one thing that makes it very nice to work with congruences in algebraic integers: it is the
fact that if a, b, n are rational integers, then a ≡ b (mod n) in rational integers is the same thing as
a ≡ b (mod n) in algebraic integers (which is why we use the same notation). Assume n is non-zero,
otherwise it is obvious by the previous remark. What does the former mean? It means that a−b n is a
rational integer. What does the latter mean? It means that a−b n is an algebraic integer. Since we are
given that a−b
n is rational, it being a rational integer is equivalent to it being an algebraic integer by
Proposition 1.1.1.

As stated before, in Section 1.3, we will prove that the algebraic integers are closed under addition
and multiplication (thus forming a ring) which means that we can manipulate these congruences like
we would in Z.

1.2 Minimal Polynomial


The goal of this section is to provide an abstract framework to manipulate algebraic numbers better,
most of the results will not have any direct application but will help simplify proofs and provide a
more conceptual way of thinking about algebraic numbers.

In the first chapter we saw that 2 + 4 3 was a root of (X − 2)4 − 3, but is it the smallest polynomial
having this property? It is natural to ask oneself, given an algebraic number α, what is the least degree
non-zero polynomial (with integer coefficients) vanishing at α.

Definition 1.2.1 (Minimal Polynomial)

Let α ∈ Q be an algebraic number. We say a least degree monic polynomial vanishing at α is


a minimal polynomial of α. We also say α is an algebraic number of degree n, where n is the
degree of any of its minimal polynomials.

The following proposition shows that the minimal polynomial is unique.

Proposition 1.2.1*

Let α ∈ Q be an algebraic number and πα be one of its minimal polynomial. Then, for any
polynomial f ∈ Q[X], f (α) = 0 if and only if πα | f . In particular, πα is unique.
1.2. MINIMAL POLYNOMIAL 17

Proof

Clearly, if πα | f ∈ Q[X], then f vanishes at α. For the converse, assume that f ∈ Q[X] vanishes
at α. Then, perform the Euclidean division of f by πα : f = gπα + h with deg h < deg πα . If h
is non-zero, then, after dividing it by its leading coefficient, we are left with a monic polynomial
with rational coefficients vanishing at α of degree less than deg πα , a contradiction.

Therefore, πα | f . Now, if πα0 is another minimal polynomial of α, we get πα | πα0 and πα0 | πα so
πα = πα0 since they are both monic.


We will thus use πα to denote the minimal polynomial of an algebraic number α. Notice that a
minimal polynomial is always irreducible in Q[X], and, conversely, an irreducible polynomial is always
the minimal polynomial of its roots.

Exercise 1.2.1∗ . Prove that the minimal polynomial of an algebraic number is irreducible and that an
irreducible polynomial is always the minimal polynomial of its roots.

We can now answer our original question. The minimal polynomial of 2 + 4
3 is in fact (X − 2)4 − 3
as Y 4 − 3 is easily seen to be irreducible in Q[X].

Exercise 1.2.2. Prove that Y 4 − 3 is irreducible in Q[X].

Given an algebraic number α, it is often particularly useful to look at the other roots of its minimal
polynomials ; these are called the conjugates of α. This is because α and its conjugates are all symmetric
because of Proposition 1.2.1: if α satisfies a certain polynomial equation with rational coefficients, then
so do its conjugates.

Definition 1.2.2 (Conjugates)

Let α ∈ Q be an algebraic number. Its conjugates are defined as the roots of its minimal
polynomial: α1 , . . . , αn with n = deg πα (we include α).

√ √ √
For instance, the conjugates of d where d is non-perfect-square rational number are d and − d.
A more elaborate example is the one of a primitive pth roots of unity, i.e. a pth root of unity ω 6= 1
p
−1
(where p is some prime number). By Theorem 3.2.1 or Corollary 5.1.5, XX−1 is irreducible so its
conjugates are all the primitive pth roots.1

Note that an algebraic number of degree n always has n distinct conjugates, because an irreducible
polynomial always has distinct roots.

Exercise 1.2.3∗ . Prove that any algebraic number of degree n has n distinct conjugates.

Notice also that α, the complex conjugate of α, is always a conjugate of α (see Appendix A). It
is of interest to discuss a bit more the link between the conjugates we just defined and the complex
conjugate of a number.

Imagine that, instead of being interested with the field of rational numbers, we were interested
in the field of real numbers and we wanted to do algebraic number theory with it. Thus, we define
algebraic numbers as roots of polynomials with real coefficients, etc. Now, every minimal polynomial
has degree 1 or 2, because any irreducible polynomial in R[X] has degree 1 or 2 (see Appendix A).
Thus, the conjugates are α are either {α} = {α, α} in the first case, or {α, α} in the second. In fact,
this can be generalised a lot, see Chapter 6.
1 I know forward references are annoying, but I need this fact for one of the worked examples. It is also useful to

know as roots of unity are absolutely fundamental in algebraic number theory. For now, you can just take my word on
it until you reach Chapter 3.
18 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Finally, we focus a bit on the algebraic integers. Can we say anything about the minimal polynomial
of an algebraic integer? We know that the minimal polynomial of an algebraic number which isn’t an
algebraic integer can’t have only integer coefficients, but what about the converse? The answer is yes,
as proven by the following proposition.

Proposition 1.2.2

Let α ∈ Q be an algebraic number. Then, πα ∈ Z[X] if and only if α ∈ Z.

Proof

It is clear that if πα ∈ Z[X], α is an algebraic integer. Thus, assume α is an algebraic integer


for the reverse implication. We will make the assumption that Z is closed under addition and
multiplication, see Section 1.3 for a proof.

Let α1 , . . . , αn be the conjugates of α. By Vieta’s formulas A.1.4, the coefficient of X k of πα is


X
(−1)n−k · αi1 · . . . · αin−k .
i1 <...<in−k

This is an algebraic integer by Theorem 1.3.2 and Exercise 1.2.4∗ but by assumption it is also
rational. Therefore, it is a rational integer and πα ∈ Z[X] as wanted.


Exercise 1.2.4∗ . Prove that the conjugates of an algebraic integer are also algebraic integers.

Exercise 1.2.5. We call an algebraic number of degree 2 a quadratic number . Characterise quadratic integers.

1.3 Symmetric Polynomials


Given a commutative ring R (in our case we will consider Z and Q) and an integer n ≥ 0, we can
consider the symmetric polynomials in n variables with coefficients in R. These are defined as the
polynomials in n variables invariant under all permutations of these variables.

Definition 1.3.1 (Symmetric Polynomials)

We say a polynomial f ∈ R[X1 , . . . , Xn ] is symmetric if f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ) for


any permutation σ of [n].

As an example, f = X 2 Y + XY 2 + X 2 + Y 2 is a symmetric polynomial in two variables, and


g = X 2 Y Z + XY 2 Z + XY Z 2 + XY 2 + X 2 Y + XZ 2 + X 2 Z + Y Z 2 + Y 2 Z
is a symmetric polynomial in three variables.

Definition 1.3.2 (Elementary Symmetric Polynomials)

The kth elementary symmetric polynomial for k ≥ 0, ek ∈ R[X1 , . . . , Xn ], is defined by


X
ek = Xi1 · . . . · Xik .
1≤i1 <...<ik ≤n

Further, if k > n then ek = 0 (the empty sum) and if k = 0 then e0 = 1 (the sum of the empty
product).
1.3. SYMMETRIC POLYNOMIALS 19

The two-variable symmetric polynomials are thus simply e1 = X + Y and e2 = XY . The three-
variable ones are e1 = X + Y + Z, e2 = XY + Y Z + ZX and e3 = XY Z.

We now state the fundamental theorem of symmetric polynomials. See Appendix B for a proof.

Theorem 1.3.1 (Fundamental Theorem of Symmetric Polynomials)

Suppose f ∈ R[X1 , . . . , Xn ] is a symmetric polynomial. Then f ∈ R[e1 , . . . , en ]. In other words,


there is a polynomial g ∈ R[X1 , . . . , Xn ] such that

f (X1 , . . . , Xn ) = g(e1 , . . . , en ).

This theorem explains why we called ek "elementary symmetric polynomials": because they gen-
erate all symmetric polynomials.
Exercise 1.3.1. Let α ∈ Q be an algebraic number with conjugates α1 , . . . αn and f ∈ Q[X1 , . . . , Xn ] be a
symmetric monic polynomial. Show that f (α1 , . . . , αn ) is rational. Further, prove that if α is an algebraic
integer and f has integer coefficients, f (α1 , . . . , αn ) is in fact a rational integer.

We can now prove that algebraic integers are closed under addition and multiplication.

Theorem 1.3.2

Let α and β be two algebraic integers. Then, αβ and α + β are also algebraic integers.

Proof

The idea is to construct a monic polynomial whose coefficients are symmetric in both the conju-
gates of α and the conjugates of β. By Exercise 1.3.1, they will thus be rational integers which
will imply that α + β is an algebraic integer.

We thus consider the conjugates α1 , . . . , αm of α and β1 , . . . , βn of β and let


Y YY Y
f (X) = (X − (αi + βj )) = ((X − αi ) − βj ) = πβ (X − αi ).
i,j i j i

If we define Y
g(X, X1 , . . . , Xn ) = πβ (X − Xi ),
i

it is symmetric as a polynomial in X1 , . . . , Xm (over the ring R = Z[X]). We can thus write

g = h(X, e1 , . . . , em )

for some h ∈ Z[X, X1 , . . . , Xn ] by the fundamental theorem of symmetric polynomials. Finally,


our original polynomial f is

f = h(X, e1 (α1 , . . . , αm ), . . . , em (α1 , . . . , αm )).

But, by Vieta’s formulas A.1.4, ek (α1 , . . . , αm ) is an integer as it is ± the coefficient of X m−k


of πα ! We thus conclude that f has integer coefficients which means that α + β is an algebraic
integer.

The αβ ∈ Z part is handled similarly and we thus omit it.



20 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Remark 1.3.1
Our proof also shows that the conjugates of α+β and αβ are among αi +βj and αi βj respectively.

Exercise 1.3.2∗ . Prove that Z is closed under multiplication.

The following straightforward consequence of the fundamental theorem of symmetric polynomials


1.3.1 is sometimes useful.

Proposition 1.3.1
Qn Qn
Let f = a · k=1 X − αk and g = b · k=1 X − βk be two polynomials with integer coefficients,
and let m ∈ Z be a rational integer which is coprime with a and b. Suppose that f ≡ g (mod m).
Then,
S(α1 , . . . , αn ) ≡ S(β1 , . . . , βn ) (mod m)
for any symmetric S ∈ Z[X1 , . . . , Xn ].

Exercise 1.3.3∗ . Prove Proposition 1.3.1.

Here is why this proposition is interesting. It lets us use a local-global principle. Suppose we have
a monic polynomial f ∈ Z[X] and you know a bunch of information about its roots modulo prime
numbers p. Then, Proposition 1.3.1 lets us deduce information about symmetric sums of the complex
roots of f , modulo p. If we let p vary, we can thus get information about symmetric sums of the roots
of f , and hence information about the roots of f themselves.

We illustrate this by an example. Problem 1.4.1 and Exercise 3.5.33† provide more elaborate
applications.

Question

Let f ∈ Z[X] be a polynomial. Suppose that f has a double root in Fp for infinitely many primes
p. Must it follow that f has a complex double root?

Answer

We prove that it does. Clearly, f needs to have degree at least 2 for it to have a double root
modulo some prime so we may assume that it does. Suppose that f has a double root β ∈ Z
modulo a rational prime p. Consider the polynomial

g(X) = f (X) − (X − β)f 0 (β) − f (β).

This may seem unmotivated, but this is just a polynomial congruent to f modulo p (by assump-
tion) which now has β as a complex double root.

We may now consider the complex roots α1 , . . . , αn of f and β = β1 , . . . , βn = β of g. By


Proposition 1.3.1, we thus have
Y Y
αi − αj ≡ βi − βj (mod p).
i6=j i6=j

Since g has a double root, the RHS is zero. Thus, p divides the LHS. Since this is true for
infinitely many primes p, we deduce that the LHS is also zero: f has a complex double root.

1.4. WORKED EXAMPLES 21

Remark 1.3.2
The number  2
n(n−1) Y Y
∆ = (−1) 2 a2n−2 · αi − αj = an−1 αi − αj 
i6=j i<j

is called the discriminant of f . We can easily check that it agrees with the usual definition when
n = 2. We have thus shown that if f has a double root mod m then m | ∆.

1.4 Worked Examples


In this section, we present two complicated problems where our previous results come to the spotlight.
However, exercises in Section 1.5 will show that the theory we have developed applies to a wide variety
of situations. Although the full power of our results is often not needed, they provide a more conceptual
framework to solve these problems.

Problem 1.4.1 (AMM 10748)

Let q be a prime number and q - r be a positive integer. Suppose that p > rq−1 is a prime
number congruent to 1 mod q and a1 , . . . , ar are rational integers such that
r
X p−1

p| ai q .
i=1

Prove that p divides one of the ai .

Solution
p−1

Suppose for the sake of a contradiction that none of ai are zero modulo p. Notice that ai q is
a qth root of unity modulo p. Let z be an element of order q modulo p; there must exist one
otherwise
Xr p−1

r≡ ai q ≡ 0 (mod p)
i=1
q−1
which is impossible as p > r . (In fact there must always exist one if p ≡ 1 (mod q) but this
is proven in Chapter 3. In fact, as you will see in Chapter 4, even if there did not exist one the
argument would still work.)
Then, as Fp is a field, the roots of X q − 1 are 1, z, . . . , z q−1 as these are all roots and this
p−1

polynomial has at most q roots. Thus, consider ki such that z ki ≡ ai q .


Pr
Let f be the polynomial i=1 X ki . By assumption, p | f (z) so
q−1
Y
f (z k ) ≡ 0.
k=1

Also, by Proposition 1.3.1 we know this is congruent to


q−1
Y
f (ω k )
k=1

modulo p where ω 6= 1 is a complex qth root of unity. However, by the triangular inequality,
q−1 q−1
Y Y
f (ω k ) ≤ |1| + . . . + |1| = rq−1 .



k=1 k=1
22 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Since p > rq−1 , this means that this product must be zero.

To conclude, as mentioned after Definition 1.2.2, we know that the minimal polynomial of ω is
X q −1
X−1 by Theorem 3.2.1 or Corollary 5.1.5. Thus, by Proposition 1.2.1, this means that

X q−1 + . . . + 1 | f.

Hence, we have
f = (X q−1 + . . . + 1)g
for some g ∈ Z[X] as X q−1 + . . . + 1 is monic. Finally, this means that q | f (1) = r which is a
contradiction. 

Problem 1.4.2 (Problems from the Book)


√ √
Let a1 , . . . , am ∈ R be positive real numbers such that n a1 + . . . + n am is rational for any
integer n ≥ 1. Prove that a1 = . . . = am = 1.

Solution

We will proceed in two steps. First, we show that a1 , . . . , am are all algebraic numbers. Let

bi = m! ai . Then, by assumption, for k ∈ [m],
m
X
pk (b1 , . . . , bm ) := bki
i=1

is a rational number. Thus, by Corollary B.2.1 of Newton’s formulas (with K = Q), the elemen-
tary symmetric polynomials evaluated b1 , . . . , bm are all rational: b1 , . . . , bm are algebraic. Our
claim follows: ai = bm!
i is also algebraic.

Finally, let N be a positive rational integer such that N a1 , . . . , N am are all algebraic integers.
There exists one by Exercise 1.4.1∗ . Notice that
√ √ √
n
p p
N ( n a1 + . . . + n am ) = N n−1 ( n N a1 + . . . + n N am )

is an algebraic integer. Since by assumption it is rational, by Proposition 1.1.1 it is a rational



integer. Call it un . Since n ai → 1, (un ) converges to N m. As it is a sequence of integers, it
must be eventually constant. Take a sufficienly large n so that un = N m = u2n .

By the Cauchy-Schwarz inequality we have


v
um m
p uX √ X √
2 2
m = 1 + ... + 1 t n
ai ≥ 2n
ai = m
i=1 i=1

with
Pm equality if and only if all ai are the same. This is the conclusion that we wanted: since

n a = m, they must all be equal to 1.
i=1 i 

Exercise 1.4.1∗ . Let α ∈ Q be an algebraic number. Prove that there exists a rational integer N 6= 0 such
that N α is an algebraic integer.
1.5. EXERCISES 23

1.5 Exercises
Elementary-Looking Problems
Exercise 1.5.1† . Find all non-zero rational integers a, b, c ∈ Z such that a
b + cb + ac and b
a + cb + ac are
also integers.
a4 −1 b4 −1
Exercise 1.5.2. Find all rational integers a, b 6= 1 such that b2 +1 + a2 +1 is also an integer.

Exercise 1.5.3† (USAMO 2009). Let (an )n≥0 and (bn )n≥0 be two non-constant sequences of rational
numbers such that (ai −aj )(bi −bj ) ∈ Z for any i, j. Prove that there exists a non-zero rational number
b −b
r such that r(ai − aj ) and i r j are integers for any i, j.
xn −y n
Exercise 1.5.4 (AMM E 2998). Let x 6= y ∈ C be complex numbers such that x−y is a rational
integer for 4 consecutive values of n. Prove that it is always an integer for n ≥ 0.

Exercise 1.5.5† (Adapted from Irish Mathematical Olympiad 1998). Let x ∈ R be a real number
such that both x2 − x and xn − x for some n ≥ 3 are rational. Prove that x is rational.

Exercise 1.5.6. Suppose that a1 , . . . , am ∈ Z are positive rational integers such that
m
X √
n i ai
i=1

also is a rational integer. Prove that ai is a rational integer for any i = 1, . . . , m.
ni

√ √
Exercise 1.5.7. Find the least n such that cos nπ can not be written in the form a + b + 3 c for


some rational numbers a, b, c. (More generally, all such n will be determined in Chapter 3.)

Exercise 1.5.8 (Miklós Schweitzer Competition 2015). Let f, g ∈ C[X] be such that

f ◦ g = X n + X n−1 + . . . + X + 2016

for some integer n ≥ 4. Prove that one of them must have degree 1.

Exercise 1.5.9† . Let |x| < 1 be a complex number. Define



X
Sn = k n xk .
k=0

Suppose that there is an integer N ≥ 0 such that SN , SN +1 , . . . are all rational integers. Prove that
Sn is a rational integer for any integer n ≥ 0.

Exercise 1.5.10† . Let n ≥ 3 be an integer. Suppose that there exist a regular n-gon with integer
coordinates. Prove that n = 4.

Exercise 1.5.11† . Let P be a polygon with rational sidelengths for which there exists a real number
α ∈ R such that all its angles are rational multiples of α, except possibly one. Prove that cos α is
algebraic.

Exercise 1.5.12 (Adapted from USA TST 2007). Let 0 < θ < π2 be a real number and m, n two
coprime rational integers. Suppose that cos θ is irrational but cos(mθ) and cos(nθ) are both rational.
Prove that θ = π6 .

Exercise 1.5.13 (IMC 2001). Let k and n be positive integers and let f be a polynomial of degree n
with coefficients in {−1, 0, 1}. Suppose that (X − 1)k | f and that

p k
<
log p log(n + 1)

for some rational prime p. Prove that all complex roots of unity of order p are roots of f .
24 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Exercise 1.5.14 (IZHO 2021). Let f ∈ Q[X] be an irreducible polynomial of degree n. Prove that
there are at most n polynomials g ∈ Q[X] of degree less than n such that f | f ◦ g.
Exercise 1.5.15† . Let ω1 , . . . , ωm be nth roots of unity. Prove that |ω1 + . . . + ωn | is either zero or
greater than m−n .
Exercise 1.5.16† . Let n ≥ 1 and n1 , . . . , nk be integers. Prove that
   
cos 2πn1 + . . . + cos 2πnk

n n

1
is either zero or greater than 2(2k)n/2
.

Exercise 1.5.17† (USA TST 2014). Let N be an integer. Prove that there exists a rational prime p
and an element α ∈ F× 2
p such that the orbit {1, α, α , . . .} has cardinality at least N and is sum-free,
i j k
meaning that α + α 6= α for any i, j, k. (You may assume that, for any n, there exist infinitely many
primes for which there is an element of order n in Fp . This will be proven in Chapter 3.)

Properties of Algebraic Numbers


Exercise 1.5.18. Which of the following are algebraic integers?
p5
√ p
17

• 1+ 33− 4 − 7 2.

5+1
• 2 .

3+1
• 2 .
7
• 12 .
√3 √
7−i 4 5
• 6 .
√ i+2
• 2· 2 .

Exercise 1.5.19. Prove that Q is a field.


Exercise 1.5.20† . Let α ∈ Q be an algebraic number with conjugates α1 , . . . , αn and f ∈ Q[X]
n
be a polynomial. Prove that the m conjugates of f (α) are each represented exactly m times among
f (α1 ), . . . , f (αn ).
Exercise 1.5.21. Let α1 , . . . , αm ∈ Q be algebraic number and f ∈ Q[X1 , . . . , Xm ] a polynomial.
(1) (n )
Denote the conjugates of αk by αk , . . . , αk i . Prove that the conjugates of f (α1 , . . . , αk ) are among
(i ) (im )
{f (α1 1 , . . . , αm ) | ik = 1, . . . , nk }.

Exercise 1.5.22† . Let f ∈ Z[X] be a monic polynomial and α be one of its roots. Prove that α is an
algebraic integer.
Exercise 1.5.23† . We say an algebraic integer α ∈ Z is a unit if there exists an algebraic integer
α0 ∈ Z such that αα0 = 1. Characterise all units.
Exercise 1.5.24† . Let m be a rational integer. We say an algebraic integer α ∈ Z is a unit mod m if
there exists an algebraic integer α0 ∈ Z such that αα0 ≡ 1 (mod m). Characterise all units mod m.
Exercise 1.5.25. Let α ∈ Z be an algebraic integer which is not a unit. Prove that the set of residues
of algebraic integers modulo α, denoted by Z/αZ, is infinite.
Exercise 1.5.26† . Let α ∈ Z be an non-rational algebraic integer. Prove that there are a finite
number of rational integers m such that α is congruent to a rational integer mod m.
Exercise 1.5.27† (Kronecker’s Theorem). Let α ∈ Z be a non-zero algebraic integer such that all its
conjugates have module at most 1. Prove that it is a root of unity.
1.5. EXERCISES 25

Exercise 1.5.28† . Determine all non-zero algebraic integers α ∈ Z such that all its conjugates are
real and have module at most 2.
Exercise 1.5.29† . Suppose that ω is a root of unity whose real part is an algebraic integer. Prove
that ω 4 = 1.
Exercise 1.5.30† . Let ω1 , . . . , ωn be roots of unity. Suppose that 1
n (ω1 + . . . + ωn ) is a non-zero
algebraic integer. Prove that ω1 = . . . = ωn .2
Exercise 1.5.31† . Let α ∈ Z be an algebraic number and let p be a rational prime. Must it follow
that αn ≡ 0 (mod p) or αn ≡ 1 (mod p) for some n ∈ N?3

2 In fact, any algebraic integer that can be written as a linear combination of roots of unity with rational coefficients

can also be written as a linear combination of roots of unity with integer coefficients. However, this is a difficult result
to prove (see Exercise 3.5.26† for a special case).
3 In Chapter 4, we prove that the answer is positive for sufficiently large p.
Chapter 2

Quadratic Integers

Prerequisites for this chapter: Chapter 1 and Section A.2.

It is best to start with an example. Suppose we want to solve the equation x2 + 1 = y 3 . Write this
equation as
(x + i)(x − i) = y 3 .
Imagine that we could conclude x + i = (a + bi)3 (this is analogous to the rational integers case: if a
product of two coprime integers is a cube, then each factor is a cube1 ). Thus, after expanding this, we
find x = a(a2 − 3b2 ) and 1 = −b(b2 − 3a2 ). This is now very easy to solve: b = ±1 since it divides 1,
so 3a2 = 1 ± 1 since b2 − 3a2 also divides 1. Thus, this implies a = 0 which finally means x = 0. We
conclude that the only solution is (x, y) = (0, 1).
It is remarkable to see that we have solved a problem about rational integers by introducing a
certain class of non-rational algebraic integers. The following sections aim to formalize this approach.
Exercise 2.0.1. Why is the "naive" approach of factorising the equation as x2 = (y − 1)(y 2 + y + 1) difficult
to conclude with? Why does our solution not work as well for the equation x2 − 1 = y 3 ?

2.1 General Definitions


Given a quadratic integer α (meaning an algebraic integer of degree 2), we define the set
Z[α] := Z + αZ = {a + bα | a, b ∈ Z}.
Given a quadratic number α (meaning an algebraic number of degree 2), we define the set
Q(α) := Q + αQ = {a + bα | a, b ∈ Q}.
The former is in fact a ring, while the latter is a field (called a quadratic field ).

Remark 2.1.1
Normally, Z[α] is defined as the smallest ring containing Z and α, i.e. the ring of all polynomials
in α with integer coefficients. Similarly, Q(α) is defined as the smallest field containing Q and α,
i.e. the field of all rational functions in α with rational coefficients. We have chosen the previous
definition for the sake of clarity. See Chapter 6 for the general definition.

It might be confusing to see square brackets used for Z[α] while round brackets are used for Q(α).
In fact, Q[α] also exists: this is the smallest ring containing α as well as Q (so polynomials in α with
rational coefficients). It turns out that for algebraic numbers α, Q[α] = Q(α) (Exercise 6.1.2∗ ).
Thus, while it is technically correct to use square brackets, we have used round brackets to

1 However it remains to prove that x + i and x − i are indeed coprime, for a suitable definition of coprime. Or, if it’s

not the case, incorporate the gcd in the argument, again, for a suitable definition of gcd.

26
2.1. GENERAL DEFINITIONS 27

emphasise the fact that it is a field.

Exercise 2.1.1∗ . Prove that Z + αZ is a ring for any quadratic integer α. This amounts to checking that it
is closed under addition, subtraction, and multiplication. What happens if α is a quadratic number which is
not an integer?

Exercise 2.1.2∗ . Prove that α + αQ is a ring for any quadratic integer α. This amounts to checking that it
is closed under addition, subtraction, multiplication, and division.

Exercise 2.1.3∗ . Let α be a quadratic number and β ∈ Q(α). Show that β has degree 1 or 2.


Exercise 2.1.4∗ . Prove that a quadratic field K is equal to Q( d) for some squarefree rational integer
d 6= 1. Moreover, prove that such fields are pairwise non-isomorphic (and in particular distinct),
√ meaning

that, for distinct squarefree a, b 6= 1, there does not exist a bijective function f : Q( a) → Q( b) such that

f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y) for any x, y ∈ Q( a).

We have seen that algebraic integers have very important properties in algebraic number theory.

Definition 2.1.1 (Ring of Integers of Quadratic Fields)

Let α be a quadratic number. We define the ring of integers of K = Q(α) to be the ring

OQ(α) := Q(α) ∩ Z

consisting of the elements of Q(α) which are also algebraic integers.

The following proposition characterises the√ring of integers of a quadratic field since by Exer-
cise 2.1.4∗ any quadratic field is of the form Q( d) for some d.

Proposition 2.1.1*

Let d ∈ Z be a squarefree rational integer. We have



OQ(√d) = Z[ d]

if d 6≡ 1 (mod 4), and " √ #


1+ d
OQ(√d) =Z
2

if d ≡ 1 (mod 4).

Remark 2.1.2

There is no ambiguity in writing Q( d) for negative d as the square
√ root of d we choose doesn’t
change that field. For instance, Z[i] = Z[−i] so we can write Z[ −1].

Proof

This follows
√ from Exercise 1.2.5,√which we reproduce here for the sake of completeness. Let
x = a + b d be han element of Q( d). If x is rational then it is a rational integer which is indeed
√ √ i
1+ d
in Z[ d] and Z 2 . Otherwise, x is an integer if and only if its minimal polynomial

X 2 − 2aX + (a2 − db2 )


28 CHAPTER 2. QUADRATIC INTEGERS

has integral coefficients, by Proposition 1.2.2. This means that 2a ∈ Z and a2 − db2 ∈ Z. Thus,
0 0
4b2 d ∈ Z so 2b ∈ Z since d is a squarefree rational integer. Let a = a2 and b = b2 for some
a0 , b0 ∈ Z. We see that x is an integer if and only if

(a0 )2 − d(b0 )2
a2 − db2 = ∈ Z.
4
This is now an easy exercise in congruences: if one of a0 , b0 is odd then the other one must be
too since 4 - d. However, an odd perfect square is always congruent to 1 modulo 4, thus if d ≡ 1
(mod 4), (a0 , b0 ) works if and only if they have the same parity, otherwise they must both even.
This is exactly what we wanted to prove.


Since a quadratic number has exactly one conjugate distinct from itself, we will call it the conjugate.

Definition 2.1.2 (Conjugation in a Quadratic Field)


√ √
Let d 6= 1 be a squarefree
√ rational integer, and let α = a + b d ∈ Q( d). The conjugate of α,
denoted α, is a − b d.

In particular, this conjugate is also defined for rational numbers, in which case we have α = α.
It is true that this is the same notation as the complex conjugate, but the context will make it clear
what is meant. (Note that, when d = −1, this conjugate is exactly the complex conjugate, but this is
the only time this happens.)
Exercise 2.1.5∗ . Prove that the conjugate is well defined.

Exercise 2.1.6∗ . Let d 6= 1 be√a rational squarefree number. Prove that the conjugation
√ satisifes α + β = α+β
and αβ = αβ for all α, β ∈ Q( d). Such a function is called an automorphism of Q( d) if it is also bijective.

Exercise 2.1.7. Let d 6= 1 be a rational squarefree number. Prove that the only automorphisms of Q( d)
are the identity and conjugation.
We now define a very important map. See Chapter 6 for more.

Definition 2.1.3 (Absolute Norm)

Let α ∈ Q be an algebraic number. Its absolute norm N(α) is defined as the product of its
conjugates.

In other words, the norm of α is (−1)n times the constant coefficient of its minimal polynomial by
Vieta’s formulas A.1.4. This norm however isn’t
√ convenient√ to work with in specific
√ fields because it
is not homogeneous: N(2) = 21 N(1) but N(2 2) = 22 N( 2). This is because 2 has two conjugates
while 1 has only one conjugate. We thus define

Definition 2.1.4 (Norm in Quadratic Fields)



Let d 6= 1 be a squarefree rational integer. We define the norm NQ(√d) : Q( d) → Q by

NQ(√d) (α) = αα.



When the context makes it clear what the base field is, we will drop the Q( d) and simply write
N.

This norm is now homogeneous, and even multiplicative! It corresponds to the absolute norm for
quadratic integers, and to the square of the absolute norm for rational integers.
2.2. UNIQUE FACTORISATION 29


Exercise 2.1.8∗ . Let d 6= 1 be a squarefree rational integer, and α, β ∈ Q( d). Prove that N (αβ) =
N (α)N (β).

Exercise 2.1.9. Prove Exercise 2.1.8∗ without any computations using Exercise 2.1.6∗ .

Exercise 2.1.10. Let d < 0 be a squarefree integer. Prove that√ the conjugate of an element of Q( d) is the
same as its complex conjugate. In particular, the norm over Q( d) is the module squared.

2.2 Unique Factorisation


Thia section will be a bit of abstract nonsense. I hope the reader doesn’t get too confused.
Our goal is to have an analogue of the fundamental theorem of arithmetic in quadratic rings
of integers. This, however, will not hold over every such ring (in fact it hasn’t even been proven
that it holds for infinitely many ones!) but it will still yield substantial applications such as the
diophantine equation we "solved" in the beginning of the chapter. First we have to define what
"unique factorisation" means. It’s not just "each element can be written in a unique way as a product
of primes", because in Z we need to add a sign for negatives. This is because Z has two units (1 and
−1), while N only has one (1).

Definition 2.2.1 (Unit)

We say en element α of a ring R is a unit if it’s invertible: i.e., there exists some β such that
αβ = βα = 1.

Exercise 2.2.1∗ . Let d 6= 1 be a squarefree rational integer. Prove that the product of two units of OQ(√d)
is still a unit, and that the conjugate of a unit is also a unit.

Exercise 2.2.2∗ . Let d 6= 1 be a squarefree rational integer. Prove that α ∈ OQ(√d) is a unit if and only if
|N (α)| = 1.

Exercise 2.2.3∗ . Determine the units of the ring Z[i].

Definition 2.2.2 (Unique Factorisation Domain)

We say an integral domain R has unique factorisation and is a unique factorisation domain
(UFD) if there exists a set of elements of R called primes such that each non-zero element α ∈ R
can be written in a unique way as a product of primes

α = p1 · . . . · pn

up to permutation and multiplication by units.

Indeed, the factorisation 6 = (−2)(−3) doesn’t bring anything new to the factorisation 6 = 2 · 3.
In Z, there is a canonical way to say which of −2 and 2 is prime, but in general there isn’t (and it
is actually more useful to say they are both prime). Thus, we say two primes p and q are associates
if there is a unit u such that q = up. Having unique factorisation then means that it is unique up to
permutations and associates.
We now discuss some ways to prove an integral domain is a UFD. Recall how unique factorisation
is proven Z. We define prime numbers as usual, then prove Bézout’s lemma (if a and b are coprime
there are x and y such that ax + by = 1) and from this deduce the fundamental Euclid lemma: if
some prime divides a product, it divides one of the factors. Finally, we induct on the natural integers
to show that a prime factorisation always exists and, using Euclid’s lemma, that it’s unique up to
permutation.
30 CHAPTER 2. QUADRATIC INTEGERS

We wish to imitate this process. It suggests that the fundamental fact about prime numbers is the
Euclid lemma, and not that it can’t be written as a non-trivial product. It also suggests that Bézout’s
lemma is the fundamental step. We thus make the following definitions.

We first take care of our "objection" about primes: they should be defined as having the Euclid
property instead of not being writable as a non-trivial product.

Definition 2.2.3 (Prime Element)

We say a non-unit p ∈ R is a prime element if it is non-zero and, for all a, b ∈ R, p | ab implies


p | a or p | b. (Divisibility is defined as usual: α | β if there exists a γ ∈ R such that β = αγ.)

Definition 2.2.4 (Irreducible Element)

We say a non-unit x ∈ R is an irreducible element if it is non-zero and x = αβ implies that α is


a unit or β is one.

The usual definition of a prime in Z is thus as an irreducible element, instead of as a prime one.
To further distinguish prime elements from prime numbers, we will thus call the latter rational primes
(because they are rational integers). This is somewhat contradictory as the primes of Z are ±p, but
by "rational prime" we will always mean a prime of N, i.e. a positive prime number.
Exercise 2.2.4∗ . Prove that an associate of a prime is also prime.

Exercise 2.2.5∗ . Prove that the conjugate of a prime is also a prime.

Exercise 2.2.6∗ . Prove that primes are irreducible.

Exercise 2.2.7∗ . Let d 6= 1 be a squarefree rational integer and let x ∈ OQ(√d) be a quadratic integer.
Suppose that |N (x)| is a rational prime. Prove x is irreducible.

Exercise 2.2.8∗ . Suppose a prime p divides another prime q. Prove that p and q are associates.

Exercise 2.2.9∗ . Prove that p is a prime element of R if and only if it is non-zero and R (mod p) is an
integral domain (this means that the product of two non-zero elements is still non-zero). In particular, if R
(mod p) is a field (this means that elements which are not divisible by p have an inverse mod p), p is prime.

Exercise 2.2.10. Let d 6= 1 be a squarefree rational integer and let p ∈ OQ(√d) be a prime. Prove that p
divides exactly one rational prime q ∈ Z.

Exercise 2.2.11. Prove that 2 is irreducible in Z[ −5] = OQ(√−5) but not prime.

Exercise 2.2.12. Show that the primes of Definition 2.2.2 must all be prime elements, and that there is at
least one associate of each prime element in that set. (Conversely, if we have unique factorisation, any such
set of primes work. This explains why we consider all primes defined in Definition 2.2.3.)

We now define formally the "Bézout property".

Definition 2.2.5 (Bézout Domain)

We say an integral domain R is Bézout Domain if, for any α, β ∈ R there exist γ ∈ R such that

αR + βR = γR.

We say such a γ is a greatest common divisor (gcd) of α and β.


2.2. UNIQUE FACTORISATION 31

Exercise 2.2.13∗ . Prove that a greatest common divisor γ of α and β really is a greatest common divisor of
α and β, in the sense that if γ | α, β and δ | α, β then δ | γ.

Exercise 2.2.14∗ . Prove that an associate greatest common divisor is also a greatest common divisor, and
that the greatest common divisor of two elements is unique up to association.
Now that we have defined rings where Bézout’s lemma holds, let’s see how we can prove that the
rings we are interested in have this property. Here is the usual proof that Z is a Bézout Domain.

Proof that Z is a Bézout Domain

Let a, b ∈ Z be rational integers, without loss of generality positive. Let c be the minimal positive
element of aZ + bZ. We wish to show that aZ + bZ = cZ. Suppose that it is not the case: there
exists some c - d ∈ aZ + bZ. Perform the Euclidean division of d by c: d = cq + r where 0 < r < c.
Thus, we have a positive element r ∈ aZ + bZ smaller than c, a contradiction since we assumed
c was the smallest.


Aha! What we need is a Euclidean division! More specifically, we need to be able to write α = ρβ+τ
for some τ which is "smaller" in some sense than β. This yields the following definition.

Definition 2.2.6 (Euclidean Domain)

We say an integral domain R is Euclidean if there exists a function f : R → N such that for any
α, β ∈ R with β 6= 0 there exist ρ, τ ∈ R such that α = ρβ + τ and f (τ ) < f (β). Such a function
f will be called a Euclidean function.

Remark 2.2.1
The remainder doesn’t have to be unique.

Exercise 2.2.15. Let R be a Euclidean domain with Euclidean function f . Show that, if f (α) = 0, then
α = 0, and if f ()α) = 1, then α is a unit or zero.
The reason why we introduced a function f : R → N is to get a measure of the size an element of
R. This is the role of f (n) = n over N, and f (n) = |n| over Z (if one wishes to prove directly that
unique factorisation holds there). Over quadratic rings of integers, this function will usually be the
absolute value of the norm. If OQ(√d) is Euclidean for the absolute value of the norm, we say it is

norm-Euclidean. By abuse of terminology we will also sometimes say Q( d) is norm-Euclidean.
The same proof as the proof that Z is a Bézout domain shows the following very important propo-
sition.

Proposition 2.2.1*

Any Euclidean domain is a Bézout domain.

Exercise 2.2.16∗ . Prove that a Euclidean domain is a Bézout domain.

Exercise 2.2.17∗ . Prove that irreducible elements are prime in a Bézout domain.
We can now state our main theorem. It might seem a bit restrictive but it works over any ring of
integer, and we invite the reader to prove it after reading Chapter 6.

Theorem 2.2.1

Any quadratic ring of integers which is a Bézout domain is a UFD.


32 CHAPTER 2. QUADRATIC INTEGERS

Proof

Let O be that ring of integers. We proceed exactly like in Z. First, we prove the existence of
a prime factorisation. Suppose that a non-zero element α ∈ O has no prime factorisation and
choose its norm N (α) to be the smallest in absolute value. Clearly, α isn’t a unit since units are
their own prime factorisation (the empty factorisation) and isn’t prime either. Since irreducible
elements are primes by Exercise 2.2.17∗ , α is not irreducible so α = βγ for some non-units
β, γ. Since N (α) = N (β)N (γ), we have |N (β)|, |N (γ)| < |N (α)| because |N (β)|, |N (γ)| 6= 1 by
Exercise 2.2.2∗ . Thus, since α was the smallest element with no prime factorisation, β and γ
have one: this means that βγ = α also has one, a contradiction.

It remains to prove the uniqueness of this factorisation. Suppose an element α has two different
prime factorisations
p1 · . . . · pn = q1 · . . . qm
and take m + n to be minimal. Since pn is prime, by definition, it must divide one of the qi , say,
qm . By Exercise 2.2.8∗ , upn = qm for some unit u. Finally, this means that

p1 · . . . · pn−1 = q1 · . . . · (uqm−1 )

so we have two different factorisations of smaller length for the same element. This is a contra-
diction since we assumed m + n was minimal.


Combining this with Proposition 2.2.1, we have proven that it suffices to find a Euclidean function
to show that
√a quadratic ring of integers has unique factorisation. By abuse of notation, we will also
say that Q( d) has unique factorisation if OQ(√d) does.

Remark 2.2.2
Most quadratic rings of integers are not Euclidean domains, Bézout domains, or even UFD. In
fact, it has only been conjectured that there are infinitely many squarefree 1 6= d ∈ Z such that
OQ(√d) is UFD! For negative d there is a complete list

{−1, −2, −3, −7, −11, −19, −43, −67, −163}

but the problem is still open for positive d.

Similarly, OQ(√d) is norm-Euclidean only for

d ∈ {−11, −7, −3, −2, −1, 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73}.

On the other hand, it has been conjectured that, for positive d, if OQ(√d) is a UFD then is
Euclidean for some exotic Euclidean function! This has been proven recently for d = 14 and
d = 69. Note that these are not part of the previous list. For negative d, OQ(√d) is Euclidean if
and only if d ∈ {−1, −2, −3, −7, −11}.

2.3 Gaussian Integers


Time for applications! We go back to the Gaussian integers Z[i] which we used at the beginning. The
norm in Q(i) is N (a + bi) = a2 + b2 .

Proposition 2.3.1*

Z[i] is norm-Euclidean.
2.3. GAUSSIAN INTEGERS 33

Proof

Let α = a + bi ∈ Z[i] and β = c + di ∈ Z[i]. Consider the number x + yi = α


β . Choose rational
1 1
integers m and n such that |x − m| ≤ 2 and |y − n| ≤ 2 . Thus, |N (x + yi − (m + ni))| ≤
1 2
2
+ 12 = 12 .

2

Hence,
|N (β)|
|N (α − β(m + ni))| = |N (β)| · |N (x + yi − (m + ni))| ≤
2
which means that the remainder τ = α − β(m + ni) works since it has norm less than |N (β)|.


Corollary 2.3.1*

Z[i] has unique factorisation.

We shall now analyse the prime elements of Z[i]. Suppose α ∈ Z[i] is prime. Then N (α) =
αα must have at most two rational prime factors since it has exactly two prime factors in Z[i] (by
Exercise 2.2.5∗ ). Moreover, if it has exactly two rational prime factors, then α is an associate of one
of them and we may assume without loss of generality that it is a rational prime.
The problem of finding the primes of Z[i] is therefore reduced to finding when a rational prime
p ∈ Z stays prime in Z[i], and when it splits as a product of two Gaussian primes αα. Indeed, if
N (α) = −p then N (iα) = p so we may assume p is positive.

Theorem 2.3.1 (Gaussian Primes)

The primes of Z[i] are, up to multiplication by a unit,


• 1 − i.
• a + bi and a − bi where a2 + b2 = p for some (positive) rational prime p ≡ 1 (mod 4).

• p where p ≡ −1 (mod 4) is some (positive) rational prime.

In algebraic number theory terminology, we say


• 2 ramifies because it becomes non-squarefree (2 = i(1 − i)2 ),
• p ≡ 1 (mod 4) splits, and
• p ≡ −1 (mod 4) stays inert because it stays prime.

Proof

First, we see that 2 = (1 + i)(1 − i) = i(1 − i)2 and that N (1 − i) = 2 so these are primes by
Exercise 2.2.7∗ .
Suppose an odd rational prime p ∈ Z does not stay inert in Z[i]. Then, by the previous discussin,
there is an α = a + bi ∈ Z[i] such that a2 + b2 = N (α) = p. The numbers a and b are clearly not
divisible by p, so (a · b−1 )2 + 1 ≡ 0 (mod p). By Exercise 2.3.1∗ , we must have p ≡ 1 (mod 4).
Thus, p ≡ 3 (mod 4) stays inert. It remains to prove that p ≡ 1 (mod 4) splits. This follows
from Exercise 2.3.2∗ : let n be an integer such that p | n2 + 1. Then,
p | (n + i)(n − i)
34 CHAPTER 2. QUADRATIC INTEGERS

but p - n + i, n − i so p isn’t prime in Z[i] as wanted. To show that it doesn’t ramify, write p = ππ
for some Gaussian prime π and notice that the gcd of a + bi = π and a − bi = π divides 2a and
2b so divides 2. However, for p 6= 2, π doesn’t divide 2 so they have gcd 1, i.e. π and π are not
associates.


Exercise 2.3.1∗ . Let n ∈ Z be a rational integer and p an odd rational prime. If n2 ≡ −1 (mod p), prove
that p ≡ 1 (mod 4).

Exercise 2.3.2∗ . Let p ≡ 1 (mod 4) be a rational prime. Prove that there exist a rational integer n such
that n2 ≡ −1 (mod p). (Hint: Consider (p − 1)!.)

As a corollary, we get

Corollary 2.3.2 (Fermat’s Two Square Theorem)

Any rational prime congruent to 1 modulo 4 is a sum of two squares of rational integers.

Exercise 2.3.3. Which rational integers can be written as a sum of two squares of rational integers?

Exercise 2.3.4∗ . Find all rational integer solutions to the equation x2 + 1 = y 3 . (This is the example we
considered in the beginning of the chapter.)

2.4 Eisenstein Integers



2iπ −1+i 3

In this section we look at the field of Eisenstein numbers Q(j) where j = exp 3 = 2 satisfies

j3 − 1
0= = j 2 + j + 1.
j−1

By Proposition 2.1.1, we have OQ(j) = Z[j] since −3 ≡ 1 (mod 4).

Remark 2.4.1
A small word of warning: the notations we use for a primitive third root of unity j, along with
the notation for a primitive fourth root of unity i are the same as the ones we usually use for
indexing sums, sets, etc. Which notation is being used should be clear from the context, and,
generally (but not always), we shall also redefine j before using it, as it is less standard than i.

Exercise 2.4.1∗ . Prove that the norm of a + bj is a2 − ab + b2 . (Bonus: do it without any computations using
cyclotomic polynomials from Chapter 3.)

Exercise 2.4.2∗ . Determine the units of Z[j].

Exercise 2.4.3∗ . Prove that Z[j] is norm-Euclidean.

Exercise 2.4.4. Characterise the primes of Z[j]. Conclude that when p ≡ 1 (mod 3) there exist rational
integers a and b such that p = a2 − ab + b2 . (You may assume that there is an x ∈ Z such that x2 + x + 1 ≡ 0
(mod p) if p ≡ 1 (mod 3). This will be proven in Chapter 3, as a corollary of Theorem 3.3.1.)

We now look at a very interesting application of Eisenstein integers: Fermat’s last theorem for
n = 3. In fact, we will even show the following stronger result.
2.4. EISENSTEIN INTEGERS 35

Theorem 2.4.1

There do not exist non-zero Eisenstein integers α, β, γ ∈ Z[j] such that α3 + β 3 + γ 3 = 0.

Let λ = 1 − j. Since 3 = N (1 − j) = (1 − j)(1 − j 2 ) = λ2 (1 + j) we see that λ is prime and that


λ is the prime factorisation of 3 (up to a unit) because 1 + j = −j 2 is a unit.
2

Exercise 2.4.5∗ . Let θ ∈ Z[j] be an Eisenstein integer. Prove that, if λ - θ, then θ ≡ ±1 (mod λ). In that
case, prove that we also have θ3 ≡ ±1 (mod λ4 ).

Proof

We will in fact prove that the equation


α3 + β 3 + εγ 3 = 0
where ε is a unit does not have non-zero solutions in Z[j] where λ - α, β. This will imply that
α3 +β 3 +γ 3 = 0 does not have non-zero solutions either. Indeed, suppose (α, β, γ) is a solution of
the latter. Without loss of generality, suppose they are pairwise coprime. Then, either λ - α, β, γ
in which case it is also a solution to the former, or we can suppose λ | γ by symmetry which
again makes it a solution of the former as λ can’t divide α or β.
Thus, suppose for the sake of a contradiction that (α, β, γ) is a solution of α3 + β 3 + εγ 3 = 0 for
some unit ε and where λ - α, β. Without loss of generality, assume they are pairwise coprime.
Suppose also vλ (γ) is minimal among the solutions. If it is zero, by Exercise 2.4.5∗ ,
α3 + β 3 + εγ 3 ∈ {±ε, ±2 ± ε} (mod λ4 )
and we can check that this is never divisible by λ4 : the norm of the former is in {1, 3, 9, 7} while
the norm of λ4 is 34 = 81. Thus, we already reach a contradiction: α3 + β 3 + εγ 3 can’t be zero
if λ - α, β, γ.
Now, suppose γ = λn δ for some λ - δ and n ≥ 1. Write
α3 + β 3 = (α + β)(α + βj)(α + βj 2 ) = −ελ3n δ 3 .
By Exercise 2.4.6∗ , the gcd of each pair of factors is λ. By replacing β by βj k for a suitable k,
we may assume that vλ (α + j ` β) = 1 for ` ∈ {1, 2}. Then, by unique factorisation, there exist
units u, v, w and Eisenstein integers λ - x, y, z ∈ Z[j] such that
 
α + β = uλ
 3n−2 3
x α + β = uλ
 x =: u0 λ3n−2 x3
3n−2 3

α + βj = vλy 3 ⇐⇒ αj + βj 2 = vjλy 3 =: v 0 λy 3 .
αj + βj = wj 2 λz 3 =: w0 λz 3
 2 3

 2
α + βj = wλz

To conclude, notice that (λx, λy, λ3n−2 z) is another smaller solution: by summing the three lines
we get
u0 λx3 + v 0 λy 3 + w0 λ3n−2 z 3 = 0
for some units u0 , v 0 , w0 since j 2 + j + 1 = 0.
Now, divide everything by u0 λ to get
x3 + µy 3 + ηλ3(n−1) z 3 = 0
for units µ, η. If n = 1, we get, modulo λ4 , ±1 ± µ ± η ≡ 0 which is easily seen to be impossible.
Thus n − 1 ≥ 1. Modulo λ3 , we get ±1 ± µ ≡ 0 so µ must be ±1. Finally, we get x3 + (±y)3 +
ηλ3m z 3 = 0 for some smaller 1 ≤ m < n which contradicts the minimality of n. In other words,
there are no solutions.

36 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.4.6∗ . Let α, β ∈ Z[j] be coprime Eisenstein integers non-divisible by λ. Prove that, if

λ | α3 + β 3 = (α + β)(α + βj)(α + βj 2 ),

each pair of factors has gcd λ.

Exercise 2.4.7. Check the computational details: ±1±µ±η is never zero mod λ4 for units µ, η and ±1±µ ≡ 0
(mod λ3 ) implies µ = ±1.

Remark 2.4.2
The reason why Eisenstein integers turned out to be so useful to solve Fermat’s last theorem for
n = 3 is that a3 + b3 factorises completely there. See Exercise 3.5.30† for more cases.

Remark 2.4.3
The part where we looked at the equation modulo λ4 is completely analogous to the proof that
a3 + b3 + c3 = 0 does not have rational integers solution where 3 - a, b, c by looking at the equation
modulo 9. In fact it is exactly the same as λ4 is a unit times 9.

2.5 Hurwitz Integers


In this section we discuss the Hurwitz integers. These are not algebraic numbers as they are not even
complex numbers, but they fit perfectly in this chapter as the reader will quickly see. They will allow
us to prove the four square theorem, stating that any positive integer is a sum of four squares, in
a similar manner as our proof of the two square theorem. First, we define the quaternion numbers,
which were introduced by Hamilton. Recall that a skew fiel is like a field but where multiplication is
not necessarily commutative (see Definition A.2.6).

Definition 2.5.1 (Quaternions)

The skew field of the quaternion numbers H is defined as the algebra R[i, j, k] := R+iR+jR+kR
where i, j, k satisfy the following multiplication rules:
2

 i = j2 = k2 = −1

ij = k = −ji
.


 jk = i = −kj
ki = j = −ik

Remark 2.5.1
One usually sees the quaternion with the equations i2 = j2 = k2 = ijk = −1.

Exercise 2.5.1∗ . Prove that ij = k = −ji, jk = i = −kj and ki = j = −ik follows from i2 = j2 = k2 =
−ijk = −1 and associativity of the multiplication.

Remark 2.5.2
One may also represent quaternions by the algebra of two by two complex matrices of the form
         
a + di b + ci 1 0 0 1 0 i i 0
=a +b +c +d := a + bi + cj + dk.
−b + ci a − di 0 1 −1 0 i 0 0 −i

It is then an easy exercise to check that i2 = j2 = k2 = ijk = −1.


2.5. HURWITZ INTEGERS 37

Exercise 2.5.2. Prove that i, j, k are distinct.

In particular, we deduce from Exercise 2.5.1∗ that multiplication is not commutative in H! This
is also why X 2 + 1 has 3 distinct roots when its degree is only 2: almost all the theory developped in
Appendix A, and in particular Corollary A.1.1, fails when multiplication is not commutative.
Exercise 2.5.3∗ . Let α, β, γ ∈ H be quaternions. Prove that (αβ)γ = α(βγ). (We say multiplication is
associative. This is why we can write αβγ without ambiguity.)

Exercise 2.5.4. Prove that there are infinitely many square roots of −1 in H.

Definition 2.5.2 (Quaternion Conjugate)

Let α = a + bi + cj + dk ∈ H. The conjugate of α, denoted α is a − bi − cj − dk.

Exercise 2.5.5∗ . Prove that, for any α, β ∈ H, α + β = α + β and αβ = βα (this is because multiplication is
not commutative anymore).

Definition 2.5.3 (Quaternion Norm)

Let α = a + bi + cj + dk ∈ H. The norm of α, N (α) is

αα = a2 + b2 + c2 + d2 .

Exercise 2.5.6∗ . Check that (a + bi + cj + dk)(a − bi − cj − dk) is indeed a2 + b2 + c2 + d2 .

Exercise 2.5.7∗ . Prove that H is a skew field. This amounts to checking that elements have multiplicative
inverses (i.e. for any α there is a β such that αβ = βα = 1).

Exercise 2.5.8∗ . Prove that the norm is multiplicative: for any α, β ∈ H, N (αβ) = N (α)N (β).

Our object of study will be the ring of Hurwitz integers 2


   
1+i+j+k 1+i+j+k
H = Z i, j, k, := Z + iZ + jZ + kZ + Z
2 2

as a subring of the skew field


Q(i, j, k) := Q + iQ + jQ + kQ.

Exercise 2.5.9∗ . Prove that H = { a+bi+cj+dk


2
| a ≡ b ≡ c ≡ d (mod 2)}. Deduce that the elements of H
have integral norms.

Exercise 2.5.10∗ . Determine the units of H.

Although multiplication is not commutative anymore, this does not mean we lose all the theory
built previously. We can still define divisibility, associates, Euclidean domains and Bézout domains.
We just need to incorporate "left" or "right" in the definition to indicate from which side we multiply.
The definitions for irreducible elements and units do not change as the first one did not use the
commutativity of multiplication while for the second one left and right units are the same since left
and right inverses are the same.
h i
2 One 1+i+j+k
might wander why we defined H as Z i, j, k, 2
instead of simply Z[i, j, k]. This is because they form
h i
maximal order while Z[i, j, k] doesn’t. More concretely, we will see in Exercise 2.5.14 that Z i, j, k, 1+i+j+k
2
has a
Euclidean division while Z[i, j, k] does not.
38 CHAPTER 2. QUADRATIC INTEGERS

Definition 2.5.4 (Left and Right Divisibility)

Let R be a ring and α, β ∈ R. We say α left-divides β and write α d β if there exists a γ ∈ R


such that β = αγ. Similarly, if there exists a γ ∈ H such that β = γα, we say α right-divides β
and write α e β.

Remark 2.5.3
The notations d and e for divisibility are non-standard.

Exercise 2.5.11∗ . Let α, β, γ ∈ H. Prove that α d β implies α d βγ but does not always imply α d γβ.

Definition 2.5.5 (Left and Right Associates)

Let R be a ring and α, β ∈ R. We say α is a left-associate (resp. right-associate) of β if there


exists a unit ε such that α = βε (resp. α = εβ).

Exercise 2.5.12∗ . Prove that being left-associate is an equivalence relation, i.e., for any α, β, γ, α is a left-
associate of itself, α is a left-associate of β if and only if β is a left-associate of α, and if α is a left-associate
of β and β is a left associate of γ then α is a left-associate of γ.

Definition 2.5.6 (Left and Right Euclidean Domains)

We say a domain R is left-Euclidean (resp. right-Euclidean) if there exists a function f : R → N


such that for any α, β ∈ R with β 6= 0 there exist ρ, τ ∈ R such that α = βρ + τ (resp.
α = ρβ + τ ) and f (τ ) < f (β). Such a function f will be called a left-Euclidean (resp. right-
Euclidean) function.

Definition 2.5.7 (Left and Right Bézout Domains)

We say a domain R is left-Bézout (resp. right-Bézout) if, for any α, β ∈ R, there exists a γ ∈ R
such that αR + βR = γR (resp. Rα + Rβ = Rγ). Such a γ will be called a left-gcd (resp.
right-gcd ) of α and β.

Left and right definitions are completely symmetric so we will focus primarly on left ones.
Exercise 2.5.13∗ . Prove that a left-gcd γ of α and β satisifies the following property: γ d α, β and if δ d α, β
then δ d γ.

Exercise 2.5.14. Prove that 1+i and 1−j do not have a left-gcd in Z[i, j, k]. In particular, it is not left-Bézout
and thus not left-Euclidean too (and the same holds for being right-Bézout and right-Euclidean by symmetry).

As before, we have the following proposition.

Proposition 2.5.1*

A left-Euclidean (resp. right-Euclidean) domain R is a left-Bézout (resp. right-Bézout) domain.

Exercise 2.5.15∗ . Prove Proposition 2.5.1.

However, being left or right Euclidean does not garantee unique factorisation anymore. That said,
some of our results will still hold for rational primes which stay prime in H because rational numbers
commute with every quaternion (we will paradoxically use this to show that they do not exist).
2.5. HURWITZ INTEGERS 39

We first prove that H is norm-Euclidean.

Proposition 2.5.2*

H is both left and right norm-Euclidean.

Proof

Let α, β ∈ H, with β 6= 0. Consider the quotient γ = α β = a + bi + cj + dk. Choose x, y, z, t ∈ Z


such that
1
|a − x|, |b − y|, |c − z|, |d − t| ≤
2
and let δ = x + yi + zj + tk. Then,
 2  2  2  2
1 1 1 1
N (γ − δ) ≤ + + + =1
2 2 2 2

with equality if and only if |a − x| = |b − y| = |c − z| = |d − t| = 21 . If the inequality is strict,


then N (α − βδ) < N (β), otherwise γ ∈ H so N (α − βγ) = 0 < N (β) as wanted.

The right-Euclidean proof is exactly the same with the order of factors reversed by symmetry.


As a corollary, we get that H is left and right Bézout. Now, we show that any Hurwitz integer has
a factorisation in irreducible Hurwitz integers.

Proposition 2.5.3

Any Hurwitz integer has a factorisation in irreducible Hurwitz integers. (When it is a unit it is
the empty factorisation.)

Exercise 2.5.16∗ . Prove Proposition 2.5.3.

Exercise 2.5.17. Prove that there is an irreducible Hurwitz integer x ∈ H for which there exist α and β such
that x d αβ but x left-divides neither α nor β.

We can now prove our main result: the Lagrange four square theorem.

Theorem 2.5.1 (Lagrange’s Four Square Theorem)

Any non-negative rational integer is a sum of four squares of rational integers.

Proof

This is equivalent to showing that any integer arises as a norm of an element of Z[i, j, k]. Consider
the prime case first. We wish to find a non-trivial factorisation p = αβ in Hurwitz integers.
Suppose that p is irreducible, which implies that it is odd as 2 = (1 + i)(1 − i).

Then, p is in fact also prime because p commutes with any quaternion since it’s real, which means
left and right divisibility by p are the same. Indeed, suppose that p | αβ and p - β. Since H is
left-Bézout, there are some γ, δ ∈ H such that

pγ + βδ = 1.
40 CHAPTER 2. QUADRATIC INTEGERS

Thus, β is right-invertible modulo p which means that

p | αβδ = α − pγδ

so p | α. The p - α case is handled similarly. (This could be phrased more efficiently using
modular arithmetic, but we did it that way to emphasise how the commutativity of p and H
made this possible.)

However, by Exercise 2.5.18∗ , there exist rational integers a and b such that

p | 1 + a2 + b2 = (1 + ai + bj)(1 − ai − bj)

but p - 1 + ai + bj, 1 − ai − bj as it is odd. Thus it can’t be prime and therefore irreducible too.

This means that there exist non-units Hurwitz integers α and β such that p = αβ. By taking the
norm we get p2 = N (α)N (β). Since neither of α, β are units, N (α), N (β) must both be equal to
p as they are different from 1.

We are almost done: we have represented p as the norm of a Hurwitz integer α and we just want
to have α ∈ Z[i, j, k]. Suppose that it is not the case. Consider the unit ε = ±1±i±j±k
2 where the
± signs are chosen so that ρ := α − ε ∈ 2Z[i, j, k]. Then,

p = αα = (ε + ρ)εε(ε + ρ) = (1 + ρε)(1 − ερ) =: α0 α0

where α0 now has rational integer coordinates since the coordinates of ρ are even so ρε ∈ Z[i, j, k].
Qk
The general case follows from the multiplicativity of the norm: if pi = N (αi ) and n = i=1 pm i
i ,
then !
Yk
mi
n=N αi .
i=1

Exercise 2.5.18∗ . Let p be a rational prime. Prove that there exist rational integers a and b such that
p | 1 + a2 + b2 .

2.6 Exercises
Diophantine Equations
Exercise 2.6.1. Solve the equation x2 + 4 = y 3 over Z.
Exercise 2.6.2† . Prove that OQ(√2) and OQ(√−2) are Euclidean.

Exercise 2.6.3. Solve the equations x2 + 2 = y 3 and x2 + 8 = y 3 over Z..


Exercise 2.6.4† . Prove that OQ(√−7) is Euclidean.

Exercise 2.6.5. Solve the equation x2 + x + 2 = y 3 over Z.


Exercise 2.6.6† . Solve the equation x2 + 11 = y 3 over Z.
Exercise 2.6.7. Let a, b, c ∈ Z be rational integers. Prove that a2 + b2 = c3 if and only if there exist
rational integers m and n such that a = m3 − 3mn2 , b = −n3 + 3m2 n and c = m2 + n2 . More generally,
if k ≥ 1 is an integer, find all the solutions a, b, c ∈ Z to the equation a2 + b2 = ck .
Exercise 2.6.8† . Let n be a non-negative rational integer. In how many ways can n be written as a
sum of two squares of rational integers? (Two ways are considered different if the ordering is different,
for instance 2 = 12 + (−1)2 and 2 = (−1)2 + 12 are different.)
2.6. EXERCISES 41

Exercise 2.6.9. Which rational integers can be written in the form a2 + 2b2 for some rational integers
a and b? What about a2 + 2b2 ? In how many ways? (You may assume that, for an odd rational prime
p, there exists a rational integer such that x2 ≡ 2 (mod p) if and only if p ≡ ±1 (mod 8), and there
exists a rational integer x such that x2 ≡ −2 (mod p) if and only if p ≡ 1 (mod 8) or p ≡ 3 (mod 8).
This will be proven in Chapter 4, as a corollary of the quadratic reciprocity law 4.5.2.)

Exercise 2.6.10 (Saint-Petersbourg Mathematical Olympiad 2013). Find all rational primes p and q
such that 2p − 1, 2q − 1, and 2pq − 1 are all perfect squares.

Exercise 2.6.11† (Euler). Let n ≥ 3 be an integer. Prove that there exist unique positive odd
rational integers x and y such that 2n = x2 + 7y 2 .

Exercise 2.6.12† (Fermat’s Last Theorem for n = 4). Show that the equations α4 + β 4 = γ 2 and
α4 − β 4 = γ 2 have no non-zero solution α, β, γ ∈ Z[i].

Exercise 2.6.13 (Chinese Mathematical Olympiad 2006). Positive integers k, m, n satisfy mn =


k 2 + k + 3. Prove that at least one of the equations

x2 + 11y 2 = 4m

and
x2 + 11y 2 = 4n
has a solution in odd rational integers.

Exercise 2.6.14† . Prove that OQ(√5) is Euclidean.

Hurwitz Integers and Jacobi’s Four Square Theorem3


Exercise 2.6.15† . Let α ∈ H be a primitive Hurwitz integer, meaning that there does not exist a
α
non-zero m ∈ Z such that m ∈ H and let N (α) = p1 · . . . · pn be its prime factorisation. Then, the
factorisation of α = π1 · . . . · πn for irreducible elements πi of norm pi is unique up to unit-migration,
meaning that if if τ1 · . . . · τk is another such factorisation, then k = n and



 τ1 = π1 u1
τ2 = u−1
1 π2 u2



...
τn−1 = u−1

n−1 πn un



−1

τ
n = un πn .

for some units u1 , . . . , un . Deduce that α is irreducible if and only if its norm is a rational prime.

Exercise 2.6.16† . Prove that (1 + i)H = H(1 + i)4 . Set ω = 1+i+j+k 2 . We say a Hurwitz integer
α ∈ H is primary if it is congruent to 1 or 1 + 2ω modulo 2 + 2i.5 Prove that, for any Hurwitz integer
α of odd norm, exactly one of its right-associates is primary.

Exercise 2.6.17† . Let m ∈ Z be an odd integer. Prove that the Hurwitz integers modulo m, H/mH,
are isomorphic to the algebra of two by two matrices modulo m, (Z/mZ)2×2 . In addition, prove that
the determinant of the image is the norm of the quaternion.

Exercise 2.6.18† . Let m be an odd integer. We say a Hurwitz integer α = a + bi + cj + dk is primitive


modulo n if gcd(2a, 2b, 2c, 2d, m) = 1. Compute the number ψ(m) of primitive Hurwitz integers modulo
m with norm zero (modulo m).
3 The following series of exercises comes from the work of Hurwitz, but our presentation follows the PhD thesis of

Nikolaos Tsopanidis, see [30].


4 This means that we can manipulate congruences modulo 1 + i normally. Note that the choice of i is not arbitrary

at all, since 1 − i = −i(1 + i) and 1 − j = (1 − ω)(1 + i) are associates. By α ≡ β (mod γ), we mean that γ divides α − β
from the left and from the right.
5 Note that a primary Hurwitz integer is always in Z[i, j, k].
42 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.6.19† . Let p be an odd prime. Prove that any non-zero α ∈ H/pH of zero norm modulo p
has a representative of the form ρπ, where π is a primary element of norm p and ρ ∈ H, and that this
π is unique. Conversely, let π ∈ H have norm p. Prove that the equation ρπ ≡ 0 (mod p) has exactly
p2 solutions ρ ∈ H/pH. Deduce that there are exactly p + 1 primary irreducible Hurwitz integers with
norm p.

Exercise 2.6.20† (Jacobi’s Four Square Theorem). Let n be a positive rational integer. In how many
ways can n be written as a sum of four squares of rational integers. (Two ways are considered different
if the ordering is different, for instance 2 = 12 + 02 + 02 + (−1)2 and 2 = (−1)2 + 02 + 02 + 12 are
different.)

Domains
Exercise 2.6.21.
√ Prove that there are finitely many rational integers d ≡ 2 (mod 4) or d ≡ 3 (mod 4)
such that Q( d) is norm-Euclidean.
Exercise 2.6.22. Let R be an integral domain such that for any set S ⊆ R there exists a β ∈ R such
that X
αR = βR.
α∈S
P
Such a ring is called a principal ideal domain (PID). Prove that it is a UFD. (The sum α∈S αR is
defined as the union of α∈S 0 αR over all finite subsets S 0 ⊆ S.)
P

Exercise 2.6.23. Let R be a Euclidean domain. Prove that it is a PID (and thus a UFD as well).

Exercise 2.6.24. Let R = Z+XQ[X] be the ring of polynomials with rational coefficients and integral
constant coefficient. Prove that R is a Bézout domain but not a UFD, and hence not a PID either.

Miscellaneous
Exercise 2.6.25† . Let (Fn )n∈Z be the Fibonacci sequence defined by F0 = 0, F1 = 1, and Fn+2 =
Fn+1 + Fn for any integer n. Prove that, for any integers m and n, gcd(Fm , Fn ) = Fgcd(m,n) .
Exercise 2.6.26. Let (Ln )n∈Z be the Lucas sequence defined by L0 = 2, L1 = 1, and Ln+2 =
Ln+1 + Ln for any integer n. Given two integers m and n, find a formula for gcd(Lm , Ln ) analogous
to Exercise 2.6.25† .
√ √
Exercise 2.6.27† . Let n√be a rational integer. Prove that (1 + 2)n is a unit of Z[ 2]. Moreover,
prove that any unit of Z[ 2] has that form, up to sign.
Exercise 2.6.28† (IMO 2001). Let a > b > c > d be positive rational integers. Suppose that

ac + bd = (b + d + a − c)(b + d − a + c).

Prove that ab + cd is not prime.


Exercise 2.6.29† . Let x ∈ R be a non-zero real number and m, n ≥ 1 coprime integers. Suppose that
xm + x1m and xn + x1n are both rational integers. Prove that x + x1 is also one.

Exercise 2.6.30. Find all automorphisms of the quaternions H, i.e. additive and multiplicative
bijections ϕ : H → H.
Chapter 3

Cyclotomic Polynomials

Prerequisites for this chapter: Chapter 1.

Quadratic numbers and roots of unity are very important in algebraic number theory; they were
one of the first objects studied in detail. We have studied a bit the former in Chapter 2, here we will
look at the minimal polynomials of the latter and their properties.

3.1 Definition
We say an nth root of unityω is a primitive nth root if its order is n, i.e. ω k 6= 1 for k = 1, 2, . . . , n − 1.
Note that, if ω = exp 2kiπ
n , ω is a primitive nth root if and only if gcd(k, n) = 1.

We may now define cyclotomic polynomials. These are the polynomials with roots primitive nth
roots of unity for some n.

Definition 3.1.1 (Cyclotomic Polynomials)

Let n ≥ 1 be an integer. The nth cyclotomic polynomial , Φn , is the polynomial of degree ϕ(n)
 
Y Y 2kiπ
X −ω = X − exp .
n
ω primitive nth root gcd(k,n)=1

For instance, Φ1 = X − 1, Φ2 = X − (−1) and Φ4 = (X − i)(X + i) = X 2 + 1. Below are the first


few cyclotomic polynomials.

• Φ1 = X − 1.

• Φ2 = X + 1.

• Φ3 = X 2 + X + 1.

• Φ4 = X 2 + 1.

• Φ5 = X 4 + X 3 + X 2 + X + 1.

• Φ6 = X 2 − X + 1.

There is one striking thing about these polynomials: they all have integer coefficients!1 In fact, this
is true for any n, despite the fact that our definition involved complex numbers. This is a consequence
of the following fundamental proposition.
1 Which makes sense, since we said they were the minimal polynomials of roots of unity.

43
44 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Proposition 3.1.1*

Let n ≥ 1 be an integer. Then, Y


Xn − 1 = Φd .
d|n

Remark 3.1.1
In general, unless otherwise specified, when we write index something (e.g. a sum or a product)
by d | n we mean that the indexing is done over the non-negative divisors of n.

Proof

This is a simple root counting exercise. We have to show that any nth root of unity is a primitive
dth root for exactly one d | n, which is clearly true as this d is the order of the root. Conversely
it is clear that any primitive dth root for some d | n is an nth root of unity.


Exercise 3.1.1∗ . Let ω be an nth root of unity. Prove that its order divides n.

Exercise 3.1.2∗ . Let p be a rational prime. Prove that Φp = X p−1 + . . . + 1.

Exercise 3.1.3∗ . Let n ≥ 1 be an integer. Prove that Φn (0) = −1 if n = 1 and 1 otherwise.

Exercise 3.1.4. Let n > 1 be an integer. Prove that Φn (1) = p if n is a power of a prime p, and Φn (1) = 1
otherwise.

From this, we get the following

Corollary 3.1.1*

Cyclotomic polynomials have integer coefficients.

Exercise 3.1.5∗ . Prove the Corollary 3.1.1 by induction.

By looking at the degrees of both sides of Proposition 3.1.1, we also get the

Corollary 3.1.2

For any integer n ≥ 1, we have X


ϕ(d) = n.
d|n

Let us examine more closely why Proposition 3.1.1 is amazing. It gives us a very good factorisation
of X n − 1, so much better than (X − 1)(X n−1 + . . . + 1). We can also get a factorisation for an − bn
by rewriting it as bn ((a/b)n − 1). Indeed, define the two-variable homogeneous polynomial Φn (a, b) :=
bϕ(n) Φn (a). Then, Y
an − bn = Φd (a, b).
d|n

Exercise 3.1.6∗ . Prove that Φn (1/X) = Φn (X)/X ϕ(n) for n > 1.


3.1. DEFINITION 45

Exercise 3.1.7∗ . Prove that, for n > 1, Φn (X, Y ) is a two-variable symmetric and homogeneous, i.e. where
all monomials have the same degree, polynomial with integer coefficients.

Exercise 3.1.8∗ . Prove that


Y
Φn (X, Y ) = X − ωY.
ω primitive nth root

We can already use this on a problem.

Problem 3.1.1
n+1 n
Let n ≥ 0 be an integer. Prove that the number 22 + 22 + 1 has at least n + 1 prime factors
counted with multiplicity.

Solution
n n+1 n
Let x = 22 . The number 22 + 22 + 1 then becomes
n
x3 − 1 23·2 − 1
x2 + x + 1 = = 2n .
x−1 2 −1
We factorise the numerators and denominators using Proposition 3.1.1:
Q
d|3·2n Φd (2)
Y
2
x +x+1= Q = Φd (2).
d|2n Φd (2) d|3·2n ,d-2n

Notice that the divisors of 3 · 2n that do not divide 2n are precisely the divisors of the form 3d
where d | 2n , i.e. of the form 3 · 2k for some 0 ≤ k ≤ n. Thus,
n
2n+1 2n
Y
2 +2 +1= Φ3·2k (2).
k=0

We have found our n + 1 divisors! It remains, however, to check that they are non-trivial, i.e.
greater than 1. For this, we return to the definition of cyclotomic polynomials:


Y Y
|Φn (2)| =
2 − ω ≥ 1=1
ω primitive nth root ω primitive nth root

since |2 − ω| ≥ 2 −|ω| = 1 for any |ω| = 1 by the triangular inequality. In addition, this inequality
is strict if n 6= 1. 

Finally, we give a formula to compute cyclotomic polynomials, a lot more efficient than just using
Proposition 3.1.1.

Proposition 3.1.2*

Let p be a prime number and n ≥ 1 an integer. If p | n then Φpn (X) = Φn (X p ), otherwise


p
Φpn (X) = ΦΦnn(X )
(X) .
46 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Proof

This is again a simple root counting exercise. Note first that both sides have the same degree: if
p | n the RHS has degree pϕ(n) = ϕ(pn) and if p - n the LHS has degree

pϕ(n) − ϕ(n) = (p − 1)ϕ(n) = ϕ(pn).

Note also that the quotient makes sense. Indeed, for any primitive nth root ω, ω p is also a
primitive nth root of unity iff gcd(n, p) = 1, i.e. p - n. Thus, it suffices to show that each root of
the LHS is a root of the RHS.

This is very easy: let ω be a primitive pnth root of unity. Then, ω p is a primitive nth root as
wanted (and ω isn’t so the denominator is non-zero).


As a corollary, we get

Corollary 3.1.3

Let n > 1 be an odd integer. Then, Φ2n (X) = Φn (−X).

Exercise 3.1.9∗ . Prove that, for odd n > 1, Φn (X)Φn (−X) = Φn (X 2 ) and deduce Corollary 3.1.3.

Exercise 3.1.10. Prove that, for any polynomial f , f (X)f (−X) is a polynomial in X 2 .

Exercise 3.1.11∗ . Let p be a prime number and n ≥ 1 an integer. Prove that if p | n then Φpn (X, Y ) =
p
,Y p )
Φn (X p , Y p ), and that Φpn (X, Y ) = ΦΦnn(X
(X,Y )
otherwise.

Exercise 3.1.12∗ . Let k ≥ 1 be an integer. Prove that Φ2k = X 2


k−1
+ 1.

3.2 Irreducibility
In fact, the factorisation we got for X n − 1 is not only very good, it is the best possible: cyclotomic
polynomials are irreducible! In algebraic-number-theoretic terminology, the conjugates of a primitive
nth root of unity are all primitive nth roots of unity. It is a notoriously hard problem to prove certain
polynomials are irreducible, so such a result is remarkable.

Theorem 3.2.1

For any integer n ≥ 1, Φn is irreducible in Q[X].

We present a proof using algebraic number theory, and leave another one as an exercise.

Proof

Let ω be a primitive nth root of unity with minimal polynomial π. We will show that, for any
rational prime p - n, ω p is also a root of π. Thus, ω k will also be a root of π for any gcd(n, k) = 1.
Since all primitive nth roots have this form by Exercise 3.2.1∗ , we have π = Φn as wanted. The
key point for this is the congruence π(ω p ) ≡ π(ω)p ≡ 0 (mod p), given by Exercise 3.2.3∗ .

Let p - n be a rational prime. Suppose for the sake of a contradiction that π(ω p ) 6= 0. Then, π
3.3. ORDERS 47

divides (in Z[X], as π is monic)


Xn − 1 Y
= X − ωk .
X − ωp
k6=p
p
Thus, π(ω ) divides
Y n−1
YY
ωi − ωj = ωi − ωj .
i6=j i=0 j6=i

By Exercise 3.2.2∗ , for a fixed i, j6=i ω i −ω j is the derivative of j=0 X −ω j = X n −1 evaluated


Q Qn−1

at ω i , i.e. n(ω i )n−1 . Thus, our double product is


n−1
Y
n(ω i )n−1 = ±nn
i=0

Qn−1
since i=0 ω i = (−1)n−1 by Vieta’s formulas A.1.4.

Finally, since p | π(ω p ), we also have p | nn : this is a contradiction since we assumed p - n.




Exercise 3.2.1∗ . Let n ≥ 1 be an integer and ω be a primitive nth root of unity. Prove that any primitive
nth root can be written in the form ω k for some gcd(k, n) = 1.

Exercise 3.2.2∗ . Let f =


Qn
Q k=1 X − αi be a polynomial. Prove that, for any k = 1, . . . , n, f 0 (αk ) =
i6=k αk − αi .

Exercise 3.2.3∗ (Frobenius Morphism). Prove the following special case of Proposition 4.1.1: for any rational
prime p and any polynomial f ∈ Z[X], f (X p ) ≡ f (X)p (mod p).

Exercise 3.2.4 (Alternative Proof of Theorem 3.2.1). Let ω be a primitive nth root of unity with minimal
polynomial π and let p - n be a rational prime. Suppose τ is the minimal polynomial of π(ω p ). Prove that
p | τ (0) and that τ (0) is bounded when p varies. Deduce that ω p is a root of π for sufficiently large p, and thus
that ω k is a root of π for any gcd(n, k) = 1.
2kπ

An interesting corollary of this theorem is thatwe can  know the conjugates of cos n for
2k0 π
anygcd(k, n) = 1: they are precisely the numbers cos n for gcd(k 0 , n) = 1.

However, unlike the primitive nth roots of unity which have degree ϕ(n), they have degree 1 for
n = 1, 2 and degree ϕ(n) for n ≥ 3 as cos 2kπ = cos −2kπ
 
2 n n .
ϕ(n)
In particular, this gives an alternative proof of Problem 1.1.1: cos 2kπ

n is rational iff 2 = 1 or
n = 1, 2, i.e. n = 1, 2, 4, 6 and we can easily check that cos 2kπ 1

n = 0, ±1, ± 2 for these n.

Exercise 3.2.5. Let k and n ≥ 1 be coprime integers. Prove that the conjugates of cos 2kπ

n
are the numbers
 0 
2k π 0 2kπ

cos n for gcd(k , n) = 1. What is its degree? What about sin n , what are its conjugates and what is
its degree?

Exercise 3.2.6. Find all quadratic cosines.

3.3 Orders
We will now see very important arithmetic properties of cyclotomic polynomials. This is the funda-
mental result.
48 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Theorem 3.3.1

Let p be a rational prime and a a rational integer. Then, p divides Φn (a) if and only if the order
of a modulo p is pvpn(n) .

For instance, it is easy to see that p | a − 1 means a has order 1 mod p, p | a + 1 means a has order
2 mod p unless p = 2, and p | a2 + 1 means a has order 4 mod p unless p = 2.

In fact, this theorem is perhaps not so surprising if one recalls the local-global principle from
Proposition 1.3.1. Over the complex numbers C, Φn (a) is zero if and only if a has order n (by
definition). Thus, one can expect that the same holds over the integers mod p, which is exactly what
this theorem says.

Proof

We do the case where p - n first. The general case will follow from Exercise 3.3.1∗ by induction
on vp (n).

Note that the statement makes sense as p | Φn (a) implies p | an − 1 so p - a. Thus, suppose p - a.
Let k be the order of a modulo p. Since
Y
0 ≡ ak − 1 = Φd (a),
d|k

there must exist a p - n such that Φn (a) ≡ 0.

We show that this n is unique. Suppose that p - m 6= n satisifies Φm (a) ≡ 0 too. Then,
Y
X mn − 1 = Φd
d|mn

has a double root at a. Thus, by Proposition A.1.3, the derivative mnX mn−1 is zero at a: this
is impossible as p - a, m, n.

Finally, notice that such an n must be the order of a modulo p. By construction, n divides the
order of a. If it was distinct from it, then

ak − 1 Y
n
= Φd (a)
a −1
d|k,d-n

would be zero thus there would be some p - m 6= n such that Φm (a) ≡ 0 which is impossible.


Exercise 3.3.1∗ . Let p be a rational prime and a a rational integer. Prove that, for any n ≥ 1, p | Φn (a) if
and only if p | Φpn (a).

Exercise 3.3.2∗ . Let p be a rational prime. Prove that there always exists a primitive root or generator
modulo p, i.e. an integer g such that g k generates all integers p - m modulo p.

Exercise 3.3.3∗ . Let p be a rational prime and a, b two rational integers. Prove that p | Φn (a, b) if and only
if p | a, b or vpn(n) is the order of ab−1 modulo p.
p

From this we get the following very important corollary.


3.3. ORDERS 49

Corollary 3.3.1*

Let p be a rational prime and a a rational integer. Suppose that p | Φn (a). Then, p ≡ 1 (mod n)
or p is the greatest prime factor of n.

Proof

If p - n, then n is the order of a modulo p by Theorem 3.3.1 so n | p − 1. Otherwise, pvpn(n) | p − 1


so all prime factors of the former are smaller than p. But the prime factors of pvpn(n) are exactly
the prime factors of n distinct from p!


Exercise 3.3.4∗ . Let p be a rational prime and a an integer of order n modulo p. Prove that ak ≡ 1 (mod p)
if and only if n | k. Deduce that n divides p − 1.2

Exercise 3.3.5∗ . Let p be a rational prime and a, b two rational integers. Suppose that p | Φn (a, b). Prove
that p | a, b, p ≡ 1 (mod n) or p is the greatest prime factor of n.

Exercise 3.3.6∗ . Let p be a rational prime and a an integer. Suppose p | Φn (a), Φm (a) and n 6= m. Prove
that m
n
is a power of p.

Exercise 3.3.7. Prove the following strengthening of Problem 3.1.1: for any integer n ≥ 0, the number
n+1 n
22 + 22 + 1 has at least n + 1 distinct prime factors.

We also get the following result. It is a special case of the celebrated theorem of Dirichlet on
arithmetic progressions which asserts that, for any gcd(m, n) = 1, there are infinitely many rational
primes p ≡ m (mod n). Its proof is significantly more involved.

Corollary 3.3.2*

For any integer n ≥ 1, there are infinitely many rational primes p ≡ 1 (mod n).

Exercise 3.3.8∗ . Let n ≥ 1 be an integer. Prove that there exist infinitely many rational primes p ≡ 1
(mod n).

Here is an example of problem that follows from the first corollary.

Problem 3.3.1 (ISL 2006 N5)

Prove that there doesn’t exist integers x 6= 1 and y such that

x7 − 1
= y 5 − 1.
x−1

Solution

Suppose for the sake of a contradiction that (x, y) is a solution. We rewrite the equation using

2 This is the mod p version of Exercise 3.1.1∗ . In fact the proof should be the same as it works in any group (see

Section A.2 and Theorem 6.3.2).


50 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

cyclotomic polynomials:
Φ7 (x) = Φ1 (y)Φ5 (y).
By Corollary 3.3.1, a prime factor p of the LHS is either 7 of 1 mod 7. Suppose that 7 - Φ7 (x).
Then, we must have Φ1 (y), Φ5 (y) ≡ 1 (mod 7) by the previous remark. Thus, from Φ1 (y) ≡ 1
(mod 7) we get y ≡ 2 (mod 7). This means that

25 − 1
Φ5 (y) ≡ ≡2 (mod 7),
2−1
a contradiction.

Hence, we must have 7 | Φ7 (x). Since 7 is distinct from 5 and not congruent to 1 mod 5, it can’t
divide Φ5 (y) which means it must divide Φ1 (y). Thus, y ≡ 1 (mod 7). This implies that

Φ5 (y) ≡ 1 + 1 + 1 + 1 + 1 ≡ 5 (mod 7)

which is again a contradiction. 

3.4 Zsigmondy’s Theorem


In this section, we formulate and prove the powerful Zsigmondy theorem.

Definition 3.4.1

Let (un )n≥1 be a sequence of rational integers. We say a prime p is a primitive prime factor of
an if p | un but p - u1 , . . . , un−1 .

In other words, a primitive prime factor is a new prime factor.

Theorem 3.4.1 (Zsigmondy)

Let a > b be non-zero coprime positive integers. The sequence (an − bn )n≥1 always has a rational
primitive prime factor for n ≥ 2 except in the following cases: n = 2 and a + b is ± a power of
2, and n = 6 and (a, b) = (2, 1).

Exercise 3.4.1∗ . Check that the exceptions stated in Theorem 3.4.1 are indeed exceptions.

Exercise 3.4.2∗ . Prove that a2 − b2 has no primitive prime factor if and only if a + b is ± a power of 2.
Here is how we will prove this theorem. The numbers an − bn have many prime factors in common.
However, we have seen that the numbers Φn (a, b) have strong restrictions on their prime factors, and
thus don’t have many common prime factors (see e.g. Exercise 3.3.6∗ ). Notice now that, since
Y
an − bn = Φd (a, b)
d|n

n n
finding a primitive prime factor of a − b reduces to finding a primitive prime factor of Φn (a, b)!
Before delving into the proof, we need a lemma called the "lifting the exponent lemma" or "LTE".

Theorem 3.4.2 (Lifting the Exponent Lemma)

Let p | n be an odd rational prime, where n ≥ 1 is an integer. Then, for any rational integer a,
vp (Φn (a, b)) ≤ 1. Moreover, for p = 2, if 4 | n then vp (Φn (a, b)) ≤ 1.
3.4. ZSIGMONDY’S THEOREM 51

Proof

Notice that
a n − bn
Φn (a, b) | .
an/p − bn/p
If an/p 6≡ bn/p (mod p) then an 6≡ bn (mod p) so vp (Φn (a, b)) = 0.
up −v p
Thus, it suffices to show that for any distinct rational integers u ≡ (mod p), p2 - u−v . Write
v = u + mp. Then,
p−1 p−1 p−1
up − v p X X X
= up−1−k v k = up−1−k (u + mp)k ≡ up−1−k (uk + kmpum−1 ) (mod p2 )
u−v
k=0 k=0 k=0

by the binomial expansion. However, this sum is just


p−1
X p(p − 1)
up−1 + pmup−2 k = pup−1 + pmup−2 · ≡ pup−1 (mod p2 )
2
k=0

which is indeed non-zero for odd p.


 n n 
a −b
For p = 2, we have v2 an/2 −bn/2
= v2 (an/2 + bn/2 ) ≤ 1 when 2 | n/2, i.e. 4 | n.


Some might be more familiar with this version of the lemma:

Theorem 3.4.3 (Lifting the Exponent Lemma)

Let p be rational prime and u ≡ v 6≡ 0 (mod p). Then,

vp (un − v n ) = vp (u − v) + vp (n).

Moreover, for p = 2, we have v2 (un − v n ) = v2 (u − v) for odd n and v2 (un − v n ) = v2 (u2 − v 2 ) +


v2 (n) − 1 for even n.

Proof

Rewrite this as
un − v n
 
vp = vp (n).
u−v
Then, this follows from the following equality:
un − v n Y
= Φd (u, v).
u−v
d|n,d>1

By Exercise 3.3.3∗ , p | Φd (u, v) if and only if d


pvp (d)
is the order of u · v −1 . Since u ≡ v, the order
−1
of u · v is just 1. Thus, p | Φd (u, v) if and only if d is a power of p. By our version of the
lemma 3.4.2, each such factor adds 1 to the p-adic valuation since a power of p distinct from 1 is
n
−v n
divisible by p. Finally, the p-adic valuation of uu−v is just the number of powers of p distinct
from 1 dividing n: i.e. vp (n). For p = 2, we also need to take in account the contribution of Φ2
so we get v2 (un − v n ) = v2 (n) − 1 + v2 (u2 − v 2 ) for 2 | n (and the case 2 - n is the same as for p
odd).

52 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

We can now start proving Zsigmondy’s theorem.


Beginning of the Proof of Zsigmondy’s Theorem 3.4.1

Suppose that Φn (a, b) does not have any primitive prime factor for some n ≥ 3; the case n = 2
was done Exercise 3.4.2∗ . Then, let p | Φn (a, b) be a non-primitive prime factor, say that it also
divides Φm (a, b) for some m < n. Since a and b are coprime integers, p cannot divide both of
them so it divides neither and the order of ab−1 mod p is both pvpn(n) and pvpm(m) by Exercise 3.3.3∗

Thus, n and m differ multiplicatively by a power of p. In particular, p | n so p is the greatest


prime factor of n by Exercise 3.3.5∗ and hence is unique!
k−1 k−1
Moreover, p can’t be equal to 2, otherwise n would be a power of 2 but Φ2k (a, b) = a2 + b2
is a sum of two coprime squares hence not divisible by 4 but clearly at least 4 which means it
can’t be a power of 2. Thus, by the LTE lemma 3.4.2, vp (Φn (a, b)) ≤ 1. We have reduced the
problem to showing |Φn (a, b)| is not equal to the greatest prime factor of n!

If |Φn (a, b)| were equal to the greatest prime factor of n, it would in particular be at most n.
Intuitively, this should not be the case as cyclotomic polynomials are exponential in n. We are thus
led to find bounds on them. This is achieved in the following proposition.

Proposition 3.4.1*

Let |a| > |b| be two real numbers and n ≥ 1 an integer. Then,

(|a| − |b|)ϕ(n) ≤ |Φn (a, b)| ≤ (|a| + |b|)ϕ(n)

with equality in either side only if n = 1 or n = 2. In addition, if n > 2,

|b|ϕ(n) ≤ Φn (a, b)

with equality only if |a| = |b|.

Proof

The first part of this follows from the triangular inequality exactly like we did for Problem 3.1.1:
Y
|Φn (a, b)| = |a + bω|
ω primitive nth root of unity

by Exercise 3.1.8∗ and each factor is between |a| − |b| and |a| + |b|. The equality case are easy to
work out: |a + bω| = |a| ± |b| implies ω is real so n = 1 or n = 2.

For the |b|ϕ(n) part, after dividing by it it reduces to |Φn (a/b)| > 1 thus to showing |Φn (x)| > 1
for all |x| > 1. Notice that, for any |ω| = 1, |x − ω| is a strictly decreasing function in x if x ≤ −1
and is a stricly increasing function in x if x ≥ 1. Hence,
Y
|Φn (x)| = |x − ω|
ω primitive nth root of unity

is either at least |Φn (1)| or |Φn (−1)|. But these are both non-zero integers so in both cases it is
at least 1 and by strict monotony if we have equality |a| = |b|.


Exercise 3.4.3. Let n ≥ 3 be an integer. Prove that Φn is positive on R.


3.5. EXERCISES 53

Back to the the Proof of Zsigmondy’s Theorem 3.4.1

Suppose that Φn (a, b) = p where p is a prime factor of n. Then, by Proposition 3.4.1,

bϕ(n) , (a − b)ϕ(n) ≤ Φn (a, n) = p.

In particular, since p | n,
bp−1 , (a − b)p−1 ≤ p.
Exercise 3.4.4 therefore implies that b = 1 and a − b = 1 since p 6= 2, i.e. b = 1 and a = 2.

We now use Proposition 3.1.2:


ϕ(n/p)
Φn/p (ap ) 2p − 1

Φn (a) ≥ ≥ .
Φn (a) 3
p
2p −1 ϕ(n/p)
By Exercise 3.4.4, since 2 3−1 ≤ p, we must have p = 3. Since we also had

3 ≤ p this
means that ϕ(n/p) = 1 so n = p or 2p.

Since Φ1 (2, 1) = 1 and Φ2 (2, 1) = 3, Φ3 (2, 1) = 5 has a primitive, which means that n = 6 and
we have finally found our exception!


Remark 3.4.1
As Exercise 3.5.40† shows, we can still get an exponential bound for Φn (2) and make that case
similar to the others, but this is technical so we preferred this approach.

Exercise 3.4.4. Prove that 2m−1 > m for any integer m ≥ 3 and 2m − 1 > 3m for any integer m ≥ 4.

3.5 Exercises
Diophantine Equations
Exercise 3.5.1. Find all rational integers x and y such that x2 + 9 = y 3 .

Exercise 3.5.2† (USA TST 2008). Let n be a rational integer. Prove that n7 + 7 is not a perfect
square.

Exercise 3.5.3. Solve the equation

x3 = y 16 + y 15 + . . . + y + 9

over Z.

Exercise 3.5.4 (Japanese Mathematical Olympiad 2011). Find all positive integers a, p, q, r, s such
that
as − 1 = (ap − 1)(aq − 1)(ar − 1).

Exercise 3.5.5† (French TST 1 2017). Determine all positive integers a for which there exists positive
integers m and n as well as positive integers k1 , . . . , km , `1 , . . . , `n such that

(ak1 − 1) · . . . · (akm − 1) = (a`1 + 1) · . . . · (a`n + 1).


54 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Divisibility Relations
Exercise 3.5.6 (IMO 2000). Does there exist a rational integer n such that n has exactly 2000 distinct
prime factors and n divides 2n + 1?
Exercise 3.5.7† . Find all coprime positive integers a and b for which there exist infinitely many
integers n ≥ 1 such that
n2 | an + bn .
Exercise 3.5.8. Prove that there exist infinitely many positive integers n such that
2
n3 | 2n + 1.
Exercise 3.5.9 (Iran TST 2013). Prove that there does not exist positive rational integers a, b, c such
that 3(ab + bc + ca) | a2 + b2 + c2 .
Exercise 3.5.10 (ISL 1998). Determine all positive integers n for which there is an m ∈ Z such that
2n − 1 | m2 + 9.

Prime Factors
Exercise 3.5.11† (ISL 2002). Let p1 , . . . , pn > 3 be distinct rational primes. Prove that the number
2p1 ·...pn + 1
n
has at least 22 distinct prime factors.
Exercise 3.5.12† (Problems from the Book). Let a ≥ 2 be a rational integer. Prove that there exist
infinitely many integers n ≥ 1 such that the greatest prime factor of an − 1 is greater than n loga n.
Exercise 3.5.13† (Inspired by IMO 2003). Let m ≥ 1 be an integer. Prove that there is some rational
prime p such that p - nm − m for any rational integer n.
Exercise 3.5.14† . Prove that ϕ(n)/n can get arbitrarily small. Deduce that π(n)/n → 0, where π(n)
denotes the number of primes at most n.
Exercise 3.5.15† . Let P (n) denote the greatest prime factor of any rational integer n ≥ 1 (P (1) = 0).
Let ε > 0 be a real number. Prove that there exist infinitely many rational integers n ≥ 2 such that
P (n − 1), P (n), P (n + 1) < nε .
Exercise 3.5.16† (Brazilian Mathematical Olympiad 1995). Let P (n) denote the greatest prime factor
of any rational integer n ≥ 1. Prove that there exist infinitely many rational integers n ≥ 2 such that
P (n − 1) < P (n) < P (n + 1).

√ √
Exercise 3.5.17. Let a, b ∈ Z[ 5] be quadratic integers such that a ≡ b (mod 5) and n ≥ 1 an
integer. Prove that
v√5 (an − bn ) = v√5 (a − b) + v√5 (n)
√ √
where v√5 (x) denotes the greatest integer v such that ( 5)v | x but ( 5)v+1 - x. Deduce that
v5 (Fn ) = v5 (n)
for any n ≥ 1, where (Fn )n≥0 is the Fibonacci sequence defined by F0 = 0, F1 = 1 and Fn+2 =
Fn+1 + Fn for n ≥ 0.
Exercise 3.5.18† (Structure of units of Z/nZ). Let p be an odd rational prime and n ≥ 1 and integer.
Prove that there is a primitive root modulo pn , i.e. a number g which generates all the numbers coprime
with p modulo pn . Moreover, show that there doesn’t exist a primitive root mod 2n for n ≥ 3, but
that, in that case, there exist a rational integer g and a rational integer a such that each rational
integer is congruent to either g k for some k or ag k modulo 2n .3
3 In group-theoretic terms, this says that (Z/pn Z)× ' Z/ϕ(pn )Z and that (Z/2Z)n ' (Z/2Z) × (Z/2n−2 Z) for n ≥ 2.

The Chinese remainder theorem then yields


(Z/2n pn 1 nm ×
1 · · · pm Z) ' (Z/2Z) × (Z/2
n−2
Z) × (Z/ϕ(pn 1 nm
1 )Z) × . . . × (Z/ϕ(pm )Z).
3.5. EXERCISES 55

Coefficients of Cyclotomic Polynomials


Exercise 3.5.19. Define the Möbius function µ : N∗ → {−1, 0, 1} by µ(n) = (−1)k where k is the
number of prime factors of n if n is squarefree, and µ(n) = 0 otherwise. Prove that
Y
Φn = (X d − 1)µ(n/d) .
d|n

Exercise 3.5.20† . Let m ≥ 0 be an integer. Prove that the coefficient of X m of Φn is bounded when
n varies.
Exercise 3.5.21† . Let ψ(x) = pα ≤x log p. By noticing that
P

Z 1
exp(ψ(2n + 1))
exp(ψ(2n + 1)) xn (1 − x)n dx ≤ ,
0 4n
prove that π(n), the number of primes at most n, is greater than Cn/ log n for some constant C > 0.
Exercise 3.5.22† . Let m ≥ 3 be an odd integer and suppose that p1 < . . . < pm = p are rational
primes such that p1 + p2 > pm and let n = p1 · . . . · pm . What are the coefficient of X p and X p−2 of
Φn ? Deduce that any rational integer arises as a coefficient of a cyclotomic polynomial.4
Exercise 3.5.23† . Let p and q be two rational primes. Prove that the coefficients of Φpq are in
{−1, 0, 1}.

Cyclotomic Fields and Fermat’s Last Theorem


Exercise 3.5.24† (Sophie-Germain’s Theorem). Let p be a Sophie-Germain prime, i.e. a rational
prime such that 2p + 1 is also prime. Prove that the equation ap + bp = cp does not have rational
integer solutions p - abc.
Exercise 3.5.25† . Let ω be an nth root of unity. Define Q(ω) as Q + ωQ + . . . + ω n−1 Q. Prove that
Q(ω) ∩ R = Q(ω + ω −1 )
where Q(ω + ω −1 ) = Q + (ω + ω −1 )Q + . . . + (ω + ω −1 )n−1 Q.
Exercise 3.5.26† . Let ω be a primitive pth root of unity, where p is prime. Prove that the ring of
integers of Q(ω), OQ(ω) := Q(ω) ∩ Z is

Z[ω] := Z + ωZ + . . . + ω n−1 Z.
(In fact this holds for any nth root of unity but it is harder to prove.)
Exercise 3.5.27† . Let ω be a primitive pth root of unity, where p is prime. Prove that p = u(1−ω)p−1 ,
where u ∈ Z is a unit of Z, i.e. 1/u is also an algebraic integer. Deduce that 1 − ω is prime in Q(ω).
Exercise 3.5.28† (Kummer). Let ω be a root of unity of odd prime order p and suppose ε is a unit
of Q(ω). Prove that ε = ηω n for some n ∈ Z and η ∈ R.
Exercise 3.5.29† . Let α ∈ Z[ω], where ω is a primitive pth root of unity. Prove that αp is congruent
to a rational integer modulo p.
Exercise 3.5.30† (Kummer). Let p be an odd prime and ω a primitive pth root of unity. Suppose
that Z[ω] is a UFD.5 Prove that there do not exist non-zero rational integers a, b, c ∈ Z such that
ap + bp + cp = 0.
(You may assume that, if a unit of Z[ω] is congruent to a rational integer modulo p, it is a pth power
of a unit. This is known as "Kummer’s lemma". See Borevich-Shafarevich [6] or Conrad [10] for a
(1 − ω)-adic proof of this.)
4 This may come off as a bit surprising considering that all the cyclotomic polynomials we saw had only ±1 and 0

coefficients.
5 Sadly, it has been proven that Z[ω] is only a UFD when p ∈ {3, 5, 7, 11, 13, 17, 19, 23}. This approach works however

almost verbatim when the class number h of Q(ω) is not divisible by p. The case h = 1 corresponds to Z[ω] being a
UFD. That said, it has not been proven that there exist infinitely many p such that p - h (but it has been conjectured
to be the case), while it has been proven that there exist infinitely many p such that p | h.
56 CHAPTER 3. CYCLOTOMIC POLYNOMIALS
j k
Exercise 3.5.31† (Fleck’s Congruences). Let n ≥ 1 be an integer, p a prime number and q = n−1
p−1 .
Prove that, for any rational integer m,
 
k n
X
q
p | (−1) .
k
k≡m (mod p)

Miscellaneous
Exercise 3.5.32. Prove a version of Zsigmondy where a and b are coprime rational integers (not
necessarily positive).
Exercise 3.5.33† (Korea Winter Program Practice Test 1 2019). Find all non-zero polynomials f ∈
Z[X] such that, for any prime number p and any integer n, if p - n, f (n), the order of f (n) modulo p
is at most the order of n modulo p.
Exercise 3.5.34† (Korea Mathematical Olympiad Final Round 2019). Show that there exist infinitely
many positive integers k such that the sequence (an )n≥0 defined by a0 = 1, a1 = k + 1 and

an+2 = kan+1 − an

for n ≥ 0 contains no prime number.


Exercise 3.5.35† (Iran Mathematical Olympiad 3rd round 2018). Let a and b be positive rational
integers distinct from ±1, 0. Prove that there are infinitely rational primes p such that a and b have
the same order modulo p. (You may assume Dirichlet’s theorem.)

Exercise 3.5.36 (All-Russian Mathematical Olympiad 2008). Let S be a finite set of rational primes.
Prove that there exists a positive rational integer n which can be written in the form ap + bp for some
a, b ∈ Z if and only if p ∈ S.
Exercise 3.5.37† (IMC 2010). Let f : R → R be a function and a < b two real numbers. Suppose
that f is zero on [a, b], and
p−1  
X k
f x+ =0
p
k=0

for any x ∈ R and any rational prime p. Prove that f is zero everywhere.

Exercise 3.5.38. What is the discriminant of Φn ?


2kπ

Exercise 3.5.39. Let k and n ≥ 1 be coprime integers. Find the minimal polynomial of tan n .
Deduce that tan(qπ) takes only the rational values 0 and ±1 for rational q.
Exercise 3.5.40† . Let n ≥ 1 be an integer. Prove that Φn (x) ≥ (x − 1)xϕ(n)−1 with equality if and
only if n = 1.6

6 In particular, Φn (2) ≥ 2ϕ(n)−1 .


Chapter 4

Finite Fields

Prerequisites for this chapter: Chapters 1 and 3 and Section A.2.

The start of this chapter will be a bit technical, I hope the reader will bear with it.

Recall what a field is. We say a (K, +, ·) is a field (we usually just write "K is a field" when the
addition and multiplication are clear from the context) if + and · have nice properties: commuta-
tivity, associativity, existence of additive identiy (0), existence of multiplicative identity (1), addition
distributes over multiplication, existence of additive inverse, and, most importantly, existence of
multiplicative inverse (except for 0). There is no need to remember all of these: a field is an inte-
gral domain where each element has a multiplicative inverse. This might seem a bit complicated, but
just think of Q when you have to use fields.
Exercise 4.0.1. Suppose K is a field of characteristic zero, i.e.

1 + ... + 1
| {z }
n times

(where 1 is the multiplicative identity) is never zero for any n ≥ 1. Prove that K contains (up to relabelling
of the elements) Q.1

First, we discuss the simplest case of finite fields: the integers modulo p: Z/pZ. We will call this
field Fp for "field with p elements". It is very important to understand that the elements of Fp are
not rational integers! p = 0 in Fp (not that this is an equality and not a congruence: congruences are
for rational integers while equality is just equality but in another field) while p 6= 0 in Z. Thus, while
we use the same notations for the elements of Fp and elements of Z, they are not the same.
Exercise 4.0.2∗ . Let p be a rational prime. Prove that there exists a unique field with p elements (it’s Z/pZ).

Now we can discuss what finite fields are. Their name is quite explicit: they are the fields which
are also finite. However their is a way nicer characterisation of them: they are the finite extensions of
some Fp , i.e. Fp with some elements algebraic over Fp added. This is analogous to the construction of
the complex numbers: you add an imaginary number i such that i2 = −1 ot the real numbers R. You
can do exactly the same thing for F3 : the polynomial X 2 + 1 doesn’t have a root there so you can add
an imaginary (formal) number i3 such that i23 = −1 (in F3 ), thus getting a field with 9 elements.
Exercise 4.0.3∗ . Prove that F3 (i) := F3 + iF3 is a field (with 9 elements). (The hard part is to prove that
each element has an inverse.)

Why are we interested in finite fields other than Fp ? Well, for the same reason we are interested
in algebraic numbers. It is nice to have polynomials factorise completely (we say they split), thus we
create new fields by adding roots of polynomials to Fp .
1 Technically, it will usually not contain Q because Q is a very specific object. Indeed, the definition of a field is

extremely sensitive: if you change the set K (relabel its elements) but keep everything else the same you get a different
field. In that case we say the new field is isomorphic to the old one. So you must prove that K contains a field isomorphic
to Q, i.e. Q up to relabeling of its elements.

57
58 CHAPTER 4. FINITE FIELDS

Let’s explain a bit what we mean by that. Given an irreducible polynomial f ∈ Fp [X] of degree
n, we let α be a formal object satisfying f (α) and then consider the field generated by α (and Fp ).
It contains α so it must be also contain α2 , . . . , αn−1 and hence all the linear combinations of these
elements. Conversely, Exercise 4.2.1∗ shows that Fp + αFp + . . . + αn−1 Fp is a field (has multiplicative
inverses) and this is therefore what we mean by "adding a root of f to Fp ". We denote this field by
Fp (α). Iterating this process shows that, given any polynomial (not necessarily irreducible) f , we can
construct a field containing Fp where f splits: these are exactly the finite fields we are interested in.
Finally, we come back to our earlier remark about elements of Fp not being integers. Here, the same
is true for F3 (i3 ): its elements are not Gaussian integers. You may protest and claim that F3 (i3 ) is just
Z[i]/3Z[i], i.e. the Gaussian integers modulo 3. And that is true (up to relabelling of the elements)
(see ??). However imagine that we were working with F5 instead. Then, any i5 satisfying i25 = −1
must already be in F5 as X 2 + 1 = (X + 2)(X − 2). Thus, Z[i]/5Z[i] is very different from F5 (i5 ) as
the former has 25 elements while the latter only 5. (The former is not a field because the polynomial
X 2 + 1 has 4 distinct roots. Equivalently, i + 2 has no inverse.)
You might argue that we need to distinguish the cases where the polynomial f already has a root
in Fp and when it does not. This is not only ugly and artificial (having to distinguish these cases), but
also false as what we want is for p to stay prime in Q(α) where α is a root of f . Up to a finite number
of exceptions, this is equivalent f staying irreducible in Fp [X] (see Exercise 6.5.35).2
As a last remark, I hope you are now convinced that a field with, say, p2 elements is very different
from Z/p2 Z! The latter is not even a field since p does not have an inverse!

4.1 Frobenius Morphism


Before constructing finite fields, we need to discuss some things about Fp itself.

Definition 4.1.1 (Frobenius)

Let p be a rational prime and R a commutative ring of characteristic p, meaning that p = 0 in


R. The Frobenius morphism of R is FrobR : x 7→ xp .

p = 0 means that
1 + . . . + 1 = 0;
| {z }
p times

this is for instance the case in Fp or Z modulo p. The word "morphism" in this context that it’s an
additive map. Indeed,
p  
X p k p−k
(x + y)p = x y = xp + y p
k
k=0
as  
p p!
= =0
k k!(p − k)!
for k = 1, . . . , p − 1 (p divides the top but not the bottom).

Proposition 4.1.1*

The Frobenius morphism is indeed a morphism.

Exercise 4.1.1. Why is commutativity (of R) needed?


A direct corollary is the following.
2 However, the intuition that finite fields are represented by algebraic integers is not completely wrong, but instead

of rational primes, we need to look at OQ(α) modulo a prime ideal p (the definition of a prime ideal being an ideal such
that OQ(α) (mod p) is a field). The ideal point of view is very rich and is actually the best point of view (compared to
tricky uses of the fundamental theorem of symmetric polynomials), but we do not expand on this in this book. See [19].
4.2. EXISTENCE AND UNIQUENESS 59

Corollary 4.1.1*
n n n
The nth iterate of the Frobenius, FrobnR is also a morphism, i.e. (x + y)p = xp + y p for any
x, y ∈ R.

Let’s see a quick application of this result.

Problem 4.1.1

Let (an )n≥0 be the sequence defined by a0 = 3, a1 = 0, a2 = 2 and an+3 = an+1 + an for n ≥ 0.
Prove that p | ap for any prime number p.

Solution

A quick computation shows that an = αn + β n + γ n (mod p) where α, β, γ ∈ Z are the roots of


the characteristic polynomial X 3 − X − 1 (see Theorem C.4.1).

Thus, by Proposition 4.1.1,

ap = αp + β p + γ p ≡ (α + β + γ)p = ap1 = 0 (mod p)

as wanted. 

Exercise 4.1.2∗ . Prove that an = αn + β n + γ n .

4.2 Existence and Uniqueness


Here, we show how to construct all finite fields and prove that there is a unique one of cardinality q for
each prime power q 6= 1 (F1 doesn’t exist because the definition of a field specifies that the multiplicative
and additive identities are different). Although we can construct a field with pn elements by adding
a root of an irreducible polynomial f ∈ Fp [X] of degree n we do not do it that way because it is
not obvious that such a polynomial exists (?? provides a proof but uses itself finite fields), but we
use this to show the field is unique (surprisingly). In fact, it is actually very surprising√that it’s √the
same for any irreducible f of degree√ n. Over Q for instance, this is completely false: Q( 2) 6= Q( 3)
(Exercise 2.1.2∗ shows that each Q( d) for integral squarefree d is distinct from the others).

It is useful to think of elements algebraic over Fp similarly to Q, as part of one big field and hence
compatible with each other even if that is not trivial (for algebraic numbers it is as Q is already part
of C, but how can we define roots of polynomials in Fp [X] which don’t exist in Fp ?). It will be proven
in Definition 4.3.1.

Proposition 4.2.1

For any rational prime p and integer n ≥ 1, there exists a field with pn elements.

Proof
n
Consider the polynomial X p − X over Fp [X]. Factorise it as f1 · . . . · fk where f1 , . . . , fk are
irreducible in Fp [X] (not necessarily distinct).

We can construct a field F where f1 , . . . , fn split (have all their roots in F ) inductively using
60 CHAPTER 4. FINITE FIELDS

Exercise 4.2.1∗ . Indeed, one can add a root of X p − X (hence a root of one of the fi ) which is
n

0
not already in Fp to get a new field F and repeat this process inductively until all fi have roots
n
in F . This must terminate as X p − X has at most pn roots in any field.
n
We claim that this field has exactly pn elements. Notice that the derivative of X p − X is 1
n
which is coprime with X p − X so all its roots are distinct. Denote them by α1 , . . . , αpn . Since
they all lie in F , it has at least pn elements.
n n
To conclude, we prove that any element of F is a root of X p −X. The roots of X p −X are clearly
closed under multiplication, multiplicative inverse and additive inverse, and Corollary 4.1.1 shows
that they are also stable under addition. Since any element of F can be written that way, we
n
conclude that they are all roots of X p − X and thus there are at most pn of them.


Exercise 4.2.1∗ . Let K be a field and f ∈ K[X] an irreducible polynomial of degree n. Prove that
K(α) := K + αK + . . . + αn−1 K
is a field, where α is defined as a formal root of f , i.e. an object satisfying f (α) = 0.
Before proving the uniqueness of this field up to isomorphism, we need an analogue of Fermat’s
little theorem. Here Fq denotes a field with q elements.

Definition 4.2.1

For any ring R, we write R× for the multiplicative group of R, i.e. the units of R.

When R = K is a field, we thus have R× = K \ {0}.

Theorem 4.2.1 (Fermat’s Little Theorem in Finite Fields)


n n
For any non-zero α ∈ F×
q , we have α
q −1
= 1. Equivalently, αq = α for any α ∈ Fq .

Proof

Let α ∈ F× q be a non-zero element. The function β 7→ αβ is a bijection from Fp to Fp since α is


invertible. Thus, Y Y n Y
β= αβ = αq −1 β.
β∈F×
q β∈F×
q β∈F×
q

n
−1
β 6= 0, we conclude that αq
Q
Since β∈F×
q
= 1.


Remark 4.2.1
For the reader knowing a bit of group theory, this can also be seen to be Lagrange’s theorem
applied to the multiplicative group F×
p.

Before proving that finite fields are unique (up to isomorphism), we will present an application of
Theorem 4.2.1 to the period of linear recurrences modulo p to further motivate the interest of finite
fields.
We shall prove that the sequence (an )n≥1 in Problem 4.1.1 has period (dividing) p6 − 1 modulo
p for any prime p. Try to find a proof for this seemingly elementary fact without appealing to finite
fields!
4.2. EXISTENCE AND UNIQUENESS 61

Proof

X 3 − X − 1 factorises (in Fp [X]!) either as three linear factors, one linear and one quadratic, or
one cubic. In these cases the roots αp , βp , γp are respectively all in Fp , one in Fp and two in Fp2 ,
or all in Fp3 . (αp , βp , γp are not algebraic numbers, they are (formal) algebraic elements over
Fp .)

We hence always have


6 6 6
−1 −1 −1
αpp = βpp = γpp =1
(as p2 − 1 and p3 − 1 divide p6 − 1).

Finally, we conclude that


6 6 6
−1 −1 −1
an+p6 −1 ≡ αpn · αpp + βpn · βpp + γpn · γpp = αpn + βpn + γpn ≡ an (mod p)

by Theorem 4.2.1 since αp , βp , γp 6= 0.

In fact, we even get that the period divides (p2 − 1)(p3 − 1).


By a similar argument, the Fibonacci sequence (Fn ) has period dividing p2 − 1 modulo any rational
prime p.

Remark 4.2.2
This √
can actually be proven elementarily for Fp2 . Indeed, any element of Fp2 can be written as
a + b d where d ∈ Fp is not a square modulo p. Since Frob2p is a morphism, it suffices to prove
√ √
that Frob2p ( d) = d to conclude that it fixes all of Fp2 . But this is easy for odd p:
√ 2 p+1
( d)p −1 = (d 2 )p−1 = 1.

Note that this argument can be written without appealing to finite fields to prove that (Fn ) has
period dividing p2 − 1, but things already become more messy as we need to treat separately
the cases where p ∈ {2, 5} (because of the denominator or the square root). Also, this does not
√ to recurrences of order ≥ 3 as algebraic numbers of degree 3 are not simply of the form
generalise
a + b 3 d (but over Fp they are by Theorem 4.2.2).

We now prove that finite fields are unique. For this, we shall need the fact that finite fields have
prime characteristic and thus contain (a copy of) Fp where p is the characteristic. Indeed, if the
characateristic of the finite field F is c = char F = ab, then we must have a = 03 or b = 0 in F since
ab = 0 in F and a field is an integral domain. By minimality of the characteristic, this means that
a = c or b = c, i.e. c is prime. F then naturally contains the copy of Fp where we send a ∈ Fp to4
1 + ... + 1.
| {z }
a times

That way, we can consider finite fields as field extensions of some Fp , as stated in the beginning of the
chapter.

Theorem 4.2.2

Let q be an integer. If q 6= 1 is a power of a prime then there exists a unique (up to isomorphism)
field with q elements, otherwise there is none.

3 Here we abusively mean 1 + . . . + 1.


| {z }
a times
4 By this we mean that we take a representative A ∈ N of a and then add 1 A times. This is well defined since they
Fp and F have the same characteristic.
62 CHAPTER 4. FINITE FIELDS

This proof is not super instructive and slightly technical so it can be skipped upon a first reading.
(However once understood, one sees that it only consists of more or less trivial technicalities. The key
point is Fermat’s little theorem.)

Proof

The fact that if F is a finite field of cardinality q then q 6= 1 is a prime power follows from
Proposition C.1.4.

We now show that finite fields of cardinality pn are unique, existence was proven in Proposi-
tion 4.2.1. We proceed by induction on n. It is clearly true for n = 1.

Suppose Fpm is unique for m < n and let F, F 0 be two fields with pn elements. Since p + p2 +
. . . + pn−1 < pn , there is some element α of F which is not in any of the previous Fpm . (By this
m
we mean that αp 6= α for any m < n. Indeed, this is the property that defines elements of Fpm .
n−1
We do not actually need the induction hypothesis: the polynomial (X p − X) · . . . · (X p − X)
has less than pn roots so it doesn’t vanish on all of Fpn .)

Let k be the degree of α so that Fp (α) is a field with pk elements. If k < n, by the induction
k
hypothesis this must be Fk so α ∈ Fk which is not the case. (Again, this means that αp = α.)
Conversely, if k > n, then Fp (α) has pk > pn elements which is impossible since it is contained
in F , a field of cardinality pn . Thus α has degree n (coincidentally this shows that there always
exists an irreducible polynomial in Fp [X] of degree n).
n
Accordingly, we conclude that F = Fp (α). By Theorem 4.2.1, we know that f | X p − X. Again,
n
by Theorem 4.2.1, we know that X p − X splits in F 0 , so f must also split in F 0 .

To conclude, let β be a root of f in F 0 . Then, we again have F 0 = Fp (β). Since α and β have
the same minimal polynomial, F and F 0 are the same except that α has been relabelled as β.
Indeed, just relabel g(α) ∈ F as g(β) ∈ F 0 .

This gives us an isomorphism (a relabelling conserving the structure) between F and F 0 : it


is clear that it is additive and multiplicative (hence same field structure) so we just need to
check that it is well-defined. This follows from the fact that α and β have the same minimal
polynomials: if g(α) = h(α) then g ≡ h (mod f ) so g(β) = h(β).


Remark 4.2.3
Some readers might recognise that the proofs of uniqueness and existence are just saying that
n
Fpn is the splitting field of X p − X.

Note that our proof also yields the following corollary.

Corollary 4.2.1

Any finite field Fpn has the form Fp (α) for some α, i.e. is generated by one element.

4.3 Properties
From the uniqueness of finite fields and Theorem 4.2.1 we can deduce a few fundamental corollaries.

Corollary 4.3.1*

The nth iterate of the Frobenius, Frobnp , fixes exactly Fpn .


4.3. PROPERTIES 63

Proof

Theorem 4.2.1 says that Fpn is fixed by Frobnp . Conversely, this polynomial can have at most pn
n
roots, so any element satisfying αp = α must lie in Fpn .


This might seem trivial but is in fact very useful as it allows us to compare elements of different
finite fields. For instance, α is in Fp if and only if αp = α (we will use this in Proposition 4.4.2).

Corollary 4.3.2*

Let m and n be positive integers. We have the inclusion Fpm ⊆ Fpn if and only if m | n.

By this, we mean that Fpn has a subfield isomorphic to Fpn if and only if m | n.

Proof
m n m
This amounts to saying that the roots of X p − X are all roots of X p − X, i.e. that X p −1 − 1 |
X p −1 − 1 since they are distinct. By Exercise 4.3.1∗ this means that pm − 1 | pn − 1. By the
n

same exercise, this is equivalent to m | n.




Exercise 4.3.1∗ . Let a and b be positive integers and K a field. Prove that X a − 1 divides X b − 1 in K if
and only if a | b. Similarly, if x ≥ 2 is a rational integer, prove that xa − 1 divides xb − 1 in Z if and only if
a | b.

Corollary 4.3.3*

Let α be a root of an irreducible polynomial f ∈ Fp [X] of degree m. Then, α ∈ Fpn if and only
if m | n.

Proof

Fp (α) is a field with pm elements so is Fpm . Thus, α ∈ Fpn if and only if Fpm ⊆ Fpn which is
equivalent to m | n by Corollary 4.3.2.


[X] of√degree
In particular, from the uniqueness of finite fields, we deduce that any polynomial in Fp√
2 splits in Fp2 which was not obvious at first. Over Q this is again completely false: 2 + 3 has
degree 4 6= 2.

Remark 4.3.1
The uniqueness of Fp2 can actually be seen quite easily: if a√and b are quadratic non-residues in

Fp then there is some c ∈ Fp such that a = c2 b (so a = c b) (see Section 4.5). The degree 2
case in general is a bit pathological because a polynomial is either irreducible or splits. If this
example didn’t convince you, you can think about the fact that each polynomial of degree 3 has
all its roots in Fp6 which is not obvious at all. Why would approximately 3p3 elements generate a
field of cardinality only p6 when one element (of degree 7) is sufficient to generate p7 − 1 others?
64 CHAPTER 4. FINITE FIELDS

Exercise 4.3.2∗ . Let f ∈ Fp [X] be a polynomial of degree n. Prove that f splits over Fpn! .

With this we can define the algebraic closure of Fp , consisting of the elements algebraic over Fp
(roots of polynomials with coefficients in Fp ). Here is how this is done: we pick a field with p elements
Fp , a field with p2 elements Fp2 which contains Fp (we can do this by relabelling the elements), a field
with p6 elements Fp6 which contains Fp2 , etc. We thus get a chain of finite fields

Fp ⊆ Fp2 ⊆ Fp6 ⊆ . . . ⊆ Fpn! ⊆ . . .

the union of which contains all finite fields since any n divides n! so Fpn ⊆ Fpn! . Thus, any polynomial
f ∈ Fp [X] as a root in this union, and one can show that in fact all the roots must lie there, for
instance using Exercise 4.3.2∗ . This is the algebraic closure of Fp .

Definition 4.3.1

The algebraic closure of Fp , Fp , is defined as the elements algebraic over Fp , i.e. the union
S ∞
n=1 Fp .
n

Note that this union makes sense as if α ∈ Fpn and β ∈ Fpm then α and β are both in Fpmn so
their sum and product are well defined.

Remark 4.3.2
As said at the beginning of the chapter, Fp is not Z/pZ, the ring of algebraic integers modulo p!
√ √ √
Indeed, the latter is not a field since it’s not an integral domain: p · p ≡ 0 but p 6≡ 0 (as

p
√1 6∈ Z)! It can even be shown that any polynomial f ∈ Z/pZ[X] has infinitely many roots
p = p
in Z/pZ (see Exercise 4.6.17).

Sometimes, when we want to evaluate a symmetric expression of algebraic numbers modulo p,


it can be useful to replace these algebraic numbers by the corresponding elements of Fp with the
fundamental theorem of symmetric polynomials (analogous to Proposition 1.3.1) to use finite field
theory. (Section 6.2 of Chapter 6 will show that any expression of algebraic numbers which is rational
can be written as a symmetric expression of some algebraic numbers and their conjugates, implying
that this replacement from Q to Fp can always be made (when p doesn’t divide the denominator of
the expression).)

Finally, we have one last result that again highlights how much better the situation is over Fp
compared to Q.

Theorem 4.3.1

Let f ∈ Fp [X] be an irreducible polynomial (in Fp [X]) of degree n. Suppose α ∈ Fpn is one its
n−1
roots (by Corollary 4.3.3). Then, all its roots are α, αp , . . . , αp .

In other words, if we know a root of f , we know all of its roots and they are generated by the
Frobenius morphism! This is completely false over Q!

Proof
k k k
By Proposition 4.1.1, f (X p ) = f (X)p so αp is always a root of f . In addition, these are all
i j
distinct as αp = αp for some i > j implies that
i−j
αp =α
so α is fixed by Frobpk where k = i − j < n. Thus, α would be in Fpk but this is impossible as
4.4. CYCLOTOMIC POLYNOMIALS 65

Fp (α) := Fp + αFp + . . . + αn−1 Fp has pn elements while Fpk has pk < pn elements.


This proposition will in particular allow us to determine how cyclotomic polynomials factorise in
Fp .

4.4 Cyclotomic Polynomials


n
Recall Theorem 3.3.1 if a ∈ Fp and Φn (a) = 0, the order of a is pvp (n)
. This holds true over arbitrary
finite fields too. We define the order of a non-zero element α ∈ Fp to be the smallest k > 0 such that
αk = 1. Primitive mth roots over Fp are defined as elements of order m. This time however, there are
0 0 0
no primitive mth roots when p | m as αm p = 1 implies (αm − 1)p = 0 so αm = 1. However, when
p - m, primitive mth roots always exist because X m − 1 has distinct roots (its derivative mX m−1 is
non-zero and thus coprime with X m − 1). Theorem 3.3.1 thus takes the following form.

Proposition 4.4.1

Let n = pk m ≥ 1 be an integer where k = vp (n). Then, over Fp ,


 ϕ(pk )
k Y
Φn = Φϕ(p
m
)
= X − ω .
ω∈Fp primitive mth root

Exercise 4.4.1∗ . Prove Proposition 4.4.1.


In particular, there always exists a primitive root of Fpn , i.e. an element g of order pn − 1 (which
Q all the other ones): they are the roots of Φp −1 (and these are all in Fp as Φp −1 |
thus generates n n n
n
X p − X = α∈Fpn X − α by Theorem 4.2.1.)
Exercise 4.4.2∗ . Let p - m be a positive integer. Prove that Φm has a root in Fpn if and only if m | pn − 1.
Here is an application of Proposition 4.4.1.

Problem 4.4.1 (Brazilian Mathematical Olympiad 2017 Problem 6)

Let 3 6= p | a3 − 3a + 1 be a rational prime where a is some rational integer. Prove that p ≡ ±1


(mod 9).

Solution

Perform the substitution a = α+ α1 where α ∈ Fp2 . This is possible as the polynomial X 2 −aX +1
has degree two and thus has its roots in Fp2 . Then,
 3  
1 1 1
a3 − 3a + 1 = α+ −3 α+ + 1 = α3 + 3 + 1.
α α α

Thus,
Φ9 (α) = α6 + α3 + 1 = 0.
We conclude that Φ9 has a root in Fp2 which means that 9 | p2 − 1 by Exercise 4.4.2∗ . This is
exactly equivalent to p ≡ ±1 (mod 9)! 

Exercise 4.4.3. Prove that p2 ≡ 1 (mod 9) if and only if p ≡ ±1 (mod 9).


66 CHAPTER 4. FINITE FIELDS

Remark 4.4.1
This seems a bit miraculous and I don’t really have a good explanation for the motivation apart
from "p ≡ ±1 (mod 9) is the same as 9 | p2 − 1 which makes us think of Φ9 in Fp2 " or "a = α + α1
makes things cancel well".

We can in fact prove that the converse also holds: if p ≡ ±1 (mod 9), X 3 − 3X + 1 has a root in
Fp . We have seen that the roots of this polynomial have the form ω + ω1 where ω ∈ Fp is a primitive
9th root of unity. It remains to check that this is indeed an element of Fp : by Corollary 4.3.1 this
amounts to checking that  p
1 1
ω+ =ω+ .
ω ω
Since p ≡ ±1 (mod 9) and ω 9 = 1, the LHS is
1
ωp + = ω ±1 + ω ∓1
ωp
which is indeed equal to ω + ω1 .

In fact, we can generalise this problem to find, for any n, a polynomial Ψn which has a root in
Fp if and only if p ≡ ±1 (mod n) (with some possible exceptions if p | n). By adapting the previous
solution, we wish to have  
1 Φn (X)
Ψn X + = ϕ(n)/2
X X
(for n ≥ 3, the other cases are trivial). Such a polynomial indeed exists: the key point is that
Φn (X)/X ϕ(n)/2 is symmetric in X and 1/X (Exercise 3.1.6∗ ). Hence, by the fundamental theorem of
1 1
symmetric polynomials, it is a polynomial in X + X and X · X , i.e. a polynomial in X + X1 .

Remark 4.4.2
One can also prove that there exists a polynomial Tn such that Tn (X + 1/X) = X n + 1/X n by
induction on n. This polynomial is called the nth Chebyshev polynomial .

Remark 4.4.3
The polynomial Ψn we have constructed is in fact the minimal polynomial of 2 cos 2π

n , see
Exercise 3.2.5. For the sake of consistency we can thus also artifically define Ψ1 = X − 2 and
Ψ2 = X + 2 (they also satisfy Proposition 4.4.2)

Exercise 4.4.4. Compute Ψ1 , . . . , Ψ8 .

Proposition 4.4.2

Let p - n be a prime number. Then, Ψn has a root in Fp if and only if p ≡ ±1 (mod n).

Proof

By definition, the roots of Ψn are ω + ω1 where ω is a root of Φn , i.e. an element of order n. Let’s
see when this is in Fp . We have  p
1 1
ω+ = ωp + p .
ω ω
Note that
X + 1/X − (ω + 1/ω) = (X − ω)(X − 1/ω)/X.
4.4. CYCLOTOMIC POLYNOMIALS 67

Thus, ω p + 1
ωp =ω+ 1
ω if and only if ω p = ω ±1 . This is exactly equivalent to p ≡ ±1 (mod n).


From this we get the following corollary, similar to Exercise 3.3.8∗ .

Theorem 4.4.1

For any positive rational integer n, there exist infinitely many rational primes congruent to −1
modulo n.

Proof

We do a Euclid-type proof. Suppose that there are only finitely many such primes p1 , . . . , pk .
Let a be the constant coefficient of Ψn . We shall consider the polynomial f = Ψn (aX)/a, which
now has constant coefficient 1. Consider f (mnp1 · . . . · pk ) for some rational integer m. Since it
is congruent to 1 modulo np1 · . . . · pk , its prime factors are distinct from p1 , . . . , pk and the ones
dividing n. Thus, they must all be 1 modulo n by assumption. (If one doesn’t want to compute
Ψn (0), they can also see it as a corollary of Theorem 5.2.1.)

How do we reach a contradiction from this? The key point is to go into negatives: if we manage
to have Ψn (mnp1 · . . . · pk ) < 0, it will be congruent to −1 modulo n. Then, we can add a large
multiple of np1 · . . . · pk to get back into positives (since the leading coefficient of Ψn is 1) while
still being congruent to −1 modulo n. Thus, it must have a prime factor which isn’t congruent
to 1 modulo n and that will be the new prime factor we were looking for (and our contradiction).

For this, note that the complex roots of Ψn are all real and distinct as they are cos 2kπ

n for
gcd(k, n) = 1. In particular, there is some interval [a, b] such that Ψn is negative there. The key
point now is to consider m ∈ Q instead of m ∈ Z, but with some restrictions. We ask that the
prime factors of its denominator are congruent to 1 modulo n (and in particular distinct from
p1 , . . . , pk and the prime factors of n).

Consider such an m satisfying this and also r := mnp1 · . . . · pk ∈ [a, b] (this is possible by
Exercise 4.4.5∗ ). The prime factors of the numerator of Ψn (r) are all congruent to 1 modulo n
by assumption. Indeed, they’re distinct from p1 , ·, pk and do not divide n as

Ψn (r) ≡ Ψn (0) ≡ ±1 (mod np1 · . . . · pk ).

Thus, either they divide the numerator of m in which case they are congruent to 1 modulo n by
assumption, or Ψn has a root in Fp which again means that p ≡ 1 (mod n) (as it’s not congruent
to −1).

This means that Ψn (r) ≡ −1 (mod n) since its negative and all its prime factors are congruent
to 1 modulo n. We have reached the conclusion we wanted: Ψn (r + N np1 · . . . · pk ) will be
positive but still congruent to −1 modulo n for some large n and will thus have a new prime
factor congruent to −1 (distinct from the prime factors of the denominator of r by assumption).


Exercise 4.4.5∗ . Let p 6= 0 be an integer. Prove that the numbers m/pk with m ∈ Z and k ∈ N are dense in
R.

Exercise 4.4.6∗ . Prove that the leading coefficient of Ψn is 1.

Finally, we discuss the factorisation of cyclotomic polynomials in Fp [X]. While they are irreducible
in Q[X], over Fp the situation is quite different.
68 CHAPTER 4. FINITE FIELDS

Proposition 4.4.3

Let n ≥ 1 be an integer. The nth cyclotomic polynomial Φn factorises as a product of ϕ(n) k


irreducible polynomials, where k is the order of p modulo n. In particular, it stays irreducible if
and only if p is a primitive root modulo n.

Proof

It suffices to show that each irreducible factor has degrre k. By Theorem 4.3.1, this is equivalent
`
to k being the smallest positive integer ` such that ω p = 1 for any element ω of order n. This
`
is very easy to show: ω p = 1 if and only if p` ≡ 1 (mod n) since ω has order n by definition.
Thus, ` is the smallest integer such that p` ≡ 1 (mod n) which is, by definition, the order of p
modulo n.


As a perhaps surprising corollary, Φ8 = X 4 + 1 is irreducible in Q[X] but reducible in any Fp [X]


as there is no primitive root modulo 8.

4.5 Quadratic Reciprocity


We are interested in knowing when an element a ∈ Fp is a perfect square, i.e. when there exists a
b ∈ Fp such that a = b2 . Equivalently, we want to know how the polynomial X 2 − a factorises in
Fp [X]. We thus make the following definitions.

Definition 4.5.1 (Quadratic Residues and Non-Residues)

Given a non-zero a ∈ Fp , we say a is a quadratic residue if it is a square in Fp . Otherwise, we


say it is a quadratic non-residue.

Note that 0 is not a quadratic residue nor a non-residue (it’s "zero"). The reason for this definition
will become clear shortly. Using primitive roots, one can easily prove the following criterion.

Proposition 4.5.1 (Euler’s Criterion)


p−1
An element a ∈ Fp is a quadratic residue if and only if a 2 = 1. Similarly, a is a quadratic
p−1
non-residue if and only if a 2 = −1.

Exercise 4.5.1∗ . Prove Proposition 4.5.1.

Based on this result, we make the following definition.

Definition 4.5.2 (Legendre Symbol)


 
a
Let p be an odd rational prime. Given an a ∈ Fp , we define the Legendre symbol of a, p to be
p−1
the integer among {−1, 0, 1} which is congruent to a 2 . We also define 12 = 1 and 0
 
2 = 0.

We could have also defined the Legendre symbol before stating Euler’s criterion (0 if a = 0, 1 if a
is quadratic residue, −1 otherwise) but one very nice property of this object is that it’s multiplicative
(by Euler’s criterion).
4.5. QUADRATIC RECIPROCITY 69
 
a
Another way of thinking about the Legendre symbol is that 1 + p counts (without multiplicity)
the number of roots of X 2 − a in Fp : it’s 1 + 0 = 1 when a = 0, 1 + 1 = 2 when a is a quadratic residue,
and 1 − 1 = 0 when a is a quadratic non-residue.
 
Let’s first analyze −1
p .

Theorem 4.5.1 (First Supplement of the Quadratic Reciprocity Law)


  p−1
−1
Let p be an odd prime. Then, p = (−1) 2 .

Proof

The polynomial X 2 − (−1) = Φ4 has a root in Fp if and only if 4 | p − 1 which is exactly what
we wanted to show.


We now state the quadratic reciprocity law. So far, we have only studied relations between finite
fields of fixed characteristic p. This result provides a very beautiful link between the structure of Fp
and Fq for distinct primes p and q.

Theorem 4.5.2 (Quadratic Reciprocity Law)

Let p and q be distinct odd rational primes. Then,


  
p q p−1 q−1
= (−1) 2 · 2 .
q p

Remark 4.5.1
 
Technically, this statement doesn’t make sense because pq is defined for p ∈ Fq and not p ∈ Z,
   
and pq is defined for q ∈ Fp and not q ∈ Z. This is of course very easy to fix: we define ap
 
for a ∈ Z as a (modp
p)
. We will make many such identifications throughout this book.

Theorem 4.5.3 (Second Supplement of the Quadratic Reciprocity Law)

p2 −1
 
2
Let p be an odd prime. Then, p = (−1) 8 .

Combined with the second supplement of this theorem, this allows us to compute (more or less
efficiently)
  arbitrary Legendre symbols since the Legendre symbol is multiplicative,. Indeed, to compute
a
p we can suppose a ∈ [p], then consider its prime factorisation a = q1 · . . . · qn and use the quadratic
   
reciprocity law and its second supplement to reduce the computation of ap to qpk where qk < p
and repeat the process sufficiently many times.
77

Exercise 4.5.2. Compute 101
.

In fact, we have already proven the second supplement with our Proposition 4.4.2.
70 CHAPTER 4. FINITE FIELDS

Proof of the Second Supplement

Notice that Ψ8 = X 2 − 2. But, by Proposition 4.4.2, this polynomial has a root in Fp if and only
p2 −1
if p ≡ ±1 (mod 8) which is exactly equivalent to (−1) 8 = 1.


p2 −1
Exercise 4.5.3. Prove that Ψ8 = X 2 − 2 and that (−1) 8 = 1 if and only if p ≡ ±1 (mod 8).

Proof of the Quadratic Reciprocity Law

We shall make an ingenious use of the Frobenius morphism of Z (mod p). Let ω ∈ Z be a
primitive qth root of unity.

Define the `th quadratic Gauss sum


X k  
`
k`
g` = ω = g
q q
k∈Fq
 
−1
for ` ∈ Fq where g = g1 . We will prove that g 2 = q. Note that we can already know, prior
q
to the computation, that g is a rational integer by Exercise 4.5.5∗ .
2

Since we wish to compute g 2 , we expand g 2 :


  
X i X j  X  ij 
g2 =  ωi   ωj  = ω i+j .
q q q
i∈Fq j∈Fq i,j∈Fq

Now we use a well known trick: the unity root filter we encoutered in Exercise A.3.9† . The
idea is that, when we sumPω n for some fixed n over the other qth roots of unity raised to the
nth power, i.e. consider k∈Fq ω kn , we get massive simplification. Hence, consider the sum
δ(n) := k∈Fq ω kn for n ∈ Fq . When n = 0 (in Fq ), this sum is q, otherwise it’s
P

ω qn − 1
= 0.
ωn − 1
We can know finish our computation of g 2 :
X
(q − 1)g 2 = g`2
`∈Fq
X X  ij 
= ω (i+j)`
q
`∈Fq i,j∈Fq
X  ij  X
= ω (i+j)k
q
i,j∈Fq `∈Fq
X  ij 
= δ(i + j)
q
i,j∈Fq
X  −i2 
= q
q
i∈Fq
 
−1
= (q − 1) .
q
 
−1
Hence, g 2 = q q as wanted.
4.6. EXERCISES 71

On the one hand,


 
p−1 p−1 q−1 q p−1 q−1
p
g = gq 2 (−1) 2 · 2 ≡g (−1) 2 · 2 (mod p)
p
 
by the previous computation. On the other hand, g p ≡ gp = g pq (mod p) by Frobenius.
Therefore,    
q p−1 q−1
· 2 p
g (−1) 2 ≡g (mod p).
p q
We want to divide both sides by g but we don’t know if g is invertible (actually  we do from
Exercise 1.5.24† ). Instead, we multiply both sides by g to transform g into g 2 = −1
q q which is
indeed invertible modulo p as p 6= q. Finally, we get
   
p−1 q−1
· 2 q p
(−1) 2 ≡ (mod p).
p q

Since both sides are ±1, they must be equal which is exactly what we wanted to prove.


Exercise 4.5.4∗ . Prove that, for any ` ∈ Fq , g` =


 
`
q
g.

Exercise 4.5.5∗ . Prove without computing g 2 that g has exactly 2 conjugates, i.e. is a quadratic number.

4.6 Exercises
Dirichlet Convolutions
Exercise 4.6.1† (Dirichlet Convolution). A function f from N∗ to C is said to be an arithmetic
function. Define the Dirichlet convolution 5 f ∗ g of two arithmetic functions f and g as
X X
n 7→ f (d)g(n/d) = f (a)g(b).
d|n ab=n

Prove that the Dirichlet convolution is associative. In addition, prove that if f and g are multiplicative 6 ,
meaning that f (mn) = f (m)f (n) and g(mn) = g(m)g(n) for all coprime m, n ∈ N, then so is f ∗ g.

Exercise 4.6.2† (Möbius Inversion). Define the Möbius function µ : Z≥1 → {−1, 0, 1} by µ(n) =
(−1)k where k is the number of prime factors of n if n is squarefree, and µ(n) = 0 otherwise. Define
also δ as the function mapping 1 to 1 and everything else to 0. Prove that δ is the identity element for
the Dirichlet convolution: f ∗ δ = δ ∗ f = f for all arithmetic functions f . In addition, prove that µ is
the inverse of 1 for the Dirichlet convolution, meaning that µ ∗ 1 = 1 ∗ µ = δ where 1 is the function
n 7→ 1.7

Exercise 4.6.3† (Prime Number Theorem in Function Fields). Prove that the number of irreducible
polynomials in Fp [X] of degree n is
1 X n d
Nn = µ p
n d
d|n

pn
and show that this is asymptotically equivalent to logp (pn ) .
5 The Dirichlet convolution appears naturally in the study of Dirichlet series: the product of two Dirichlet series
P∞ f (n) P∞ g(n) P∞ (f ∗g)(n)
n=1 ns and n=1 ns is the Dirichlet series corresponding to the convolution of the coefficients n=1 ns
.
6 This terminology has conflicting meanings: in algebra, it means that f (xy) = f (x)f (y) for all x, y, while for arithmetic

functions, it only means that f (xy) = f (x)f (y) for coprime x, y.


7 This also explains how we found the formula for Φ from Exercise 3.5.19
n
72 CHAPTER 4. FINITE FIELDS

Linear Recurrences
Exercise 4.6.4† (China TST 2008). Define the sequence (xn )n≥1 by x1 = 2, x2 = 12 and xn+2 =
6xn+1 − xn for n ≥ 0. Suppose p and q are rational primes such that q | xp . Prove that, if q 6= 2, 3,
then q ≥ 2p − 1.

Exercise 4.6.5 (Korean Mathematical Olympiad 2013 Final Round). Let a and b be two coprime
positive rational integers. Define the sequences (an )n≥0 and (bn )n≥0 by
√ √
(a + b 2)2n = an + bn 2

for n ≥ 0. Find all rational primes for which there is some positive rational integer n ≤ p such that
p | bn .
 
Exercise 4.6.6† . Let p 6= 2, 5 be a prime number. Prove that p | Fp−ε where ε = p5 .
 
Exercise 4.6.7† . Let p 6= 2, 5 be a rational prime. Prove that p | Fp − 5
p .

Exercise 4.6.8† . Let m ≥ 1 be an integer and p a rational prime. Find the maximal possible period
modulo p ≥ m of a sequence satisfying a linear recurrence of order m.

Exercise 4.6.9† . Let f ∈ Z[X] be a polynomial and (an )n≥0 be a linear


 recurrence
 of rational integers.
an
Suppose that f (n) | an for any rational integer n ≥ 0. Prove that f (n) is also a linear recurrence.8

Polynomials and Elements of Fp


Exercise 4.6.10. Suppose f ∈ Fp [X] is such that f | X n − 1 implies n > pdeg f . Prove that f is
irreducible in Fp [X].
n
Exercise 4.6.11† . Let a ∈ Fp be non-zero. Prove that X p − X − a is irreducible over Fp if and only
if n = 1, or n = p = 2.

Exercise 4.6.12† (ISL 2003). Let (an )n≥0 be a sequence of rational integers such that an+1 = a2n − 2.
Suppose an odd rational prime p divides an . Prove that p ≡ ±1 (mod 2n+2 ).

Exercise 4.6.13. Let f ∈ Fp [X] be a polynomial. Prove that f has a double root in Fp if and only if
its discriminant is zero.

Exercise 4.6.14† . Let f ∈ Fp [X] be an irreducible polynomial of odd degree. Prove that its discrim-
inant is a square in Fp .

Exercise 4.6.15† (Chevalley-Warning Theorem). Let f1 , . . . , fm ∈ Fpk [X1 , . . . , Xn ] be polynomials


such that d1 + . . . + dm < n, where di is the degree of fi . Prove that, if f1 , . . . , fm have a common
root in Fpk , then they have another one.

Exercise 4.6.16. Prove that

Fp = Fp ({ω ∈ Fp | ω has prime order}).

Exercise 4.6.17. Prove that any polynomial f ∈ Z/pZ[X] has infinitely many roots in Z/pZ.

Exercise 4.6.18 (Miklós Schweitzer 2018). Suppose X 4 + X 3 + 2X 2 − 4X + 3 has a root in Fp . Prove


that p is a fourth power modulo 13.
8 In fact, the Hadamard quotient theorem states that if a linear recurrence b always divides another linear recurrence
  n
an then ab n is also a linear recurrence.
n
4.6. EXERCISES 73

Squares and the Law of Quadratic Reciprocity


Exercise 4.6.19† . Let q be a prime power, a ∈ F×
q and m ≥ 1 an integer. Prove that a is an mth
p−1
power in Fq if and only if a gcd(p−1,m) = 1.
Exercise 4.6.20† . Let a be a rational integer. Suppose a is quadratic residue modulo every rational
prime p - a. Prove that a is a perfect square.
Exercise 4.6.21† . Prove that 16 is an eighth power modulo every prime but not an eighth power in
Q.
Exercise 4.6.22† . Prove that, if a polynomial f ∈ Z[X] of degree 2 has a root in Fp for any rational
prime p, then it has a rational root. However, show that there exists polynomials of degree 5 and 6
that have a root in Fp for every prime p but no rational root.9
Exercise 4.6.23† (Jacobi Reciprocity). Define the Jacobi symbol n· of an odd positive integer n as


the product    
· ·
· . . . ·
pn1 1 pnk k
where n = pn1 1 · . . . · pnk k is the prime factorisation of n. Prove the following statements: for any odd
m, n
m−1 n−1
• m 2 · 2 .
 n
n m = (−1)
m−1
• −1

m = (−1)
2 .

m2 −1
2

• m = (−1) 8 .
m

(The Jacobi symbol n is 1 if m is quadratic residue modulo n but may also be 1 if m isn’t.)
Exercise 4.6.24† . Suppose a1 , . . . , an are distinct squarefree rational integers such that
n
X √
bi ai = 0
i=1

for some rational numbers b1 , . . . , bn . Prove that b1 = . . . = bn = 0.


n
Exercise 4.6.25† . Let n ≥ 2 be an integer and p a prime factor of 22 + 1. Prove that p ≡ 1
(mod 2n+2 ).
Exercise 4.6.26† (USA TST 2014). Find all functions f : Z → Z such that (m − n)(f (m) − f (n)) is
a perfect square for all m, n ∈ Z.

Sums and Products


Exercise 4.6.27† (Tuymaada 2012). Let p be an odd prime. Prove that
p+1
1 1 1 (−1) 2
+ + ... + ≡ (mod p)
02 + 1 12 + 1 (p − 1)2 + 1 2
where the sum is taken over the k for which k 2 + 1 6≡ 0.
Exercise 4.6.28. How many pairs (x, y) of elements of Fp are there such that x2 + y 2 = 1?
Exercise 4.6.29 (USAMO 2020). What is the product of the elements a of Fp such that both a and
4 − a are quadratic non-residues?
Exercise 4.6.30† . Let n ≥ 1 be an integer. Prove that, for any rational prime p,
p−1
Y ϕ(n)
Φn (k) ≡ Φn/ gcd(n,p−1) (1) ϕ(n/ gcd(n,p−1)) (mod p).
k=1

9 The Chebotarev density theorem implies that such a polynomial must be reducible. In fact it even characterises

polynomials which have a root in Fp for every rational prime p based on the Galois groups of their splitting field (see
Chapter 6). In particular, it shows that 5 and 6 are minimal.
74 CHAPTER 4. FINITE FIELDS

Miscellaneous
Exercise 4.6.31. Compute Ψn (0) for n ≥ 1.

Exercise 4.6.32† (Lucas’s Theorem). Let p be a prime number and

n = pm nm + . . . + pn1 + n0

and
k = pm km + . . . + pk1 + k0
be the base p expansion of rational integers k, n ≥ 0 (ni and ki can be zero). Prove that
  Y m  
n ni
≡ .
k i=0
ki

Exercise 4.6.33† (Carmichael’s Theorem). Let a, b be two coprime integers such that a2 − 4b > 0,
and let (un )n≥1 denote the linear recurrence defined by u0 = 0, u1 = 1, and

un+2 = aUn+1 − bUn .

Prove that for n 6= 1, 2, 6, un always have a primitive prime factor, except when n = 12 and a = b = ±1
(corresponding to the Fibonacci sequence).

Exercise 4.6.34† . Suppose p ≡ 2 or p ≡ 5 (mod 9) is a rational prime. Prove that the equation

α3 + β 3 + εaγ 3 = 0

where ε ∈ Z[j] is a unit and 2 6= a ∈ {p, p2 } does not have solutions in Z[j].

Exercise 4.6.35† (Class Equation of a Group Action and Wedderburn’s Theorem). Let G be a finite
group, S a finite set, and · a group action of G on S.10 Given an element s ∈ S, let Stab(s) and Fix(G)
denote the set of elements of G fixing s and the elements of S fixed by all of G respectivelly. Finally,
let Oi = Gsi be the (disjoint) orbits of size greater than 1. Prove the class equation:
X |G|
|S| = | Fix(G)| + .
| Stab(si )|
|Oi |>1

Deduce Wedderburn’s theorem: any finite skew field is a field.


Exercise 4.6.36 (USA TSTST 2016). Does there exist a non-constant polynomial f ∈ Z[X] such
that, for any rational integer n > 2,

f (Z/nZ) := {f (0), . . . , f (n − 1)} (mod n)

has cardinality at most 0.499n?

10 In other words, a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G and s ∈ S. See also

Exercise A.3.20† .
Chapter 5

Polynomial Number Theory

Prerequisites for this chapter: Section A.1.

Algebraic number theory is deeply linked with polynomials (already by definition!). Here we study
some arithmetic properties of polynomials with rational coefficients.

5.1 Factorisation of Polynomials


We have already mentioned factorisation of polynomials as a unique product of irreducible polynomials
in Chapter 2 (in an abstract context) and Chapter 4 (for Fp [X]) but we restate the main results here
since they are fundamental.

Theorem 5.1.1 (Factorisation in Irreducible Polynomials in Q [X])

Any non-zero polynomial f ∈ Q[X] has a unique factorisation as a constant times a product of
monic irreducible polynomials.

Proof

Q is a field so Q[X] is Euclidean (for the degree map) (see Proposition A.1.1) which means it’s a
UFD by Proposition 2.2.1 and Theorem 2.2.1. To finish, any irreducible polynomial has a unique
monic associate so just use them in the factorisation and collect the leading coefficient in the
beginning.


Since we deal with arithmetic property of polynomials, we are interested in factorising polynomials
over Z[X]. However Z is not a field anymore, so is Z[X] really a UFD? Gauss’s lemma shows that as
long as R is a UFD, R[X] also is one. Before proving this however, we expand a bit on irreducible
polynomials in Z[X]. The polynomial 2X is irreducible in Q[X] (if 2X = f g then either f or g is
constant and non-zero, i.e. a unit of Q[X]) but not anymore in Z[X]. Indeed, it factorises as 2 · X and
2 is not a unit anymore (1/2 6∈ Z[X]).

We are thus led to make the following definition.

Definition 5.1.1 (Primitive Polynomial)

We say a polynomial f ∈ Z[X] is primitive if the gcd of its coefficients is 1.

75
76 CHAPTER 5. POLYNOMIAL NUMBER THEORY

For instance, the only constant primitive polynomials are 1 and −1.

Theorem 5.1.2 (Gauss’s Lemma)

The product of two primitive polynomials is primitive.

Proof

Suppose f and g are primitive but f g isn’t. Let p be a prime dividing all coefficients of f g, i.e.
f g ≡ 0 (mod p). Since Fp [X] is an integral domain, this means f ≡ 0 (mod p) or g ≡ 0 (mod p)
which is impossible as they are primitive.


We can also state Gauss’s lemma with the notion of content. The primitive polynomials are
polynomials of content 1.

Definition 5.1.2 (Content)

The content of a polynomial f ∈ Z[X], c(f ), is defined as the gcd of the coefficients of f . For
f ∈ Q[X], it is c(N f )/|N | where 0 6= N is such that N f ∈ Z[X].

Exercise 5.1.1∗ . Prove that the content is well-defined: c(N f )/|N | = c(M f )/|M | for any non-zero M, N ∈ Z
such that N f, M g ∈ Z[X].

Proposition 5.1.1 (Equivalent Form of Gauss’s Lemma)*

The content is completely multliplicative, i.e. c(f g) = c(f )c(g) for any f, g ∈ Q[X].

Proof

Without loss of generality, we may assume f, g ∈ Z[X] as c(N f ) = |N |c(f ) for any N ∈ Z
(Exercise 5.1.1∗ ). Then, f /c(f ) and g/c(g) are primitive so c(ff)c(g)
g
is too by Theorem 5.1.2.
Accordingly,  
fg
c(f g) = c(f )c(g)c = c(f )c(g).
c(f )c(g)


Corollary 5.1.1 (Irreducible Polynomials in Z [X])*

A polynomial f ∈ Z[X] is irreducible in Z[X] if and only if it is primitive and irreducible in Q[X].

Proof

Clearly, if f is primitive but reducible in Z[X], it is reducible in Q[X]. Thus, it suffices to show
that a primitive polynomial which is reducible in Q[X] also is reducible in Z[X]. Suppose f = gh.
By multiplicativity of the content, we also have

f = (g/c(g))(h/c(h))
5.1. FACTORISATION OF POLYNOMIALS 77

which is a factorisation in Z[X] as wanted (by Exercise 5.1.2∗ ).




Exercise 5.1.2∗ . Suppose f ∈ Q[X] has integral content. Prove that f has integer coefficients.

In fact, we even have the following more general result.

Proposition 5.1.2

Suppose f, g ∈ Z[X] are polynomials such that f divides g in Q[X]. Then, f ∗ divides g in Z[X],
where f ∗ = f /c(f ) is the primitive part of f .

Exercise 5.1.3∗ . Prove Proposition 5.1.2.

We finally get our factorisation in Z[X].

Corollary 5.1.2 (Factorisation in Irreducible Polynomials in Z [X])*

Any non-zero polynomial f ∈ Z[X] has a unique factorisation as a constant times a product
of non-constant primitive irreducible polynomials with positive leading coefficient. Equivalently,
Z[X] is a UFD.

Exercise 5.1.4∗ . Prove Corollary 5.1.2.

As another corollary of Proposition 5.1.2, we get a new proof of Proposition 1.2.2, asserting that
the minimal polynomial of an algebraic integer has integer coefficients, which uses neither the fact that
rational integers are the only rational algebraic integers nor the fact that Z is closed under addition
and multiplication.

Corollary 5.1.3

Let α ∈ Q be an algebraic number. Then, πα ∈ Z[X] if and only if α ∈ Z.

Proof

It is clear that if πα ∈ Z[X] then α ∈ Z, thus suppose that α ∈ Z. Let f ∈ Z[X] be a monic
polynomial vanishing at α. Then, πα∗ divides f in Z[X] by Proposition 5.1.2 so the leading
coefficient of πα∗ is ±1 since it divides the leading coefficient of f which is 1. Finally, we have
πα∗ = πα which means that it has integer coefficients as wanted.


Because of these results, from now on we will say "irreducible" to mean "irreducible in Q[X]" and
"primitive and irreducible" to mean "irreducible in Z[X]", unless otherwise specified. By default, f | g
means that f divides g in Q[X] and we will specify if it’s true in Z[X] too when needed. If necessary,
we will use |Q[X] for divisibility in Q[X] and |Z[X] for divisibility in Z[X].

Before discussing another result, we will say one last thing on Gauss’s lemma. One can see that its
proof only uses the fact that Z is a UFD (see Chapter 2). Thus, we could restate it in the following
form.
78 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Proposition 5.1.3 (Gauss’s Lemma)

Suppose that a ring R is a UFD. Then, R[X] is also one.

It can also be seen from our proof of Corollary 5.1.2 that the primes of R[X] are either primes of R
or primitive irreducible polynomials in Frac R[X]. A very important consequence is that, by induction,
R[X1 , . . . , Xn ] is a UFD when R is, and in particular K[X1 , . . . , Xn ] is a UFD for every field K. In
other words, we still have factorisation in irreducible polynomials with more variables.

Corollary 5.1.4

For any field K and any integer n ≥ 1, K[X1 , . . . , Xn ] is a UFD.

We end this section with a classical criterion for proving certain polynomials are irreducible.

Proposition 5.1.4 (Eisenstein’s Criterion)

Suppose p is a rational prime and f = an X n + . . . + a0 ∈ Z[X] is a polynomial such that p - an ,


p | an−1 , . . . , a0 and p2 - a0 . Then f is irreducible (in Q[X]).

Proof

Suppose f = gh where g, h ∈ Z[X] are non-constant. Then, modulo p, gh ≡ X p so g ≡ X i and


h ≡ X j for some k. Moreover, we must have deg(g (mod p)) = deg g as otherwise the leading
coefficient of g is divisible by p which is impossible as p - an .

Thus, i, j ≥ 1. This must mean that p | g(0), h(0) so p2 | g(0)h(0) = a0 , a contradiction. Hence
by Gauss’s lemma f is irreducible in Q[X].


Corollary 5.1.5

The pth cyclotomic polynomial Φp = X p−1 + . . . + X + 1 is irreducible for any rational prime p.

Proof

Apply the Eisenstein criterion to

(X + 1)p − 1
     
p p p
Φp (X + 1) = = X p−1 + X p−2 + . . . + X.
(X + 1) − 1 p p−1 1

Remark 5.1.1
If one finds this transformation a bit unnatural, one can also reprove Eisenstein for this poly-
p
−1
nomial: Φp = XX−1 ≡ (X − 1)p−1 by Proposition 4.1.1 so if it were reducible p2 would divide
Φp (1).

More generally, there are two basic principles to prove a polynomial is irreducible in Q[X]: find
some (impossible) information about a hypothetical factorisation modulo some prime p, or find
some bounds on its roots to get a contradiction if it were reducible (for instance if a monic
5.2. PRIME DIVISORS OF POLYNOMIALS 79

polynomial’s constant coefficient is prime and it were reducible, one of the factors must have
constant coefficient ±1 and hence a root of absolute value less than 1). We do not explore the
second idea in this book, see chapter 17 of PFTB [1] for an account of this. Note that, even if f
is irreducible in Q[X], it is not always possible to find a rational prime p for which f (mod p) is
irreducible in Fp as Φ8 shows (Proposition 4.4.3).

Exercise 5.1.5. Prove that Φpn is irreducible with Eisenstein’s criterion.

5.2 Prime Divisors of Polynomials


Given a polynomial f ∈ Z[X] are interested in knowing which rational primes p are such that f
has a root in Fp . In fact we can already see that it is deeply linked to algebraic number theory: if
f = α(X − α1 ) · . . . · (X − αn ), we want to know whether the prime p divides the product
α(a − α1 ) · . . . · (a − αn )
for some a ∈ Z. We will not answer this question however, as it goes beyond the scope of this book
(see the chapter on density theorems of [19]). Instead, we will only prove that there are infinitely many
such primes (we call them "prime divisiors of the polynomial" by abuse of terminology, as they divide
one of the value taken by it.)

Theorem 5.2.1

For any non-constant polynomial f ∈ Z[X], there exists infinitely many rational primes p such
that p | f (a) for some a.

Proof

Suppose there were only finitely many such primes, p1 , . . . , pm . Clearly, f (0) 6= 0 as otherwise
all primes divide f (0) (or f (p) if you prefer large numbers). Thus, let
v (f (0))+1 v (f (0))+1
N = p1p1 · . . . · pmpm .
vp (f (0))+1
Consider the numbers f (kN ) for k ∈ Z: they are congruent to f (0) modulo pi i so their
vpi is vpi (f (0)). By assumption, their only prime factors are the pi : we conclude that
v (f (0)) v (f (0))
f (kN ) = ±p1p1 · . . . · pmpm = ±f (0).

Finally, the polynomial f (X)2 − f (0)2 has infinitely many roots so is zero which means that f is
constant, a contradiction.


Remark 5.2.1
Perhaps a simpler proof is to consider the polynomial f (aX)/a, where a = f (0) is the constant
coefficient of f , to avoid problems with its constant coefficient (which is now 1). We have presented
the other proof first because we find it to be more instructive (but the alternative one is instructive
too). This is what we did in the proof of Theorem 4.4.1.

Remark 5.2.2
We can also prove a much stronger result analytically: if (an )n≥0 is an increasing sequence of
positive integers bounded by a polynomial, an ≤ f (n) for some f ∈ R[X], then there are infinitely
many primes which divide at least one term of the sequence. Indeed, if there was only p1 , . . . , pm ,
80 CHAPTER 5. POLYNOMIAL NUMBER THEORY

then, on the one hand



X 1
1/ deg f
n=1 an

would diverge since it grows faster than (a constant) times the harmonic series. On the other
hand, by assumption,

m X
X 1 Y 1
6∞ 1/ deg f

n=1 an k=1 n=1
pn/ deg f
m
Y 1
= 1
k=1
1− p1/ deg f

< ∞.

An interesting corollary is that, if f ∈ Z[X] is a polynomial and S ⊆ N is a set of non-zero density


(that is, |S∩[n]|
n 6→ 0), then there are infinitely many primes p such that p | f (s) for some s ∈ S.

If we define P(f ) to be the set of primes p such that f has a root modulo p, this result becomes
the fact that P(f ) is infinite when f is non-constant.
Here is an application of this result.

Problem 5.2.1 (APMO 2021 Problem 2)

Find all polynomials f ∈ Z[X] such that, for any n, there are at most 2021 pairs of rational
integers 0 < a < b ≤ n for which |f (a)| ≡ |f (b)| (mod n).

Solution

We shall show that if f has degree at least 2, one value will of f will be reached arbitrarily
many times. Since f (m) is always positive or always negative for large m, we may assume by
translating f that its sign is constant on positive numbers.
Thus, we want to estimate the number of (a, b) ∈ (Z/nZ)2 such that f (a) ≡ f (b) (mod n). By
Theorem 5.2.1, there are infinitely many prime divisors of f (X + 1) − f ; let p1 , . . . , pm be such
primes.
Thus, for n = pi , there is one value f (a) which is reached twice modulo n. Hence, for n =
p1 · . . . · pm , by the Chinese remainder theorem (CRT), there is a value which is reached 2m times
modulo n. This indeed grows unbounded.
We conclude that f must have degree 1 (constant f clearly doesn’t work), i.e. f = uX + v for
some u, v ∈ Z and now we need to take in account the absolute values. We show that u = ±1.
Suppose for the sake of a contradiction that |u| ≥ 2. Notice that the sign of f (n) is constant for
n ≥ v.
Modulo un , f (a) ≡ f (b) if and only if a ≡ b (mod un−1 ). For each residue modulo un−1 , there
are |u| residues modulo un . Thus, there are |u|n−1 · |u|

2 pairs of 0 < a < b ≤ un such that
f (a) ≡ f (b). Now subtract the contribution of the residues where the sign is potentially not the
same to get at least  
n−1 |u|
(|u| − |v|)
2
pairs which indeed grows unbounded.
Finally, f = ±X + v. It is easy to see that when v and the leading coefficient have the same
sign there is no pair working since the sign of f (a) is constant. When v has the opposite sign, f
works if and only if |v| ≤ 2022. 
5.3. HENSEL’S LEMMA 81

Exercise 5.2.1∗ . Why does CRT imply that there is a value reached 2m times modulo p1 · . . . · pm ?

Exercise 5.2.2. Prove that X − v works iff 0 ≤ v ≥ −2022, and −X + v works iff 0 ≤ v ≤ 2022.

5.3 Hensel’s Lemma


We have found (some results about) when a polynomial f has a root modulo p. Now suppose we want
to know when f has a root modulo n = pn1 1 · . . . · pnmm . By the Chinese remainder theorem, this is
equivalent to knowing when f has a root modulo pni i for each i. Indeed, f (a) ≡ 0 (mod n) if and only
if f (a) ≡ 0 (mod pni i ) for each i, i.e. a is congruent modulo pni i to a root ai of f (mod pni i ). A partial
result is provided by the Hensel lemma.

Theorem 5.3.1 (Hensel’s Lemma)

Let f ∈ Z[X] be a polynomial and p a rational prime. If p | f (a) for some a ∈ Z and p - f 0 (a),
then, for any k, there is a unique b ∈ Z/pm Z such that pm | f (b) and b ≡ a (mod p).

Before proving this result, we need a lemma which is in fact even more important than Hensel.
Rather than only remembering the statement of Hensel’s lemma, the reader should also learn the
method of proving it which can be useful in a larger variety of situations.

Proposition 5.3.1 (Taylor’s Formula)*

Let f ∈ Q[X] be a polynomial of degree n. For any h ∈ Q, we have

f f (n)
f (X + h) = f + hf + h2 · + . . . + hn · .
2 n!

Remark 5.3.1
One can also write

X f (k)
f (X + h) = hk ·
k!
k=0

as all terms after k = n vanish (since f has degree n.)

Proof

It suffices to prove this when f = X n , as it will then be true for any linear combination of these
polynomials, i.e. for any polynomial.
Notice that
(X n )(k) = n(n − 1) · . . . · (n − (k − 1))X n−k
so that
f (k)
 
n
= X n−k .
k! k
Finally,
X f (k) X n
hk · = hk X n−k = (X + h)n
k! k
k k
as wanted.

82 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Corollary 5.3.1*

Let p be a rational prime, k a positive integer and f ∈ Z[X] a polynomial. For any rational
integer h divisible by pk , we have

f (X + h) ≡ f + hf 0 (mod pk+1 ).

Proof
(k)
It suffices to prove that fk! have integer coefficients by Proposition 5.3.1 (we evaluate both sides
modulo pk+1 ). But we have already shown that, if f = i ai X i , we have
P

f (k) X i
= ai X i−k .
k! i
k

Here’s an application of this result, before proving Hensel’s lemma.

Problem 5.3.1 (USA TST 2010 Problem 1)

Let f ∈ Z[X] be a non-constant polynomial such that gcd(f (0), f (1), f (2), . . .) = 1 and f (0) = 0.
Prove that there exist infinitely many integers rational integer n ∈ Z such that

gcd(f (n) − f (0), f (n + 1) − f (1), . . .) = 1.

Solution

We take n = p a rational prime. Suppose that a rational prime q 6= p divides f (p + k) − f (k)


for all k. Then, f (mp) ≡ f (0) (mod q) for any m by induction. But, since q 6= p, mp (mod q)
goes through every element of Fq which means that f is constant modulo q. Since f (0) = 0, this
means q | gcd(f (0), f (1), f (2), . . .) which is impossible.

Hence, we know that gcd(f (p)−f (0), f (p+1)−f (1), . . .) is a power of p. It is clearly divisible by p;
thus it remains to prove that it’s not divisible by p2 . By Corollary 5.3.1, f (p + k) − f (k) ≡ pf 0 (k)
(mod p2 ), this is equivalent to p not dividing at least one number of the form f 0 (k).

This is very easy to have: f has degree at least 1, so f 0 is non-zero. Now, just pick a k such that
f 0 (k) 6= 0 and any rational prime p - f 0 (k) (there are clearly infinitely many such primes). 

Proof of Hensel’s Lemma

We proceed by induction on k. For k = 1, the result is clear. Now, suppose there is a unique
bk ∈ Z/pk Z such that f (bk ) ≡ 0 (mod pk ) and bk ≡ a (mod p). This means that any bk+1
satisfying f (bk+1 ) ≡ 0 (mod pk+1 ) and bk+1 ≡ a (mod p) must be congruent to bk modulo pk .
We show that a unique bk+1 ≡ bk (mod pk ) which is a root of f modulo pk+1 exists (modulo
pk+1 ).

Write bk+1 = bk + upk . By Corollary 5.3.1, we have

f (bk+1 ) ≡ f (bk ) + upk f 0 (a) (mod pk+1 ),


5.4. BÉZOUT’S LEMMA 83

Accordingly, bk+1 is a root of f modulo pk+1 if and only if u ≡ (f 0 (a))−1 (f (bk )/pk ) (mod p)
as f 0 (a) is invertible modulo p by assumption. This exists and is unique modulo p, hence
bk+1 = bk + upk indeed exists and is unique modulo pk+1 .


Here’s an application of Hensel itself.

Problem 5.3.2 (ISL 1995 N1)

Prove that, for any positive integer n, there exists a rational integer k such that k · 2n − 7 is a
perfect square.

Solution

This is equivalent to −7 being a perfect square modulo 2n . This makes us think of applying
Hensel’s lemma to the polynomial X 2 + 7. Unfortunately, its derivative 2X ≡ 0 is zero modulo
2, thus the hypotheses can never be satisfied.

Nevertheless, we already know a root a of X 2 − 17 modulo 2n must be odd. Thus, we can make
the substitution X = 2Y + 1 to get

X 2 + 7 = (2Y + 1)2 + 7 = 4(Y 2 + Y + 2).

We can now use Hensel’s lemma on Y 2 + Y + 2: its derivative is 2Y + 1 ≡ 1 which is never zero,
hence we can lift the root 1 of Y 2 + Y + 2 modulo 2 to a (unique) root a modulo 2n−2 . Such an
a will satisfy
(2a + 1)2 ≡ −7 (mod 2n )
by what we have shown. 

Exercise 5.3.1∗ . Let p be an odd prime and a a quadratic residue modulo p. Prove that a is a quadratic
residue modulo pn , i.e. a square modulo pn (coprime with pn ), for any positive integer n.

Exercise 5.3.2∗ . Prove that an odd rational integer a ∈ Z is a quadratic residue modulo 2n for n ≥ 3 if and
only if a ≡ 1 (mod 8).

5.4 Bézout’s Lemma


We shall now see how irreducible polynomials really shine in polynomial number theory, with a few
worked examples. Recall Bézout’s lemma for Q[X] (Q[X] is Euclidean and hence a Bézout domain
too).

Proposition 5.4.1 (Bézout’s lemma for Q [X])

For any coprime polynomials f, g ∈ Q[X], there exists polynomials u, v ∈ Q[X] such that

f u + gv = 1.

Of course, this also holds for multiple polynomials: if f1 , . . . , fn are coprime then some linear
combination (with coefficients in Q[X]) of them is 1 (just induct on n using Proposition 5.4.1).
84 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Corollary 5.4.1*

For any coprime polynomials f, g ∈ Z[X], there exists polynomials u, v ∈ Z[X] and a non-zero
constant N ∈ Z such that
f u + gv = N.

Exercise 5.4.1∗ . Prove Corollary 5.4.1.

Corollary 5.4.2

Suppose f, g ∈ Z[X] are such that, for any sufficiently large rational prime p, p | f (n) implies
p | g(n) for any n ∈ Z. Then, rad f | rad g, i.e. every irreducible factor of f in Q[X] divides g.

Proof

Suppose that π is a non-constant primitive irreducible factor of f which doesn’t divide g.

Since π is irreducible, it is then coprime with g, so by Bézout’s lemma there exist polynomials
u, v ∈ Z[X] and a non-zero integer N ∈ Z such that

πu + gv = N.

In particular, any common prime factor of π(n) and g(n) must also divide N . Thus, if p > N is
a sufficiently large prime factor of π(n) | f (n) (there exists one by Theorem 5.2.1), then p - N so
p - g(n) which is a contradiction.


In other words, the prime divisors of a polynomial are controlled by the prime divisors of its
irreducible prime factors, thanks to Bézout’s lemma. Here is a more elaborate example, not involving
irreducible polynomials in the statement.

Problem 5.4.1

Suppose f ∈ Q[X] is a polynomial which takes only perfect square values (in Q). Prove that it
is the square of a polynomial with rational coefficients.

Solution

By multiplying f by an appropriate integral perfect square, we may assume f has integer coeffi-
cients. Without loss of generality, we may assume f is squarefree. We shall show that f must be
constant, since this clearly implies that it is the square of a polynomial with integer coefficients.
Consider its factorisation in non-constant primitive irreducible polynomials f = aπ1 · . . . · πm .
Suppose for the sake of a contradiction that m ≥ 1. First, we wish to distinguish the prime
divisors of π1 from the prime divisors, so that, when p | π1 (n), vp (π1 (n)) = vp (f (n)) (which must
be even by assumption). By Bézout’s lemma, since π1 and π2 π3 · . . . · πm are coprime, there exist
polynomials u, v ∈ Z[X] and a non-zero integer N such that
uπ1 + vπ10 π2 · . . . · πm = N.
Now, consider a rational prime p > a, N and a rational integer n ∈ Z such that p | π1 (n); there
exists one by Theorem 5.2.1. By assumption, p - aN , thus p - aπ2 (n) · . . . · πm (n) which implies
vp (f (n)) = vp (π1 (n)).
5.4. BÉZOUT’S LEMMA 85

In particular, since vp (π1 (n)) and vp (π1 (n + p)) are even and positive, we must have

p2 | π1 (n), π1 (n + p).

But, by Corollary 5.3.1

π1 (n + p) ≡ π1 (n) + pπ10 (n) ≡ pπ10 (n) (mod p2 )

which means p must divide π10 (n).

To conclude, π1 and π10 are coprime (since π1 is irreducible and deg π1 > deg π10 ) so, by Bézout’s
lemma, there are some r, s ∈ Z[X] and a non-zero M ∈ Z such that rπ1 +sπ10 = M . Then, for p >
a, M, N , the previous remark is impossible and we must have vp (π1 (n)) = 1 or vp (π1 (n + p)) = 1
which is a contradiction. 

Remark 5.4.1
We could have directly used Bézout on π1 and π10 π2 · . . . · πn but we presented it that way to
highlight the motivation. In fact, what we have proven wit this is that, if π and π1 , . . . , πk are
distinct primitive irreducible polynomials, there exists infinitely many primes p for which there
is an n such that vp (π(n)) = 1 and vp (πi (n)) = 0 for i = 1, . . . , k.

Remark 5.4.2
In fact, the problem of determining which polynomials reach infinitely many square values has
been completely settled with deep arithmetic geometry results. We can also approach this an-
alytically (and elementarily) to get results stronger than what we proved, but worse than the
complete characterisation. The idea is that if the leading coefficient of f ∈ Z[X] is a square, we
can find a polynomial g ∈ Z[X] such that

g(x)2 ≤ f (x) < (g(x) + 1)2

for any sufficiently large |x|, which forces f = g 2 if f takes infinitely many square values. When
the leading coefficient is not a square, we can still transform it into a square in some cases. For
instance, if f (2n ) is a square for all n, then the leading coefficient of f (X)f (2m X) is a square for
even m, and this polynomial takes infinitely many square values, so must be a square. It is not
hard that this must imply that f is a square of X times a square, which doesn’t work (e.g. by
looking at the roots: for sufficiently large m, the only possible common root of f and f (2m X) is
0).

We conclude this chapter with two additional remarks. When dealing with problems about polyno-
mials modulo some prime p, it is very important to keep in mind a polynomial of degree n has at most
n roots modulo p (since Fp is a field). Also, when dealing with exponential functions and polynomials
at the same time, say f (n) and an , modulo p one can choose the value of an and f (n) independently
as the first one has period p − 1 while the latter has period p. We illustrate this by an example.

Problem 5.4.2 (Polish Mathematical Olympiad 2003 Problem 3)

Find all polynomials f ∈ Z[X] is a polynomial such that f (n) | 2n − 1 for any positive rational
integer n.

Solution

Suppose some prime p divides f (n) for some rational integer n. Choose a rational integer n0
86 CHAPTER 5. POLYNOMIAL NUMBER THEORY

satisfying n0 ≡ n (mod p) and n0 ≡ 0 (mod p − 1) by CRT. Then,


0
p | f (n0 ) | 2n − 1 ≡ 1 (mod p)

which is impossible. Thus, we conclude f (n) = ±1 for all n which means f = ±1. These are
indeed solutions. 

5.5 Exercises
Algebraic Results
Exercise 5.5.1† . Suppose f, g ∈ Z[X] are polynomials such that f (n) | g(n) for infinitely many
rational integers n ∈ Z. Prove that f | g. In addition, generalise the previous statement to f, g ∈
Z[X1 , . . . , Xm ] such that f (x) | g(x) fo x ∈ S1 × . . . × Sn , where S1 , . . . , Sn ⊆ Z are infinite sets.

Exercise 5.5.2† . Let f ∈ Q[X] be a polynomial. Suppose that f always takes values which are mth
powers in Q. Prove that f is the mth power of a polynomial with rational coefficients. More generally,
find all polynomials f ∈ Q[X1 , . . . , Xm ] such that f (x1 , . . . , xm ) is a (non-trivial) perfect power for
any (x1 , . . . , xm ) ∈ Zm .

Exercise 5.5.3† . Suppose f, g ∈ Z[X] are polynomials such that f (a) − f (b) | g(a) − g(b) for any
rational integers a, b ∈ Z. Prove that there exists a polynomial h ∈ Z[X] such that g = h ◦ f .

Exercise 5.5.4 (RMM SL 2016). Let p be a prime number. Prove that there are only finitely many
primes q such that
bq/pc
X
q| k p−1 .
k=1

Exercise 5.5.5. Let x and y be positive rational integers. Suppose f, g ∈ Z[X] are polynomials such
that f (ab) | g(ax + by ) for any a, b ∈ Z. Prove that f is constant.

Exercise 5.5.6 (ISL 2019). Suppose a and b are positive rational integers such that
 
an
an + 1 | +1
b

for any rational integer n ≥ b. Prove that b + 1 is prime.

Exercise 5.5.7. Suppose f ∈ Z[X1 , . . . , Xn ] is a polynomial such that, for each tuple of rational
primes (p1 , . . . , pn ), there is some i for which pi | f (p1 , . . . , pk ). Prove that Xi | f for some i. (You
may assume Dirichlet’s theorem.)

Exercise 5.5.8 (Inspired by Iran TST 2019). Suppose f1 , . . . , fk ∈ Q[X] are polynomials such that,
whenever n is a perfect power, one of f1 , . . . , fk is too. Prove that one of f1 , . . . , fk is a non-trivial
power of a polynomial or X.

Polynomials over Fp
Exercise 5.5.9† (Generalised Eisenstein’s Criterion). Let f = an X n +. . .+a0 ∈ Z[X] be a polynomial
and let p a rational prime. Suppose that p - an , p | a0 , . . . , an−1 , and p2 - ak for some k < n. Then
any factorisation f = gh in Q[X] satisfies min(deg g, deg h) ≤ k.

Exercise 5.5.10† (China TST 2008). Let f ∈ Z[X] be a (non-zero) polynomial with coefficients in
k
{−1, 1}. Suppose that (X − 1)2 divides f and deg f ≥ 2k . Prove that deg f ≥ 2k+1 − 1.

Exercise 5.5.11† (Romania TST 2002). Let f, g ∈ Z[X] be polynomials with coefficients in {1, 2002}.
Prove that deg f + 1 | deg g + 1.
5.5. EXERCISES 87

Exercise 5.5.12† (USAMO 2006). Find all polynomials f ∈ Z[X] such that the sequence (P (f (n2 ))−
2n)n≥0,f (n2 )6=0 is bounded above, where P is the greatest prime factor function. (In particular, since
P (0) = +∞, we have f (n2 ) 6= 0 for any n ∈ Z.)

Exercise 5.5.13 (Iran TST 2011). Suppose a polynomial f ∈ Z[X] is such that pk | f (n) for all
n ∈ Z, for some k ≤ p. Prove that there exist polynomials g0 , . . . , gk ∈ Z[X] such that

k
X
f= (X p − X)i pk−i gi .
i=0

In addition, prove that this becomes false when k > p.

Exercise 5.5.14† (China TST 2021). Suppose the polynomials f, g ∈ Z[X] are such that, for any
sufficiently large rational prime p, there is an element rp ∈ Fp such that f ≡ g(X + rp ) (mod p). Prove
that there exists a rational number r ∈ Q such that f = g(X + r).

Exercise 5.5.15 (IMO 1993). Let n ≥ 2 be an integer. Prove that the polynomial X n + 5X n−1 + 3
is irreducible.

Iterates
Exercise 5.5.16† . Let f ∈ Z[X] be a polynomial. Show that the sequence (f n (0))n≥0 is a Mersenne
sequence, i.e.
gcd(f i (0), f j (0)) = f gcd(i,j) (0)
for any i, j ≥ 0.

Exercise 5.5.17† . Suppose the non-constant polynomial

f = ad X d + . . . + a2 X 2 + a0 ∈ Z[X]

has positive coefficients and satisfies f 0 (0) = 0. Prove that the sequence (f n (0))n≥1 always has a
primitive prime factor.

Exercise 5.5.18† (Tuymaada 2003). Let f ∈ Z[X] be a polynomial and a ∈ Z a rational integer.
Suppose |f n (a)| → ∞. Prove that there are infinitely many primes p such that p | f n (a) for some
n ≥ 0 unless f = AX d for some A, d.

Exercise 5.5.19† (USA TST 2020). Find all integers n ≥ 2 for which there exist a rational integer
m > 1 and a polynomial f ∈ Z[X] such that gcd(m, n) = 1 and n | f k (0) ⇐⇒ m | k for any positive
rational integer k.

Exercise 5.5.20† . Let f ∈ Q[X] be a polynomial of degree k. Prove that there is a constant h > 0
such that that the denominator of f (x) is greater than h times the denominator of xk .

Exercise 5.5.21† . Let f ∈ Q[X] be a polynomial of degree at least 2. Prove that



\
f k (Q)
k=0

is finite.

Exercise 5.5.22† (Iran TST 2004). Let f ∈ Z[X] be a polynomial such that f (n) > n for any positive
rational integer n. Suppose that, for any N ∈ Z, there is some positive rational integer n such that

N | f n (1).

Prove that f = X + 1.
88 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Divisibility Relations
Exercise 5.5.23† . Find all polynomials f ∈ Z[X] such that f (n) | nn−1 − 1 for sufficiently large n.

Exercise 5.5.24. Find all polynomials f ∈ Z[X] such that gcd(f (a), f (b)) = 1 whenever gcd(a, b) = 1.
Exercise 5.5.25† (ISL 2012 Generalised). Find all polynomials f ∈ Z[X] such that rad f (n) |
rad f (nrad n ). (You may assume Dirichlet’s theorem.)
Exercise 5.5.26 (ISL 2011). Suppose f, g ∈ Z[X] are coprime polynomials such that f (n) and g(n)
are positive for any positive rational integer n. Suppose that

2f (n) − 1 | 3g(n) − 1

for any positive rational integer n. Prove that f is constant.


Exercise 5.5.27† . Find all polynomials f ∈ Z[X] such that f (p) | 2p − 2 for any prime p. (You may
assume Dirichlet’s theorem.)
Exercise 5.5.28 (Iran Mathematical Olympiad 3rd Round 2016). We say a function g : N → N is
special if it has the form g(n) = af (n) where f ∈ Z[X] is a polynomial such that f (n) is positive when
n is a positive rational integer and a is a rational integer. We also say the sum, difference, and product
of two special functions is special. Prove that there does not exist a non-zero special function g and a
non-constant polynomial f ∈ Z[X] such that

f (n) | g(n)

for any positive rational integer n.

Miscellaneous
Exercise 5.5.29† (Generalised Hensel’s Lemma). Let f ∈ Z[X] be a polynomial and a ∈ Z an integer.
Let m = vp (f 0 (a)). If p2m+1 | f (a), prove that f has exactly one root b modulo pk which is congruent
to a modulo pm+1 for all k ≥ 2m + 1.

Exercise 5.5.30† . Let f ∈ Z[X] be a non-constant polynomial. Is it possible that f (n) is prime for
any n ∈ Z?
Exercise 5.5.31† . Find all polynomials f ∈ Q[X] which are surjective onto Q.
Exercise 5.5.32 (Inspired by USA TST 2008). Let n be a positive rational integer. How many
sequences of n elements of Z/nZ have the form

(f (0), . . . , f (n − 1))

for some f ∈ Z/nZ[X]?


Exercise 5.5.33. We say a subset S of Z/nZ is d-coverable if there exists a polynomial f ∈ Z/nZ[X]
of degree at most d such that
S = {f (0), . . . , f (n − 1)}.
Find all rational integers n such that all subsets of Z/nZ are d-coverable for some d, and find the
minimum possible d for these n.
Exercise 5.5.34 (Iran TST 2015). Let (an )n≥0 denote the sequence of rational integers which are
sums of two squares: 0, 1, 2, 4, 5, 8, . . .. Let m ∈ Z be a positive rational integer. Prove that there are
infinitely many integers n such that an+1 − an = m.

Exercise 5.5.35† (ISL 2005). Let f ∈ Z[X] be a non-constant polynomial with positive leading
coefficient. Prove that there are infinitely many positive rational integers n such that f (n!) is composite.
Chapter 6

The Primitive Element Theorem and


Galois Theory

Prerequisites for this chapter: Chapters 1, 3 and 4 and Sections A.2 and C.1 for the whole chapter
and Chapter 5 for Section 6.4. Chapter 2 is recommended.

6.1 General Definitions


Let’s start with some general definitions with which you should be somewhat familiar with by now
(from Chapters 2 and 4 and the exercises).

Definition 6.1.1

The smallest ring containing a commutative ring R and elements α1 , . . . , αn is denoted

R[α1 , . . . , αn ].

It consists of polynomial expressions in α1 , . . . , αn :

R[α1 , . . . , αn ] = {f (α1 , . . . , αn ) | f ∈ R[X1 , . . . , Xn ]}.

Definition 6.1.2

The smallest field containing a commutative field K and elements α1 , . . . , αn is denoted

K(α1 , . . . , αn ).

It consists of rational expressions in α1 , . . . , αn :

K(α1 , . . . , αn ) = {f (α1 , . . . , αn ) | f ∈ K(X1 , . . . , Xn )}.

Exercise 6.1.1∗ . Prove that R[α1 , . . . , αn ] is indeed the smallest ring containing R and α1 , . . . , αn , in the
sense that any other such ring must contain R[α1 , . . . , αn ]. Similarly, prove that any field containing K and
α1 , . . . , αn contains K(α1 , αn ).

Exercise 6.1.2∗ . Let α ∈ Q be an algebraic number. Prove that Q(α) = Q[α].

Remark 6.1.1
Of course, we assumed the multiplication and addition of R and K were compatible with the αi ,
and in the second definition, that K[α1 , . . . , αn ] is an integral domain, otherwise the definitions

89
90 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

do not make sense. Indeed, if α1 α2 = 0 but α1 , α2 6= 0 then no field can contain α1 and α2 .

We now generalise the quadratic fields from Chapter 2 to arbitrary fields of the form Q(α1 , . . . , αn )
for some algebraic numbers α1 , . . . , αn ∈ Q. These are called number fields. However, to get more
number-theoretic information on number fields, we must do algebraic number theory with numbers
fields too instead of only Q.

Here is what this means: we defined algebraic numbers as roots of polynomials with rational
coefficients. We can then define algebraic numbers over a number field K as the roots of polynomials
with coefficients in K. These turn out to be the same as the regular algebraic numbers by the
fundamental theorem of symmetric polynomials,

but what’s different is their minimal polynomial.
Indeed, the minimal polynomial of (i+1)2
2
over Q is X 4 + 1 but over Q(i) it is just X 2 − i.

Exercise 6.1.3∗ . Prove that the minimal polynomial of (i+1) 2
2
over Q(i) is X 2 − i.

We thus make the following definitions. One of the reasons we do it in so much generality is to do
number theory with other base fields than Q (but which are still number fields), but another one is
also to revisit the theory of finite fields a bit, since these are also about algebraic elements.

Definition 6.1.3 (Field Extensions)

We say two fields K ⊆ L form a field extension denoted L/K (big field over small field).

We will usually only say "extension" for "field extension". We also make analogous definitions for
elements algebraic over K, their minimal polynomial, their degree, their conjugates, etc. The field of
elements algebraic over L is denoted L and called the algebraic closure of L (this won’t be needed in
this book as L = Q for any number field L).

Definition 6.1.4 (Degree of a Field Extension)

The degree [L : K] of an extension L/K is the dimension of L as a K-vector space.

Exercise 6.1.4∗ . Check that L is a K-vector space.

What the degree does is that it measures the "size" of the extension. This definition might seem
somewhat complicated at first, but it is in fact very simple (in the cases we’re interested in): when
L = K(α) for some α algebraic over L of degree n, the degree of L/K is also just n! Indeed, by
definition the elements 1, α, . . . , αn−1 are K-linearly independent (otherwise the minimal polynomial
of α has degree less than n) while 1, α, . . . , αn aren’t (since some polynomial of degree n vanishes at
α). For our purposes, all extensions of number fields have the form L = K(α): this is the primitive
element theorem 6.2.1. You might thus wonder why we state things with linear algebra terminology:
it’s simply because linear algebra and bases are convenient to work with, as the following example as
well as the tower law 6.1.1 show.

Proof that algebraic numbers are closed under addition and multiplication

Let α, β ∈ Q be algebraic numbers of respective degrees m and n. Then, Q(α, β) is a Q-vector


space with dimension at most mn, since its generated by αi β j , i = 0, . . . , m − 1, j = 0, . . . , n − 1.

As a consequence, 1, α+β, (α+β)2 , . . . , (α+β)mn are linearly dependent which means that there
is a polynomial with rational coefficients of degree at most mn vanishing at αβ , in particular it’s
algebraic.

Similarly, 1, αβ, (αβ)2 , . . . , (αβ)mn are linearly dependent so αβ is algebraic.



6.1. GENERAL DEFINITIONS 91

Note that this proof does not work to show that Z is closed under addition and multiplication, as
Z is not a Q-vector space anymore so bases don’t work nicely, However, a proof using linear algebra
still exists, see Section C.3.

Proposition 6.1.1 (Tower Law)

Suppose M/L/K is a tower of extensions (meaning K ⊆ L ⊆ M ). Then,

[M : K] = [M : L][L : K].

In other words, the degree is multiplicative in towers of extensions.

Proof

Let m = [M : L] and n = [L : K]. Let u1 , . . . , um be a L-basis of M , and v1 , . . . , vn be a


K-basis of L. Then, (ui vj )i∈[m],j∈[n] is a K-basis of M . Since this basis has cardinality mn,
[M : K] = mn.


Exercise 6.1.5∗ . Prove that (ui vj )i∈[m],j∈[n] is a K-basis of M .

Exercise 6.1.6∗ . Let M/L/K be a tower of extensions and α ∈ M . Prove that the minimal polynomial
of α over L divides the minimal polynomial of α over K. In other words, its L-conjugates are among its
K-conjugates.

Before we make more definitions, here is an application of why we care about extensions of number
fields (where K 6= Q), and why the tower law is useful.

Problem 6.1.1 (IMC 2012 Problem 5)

Let a ∈ Q be a rational number, and n ≥ 1 be an integer. Prove that the polynomial


n
(X 2 + aX)2 + 1

is irreducible in Q[X].

Before solving this problem, we need a lemma which follows from the tower law.

Lemma 6.1.1

Let f, g ∈ Q[X] be polynomials. Then, f ◦ g is irreducible in Q[X] if and only if f is irreducible


in Q[X] and g − α is irreducible in Q(α)[X], where α is a root of f .

Proof

Let m = deg f and n = deg g. Consider a root α of f and a root β of g − α. Then, β is a root of
f ◦ g and we have
[Q(β) : Q] = [Q(β) : Q(α)][Q(α) : Q].
f ◦ g is irreducible if and only if [Q(β) : Q] = deg f ◦ g = mn. Also, [Q(α) : Q] ≤ m since α is a
root of f , with equality iff f is irreducible in Q[X]. Similarly, [Q(β) : Q(α)] ≤ n since β is a root
92 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

of g − α, with equality iff g − α is irreducible in Q(α). To conclude,

[Q(β) : Q] = [Q(β) : Q(α)][Q(α) : Q] ≤ mn

with equality if and only if f is irreducible in Q[X] and g −α is irreducible in Q(α)[X], as wanted.


Solution
n
Using the lemma, we wish to show that f = X 2 + 1 = Φ2n+1 is irreducible in Q[X], which is
true by Theorem 3.2.1 (or Eisenstein’s criterion) and that g − ω = X 2 + aX − ω is irreducible
in Q(ω) where ω is a primitive 2n+1 th root of unity. Here is why this is easier to manipulate: a
polynomial of degree two is reducible if and only if it has no root. Thus, suppose for the sake of
a contradiction that X 2 + aX − ω has a root in Q(ω), i.e. h(ω)2 + ah(ω) = ω for some h ∈ Q[X].

We complete the square and take the norm: (h(ω) + b)2 = ω + b2 , where b = a/2, and
Y Y
(h(ω k ) + b)2 = ω k + b2 .
k odd k odd

The LHS is a perfect square by the fundamental theorem of symmetric polynomials, while the
RHS is Y n+1
b2 − ω k = Φ2n+1 (b2 ) = b2 +1
k odd
n+1
since ϕ(2 ) = 2 is even. Now, we suppose that the diophantine equation x4 + y 4 = z 2 has
n

no solution in non-zero rational integers. This is a classical result which was proven by Fermat.
See Exercise 2.6.12† for a proof.
n+1
Hence, the diophantine equation b2 + 1 = c2 has only the rational solution b = 0 which means
2 2n
a = 0. But then (X + 0X) + 1 = Φ2n+2 is clearly irreducible so we reach a contradiction in
all cases. 

Finally, we make three more definitions, the last two having been encountered a few times already
in special cases.

Definition 6.1.5 (Finite Extension)

We say an extension L/K is finite if its degree [L : K] is finite.

Definition 6.1.6 (Number Field)

A finite extension of Q is called a number field .

Exercise 6.1.7∗ . Prove that finite extensions of K are exactly the fields of the form K(α1 , . . . , αn ) for
α1 , . . . , αn algebraic elements over K, using Proposition 6.1.1.

Definition 6.1.7 (Ring of Integers)

Let K be a number field. Its ring of integer , OK , is the ring of algebraic integers of K: K ∩ Z.
6.2. THE PRIMITIVE ELEMENT THEOREM AND FIELD THEORY 93

6.2 The Primitive Element Theorem and Field Theory


Our main result is the following: every number field is generated by one element. This is extremely
nice, as you will see with all the applications, as all one has to do is to take in account the minimal
polynomial of the generator to deduce the structure of the field. For instance, as mentioned before,
one can easily compute the degree of K = Q(α) (if one knows α). Compare this to, say, Q(α, β): you
not only have to take in account the contribution of α (its degree over Q), but also what β adds to
the contribution of α (its degree over Q(α))!

Theorem 6.2.1 (Primitive Element Theorem)

Let K ⊆ C be a field, and α, β ∈ K algebraic elements over K. Then, there exists a γ ∈ K(α, β)
such that
K(α, β) = K(γ).

Since, by Exercise 6.1.7∗ , every number field is finitely generated, repeated applications of the
primitive element theorem lead to number fields being generated by one element (by induction).

Proof

We take γ = α + tβ, for some t ∈ K that will be chosen later. We will find two polynomials in
K(γ)[X] whose gcd is X − α: since the gcd can be obtained by the Euclidean algorithm, this
means that α ∈ K(γ) and thus also β = (γ − α)/t ∈ K(γ).

For these polynomials, we choose


Y
f = πα = X − αi
i

and Y
g= (X − (γ − tβj ))
j

where αi and βj are the conjugates of α and β respectively (over K). By the fundamental
theorem of symmetric polynomials, they both have coefficients in K(γ). It remains to see that
their gcd is X − α: since they have distinct roots this is equivalent to α being the only common
root. Suppose αi = γ − tβj = α + t(β − βj ) is another common root: this yields
αi − α
t=
β − βj

(β 6= βj as α 6= αi ). There are clearly a finite number of such values, thus any sufficiently large
t works.


If you read this proof carefully, you might notice that it almost works for any infinite field K. In
fact, it does show that if L/K is finite then L is generated by one element, under the assumption
of separability. This means that the conjugates of an element are always distinct. Indeed, if this
assumption is not satisfied, then the gcd of the two polynomials we constructed could be divisible
by (X − α)2 and thus not equal to X − α. It seems obvious that irreducible polynomials have no
repeated roots, and that’s because it is, but only in characteristic zero. In characteristic p, things
becomes weird, but one can check that it still holds for finite fields (it can only fail if the derivative
of the polynomial is zero). Since these are the cases we are interested in, we will assume that all our
extensions are separable, so that we can use the primitive element theorem. See Exercise 6.5.39 for an
example of a non-separable extension.
94 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Note also that we also proved the primitive element theorem for finite fields, in Corollary 4.2.1.

Definition 6.2.1 (Separable Extension)

An algebraic (i.e. L ⊆ K) extension L/K is said to be separable if the minimal polynomial of


any element of L has distinct conjugates.

With this, we can establish a numbers of field theoretic results for number fields. In particular, we
shall generalise the norm NQ(√d) of quadratic fields Definition 2.1.4.

Definition 6.2.2 (Embeddings)

Let L/K be a finite extension. The K-embeddings of L are functions f : L → L which additive,
multiplicative, and the identity on K. In other words, functions satisfying f (k) = k for k ∈ K,
and f (x + y) = f (x) + f (y) as well as f (xy) = f (x)f (y) for x, y ∈ L. The set of K-embeddings
of L is denoted EmbK (L).

Since we are usually concerned with the case K = Q, we will just say "embedding" for "Q-
embedding" as well as write Emb for EmbQ .

Remark 6.2.1
We say "embedding" because a normal embedding of S into U is an injective morphism ϕ : S 7→
U . Notice that, for U = L, this corresponds to the Q-embeddings of Q/L, which we just call
embeddings. What do we call such a morphism an "embedding"? Because you "embed" S into
U by associating it with its isomorphic image f (S): you get a copy of S in U .

Exercise 6.2.1. Let K be a number field. Prove that the embeddings of K are the non-zero functions
f : K → C which are both multiplicative and additive.

Exercise 6.2.2∗ . Let L/K be a finite extension and ϕ ∈ EmbK (L) an embedding of L. Prove that ϕ(f (α)) =
f (ϕ(α)) for any f ∈ K[X] and α ∈ L.

Exercise 6.2.3∗ . Let α ∈ L be an element and σ ∈ EmbK (L) be an embedding. Prove that σ(α) is a
conjugate of α.

Exercise 6.2.4∗ . Prove that an embedding is injective.


Why do we care about embeddings? Well, because they are precisely the morphisms obtained by
conjugation (in the case where L is generated by one element, which can be achieved thanks to the
primitive element theorem)!

Proposition 6.2.1 (Embeddings)*

Let K(α) = L/K be a finite separable extension. The K-embeddings of K(α) are precisely the
functions ϕ : f (α) 7→ f (β) where β is some conjugate of α and f ∈ K(X). In particular, there
are exactly [L : K] of them.

Let’s check that this statement makes sense: ϕ is clearly additive and multiplicative, and it indeed
fixes K since if f = k is a constant polynomial then k(α) = k = k(β). Why doesn’t it work for any
β then? That’s because we need to check it’s well defined: an element of K(α) has multiple ways of
being written (e.g. 0 = πα (α)) (here πa means the minimal polynomial with coefficients in K). This
is very easy to show: if f (α) = g(α) then
f ≡g (mod πα ) ⇐⇒ f ≡ g (mod πβ )
so f (β) = g(β) which shows that it’s well defined. In fact, we have already proven the proposition like
this, since by Exercise 6.2.3∗ β = ϕ(α) is a conjugate of α and by Exercise 6.2.2∗ ϕ : f (α) 7→ f (β).
6.2. THE PRIMITIVE ELEMENT THEOREM AND FIELD THEORY 95

Remark 6.2.2
If L/K is algebraic separable but not finite, there are many embeddings and the fundamental
theorem of Galois theory Theorem 6.3.1 does not hold anymore (the way it’s currently stated).
For L/K not algebraic, there are even more embeddings! For instance, if L = K(α) with α
transcendental over K, then the embeddings of L/K are σβ : f (α) 7→ f (β) for any transcendental
β. (In other words, transcendental elements are all conjugates in some sense.)

Embeddings give us a systematic way to manipulate conjugation. For instance, the embeddings
of C/R are the identity and complex conjugation, and using the conjugation embedding we proved
Proposition A.1.2. We will illustrate this by an application soon, but we need to build up a few results
on embeddings first, namely an equivalent version of Exercise 1.5.20† . Here is how to prove it with
the formalism of embeddings (note that it’s the same proof, but with more comfortable objects). Note
that this result is completely obvious: everything is symmetric between conjugates, so of course they
are reached the same number of times. We are just expressing this symmetry more formally.

Proposition 6.2.2*

Let f ∈ K[X]. The m conjugates of f (α) are f (ϕ(α)), and each is represented n/m times where
n = [K(α) : K] is the degree of α.

Proof

Note that conjugates areQreached at least once: this follows from the fundamental theorem of
symmetric polynomials: i X − f (αi ) has integer coefficients where αi are the conjugates of α.
Moreover, by Exercise 6.2.3∗ f (αi ) is always a conjugate of f (α).

It remains to see that they all appear the same number of times. Suppose f (ϕ1 (α)) = f (ϕ2 (α)) =
. . . = f (ϕk−1 (α)) is reached exactly k times, where ϕ1 = id.

Consider one of its conjugate, f (ψ(α)) = ψ(f (ϕ1 (α))). Since ψ is injective by Exercise 6.2.4∗ ,
we conclude that f (ψ(α)) is also reached exactly k times, for

f (ϕ(α)) = ψ(f (ϕi (α)) = f (ψϕi (α)))

and if f (ψ(α)) = f (ψ 0 (α)) then f (α) = f (ψ −1 ψ 0 (α)) so ψ −1 ψ 0 ∈ {ϕ1 , . . . , ϕk } as wanted.




Remark 6.2.3
Here is how this could be proven without field theory. In fact, in this case it is even a lot
quicker. However, we have chosen this approach because in general it is nicer to think in terms
of
Q embeddings, and this will be particularly important for Section 6.3. Since all roots of π =
k
i X − f (αi ) are roots of π α , the factorisation in irreducible polynomials must be π α for some k.
This means exactly that all conjugates are reached the same amount of times.

This result can be reformulated to show that, if M/L/K is a tower of extensions such that M/K
is separable (since we only defined embeddings for separable extensions), every K-embedding of L
extends to exactly [M : K]/[L : K] = [M : L] K-embedding of M .

Proposition 6.2.3*

Any K-embedding of L extends to exactly [M : L] K-embedding of M . In particular, every


embedding is reached.
96 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Proof

By Exercise 6.2.3∗ , embeddings of M/K restrict to embeddings of L/K, and by Proposition 6.2.2
each embedding is reached the same number of times, i.e. [M : K]/[L : K] = [M : L].


With these result, we can now define the norm for any finite extension! But first, we present an ex-
ample that shows the conceptual power of embeddings (which, again, only provide a more comfortable
to solve the problem, the essence stays the same).

Problem 6.2.1

Suppose α1 , . . . , αn ∈ Q are positive real algebraic numbersPsuch that αi is maximal out of the
n
absolute value of its conjugates for each i. Prove that, if i=1 αi is rational, then αi ∈ Q for
each i.

Proof
Pn
Consider the embeddings of K = Q(α1 , . . . , αn ). If i=1 αi ∈ Q is rational, it is fixed by every
embedding σ of K. But, the absolute value of
n n
! n
X X X
αi = σ αi = σ(αi )
i=1 i=1 i=1
Pn Pn
is at most i=1 |σ(αi )| ≤ i=1 αi . Hence, since we have equality in the triangular inequality,
we must have σ(αi ) = uαi for some |u| = 1. But then,
n
X n
X n
X
αi = σ(αi ) = u αi
i=1 i=1 i=1

so u = 1. Finally, this means that αi is fixed by any σ ∈ Emb(K), which means that its fixed by
any σ ∈ Emb(Q(αi )) by Proposition 6.2.3. We conclude that its only conjugate is itself: αi ∈ Q.


Exercise 6.2.5. Solve Problem 6.2.1 without field theory, i.e. using only the content of Chapter 1.
Pn √
As a notable corollary, we get that i=1 ki ai for positive ai is rational iff ai is a perfect ki th power
for each i, which was Exercise 1.5.6. To conclude this section, we define the norm in arbitrary finite
extensions.

Definition 6.2.3 (Norm)

Let L/K be a finite separable extension. The norm NL/K defined as


Y
NL/K (α) = σ(α).
σ∈EmbK (L)

Notice in particular that the norm is obviously multiplicative because the embeddings are! As an
example, the norm in C/R is the square of the module: N(a + bi) = a2 + b2 = |a + bi|2 . Here is a bad
application of the multiplicativity of the norm, which nonetheless appeared in a USA TST.
6.3. GALOIS THEORY 97

Problem 6.2.2 (USA TST 2012)

Do there exist abritrarily large rational integers a, b, c such that a3 + 2b3 + 4c3 = 6abc + 1?

Solution

3

3 3 3 3
One can check that NQ( √ 3
2) (a + b 2 + c 4) = a + 2b + 4c − 6abc. Thus we want to find
√3

elements of norm 1 in this field. Notice that NQ( √
3
2) (1 + 2 + 3 4) so
√ √ √ √
a + b 2 + c 4 = (1 + 2 + 4)n
3 3 3 3

will also have norm 1 for any n. Pick an n sufficiently large and we are done. 

√ √
Exercise 6.2.6∗ . Check that NQ( √
3
2)
(a + b 3 2 + c 3 4) = a3 + 2b3 + 4c3 − 6abc.

Norms in different extensions are linked by the following proposition. This is left as an exercise in
the next section, as we need to define Galois groups to prove it.

Proposition 6.2.4

Let M/L/K be a tower of finite separable field extensions. Then NM/K = NL/K ◦ NM/L .

6.3 Galois Theory


Galois theory studies Galois extensions, i.e. extensions closed under embeddings. An algebraic exten-
sion L/K means that every element of L is algebraic over K.

Definition 6.3.1 (Galois Extension)

A Galois extension L/K is a separable algebraic extension closed under embeddings, meaning
that for any α ∈ L, all the conjugates of α also lie in L.

The simplest case of a Galois extension are the quadratic fields we say in Chapter 2 and the
cyclotomic fields Q(ω) where ω is a root of unity. Indeed, all its conjugates have the form ω k ∈ Q(ω).
It is easy to construct Galois extensions: one starts with any extension K(α) and then gets the galois
closure L = K(α1 , . . . , αn ) by adding the conjugates α1 , . . . , αn of α.
Exercise 6.3.1∗ . Check that K(α1 , . . . , αn )/K is Galois and prove that any Galois extension has this form.

Galois extensions are particularly interesting because EmbK (L) becomes a group, meaning that
the composition of two embeddings is still an embedding since embeddings are now L → L. We thus
denote EmbK (L) as AutK (L) (also written Aut(L/K)) when L is Galois, because its embeddings are
now automorphisms.

Definition 6.3.2 (Galois Group)

The Galois group Gal(L/K) of an extension L/K is its group of K-embeddings: AutK (L).

Exercise 6.3.2∗ . Can you write the Galois group of a quadratic extension L/K in a way that doesn’t depend
on L or K? (More specifically, show that the Galois groups of quadratic extensions are all isomorphic.)
98 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.3.3∗ . Check that the Galois group is a group under composition. (You may assume that each
element has an inverse, this will be proven later as a corollary of Theorem 2.5.1.)

Exercise 6.3.4. Let L/K be Galois and K ⊆ M ⊆ L be an intermediate field. Prove that EmbK (M ) is a
system of representatives of Gal(L/K)/ Gal(L/M ), where the quotient means Gal(L/K) modulo Gal(L/M ),
i.e. we say σ 0 ≡ σ if σ −1 ◦ σ 0 ∈ Gal(L/M ). (Our quotient A/B is more commonly though of as the set
of right-cosets of B in A, i.e. the sets Ba for a ∈ A (which we just wrote as a in our case).) (See also
Exercise A.3.14† .)

Exercise 6.3.5. Prove Proposition 6.2.4 using Exercise 6.3.4. (This is a bit technical.)
Again, when K = Q, we may drop the K in the notation. We will also sometimes write G(L/K).
Here is the most important Galois group: Gal(Q(ωn )/Q) where ωn is a primitive nth root of unity. By
Theorem 3.2.1, this is σk : ωn 7→ ωnk for gcd(n, k) = 1. Note that σi ◦ σj maps ωn to (ω j )i = ω ij . This
means that σi ◦ σj = σij . It is thus isomorphic to (Z/nZ)× : just label σk as k (mod n) and this makes
sense by the previous consideration on composition (which becomes multiplication in (Z/nZ)× ). In
particular, it is abelianindexgroup!abelian which means that ab = ba (composition commutes).
Another particularly simple Galois group is Gal(Fpn /Fp ): by Theorem 4.3.1, its elements are
id, ϕp , ϕ2p , . . . , ϕn−1
p .

In particular, it is generated by only one element: if we relabel ϕkp as k (mod n) we get Gal(Fpn /Fp ) '
Z/nZ!
Before we state and prove the fundamental theorem of Galois theory, here is a quick application of
the Galois group which is a special case of ??.

Problem 6.3.1

3
Is 2 a sum of roots of unity?

Solution

Suppose that this

3
2 ∈√ Q(ω) for some root of unity ω. X 3 − 2 is irreducible in Q[X], so the
3 i 3
√ of 2 are ζ 2 where ζ is a primitive third root of unity. Since Q(ω) is Galois,
conjugates
K = Q( 3 2,√ζ) ⊆ Q(ω). By Proposition 6.2.3, Gal(Q(ω)/Q) restricts to (multiple copies of)
G = Gal(Q( 3 2, ζ)).
The key point is that Gal(Q(ω)/Q) is abelian (commutes), while G isn’t so this is impossible.
We have already shown that the former is abelian, so it remains to show that G is not. We claim
that the embeddings of K are (√ √
3
2 7→ ζ a 3 2
σ(a,b) :
ζ 7→ ζ b
for (independent) a ∈ Z/3Z and b ∈ (Z/3Z)× .
Clearly, these are all the possible embeddings of K, so it remains to check that they are indeed
embeddings, i.e. that [K : Q] = 3 · 2 = 6. Since
[K : Q] = [K : Q(ζ)][Q(ζ) : Q] = 2[K : Q(ζ)],
it remains to prove that [K : Q(ζ)] = 3, i.e. that X 3 − 2 is irreducible over Q(ζ). This is very
easy: if it wasn’t the case it would have a root in Q(ζ), so Q(ζ) would contain an element of
degree 3 which is impossible as Q(ζ) has degree 2.
Finally, one can see that G is not abelian as σ(0,−1) ◦σ(1,1) = σ(−1,−1) , but σ(1,1) ◦σ(0,−1) = σ(1,−1) .

Therefore, 3 2 is not a sum of roots of unity. 
6.3. GALOIS THEORY 99

Remark 6.3.1
In fact, the Kronecker-Weber theorem asserts the converse: if Gal(K/Q) is abelian, K is contained
in a cyclotomic field Q(ω) for some root of unity ω.

Exercise 6.3.6∗ . Compute σ(0,−1) ◦ σ(1,1) and σ(1,1) ◦ σ(0,−1) .


Here is the main reason why Galois groups are interesting.

Theorem 6.3.1 (Fundamental Theorem of Galois Theory)

Let L/K be a finite Galois extension. There is a one-to-one correspondence – called the Galois
correspondence – between subgroups H of Gal(L/K) (subsets closed under composition and
inversion) and the intermediate fields L/M/K. This correspondence is given by

H 7→ LH ,

where LH is the fixed field of H, i.e. the elements of L which are fixed by all of H. The reverse
direction is given by
M 7→ Gal(L/M ).

Proof

Note that Gal(L/M ) = EmbM (L). In particular, the elements fixed by Gal(L/M ) are those
which have only one M -conjugate: they are thus in M . This shows that LGal(L/M ) = M .

It remains to prove that Gal(L/LH ) = H. Clearly, H ⊆ Gal(L/LH ) since H fixes LH by


definition. Write L = LH (α): the cardinality of Gal(L/LH ) is the degree of α (over LH ).
However, note that
ei (σ1 (α), . . . , σk (α))
is fixed by H for any i, where σ1 , . . . , σk are the elements of H (this is because σH = H for
any σ ∈ H). Thus, α has at most k conjugates. To conclude, we have H ⊆ Gal(L/LH ) and
| Gal(L/LH )| ≤ |H| which implies H = Gal(L/LH ).


Remark 6.3.2
There is an explicit way Q of choosing a primitive element of LH from a primitive element of L. If
L = K(α), set fH (X) = σ∈H X − σ(α). For sufficiently large n ∈ Z, fH (n) is fixed only by H:
since σfH and fH are distinct polynomials for σ 6∈ H, they have a finite number of common roots.
Finally, if β is fixed only by H, then Gal(L/K(β)) = H by definition, which means K(β) = LH .

Exercise 6.3.7∗ . Prove that ei (σ1 (α), . . . , σk (α)) is fixed by H for any i.
For L/K = Fpn /Fp for instance, this is Corollary 4.3.1. Indeed, the (additive, so closed under
Z/dZ
addition) subgroups of Z/nZ are simply Z/dZ for d | n and the fixed field Fpn is Fpn/d , the fixed
n/d n/d
field of ϕp (as Z/dZ is generated by ϕp ). We now present a quick application of the fundamental
theorem of Galois theory in the case of cyclotomic fields.

Problem 6.3.2

Let ωm be a primitive mth root of unity, and ωn a primitive nth root of unity. What are
Q(ωm , ωn ) and Q(ωm ) ∩ Q(ωn )?
100 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Let ω be a primitive mnth root of unity and σk be the embedding ω 7→ ω k .

Let ωd be a primitive dth root of unity where d | mn. Notice that Q(ωd ) is the fixed field of

Hd = {σk | kmn/d ≡mn mn/d ⇐⇒ k ≡d 1}

since these are exactly the automorphisms such that σk (ωd ) = ωd , as ωd = ω mn/d . In particular,

Hm ∩ Hn = {σk | k ≡m 1, k ≡n 1} = {σk | k ≡lcm(m,n) 1} = Hlcm(m,n) .

This means that Q(ωm , ωn ) is Q(ωlcm(m,n) ) by Exercise 6.3.8∗ . Similarly, the group generated
by Hm and Hn is
hHm , Hn i = {ab | a ≡m 1, b ≡n 1} = Hgcd(m,n)
since (1 + am)(1 + bn) ≡mn 1 + (am + bn) goes through every residue which is 1 modulo gcd(m, n)
by Bézout’s lemma. Thus, Q(ωm ) ∩ Q(ωn ) is Q(ωgcd(m,n) ). 

Remark 6.3.3
In fact, a very direct proof could be given for Q(ωm , ωn ) = Q(ωlcm(m,n) ): one inclusion is trivial,
and for the other we have
 b  a  
2iπ 2iπ 2iπ
exp exp = exp
m n lcm(m, n)

where am + bn = gcd(m, n). However, such a proof does not work anymore for Q(ωm ) ∩ Q(ωn )
because we do not have access directly to this field. For instance, K(ωm , ωn ) = K(ωlcm(m,n) ) is

always true, but K(ωm ) ∩ K(ωn ) = K(ωgcd(m,n) ) isn’t always! As an example, if K = Q( 3),

then K(i) = Q(i, 3) = K(j) where j is a primitive cube root of unity. Thus, K(i) ∩ K(j) 6= K.

Exercise 6.3.8∗ . Given two subfields A and B of a field L, define their compositum or composite field AB as
the smallest subfield of L containing both A and B (in other words, the field generated by A and B). Let L/K
be a finite Galois extension and A, B be intermediate fields. Prove that Gal(L/AB) = Gal(L/A) ∩ Gal(L/B).

Exercise 6.3.9∗ . Given two subgroups H1 , H2 of a group H, define the subgroup they generate, hH1 , H2 i,
as the smallest subgroup containing both H1 and H2 . Let L/K be a finite Galois extension and A, B be
intermediate fields. Prove that Gal(L/A ∩ B) = hGal(L/A), Gal(L/B)i.

Here are a few additional properties of the Galois correspondence.

Proposition 6.3.1

We have [LH : K] = |G|/|H| where G = Gal(L/K).

Proof

|H| = | Aut(L/LH )| = [L : LH ] and |G| = [L : K] so

[L : K] |G|
[LH : K] = = .
[L : LH ] |H|


6.3. GALOIS THEORY 101

Proposition 6.3.2

The Galois correspondence is inclusion reversing: H1 ⊆ H2 ⇐⇒ LH1 ⊇ LH2 .

Exercise 6.3.10∗ . Prove Proposition 6.3.2.

Here is another application of the fundamental theorem of Galois theory, generalising Problem 6.3.1.

Problem 6.3.3

n
When is 2 a sum of roots of unity?

Solution

Suppose that Q( n 2) ⊆√Q(ω) for some root of unity ω. By Exercise 6.3.11∗ , any subfield of Q(ω)
n
is Galois over Q, so√Q( n 2) also
√ is. Note that, since X −2 is irreducible by Eisenstein’s criterion,
n k n
the conjugates of√ 2 are ζ 2 where ζ is a primitive nth root of unity. In particular, we must
n
have
√ Q(ζ) ⊆ Q( 2) ⊆ R, which implies n = 1 or n = 2. Conversely, these works: 2 = 1 + 1 and
n
2 = ±(ω + 1/ω) for any primitive eight root of unity ω. Indeed,
 2
1 1
ω+ = ω2 + +2=2
ω ω2

as ω 4 = −1 (this is a Gauss sum). 

Exercise 6.3.11∗ . Let L/K be a finite Galois extension and let M be an intermediate field. Prove that, for
any σ ∈ Gal(L/K), Gal(L/σM ) = σ Gal(L/M )σ −1 . Deduce that the intermediate fields which are also Galois
(over K) are M = LH where H is a normal subgroup of G = Gal(L/K), meaning that σHσ −1 = H for any
σ ∈ G. In particular, if L/K is abelian, meaning that its Galois group is, any intermediate field is Galois over
K.

This fundamental theorem of Galois theory lets us get deeper insight on the Gauss sums of Sec-
tion 4.5. Indeed, the Galois group of Q(ω) where ω is a primitive qth root of unity is (isomorphic to)
(Z/qZ)× . By Galois theory (and a bit of group theory), we already know that this field contains a
unique field of degree 2: indeed, by Proposition 6.3.1, it’s LH where |G|/|H| = 2. It can be seen easily
that the unique such subgroup is the subgroup of quadratic residues. Let σk denote the embedding
ω 7→ ω k . Hence, we get X
ω k ∈ LH
( kq )=1
and X
ω k ∈ LH
( kq )=−1
(they are fixed by the embeddings of H) and then it’s just a matter of computing the value of these
sums to deduce what the quadratic field is. To simplify things a bit we can consider our Gauss sum
since when we square it it’s fixed by all automorphisms which means it’s rational.
√ ∗ q−1
Once we know that this quadratic field is Q( q ∗ ) where 2 1, we directly1 get the law
√ q∗ = (−1)
of quadratic reciprocity (without using the Gauss sum): if q = f (ω) ∈ LH for some f ∈ Z[X], then

( q ∗ )p ≡ σp (q ∗ ) (mod p)
by Frobenius and this√is equal to q ∗ iff σp ∈ H, i.e. p is a quadratic residue modulo q. Otherwise, it’s
its other conjugate − q ∗ . The rest of the proof is the same as before.
1 Actually, we need to know that the denominator of the coefficients of f are not divisible by p, so that f makes sense

over Fp and we can use the Frobenius morphism. This follows for instance from Exercise 3.5.26† .
102 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

To finish with the quadratic


√ reciprocity law, we said that we didn’t need to use Gauss sums, but
then how do we show that q ∗ ∈ Q(ω) without computing them? One way of doing this is to notice
that, on the one hand,
q−1
Y
1 − ω k = Φq (1) = q.
k=1
On the other hand,
q−1 q−1
 q−1 2
q−1 2 q−1 2 2
Y Y Y Y q−1 Y
1 − ωk = (1 − ω k )(1 − ω −k ) = 1 − ωk = ω −k (1 − ω k )2 = (−1) 2 ω `  (1 − ω k )
k=1 k=1 k=1 k=1 k=1

q−1 0
so (−1) 2 q is a square (ω ` is a square: just choose `0 ≡ `/2 (mod q) to get ω ` = (ω ` )2 ) thus
concluding our new proof of the quadratic reciprocity law.
Exercise 6.3.12∗ . Fill in the details of this proof of the quadratic reciprocity law.

We worked hard to get all of this, so here is a concrete application of the fundamental theorem of
Galois theory, which generalises Proposition 4.4.2.

Problem 6.3.4 (Schur)

Let H be a subgroup of (Z/nZ)× (i.e. a subset closed under multiplication and inversion,
or equivalently just closed under multiplication by little Fermat). Prove that there exists a
polynomial f ∈ Z[X] such that, for any rational prime p - n, f (mod p) has all its roots in Fp
when p (mod n) ∈ H, and no roots in Fp otherwise, up to a finite number of exceptions.

Proof

P of Ψ in Proposition 4.4.2, one might choose f to be the


If one tries to copy the construction
minimal polynomial polynomial of h∈H ω h where ω is a complex primitive nth root of unity.
However, when we need to show that
!p
X X
h
ζ = ζ ph
h∈H h∈H

is equal to h∈H ζ h (where ζ is now a primitive nth root of unity in Fp ) iff p (mod n) ∈ H we
P
run into trouble. First, notice that this is equivalent to
X X
ω ph = ωh
h∈H h∈H

for sufficiently large p by the fundamental theorem of symmetric polynomials. Indeed, suppose
there are infinitely P many primesP p ≡ k (modP n) such thatPf has a root in Fp . Consider the
(absolute) norm of h∈H ω hk − h∈H ω h : if h∈H ζ hk = h∈H ζ h this norm is divisible by p
so if it’s true infinitely many times it must be zero (which we want to show is false if k 6∈ H).
Conversely, clearly if p (mod n) ∈ H, f has all its roots ζ h in Fp .

Unfortunately, it is not always true that h∈H ω h is distinct from its conjugates h∈H ω kh for
P P
k 6∈ H. Indeed, let n = 12 and H = {1, 7}: we get ω + ω 7 = ω(1 + ω 6 ) = 0 which is absolutely
not what we want.

However, if we think a bit about what we want, we realise that we wish that σk (r) = r if and
only if k ∈ H where r ∈ Q(ω) is a root of f (since ϕp (r) = σp (r)). This is exactly what it means
for r to be in Q(ω)H ! Thus, we are done: just choose r to be a generator of Q(ω)H .

6.3. GALOIS THEORY 103

Exercise 6.3.13∗ . Convince yourself of this solution.

Since Galois theory is very much related to group theory (via the Galois group), we finish the section
with two fundamental group theory results: the Lagrange and Cauchy theorems. The former generalises
Fermat’s little theorem. It shows that a subgroup of a finite group is just a subset closed under
multiplication (or addition depending on what your operation is), without needing the assumption
that it’s closed under inversion too. Keep in mind that a group is not necessary abelian, so our proof
of Theorem 4.2.1 does not work for this. Also, remember that the identity e of a group G is an element
such that ge = eg = g for any g ∈ G. The latter constitutes a converse of Lagrange’s theorem when
the order is prime: it shows that, as long as p divides |G|, there is an element of order p.
Exercise 6.3.14∗ . Prove that the identity of a group is unique.

Theorem 6.3.2 (Lagrange’s Theorem)

Let G be a group of cardinality n (we also say G has order n) with the operation ·. Then, for
any g ∈ G, the order of g, meaning the smallest k > 0 such that g k = e, divides n.

Proof

The proof will be combinatorial. Let m be the order of G. Partition G into orbits of the form
Oh = {h, hg, hg 2 , . . .}. We claim that this is indeed a partition: if Oh ∩ Oh0 6= ∅ then Oh = Oh0 .

Indeed, if hg i = h0 g j for some i, j then hg k = h0 g j−i+k for any k so Oh = Oh0 . Since each orbit
has cardinality m, we conclude that m | n.


Exercise 6.3.15∗ . Prove the following refinement of Theorem 2.5.1: if G is a finite group and H a subgroup
of G, |H| divides |G|. Why does it imply Theorem 2.5.1?

Theorem 6.3.3 (Cauchy’s Theorem)

Let G be a finite group. If p | |G| is a rational prime, then, G has an element of order p.

Proof

The proof is again combinatorial (group theory is very combinatorial). Consider the set S of
(g1 , . . . , gp ) ∈ G such that g1 · . . . · gp = e, the identity of G. There are p | |G|p−1 such tuples.
Now group them by circular permutations: consider the orbits
{(g1 , . . . , gp ), (g2 , . . . , g1 ), . . . , (gp , . . . , gp−1 )}.
The size of each orbit has size 1 or p: indeed, if σ denotes the circular permutation (x1 , . . . , xp ) 7→
(x2 , . . . , x1 ), we have σ p = id so the order of σ divides p by Lagrange’s theorem (in the group
of permutations of (x1 , . . . , xp ), which might not be Sn for fixed xi ). (This can also be seen
directly: if σ k (g1 , . . . , gp ) = (g1 , . . . , gp ), then gi+kn = gi so if k is invertible modulo p, gi = gj
for any i 6= j which means the orbit has size 1, and otherwise the orbit has size p.)
Thus, modulo p, the cardinality of S is congruent to the number of orbits of size 1. However,
(g, . . . , g) is in S iff g p = e, i.e. g has order 1 or p. Thus, if we let np denote the number of
elements of order p, we get
1 + np ≡ |S| ≡ 0 (mod p)
since p | |S| = |G|p−1 , which implies that np is non-zero as wanted.

104 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

6.4 Splitting of Polynomials


We shall now discuss an application of the primitive element theorem, other than that it lets us build
Galois theory quickly. We have already seen in Section 5.2 that, when f ∈ Z[X] is non-constant, the
set P(f ) of rational primes p such that f (mod p) has a root in Fp is infinite. Here, we show a stronger
result, namely that Psplit (f ), the set of rational primes which don’t divide the leading coefficient of f
such that f is split modulo p, meaning that f has as many roots in Fp as its degree, is infinite.

Theorem 6.4.1

For any non-constant polynomial f ∈ Z[X] of leading coefficient a, there are infinitely many
rational primes p - a such that f (mod p) is split in Fp , meaning that its root in Fp are all in Fp .

Proof

Without loss of generality, we can assume f ∈ Q[X] is monic; the condiiton p - a becomes
that p doesn’t divide the denominators of the coefficients. Let α1 , . . . , αn be the roots of f ,
and K = Q(α1 , . . . , αn ). By the primitive element theorem, there is a β such that K = Q(β).
Consider the minimal polynomial π of β: we show that whenever π (mod p) has a root in Fp
and p is sufficiently large, f (mod p) is split in Fp . By Theorem 5.2.1, there are infinitely many
such primes.

Indeed, we know that β generates the roots αi of f in Q, so we might expect the same to hold in
Fp . It is in fact quite easy to show that this intuition holds true. Let g1 , . . . , gn ∈ Q[X] be such
that fi (β) = αi .

Let p be greater than the denominators of the gi so that gi (mod p) makes sense and suppose
βp ∈ Fp is a root of π (mod p). We shall show that
n
Y
f≡ X − gk (βp ).
k=1

Consider the coefficient in front of X i : ±ei (g1 , . . . , gn ) evaluated at βp . By assumption,

ei (g1 , . . . , gn )(β) = ai ∈ Q

so that π | ei (g1 , . . . , gn ) − ai . Using Gauss’s lemma, this divisibility becomes a divisibility in


F[ X]] (as p doesn’t divide the denominators of gi ) which means that we also have

ei (g1 , . . . , gn )(βp ) ≡ ai .

This concludes the proof.




As an important corollary, we get that any set of non-constant polynomials have infinitely many
common prime divisors.

Corollary 6.4.1

For any non-constant polynomials f1 , . . . , fn ∈ Z[X], Psplit (f1 ) ∩ . . . ∩ Psplit (fn ) is infinite.
6.5. EXERCISES 105

Proof

Apply Theorem 6.4.1 to f = f1 · . . . · fn .




From this, we can deduce the following very non-trivial result.

Corollary 6.4.2

Let n ≥ 1 be a rational integer. Any non-constant polynomial f ∈ Z[X] has infinitely many
prime factors p ≡ 1 (mod n).

Proof

Apply Corollary 6.4.1 to f and Φn .




Exercise 6.4.1. Does there exist an a 6≡ 1 (mod n) such that any non-constant f ∈ Z[X] has infinitely many
prime factors congruent to a modulo n?

6.5 Exercises
Field and Galois Theory
Exercise 6.5.1† . Let L/K be a finite separable extension of prime degree p. If f ∈ K[X] has prime
degree q and is irreducible over K but reducible over L, then p = q.

Exercise 6.5.2† . Let L/K be a finite Galois extension and M/K be a finite extension. Prove that
Gal(LM : M ) ' Gal(L : L ∩ M ). In particular, [LM : L] = [L : L ∩ M ]. Conclude that, if L/K and
M/K are Galois, we have
[LM : K][L ∩ M : K] = [L : K][M : K].

Exercise 6.5.3† . Prove that, for any n, there is a finite Galois extension K/Q such that Gal(K/Q) '
Z/nZ.

Exercise 6.5.4† (Cayley’s Theorem). Let G be a finite group. Prove that it is a subgroup of Sn for
some n. Conclude that there is a finite Galois extension L/K of number fields such that G ' Gal(L/K).
(This is part of the inverse Galois problem. So far, it has only been conjectured that we can choose
K = Q.)

Exercise 6.5.5† (Dedekind’s Lemma). Let L/K be a finite separable extension in characteristic 0.
Prove that the K-embeddings of L are linearly independent.

Exercise 6.5.6† (Hilbert’s Theorem 90). Suppose L/K is a cyclic extension in characteristic 0, mean-
ing its Galois group Gal(L/K) ' (Z/nZ, +) for some n (like Gal(Fpn /Fp )) or Gal(Q(exp(2iπ/p))/Q)).
Prove that α ∈ L has norm 1 if and only if it can be written as β/σ(β) for some β ∈ L, where σ is a
generator of the Galois group (element of order n).

Exercise 6.5.7. When are two number fields isomorphic?

Exercise 6.5.8† (Lüroth’s Theorem). Let K be a field and L a field between K and K(T ). Prove
that there exists a rational functions f ∈ K(T ) such that L = K(f ).
106 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

nth Roots
Exercise 6.5.9† . Let K be a field, p a prime number, and α an element of K. Prove that X p − α is
irreducible over K if and only if it has no root.

Exercise 6.5.10† . Let f ∈ K[X] be a monic irreducible polynomial and p a rational prime. Suppose
that (−1)deg f f (0) is not a pth power in K. Prove that f (X p ) is also irreducible.

Exercise 6.5.11† (Vahlen, Capelli, Redei). Let K be a field and α ∈ K. When is X n − α irreducible
over K?

Exercise 6.5.12
√ . Let n ≥ 1 be an integer and ζ a primitive nth root of unity. What is the Galois
n
group of Q( 2, ζ) over Q?

Exercise 6.5.13† . Let n ≥ 1 be an integer and p1 , . . . , pm rational primes. Prove that


√ √
[Q( n p1 , . . . , n pm ) : Q] = nm .

(This is a generalisation of Exercise 4.6.24† .)

Exercise 6.5.14† (Kummer Theory). Let L/K be a finite Galois extension in characteristic 0. Suppose
that Gal(L/K) ∼ Z/nZ. If K contains a primitive nth root of unity, prove that L = K(α) for some
αn ∈ K.

Exercise 6.5.15† (Artin-Schreier Theorem). Let L/K be a finite extension such that L is algebraically
closed. Prove that [L : K] ≤ 2.

Constructibility and Solvability


Exercise 6.5.16† . Given two points, you are allowed to draw the line between them, as well as the
circle of center one of the points going through the other. Initially, you may start with the points (0, 0)
and (0, 1) and define additional points that way. We say a real number r is constructible p if the point
(0, r) is constructible. Prove that, if x and y are constructible, so are x + y, xy, −x, |x|, and x1 if
x 6= 0.

Exercise 6.5.17† . Prove that a real number is constructible if and only if it is algebraic and the
degree of its splitting field, meaning the field generated by its conjugates, is a power of 2. Deduce that,
using only a (non-graded) ruler and a compass,

1. A regular n-gon is constructible if and only if ϕ(n) is a power of 2.

2. It is not always possible to trisect an angle.

3. Given a square with area A, it is not possible to construct a square with area 2A.

Exercise 6.5.18† . We say a finite Galois extension L/K in characteristic 0 is solvable by radicals if
there is a tower of extensions
K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L
such that Ki+1 is obtained from Ki by adjoining an nth root of some element of Ki to Ki , for some
n. We also say a group G is solvable if there is a chain 0 = G0 ⊂ G1 ⊂ . . . ⊂ Gm = G such that Gi is
normal in Gi+1 (see Exercise 6.3.11∗ ) and Gi+1 /Gi is cyclic. Prove that L/K is solvable by radicals if
and only if its Galois group is. (When L is the field generated by the roots of a polynomial f ∈ K[X],
L/K being solvable by radicals means that the roots of f can be written with radicals, which explains
the name.)

Exercise 6.5.19† . Let n ≥ 1 be an integer. Prove that Sn is not solvable for n ≥ 5. Conclude from
Exercise 6.5.21† that some polynomial equations are not solvable by radicals.2 (This is quite technical.)
2 If one only wants to show that there is no general formula, one doesn’t need to do the first part since the general

polynomial n
Q
i=1 X − Ai ∈ Q(A1 , . . . , An )[X] already has Galois group Sn (where A1 , . . . , An are formal variables).
6.5. EXERCISES 107

Exercise 6.5.20† . We say a finite Galois extension L/K of real fields, i.e. L ⊆ R, is solvable by real
radicals if there is a tower of extensions

K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L

such that Ki+1 is obtained from Ki by adjoining the nth root of some positive element of Ki to Ki .
Prove that L/K is solvable by real radicals if and only if [L : K] is a power of 2.

Exercise 6.5.21† . Let p be a prime number and G ⊆ Sp a subgroup containing a transposition τ


(see the paragraph after Definition C.3.2) and an element γ of order p. Prove that G = Sp . Deduce
that, if f ∈ Q[X] is an irreducible polynomial of degree p with precisely two non-real complex roots,
then the Galois group of the field generated by its roots (called its splitting field , because it is a field
where it splits) over Q is Sp .

Exercise 6.5.22† . Let n be a positive integer. Prove that there is a number field K, Galois over
Q, such that Gal(K/Q) ' Sn . (You may assume the following result of Dedekind: if f ∈ Z[X] is a
polynomial, for any prime number p not dividing the discriminant ∆ of f , the Galois group of f over
Fp is a subgroup of the Galois group of f over Q.3 )

Cyclotomic Fields
Exercise 6.5.23† . Let ω be a primitive nth root of unity. When is Φm irreducible over Q(ω)?

Exercise 6.5.24. Prove that Q(ωm , ωn ) = Q(ωm + ωn ), where ωm and ωn are primitive mth and nth
roots of unity respectively.

Exercise 6.5.25† . Let n be an integer and m ∈ Z/nZ be such that m2 ≡ 1 (mod n). Prove that
there exist infinitely many primes congruent to m modulo n, provided that there exists at least one
which is greater than n2 . (It is also true that our Euclidean approach to special cases of Dirichlet’s
theorem only works for m2 ≡ 1 (mod n), see ??.)
Pn
Exercise 6.5.26†P(Mann). Suppose that ω1 , . . . , ωn are roots of unity such that i=1 ai ωi = 0 for
some ai ∈ Q and i∈I ai ωi 6= 0 for any non-empty strict subset I ⊆ [n]. Prove that ωim = ωjm for any
i, j ∈ [n] where m is the product of primes at most n.

Exercise 6.5.27. Which quadratic subfields does a cyclotomic field contain?

Exercise 6.5.28† . Prove the Gauss and Lucas formulas: given an odd squarefree integer n > 1, there
exist polynomials An , Bn , Cn , Dn ∈ Z[X] such that
n−1 n−1
4Φn = A2n − (−1) 2 nBn2 = Cn2 − (−1) 2 nXDn2 .

Deduce that, given any non-zero rational number r, there are infinitely many pairs of distinct rational
prime (p, q) such that r has the same order modulo p and modulo q.

Exercise 6.5.29 (Inspired by USAMO 2007). Let p be an odd prime and n ≥ 1 an integer. Prove
that the number
n
p2p − 1

has at least 3n prime factors (counted with multiplicity).

Miscellaneous
Exercise 6.5.30† . Let f ∈ Q[X] be an irreducible polynomial with exactly one real root of degree at
least 2. Prove that the real parts of its non-real roots are all irrational.
3 The Galois group of a polynomial f over a field F is defined as the Galois group of its splitting field over F , i.e. as

Gal(F (α1 , . . . , αk )/F ), where α1 , . . . , αk are the roots of f .


108 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.31† . Let K be a number field of degree n. Prove that there are elements α1 , . . . , αn of
K such that
OK ⊆ α1 Z + . . . + αn Z.
By showing that any submodule of a Z-module generated by n elements is also generated by n elements,
deduce that OK has an integral basis, i.e. elements β1 , . . . , βn such that

OK = β1 Z + . . . + βn Z.

Exercise 6.5.32† . Let f ∈ Q[X] be an irreducible polynomial of prime degree p and denote its roots
by α0 , . . . , αp−1 . Suppose that
λ0 α0 + . . . + λp−1 αp−1 ∈ Q
for some rational λi . Prove that λ0 = . . . = λp−1 .
Exercise 6.5.33† (TFJM 2019). Let N be an odd integer. Prove that there exist infinitely many
rational primes p ≡ 1 (mod N ) such that x 7→ xn+1 + x is a bijection of Fp , where n = p−1
N .

Exercise 6.5.34† . Let f ∈ C(X) be a rational function, and suppose f sends rational integers algebraic
integers to algebraic integers. Prove that f is a polynomial.

Exercise 6.5.35. Let α ∈ Z be an algebraic integer with minimal polynomial f . Prove that a rational
prime p not dividing the discriminant ∆ of f stays prime in OQ(α) if and only if f stays irreducible in
Fp [X].
Exercise 6.5.36. Suppose f ∈ R[X] is positive on R. Prove that there exist polynomials g, h ∈ R[X]
such that f = g(X)2 + h(X)2 .
Exercise 6.5.37. Suppose f ∈ R[X] is positive on R>0 . Prove that there exist polynomials g, h ∈ R[X]
such that f = g(X)2 + Xh(X)2 .
Exercise 6.5.38 (Inspired by ISL 2020). Let p be a prime and f ∈ Z[X] a polynomial. Alice and
Bob play a game: Bob chooses two initial element α, β ∈ Fp and Alice iteratively replace α by f (α)
or α0 such that α = f (α0 ). She wins if she can reach β, otherwise Bob wins. Prove that there exists
infinitely many primes p such that Bob is able to win.
Exercise 6.5.39. Prove that Fp (U, T )/Fp (U p , T p ) has no primitive element.
Chapter 7

Units in Quadratic Fields and Pell’s


Equation

Prerequisites for this chapter: Chapter 2 for the whole chapter and Chapter 6 for Section 7.4.

7.1 Fundamental Unit


Recall that a unit α ∈ OK is an invertible element (in OK ), i.e. an element of norm ±1. By abuse of
terminology, we shall also call α a unit of K, even though all non-zero elements are units in K since
it’s a field.
Exercise 7.1.1∗ . Prove that α is invertible if and only if its norm is ±1.

We are interested in characterising units in quadratic fields. Notice that a + b d ∈ OQ(√d) is a

unit if and only if ±1 = N (a + b d) = a2 − bd2 so units in quadratic fields are deeply linked with the
so-called Pell equation.

Note also that units are closed under multiplications since the norm is multiplicative. In particular,
if K has a unit which is not a root of unity, it has infinitely many units.

We shall √
prove that there always exists such a unit, but first we turn ourselves over to the complex
case, i.e. Q( d) for d < 0 (such a field√is called a complex quadratic field ), for which the situation is a
lot simpler. Indeed, the norm of a + b d is a a − db2 ≥ a2 + b2 since d < 0. We thus get the following
characterisation of units in complex quadratic fields.

Proposition 7.1.1

Let d < 0 be a squarefree rational integer. The units of Q( d) are {1, −1, i, −i} for d = −1,
{1, −1, j, j 2 } for d = 3, and {1, −1} for other d.

Exercise 7.1.2∗ . Prove Proposition 7.1.1.



In the real case, however, the situation is completely different (Q( d) for d > 0 is called a real
quadratic field ). Indeed, there always exists infinitely
√ many units. Before we prove this, let us talk
about fundamental units. Notice that since Q( d) ⊆ R, the only roots of unity it has are ±1.

Definition 7.1.1 (Fundamental Unit)

Let K be a real quadratic field. A unit θ of K is said to be a fundamental unit if it generates all
other units of K, i.e. any unit has the form ±θn for some n ∈ Z.

109
110 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

We now show that, if there is a non-trivial unit, there always exists a fundamental unit greater
than 1. We will refer to this unit when we say "the fundamental unit".

Proposition 7.1.2

Any real quadratic field has a fundamental unit θ (unique with the additional condition θ > 1).

Proof of Proposition 7.1.2 assuming there is a non-trivial unit

The uniqueness is obvious: if α = ±β u and β = ±αv , then α = ±(±αv )u so either u = v = ±1


and α = ±β ±1 as wanted, or α is a root of unity (which means the only units are ±1 but we
assumed there was a non-trivial unit).
√ √
Notice that, if K has a unit α = a + b d 6= ±1, then it has a unit β = |a| + |b| d > 1. Let θ
be the smallest unit which is greater than 1; there exists one since there are only finitely many
units in any interval [a, b] for positive a, b.

Indeed, if θ ∈ [a, b] then θ = 1/θ ∈ [1/b, 1/a] so the minimal polynomial of θ has bounded
coefficients which shows that there are a finite number of such θ.

Now, we prove that all units are generate by θ. Suppose for the sake of a contradiction that
ε > 1 is the smallest unit which is not a power of θ (we can do that for the same reasons as
before). Since θ is the minimal unit greater than 1, we must have ε > 1; but then 1 < ε/θ < ε is
a smaller unit which is not a power of θ and that is a contradiction.


It remains to prove that the units of a real quadratic field are not all trivial. We follow the proof
of Lagrange. First, we need a lemma.

Lemma 7.1.1 (Dirichlet’s Approximation Theorem)

Let α ∈ R be a real number. For any rational integer N > 0, there are rational integers p, q such
that 0 < q ≤ N and
1
|qα − p| < .
N


In particular, there are infinitely many pairs of rational integers (p, q) such that α − pq < 1
q2 . This

will prove to be very useful for finding units.

Proof

Consider the fractional parts of the numbers 0, α, 2α, . . . , N α. They all lie in the intervals
     
1 1 2 N −1
0, , , ,..., ,1
N N N N
so, by the pigeonhole principle, two of them must lie in the same interval. Thus, their difference
has absolute value less than N1 , which is exactly what we were looking for:

1
|αq − p| = |(αq1 − p1 ) − (αq2 − p2 )| <
N
where q = |q1 − q2 |, p = ±(p1 − p2 ) and p1 = bαq1 c, p2 = bαq2 c.

7.2. PELL-TYPE EQUATIONS 111

Finally, we prove the existence of a non-trivial unit.

Proof of the existence of a non-trivial unit


√ √
Take α − d in the Dirichlet approximation theorem. Suppose |a − b d| < 1b . Then,
√ √
2 2 a+b d 2b d + 1
|a − db | < ≤
b b

as a ≤ b d + 1b .

In particular, some value M must be reached infinitely many times by a2 − db2 . Moreover, again
by the pigeonhole principle, some pair (a, b) (mod M ) must be repeated infinitely many times.
If (a, b) ≡ (a0 , b0 ) (mod M ) and a2 − db2 = M = a02 − db02 then
√ √ √
a+b d (a + b d)(a0 − b0 d)
√ =
a0 + b0 d M
√ √ √ √
is an algebraic integer as (a + b d)(a0 − b0 d) ≡ (a + b d)(a − b d) ≡ 0 (mod M ) and has norm
1. We have found a non-trivial unit.


Here is a table of the fundamental units for small d.



• θ2 = 1 + 2 (norm −1).

• θ3 = 2 + 3 (norm 1).

1+ 5
• θ5 = 2 (norm −1).

• θ6 = 5 + 2 6 (norm 1).

• θ7 = 8 + 3 7 (norm 1).

• θ10 = 3 + 10 (norm −1).

7.2 Pell-Type Equations


Notice that the fundamental unit may have norm 1 or norm −1. To have norm −1, a necessary
condition is that −1 is a quadratic residue modulo d. However, as Exercise 7.5.13† shows, it is not
sufficient and one cannot really predict which sign the norm of the fundamental unit will have. That
said, this condition is sufficient when d is a prime number.

Proposition 7.2.1

√  
Let p be a rational prime. The fundamental unit of Q( p) has norm −1 if and only if −1
p = 1,
i.e. p ≡ 1 (mod 4) or p = 2.

Proof
 
−1
If the fundamental unit has norm −1, then p = 1 so it suffices to prove that the converse also
holds. When p = 2, the fundamental unit indeed has norm −1. Now suppose p ≡ 1 (mod 4), and
√ √
let a + b p > 1 be the minimal unit of Q( p) with a, b ∈ Z (so not necessarily the fundamental
112 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

unit). We have a2 − pb2 = 1 so, modulo 4 we get that a is even (otherwise a2 − pb2 ≡ −1). Since

(a − 1)(a + 1) = pb2 ,

we must have a ± 1 = 2x2 and a ∓ 1 = 2py 2 for some x, y ∈ Z. If a + 1 = 2x2 , we get

2x2 − 2py 2 = (a + 1) − (a − 1) = 2

which is impossible as a + b d was already the smallest unit with norm 1. Thus,

2x2 − 2py 2 = (a − 1) − (a + 1) = −2

as wanted.


Note that when we proved the existence of a unit, we did not use the fact that d was squarefree
anywhere. Thus, we in fact get that the Pell equation x2 − dy 2 = 1 has integral solutions for any
positive squarefree d, and that all solutions are generated by the minimal one. However, we will also
prove this from the existence of a fundamental unit.

Write d = uv 2 where u is the squarefree part of d, and let α be the fundamental unit of Q( u).
Then, the (positive) solutions of x2 − uy 2 = 1 have the form

x + y u = αn .

Thus, we get
αn − αn
y= √ .
2 u
We want to know when y is divisible by v, i.e. when

2v | αn − αn = αn − α−n

which is equivalent to 2v u | α2n − 1 as α is a unit.

In fact, we can find when any non-zero β ∈ OQ(√d) divides α2n − 1 exactly like we would in Z.
Indeed, OQ(√u) modulo β has a finite number of elements (in fact |N (β)| by ??) so α2n cycles modulo
k:
α2i ≡ α2j ⇐⇒ k | α2(i−j) − 1
(α is a unit so we can divide by it). Then, we can define the order of α2 modulo β to be the smallest
m such that α2m ≡ 1 to get that α2n ≡ 1 ⇐⇒ m | n which means that the solutions to k | α2n are
generated by αm , the minimal solution, as wanted.

Exercise 7.2.1∗ . Prove that OQ(√u) /βOQ(√u) is finite if β 6= 0.

Here is how our previous discussion


h √translates.
i Actually, our statement is a bit more general because
1+ d
we also allow rings of the form Z 2 when d ≡ 1 (mod 4), but these still have a non-trivial unit

since we have shown that Z[ d] do (and the same proof as Proposition 7.1.2 shows that the minimal
unit greater than 1 is fundamental).

Definition 7.2.1 (Fundamental Unit)

Let δ be a quadratic integer. A unit θ of Z[δ] is said to be a fundamental unit if it generates all
other units of K, i.e. any unit has the form ±θn for some n ∈ Z.
7.2. PELL-TYPE EQUATIONS 113

Proposition 7.2.2

For any quadratic integer δ, Z[δ] always has a unique fundamental unit greater than 1.

Note that this differs from our previous definition (although the proof is the same as before) because

Z[δ] may not be OQ(√d) . We will also call (x, y) the fundamental solution of x2 − dy 2 = 1 if x + y d
√ h √ i
is the fundamental unit of Z[δ], where Z[δ] = Z[ d] or Z 1+2 d (d is not necessarily squarefree
anymore).

We now discuss equations of the form x2 − dy 2 = k for some fixed k. As we have seen earlier with
k = −1, it is very hard to determine when this equation has a solution, so we will instead show that
all solutions are generated by the fundamental solution of x2 − dy 2 = 1 and a finite number of pairs
(xi , yi ) such that x2i − dyi2 = k.

Proposition 7.2.3

Let k ∈ Z be a non-zero rational integer which is not a perfect square and θ the fundamental
unit of Z[δ]. There exists elements α1 , . . . αn ∈ Z[δ] of norm k such that the elements of Z[δ] of
norm k are exactly those of the form ±θi αj .

Proof

a0 +b0 d
The proof is the same as before. For each (a, b) ∈ (Z/2kZ)2 , pick an element α(a,b) = 2 ∈

0 0 a+b d
Z[δ] with (a , b ) ≡ (a, b) (mod k) of norm k if there exists one. Then, if α = 2 has norm k,
α
α(a,b) (mod k)

is a unit of Z[δ] so has the form θi for some i.




Remark 7.2.1
Note that we can solve this equation in finite time, since it suffices to find elements of norm k
between 1 and the fundamental unit θ, as any solution greater than θ reduces to one smaller after
a division by a suitable power of θ.

To conclude this section, we consider the equation ax2 − by 2 = k. Again, we will not determine
when this has non-trivial solutions since the case b = −1 reduces to y 2 − ax2 = −1. We shall get
a characterisation of the solutions to these equations, albeit non-explicit and slightly cumbersome.
Nevertheless, for any given values of a, b, k one can compute all solutions explicitly with this.

Proposition 7.2.4

Let a and b be non-zero rational√ numbers of same sign such that ab is not a square and let θ
be the fundamental unit√of Z[ ab]. Further, let k 6= 0 be a rational integer. Then, there exists
elements α1 , . . . αn ∈ Z[ ab] of norm k and rational integers u1 , . . . , un , m1 , . . . , mn such that
the integral solutions of ax2 − by 2 = k are exactly the x, y for which

x + y ab ∈ {±θui +jni αi | i ∈ [n], j ∈ Z}.
114 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Proof

Solving ax2 − by 2 = k is equivalent to solving √


(ax)2 − aby 2 = ak. We already know that the
solutions of x − aby = ak have the form x + y d = ±αi θn . We wish to know when a divides
2 2

n
αi θn + αi θ
x= ,
2
i.e. when 2a divides αi θ2n + αi . Let mi > 0 be the smallest integer such that αi (θ2mi − 1) ≡ 0.

Either there is no solution to αi θ2n ≡ −αi or ui is a solution and all solutions are given by n ≡ ui
(mod mi ). Indeed, αi θ2n ≡ −αi is then equivalent to

αi θ2n ≡ αi θ2ui ⇐⇒ α(θ2(n−ui )−1 ) ≡ 0

since θ is a unit. Thus, it remains to prove that αi θ2m ≡ αi if and only if mi | m.

We proceed like we would when αi = 1. Suppose, for the sake of a contradiction, that αi θ2m ≡ α
and mi - m. Write the Euclidean division m = qmi + r with 0 < r < mi . Then,

αi ≡ αi θ2m = αi θ2qmi θ2r ≡ αi θ2r

which is a contradiction since mi was assumed to be minimal.




Remark 7.2.2
It is no coincidence that proving that

αi θ2m ≡ αi (mod 2a)

iff mi | m was so similar to proving that θ2m ≡ 1 iff the order of θ2 divides m. This is because
it is in fact the same result, but modulo 2a/ gcd(2a, αi ). Since we have not defined the gcd in
non-Bézout domains, we could not use this approach (the gcd is usually not a number but an
ideal!).

7.3 Størmer’s Theorem


In this section, we focus on a very nice application of Pell equations, regarding consecutive S-units.

Definition 7.3.1 (S-Units)

Let S be a finite set of rational primes. A rational number r ∈ Q is said to be a S-unit if the
prime factors of its numerator and denominator are in S. Given a non-zero rational integer s ∈ Z,
We also say r is a s-unit if all prime factors of the numerator and denominator of r are prime
factors of s.

Theorem 7.3.1 (Størmer’s Theorem)

For any set of rational primes S of cardinality n, the equation u − v = 1 has at most 3n solutions
in positive integral S-units.

Here is how we will approach this theorem: if v and u = v + 1 are S-units, then one of them is
even so 2 ∈ S. Thus, 4v(v + 1) = (2v + 1)2 − 1 is also an S-unit. Let x = 2v + 1. Write
Y
x2 − 1 = pkp ,
p∈S
7.3. STØRMER’S THEOREM 115

and
Q forkpeach kp 6= 0, choose dp ∈ {1, 2} such that dQ p ≡ kp (mod 2) (otherwise set dp = 0). Then,
−dp 2 dp
p∈S p is a perfect square, say y . Letting d = p∈S p we get the Pell equation

x2 − dy 2 = 1.

Also – and this is the key point – note that y is a d-unit by construction. We shall prove that the only
possible solution to the Pell equation x2 − dy 2 = 1 where y is a d-unit is the fundamental solution.
Thus, for each such Pell equation there is at most one corresponding pair of consecutive S-units. Since
dp ∈ {0, 1, 2} for each p, there are 3n equations to consider which yields the result.

Hence, we just need to prove the following proposition.

Proposition 7.3.1

For any positive rational integer d which is not a perfect square, the only possible positive solution
to the Pell equation x2 − dy 2 = 1 where y is a d-unit is the fundamental solution.

Proof
√ √
Let x + y d be the fundamental unit of Z[ d]. Since we are interested in positive solutions of
the Pell equation, by Proposition 7.2.2, we want to show that if
√ √
(x + y d)n − (x − y d)n
yn = √
2 d
is a d-unit, then n = 1. Suppose there is a solution where n 6= 1. Since ym | yn when m | n, we
may assume n = p is prime.

We have      
p p−1 p p−3 2 p p−5 4 2
yp = x + x y d+ x y d + ... (∗)
1 3 5
Let q be a prime factor of yp ; by assumption q | d. Every term of this sum except the first one
is divisible by d, thus  
p p−1
q| x = pxp−1 .
1
Since x2 − dy 2 = 1, x is coprime with d so q | p which means q = p. Thus, yp is a power of p and
in particular divisible by p2 unless p = 2.

However, as p | kp for 0 < k < p, if p > 3, every term of (∗) is divisble by p2 except the first one


which is pxp−1 . This is a contradiction.

It remains to settle the cases p = 2 and p = 3. The first


√ one is trivial: we have y2 = 2x so x = 1
as it’s coprime with d which is impossible since x + y d was a non-trivial unit.

Finally, in the case p = 3 we get y3 = 3x2 + dy 2 , and since x2 − dy 2 = 1 this means

y3 = 3x2 + (x2 − 1) = (2x − 1)(2x + 1).

This is a product of two numbers which differ by 2, thus it can only be a power of 3 if x√= 1
since the only powers of 3 which differ by 2 are 1 and 3. This is again impossible as x + y d is
a non-trivial unit.


Exercise 7.3.1∗ . Prove that ym | yn iff m | n.


116 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

7.4 Units in Complex Cubic Fields, Thue’s Equation and Kobayashi’s


Theorem
In this section, we will prove that, for any finite set of rational primes S and any fixed integer k 6= 0,
the equation u − v = k has finitely many solutions in integral S-units. We will, however, not find an
explicit bound like we did in the last section for k = 1. In fact, our method can not give bounds; and
it does not let us compute effectively all solutions for a fixed k and S.
Exercise 7.4.1. Why does looking at the (2k )2 Pell-type equations ax2 − by 2 = k for squarefree integral
S-units a, b not prove that u − v = k has finitely many integral S-units solutions?

Thus, instead of considering Pell-type equations ax2 − by 2 = k that usually have infinitely many
3 3 rp
Q
solutions, we shall consider equations of the form ax − by = k. Indeed, if u = p∈S p and
v = p∈S psp then, by choosing ap , bp ∈ {1, 2, 3} such that ap ≡ rp (mod 3) and bp ≡ sp (mod 3) and
Q
p p
defining a = p∈S pap and b = p∈S pbp , we get that ( 3 u/a, 3 v/b) is a solution of one of the 3k
Q Q
Thue equations
ax3 − by 3 = k.
Indeed, it is a theorem of Thue that such an equation has only finitely many solutions for k 6= 0. This
is what we’ll prove in this section.

The theorem that u−v = k has finitely many integral S-units solutions is also known as Kobayashi’s
theorem. It is usually written as such:

Theorem 7.4.1 (Kobayashi’s Theorem)

Let M be a set of rational integers with finitely many prime divisors, meaning that there are
finitely many rational primes which divide at least one element of M . Then, the translate k + M
has infinitely many prime divisors for any rational integer k 6= 0.

Note that this is indeed equivalent to our result on the finiteness of integral S-units equations: M
has finitely many prime divisors if and only if all its elements are S-units for some finite S, and the
same holds for k + M . So if they’re both sets of S-units for some finite S, we can assume it’s the same
S for both of them but then the equation u − v = k has infinitely many solutions in integral S-units.

Thus, we need to prove the following special case of Thue’s theorem.

Theorem 7.4.2 (Thue)

For any non-zero rational integers a, b and k, the equation ax3 + by 3 = k has finitely many
solutions in rational integers.

2 2
p
The equation
p ax − bx = k was linked to units in Q( b/a), thus, for ax3 + by 3 we will consider
3
units in Q( b/a). When a/bpis perfect cube this problem is more or less trivial so we can assume that
it isn’t the case, i.e. that Q( 3 b/a) is a field of degree 3.
Exercise 7.4.2∗ . Prove Theorem 7.4.2 in the case where a/b is a rational cube.

As before, we first consider the case√where a = 1 and k = 1 since it corresponds
√ to units of Q( 3 b).
Indeed, if x3 + by 3 = 1, then N (x + y 3 b) = 1. Why should we√expect Q( 3 b) to have finitely many
such units when√there are infinitely many of them for K = Q( d)? It’s because this unit does not
3
have a term in b2 , so for instance units of that form are absolutely not closed under multiplication,
contrary to the quadratic case.
√ √
We shall find a characterisation of the units of Q( 3 b). This relies on the fact that√Q( 3 b) is a
complex cubic field , not because it’s not real but because some of its conjugates fields Q(j 3 b) for some
primitive third root of unity j aren’t.
7.4. UNITS IN COMPLEX CUBIC FIELDS, THUE’S EQUATION AND KOBAYASHI’S THEOREM117

We also say a number field K is totally real if all its conjugate fields are real. Also, we say an
embedding σ is real if σK is real, and complex otherwise. Complex embeddings come into √ pair σ, σ:
this will be quite useful for us as we only need to deduce information on an element of Q( 3 b) and one
its conjugate to have information on all its conjugates.

We define fundamental units almost as before, but this time we don’t require θ to be non-trivial if
there are no non-trivial units (i.e. we allow θ = 1). There always exists a non-trivial unit, but since
it’s non-trivial to show and we do not need it (it’s better for us if there are less units) we do not do it.
Again, we say "unit of K" to mean "unit of OK ".

Definition 7.4.1 (Fundamental Unit For Complex Cubic Fields)

Let K ⊆ R be a complex cubic field. A unit θ ≥ 1 of K is said to be a fundamental unit if it


generates all others: any unit can be written as ±θn .

Proposition 7.4.1

Let K ⊆ R be a complex cubic field. If K has a non-trivial unit, then K has a (unique)
fundamental unit.

Proof

Again, uniqueness is obvious from existence. Suppose K has a unit greater than 1, otherwise all
its units are ±1 so we can take θ = 1 (this is in fact impossible and even if it were possible we
wouldn’t call it a fundamental unit because the units are generated by no element).

We imitate the proof of Proposition 7.1.2. The key step is the existence of a minimal unit θ > 1.
Such a unit exists, because if ε 6= ±1 is a unit then |ε|±1 is a unit greater than 1 for some choice
of ±1.

Now, let’s prove that a minimal one exists. As before, we prove that there are finitely many
units in any interval [a, b] for positive a, b. Suppose ε ∈ [a, b] is a unit. Let σ, σ be the complex
embeddings of K. Then,
1 = |εσ(ε)σ(ε)| = ε|σε|2 .
Thus, the absolute values of the conjugates of ε are all bounded, which means that the minimal
polynomial of ε has bounded coefficients: there exists finitely many such ε.

Finally, we again proceed as in the quadratic case. Let θ be the minimal unit greater than 1
Suppose ε > 1 is the minimal unit which is not a power of θ. Then, ε > θ by minimality of θ,
which means 1 < ε/θ < ε, contradicting the minimality of ε.


Now that we have characterised units of Q( 3 b/a), we characterise elements of OQ( √


p
3
b/a)
of norm
k for a fixed k. This is related√to our equation ax3 + by 3 = k: indeed, if x, y is an integral solution of
3
this equation, then N (ax + y a2 b) = a2 k.

Proposition 7.4.2

Let k ∈ Z be a non-zero√rational integer and θ the fundamental unit of Q( 3 d). There exists
elements α1 , . . . , αn ∈ Z[ 3 d] of norm k such that the elements of OK of norm k all have the form
±θi αj .
118 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Proof

The proof is again the same as before. For each (a, b, c) ∈ (Z/kZ)3 , pick an element of norm k
√ √
α(a,b,c) = a0 + b0 d + c0 d2
3 3

√ √
with (a0 , b0 , c0 ) ≡ (a, b, c) (mod k) if there exists one. Then, if α = a + b 3 d + c d2 has norm k,
3

α
α(a,b,c) (mod k)

is a unit of OK so has the form θi .




Remark 7.4.1
We did not consider the equation N (α) = k in OK because that would require to find the structure
of OK which is slightly cumbersome. This is left as Exercise 7.5.27 and we let the reader adapt
the proof for OK (the conclusion is that the elements of norm k are exactly those of the form
±θjα .)

Finally, we prove Theorem 7.4.2.

Proof of Theorem 7.4.2


p √
Let K = Q( 3 b/a) = Q( 3 d) where d = a2 b, and denote its fundamental unit by θ.
From the previous consideration and Proposition 7.4.2, it suffices to show that, for any non-zero

element α ∈ OK , there are finitely many rational integers n such that αθn has the form x + y 3 d.
If θ = ±1 is trivial, the claim is obvious, thus suppose θ > 1 is non-trivial.

Let
√ j be a primitive third root of unity and σ be the
√ complex embedding of K sending 3 d to
j 3 d. Since j 2 + j + 1 = 0, αθn has the form x + y 3 d if and only if
αθn + jσ(α)σ(θ)n + j 2 σ(α)σ(θ)n = 0.
√ √
3
Indeed, if β = r + s 3 d + t d2 , we have
√3

3

3

3

3

3

3
3t d2 = (r + s d + t d2 ) + (jr + j 2 s d + t d2 ) + (j 2 r + js d + t d2 ) = β + σβ + σβ.
Thus, we wish to show that the linear recurrence of algebraic numbers
αθn + jσ(α)σ(θ)n + j 2 σ(α)σ(θ)n
has finitely many zeros. By Corollary 8.5.2 of the Skolem-Mahler-Lech theorem 8.5.1 which will
be proven in Chapter 8, there exists two embeddings σ1 and σ2 such that
σ1 (θ)
σ2 (θ)
is a root of unity. By composing with another embedding, we may assume σ1 = id, and by
symmetry between j and j 2 we may assume σ2 = σ.

By an argument similar to Problem 6.3.1, we can show that the only roots of unity √ in Q( 3√d, j)
2 2 3 3
are ±1, ±j and ±j . Hence, we must have θ/σ(θ) ∈ {±1, ±j, ±j }. Write θ = x + y d + z d2 .
Suppose θ/σ(θ) = ±1. We get

3
√3

3
√3
x + y d + z d2 = ±(x + yj d + zj 2 d2 )
√3

3
which
√ means
√ y√ = z =
√ 0 as j has degree two over Q( d) (since it’s not in Q( d) ⊆ R) so
3 3
1, 3 d, d2 , j, j 3 d, j 2 d2 are Q-linearly independent. This is impossible since θ is non-trivial by
assumption. The other cases yield similar contradictions, which finishes the proof.

7.4. UNITS IN COMPLEX CUBIC FIELDS, THUE’S EQUATION AND KOBAYASHI’S THEOREM119


Exercise 7.4.3∗ . Prove that the only roots of unity of Q( 3 d, j) are ±1, ±j and ±j 2 .

Exercise 7.4.4∗ . Prove that θ/σ(θ) ∈ {±j, ±j 2 } is also impossible.

Remark 7.4.2
In fact, if K is a number field of degree n with real embeddings τ1 , . . . , τr and complex embeddings
σ1 , σ 1 , . . . , σs , σ s , the Dirichlet unit theorem states that the units of K have the form
n
ζεn1 1 · . . . · εr+s−1
r+s−1

where ζ is a root of unity and ε1 , . . . , εr+s−1 ∈ K are multiplicatively independent units. The
case we treated corresponded to (r, s) = (2, 0) and (r, s) = (1, 1) (although we didn’t prove they
were multiplicatively independent for complex cubic fields, i.e. that the units are not all roots of
unity).

Remark 7.4.3
Thue in fact proved more generally that if f ∈ Z[X, Y ] is an irreducible homogeneous polynomial
of degree n ≥ 3, i.e. f is homogeneous and f (X, 1) is irreducible in Z[X], the equation

f (x, y) = k

has a finite number of integral solutions x, y ∈ Z for any fixed k ∈ Z. In fact, this is deeply linked
with the irrationality measure of algebraic numbers (it can also be proven with p-adic methods
like the Skolem-Mahler-Lech theorem (see [6]) but Thue proved it that way).

The equality f (x, y) = k yields f (x/y, 1) = k/y n which means that x/y is very close to a root of
f . In fact, it is equivalent to the finiteness of pairs of rational integers (p, q) with q 6= 0 such that

α − p < C

q qn

for any C > 0. Thue proved that there were finitely many pairs (p, q) such that
 
p 1
α− < n/2+ε
q q

for any ε > 0, thus establishing his theorem. See Silverman-Tate, chapter 5, section 3 [26].

We say a real number α ∈ R has irrationality measure µ if µ is the largest real number such that,
for any ε > 0, there are finitely many pairs of rational integers (p, q) with q 6= 0 such that
 
p 1
α− < µ+ε .
q q

Dirichlet’s approximation theorem Lemma 7.1.1 shows that any real number has irrationality
measure at least 2. Conversely, the very deep Thue-Siegel-Roth theorem states that any √ real
algebraic number has irrationality measure exactly 2 (Siegel proved you could take µ = 2 n and
Roth µ = 2).

Remark 7.4.4
In fact, the S-unit equation u − v = 1 also has a finite number of rational solutions. This is
considerably harder than Kobayashi’s theorem.
120 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

7.5 Exercises
Diophantine Equations
2 2
Exercise 7.5.1† (ISL 1990). Find all positive rational integers n such that 1 +...+n
n is a perfect
square.

Exercise 7.5.2† (BMO 1 2006). Let n be a rational integer. Prove that, if 2+2 1 + 12n2 is a rational
integer, then it is a perfect square.
Exercise 7.5.3. Find all rational integers n such that 2n + 1 and 3n + 1 are both perfect squares.
Exercise 7.5.4† (RMM 2011). Let Ω(·) denote the number of prime factors counted with multiplicity
of a rational integer, and define λ(·) = (−1)Ω(·) . Prove that there are infinitely many rational integers n
such that λ(n) = λ(n + 1) = 1 and infinitely many rational integers n such that λ(n) = λ(n + 1) = −1.
Exercise 7.5.5† . Let k be a rational integer. Prove that there are infinitely positive integers n such
that n2 + k | n!.
Exercise 7.5.6 (BAMO 2011). Does there exist a row of the Pascal triangle with four distinct numbers
a, b, c, d satisfying a = 2b and c = 2d?
Exercise 7.5.7 (Bulgaria National Olympiad 1999). Prove that there are infinitely many rational
integers x, y, z, t such that x3 + y 3 + z 3 + t3 = 1999.
Exercise 7.5.8. Let n be a positive rational integer which is not a perfect square. Prove that there
are infinitely many rational integers a, b, c, d such that

(a2 + nd2 )(b2 + nd2 )(c2 + nd2 )

is a perfect square.
Exercise 7.5.9 (ISL 1999). Find two infinite increasing sequences of rational integers (an )n≥0 and
(bn )n≥0 such that an (an + 1) | b2n + 1 for any n.
Exercise 7.5.10 (EGMO 2016). (EGMO 2016). Let S be the set of all positive integers n such that
n4 has a divisor in the range [n2 + 1, n2 + 2n]. Prove that there are infinitely many elements of S
congruent to 0, 1, 2, 5, 6 modulo 7 and no element congruent to 3 or 4.

Pell-Type Equations
Exercise 7.5.11† . Let d be a rational integer. Solve the equation x2 − dy 2 = 1 over Q.
 
2
Exercise 7.5.12. Let p ≡ −1 (mod 4) be a rational prime. Prove that the equation x2 − py 2 = 2 p
has a non-trivial solution over Z.
Exercise 7.5.13† . Prove that the equation x2 − 34y 2 = −1 has no non-trivial solution in Z despite
−1 being a square modulo 34.
Exercise 7.5.14. Solve the equation 3x2 − 2y 2 = 10 over Z.

Fundamental Units
√ √
Exercise 7.5.15† . Let d ≡ 1 (mod 4) be a squarefree integer, and suppose η = a+b d
6∈ Z[ d] is the
√ √ 2
fundamental unit of Q( d). Prove that η n ∈ Z[ d] if and only if 3 | n.
Exercise 7.5.16† . Let d 6= 1 be a squarefree 2n 2
√ rational integer, and suppose√that 2 + 1 = dm for
n
some integers n, m ≥ 0. Show that 2 + m d is the fundamental unit of Q( d), provided that d 6= 5.
Exercise 7.5.17† . Suppose that d = a2 ± 1 is squarefree, where a ≥ 1 is some rational integer and
let k ≥ 0 be a rational integer. Suppose that the equation x2 − dy 2 = m has a solution in Z for some
|m| < ka. For sufficiently large d, prove that |m|, d + m or d − m is a square.
7.5. EXERCISES 121

Exercise 7.5.18† . Solve completely the equation x3 + 2y 3 + 4z 3 = 6xyz + 1 which was seen in
Problem 6.2.2.
Exercise 7.5.19† (Weak Dirichlet’s Unit Theorem). Let K be a number field with r real embeddings
and s pairs of complex embeddings. Prove that there exist units ε1 , . . . , εk with k ≤ r + s − 1 such
that any unit of K can be written uniquely in the form

ζεn1 1 · . . . · εnk k

for some integers ni and a root of unity ζ.


Exercise 7.5.20† (Gabriel Dospinescu). Find all monic polynomials f ∈ Q[X] such that f (X n ) is
reducible in Q[X] for all n ≥ 2 but f is irreducible.

Miscellaneous
Exercise 7.5.21† (Liouville’s Theorem). Let α be an algebraic number of degree n. Prove that there
exists a constant C > 0 such that
α − p > C

q qn
for any p, q ∈ Z (with q > 0).
Exercise 7.5.22† . Prove that 5n2 ± 4 is a perfect square for some choice of ± if and only if n is a
Fibonacci number.

Exercise 7.5.23† (ELMO 2020). Suppose n is a Fibonacci number modulo every rational prime.
Must it follow that n is a Fibonacci number?
Exercise 7.5.24† (Nagell, Ko-Chao, Chein). Let p be an odd rational prime. Suppose that x, y ∈ Z
are rational integers such that x2 − y p = 1. Prove that 2 | y and p | x. Deduce that this equation has
no solution for p ≥ 5. (The case p = 3 is Exercise 8.7.19† .)

Exercise 7.5.25† . Prove that there are at most 3|S| pairs of S-units distant by 2.
Exercise 7.5.26† . Assuming the finiteness of rational solutions to the S-unit equation u + v = 1 for
any finite S, determine all functions f : Z → Z such that m − n | f (m) − f (n) for any m, n and f is a
bijection modulo sufficiently large primes.

Exercise 7.5.27. Let m be a rational integer. What are the integers of Q( 3 m)?
Chapter 8

p-adic Analysis

Prerequisites for this chapter: Section A.1 for the whole chapter, Sections 6.2 and 6.4 for Section 8.5
and Chapter 2 for Section 8.6. Chapter 6 is recommended.

p-adic numbers have many applications and are absolutely fundamental in number theory nowadays.
That said, this chapter will be almost exclusively dedicated to proving the Skolem-Mahler-Lech theorem
8.5.1 and related results. We refer the reader to the Addendum 3A of [2] for more applications of p-adic
numbers.

8.1 p-adic Integers and Numbers


Again, this section will be a bit abstract. If you have trouble, following, skip to Problem 8.3.1 for
motivation. In elementary number theory, when working with diophantine equations, it is often useful
to reduce the equation modulo a rational prime p. If that is not sufficient, one might look modulo p2 ,
then modulo p3 , etc. p-adic numbers are what you get when you consider something modulo pn for all
n. More precisely, a p-adic integer is the data of an element of Z/pZ, of an element of Z/p2 Z, of an
element of Z/p3 Z, ..., such that these elements are compatible between them (the element of Z/p2 Z is
congruent to the element of Z/pZ modulo p.)

Definition 8.1.1 (p-adic Integers)

A p-adic integer a is a tuple

(a1 , a2 , a3 . . .) ∈ Z/pZ × Z/p2 Z × Z/p3 Z × . . .

such that ai ≡ aj (mod pmin(i,j) ) for any i, j. The set of p-adic integers is denoted Zp .

The p-adic integers Zp 1 form an integral domain under component-wise addition and multiplication,
meaning that
(a1 , a2 , a3 , . . .)(b1 , b2 , b3 , . . .) := (a1 b1 , a2 b2 , a3 b3 , . . .)
and
(a1 , a2 , a3 , . . .) + (b1 , b2 , b3 , . . .) = (a1 + b1 , a2 + b2 , a3 + b3 , . . .).

Exercise 8.1.1∗ . Check that Zp is an integral domain. What is its characteristic?

Since p-adic integers are supposed to represent a tuple of local data modulo powers of p, it
makes sense to associate the rational integer a ∈ Z with the p-adic integer (a (mod p), a (mod p2 ), a
1 Now you know why you shouldn’t use Zn for Z/nZ! If you want a shorter notation you can use Z/n.

122
8.1. P -ADIC INTEGERS AND NUMBERS 123

(mod p3 ), . . .). Thus, by abuse of notation, we say Z ⊆ Zp because of this embedding.2 In fact, since
a (mod pn ) makes sense when a is a rational number with denominator coprime with p, we even get

Z(p) ⊆ Zp

where Z(p) denotes the rational numbers with denominator coprime with p.

Exercise 8.1.2∗ . Check that a 7→ (a (mod p), a (mod p2 ), a (mod p3 ), . . .) is indeed an embedding of Z(p)
into Zp , i.e. that it’s injective.

Remark 8.1.1
We use the notation Z(p) because it is the localisation of Z away from the prime ideal (p).

Now, suppose we want to make sense of 1/p p-adically. This can’t be a p-adic integer because
1/p makes no sense modulo p. Thus, we define p-adic numbers by allowing a formal (subject to some
relations) division of p-adic integers by powers of p.

Definition 8.1.2 (p-adic numbers)

A p-adic number is an element of the form pk a for some k ∈ Z and a ∈ Zp . The set of p-adic
numbers is denoted Qp .

With this, we can now say (somewhat abusively) that Q ⊆ Qp by associating the rational num-
ber r = pk a with a ∈ Z(p) to pk (a (mod p), a (mod p2 ), a (mod p3 ), . . .). For instance, 1/p =
p−1 (1, 1, 1, . . .).

p-adic numbers now form a field, and as we have seen numerous times, working in a field is always
great. Here is how multiplication and addition are defined: let x = pk a and y = pm b be p-adic numbers.
Suppose without loss of generality that m, k < 0 otherwise they are p-adic integers. Multiplication is
defined as (pk a)(pm b) = pk+m ab, and addition by

pk a + pm b = pk+m (p−k a + p−m b) = pk+m (p−k a1 + p−m b1 , p−k a2 + p−m b2 , . . .)

as p−k a and p−m b are p-adic integers by assumption.3

It remains to prove that every element of Qp has a multiplicative inverse, so far we have only shown
that it is a ring. This is easy, but before we do it let us define the p-adic valuation of p-adic numbers.

Proposition 8.1.1 (Units of Zp )

The units in Zp (we will also call them "units of Qp " abusively), Z×
p , are the p-adic integer with
non-zero first coordinate: a = (a0 , . . .) and a0 6≡ 0 (mod p).

Proof

This is obvious: if a0 ≡ 0 then a0 b0 ≡ 0 for any b0 so ab can never be 1 = (1, 1, 1, . . .). Conversely,

2 Z is isomorphic to the subset of p-adic integers of the previous form; in general, when f : S → U is an injective

morphism, we call f an embedding of S into U . Notice that the regular embeddings of a number fields are embeddings
into C. See also Remark 6.2.1.
3 Technically, as we have defined p-adic numbers, pk (a , a , a , . . .) and (pk a , pk a , pk a , . . .) are distinct for positive
1 2 3 1 2 3
k. Indeed, we said our division by p was formal, which means a p-adic number is a tuple (k, a) ∈ Z × Zp which we write
as pk a. This is however very easy to fix: just identify these two p-adic numbers to be the same.
124 CHAPTER 8. P -ADIC ANALYSIS

if a0 6≡ 0, the components of a are all invertible since they are coprime with pn for any n, so

a−1 = (a−1 −1 −1
0 , a1 , a2 , . . .).

Definition 8.1.3 (p-adic valuation)

Let z ∈ Qp be a non-zero p-adic number. Write z = pk a where a ∈ Z× p is a unit. The p-adic


valuation of z, vp (z) is the integer k. We also define vp (0) = +∞.

Of course, the p-adic valuation of rational integers is the same as the regular p-adic valuation. Now
it is follows directly that Qp is a field: if z = pvp (z) a, z −1 = p−vp (z) a−1 .
To finish this section, we mention one nice property of p-adic numbers. Even if this proposition
does not convince you of the use of Qp , it should at least convince you that it is a very nice object.

Theorem 8.1.1 (Hensel’s Lemma)

Let f ∈ Zp [X] be a polynomial. If, for some a ∈ Zp , |f (a)|p < 1 and |f 0 (a)| = 1, then f has a
unique root α ≡ a (mod p) in Zp .

Proof

This is almost exactly the regular Hensel lemma 5.3.1: if f has a root a modulo p, i.e. |f (a)|p < 1,
such that p - f 0 (a), i.e. |f 0 (a)|p = 1, then f has a unique root rk in Z/pk Z congruent to a modulo
p. The number α = (α1 , α2 , α3 , . . .) is then the unique root of f in Qp congruent to a modulo
p. The only difference is that, in our previous version of Hensel’s lemma, f had coefficients in Z
and not in Zp . However, it is easy to check that this does note change anything to the proof.


This usually reduces the problem of finding roots of polynomials in Qp to finding roots in Fp . For
instance, there is a square root of −1 in Q5 .

8.2 p-adic Absolute Value


This p-adic valuation lets us define an absolute value on Qp : |z|p = p−vp (zp ) (and |0|p = 0).

Definition 8.2.1 (p-adic Absolute Value)

The p-adic absolute of Qp is defined as | · |p = p−vp (·) (in particular |p|p = 1/p). The regular
absolute value on R (or C) will be denoted | · |∞ .

By an absolute value, we mean a function | · |p : Qp → R>0 which is multiplicative, zero only at


zero, and which satisfies the triangular inequality. The first two properties are obvious, and the last
one follows from the following stronger inequality.

Proposition 8.2.1 (Strong Triangle Inequality)*

For any p-adic numbers x, y ∈ Qp , we have |x + y|p ≤ max(|x|p , |y|p ) with equality if |xp | =
6 |y|p .
8.2. P -ADIC ABSOLUTE VALUE 125

Proof

This is equivalent to vp (x+y) ≥ min(vp (x), vp (y)) with equality if vp (x) 6= vp (y) which is obvious.


Notice that with this absolute value, the p-adic integers are now a ball : Zp = {|z|p ≤ 1 | z ∈ Qp }
since p-adic integers are the p-adic numbers with non-negative p-adic valuation.

With this norm we can now define a distance on Qp : d(x, y) = |x−y|p . This is completely analogous
to R and C, but now two numbers are very close if they are divisible by a large power of p. With
this distance, we can define convergence: a sequence (an )n≥0 of p-adic numbers converges to a ∈ Qp
if d(a, an ) → 0, i.e. |a − an |p → 0. This is also equivalent to vp (a − an ) → +∞. For instance, the
sequence (pn )n≥0 converges to 0 p-adically.

Remark 8.2.1
We will usually use xn → 0 to mean that xn goes to 0 p-adically, but sometimes it will also mean
that xn → 0 over R. We hope that the distinction will be made clear from context; the latter will
normally be used when xn is a sequence of norms of p-adic numbers.
P
Similarly,
Pn we can define convergence of series i ai : we say the series converges if its partial sums
bn = i=0 ai converge. Here is a fundamental proposition, that show that the situation is very different
in the p-adic case compared to the real or complex case.

Proposition 8.2.2 (p-adic Convergence of Series)*


P
The series i ai converges if and only if an → 0.

Proof
Pn Pn−1
It is clear that if it converges, an = i=0 ai − i=0 ai converges. The surprising part is that
the converse also holds. If an → 0, we can assume they are all p-adic integers since there will
only be a finite number of non-integral ai (an ∈ Zp iff |an |p ≤ 1).
P
Consider the kth component of the series i ai : it is the sum of the kth components of ai . But
P am → 0, the kth component of ai is zero for sufficiently large i. Thus, the kth component
since
of i ai isPa sum of a finite number of terms for each k, which mean that they are all well-defined
and thus i ai is too. (Looking at the kth component is equivalent to reducing modulo pk : there
are a finite number of ai not divisible by pk so the partial sums eventually stabil modulo pk ,
which means that it converges p-adically since it is true for all k.)


Exercise 8.2.1∗ . Convince yourself of this proof.

Over R this is very wrong: the harmonic series i≥1 1i diverges but 1
P
i → 0. As a corollary, we get
a very simple criterion for the convergence of a sequnce (an )n≥0 .

Corollary 8.2.1*

A sequence (an )n≥0 of p-adic numbers converges if and only if an+1 − an → 0.


126 CHAPTER 8. P -ADIC ANALYSIS

Proof
P
Apply Proposition 8.2.2 to the series i ai+1 − ai (the nth partial sum is an − a0 ).


Exercise 8.2.2∗ . Prove that the strong triangle inequality also holds for series: if ai → 0 then i ai p ≤
P

maxi |ai |p with equality if the maximum is achieved only once.


Let us talk a bit more about the p-adic absolute value. Recall that real numbers are constructed
from rational numbers by adding the limits of sequences which
√ should converge but do not in Q. Here
is an example. If you√write down the decimal digits of 2, you get a sequence of rational numbers
converging (in R) to 2. But in Q, this sequence does not have a limit (so does not converge) as

2 6∈ Q. You might ask "how do we determine which sequences should converge without having
defined R first?". This is achieved by the notion of a Cauchy sequence: a sequence (an )n≥0 such that,
for any ε > 0, |am − an | ≤ ε for sufficiently large m and n (m, n ≥ N for some N ).
This process is called completing Q with respect to | · |∞ , and R is said to be the completion of Q
with respect to | · |∞ . For this reason we shall also denote R by Q∞ .4 We do not discuss the technical
details here, but it turns out the p-adic fields we constructed are the completions of Q with respect
to the p-adic absolute value | · |p (the fact that Cauchy sequences converges follows from the stronger
Corollary 8.2.1).5 In fact, the only fields you can get by completing Q with respect to some absolute
value are R and the p-adic fields Qp (thus called the completions of Q) by Exercise 8.7.9†6
The completions of Q (and their finite extensions) are called local fields (because they have local
data) while Q and its finite extensions are called global fields.7 In ?? we will see one instance of how
you can piece local data together to get global data (the Hasse-Minkowski principle ??). More simply,
though, we have the following proposition.

Proposition 8.2.3 (Product Formula)

For any non-zero x ∈ Q, we have Y


|x|∞ · |x|p = 1.
p

Exercise 8.2.3∗ . Prove the product formula.

8.3 Binomial Series


This section will be a bit more concrete. We wish to make sense of ab for p-adic numbers a and b.
Actually, already over Q, ab doesn’t always make sense in Q for instance 21/2 6∈ Q (and if we consider
√ √
2
Q( 2), then 2 is not even algebraic by a deep result of Gelfond-Schneider). Thus we will only try
to make sense of ab for b ∈ Zp , although even there it won’t be defined canonically for all a ∈ Qp : we
will define ab only when a ≡ 1 (mod p) and b ∈ Zp .
Write a = 1 + u, with p | u. Over Z, we can define (1 + u)b for positive b ∈ Z as
X b
uk .
k
k

In fact, the same formula works for any b ∈ Zp because uk → 0 so the series will converge. Let us
explain a bit more. We need the fundamental fact that Z and even N are dense in Zp .
4Z
∞ is sometimes thought of as [−1, 1] since for p 6= ∞ we have Zp = {|x| ≤ 1 | x ∈ Qp }, but it doesn’t have
properties as nice as the other Zp .
5 The fact that its elements have such an explicit form in terms of Q is because | · | is non-Archimedean, i.e. satisfies
p
the strong triangle inequality Proposition 8.2.1.
6 There are absolute values different from | · | 2 2
∞ and | · |p like | · |∞ , but completing Q with respect to | · |∞ gives (a
field isomorphic to) R.
7 Technically, there are other local or global fields as well, but these are the only ones in characteristic 0.
8.3. BINOMIAL SERIES 127

Proposition 8.3.1

N is dense in Zp , meaning that for any a = (a1 , . . .) ∈ Zp and any ε > 0, there is a b ∈ N such
that |a − b| < ε. Similarly, Q is dense in Qp .

Proof

Simply pick b ≡ an (mod pn ) for some large n: we get |a − b|p ≤ p−n . For Qp it’s Exercise 8.3.1∗ .


Exercise 8.3.1∗ . Prove that Q is dense in Qp .


·

Here is what this implies. For a fixed k, denote by k : Zp → Qp the function
 
n n(n − 1) · . . . · (n − (k − 1))
= .
k k!

This is a continuous function (Exercise 8.3.2∗ ) which satisfies nk p ≤ 1 on N. Since N is dense in




Zp , we in fact have nk p ≤ 1 on Zp so k· : Zp → Zp (Exercise 8.3.3∗ ). Finally, this means that, for


 

|u|p < 1, the function


∞  
n
X n k
(1 + u) = u
k
k=0

converges for any n in Zp by Proposition 8.2.2 as nk uk p ≤ |u|kp → 0. In fact, this is the unique


extension of n 7→ (1 + u)n from N to Zp as N is dense in Zp .

Exercise 8.3.2∗ . Let f ∈ Qp [X] be a polynomial. Prove that f is continuous on Qp .

Exercise 8.3.3∗ . Let f : Zp → Qp be a continuous function. If |f (x)|p ≤ 1 for any n in a dense subset (in
Zp ), prove that |f (x)|p ≤ 1 for any x ∈ Zp .

Finally, we get the following proposition.

Proposition 8.3.2*

Let |u|p < 1 be a p-adic number. The function


∞  
z
X z
z 7→ (1 + u) := uk
k
k=0

is a continuous multiplicative function Zp → Zp , i.e. for any x, y ∈ Zp we have

(1 + u)x (1 + u)y = (1 + u)x+y .

Proof

It sufices to note that (1 + u)x (1 + u)y = (1 + u)x+y for any x, y ∈ N thus for any x, y ∈ Zp by
density.

128 CHAPTER 8. P -ADIC ANALYSIS

Before presenting an application, let us present a philosophical remark about p-adic numbers taken
from Evan Chen [9]. Imagine you are given the following problem: estimate 112 + . . . + 10000
1
2 to within

0.001. This is a statement solely about rational numbers, but it is considerably easier to solve if one
knows about real numbers:

1 1 π2 X 1
+ . . . + = −
12 100002 6 k2
k=10001
π2
P∞ 1
and it is now very easy to estimate 6 and k=10001 k2 . Similarly, suppose you are given the following
problem.

Problem 8.3.1 (USA TST 2002 Problem 2)

Let p > 5 be a rational prime. Prove that the sum


p−1
X 1
fp (x) =
(px + k)2
k=1

does not depend on x ∈ Z modulo p3 .

We wish to compute this sum modulo p3 , that is, estimate this p-adic sum S to a value s ∈ Q such
that |S − s|p ≤ p−3 . This is a statement about rational numbers, but it really helps to use p-adic
numbers to estimate it p-adically.

Solution

We work in Qp . We have
p−1 p−1
X 1 X 1  px −2
2
= 2
1+
(px + k) k k
k=1 k=1
p−1 ∞  
X 1 X −2  px i
=
k 2 i=0 i k
k=1
p−1       2 2
X 1 −2 −2 px −2 p x
≡ 2
+ + (mod p3 )
k 0 1 k 2 k2
k=1
p−1 p−1 p−1
X 1 X 1 2 2
X 1
= 2
− 2xp 3
+ 3x p p − 1 4.
k k k
k=1 k=1 k=1

By Exercise 8.3.4∗ , this is congruent to


Pp−1 1
k=1 k2 modulo p3 , which proves the result. 

Exercise 8.3.4∗ . Prove that, if p > 5 is a rational prime, p2 | p−1


P 1
Pp−1 1
k=1 k3
and p | k=1 k4 .

Finally, we prove that x 7→ (1 + u)x can be expanded as a power series in x. This will be useful for
proving the Skolem-Mahler-Lech theorem in Section 8.5. In fact we prove the following more general
result.

Proposition 8.3.3

Let (an )n≥0 be a sequence of p-adic numbers such that an → 0. If ak /k! → 0, the function
∞  
X x
f (x) = ak
k
k=0

defines a convergent power series on Zp .


8.3. BINOMIAL SERIES 129

We shall simply expand the binomial coefficients in terms of x and switch the double sums to get
a power series. For this, we need a lemma to switch double sums (of infinitely many terms), similar to
Proposition 8.2.2. Over R and C it’s usually tricky and not always true, but over Qp it’s very simple
like for Proposition 8.2.2.

Proposition 8.3.4 (Switching Double Sums)

Let (ai,j )(i,j)∈N2 be a family of p-adic numbers. Suppose ai,j → 0 when i + j → ∞ (meaning
that, for any ε > 0, there are finitely many pairs (i, j) such that |ai,j |p > ε). Then,
∞ X
X ∞ ∞ X
X ∞
ai,j = ai,j
i=0 j=0 j=0 i=0

(in particular, both series converge).

Exercise 8.3.5∗ . Prove Proposition 8.3.4.

Proof of Proposition 8.3.3

Expand k! xk = x(x − 1) · . . . · (x − (k − 1)) as i ci,k xi , where |ci,k |p ≤ 1 as ci,k ∈ Z. By


 P
Proposition 8.3.4, we get
∞   X ∞ ∞
X x X ak
ak = xi ci,k
k i=0
k!
k=0 k=0

as |ci,k ak /k!|p ≤ |ak /k!|p −→ 0.


i+k


Finally, to conclude that x 7→ (1 + u)x is a power series, by Proposition 8.3.3, we need to estimate
|k!|p to prove that we indeed have uk /k! → 0. This follows from the following proposition.

Proposition 8.3.5 (Legendre’s Formula)*

Let n ∈ N. We have
∞  
X n n − sp (n)
vp (n!) = k
= .
p p−1
k=1
sp (n)
In particular, vp (n!) = n
p−1 + o(n) and |n!|p = p−n/(p−1)+o(n) , where o(n) = − p−1 is a quantity
such that o(n)/n → 0.

Remark 8.3.1
One might notice that for u ∈ Qp , |u|p < p−1/(p−1) is equivalent to |u|p < 1 because the only
values |u|p ≤ 1 can take are 1, 1/p, 1/p2 , . . .. There is however a reason why we stated it that way:
it’s because we can do algebraic number theory over Qp , and over extensions of Qp we might have
p−1/(p−1) < |u|p < 1. (See ??.)

Proof

The first equality is left as Exercise 8.3.6∗ . For the second one, write n = nm pm + . . . + n1 p + n0
130 CHAPTER 8. P -ADIC ANALYSIS

the base p expansion of n. Then,


 
n
= nm pm−k + . . . + nk+1 p + nk .
pk

Thus,
∞  
X n
vp (n!) =
pk
k=1
Xm Xm
= ni pi−k
k=1 i=k
m
X i
X
= ni pi−k
i=0 k=1
i
X p −1
= ni ·
i=0
p−1
n − sp (n)
= .
p−1


Exercise 8.3.6∗ . Let n ∈ N be a positive rational integer and p be a prime number. Prove that
∞  
X n
vp (n)! = .
pk
k=1

Corollary 8.3.1*

For any |u|p < p−1/(p−1) , the function x 7→ (1 + u)x is a convergent power series on Zp .

Exercise 8.3.7∗ . Prove Corollary 8.3.1.

8.4 Analytic Functions


In this section, we discuss (p-adic) (locally and globally )analytic functions, i.e. functions given (locally
or globally) by power series.

Definition 8.4.1 (Local Analyticity)

Let f : Zp → Qp be a function. We say f is locally analytic at α if f (x) is a given by a power


series in x−α around α, i.e. there is an ε > 0 and numbers an → 0 such that, for any |x−α|p ≤ ε,

X
f (x) = ai (x − α)i .
i=0

If f is locally analytic everywhere (on Zp ), we say it’s simply locally analytic (on Zp ).
8.4. ANALYTIC FUNCTIONS 131

Remark 8.4.1
Locally analytic functions are commonly referred to as just analytic functions.

Exercise 8.4.1∗ . Prove that locally analytic functions are continuous.

Exercise 8.4.2∗ . Prove that the sum and product of two locally analytic functions is again a locally analytic
function.

Exercise 8.4.3∗ . Prove that polynomials are locally analytic (everywhere).

It turns out that there is an extremely simple class of p-adic analytic


P∞ functions: globally analytic
functions, i.e. functions given by a convergent power series f (x) = i=0 ai xi for some ai → 0. This
lets us prove that a function is locally analytic in a very simple way. Why do we care about analytic
functions? Let us explain a bit what we are trying to do.

Our goal is to prove the Skolem-Mahler-Lech theorem 8.5.1, which says that the zeros of a linear
recurrence (an )n∈Z of algebraic numbers8 are a union of a finite set and some arithmetic progressions;
this was used in Section 7.4 for instance. How are we going
P to approach this theorem? There are two
main steps. For the sake of simplicity, we suppose an = i fi (n)αin where fi ∈ Z[X] and αi ∈ Z.

1. Transform (an )n∈Z into (the restriction of) multiple p-adic analytic functions. n 7→ αin might
(p−1)n
not define directly a p-adic analytic function with Corollary 8.3.1, but n 7→ αi does since,
p−1
by little Fermat’s theorem, αi ≡ 1 (mod p). Hence sk = (a(p−1)m+k )m∈Z define p − 1 analytic
functions on Zp .

2. Show that a locally analytic function is either identically zero on Zp , or has finitely many zeros
in Zp (and thus in Z too). This means that each sk is either always zero or has finitely many
zeros which was what we wanted to show (the zeros of (an )n∈Z are a union of a finite set and
arithmetic progressions of the form ((p − 1)m + k)m ).

Thus, our goal in this section is to prove that x 7→ (1 + u)x defines a locally analytic function, as
well as the second step: that a locally analytic funtion has finitely many zeros in Zp . This is in fact
not so surprising in hindsight: Zp consists of those elements of Qp which are "small" (have absolute
value at most 1). In fact, for small elements, the same results hold over R and C: a locally analytic
function in R is either identically zero in [−1, 1] or has finitely many zeros there, and the same goes
for C and the unit disk. What changes is that rational integers are small in Qp but big in R and C.

We now come back to what we first stated in this section: the fact that globally analytic functions
are locally analytic. We shall prove that any power series is locally analytic: this will imply that
x 7→ (1 + u)x is locally analytic for |u|p < p−1/(p−1) , by Proposition 8.3.3. This means that, although
analycity was defined as being a local property, it also follows from a global one (but the converse is
not true9 ).

Definition 8.4.2 (Globally Analytic Function)

A globally analytic function f : Zp → Qp is a function given everywhere (on Zp ) by a convergent


power series around some α ∈ Zp , i.e., there are numbers an → 0 such that, for any x ∈ Zp ,

X
f (x) = ai (x − α)i .
i=0

8 Actually, it is also true for sequences in any field of characteristic zero, but we only prove it for sequences of algebraic

numbers. The general case is Exercise 8.7.29† .


9 For instance, if p−1/(p−1) < |u| < 1, (1 + u)x is locally analytic since we can show that (1 + u)pn ≡ 1 (mod p) for
p
some n, but not globally analytic. As we mentioned in Remark 8.3.1, to find such an u requires knowledge of algebraic
extensions of Qp , see Exercise 8.7.13† .
132 CHAPTER 8. P -ADIC ANALYSIS

Proposition 8.4.1

Let f be a globally analytic on Zp . Then, f is also locally analytic.

We will in fact prove that, for any α ∈ Zp , f is equal around α to its Taylor series at α (f (n) is the
i 0
P∞  P∞
nth (formal) derivative of f : i=0 ai x = i=0 iai xi−1 ):

Proposition 8.4.2 (Taylor Series)


P∞
Let f (x) = i=0 ai xi which converges everywhere for Zp . Then, for any α ∈ Zp ,

X f (n) (α)
f (x) = (x − α)n
n=0
n!

for any x ∈ Zp (in particular this series converges for such x).

As a corollary, the globally analytic functions are exactly the convergent power series. Actually,
the proof of this proposition is almost identical to the one of Proposition 5.3.1, using Proposition 8.3.4.

Exercise 8.4.4∗ . Prove Proposition 8.4.2.

To conclude this section, we prove that a locally analytic function has finitely many zeros on Zp
or is identically zero. For this, we need to prove that Zp is sequentially compact, meaning that any
(an )n≥0 sequence of p-adic integers has a convergent subsequence (aϕ(n) )n≥0 (over R or C this is known
as the Bolzano-Weierstrass theorem, see Exercise 8.7.10† ).

Definition 8.4.3 (Sequential Compactness)

We say a set S of p-adic numbers is sequentially compact if any sequence (sn )n≥0 of elements of
S has a subsequence sϕ(n) → s ∈ S converging to an element of S.

Proposition 8.4.3 (Zp is Sequentially Compact)

Zp is sequentially compact.

Proof

Let (an )n≥0 be a sequence of p-adic integers. By the pigeonhole principle, (an )n =: (an,0 )n has an
infinite subsequence (an,1 )n which is constant modulo p. Now (an,1 )n has an infinite subsequence
constant modulo p2 , (an,2 )n . Repeating this process, we get an chain of sequences

(an,0 )n ⊆ (an,1 )n ⊆ . . .

such that (an,k )n is constant modulo pk . Thus, the subsequence

(a0,0 , a0,1 , a0,2 , . . .)

converges as a0,k+1 ≡ a0,k (mod pk ) so |a0,k+1 − a0,k |p ≤ p−k .



8.5. THE SKOLEM-MAHLER-LECH THEOREM 133

Proposition 8.4.4 (Principle of Isolated Zeros)

Let f be a non-zero locally analytic function. The zeros of f are isolated , meaning that there is
no sequence of distinct zeros of f which converges to another zero.

Proof

Let α be a zero of f , if there exists one. Write


X
f (x) = ak (x − α)k
k=0

(using Proposition 8.4.2 we get that this converges for all |x|p < r). Let m be the smallest integer
such that am 6= 0; there exists one otherwise f is identically zero. Define

X
g(x) = bk (x − α)k−m
k=m

so that f (x) = (x − α)m g(x). Since g(α) = bm is non-zero, by continuity there is some ε such
that g(x) is non-zero when |x − α|p < ε. Thus, f (x) = g(x)(x − α)m is also non-zero there, which
shows that α is isolated.


Corollary 8.4.1*

A non-zero locally analytic function on a sequential compact has finitely many zeros.

Proof

Suppose a locally analytic function f had infinitely many zeros there would be a sequence of zeros
converging by Proposition 8.4.3. By continuity, it converges to another zero z. This contradicts
the fact that z is isolated.


Note that the only property we have used here is the sequential compactness, so in particular this
result also holds over R and C.

8.5 The Skolem-Mahler-Lech Theorem


With all the hard work we did in the previous section, it is now straightforward to deduce the Skolem-
Mahler-Lech theorem.

Theorem 8.5.1 (Skolem-Mahler-Lech Theorem)

Let (un )n∈Z be a linear recurrence of algebraic numbers. The zeros Z((un )n ) of (an )n , i.e. the
set of n such that an = 0, is a union of a finite set and a finite number of arithmetic progressions:
k
[
Z((un )n ) = S ∪ (ai + bi Z)
i=0

where S is a finite set and ai , bi ∈ Z.


134 CHAPTER 8. P -ADIC ANALYSIS

Remark 8.5.1
The Skolem-Mahler-Lech theorem is also valid for sequences in aribtrary fields of characteristic
zero. Skolem proved it for sequences of rational numbers, Mahler for algebraic numbers and Lech
for sequences in any field of characteristic zero. Thus, the above theorem could perhaps be called
the "Skolem-Mahler theorem".
Notice that this theorem is optimal: the sequence

an = (n − s1 ) · . . . · (n − sm ) · (ω1n−b1 − 1) · . . . · (ωkn−bk − 1)
Sk
where ωi is a primitive ai th root of unity vanishes exactly on {s1 , . . . , sm } i=0 (ai + bi Z).

Proof

n
P
Write un = i fi (n)αi where fi ∈ Q[X] and αi ∈ Q by Theorem C.4.1. Note that we can
suppose without loss of generality that (un )n≥0 takes rational values. Indeed, if K is the fields
generated by the αi and the coefficients of the fi as well as all their conjugates, then, for any
σ ∈ Gal(K/Q), un is zero iff X
σ(un ) = σfi (n)σ(αi )n
i

σfi (n)σ(αi )n .
Q P
is, so we can consider the norm σ i

Choose a rational prime such that αi and the coefficients of fi make sense in Fp for all i. This can
be done as follows: pick a non-zero N ∈ Z such that N fi ∈ Z[X] and N αi ∈ Z and then define
h as the lcm of the minimal polynomials of N αi and the minimal polynomials of the (non-zero)
coefficients of N fi . Then, choose a rational prime p - N such that h splits in Fp ; there exists
such a prime by Theorem 6.4.1.

In addition, choose p sufficiently large so that if p - h(a) then p - h0 (a); this can be done using
Bézout’s lemma 5.4.1 as g is squarefree so coprime with its derivative. Finally, we also want
the roots of h in Fp to be non-zero, this is again true for sufficiently large p as h(0) 6= 0 (since
αi 6= 0).

Thus, write an = i gi βin where gi ∈ Qp [X] and βi ∈ Qp by Theorem 8.1.1. Since |βi |p = 1 by
P

construction, we have |βip−1 − 1|p ≤ 1/p by Fermat’s little theorem.


(p−1)n
Thus, the function n 7→ βi = (1 + (βip−1 − 1))n is analytic by Corollary 8.3.1 and Proposi-
tion 8.4.1. To conclude, for a fixed r ∈ Z/pZ, the function
(p−1)n+r
X
n 7→ gi ((p − 1)n + b)βi = u(p−1)n+r
i

is analytic on Zp , so is either identically zero or has finitely many zeros in Zp and thus in Z by
Corollary 8.4.1.

Finally, put the zeros in the finite set when they are a finite number of them, and as an arithmetic
progression when it is identically zero and we are done.


Remark 8.5.2
One could wonder why the fact that s 7→ (αN )s is analytic doesn’t imply that s 7→ αs is as well, by
replacing s by s/N . The problems is that this only gives us an analytic function f which is equal
to αn when n is a rational integer divisible by N . More precisely, since f (x + y) = f (x)f (y)
for all x, y ∈ Zp , we know f (1) is an N th root of f (N ) = αN , but we don’t know which one. As
8.5. THE SKOLEM-MAHLER-LECH THEOREM 135

well we shall see very shortly, roots of unity are exactly the reason why some linear recurrences
can be zero infinitely many times without being identically zero.

Exercise 8.5.1∗ . Convince yourself of this proof.

Exercise 8.5.2∗ . Do you think this proof could be formulated without appealing to p-adic analysis?

Corollary 8.5.1

For any linear recurrence (un )n∈Z of algebraic numbers, there are finitely many α ∈ Q such that
(un )n reaches α infinitely many times.

Proof

Write un = i fi (n)αin . Our proof of Theorem 8.5.1 shows that the common difference depends
P
only on the number field K generated by the coefficients of fi as well as the αi . Clearly, if (un )n
reaches α then α ∈ K.

This means that the common difference d is the same for (an )n as well as the linear recurrence
(un − α)n , so if the latter vanishes infinitely many times then it vanishes on dZ + c for some c.
Thus, (un )n can take a value α infinitely many times only for at most d values of α, otherwise,
(un − α)n and (un − β)n will vanish on the same dZ + c which is impossible for α 6= β.


Here is a very nice corollary of the Skolem-Mahler-Lech theorem.

Corollary 8.5.2
n
P
Suppose an = i fi (n)αi is a linear recurrence of algebraic numbers which is zero infinitely
many times but not identically zero. Then, αi /αj is root of unity for some i 6= j.

Note that this is not a weak result at all: if the field K generated by the αi has exactly N roots of
unity, then, for any fixed m, the sequence (uN n+m )n∈Z is a linear recurrence such that the quotient
αiN /αjN = (αi /αj )N
of two distinct roots of its characteristic polynomial is never a root of unity since ω N = 1 for any root
of unity ω ∈ K. Thus, we can partition (un )n∈Z into subsequences of the form (uN n+m )n∈Z , and each
of these subsequence must either be always zero or finitely many times zero. If we are dealing with
sequences of integers, we can even combine this with Corollary 8.5.1 to get that each subsequence must
be constant of tend to infinity in absolute value.
Exercise 8.5.3∗ . Prove that any number field has a finite number N of roots of unity, and that ω N = 1 for
any root of unity ω of K. (In other words, the roots of unity of K are exactly the N th roots of unity.)

We need a lemma to prove this corollary, which was already used in the proof of Theorem C.4.1.

Lemma 8.5.1

Let K be a field of characteristic zero. If


k
X
un = gi (n)βin = 0
i=1

for any n ∈ Z, where gi ∈ K[X] and βi ∈ K are both non-zero for all i, then βi = βj for some
i 6= j.
136 CHAPTER 8. P -ADIC ANALYSIS

Proof of Corollary 8.5.2 using the Lemma

n
P
If un = i fi (n)αi is infinitely many times zero for some non-zero fi ∈ Q[X] and non-zero
ri ∈ Q, then X
ur+sn = αiv fi (r + sn)αiun
i

is identically zero for some u 6= 0, v by Theorem 8.5.1. Thus, by the lemma, we must have
αis = αjs for some i 6= j, which implies that αi /αj is a root of unity as wanted.


Proof of the Lemma

We prove theP contrapositive: if β1 , . . . , βk are all distinct then gi = 0 for all i. We proceed by
induction on i deg gi , the base case follows from the Vandermonde determinant C.3.2. For the
induction step, suppose deg g1 ≥ 1 without loss of generality. Consider the sequence
X
vn = un+1 − β1 un = (βi gi (n + 1) − β1 gi (n))βin .
i

Since deg(βi gi (X + 1) − β1 gi ) ≤ deg fi for i ≥ 1 and deg(β1 (gi (X + 1) − gi )) ≤ deg fi − 1, by the


induction hypothesis we have βi gi (X + 1) − β1 gi = 0 for all i. This means that they are constant,
but we have already treated this case so we are done.


Alternative Proof of the Lemma for K = Q, using Algebraic Number Theory

Here is an alternative proof, which in this case is less efficient than the first one but that we still
present because it is neat. Using an argument similar to Exercise 8.7.29† , one can also adapt
it to work over any characteristic zero field. Consider an N such that g1 (N ), . . . , gk (N ) 6= 0.
Pick aP large prime p such that gi and βi make sense modulo p, using Theorem 6.4.1 and write
un ≡ i gi0 (n)βin 0 where gi0 ∈ Fp [X] and βi0 ∈ Fp . By picking p sufficiently large, suppose also
that p - gi0 (N ), βi0 for each i.

Since the order of βi0 is coprime with p (it divides pm − 1 for some m), using CRT we can choose
0
an M such that βiM = 1 for each i and gi0 (M ) = gi0 (N ). Thus we have i gi0 (N )βin 0 = 0 for all
P
n. By Vandermonde C.3.2 this implies
Y Y
βi − βj ≡ βi0 − βj0 = 0.
i6=j i6=j

Since this is true for infinitely many primes, the LHS is zero too, i.e. βi = βj for some i 6= j.


Remark 8.5.3
Note that we again proved a global statement in a local way, although we only used finite fields
instead of p-adic numbers here. Results like Exercise 8.7.29† prove that we can even prove results
about C that seem analytic or algebraic in nature with number theory. In fact, the only known
proofs of Skolem-Mahler-Lech are p-adic in essence.

Remark 8.5.4
It is interesting to note that our proof of the Skolem-Mahler-Lech is non-effective. We can find
the common difference of the arithmetic progressions if the sequence has infinitely many zeros,
8.6. STRASSMANN’S THEOREM 137

although our proof is not excellent for this because we chose to work Qp instead of finite extensions
of Qp , and we can also bound the size of the additional finite set with Theorem 8.6.1, but we
cannot decide if a linear recurrence has a zero or not. This defect is shared by all known proofs.

Remark 8.5.5
Our bound on the common difference of the arithmetic progressions is very weak because we do
not know how big the least prime such that every exists in Fp is. Using finite extensions of Qp and
the theory of finite fields, one can get a way better bound, as we can now choose the smallest p
such that the algebraic numbers are non-zero in Fp (see ?? for how to extend the p-adic absolute
value to finite extensions).

8.6 Strassmann’s Theorem


Our previous method of showing an analytic function had finitely many zeros on Zp did not proving
any actual bound, so we will fix that here. With the bounds we get, in some situations we will be able
to determine all zeros of certain linear recurrences, and solve certain diophantine equations thanks to
them.

Theorem 8.6.1 (Strassmann’s Theorem)


P∞
Let f (x) = k=0 ak xk be a non-zero power series convergent on Zp , i.e. ak → 0. Suppose N
is maximal such that |aN |p := max(|an |p ), i.e. |aN |p > |an |p for n > N , and |aN |p ≥ |an |p for
n ≤ N . Then f has at most N zeros.

Notice that such an N always exists, for otherwise ak 6→ 0.

Proof

We proceed by induction on N . When N = 0,

|f (x)|p ≤ max(|ai xi |p ) = |a0 |p

by the strong triangle inequality 8.2.1 since |a0 |p > |ai |p ≥ |ai xi |p for any i > 0. Moreover, the
maximum is achieved only once so we have |f (x)|p = |a0 |p for all x ∈ Zp so f never vanishes (if
a0 = 0 then f = 0 since it’s the maximum coefficient, which is impossible).

Now, suppose N ≥ 1 is maximal such that |aN |p = max(|an |p ). Suppose α ∈ Zp is a zero of f , if


there is none we are already done. Write

f (x) = f (x) − f (α)


X∞ ∞
X
= a i xi − ai αi
i=0 i=0

X
= ai (xi − αi )
i=0
∞ X
X i−1
= (x − α)i ai xj αi−j−1
i=0 j=0

X ∞
X
= (x − α)i xj ai αi−j−1 .
j=0 i=j+1
138 CHAPTER 8. P -ADIC ANALYSIS

using Proposition 8.3.4. Let



X
bj = ai αi−j−1
i=j+1

and define

X
g(x) = bk xk
k=0

so that f (x) = (x − α)g(x) for x ∈ Zp .

We shall prove that |bN −1 |p > |bn |p for n > N − 1, and |bN −1 |p ≥ |bn | for n ≤ N , that way g will
have at most N − 1 zeros by the induction hypothesis so f at most N . Note that

bN −1 = aN + αaN +1 + α2 aN +2 + . . .

so that |bN −1 |p = |aN |p by the strong triangle inequality. For n 6= N − 1, we have

|bn |p = |an+1 + αan+2 + α2 an+3 + . . . |p ≤ max(|ai αi |p ) ≤ max(|ai |p ) ≤ |aN |− = |bN −1 |p


i>n i>n

and this inequality is strict for n > N − 1 since we then have |ai |p < |aN |p for i ≥ n + 1 > N .


Here is an application from [7], very hard to solve by elementary means (which would be in essence
p-adic anyway).

Proposition 8.6.1 (Ramanujan, Nagell)

The positive integers n such that


x2 + 7 = 2n
has a solution in Z are n ∈ {3, 4, 5, 7, 15}.

Proof

First, let’s analyse this equation in Q( −7). By Exercise 8.6.1, it is Euclidean so a UFD. Suppose
(x, n) is a solution; clearly x is odd. We get
√ √
x + −7 x − −7
· = 2n−2
2 2
√ √ √ √ √
and the prime factorisation of 2 is 1+ 2 −7 · 1− 2 −7 . Let α = 1+ −7
2 and β = 1− −7
2 . Since 1± −7
2
isn’t divisible by 2, we must have (Exercise 8.6.2)

x ± −7
= αn−2 .
2
This has a solution if and only if
αn−2 − β n−2
= ±1.
α−β
The LHS is a linear recurrence which we will denote by (un−2 )n≥0 .

Now, let’s try to find a p-adic field where −7 exists. Since −7 ≡ 22 (mod 11) we can work in
Q11 . By Hensel’s lemma, there are two roots of X 2 − X + 2 (this is the characteristic polynomial
of the sequence) which we will abusively call α and β again. One of the roots is congruent to 16
modulo 112 , say α, and the other one is β = 1 − α ≡ 106 (mod 112 ).
8.7. EXERCISES 139

Let r ∈ {0, 1, . . . , 9} be an integer. Since a = α10 − 1 ≡ 99 (mod 112 ) and b = β 10 − 1 ≡ 77


(mod 112 ) are divisible by 11, the functions s 7→ ur+10s are analytic. Let’s find out how many
times they can be ±1. Expand (α − β)(ur+10s ± 1) as a power series in s:

(α − β)(ur+10s ± 1) = αr (1 + a)s − β r (1 + b)s ± (α − β)


X s X s
= αr as − β r bs ± (α − β)
k k
k k
r r
≡ α (1 + as) − β (1 + bs) ± (α − β) (mod 112 ).

An easy computation shows that the Strassmann bounds are N = 1 for r ∈ {1, 2, 5} and N = 0
for r ∈ {0, 4, 6, 7, 8, 9}. For r = 3, by expanding one more term, we find that the Strassmann
bound is N = 2. Since we have exactly this many solutions (r + 10s ∈ {1, 2, 3, 5, 13}), we are
done as they correspond to n ∈ {3, 4, 5, 7, 15}. (Technically we have to consider 20 functions
because of the ±1 sign, but it is easy to see that only the + sign works for r ∈ {1, 2}, only the
− sign works for r ∈ {3, 5}, and none of them do for other r.)



Exercise 8.6.1. Prove that Q( −7) is norm-Euclidean. (This is also Exercise 2.6.4† .)

√  √ n−2
x± −7 1+ −7
Exercise 8.6.2. Prove that, if x2 + 7 = 2n , then 2
= 2
for some choice of ±.

Exercise 8.6.3∗ . Compute the Strassmann bounds for the function s 7→ (α − β)(us+10r ± 1), for each r ∈
{0, 1, . . . , 9}. (If you do not want to do it all by hand, you may use a computer. In any case, it is better to do
it to have a feel for why it works because it’s very cool.)

Exercise 8.6.4. Prove that 3, 4, 5, 7, 15 are indeed solutions to the given equation. (You may use a computer
for n = 15.)

8.7 Exercises
Analysis
Exercise 8.7.1† (Vandermonde’s Identity). Let x and y be p-adic integers. Prove that
    
x+y X x y
=
k i j
i+j=k,i,j≥0

for any k.

Exercise 8.7.2† (Mahler’s Theorem). Prove that a function f : Zp → Qp is continuous if and only if
there exist ai → 0 such that
∞  
X x
f (x) = ai
i=0
i

for all x ∈ Zp . These ai are called the Mahler coefficients of f . Moreover, show that max(|f (x)|p ) =
max(|ai |p ).

Exercise 8.7.3 (USA TST 2011). We say a sequence (zn )n≥0 is a p-pod if
m   !
k m
X
vp (−1) zk → ∞.
k
k=0

Prove that if (an )n≥0 and (bn )n≥0 are p-pods then (an bn )n≥0 is too.
140 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.7.4† . Prove that the following power series converge if and only if for |x|p < 1 and
|x|p < p−1/(p−1) respctively:
∞ ∞
X (−1)k−1 xk X xk
logp (1 + x) = , expp (x) = .
k k!
k=1 k=0

In addition, prove that

1. expp (x + y) = expp (x) expp (y) for |x|p , |y|p < p−1/(p−1) .

2. logp (xy) = logp (x) + logp (y) for |x|p , |y|p < 1

3. expp (log(1 + x)) = 1 + x for |x|p < p−1/(p−1) .

4. logp (exp(x)) = x for |x|p < p−1/(p−1) .

Exercise 8.7.5† . Prove that !


n
X 2k
v2 → ∞.
k
k=1
P∞
Exercise 8.7.6† (Mean Value Theorem). Let f (x) = i=0 ai xi be a p-adic power series converging
for |x|p ≤ 1, i.e. ai → 0. Prove that

|f (t + h) − f (t)|p ≤ |h|p max(|ai |p )


i

for any |t|p ≤ 1 and |h|p ≤ p−1/(p−1) .

Absolute Values
Exercise 8.7.7† . We say an absolute value | · | over a field K, i.e. a function | · | → R≥0 such that

• |x| = 0 ⇐⇒ x = 0

• |x + y| ≤ |x| + |y|

• |xy| = |x| · |y|

is non-Archimedean if the sequence |m| ≤ 1 for all m ∈ Z and Archimedean otherwise. Prove that m
is non-Archimedean if and only if it satisfies the strong triangular inequality |x + y| ≤ max(|x|, |y|)
for all x, y ∈ K. In addition, prove that, if | · | is non-Archimedean, we have |x + y| = max(|x|, |y|)
whenever |x| = 6 |y|.

Exercise 8.7.8† . Let K be a field and let | · | : K → R≥0 be a multiplicative function which is an
absolute value on Q. Suppose that | · | satisfies the modified triangular inequality |x + y| ≤ c(|x| + |y|)
for all x, y ∈ K, where c > 0 is some constant. Prove that it satisfies the triangular inequality.

Exercise 8.7.9† (Ostrowski’s Theorem). Let | · | be an absolute value of Q. Prove that | · | is equal to
| · |rp for some prime p and some r ≥ 1, or to | · |r∞ for some 0 < r ≤ 1 or is the trivial absolute value
| · |0 which is 0 at 0 and 1 everywhere else.

Exercise 8.7.10† (Bolzano-Weierstrass Theorem). Prove that a set S ⊆ Rn is sequentially compact if


and only if it closed, meaning that any sequence of elements of S converging in Rn (for the Euclidean
distance) converges in S, and bounded.

Exercise 8.7.11† (Extremal Value Theorem). Let M be a metric space, i.e. a set with a distance
d : M → R≥0 such that d(x, y) = 0 iff x = y, d(x, y) = d(y, x) (commutativity) and d(x, y) ≤
d(x, z) + d(z, y) (triangle inequality) for any x, y, z ∈ M and let S be a sequentially compact subset of
M . Suppose f : S → R is a continuous function. Prove that f has a maximum and a minimum.
8.7. EXERCISES 141

Exercise 8.7.12† (Equivalence of Norms). Let (K, | · |) be a complete valued field in characteristic
0, i.e. a field with an absolute value | · | which is complete10 for the distance induced by this absolute
value. A norm on a vector space V over K is a function k · k : V → R≥0 such that

• kxk = 0 ⇐⇒ x = 0

• kx + yk ≤ kxk + kyk

• kaxk = |a|kxk

for all x, y ∈ V and a ∈ K. We say two norms k · k2 and k · k2 are equivalence of norms if there are two
positive real numbers c1 and c2 such that kxk1 ≤ c1 kxk2 and kxk2 ≤ c2 kxk1 for all x ∈ V .11 Prove
that any two norms are equivalent over a finite-dimensional K-vector space V . In addition, prove that
V is complete under the induced distance of any norm k · k.

Exercise 8.7.13† . Let K = Qp be a local field12 , where p be a prime number or ∞ and let L be a
finite extension of K. Prove that there is only one absolute value of L extending | · |p on K, and that
1/[L/K] 131415
it’s given by | · |p = NL/K (·) p
.

Exercise 8.7.14† . Let (K, ·) be a complete valued field in characteristic 0 and let f ∈ K[X] be a
polynomial. Prove that f either has a root in K, or there is a real number c > 0 such that |f (x)| ≥ c
for all x ∈ K.

Exercise 8.7.15† (Ostrowski). Let (K, ·) be a complete valued Archimedean field in characteristic
016 . Prove that it is isomorphic to to (R, | · |∞ ) or (C, | · |∞ ).

Diophantine Equations
Exercise 8.7.16† (Brazilian Mathematical Olympiad 2010). Find all positive rational integers n and
x such that 3n = 2x2 + 1.

Exercise 8.7.17 (Taiwan TST 2021). Find all triples of positive rational integers (x, y, z) such that

x2 + 4y = 5z .

Exercise 8.7.18. Prove that the equation x3 + 11y 3 = 1 has no non-trivial rational integer solutions.

Exercise 8.7.19† . Solve the diophantine equation x2 − y 3 = 1 over Z.

Exercise 8.7.20† (Lebesgue). Solve the equation x2 + 1 = y n over Z, where n ≥ 3 is an odd integer.

Exercise 8.7.21† . Solve the equation x2 + 1 = 2y n over Z, where n ≥ 3 is an odd integer.


10 Recall that completeness means that all Cauchy sequences converge. A Cauchy sequence (un )n≥0 is a sequence such
that, for any ε > 0, there is an N such that |um − un | ≤ ε for all m, n ≥ N .
11 This means that they induce the same topology on V .
12 This result is true for any complete valued field (K, | · |), but it is harder to prove.
13 In particular, this absolute value is still non-Archimedean if it initially was. For instance, by Exercise 8.7.7† , if p is

prime, the extension of | · |p still satisfies the strong triangle inequality. In fact, this is the only interesting case since it’s
too hard to treat the case K = R separately.
14 Here is why this absolute value is intuitive: by symmetry between the conjugates, we should have |α| = |β| if α
p p
[K:Q ]
and β are conjugates. Taking the norm yields |NK/Qp (α)|p = |α|p p as indicated.
15 One might be tempted to also define a p-adic valuation for elements of K as v (·) = − log(| · | )/ log(p), and this is
p p
also what we will do in some of the exercises. However, we warn the reader that, if α ∈ Z is an algebraic integer and αp
is a root of its minimal polynomial in Qp , vp (αp ) ≥ 1 does not mean anymore that p divides α in Z, it only means that
p divides αp in Zp := {x ∈ Qp | |x|≤ 1}.
16 In fact it is quite easy to show that char K = 0 follows from the assumption that | · | is Archimedean, but we add

this assumption for the convenience of the reader.


142 CHAPTER 8. P -ADIC ANALYSIS

Linear Recurrences
Exercise 8.7.22† . Let (un )n≥0 be a linear recurrence of rational integers given by i fi (n)αin such
P
that αi /αj is not a root of unity for i 6= j. If un is not of the form aαn for some a, α ∈ Z, prove that
there are infinitely many prime numbers p such that p | un for some integer n ≥ 0.
Exercise 8.7.23† . Does there exists an unbounded linear recurrence (un )n≥0 such that un is prime
for all n?

Miscellaneous
Exercise 8.7.24† . Which roots of unity are in Qp ?
Exercise 8.7.25. Any p-adic number can be written uniquely in the following way: a = k>N ak pk
P
for some N ∈ Z and ak ∈ [p] (this amounts to choosing a system of representents of Z/pZ). Prove that
a ∈ Q if and only if the sequence (ak )k is eventually periodic.
Exercise 8.7.26 (ISL 2020). Find all functions f : Z>0 → Z≥0 such that f (xy) = f (x) + f (y) for
every integers x, y > 0 and for which there are infintely many n ∈ N satisfying f (k) = f (n − k) for
every integer 0 < k < n.

Exercise 8.7.27† (China TST 2010). Let k ≥ 1 be a rational integer. Prove that, for sufficiently
large n, nk has at least k distinct prime factors.
Exercise 8.7.28† . Find all additive functions f : ZN → Z, where addition is defined componentwise.
(To those who have read Section C.2, the fact that there are a nice characterisation of those functions
should come off as a surprise.)

Exercise 8.7.29† . Prove that the Skolem-Mahler-Lech theorem holds over any field of characteristic
zero.
Appendix A

Polynomials

Prerequisites for this chapter: none.

A.1 Fields and Polynomials

Definition A.1.1 (Field)

A field (K, +, ·) is a set K with at least two elements and with two operations + and ·, called
addition and multiplication. These operations have the following properties: they are associative
and commutative, they have inverses, they have an identity (except for 0, it doesn’t have a
multiplicative inverse), and multiplication distributes over addition. We usually just say that K
is a field by abuse of terminology.

Here is what these terms mean: an operation † : K 2 → K is associative if (a † b) † c = a(b † c) for


any a, b, c ∈ K (that way we can write a † b † c without ambiguity).
It is commutative if a † b = b † a for any a, b ∈ K.
It has an identity e if a † e = e † a = a for any a ∈ K (this is denoted 0K for addition and 1K for
multiplication, but we usually drop the K when the context is clear).
a0 is an inverse of a for † if a † a0 = a0 † a = e (denoted −a for addition and a−1 for multiplication).
+ distributes over · if a(b + c) = ab + ac and (b + c)a = ba + ca, where ab denotes a · b.
Exercise A.1.1∗ . Let K be a field. Prove that 0K a = 0K for any a ∈ K.

Exercise A.1.2∗ . Let † be a binary (taking two arguments) associative operation on a set M . Suppose that
M has an identity. Prove that it is unique. Similarly, prove that, if an element g ∈ M has an inverse, then it
is unique.1

Remark A.1.1
Note that a field must have at least two elements (the additive and multiplicative identities must
be distinct), i.e. the trivial ring R = {0} is not a field. There are various reasons for this axiom,
akin to the convention that 1 isn’t prime, but perhaps the simplest one is that if it were a field we
would not have the uniqueness of dimension anymore since {0} and the empty set are both bases
of {0}. (This is unimportant for this appendix, see Appendix C for the definition of dimension.)

Here are some examples of fields: the familiar sets of rational numbers Q, of real numbers R and
of complex numbers C. We will define a variety of other fields throughout this book, but here is one
very important field: the fields Fp of integers modulo p, where p is a prime. You can think of it
1 Such a structure is called a monoid.

143
144 APPENDIX A. POLYNOMIALS

as {0, 1, . . . , p − 1} with addition and multiplication modulo p. It differs greatly from the previous
fields for two reasons: because it is finite (we will study these fields in Chapter 4) and because it has
non-zero characteristic 2 (see Section A.2 for a definition if you’re curious, but this is unimportant for
now). Why is it a field? Well, all axioms are obvious because they are true in Z so also in Z modulo
p, except the one about multiplicative inverses. But you already know that integers which are not
divisible by p have inverses modulo p since it’s prime.
We now define polynomials with coefficients in a field K.

Definition A.1.2 (Polynomials)

A polynomial f with coefficients in K is a object f = an X n +. . .+a1 X +a0 where a0 , . . . , an ∈ K.


The greatest k such that ak 6= 0 is called the degree deg f of f ; the degree of the zero polynomial
deg 0 is −∞. The set of polynomials with coefficients in K is denoted K[X].

The coefficient adeg f is called the leading coefficient of f , and a0 is the constant coefficient of
f . When the leading coefficient is 1, we say the polynomial is monic (the zero polynomial isn’t
monic).

Remark A.1.2
We can also consider similar objects but without the restriction that ak = 0 for sufficiently
large k. They are called formal power series. They are also very useful objects, but are not
considerably used in algebraic number theory so we do not consider them here (two exceptions:
see Theorem B.2.1 and Remark C.4.1). Another point to note is that, although one can obtain
many very interesting results by purely formal and algebraic considerations, we lose one advantage
of polynomials: we can not always evaluate them (since the resulting series might not converge,
or worse, we might not even have a topology to consider convergence). Thus, they demand a bit
more care if we want to do that. See Andreescu-Dospinescu [1] chapter 8 for an introduction to
the wonders of formal power series.
The sum and product of two polynomials are defined intuitively, I don’t think I have to explain
that. The formal object X will be called a "variable", even if that makes it seem like it’s not a formal
object.3 Polynomials in multiple variables are defined analogously as
X
ai1 ,...,im X1i1 · . . . · Xkim
i1 ,...,im ≥0

where all but finitely many ai1 ,...,im are zero. The degree is now defined as the greatest value of
i1 + . . . + im for non-zero ai1 ,...,im .
Exercise A.1.3∗ . Prove that multiplication of polynomials is associative and commutative.
A polynomial is not a polynomial function! A polynomial is a purely formal object: for instance
the polynomial functions x 7→ xp and x 7→ x are the same over the integers modulo p by Fermat’s
little theorem, but the polynomials X p and X are distinct. That said, we can still consider them as
polynomial functions when we want to (to evalute polynomials at a point for instance), but it is also
important to be able to consider them only as polynomials (e.g. for Corollary A.1.1).
Here is why fields are nice: they are precisely the structure that lets us define polynomials (and be
able to add them and multiply them nicely) as well as have a Euclidean division.

Proposition A.1.1 (Euclidean Division of Polynomials)

Let f, g ∈ K[X] be polynomials, with g 6= 0. There exists polynomials q, r ∈ K[X] with


deg r < deg g such that f = gq + r.

2 This is a consequence of its finiteness, but it has important consequences too which explains why it is mentioned.
3 The technical term is "indeterminate" but I prefer using "variable".
A.1. FIELDS AND POLYNOMIALS 145

Proof

Start with the uniqueness part. If gq + r = f = gq 0 + r0 , then (q − q 0 )g = r0 − r and q 6= q 0 . Thus,


deg(q − q 0 )g ≥ deg g > deg r0 − r which is impossible.

We now proceed by induction on deg f to prove the existence, for a fixed g. If deg f < deg g,
we already done: f = g · 0 + f . Otherwise, let a and b be the leading coefficients of f and g
respectively, which are non-zero since deg f ≥ deg g ≥ 0. The polynomial f − ab−1 X deg f −deg g g
has degree less than deg g, so by the induction hypothesis there exist polynomials q and r such
that deg r < deg g and
f − ab−1 X deg f −deg g g = gq + r
Finally, this gives us
f = (q + ab−1 X deg f −deg g )g + r.


We now define divisibility of polynomials like we do in Z:

Definition A.1.3 (Divisibility of Polynomials)

We say a polynomial f ∈ K[X] divides a polynomial g ∈ K[X], and write f | g, if there exists a
polynomial h ∈ K[X] such that g = f h.

Note that repeated applications of the Euclidean remainder yields the Euclidean algorithm: given
two polynomials f, g ∈ K[X] with deg f > deg g, we iteratively replace f by the remainder of its
division by g. For instance, f = X 3 + X and g = X 2 yields

{X 3 + X, X 2 } → {X 2 , X} → {X, 0} → {0, 0}.

This will, like in Z,4 eventually produce the pair {0, h} where h is the greatest common divisor (gcd)
of f and g, i.e. a polynomial which divides both f and g, and such that, if h0 | f, g then h0 | h (in
particular it is the common divisor with greatest degree, except when f = g = 0). Note that the gcd
is only defined up to multiplication by a non-zero constant, although we will usually assume it to be
monic.

Exercise A.1.4∗ . Prove that the gcd of 0 and 0 is 0.

Exercise A.1.5∗ . Prove that the Euclidean algorithm produces the gcd. Deduce that the gcd of two polyno-
mials in K[X] is also in K[X]. (As a consequence, the fundamental theorem of algebra Theorem A.1.1 implies
that two polynomials with rational coefficients are coprime in Q[X] if and only if they have a common complex
root.)

Exercise A.1.6∗ (Bézout’s Lemma). Consider two polynomials f, g ∈ K[X]. Prove that there exist polyno-
mials u, v ∈ K[X] such that uf + vg = gcd(f, g).

As another corollary of Proposition A.1.1, we get the following extremely fundamental fact.

Proposition A.1.2*

Let f ∈ K[X] be a polynomial. If f (α) = 0, then X − α | f .

4 The deep reason behind all these analogies with Z lies in Chapter 2: both Z and K[X] are Euclidean domains.
146 APPENDIX A. POLYNOMIALS

Proof

Let f = (X − α)q + r be the Euclidean division of f by X − α. Since deg X − α = 1, we have


deg r < 1 so r is constant. Notice that r = r(α) = f (α) = 0, which means f = (X − α)q, i.e.
X − α divides f .


Corollary A.1.1*

A polynomial f ∈ K[X] of degree n ≥ 0 has at most n roots in K.

Proof

Suppose for the sake of a contradiction that f had n + 1 roots α1 , . . . , αn+1 . Using Proposi-
tion A.1.2 repeatedly, we get f = (X − α1 )f1 , f1 = (X − α2 )f2 , ..., fn = (X − αn+1 )fn+1 so
that
f = (X − α1 ) · . . . · (X − αn+1 )fn+1 .
Since f is non-zero, the degree of fn+1 is non-negative so n = deg f = n + 1 + deg fn+1 ≥ n + 1
which is a contradiction.


Exercise A.1.7∗ . Let f ∈ K[X1 , . . . , Xn ] be a polynomial in n variables and suppose S1 , . . . , Sn ⊆ K are


subsets of K such that |Si | > degXi f . If f vanishes on S1 × . . . × Sn , prove that f = 0. (This is the
generalisation of Corollary A.1.1 to multivariate polynomials.)

Here is a non-trivial application of this.

Problem A.1.1

Let n ≥ 2 be a positive integer. What is the gcd of the numbers 1n − 1, 2n − 1, . . . , nn − 1?

Solution

Let d be this gcd. Suppose p is a prime factor of d. If p ≤ n, then p | pp − 1 which is impossible.


Thus p > n. Consider the polynomial

X n − 1 − (X − 1) · . . . · (X − n)

in Fp [X]. It has degree at most n − 1 and n roots (in Fp ) by assumption, thus it is the zero
polynomial. Hence we have

X n − 1 ≡ (X − 1) · . . . · (X − n) (mod p).

Expand the RHS and consider the coefficient of X n−1 : it is −(1 + . . . + n) = − n(n+1)
2 . On the
other hand, since n ≥ 2, the coefficient of X n−1 of the LHS is 0. Thus

n(n + 1)
p| .
2
Since p > n, this means p = n + 1. Thus, if n + 1 is composite we are already done: the gcd is 1.
If n + 1 = p is prime, the gcd d is a power of p and we must find out what it is. Clearly, p is odd.
A.1. FIELDS AND POLYNOMIALS 147

By Fermat’s little theorem, p | k n − 1 for k = 1, . . . , n so p | d. It remains to prove that p2 - d.


For this, suppose for the sake of a contradiction that p2 | (p − 1)p−1 − 1. Then,
p−1
2 −1
(p − 1)(p−1) − 1 X p−1
p| 2
= (p − 1)2k ≡ (mod p)
(p − 1) − 1 2
k=0

which is a contradiction so we are done in this case too: d = n + 1 if it is prime and 1 otherwise.


In particular, notice that X p−1 − 1 = (X − 1) · . . . · (X − (p − 1)) in Fp which will be important for


Chapter 4.

Proposition A.1.2 motivates us to make the following definition.

Definition A.1.4 (Multiple Root)

We say α is a root of multiplicity m if (X − α)m | f but (X − α)m+1 - f . The multiplicity of α


is denoted vα (f ).

Definition A.1.5 (Derivative)

ai X i ∈ K[X] is f 0 = iai X i−1 . The nth


P P
The (formal) derivative of a polynomial f = i≥0 i≥1
derivative of f is denoted f (n) (f (0) = f ).

Exercise A.1.8∗ . Prove that (f g)0 = f 0 g + gf 0 and (f + g)0 = f 0 + g 0 for any f, g ∈ K[X]. Show also that
(f n )0 = nf 0 f n−1 for any positive integer n, where f k denotes the kth power and not the kth iterate. More
generally, show that
n
!0 n
Y X Y
fi = fi0 fj .
i=1 i=1 j6=i

We can now give a criterion to compute the multiplicity of a root, using our notion of derivative.
This will however only work as long as the characateristic char K of K is greater than the multiplicity
of the root. Roughly speaking, the characteristic c ∈ N is the smallest positive number such that c = 0
in K if there exists one, and 0 otherwise. See Definition A.2.3 for a more rigorous definition. For
instance, the characteristic of Fp is p while the characteristic of Q is 0.

Proposition A.1.3 (Multiple Roots)*

Let f ∈ K[X] be a polynomial. If char K = 0, for any positive integer m and any α ∈ K, we
have (X − α)m | f if and only if

f (α) = f 0 (α) = . . . = f (m−1) (α) = 0.

Otherwise, this only holds for m < char K.

Here is how this theorem can fail when vα (f ) ≥ c: the derivative of f = X p over Fp is pX p−1 = 0,
so f (k) (0) = 0 for all k ∈ N yet X p is clearly not divisible by X k for all k ∈ N.
148 APPENDIX A. POLYNOMIALS

Proof

We proceed by induction on m. The base case is Proposition A.1.2. We shall prove that, if
f (α) = 0, vα (f 0 ) = vα (f ) − 1.

Let m = vα (f ) ≥ 1. Write f = (X − α)m g where X − α - g. Then, by Exercise A.1.8∗ ,


f 0 = m(X − α)m−1 g + (X − α)m g 0 which is indeed divisible by (X − α)m−1 but not (X − α)m
as X − α - g. Here, we used the fact that m is non-zero because it is positive but less than the
characteristic.


Using our notion of multiple roots, we get that if f has degree n, leading coefficient a, and roots
α1 , . . . , αn (not necessarily distinct, we count them with multiplicity) then (X − α1 ) · . . . · (X − αn ) | f
so that
f = (X − α1 ) · . . . · (X − αn )g

for some g which must be constant equal to a by looking at the degrees. We have factorised the
polynomial with its roots. The following proposition shows that we can recover the coefficients from
the roots using this factorisation.

Proposition A.1.4 (Vieta’s Formulas)*

Suppose f = a0 + . . . + an−1 X n−1 + X n is a monic polynomial of degree n with roots α1 , . . . , αn


(counted with multiplicity). Then,
X
an−k = (−1)k αi1 · . . . · αik
i1 <...<ik

for any k = 0, . . . , n − 1.

Proof

It is simply the expansion of f = (X − α1 ) · . . . · (X − αn ).




In particular, a0 = (−1)n α1 · . . . · αn and an−1 = −(α1 + . . . + αn ). We have in fact already used a


special case of these formulas in Problem A.1.1. Here are two more applications of this, to show how
useful it is.

Corollary A.1.2 (Wilson’s Theorem)

For any prime p, (p − 1)! ≡ −1 (mod p).

Proof

We have already seen that 1, 2, . . . , p − 1 are exactly the roots of X p−1 − 1 in Fp by Fermat’s
little theorem. Thus, their product is (−1)p−1 · (−1) by Vieta’s formulas as −1 is the constant
coefficient of X p−1 − 1. This means that (p − 1)! ≡ (−1)p ≡ −1 (mod p) as wanted (when p is
odd it’s clear and when p = 2 we have 1 = −1).

A.1. FIELDS AND POLYNOMIALS 149

Problem A.1.2 (APMO 2014 Problem 3)

Find all positive integers n such that for any integer k there exists an integer a for which a3 +a−k
is divisible by n.

(Partial) Solution

This is equivalent to x 7→ x3 + x being bijective modulo n. In particular, if it is bijective modulo


n it is bijective modulo any prime factor p | n. We will show that p = 3. This will imply that
n is a power of 3, and conversely all powers of 3 work but we have not established the tools to
prove this yet. It will be proven in Chapter 5 a consequence of Hensel’s lemma 5.3.1. Clearly
p = 2 doesn’t work as 2 | 03 + 0, 13 + 1 so p must be odd.

Thus, we restrict ourself to the prime case. Suppose x 7→ x3 + x is a permutation of Fp . Then,

1 · 2 · . . . · (p − 1) ≡ (13 + 1)(23 + 2) · . . . · ((p − 1)3 + (p − 1))

as 03 +0 = 0 so x 7→ x3 must also be a permutation of Fp \{0}. After simplifying by 1·. . .·(p−1),


this is equivalent to
(12 + 1)(22 + 1) · . . . ((p − 1)2 + 1) ≡ 1.
But notice that, by Fermat’s little theorem, the numbers of the form a2 + 1 are the roots of the
p−1 p−1
polynomial (X − 1) 2 − 1 whose constant term is (−1) 2 − 1. Moreover, in our product, every
root is present exactly twice (a2 + and (−a)2 + 1) so we get
p−1
(12 + 1)(22 + 1) · . . . ((p − 1)2 + 1) ≡ (±((−1) 2 − 1))2 (mod p)
p−1
by Vieta’s formulas. But (±((−1) 2 − 1))2 ∈ {0, 4} so for this to be congruent to 1 modulo p
we must have p = 3. It is also easy to check that x 7→ x3 + x is a bijection of F3 , hence we are
done (with the prime case). 

Conversely, the following result, which is proven in Appendix B, shows that we can always achieve
such a factorisation over the complex numbers. Fields where this result holds are said to be algebraically
closed .

Theorem A.1.1 (Fundamental Theorem of Algebra)

Any polynomial f ∈ C[X] of degree n ≥ 0 has exactly n roots, i.e.,

f = a(X − α1 ) · . . . · (X − αn )

where α1 , . . . , αn are the roots of f counted with multiplicity and a is its leading coefficient.

Over the real numbers, we have the following result.

Proposition A.1.5

Any non-zero polynomial f ∈ R[X] factorises into a product

a(X − α1 ) · . . . · (X − αm )f1 · . . . · fk

where a ∈ R is its leading coefficient, αi ∈ R are its real roots, and the fi ∈ R[X] are monic
polynomials of degree 2 with no real roots.
150 APPENDIX A. POLYNOMIALS

In fact, we shall prove that non-real roots α come in pairs of conjugates α, α, since (X − α)(X − α)
has real coefficients this will yield the result.

Proposition A.1.6*

Suppose f ∈ R[X]. Then, for any α ∈ C, vα (f ) = vα (f ).

Proof

We have f = (X − α)m g for some g ∈ C[X] if and only if f = (X − α)m g, where g denotes the
polynomial whose coefficients are the complex conjugates of those of g.


Proof of Proposition A.1.5

Write f = a(X − α1 ) · . . . · (X − αm )(X − β1 )(X − β 1 ) · . . . · (X − βm )(X − β m ) where αi ∈ R


and βi ∈ C \ R using Proposition A.1.6. Then,

f = a(X − α1 ) · . . . · (X − αm )f1 · . . . · fk

where fi = (X − βi )(X − β i ) = X 2 − 2<(βi ) + |βi |2 ∈ R[X].




The fact that some polynomials over R[X] cannot be decomposed into a product of linear polyno-
mials motivates us to make the following definition.

Definition A.1.6 (Irreducible Polynomial)

A non-zero polynomial f ∈ K[X] is said to be irreducible in K[X] if it can not be written as a


product of two polynomials of smaller degrees. We shall also say it is irreducible over K.

Notice that degree 2 and 3 polynomials are irreducible if and only if they don’t have a root in K,
since if they can be written as a product of two polynomials of smaller degrees one of them must have
degree 1. For instance, X 3 + 2 is irreducible in Q[X] since it does not have a root there, but not in
R[X]. X 2 + 1 is irreducible in R[X], but not in C[X]. Irreducible polynomials over R and C are a bit
degenerate because they have degree 1 or 2, but over Q there are irreducible polynomials of arbitrarily
large degree.

Here is one last result, which is the only one of this section which will not be used extensively
throughout this book.

Theorem A.1.2 (Lagrange Interpolation)

For any distinct a1 , . . . , an+1 ∈ K and b1 , . . . , bn+1 ∈ K there is a unique polynomial f ∈ K[X]
Pn+1
of degree at most n such that f (ai ) = bi for i = 1, . . . , n + 1. It is given by f = i=1 bi fi where
Y X − aj
fi := .
ai − aj
j6=i
A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 151

Proof

First, let’s prove uniqueness. If f, g ∈ K[X] have degree at most n and f (ai ) = bi = g(ai ) for any
i = 1, . . . , n + 1 then f − g has n + 1 roots and has degree at most n so it is the zero polynomial,
i.e. f = g.

For existence, notice that fi (aj ) = 0 for any j 6= i and fi (ai ) = 1, so that
X X
f (ai ) = bj fj (ai ) = bi · 1 + bj · 0 = bi .
j j6=i

Corollary A.1.3

Let K ⊆ L be two fields. If a polynomial f ∈ L[X] of degree n reaches values in K at n + 1


points in K, it has coefficients in K.

Proof

If f (ai ) = bi with ai , bi ∈ K for i = 1, . . . , n + 1 and distinct ai , the Lagrange interpolation


formula shows that f ∈ K[X].


Exercise A.1.9∗ . Prove that every function f : Fp → Fp is polynomial.

To conclude this section, we make one final definition. Unlike what their name would suggest,
rational functions are formal objects and are not functions, like polynomials.

Definition A.1.7 (Rational Function)

A rational function with coefficients in K is a quotient of two polynomials f /g with coefficients


in K such that g 6= 0 (with the additional rule that f /g = (hf )/(hg) for any non-zero h ∈ K[X]).
The set of rational functions with coefficients in K is denoted K(X).

The derivative of a rational function f /g is (f 0 g − g 0 f )/(g 2 ), where g 2 = g · g.

Exercise A.1.10∗ . Prove that the derivative of a rational function does not depend on its form: i.e. (f /g)0 =
((hf )/(hg))0 for any f, g, h ∈ K[X] with g, h 6= 0.

A.2 Algebraic Structures and Morphisms


We introduced the notion of a field in the last section; here, we shall define a few additional algebraic
structures. There are two things to understand from this section: what an integral domain is5 and
what morphisms and isomorphisms are. This doesn’t mean that the other definitions are useless, but
you can ignore them for now. They will be used in some chapters: when this happens the reader
should come to this appendix to refresh their memory.
5 Although, usually in this book when something is obviously an integral domain and we don’t want to emphasise this

we will just call it a ring.


152 APPENDIX A. POLYNOMIALS

Definition A.2.1 (Ring)

We say a set R with two operations + and · from R2 to R is a ring if the following axioms are
satisifed. We write ab for a · b.
1. + is associative: (a + b) + c = a + (b + c) for any a, b, c ∈ R.
2. + is commutative: a + b = b + a for any a, b ∈ R.

3. additive identity: there is an element 0R such that 0R + a = a for any a ∈ R.


4. additive inverse: for any a ∈ R there is an element −a ∈ R such that a + (−a) = 0R .
5. · is associative: (a · b) · c = a · (b · c) for any a, b, c ∈ R.

6. multiplicative identity: there is an element 1R such that 1R · a = a · 1R = a for any a ∈ R.


7. · distributes over +: for any a, b, c ∈ R, a(b + c) = ab + ac and (b + c)a = ba + ca.

Exercise A.2.1∗ . Prove that 1R and 0R are unique, and that any element has a unique additive inverse and
a unique multiplicative inverse if it is non-zero.

Exercise A.2.2∗ . Let R be a ring. Prove that 0R a = a0R = 0R for any a ∈ R.

A ring is like a field, but possibly without the existence of multiplicative inverse, as well as with a
possibly non-commutative multiplication. Non-commutative rings will only be relevant in Chapter 2.
Again, R is technically not a ring, it is (R, +, ·) that is one, but by abuse of terminology we will say
that R is a ring when the addition and multiplication are obvious. We shall usually write 0 for 0R and
1 for 1R , even if they might not technically be our usual 0, 1 ∈ Z.

Definition A.2.2 (Z/nZ)

By Z/nZ, we denote the ring with n elements of integers modulo n. In particular, Z/0Z = Z.

Remark A.2.1
There is a deeper story behind this notation, see the footnote in Exercise A.3.14† .

Rings have a certain number associated to them which is what distinguishes rings like Z from rings
like Z/nZ. We have already encountered this notion in the case of fields in Proposition A.1.3.

Definition A.2.3 (Characteristic)

Let R be a ring. We say R has characteristic m if m is the smallest integer such that

1 + . . . + 1 = 0.
| {z }
m times

If no such m exists we say R has characteristic 0. The characteristic of R is denoted char R.

In other words, the characteristic of R is the smallest m ≥ 0 such that R contains a copy of Z/mZ.

Exercise A.2.3∗ . Prove that char R is the smallest m ≥ 0 such that R contains a copy of Z/mZ
A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 153

Definition A.2.4 (Commutative Ring)

A commutative ring is a ring where multiplication is commutative.

We now define a field in terms of these objects, exactly like before, but hopefully this makes it
clearer how these objects are connected.

Definition A.2.5 (Field)

A field K is a commutative ring where non-zero elements have multiplicative inverses, i.e. for
any a ∈ K there is an element a−1 ∈ K such that aa−1 = 1K .

For fields, you should think about Q. We also have the analogous definition for non-commutative
fields.
Exercise A.2.4∗ . Prove that the characteristic of a field is either 0 or a prime number p.

Definition A.2.6 (Skew Field)

A skew field K is a field but where multiplication is not necessarily commutative, i.e. a ring with
multiplicative inverses: for any 0 6= a ∈ K there is an element a−1 ∈ K such that aa−1 = a−1 a =
1K (we specify both equalities because multiplication is not necessarily commutative anymore).

Finally, we define the fundamental integral domains.

Definition A.2.7 (Domain)

A domain is a ring where the product of two non-zero elements is non-zero. A commutative
domain is called an integral domain.

For integral domains, you can again think about Z. An example of a commutative ring which isn’t
an integral domain is Z/4Z: 2 · 2 ≡ 0 but 2 6≡ 0. Z is really the typical example of an integral domain,
more than of a ring or commutative ring.
Exercise A.2.5. Let R be a finite integral domain (i.e. with finitely cardinality). Prove that it is a field.

An important fact about integral domains is that they are precisely the subrings of fields, i.e. they
are the rings which can be embedded in a larger field. Why is this true? It’s obvious that a subring
of a field is an integral domain. For the converse, given an integral domain R you can construct its
field of fractions Frac R, exactly like how you construct Q from Z. You define formal objects a/b for
a, b ∈ R and then you say a/b = c/d if ad = bc and you define addition and multiplication in this
obvious ways; it is then easy to check that this yields a field. For instance, for R = K[X], this gives
Frac R = K(X).
Exercise A.2.6∗ . Prove that a subring of a field is an integral domain.

Exercise A.2.7. What goes wrong if you try to construct the field of fractions of a commutative ring which
isn’t a domain?

Since integral domains can be embedded in fields, polynomials with coefficients there retain most
of their properties, so we can also define polynomials with coefficients in an integral domain R. The
ring of such polynomials is denoted R[X], and the ring of rational functions with coefficients in R,
Frac R[X] is denoted R(X).
Exercise A.2.8∗ . Let R be an integral domain. Prove that R[X] is also one.
154 APPENDIX A. POLYNOMIALS

In fact, perhaps the most important property we lose when restricting ourself to an integral domain,
is that we can not do the Euclidean division of any f by any g 6= 0. Indeed, our proof Proposition A.1.1
involved dividing by the leading coefficient of g, and it is true that we can’t have X = 2Xq + r for
some q ∈ Z[X] and r ∈ Z[X] of degree less than 1. However, this also means that there is one case
when we can make this Euclidean division: when g is monic.

Finally, we explain what morphisms are. Imagine you have the two fields {0, 1} and {a, b} where
a, b are formal symbols. Multiplication and addition are defined as follows. For the former 0 + 0 = 0,
0 + 1 = 1 and 1 + 1 = 0 for addition, and 0 · 0 = 0, 0 · 1 = 0 and 1 · 1 for multiplication. For the latter,
it’s a + a = a, a + b = b and b + b = a for addition, and a · a = a, a · b = a and b · b = b.

These are defined exactly in the same way! Any reasonable person would want to conclude that
they are the same, that they are both equal to F2 . But they are not! Our definition of a field was very
clear: it is a triple of a set and two operations satisfying some axioms. Here the sets are different so
the triples are too, which means the fields are not the same.

Thus, we want to define formally what a "relabelling" of the elements is. This is exactly what an
isomorphism is (iso = same, morphism = shape). We will in fact not define them formally in general
because isomorphisms depend on the structure considered (a ring isomorphism and a group (to be
defined later) isomorphism are different), but here is what a field isomorphism.

Definition A.2.8 (Field Isomorphism)

Let K and L be two fields. We say a function ϕ : K → L is an isomorphism if f is additive,


multiplicative, sends 1 to 1, and is bijective, i.e.

ϕ(x + y) = ϕ(x) + ϕ(y)


ϕ(x)ϕ(y) = ϕ(x)ϕ(y)
f (1) = 1

for any x, y ∈ K. If there exists such a ϕ, we say K and L are isomorphic and write K ' L.

A few words on this definition. The function ϕ is our "relabelling" of the elements: relabel x as
ϕ(x). The conditions of ϕ are to ensure that f conserves the field structure, i.e. addition gets mapped
to addition, multiplication to multiplication, inverse to inverse, identity to identity. The reason why
we ask that f (1) = 1 but not that f (0) = 0 is that this follows from ϕ(0) + ϕ(0) = ϕ(0). However
f (1) = 1 does not follow from ϕ(1)ϕ(1) = ϕ(1): ϕ could be identically zero.

Similarly, f (−x) = −f (x) and f (x−1 ) = f (x)−1 follow from the additivity and multiplicativity
respectively. What we really want is for f to respect every single aspect of the field structure, but
we have not written down all of these conditions in the definition of an isomorphism since they are
redundant (of course we also want f to be bijective: this is what it means for the fields to be "the
same up to relabelling").

In fact, all this talk about "conserving the structure" suggests that this might actually be an
important notion, and this is why we define morphism as functions which preserve the structure (which
is implicit, technically we should say our previous f is a field isomorphism, not just an isomorphism).
A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 155

Definition A.2.9 (Field Morphism)

Let K and L be two fields. We say a function ϕ : K → L is an morphism if f is additive,


multiplicative and sends 1 to 1, i.e.

ϕ(x + y) = ϕ(x) + ϕ(y)


ϕ(x)ϕ(y) = ϕ(x)ϕ(y)
ϕ(1) = 1

for any x, y ∈ K.

Commutative rings morphisms and isomorphisms are defined exactly the same way, because the
additional structure that comes from fields is the existence of multiplicative inverse, but we have seen
that the fact that ϕ sends inverses to inverses already follows from its multiplicativity (and the fact
that ϕ(1) = 1).

You might think that these notions of isomorphisms and morphisms are just pedantic details about
how to define formally objects. They are not. They suggest that objects which were initially defined
very differently might in fact be similar. For instance, morphisms of sets are just functions since sets
have no structure, and isomorphisms are bijective functions. Are bijections useless to study?

An example of fields non-trivially isomorphic is the one of Q(π) of rational functions in π and Q(e)
of rational functions in e are isomorphic. This is not obvious, and in fact very hard to prove (we do
not do it in this book).6 A better example will be seen soon, but first we need to define groups for
this (they will be used in Chapter 6).

Definition A.2.10 (Group)

We say a set G with an operation † : G2 → G is a group if the † is associative, has an identity,


and each element has an inverse for †, i.e.
1. (a † b) † c = a † (b † c) for any a, b, c ∈ G.
2. there is an e ∈ G such that a † e = e † a = a for any a ∈ G.

3. for any a ∈ G there is an a−1 ∈ G such that a † a−1 = a−1 † a = e.

Exercise A.2.9∗ . Prove that the identity e of a group G is unique, and that any a ∈ G has a unique inverse.
Moreover, prove that (xy)−1 = y −1 x−1 .

The simplest example is the cyclic group with n elements (Z/nZ, +) where Z/nZ represents integers
modulo n. We say it’s cyclic because it’s generated by only one element: the elements of (Z/nZ, +)
have the form 1 + . . . + 1 for some number of ones. It is commutative or abelian, which means that
the operation is commutative.7

Note that if you consider rings as groups you must ignore their multiplicative structure, since groups
have only operation (you can also consider the multiplicative group of units a ring, i.e. the elements
which are invertible).

A slightly more elaborate example, yet still very important, is the symmetric group with n elements
Sn . This is the group of permutations of {1, . . . , n}, and the operation is composition.
Exercise A.2.10∗ . Check that (Sn , ◦) is a group.
6 Well, actually, I’m a bit exaggerating here because I did not find a good examples with fields, showing that Q(π) =

Q(e) amounts to showing they are both transcendental (see Section 1.1 for a definition), and this is what’s really hard.
7 Personally, I was only convinced by this termniology when I realised saying "let L/K be a commutative extension"

seemed extremely awkward. (A field extension L/K is said to be abelian if its Galois group is, see Chapter 6.)
156 APPENDIX A. POLYNOMIALS

Morphisms of groups are extremely easy to define since groups have so little (yet so much!) struc-
ture: it’s simply a function which commutes with the operation: ϕ : G → H is a morphism if
ϕ(a † b) = ϕ(a) ? ϕ(b) for any a, b ∈ G.
Exercise A.2.11∗ . Prove that a morphism of groups from (G, †) to (H, ?) maps the identity of G to the
identity of H.

W nowe give an example of a non-trivial isomorphism (of groups). If p is a prime, the groups
((Z/pZ)× , ·) and (Z/(p − 1), +) are isomorphic, where (Z/pZ)× denotes the integers mod p which are
coprime with p (so that inverses exist). Since they clearly have the same number of elements, this
is equivalent to (Z/pZ)× being cyclic, i.e. generated by one element, which we will call g. This g is
such that, for any a ∈ (Z/pZ)× , there is a k such that a = g k . This is exactly the definition of a
primitive root! Thus, this isomorphism translates the fact that there is a primitive root modulo p,
which is certainly non-trivial! Here is another very interesting example: (Z/nZ, +) is isomorphic to
(Un , ·), where Un denotes the set of complex nth roots of unity. Indeed, the function k 7→ exp 2kiπ
n
is clearly an isomorphism between the two.

We will usually write our group operations multiplicatively, that is, we will write xy or x · y for x † y.
In this case, one should always write the inverse of y as y −1 instead of 1/y, unless the group is already
known to be abelian. Indeed, if we were to write 1/y, would x/y mean xy −1 or y −1 x? Sometimes the
additive notation will also be used, we shall then write x + y for xy and nx for xn . We may also write
the identity e of G as 1 or 0, depending on whether we are using the additive or multiplicative notation.
In addition, when the operation is obvious, we will omit it. For instance, when we consider a ring as a
group (such as Z/nZ), the operation will always be addition. Indeed, it cannot be multiplication since
0 does not have an inverse. Conversely, if we write R× , the set of invertible units of R, this shall be
considered as a multiplicative group (it is not additive since 0 6∈ R× ).

Lastly, we define two important maps on morphisms.

Definition A.2.11 (Kernel)

The kernel ker ϕ of a morphism ϕ is the set of x such that ϕ(x) = 0. It measures how far ϕ is
from being injective.

Definition A.2.12 (Image)

The image im ϕ of a morphism ϕ is the set of y such that y = ϕ(x) for some x. It measures how
far ϕ is from being surjective.

Exercise A.2.12∗ . Prove that the kernel of a morphism (of rings or groups) is closed under addition.

Exercise A.2.13∗ . Prove that a morphism of groups is injective iff its kernel is trivial, i.e. consists of only
the identity.

As a final remark, in Appendix C, we will introduce another algebraic structure called vector spaces,
and we will define morphisms for vector spaces.

A.3 Exercises
Derivatives
Exercise A.3.1† . Let f, g ∈ K[X] be two polynomials. Prove that the derivative of f ◦ g is g 0 · f 0 ◦ g.

Exercise A.3.2† . Let f ∈ K[X] be a non-constant polynomial. Prove that there are a finite number
of g, h ∈ K[X] such that g ◦ h = f , up to affine translation, meaning (g, h) ≡ g(aX + b), h−b
a .
A.3. EXERCISES 157

Exercise A.3.3. Let f ∈ R[X] be a polynomial. Suppose that f ◦ f is the square of a polynomial.
Prove that f also is the square of a polynomial.
Exercise A.3.4† (USA TST 2017). Let f, g ∈ R[X] be non-constant coprime polynomials. Prove
that there are at most three real numbers λ such that f + λg is the square of a polynomial.
Exercise A.3.5 (All-Russian Olympiad 2014). On a blackboard, we write (only) the polynomials
X 2 − 3X 2 and X 2 − 4X and all real numbers c ∈ R. If the polynomials f and g are written on the
board, we can also write f ± g, f · g and f ◦ g. Is it possible to write a polynomial of the form X n − 1?
Exercise A.3.6† (Discrete Derivative). Let f ∈ K[X] be a polynomial of degree n and leading
coefficient a. Define its discrete derivative as ∆f := f (X + 1) − f (X). Prove that, for any g ∈ K[X]
∆f = ∆g if and only if f − g is constant, and that ∆f is a polynomial of degree n − 1 with leading
coefficient an where a is the leading coefficient of f . Deduce the minimal degree of a monic polynomial
f ∈ Z[X] identically zero modulo m, for a given integer m ≥ 1.
Exercise A.3.7† . Let f : R → R be a function. Define its discrete derivative ∆f as x 7→ f (x+1)−f (x).
Prove that, for any integer n ≥ 0,
n  
X n
∆n f (x) = (−1)n−k f (x + k).
k
k=0

Exercise A.3.8† . Let m ≥ 0 be an integer. Prove that there is a polynomial fm ∈ Q[X] of degree
m + 1 such that
Xn
k m = fm (n)
k=0
for any n ∈ N.

Roots of Unity
Exercise A.3.9† (Root of Unity Filter). Let f = i ai X i ∈ K[X] be a polynomial, and suppose that
P
ω1 , . . . , ωn ∈ K are distinct nth roots of unity. Prove that
f (ω1 ) + . . . + f (ωn ) X
= ak .
n
n|k

Deduce that, if K = C,
max |f (z)| ≥ |f (0)|.
|z|=1

(You may assume the existence of a primitive nth root of unity ω, meaning that ω k 6= 1 for all k < n,
or, equivalently, every nth root of unity are powers of ω. This will be proven in Chapter 3.)
Exercise A.3.10† . Let f = i ai X i ∈ R[X] be a polynomial and ω1 , . . . , ωn ∈ C be distinct nth
P
roots of unity with n > deg f . Prove that
|f (ω1 )|2 + . . . + |f (ωn )|2 X
= a2i .
n i

Denote by S(f ) the sum of the squares of the coefficients of f . Deduce that S(f g) = S(f X deg g g(1/X))
for all f, g ∈ R[X]. (X deg g g(1/X) is the polynomial obtained by reversing the coefficients of g.)
Exercise A.3.11† . Let k be an integer. Prove that a∈Fp ak is 0 if p − 1 - k and −1 otherwise.
P

Deduce that any non-constant polynomial f ∈ Fp [X] satisfying f (a) ∈ {0, 1} for all a ∈ Fp must have
degree at least p − 1.
Exercise A.3.12† . Let p 6= 3 be a prime number. Suppose that a and b are integers such that
p | a2 + ab + b2 . Prove that (a + b)p ≡ ap + bp (mod p3 ).
Exercise A.3.13 (China TST 2018). Let k be an integer, p a prime number, and S the set of kth
powers of elements of Fp . Prove that, if 2 < |S| < p − 1, the elements of S are not an arithmetic
progression.
158 APPENDIX A. POLYNOMIALS

Group Theory
Exercise A.3.14† . Given a group G and a normal subgroup H ⊆ G, i.e. a subgroup such that

x+H −x=H

for any x ∈ G,8 we define the quotient G/H of G by H as G modulo H 9 , i.e. we say x ≡ y (mod H)
if x − y ∈ H.10 Prove that this indeed a group, and that |G/H| = |G|/|H| for any such G, H.

Exercise A.3.15† (Isomorphism Theorems). Prove the following first, second, and third isomorphism
theorems.

1. Let ϕ : A → B be a morphism of groups. Then, A/ ker ϕ ' im ϕ. (In particular, ker ϕ is normal
in A and | im ϕ| · | ker ϕ| = |A|.)

2. Let H be a subgroup of a group G, and N a normal subgroup of G. Then, H/H ∩ N ' HN/N .
(In particular, you need to show that this makes sense: HN is a group and H ∩ N is normal in
H.)

3. Let N ⊆ H be normal subgroups of a group G. Then, (G/N )/(H/N ) ' G/H.

Exercise A.3.16† . Let G be a finite group, ϕ : G → C× be a non-trivial group morphism (i.e. not
the constantPfunction 1), where (C× , ·) is the group of non-zero complex numbers under multiplication.
Prove that g∈G ϕ(g) = 0.

Exercise A.3.17† (Lagrange’s Theorem). Let G be a group of cardinality n (also called the order of
G). Prove that g n = e for all g ∈ G. In other words, the order of an element divides the order of the
group. More generally, prove that the order of a subgroup divides the order of the group.

Exercise A.3.18† (5/8 Theorem). Let G be a non-commutative finite group. Prove that the proba-
bility
|{(x, y) ∈ G2 | xy = yx}|
p(G) =
|G|2
that two elements commute is at most 5/8.

Exercise A.3.19† (Fundamental Theorem of Finitely Generated Abelian Groups). Let G be an


abelian group which is finitely generated, i.e., if we write its operation as +, there are g1 , . . . , gk ∈ G
such that any g ∈ G can be represented as n1 g1 + . . . + nk gk for integers ni ∈ Z. Prove that there
is a unique integer n ≥ 0 (called the rank of the group) and a unique sequence of positive integers
d1 | . . . | dm such that
(G, +) ' (Zn × Z/d1 Z × . . . × Z/dk Z, +).

Exercise A.3.20† (Burnside’s Lemma). Let G be a finite group, S a finite set, and · a group action
of G on S, meaning a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G
and s ∈ S. Given a g ∈ G, denote by Fix(g) the set of elements of s fixed by g. Prove that

1 X
|S/G| = Fix(g),
|G|
g∈G

where |S/G| denotes the number of (disjoint) orbits Oi = Gsi . Deduce the number of necklaces that
have p beads which can be of a colours, where p is a prime number and two necklaces are considered
to be the same up to rotation.
8 In particular, when G is abelian, any subgroup is normal.
9 This is where the notation Z/nZ comes from! In fact this shows that, in reality, we should say "modulo nZ" instead
of "modulo n".
10 A better formalism is to say that G/H is the set of cosets g + H for g ∈ G. In fact, we will almost always use this

definition in the solutions of exercises (since this is the only place where this will appear), but we introduced it that way
to make the analogy with Z/nZ clearer.
A.3. EXERCISES 159

Miscellaneous
Exercise A.3.21† (China TST 2009). Prove that there exists a real number c > 0 such that, for any
prime number p, there are at most cp2/3 positive integers n satisfying n! ≡ −1 (mod p).
Exercise A.3.22† (Mason-Stothers Theorem, ABC conjecture for polynomials). Suppose that A, B, C ∈
C[X] are coprime polynomials such that A + B = C. Prove that
1 + max(deg A, deg B, deg C) ≤ deg(rad ABC)
where rad ABC is the greatest squarefree divisor of ABC (in other words, deg(rad ABC) is the number
of distinct complex roots of ABC). Deduce that the Fermat equation f n + g n = hn for f, g, h ∈ C[X]
does not have non-trivial solutions for n ≥ 2.
Exercise A.3.23† . Find all polynomials f ∈ C[X] which send the unit circle to itself.
Exercise A.3.24. Suppose that f, g ∈ C[X] are polynomials such that, for all x ∈ C, f (x) ∈ R implies
g(x) ∈ R. Prove that there exists a polynomial h ∈ R[X] such that g = h ◦ f .
Exercise A.3.25. Let (K, +, ·) be a set satisfying the axioms of a field except possibly that · takes
values in K. Prove that it is in fact a field.
Exercise A.3.26† (Gauss-Lucas Theorem). Let f ∈ C[X] be a polynomial with roots α1 , . . . , αk .
Prove that
f0 X 1
= .
f X − αk
k
Deduce the Gauss-Lucas theorem: if f ∈ C[X] is non-constant, Pthe roots ofPf 0 are in the convex hull of
0
the roots of f , that is, any root β of f is a linear combination i λi αi with i λi = 1 and non-negative
λi ∈ R.
Exercise A.3.27† (Sturm’s Theorem). Given a squarefree polynomial f ∈ R[X], define the sequence
f0 = f , f1 = f 0 and fn+2 is minus the remainder of the Euclidean division of fn by fn+1 . Define also
V (ξ) as the number of sign changes in the sequence f0 (ξ), f1 (ξ), . . ., ignoring zeros. Prove that the
number of distinct real roots of f in the interval ]a, b] is V (a) − V (b).11
Exercise A.3.28† (Ehrenfeucht’s Criterion). Let K be a characteristic zero field, let f1 , . . . , fk ∈ K[X]
be polynomials and define
f = f1 (X1 ) + . . . + fk (Xk ) ∈ K[X1 , . . . , Xk ].
If k ≥ 3, prove that f is irreducible. In addition, prove that this result still holds if k = 2 and f1 and
f2 have coprime degrees.
Exercise A.3.29† (IMC 2007). Let a1 , . . . , an be integers. Suppose f : Z → Z is a function such that
n
X
f (kai + `) = 0
i=1

for any k, ` ∈ Z. Prove that f is identically zero.


Exercise A.3.30. Find all polynomials f ∈ C[X] satisfying
1. f (X n ) = f (X)n for some integer n ≥ 2.
2. f (X 2 + 1) = f (X 2 ) + 1.
3. f (X)f (X + 1) = f (X 2 + X + 1).
4. f (X 2 ) = f (X)f (X + 1).
5. f (X 2 ) = f (X)f (X − 1).
Exercise A.3.31. Let f ∈ K[X] be a polynomial of degree n. Find f (n + 1) if
k
1. (USAMO 1975) f (k) = k+1 for k = 0, . . . , n.

2. f (k) = 2k for k = 0, . . . , n.
11 If we choose a = −∞, b = +∞, this gives an algorithm to compute the number of real roots of f , by looking at the

signs of the leading coefficients of f0 f1 , . . ..


Appendix B

Symmetric Polynomials

Prerequisites for this chapter: Section A.1.

B.1 The Fundamental Theorem of Symmetric Polynomials


Given a commutative ring R (in our case we will consider Z and Q) and an integer n ≥ 0, we can
consider the symmetric polynomials in n variables with coefficients in R. These are defined as the
polynomials in n variables invariant under all permutations of these variables.

Definition B.1.1 (Symmetric Polynomials)

We say a polynomial f ∈ R[X1 , . . . , Xn ] is symmetric if f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ) for


any permutation σ of [n].

Exercise B.1.1. Let f ∈ K(X1 , . . . , Xn ) be a rational function, where K is a field. Suppose f is symmetric,
i.e. invariant under permutations of X1 , . . . , Xn . Prove that f = g/h for some symmetric polynomials g, h ∈
K[X1 , . . . , Xn ].

As an example, f = X 2 Y + XY 2 + X 2 + Y 2 is a symmetric polynomial in two variables, and

g = X 2 Y Z + XY 2 Z + XY Z 2 + XY 2 + X 2 Y + XZ 2 + X 2 Z + Y Z 2 + Y 2 Z

is a symmetric polynomial in three variables.

Definition B.1.2 (Elementary Symmetric Polynomials)

The kth elementary symmetric polynomial for k ≥ 0, ek ∈ R[X1 , . . . , Xn ], is defined by


X
ek = Xi1 · . . . · Xik .
1≤i1 <...<ik ≤n

Further, if k > n then ek = 0 (the empty sum) and if k = 0 then e0 = 1 (the sum of the empty
product).

The two-variable symmetric polynomials are thus simply e1 = X + Y and e2 = XY . The three-
variable ones are e1 = X + Y + Z, e2 = XY + Y Z + ZX and e3 = XY Z.

We now state the fundamental theorem of symmetric polynomial.

160
B.2. NEWTON’S FORMULAS 161

Theorem B.1.1 (Fundamental Theorem of Symmetric Polynomials)

Suppose f ∈ R[X1 , . . . , Xn ] is a symmetric polynomial. Then f ∈ R[e1 , . . . , en ]. In other words,


there is a polynomial g ∈ R[X1 , . . . , Xn ] such that

f (X1 , . . . , Xn ) = g(e1 , . . . , en ).

This theorem explains why we called ek "elementary symmetric polynomials": because they gen-
erate all symmetric polynomials.

Proof

We proceed by induction on the number n of variables. For n = 1 this is trivial as σ1 = X. Thus


suppose it is true for n − 1 variables.

Now, we proceed again by induction on the degree d of f . When f = 0 this is obvious. Otherwise,
by the first induction hypothesis,

f (X1 , . . . , Xn−1 , 0) = g(e01 , . . . , e0n−1 )

for some g ∈ K[X1 , . . . , Xn ], where e0i is the ith elementary symmetric polynomial in n − 1
variables, i.e. e0i (X1 , . . . , Xn−1 ) = ei (X1 , . . . , Xn−1 , 0). Notice that deg g ≤ deg f .

Write
f (X1 , . . . , Xn ) = g(e1 , . . . , en−1 ) + h(X1 , . . . , Xn ).
We have h(X1 , . . . , Xn−1 , 0) = 0, i.e.

Xn | h(X1 , . . . , Xn−1 , Xn )

(h(X1 , . . . , Xn ) has a root at Xn in the ring R[X1 , . . . , Xn−1 ][X]). By symmetry we also have
Xi | h for any i so h = X1 · . . . · Xn r = en s.

To conclude s is symmetric and has degree at most deg f − n < deg f so r = s(e1 , . . . , en ) by the
second induction hypothesis and we get

f = g(e1 , . . . , en−1 ) + en r(e1 , . . . , en )

as wanted.


Exercise B.1.2. Prove that the decomposition of a symmetric polynomial f as g(e1 , . . . , en ) is unique.

B.2 Newton’s Formulas


Apart from elementary symmetric polynomials, there are another type of polynomials which are of
particular interest.

Definition B.2.1 (Power Sum Polynomials)

The kth power sum polynomial for k ≥ 0, pk ∈ R[X1 , . . . , Xn ], is defined by

pk = X1k + . . . + Xnk .

Here is how they relate to the elementary symmetric polynomials.


162 APPENDIX B. SYMMETRIC POLYNOMIALS

Theorem B.2.1 (Newton’s Formulas)

For any integer k ≥ 0, we have


k
X
kek = (−1)i−1 ek−i pi = ek−1 p1 − ek−2 p2 + . . . + (−1)k−1 pk .
i=1

The main importance of these formulas is that they let us recover the ei from the pi by induction,
as the RHS only has ei for i < k. This is expressed in the following corollary.

Corollary B.2.1*

Let K be a field of characteristic zero. Then, K(e1 , . . . , en ) = K(p1 , . . . , pn ). In particular, if

p1 (x1 , . . . , xn ), . . . , pn (x1 , . . . , xn )

all lie in K, then so do


e1 (x1 , . . . , xn ), . . . , en (x1 , . . . , xn ).

Remark B.2.1
This still holds in characteristic p as long as n < p, so that the k in kek is non-zero.

Exercise B.2.1∗ . Prove Corollary B.2.1.

Elementary Proof of Newton’s Formulas

Let’s compute the product ek−1 pi . ek−i is the sum of products of k − i distinct variables, while
pi is the sum of the ith powers of all the variables. Thus, ek−i pi is the sum of the product of the
ith power of some variable, times k − i distinct variables.

In the product of the k − i distinct variable, either one of these variables will be the same as the
one raised to the ith power, or not. Hence, we get ek−i pi = r(i) + r(i + 1) where r(j) is the sum
of products of one other variable raised to the k − jth power times j other distinct variables:
X X
Xi1 · . . . · Xik−j · X`j .
i1 <...<ik−j `6=i1 ,...,ik−j

Thus,
k
X
(−1)i−1 ek−i pi = (r(1)+r(2))−(r(2)+r(3))+. . .+(−1)k (r(k)+r(k+1)) = r(1)+(−1)k r(k+1).
i=1

To conclude, r(1) = kek and r(k + 1) = 0 (there is no sum over −1 variables, or at least which
is a homogeneous polynomial of degree k).


Proof of Newton’s Formulas using Generating Functions

We now present a proof by generating functions. We work in R[X1 , . . . , Xk ][[T ]], i.e. the ring of
B.3. THE FUNDAMENTAL THEOREM OF ALGEBRA 163

formal power series in T with coefficients in R[X1 , . . . , Xn ]. We have


n
X
(1 − X1 T ) · . . . · (1 − Xn T ) = (−1)k ek T k
k=0

so, by differentiating, we get


 
n
X n
X Y
k k (−Xi ) (1 − Xj T )
(−1) kek T = T
k=0 i=1 j6=i
n
! n
X Xi T Y
=− (1 − Xj T )
i=1
1 − Xi T j=1
 

n X n
!
X X
j
= − (Xi T ) (−1)i ei T i
i=1 j=1 i=0


!n

X X
= pi T i  (−1)i−1 ei T i  .
i=1 j=0

By comparing the T k coefficients, we get


k
X
(−1)k = (−1)k−i−1 pi ek−i
o=1

as wanted.


B.3 The Fundamental Theorem of Algebra


Recall the statement of the fundamental theorem of algebra, whose name was coined at a time where
algebra was about solving polynomial equations.1 In reality, it is a theorem about analysis and is
usually proven that way. However, in this section we will present a mostly algebraic one (a completely
algebraic proof is impossible because R is defined analytically).

Theorem B.3.1 (Fundamental Theorem of Algebra)

Any polynomial f ∈ C[X] of degree n ≥ 0 has exactly n roots, i.e.,

f = a(X − α1 ) · . . . (X − αn )

where α1 , . . . , αn are the roots of f counted with multiplicity and a is its leading coefficient.

By Proposition A.1.2, it suffices to show that any non-constant polynomial f ∈ C[X] has a root in
C. This is also called the d’Alembert-Gauss theorem because d’Alembert was the first one to recognise
the importance of proving this result but gave a flawed proof, while Gauss was (almost) the first one
to give a rigorous proof (in fact he even gave multiple proofs).

1 Now it means abstract algebra and linear algebra, see Section A.2 and Appendix C (the section on abstract algebra

does not give justice to the subject, since it was only about setting up useful definitions for this book).
164 APPENDIX B. SYMMETRIC POLYNOMIALS

Theorem B.3.2 (d’Alembert-Gauss Theorem)

Any polynomial non-constant polynomial with complex coefficients has a complex root.

Notice that we can assume the polynomial f has real coefficients, since if f has complex coefficients,
g = f f has real coefficients where f denotes the polynomial obtained by applying complex conjugation
to the coefficients of f . Thus, if we find a root α of g, either α ∈ R in which case we can use induction on
g g
X−α , or we know by Proposition A.1.6 that α is also a root and we can use induction on (X−α)(X−α) .
This would prove that g has as many complex roots as its degree, and thus f too.
We shall only use two results. The first one is a corollary of the intermediate value theorem, while
the second one is left as an exercise.

Proposition B.3.1

Any polynomial f ∈ R[X] of odd degree has a real root.

Proof

Since f has odd degree, f (−∞) and f (+∞) have opposite signs so there is a root by the inter-
mediate value theorem. (Here we work in R ∪ {+∞, −∞}, which is just a nice shortcut for not
writing limits.)


Proposition B.3.2

Any polynomial f ∈ C[X] of degree 2 has a root C.

Exercise B.3.1∗ . Prove Proposition B.3.2.

Proof that any non-constant polynomial with real coefficients has a complex root

Let f ∈ R[X] be a polynomial of degree n. We proceed by induction on k = v2 (n); the case


k = 0 is Proposition B.3.1. For the induction step, suppose k ≥ 1 so that n is even.
Let α1 , . . . , αn be the roots of f . You might wonder what that means since we don’t know if
they exist (in C) or not. It’s true that we don’t know whether they lie in C or not yet, but
we can construct them formally, just like i was constructed formally to be an object such that
i2 = −1. Thus we can construct α1 , . . . , αn inductively such that f has n roots in some field K.
(However, if you add a formal object α such that g(α) to a field, this will make a field only if g
is irreducible. But you can factorise f into a product of irreducible polynomials then add a root
of one of the factors and repeat. See Exercise 4.2.1∗ .)
Given a real number t, we consider the polynomial
Y
gt = X − (αi + αj + tαi αj )
i<j
Q
which has real coefficients by the fundamental theorem of symmetric polynomials: if i<j X−
(Xi + Xj + tXi Xj ) = ht (X, e1 , . . . , en ) for some ht ∈ R[X][X1 , . . . , Xn ], then
gt = ht (X, e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn ))
since ei (α1 , . . . , αn ) ∈ R by Vieta’s formulas.
B.4. EXERCISES 165

Notice that gt has degree n(n−1)


2 and, since n is even, v2 (deg gt ) = n − 1. Hence, gt always has a
complex root α = αi + αj + tαi αj for some i, j by the induction hypothesis.

Now pick n(n−1)


2 + 1 values of 0 6= t ∈ R: by the pigeonhole principle two of them must have the
same indices, αi + αj + rαi αj is a complex root of gr and αi + αj + sαi αj is a complex root of
gs for the same i, j.

By subtracting these two numbers, we get αi αj ∈ C. Similarly, we have αi + αj ∈ C. Thus αi


and αj are roots of a quadratic equation X 2 − (αi + αj )X + αi αj with complex coefficients and
hence also lie in C by Proposition B.3.2. In particular, we have found a complex root of f as
wanted.


B.4 Exercises
Newton’s Formulas
Exercise B.4.1. Denote by hk ∈ R[X1 , . . . , Xn ] the kth complete homogeneous polynomial , i.e. the
sum of all monomials of degree k. Prove that
k
X
khk = hk−i pi = hk−1 p1 + hk−2 p2 + . . . + pk
i=1

for any k ≥ 0.

Exercise B.4.2† (Hermite’s Theorem). Prove that a function f : Fp → Fp is a bijection if and only
if a∈Fp f (a)k is 0 for k = 1, . . . , p − 2 and −1 for k = p − 1.
P

Exercise B.4.3† . Suppose that α1 , . . . , αn are such that α1k + . . . + αkn is an algebraic integer for all
n. Prove that α1 , . . . , αk are algebraic integers.

Algebraic Geometry2
Exercise B.4.4† (Resultant). Let R be a commutative ring, and f, g ∈ R[X] be two polynomials of
respective degrees m and n. For any integer k ≥ 0, denote by Rk [X] the subset of R[X] consisting of
polynomials of degree less than k. The resultant Res(f, g) is defined as the determinant of the linear
map
(u, v) 7→ uf + vg
from Rm [X] × Rn [X] to Rm+n [X]. Prove that, if f = i ai X i and g = i bi X i , we have3
P P

··· ···

a0 0 0 b0 0 0
··· ···

a1 a0 0 b1 b0 0
.. ..

. .

a2 a1 0 b2 b1 0
.. .. .. ..

.. ..
. . . a0 . . . b0
Res(f, g) = .. ,

..
am
am−1 ··· . bn bn−1 ··· .
.. .. .. ..
0
am . . 0 bn . .
. .. .. .. .. ..
..

. . am−1 . . . bn−1
0 0 ··· am 0 0 ··· b
n

2 The link with symmetric polynomials is quite feeble, I admit. I included this section because of the resultant, which

is a polynomial symmetric in the roots of its arguments.


3 This is an (m + n) × (m + n) matrix, with n times the element a and m times the element b .
0 0
166 APPENDIX B. SYMMETRIC POLYNOMIALS

X − βj , then4
Q Q
and, if f = a i X − αi and g = b j

Y
Res(f, g) = am bn αi − βj .
i,j

In addition, prove that Res(f, g) ∈ (f R[X] + gR[X]).5 Finally, prove that if f, g ∈ Z[X] are monic and
uf +vg = 1 for some u, v ∈ Z[X], Res(f, g) = ±1. (It is not necessarily true that (f R[X]+gR[X])∩R =
Res(f, g)R for specific polynomials f, g, but we always have Res(f, g) ∈ f R[X]+gR[X] by the previous
point.)

Exercise B.4.5. Prove that

• Res(f g, h) = Res(f, h) Res(g, h) for any f, gh ∈ R[X].

• Res(f − gh, g) = bk−n Res(f, g) for any f, g, h ∈ R[X] where k is the degree of f − gh, n is the
degree of g and b its leading coefficient.

• Res(F (f, g), G(f, g)) = Res(F, G)k Res(f, g)mn where F, G, f, g ∈ R[X, Y ] are homogeneous poly-
nomials of respective degrees m, n, k, k. Here, by Res(A, B) for homogeneous A, B ∈ R[X, Y ],
we mean Res(A(X, 1), B(X, 1)).

Exercise B.4.6† (Hilbert’s Nullstellensatz). Let K be an algebraically closed field. Suppose that
f1 , . . . , fm ∈ K[X1 , . . . , Xn ] have no common zeros in K. Prove that there exist polynomials g1 , . . . , gm
such that
f1 g1 + . . . + fm gm = 1.

Deduce that, more generally, if f is a polynomial which is zero at common roots of polynomials
f1 , . . . , fm (we do not assume anymore that they have no common roots), then there is an integer k
and polynomials g1 , . . . , gm such that

f k = f1 g1 + . . . + fm gm .

Exercise B.4.7† (Weak Bézout’s Theorem). Prove that two coprime polynomials f, g ∈ K[X, Y ] of
respective degrees m and n have at most mn common roots in K. (Bézout’s theorem states that they
have exactly mn common roots counted with multiplicity, possibly at infinity.6 )

Exercise B.4.8† . Prove that n + 1 polynomials f1 , . . . , fn+1 ∈ K[X1 , . . . , Xn ] in n variables are


algebraically dependent, meaning that there is some non-zero polynomial f ∈ K[X1 , . . . , Xn+1 ] such
that
f (f1 , . . . , fn+1 ) = 0.

Exercise B.4.9† (Transcendence Bases). Let L/K be a field extension. Call a maximal set of K-
algebraically independent elements of L a transcendence basis. Prove that, if L/K has a transcendence
basis of cardinality n, then all transcendence bases have cardinality n. This n is called the transcendence
degree trdegK L. Finally, show that, if L = K(α1 , . . . , αn ) any maximal algebraically independent
subset of α1 , . . . , αn is a transcendence basis. (In particular trdegK L ≤ n.)

Exercise B.4.10† . Let K be an algebraically closed field which is contained in another field L.
Suppose that f1 , . . . , fm ∈ K[X1 , . . . , Xn ] are polynomials with a common root in L. Prove that they
also have a common root in K.
n(n−1)
(−1) 2
4 In particular, the discriminant of f is a
· Res(f, f 0 ).
5 In other words, the resultant provides an explicit value of a possible constant in Bézout’s lemma for arbitrary rings

(such as Z).
6 This requires some care: we need to define the multiplicity of common roots as well as what infinity means. See any

introductory text to algebraic geometry, e.g. Sharevich [shafarevich]. See also the appendix on projective geometry of
Silverman-Tate [26].
B.4. EXERCISES 167

Miscellaneous
Exercise B.4.11† (ISL 2020 Generalised). Let n ≥ 1 be an integer. Find the maximal N for which
there exists a monomial f of degree N which can not be written as a sum
n
X
e i fi
i=1

with fi ∈ Z[X1 , . . . , Xn ].
Exercise B.4.12† (Lagrange). Given a rational function f ∈ K[X1 , . . . , Xn ], we denote by Gf the
set of permutations σ ∈ Sn such that

f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ).

Let f, g ∈ K(X1 , . . . , Xn ) be two rational functions. If Gf ⊆ Gg , prove that there exists a rational
function r ∈ K[e1 , . . . , en ](X) such that
g = r ◦ f.
Exercise B.4.13† (Iran Mathematical Olympiad 2012). Prove that there exists a polynomial f ∈
R[X0 , . . . , Xn−1 ] such that, for all a0 , . . . , an−1 ∈ R,

f (a0 , . . . , an−1 ) ≥ 0

is equivalent to the polynomial X n + an−1 X n−1 + . . . + a0 having only real roots, if and only if
n ∈ {1, 2, 3}.

Exercise B.4.14. Let f ∈ K[X] be a monic polynomial with roots α1 , . . . , αn . Prove that its dis-
criminant, as defined in Remark 1.3.2 or Exercise B.4.4† is equal to
n
n(n−1) Y
(−1) 2 · f 0 (αi ).
i=1

Deduce a formula for the discriminant of X n + aX + b and show that it is valid over any ring (not
necessarily one where X n + aX + b has at most n roots7 ).

7 Remember that Corollary A.1.1 is valid only over fields and integral domains.
Appendix C

Linear Algebra

Prerequisites for this chapter: Section A.1. Section A.2 is recommended.

C.1 Vector Spaces

Definition C.1.1 (Vector Space)

A vector space V over a field K, also called a K-vector space, is a set where you can add elements
of V and also mutliply them by elements of K. More specifically, we have the following axioms.

1. associativity of addition: (u + v) + w = u + (v + w) for any u, v, w ∈ V .


2. commutativity of addition: u + v = v + u for any u, v ∈ V .
3. identity of addition: there is a 0V ∈ V such that u + 0V = 0V + u for any u ∈ V .

4. compatibility of multiplication: (ab)v = a(bv) for any a, b ∈ K and v ∈ V .


5. identity of multiplication: 1K v = v for any v ∈ V (1K is the identity of K).
6. disitributy of multiplication: a(u + v) = au + av and u(a + b) = ua + ub for any a, b ∈ K
and u, v ∈ V .

The elements of the base field K are called scalars (and the elements of V vectors, although we
won’t use this termniology much).

These axioms are again all very obvious and you don’t need to try to remember them, they are
exactly the properties which let us establish the next propositions (that’s why we defined it like that)
so you just need to focus on what’s next. As an example of vector spaces one can take the K-vector
space K, the K-vector space K n with componentwise addition and multiplication (these are regular
vectors), Z/pn Z as a Fp -vector space, and as a final more elaborate example the R-vector space of
functions f : R → R such that f (0) = 0 (it’s closed under addition).
When the base field is obvious from context or does not matter, we will drop the K.

Definition C.1.2 (Linear Independence)

We say elements u1 , . . . , un of a K-vector


P space V are linearly independent if no non-trivial linear
combination of them is zero, i.e. i ai ui = 0 for ai ∈ K implies ai = 0 for all i. Otherwise, we
say they are linearly dependent.

For instance, the vectors (1, 0) and (0, 1) are linearly independent but the vectors (1, 0), (0, 1) and
(1, 1) aren’t because (1, 0) + (0, 1) − (1, 1) = 0.

168
C.1. VECTOR SPACES 169

Definition C.1.3 (Bases)

A K-basis of a K-vector space V is a family of linearly independentPelements (ei )i∈I such that
they span all of V : any element of V is a unique linear combination i ai ei for some ai ∈ K (all
but finitely many zero so that the sum makes sense).

The most common basis of Rn for instance is family set of unit vectors

(1, 0 . . . , 0), (0, 1, . . . , 0), . . . , (0, . . . , 0, 1).

The next proposition says that the cardinality of a basis (if there is one) does not depend on the
basis, but only on the vector space. But first, here is an application (even though we have only made
definitions!). When the base field K is obvious, we will drop the K and simply say "basis" and "vector
space".

Problem C.1.1

There are 2n + 1 cows such that, whenever one exludes one of them, the rest can be divided into
two groups of size n such that the sum of the weights of the cows in each group is the same.
Prove that all cows have the same weight.

Solution

Let wi be the the ith cow. First, we solve the problem by induction when the weights
P weight of P
are in Z: if i∈Ik wi = i6∈Ik ,i6=k wi then

2n+1
X X X
wi = wk + wi + wi ≡ wk (mod 2)
i=1 i∈Ik i6∈Ik ,i6=k

so all weights have the same parity. If they are all even divide them by 2 and get a smaller
solution (unless they are all equal to 0 in which case they are indeed equal), otherwise add 1 to
all of them and divide them by 2; this yields another solution as the groups have the same size
by assumption.

Now solve the problem over Q: just multiply the weights by the lcm of the denominators to get
weights in Z and we have already solved that case.

Finally, let’s do the general case: wi ∈ R. Consider the Q-vector space generated by the weights:
V = w1 Q + . . . + w2n+1 Q. Find a basis of V like this: pick any maximal subset of weights which
are linearly independent (wi )i∈I . Indeed, any other weight wk can be represented as a linear
combination of (wi )i∈I since there is a linear combination
X
awk + ai wi = 0
i∈I

as (wk ) ∪ (wi )i∈I is linearly dependent by assumption, so


X −ai
wk = wi
a
i∈I

as a 6= 0 (otherwise (wi )i∈I are linearly dependent).


P
Thus, we have found a basis e1 , . . . , em of V . To finish, write each wi as j aj,i ej . We shall
prove that aj,1 , aj,2 , . . . , aj,2n+1 satisfy the same conditions as the wi (you can partition them
into two groups of same sum and same size when you remove any one of them) for any j. Since
170 APPENDIX C. LINEAR ALGEBRA

they are in Q, by our prevous step this implies P P all equal and thus the weights wi
that they are
are also all equal. This is however very easy: if i∈I wi = i∈I 0 wi then
X X X X
ej aj,i = ej aj,i
j i∈I j i∈I 0
P P
so i∈I aj,i = i∈I 0 aj,i by definition of a basis. We are done. 

Now, we prove that any basis has the same cardinality if there exists a finite one. In that case
we say the vector space is finite-dimensional . Unless otherwise stated, we will always work in the
finite-dimensional case.

Proposition C.1.1 (Dimension)

Suppose e1 , . . . , en is a basis of a vector space V . Then, any basis of V has cardinality n. This
n is called the dimension of V and written dimK V .

In fact we prove more.

Proposition C.1.2

Suppose u1 , . . . , un are linearly independent elements of V and v1 , . . . , vm span all of V . Then


m ≥ n.

Since bases satisfy both of these conditions, we get n ≥ m and m ≥ n so m = n for any bases of
cardinality m and n.

Proof
P
We prove the contraposite: if n > m then u1 , . . . , un are linearly dependent. Write uj = i ai,j vi
with ai,j ∈ K. We proceed by induction on m (when m = 1 it is obvious).

Pick an ai,j 6= 0, this is possible otherwise all ui are zero so in particular linearly dependent;
without loss of generality assume i = j = 1. We will get rid of ui and vj for our induction. To
do so, consider the family of vectors
a1,2 a1,3 a1,n
u1 , u2 − u1 , u3 − u1 , . . . , un − u1 .
a1,1 a1,1 a1,1

Clearly, these are linearly independent since you can recover u1 , . . . , un from them. Also,
a1,j X
uj − u1 = (ai,j − a1,i a1,i /a1,1 )vi
a1,1
j≥2

has no coordinate in v1 . Thus by our induction hypothesis (e.g. on V 0 the space generated
by v2 , . . . , vm ) u2 , . . . , un are linearly dependent and we are done. (This idea of getting rid of
coordinates will be used again in the proof of Theorem C.3.1, which states that the determinant
of linearly independent vectors is non-zero.)


Here is a small application.


C.1. VECTOR SPACES 171

Problem C.1.2

Let f, g ∈ R[X] be two non-constant polynomials. Prove that there is an h ∈ R[X] such that f h
is a polynomial in g.

Solution

This amounts to saying that some multiple of f is a polynomial in g. Let n = deg f . We work
in R[X]/f R[X], i.e. R[X] modulo f . This an R-vector space of dimension n as (1, X, . . . , X n−1 )
is a basis.

Now consider the n + 1 elements 1, g, g 2 , . . . , g n , where g k denotes the kth iterate of g. Since this
family has more elements than the dimension of the vector space, they are linearly dependent:
we get X
ai g i ≡ 0 (mod f )
i

for some not all zero ai ∈ R. This constitutes the wanted multiple of f . 

Remark C.1.1
As this solution shows, the problem is extremely flexible. For instance, for any infinite set of
non-negative integers S (e.g. the set of primes), f has a multiple whose non-zero monomials have
the form X s for some s ∈ S.
A final important result on basis, that we already saw in the proof of Problem C.1.1, is that we
can always complete a family of linearly independent vectors to get a basis.

Proposition C.1.3

Any family of linearly independent vectors of a finite-dimensional vector space V can be com-
pleted into a basis by adding elements to it, and from any generating family we can extract a
basis.

Exercise C.1.1∗ . Prove Proposition C.1.3.

As always, a quick application to finite fields. In Chapter 4 we prove that there exist fields of
cardinality q for every prime power q = pn =6 1. Here we show the converse: if F is a field with q
elements, then q is a prime power.

Proposition C.1.4

Suppose F is a field with q elements. Then there is some prime p and an integer n such that
q = pn .

Proof

The key is to consider F as a vector space over Fp , where p is the characteristic of F . Indeed, the
characteristic c is a prime, since if 0 = c = ab then either a or b must be zero too which means
a = c or b = c by minimality of c.

Now F is a Fp -vector space in the obvious way: just define nx = x + . . . + x (technically that’s
| {z }
n times
what we did before with the characteristic too) and this is compatible with Fp as p = 0 in F .
172 APPENDIX C. LINEAR ALGEBRA

Since F is finite, it is also finite-dimension as a vector space: let e1 , . . . , en be a basis (there exists
one by Proposition C.1.3). Then, every element of F can be written in a unique way as
n
X
ai ei
i=1

for some ai ∈ Fp . There are exactly pn tuples (a1 , . . . , an ) ∈ Fp , so q = pn as wanted.




C.2 Linear Maps and Matrices


In this section, we consider morphisms of vector spaces which are called linear maps, or linear trans-
formations.1

Definition C.2.1 (Linear Maps)

Let U and V be two K-vector spaces. A linear map ϕ : U → V is a function which is additive
and homogeneous, i.e. ϕ(x + y) = ϕ(x) + ϕ(y) and ϕ(λx) = λϕ(x) for any x, y ∈ U and λ ∈ K.

For instance, the derivative map f 7→ f 0 is a linear map from K[X] to itself (as K-vector spaces).

There is a very simple characterisation of linear maps U → V . Let u1 , . . . , um be a basis of U and


v1 , . . . , vn of V . Write X
ϕ(uj ) = ai,j vi
i

for j = 1, . . . , m. Then f P
is uniquely defined from these ai,j , and any system of ai,j gives rise to a
linear map U → V : if x = j bj uj then
X
ϕ(x) = bj ai,j vi .
i,j

Note in particular that this shows that the structure of finite-dimensional vector spaces is more or
less trivial: a vector space of dimension n is isomorphic to K n . However, the profoundness of linear
algebra lies precisely in what these isomorphisms are.

Remark C.2.1
When K = Q, the Q-linear maps are precisely the additive maps. Indeed, it follows from additivity
that ϕ(nx) = nϕ(x) for n ∈ Z, which implies that
 m  ϕ(mx) m
ϕ x = = ϕ(x).
n n n
Additive functions are also called functions satisfying the "Cauchy equation" (ϕ(x + y) = ϕ(x) +
ϕ(y)). This explains why this equation is unsolvable over R: R is an infinite-dimensional Q-
vector space, so there are a lot of solutions: "just" pick a basis (ui )i∈I of R and send ui wherever
you want. (It is however impossible, in the general case, to prove the existence of a basis of an
infinite-dimensional vector space without the axiom of choice.)

Let us now comment a bit our proof of Lagrange’s interpolation theorem A.1.2. What we did was
consider the canonical basis e1 , . . . , en+1 where ei has a 1 in the ith position and zeros everywhere else
1 This shows that one should not call polynomial functions of degree 1 "linear", because they are not linear maps

(unless the constant coefficient is 0)! One should call them "affine", because they correspond to affine transformations,
not linear ones.
C.2. LINEAR MAPS AND MATRICES 173

of the space K n+1 consisting of vectors (b1 , . . . , bn+1 ). Then, for each element of this basis, we found
polynomials fi such that
(fi (a1 ), . . . , fi (an )) = ei .
Finally, we get the wanted result by taking linear combinations of these fi since e1 , . . . , en+1 is a basis.

We come back to more abstract considerations. Given the bases B = (u1 , . . . , um ) and C =
(v1 , . . . , vn ), we denote the linear map ϕ in matrix form relative to the bases B and C by
 
a1,1 a1,2 · · · a1,n
 a2,1 a2,2 · · · a2,n 
MBC (ϕ) =  . ..  .
 
.. ..
 .. . . . 
am,1 am,2 ··· am,n

Note that we have used the index j for elements of the domain, and the index i for the codomain. This
means that, to get the matrix of ϕ, we represent ϕ(u1 ), . . . , ϕ(um ) by column vectors:
   
a1,1 a1,n
 a2,1   a2,n 
 ..  , . . . ,  .. 
   
 .   . 
am,1 am,n

and then piece them together.

Definition C.2.2 (Matrices)


 
· · · a1,n
a1,1
..  ..
..  . The set
An m × n matrix is a family (ai,j )i,j∈[m]×[n] which is denoted by  .
. . 
am,1 · · · am,n
of m × n matrices with coefficients in K is denoted by K m×n ; when n = 1 we just write K m .

Note that this last notation clashes with the Cartesian product, and that an element of K m is
a column vector not a row vector! To make things worse, we will even denote elements of K m by
(a1 , . . . , am ) as column vectors take too much place.

Here is how we define the product of two matrices A and B: if A = MCB (ψ) and S = MD C
(ϕ) where
B
U → V → W , we want AB to correspond to MD (ψ ◦ ϕ) where D = (w1 , . . . , w` ) is a basis of W . Thus
ϕ ψ
we compute
!
X X X X X X
ψ(ϕ(uj )) = ψ bk,j vk = bk,j ψ(vk ) = bk,j ai,k wi = wi ai,k bk,j .
k k k i i k
P
Hence we define (ai,j )(bi,j ) = (ci,j ) where ci,j = k ai,k bk,j (scalar product of the ith row of A with
jth column of B). (In particular, the product of two matrices is only defined when the coordinates
agree: m × n and n × `.)

Definition C.2.3 (Matrix Multiplication)

and (bi,j ) of dimensions m × n and n × ` is the matrix (ci,j )


The product of two matrices (ai,j ) P
of dimension m × ` given by ci,j = k ai,k bk,j .

Matrix multiplication is clearly associative, since composition is. It is however not-commutative


in general. Similarly, addition of matrices is defined componentwise because we want MCB (ϕ) + MCB (ψ)
to correspond to ϕ + ψ. (We do not define multiplication of matrices to correspond to multiplication
of linear maps because this does not make sense: x 7→ x · x is not linear.)
174 APPENDIX C. LINEAR ALGEBRA
         
1 0 1 1 1 1 1 1 1 0 1 0
Exercise C.2.1∗ . Prove that = but = .
0 0 0 0 0 0 0 0 0 0 0 0

Exercise C.2.2∗ . Prove that matrix multiplication is distributive over matrix addition, i.e. A(B + C) =
AB + AC and (A + B)C = AC + BC for any A, B, C of compatible dimensions.

Suppose we want to invert a linear map, i.e. find another linear map ϕ−1 such that ϕ ◦ ϕ−1 = id.
The matrix of the identity is very simple to describe (with one basis): it’s the matrix with ones on the
diagonal and zero everywhere else
 
1 0 ... 0
0 1 . . . 0
MBB (id) =  . . . := In
 
 .. .. . . ... 

0 0 ... 1

since
id(ej ) = ej = 0e1 + . . . + 0ej−1 + ej + 0ej+1 + . . . + 0en .
This matrix is called the identity matrix . Thus we would like to invert matrices. Why would this
be useful? Well, this lets us, for instance, perform changes of bases: imagine that we first expressed
ϕ with respect to the bases B = (u1 , . . . , um ) and C = (v1 , . . . , vn ), but then we decided we actually
preferred to express it with respect to B 0 = (u01 , . . . , u0m ) and C 0 = (v10 , . . . , vm
0
). Consider the following
0 0
two linear maps ϕU (uj ) = uj and ϕV (vj ) = vj . It is clear that
0
MCB0 (ϕ) = MCB (ϕV ◦ ϕ ◦ ϕ−1
U )

since if ϕ(uj ) = i ai,j vi , then composing with ϕ−1 0


P
U on the right transforms uj into uj and composing
0
with ϕV on the left transforms vi into vi .

Thus,
0
MCB0 (ϕ) = MCB (ϕV )MCB (ϕU )−1
(the matrices MCB (ϕV ) and MCB (ϕU )−1 are called change of bases matrices). Of particular interest is
the case where U = V , B = C and B 0 = C 0 . In that case, we get an equality of the form M 0 = N M N −1 .

Finally, we prove one last result linking the surjectivity and injectivity of a linear map: if a linear
map from a vector space to itself is injective, then it is surjective too and conversely. This is false in
the infinite-dimension case, but this does not affect as as we only care about the finite-dimensional
one. Recall what the kernel and image of a morphism are. Note that if ϕ is linear, then ker ϕ and im ϕ
are vector spaces (as they are closed under addition, as well as multiplication by scalars).

Definition C.2.4 (Kernel and Image)

Let ϕ : U → V be a linear map. Its kernel ker ϕ is the set of u ∈ U such that ϕ(u) = 0, and its
image is the set of v ∈ V such that ϕ(u) = v for some u ∈ U .

Theorem C.2.1 (Rank-Nullity Theorem)

Suppose ϕ : U → V is a linear map (of finite-dimensional vector spaces). Then, dim ker ϕ +
dim im ϕ = dim U .

Remark C.2.2
This is called the "rank-nullity theorem" because dim im is also called the rank, and nullity means
dim ker.
C.2. LINEAR MAPS AND MATRICES 175

Proof

Let u1 , . . . , uk be a basis of ker ϕ and ϕ(u01 ), . . . , ϕ(u0m ) a basis of im ϕ. We prove that


u1 , . . . , uk , u01 , . . . , u0m is a basis of U .

First we prove that these elements are linearly independent. Suppose that
X X
ai ui + bi u0i = 0.
i i

Then, i bi f (u0i ) = 0 by composing with fP . Since f (u01 ), . . . , f (u0m ) are linearly independent,
P
this means b1 = . . . = bm = 0. Then, from i ai ui = 0, we deduce ai = 0 since u1 , . . . , uk are
also linearly independent.

Let u ∈ U be an element. Write f (u) = i bi f (u0i ).


P
It remains to Pprove that they span all of U . P
Then, f (u − i bi u0i ) = 0. This means u − i bi u0i ∈ ker f , so
X X X X
u− bi u0i = ai ui ⇐⇒ u = ai ui + bi u0i
i i i i

as wanted.


Corollary C.2.1*

A linear map U → U is injective if and only if it is surjective. In other words, a square matrix
has a right-inverse if and only if it has a left-inverse.

Proof

A linear map is injective if and only if its kernel is trivial, i.e. has dimension 0. By the rank-nullity
theorem, this is equivalent to dim im ϕ = dim U , i.e. ϕ being surjective.

A n × n matrix A has a right-inverse if and only if the linear map from K n×n → K n×n defined by
B 7→ AB is surjective, and if A−1 is this inverse then A(A−1 A) = A so A−1 A = In by injectivity
(from the rank-nullity theorem) which means A−1 is a left inverse too. But if A has a left-inverse,
then B 7→ AB is injective which means it’s surjective too.


Another corollary is that this lets us deduce the existence part of the Lagrange interpolation
theorem (Theorem A.1.2) from the uniqueness part: for any a1 , . . . , an ∈ K, the map from the vector
space Kn [X] of polynomials of degree less than n to K n given by
f 7→ (f (a1 ), . . . , f (an ))
is injective so must be surjective too since Kn [X] and K n both have dimension n.
Here is a combinatorial application of the fact that the right-inverse of a matrix is also its left-
inverse.

Problem C.2.1

There are 2n boys and 2n girls at a party. For each pair of girls, there are exactly n boys that
danced with exactly one of them. Prove that the same is true if we exchange the words "boys"
and "girls" in the last sentence.
176 APPENDIX C. LINEAR ALGEBRA

Solution

Consider the adjacency matrix M = (ai,j ) defined by ai,j = 1 if the ith girl and the jth boy have
danced together and −1 otherwise (so the rows correspond to the girls and the columns to the
boys). We claim that the condition of the problem is exactly equivalent to

M M T = 2nI2n ,

where M T designates the transpose matrix of M , i.e. the matrix M T = (bi,j ) where bi,j = aj,i
(exchange the rows with the columns). Let’s compute this product: the (i, j) coordinate is
X X
ci,j = ai,k bk,j = ai,k aj,k .
k k

Now what is ai,k aj,k ? ai,k corresponds to whether the ith girl has danced with the kth boy, and
aj,k to whether the jth girl has danced with the kth boy. Thus ai,k aj,k is −1 = 1 · (−1) = (−1) · 1
if exactly one of them has danced with him, and 1 = 1 · 1 = (−1)(−1) otherwise.
P
We conclude that the sum k ai,k aj,k is zero if and only if there are exactly n boys which danced
for i 6= j this is exactly what the problem says! For i = j the
with exactly one of the girls i, j: P
sum is trivial: a2i,k = 1 so ci,j = k 1 = 2n. We have thus proven our claim: the condition is
equivalent to M M T = 2nI2n , i.e. that M T /2n is the right-inverse of M . But then M T /2n is also
the left-inverse of M by the rank-nullity theorem, so
1 T
M M = I2n ⇐⇒ M T (M T )T = 2nI2n
2n
which means that the statement is true with the word "boys" and "girls" exchanged (since this
is what the transpose does: it exchanges the rows with the columns). 

C.3 Determinants
The set of matrices almost form a non-commutative ring under addition and multiplication, except
that multiplication is not always defined (only when the dimensions are compatible). However, for
square matrices it is always possible. Thus, square matrices are usually nicer to study, for instance we
have seen that they have a right-inverse and if and only they have a left-inverse, which is not true for
other matrices (by the rank-nullity theorem). In this section, we find a criterion to determine those
square matrices are invertible. Note that finding when a n × n matrix M is invertible is equivalent
to finding when the rows or the columns are linearly independent. Indeed, the column are linearly
independent if and only if the images of the canonical basis

e1 = (1, 0, . . . , 0), . . . , en = (0, . . . , 0, 1)

by the linear map v 7→ Av from K n to K n are linearly independent, i.e. if and only if this map is
injective (which is equivalent to A being invertible). For the rows, one can consider the map v 7→ AT v,
as (AB)T = B T AT so A is invertible if and only if its transpose is.

Exercise C.3.1. Prove that an m × n matrix can only have a right-inverse if m < n, and only a left-inverse
if m > n. When does such an inverse exist?

Exercise C.3.2∗ . Prove that (AB)T = B T AT for any n × n matrices A, B.


 
a b
Let’s start with the 2 × 2 case. Let be such a matrix. The vectors (a, b) and (c, d) are
c d
linearly dependent if and only if there are some x, y such that (ax + cy, bx + dy) = (0, 0). By rescaling
x and y if necessary, we may assume x = −c and y = a from ax + cy = 0 (unless a = c = 0 but in that
case they’re clearly linearly dependent). Then, bx + dy = 0 becomes ad − bc = 0.
C.3. DETERMINANTS 177
 
a b
Thus, is invertible if and only if ad − bc 6= 0. In fact, we can even check that
c d
  
a b d −b
= (ad − bc)I2 .
c d −c a

Exercise C.3.3. Prove this identity.


 
a b
This number ad − bc is called the determinant det M of the matrice M = . Our goal will
c d
be to define such a determinant for n × n matrices satisfying the following properties:
• det M is a homogeneous polynomial of degree n in the coordinates of M . Moreover, det M has
degree 1 in each coordinate of M .
• M is invertible if and only if det M 6= 0.
In fact, these properties define uniqueley the determinant up to leading coefficient (this will be
proven in Theorem C.3.4)!For the sake of convenience, since we are going to be talking more about
columns and rows, we denote the ith row of M by Mi and the jth column of M by M j so that
 
M1
M = [M 1 , · · · , M n ] =  ...  .
 

Mn

Here is how one can define the determinant of n × n matrices inductively. In Theorem C.3.4
and Theorem C.3.3, we will give another characterisations of the determinant, the last one being
completely explicit. The reader may wish to take it for granted that the determinant exists for now,
skip to Problem C.3.1 to see an application and come back later to read the proofs.

Definition C.3.1 (Determinant)

Let A = (ai,j ) be a matrix. Denote by Ai,j the matrix obtained by removing the ith row and
the jth column. We define the determinant inductively by det[a] = a and

det A = a1,1 det A1,1 − a2,1 A2,1 + . . . + (−1)n an,1 An,1 .

Remark C.3.1
 
a1,1 ··· a1,n
We shall also sometimes denote the determinant of  ... .. ..  as

. . 
an,1 ··· an,n

a1,1
··· a1,n
.. .. .. .
.
. .
an,1 ··· an,n

Here is an example for the determinant of 3 × 3 matrices.


 
a1,1 a1,2 a1,3      
a a2,3 a a1,3 a a1,3
det a2,1 a2,2 a2,3  = a1,1 det 2,2 − a2,1 det 1,2 + a3,1 det 1,2
a3,2 a3,3 a3,2 a3,3 a2,2 a2,3
a3,1 a3,2 a3,3
= a1,1 (a2,2 a3,3 − a2,3 a3,2 ) − a2,1 (a1,2 a3,3 − a1,3 a3,2 ) + a3,1 (a1,2 a2,3 − a2,2 a1,3 ).

We shall get a more explicit formula for the determinant in Theorem C.3.3, but for now we shall
use this one. Let’s prove that it is linear in each column (we say the determinant is multilinear in the
columns).
178 APPENDIX C. LINEAR ALGEBRA

Proposition C.3.1

The determinant is linear in each column, i.e., for any k ∈ [n], t ∈ K and C, C 0 ∈ K n , we have

det[A1 , . . . , Ak−1 , tC + C 0 , Ak+1 , . . . , An ] = t det[A1 , . . . , Ak−1 , C, Ak+1 , . . . , An ]


+ det[M 1 , . . . , Ak−1 , C 0 , Ak+1 , . . . , An ].

Proof

This follows easily from our inductive definition. For the sake of convenience we shall write
detkD (M ) = det[M 1 , . . . , Mk−1 , D, Mk+1 , . . . , M n ] for any D ∈ K N and any M . Let C = (ci )
and C 0 = (c0i ). First suppose k = 1. Then,
X X X
detktC+C 0 (A) = (tci + c0i ) det Ai,1 = t ci det Ai,1 + c0i det Ai,1 = t detkC (A) + detkC 0 (A).
i i i

Otherwise, X
detktC+C 0 (A) = ai,1 detk−1
tC+C 0 Ai,1
i

which is linear by the induction hypothesis.




Thus, the determinant is invariant under addition of two columns of the matrix if and only if the
determinant of a matrix with two identical columns is 0. Before showing this however, we prove that
the determinant changes by a sign when we exchange two columns. This should satisfy the reader who
was disappointed by the lack of symmetry in our definition.

Proposition C.3.2

The determinant of a matrix is multiplied by −1 when exchanging two of its columns (which are
distinct).

Proof

Without loss of generality suppose j > i. We prove the claim when exchanging two consecutive
columns. Iterating this process yields the desired conclusion: indeed if we do the switches

i 7→ i + 1 7→ . . . 7→ i + (j − i),

the kth column goes to the k − 1th column for i + 1 ≤ k ≤ j, and then if we do the switches

j−1 7→ j − 2 7→ . . . 7→ j − 1(j − 1 − i)
| {z }
originally j

we have exchanged at the end exchanged only the original ith column with the original jth
column. In total, we made (−1)(j−i)+(j−1−i) = −1 switches of consecutive columns, so the
determinant is negated.

To prove that this for consecutive columns, we introduce a similar notation to the one we did
before:
detkC,C 0 (A) = det[A1 , . . . , Ak−1 , C, C 0 , Ak+2 , . . . , An ].
C.3. DETERMINANTS 179

We have

0 = detkAk +Ak+1 ,Ak +Ak+1 (A)


= detkAk ,Ak (A) + detkAk+1 ,Ak+1 (A) + detkAk ,Ak+1 (A) + detkAk+1 ,Ak (A)
= det A + detkAk+1 ,Ak (A)

as wanted since the first two determinants are zero as the matrices have two equal columns.


Proposition C.3.3

The determinant of a matrix with two identical columns is zero.

Proof

By induction again (what else can we do when we defined the determinant inductively?). When
n = 1, 2 this is obvious. Otherwise, by switching some columns, we may assume the second and
third ones are equal by Proposition C.3.2. Then
X
det A = ai,1 det Ai,1
i

and all Ai,1 have two identical columns so habe zero determinant by the induction hypothesis.


Exercise C.3.4∗ . Prove that det In = 1.

Exercise C.3.5∗ . Prove that the determinant of a matrix with a zero column is zero.

Exercise C.3.6. Prove that the determinant of a non-invertible matrix is 0.

With this we can almost prove that the determinant is non-zero if and only if the matrix is invertible.
But first, we need to know how to compute a certain kind of determinants: determinants of upper
triangular matrices, i.e. matrices M = (ai,j ) such that ai,j = 0 for j > i
 
a1,1 0 ··· 0
 a2,1 a2,2 · · · 0 
.
 
 .. .. . .
 . . . 0 
an,1 an,2 · · · an,n

Here is a sketch of how we are going to proceed to prove that the determinant of an invertible matrix
is non-zero, using upper triangular matrices.

The determinant is invariant under column operations, i.e. adding a scalar times a column to
another column. Indeed,

detkAk +tAi (A) = detkAk (A) + t detkAi (A) = det A

since detAi (A) = 0 as it has two equal columns. Thus, we shall transform A into an upper triangular
matrix using column operations and exchanging columns. The determinant of this matrix will then be
± det A so we compute it with the following proposition and as a result we conclude that det A 6= 0.
This idea of transforming A into a triangular matrix is also what we did to prove that bases all had
the same cardinality in Proposition C.1.2.
180 APPENDIX C. LINEAR ALGEBRA

Proposition C.3.4

The determinant of an upper triangular matrix A = (ai,j ) is the product of the elements on the
diagonal a1,1 · a2,2 · . . . an,n .

Proof

By induction! It’s clearly true for n = 1 and for n ≥ 2 we have

det A = a1,1 det A1,1 − a2,1 A2,1 + . . . + (−1)n an,1 An,1 .

Now notice that Ai,1 has one row full of zeros (the one correspond to the first row of A) which
means det Ai,1 = 0 by Exercise C.3.5∗ , i.e. det A = a1,1 det A1,1 = a1,1 a2,2 · . . . · an,n by the
induction hypothesis.


Theorem C.3.1

Any n × n matrix A is invertible if and only if its determinant det A is non-zero.

Proof

As said before, we will transform A into an upper triangular matrix by making column operations
and exchanging columns. Since this leaves the space generated by the columns unchanged and
changes the determinant by a factor of ±1, it will suffice to prove that an upper triangular matrix
is invertible if and only if its diagonal has no zero element. This is Exercise C.3.7∗ .

Here is how we do it. We proceed by induction on n (n = 1 is trivial as always). If the first row
of A is all zero then we can directly apply the induction hypothesis on A1,1 . Otherwise, suppose
that ai 6= 0. By exchanging the ith column with the first one, we can assume i = 1.

Now, consider the matrix


 
a1,2 1 3 a1,3 1 a1,n 1
A1 , A2 − A ,A − A , . . . , An − A .
a1,1 a1,1 a1,1

It is column equivalent to A and its first row is zero, except for a1,1 . Now apply the induction
hypothesis to this matrix.


Exercise C.3.7∗ . Prove that an upper triangular matrix is invertible if and only if its determinant is non-zero,
i.e. if the elements on its diagonal are non-zero.
We are finally ready for some applications. We shall first give a proof that algebraic integers are
closed under addition and multiplication using our machinery. In fact we even have the following more
general criterion.

Proposition C.3.5

An algebraic number α is an algebraic integer if and only if there is a finitely generated Z-module
M such that αM ⊆ M . (A module is like a vector space except it can be over any ring, not
necessary a field. In this case a Z-module is a space where you can add and subtract elements
since multiplication by integers is trivial. Finitely generated means that M = u1 Z + . . . + um Z
for some u1 , . . . , um .)
C.3. DETERMINANTS 181

Proof

If α is an algebraic integer, then we can take M = Z + αZ + . . . + αn−1 Z where n is the degree


of α. For the converse, suppose M is a finitely generated Z-module such that αM ⊆ M .

Let u1 , . . . , um be a system of generators of M . Write


X
αuj = ai,j ui
i

with ai,j ∈ Z. Subtracting αuj from both sides, we get that the vectors

(a1,j , . . . , aj−1,j , aj,j − α, aj+1,j , . . . , um,j )

are linearly independent over C. Let A = (ai,j ). The previous remark means that the rows
of A − αIm are linearly dependent, i.e. A − αIm has determinant zero. But the determinant
det(A − αIm ) is a polynomial in α with integer coefficients since A has integer coordinates.
Moreover, from Lemma C.3.1, we see that its leading coefficient is (−1)m so α ∈ Z as wanted.


Corollary C.3.1

The set of algebraic integers Z is closed under addition and multiplication.

Proof

If α and β are algebraic integers of respective degrees m and n, then


X
M = Z[α, β] := αi β j Z
0≤i≤m−1,0≤j≤n−1

is a finitely generated Z-module such that (α + β)M ⊆ M and αβM ⊆ M . Thus α + β and αβ
also are algebraic integers.


Exercise C.3.8. Prove that Z is integrally closed , meaning that, if f is a monic polynomial with algebraic
integer coefficients, then any of its root is also an algebraic integer. (This is also Exercise 1.5.22† .)

Before presenting more applications, we need two last results. Here is an explicit formula for the
determinant which will make our life a lot easier. It can easily be proven by induction. Theorem C.3.3
will determine exactly when those ε(σ) are 1 and when they are −1.

Lemma C.3.1

The determinant of an n × n matrix A = (ai,j ) is equal to


 
a1,1 · · · a1,n
det  ... .. ..  = X ε(σ)a
σ(1),1 · . . . · aσ(n),n

. . 
an,1 · · · an,n σ∈Sn

where the sum is taken over all permutations σ of [n] and ε(σ) ∈ {−1, 1}. Moreover, σ(id) = 1
(i.e. the coefficient of a1,1 · . . . · an,n is 1).

Exercise C.3.9∗ . Prove Lemma C.3.1.


182 APPENDIX C. LINEAR ALGEBRA

Now, we compute a very important determinant, and then we can move on to applications.

Theorem C.3.2 (Vandermonde Determinant)

Let x1 , · · · , xn ∈ K be elements. We have

x21 xn−1
 
1 x1 ··· 1
1 x2 x22 ··· xn−1
2

n−1 
  Y
det 1
 x3 x23 ··· x3  = (xi − xj )
. .. .. .. ..  i<j
 .. . . . . 
1 xn x2n ··· xn−1
n

Proof

Note that this determinant is zero when xi = xj for some i 6= j since it then has two equal
columns. Now replace xi by a variable Xi and consider this determinant as a polynomial in
X1 , . . . , Xn . The previous observation implies that it is divisible by Xi − Xj for any i, j, i.e. by
Lemma C.3.1 shows that the degree of the determinant is n(n−1)
Q
i<j Xi − Xj . In addition,
Q 2
which is the same as i<j Xi − Xj so they are equal up to a multiplicative constant. The same
lemma shows that the coefficient of

X1 X22 X33 · . . . · Xnn


Q
in the determinant is 1, so it is equal to i<j Xi − Xj since it has the same coefficient.


From this we deduce a very important corollary.

Corollary C.3.2*

If x1 , . . . , xn are distinct numbers, then the vectors

(1, x1 , . . . , xn1 ), . . . , (1, xn , . . . , xnn )

are linearly independent.

Remark C.3.2
In fact, we didn’t need to do all this to prove this corollary. Indeed, the invertibility of the matrix

1 x1 x21 · · · xn−1
 
1
1 x2 x22 · · · xn−1 2

1 x3 x2 · · · xn−1 
 
M = 3 3 
. .. .. .. .. 
 .. . . . . 
1 xn x2n ··· xn−1
n

when x1 , . . . , xn are distinct is exactly what Lagrange interpolation A.1.2 gives us. Indeed, the
non-explicit form of the theorem says precisely that the linear map x 7→ M x from K n to itself
is surjective. I hope the reader doesn’t feel too disheartened by this, there are times where the
explicit value of the determinant will be useful to us (in the exercises).

Here is an arithmetic application, which is exactly how we will use the Vandermonde determinant
in this book (in the exercises).
C.3. DETERMINANTS 183

Problem C.3.1

Let p is a prime number and a1 , . . . , am integers which are not divisible by p. Suppose p |
ak1 + . . . + akm for k = 1, . . . , m. Prove that p | m.

Solution

Collect the ai which are equal modulo p together to get


n
X
p| ci bni
i=1
P
for some positive integers i ci = m and distinct bi ∈ Fp . We have a system of equations


 c1 + . . . + cn ≡ 0

c b + . . . + c b ≡ 0
1 1 n n


 . . . . . . . . . . . . . . .......
 n
c1 b1 + . . . + cn bnn ≡ 0.

Since the bi are distinct modulo p, the vectors (1, . . . , bn1 ), . . . , (1, . . . , bnn ) are P
linearly independent
in Fp by Vandermonde. Thus, we must have c1 ≡ . . . ≡ cn ≡ 0. Since m = i ci , we have p | m
as wanted. 

As we saw from the first example, matrices are deeply linked with system of linear equations. In
hindsight, this is obvious: the system


a1,1 x1 + a1,2 x2 + . . . + a1,n xn = b2

a x + a x + . . . + a x = b
2,1 1 2,2 2 1,n n 2


 .................................
an,1 x1 + an,2 x2 + . . . + an,n xn = bn

is equivalent to
     
   a1,1 a1,2 a1,n  
x1 a1,1 ··· a1,n  a2,1   a2,2   a2,n  b1
 ..   .. .. ..  = x    .. 
1  .  + x2  .  + . . . + xn  .  =  . 
   
 .  . . .   ..   ..   .. 
xn an,1 ··· an,n bn
an,1 an,2 an,n

i.e. to XA = B where X = (xi ), A = (ai,j ) and B = (bi ). In particular, when A is invertible there is
a unique solution, so if we have at a some point in a problem we reach such a system of equations and
have a trivial solution, then we know what the xi are since it’s the only solution.

As a last application of determinants, we give a different solution to the cows problem C.1.1.

Alternative Solution to the Cows Problem C.1.1

Again, let wi be weight of the ith cow. Write


X X
wi = wi
i∈Ik i∈Jk

where |Ik | = |Jk | = n and Ik ∪ Jk = [2n + 1] \ {k} and suppose w2n+1 ∈ Ik for k 6= 2n + 1.
184 APPENDIX C. LINEAR ALGEBRA

Consider the system of 2n linear equations in 2n unknowns


X X
wi − wi = −w2n+1
i∈Jk i∈Ik ,i6=2n+1

for k = 1, . . . , 2n. The determinant of the associated matrix has the form
 
0 ±1 ±1 · · · ±1
±1 0 ±1 · · · ±1
 
±1 ±1 0 · · · ±1
 
 .. .. .. .. .. 
 . . . . . 
±1 ±1 ±1 ··· 0

where there are 0s on the diagonal and a 1 in the (i, j) coordinate if i ∈ Jj and a −1 otherwise.
We wish to show that this determinant is non-zero. Thus, there will be a unique solution to the
system, and since w1 = . . . = w2n = w2n+1 is such a solution it will imply that they are indeed
all equal. Modulo 2, the determinant is simply
 
0 1 ··· 1
1 0 · · · 1 X X
..  = ε(σ)a1,σ(1) · . . . · an,σ(n) ≡ a1,σ(1) · . . . · a2n,σ(2n)
 
 .. .. . .
. . . . σ∈S2n σ∈S2n
1 1 ··· 0

where the matrix above is A = (ai,j ). Now, a1,σ(1) · . . . · a2n,σ(2n) is 1 if and only if σ has no fixed
point. Thus, this determinant is congruent to the number of derangements, i.e. permutations
without fixed points. Exercise C.3.10∗ implies that this number is odd so non-zero, so the original
determinant was also odd and in particular non-zero and we are done. 

Remark C.3.3
One can also compute directly  
0 1 ··· 1
1 0 ··· 1
..  ,
 
 .. .. ..
. . . .
1 1 ··· 0
as a consequence of, e.g., Exercise C.5.6.

Exercise C.3.10∗ . Prove that the number of derangements of [m] is


m
X (−1)i m!
i=0
i!

and that this number is odd if m is even and even if m is odd.

All right, now let’s finish with the determinant.

Definition C.3.2 (Signature)

The signature ε(σ) of a permutation σ of [n] is


Y σ(i) − σ(j)
.
i−j
1≤i<j≤n
C.3. DETERMINANTS 185

Note that since σ is a permutation, its signature is in {−1, 1}. Thus, the signature of a permutation
is −1 raised to its number of inversions, i.e. the number of j > i such σ(j) < σ(i) (an inversion has a
contribution of −1 in the signature).
This definition of the signature is in fact not always convenient to work with, so we shall also
mention another one. One can see that when we apply a transposition to σ, i.e. switch two of its
values σ(i) and σ(j), the signature is multiplied by −1. Since any permutation is a composition of
transpositions, the signature is 1 if there are an even number of transpositions and −1 otherwise. In
the first case we say the permutation is even, and in the second one that it is odd .
In particular it does not depend on which transpositions we choose. for example, if one starts with
the sequence (1, . . . , 2m) and switches a pair of elements at each step, one will never be able to go back
to the original tuple after an odd number of times since the signature will be −1 while the signature
of the identity is 1.
As another consequence of this characterisation, we see that the signature is a morphism of groups
Sn → {−1, 1}: indeeed ε(σ ◦ σ 0 ) = ε(σ) · ε(σ 0 ) (this is obvious if you consider σ and σ 0 as a composition
of transpositions).
Exercise C.3.11∗ . Prove that the signature is negated when one exchanges two values of σ (i.e. compose a
transposition with σ).

Exercise C.3.12∗ . Prove that transpositions τi,j : i ↔ j and k 7→ k for k 6= i, j generate all permutations
(through composition).

Remark C.3.4
Proposition C.3.2 now reads as follow: when we apply a transposition to the columns, the de-
terminant gets multiplied by −1. Thus, when we apply a permutation σ to the columns, the
determinant gets multiplied by ε(σ). (This is also a direct corollary of the next theorem.)

We get the following refinement of Lemma C.3.1 by induction.

Theorem C.3.3

The determinant of an n × n matrix A = (ai,j ) is equal to


 
a1,1 · · · a1,n
det  ... .. ..  = X ε(σ)a
σ(1),1 · . . . · aσ(n),n

. . 
an,1 · · · an,n σ∈S n

where the sum is taken over all permutations σ of [n].

Exercise C.3.13∗ . Prove Theorem C.3.3.


Since permutations are symmetric with respect to the rows and the columns, we get that the
determinant is also symmetric with respect to the rows or the columns. In particular, our expansion
with respect to one column that we defined the determinant with also holds for rows (and using
Proposition C.3.2 it holds for any row and any column).

Corollary C.3.3

For any square matrix A, det A = det AT .

Exercise C.3.14∗ . Prove that det A = det AT for any square matrix A.
As promised in the beginning of the section, the determinant is the unique solution of a certain
functional equation. In fact, this is equation is more or less equivalent to our inductive definition as
we shall see. As a consequence, we will see that this implies the multiplicativity of the determinant.
186 APPENDIX C. LINEAR ALGEBRA

Theorem C.3.4

The determinant of n × n matrices is the only function D which is multilinear (linear in all
columns), zero when two columns are the same, and such that D(In ) = 1.

Proof

We proceed by induction on n. When n = 1 it’s obvious. For the inductive step, consider the
canonical basis of K n , i.e. the column vectors ei with a 1 in ith position and zeros everywhere
else.

The same proof as Proposition C.3.2 shows that when we exchange two columns, D gets multiplied
by −1 (since the only thing we used there was the multilinearity). Note that if A = (ai,j ) is an
(n − 1) × (n − 1) matrix, then
 
0 a1,1 · · · a1,n−1
 .. .. .. .. 
.
 . . . 
Dk : A 7→ D 
 1 0 · · · 0 

. . . .
 .. .. .. .. 

0 an,1 · · · a1,n−1

where the first column is E k , the kth row has only zeros, and the other rows have the matrix
A (a bit distorted if k 6= 1, n) also satisfies the conditions of the theorem, except possibly
the unitary condition. One can check that Dk (In−1 ) = (−1)k−1 by exchanging some columns
(Exercise C.3.15∗ ), thus Dk (A) = (−1)k det A by the induction hypothesis. Notice also that, for
any b1 , . . . , bn−1  
0 a1,1 · · · a1,n−1
 .. .. .. .. 
.
 . . . 
D 1 b1 · · ·
 bn−1  = Dk (A)
. . . .
 .. .. .. .. 

0 an,1 · · · a1,n−1
since by adding the first column E k to the other ones we can get rid of the bi as this doesn’t
change the determinant.

Finally, using the multilinearity we have


n
! n
X X
i 2 n
D(A) = D ai,1 E , A , . . . , A = ai,1 Di (A1,i )
i=1 i=1

by the previous remark. Since Di (A1,i ) = (−1)i−1 det A, this is the recurrence we originally
defined the determinant with so we are done.


Exercise C.3.15∗ . Prove that Dk (In−1 ) = (−1)k−1 .

From this we deduce the following.

Proposition C.3.6 (Multiplicativity of the Determinant)*

The determinant is multiplicative, i.e. for any n × n matrice A and B we have det(AB) =
det(A) det(B).
C.3. DETERMINANTS 187

Proof

Fix B. The function A 7→ det(AB)/ det(B) satisfies all conditions of Theorem C.3.4 so is equal
to det(A).


Exercise C.3.16. Prove that the determinant is multiplicative by using the explicit formula of Theorem C.3.3.

As an important corollary, we get det(A−1 ) = det(A)−1 since

det(A) det(A−1 ) = det(In ) = 1.

This result also lets us define the determinant of a linear map, although we won’t be using this here.

Corollary C.3.4

We can define the determinant of a linear map ϕ : U → U as the determinant of any matrix
MBB (ϕ) representing ϕ.

Proof

We need to show that this is well-defined. If M is a matrix representing ϕ, then the other
matrices representing ϕ have the form N M N −1 for an invertible N (see Section C.2). Since the
determinant is multiplicative, we have

det(N M N −1 ) = det(N ) det(M ) det(N )−1 = det(M )

as wanted.


Remark C.3.5
With this notion of determinant of a linear map, the norm NL/K (α) is sometimes defined as the
determinant of the linear map from L to L defined by x 7→ xα (see Chapter 6).

Exercise C.3.17. Let L/K be a finite extension. Prove that the determinant of the K-linear map L → L
defined by x 7→ xα is the norm of α defined in Definition 6.2.3.

Finally, one might be interested in knowing when, say, a matrix with integer coordinates has an
inverse with integer coordinates as well. This is achieved by Proposition C.3.6 and the following result,
which gives an explicit formula for the inverse of a given matrix.

Proposition C.3.7

The adjugate of A, adj A := ((−1)i+j det(Aj,i )) satisfies A adj A = (det A)In = adj AA.

Remark C.3.6
The transpose of the adjugate, com A := ((−1)i+j det(Ai,j )), is called the comatrix of A.
188 APPENDIX C. LINEAR ALGEBRA

Proof
Pn
Let’s compute (bi,j ) = A adj A. We have bi,j = k=1 ai,k (−1)j+k det(Aj,k ). When i = j this
is the ith row expansion of the determinant so is indeed equal to det A. When i 6= j, it is still
the expansion of a determinant, but not of A: it is (−1)i+j times the determinant of the matrix
obtained by replacing the ith row of A by its jth row. This matrix has two identical rows so its
determinant is zero.

Thus we have bi,j = 0 if i 6= j and bi,i = det A, i.e. A adj A = (det A)In . The coordinates of
adj AA are treated in a similar fashion, by noting that they are column expansions of certain
determinants this time.


Remark C.3.7
There is another way to argue that adj AA is also equal to (det A)In once we have proven A adj A
is. Note that this is a polynomial equation in the coordinates of A, so if it holds sufficiently many
times in a fixed infinite field K it must always hold (e.g. by Exercise A.1.7∗ ). Suppose that det A
is non-zero. Then, adj A/ det A is the inverse of A so it commutes with A as wanted. Finally,
(adj AA − (det A)In ) det A is zero for all A ⊆ K n×n so must be identically zero, which implies
adj A = (det A)In since the determinant is not the zero polynomial.

Exercise C.3.18∗ . Prove that adj AA = (det A)In .

In fact, this also gives another proof Theorem C.3.1. This follows from the more general corollary
below, which answers our question about invertible matrices with integer coefficients.

Corollary C.3.5*

Let A be a matrix with coefficients in a commutative ring R. A is invertible in R (i.e. A has an


inverse with coordinates in R) if and only if det A is a unit of R.

For instance, a matrix with integer coordinates has an inverse with integer coordinates if and only
if its determinant is ±1.

Proof

If A−1 has coordinates in R, then det(A) det(A−1 ) = 1 so det A is a unit. Conversely, if u det A =
1, then A(u adj(A)) = In .


C.4 Linear Recurrences


In this short section, we derive the formula for linear recurrences using the Vandermonde determinant,
which will be used a lot in this book.2

2 Mainly in the exercises, though.


C.4. LINEAR RECURRENCES 189

Definition C.4.1 (Linear Recurrences)

We say a sequence (un )n∈Z is a linear recurrence if there is some k ≥ 1 and numbers a0 , . . . , ak−1 ∈
K such that a0 6= 0 and
k−1
X
un+k = ai un+i
i=0

for all i. The smallest such k is called the order of the linear recurrence, and the polynomial
X k − ak−1 X k−1 − . . . − a1 X − a0 is its characteristic polynomial . This polynomial is also called
the characteristic polynomial of the above equation (and the equation is called the equation
associated with f ).

Theorem C.4.1 (Linear Recurrences)

Let K be characteristic zero field and (un )n∈Z be a linear recurrence of elements of K with
characteristic polynomials f . Suppose the distinct roots of f are α1 , . . . , αr with multiplicity
m1 , . . . , mr . Then, there exist polynomials f1 , . . . , fr of degrees less than m1 , . . . , mr respectively
such that
un = f1 (n)α1n + . . . + fr (n)αr (n)
for all n ∈ Z.

Proof

Let d be the degree of f . Consider the K-vector space of sequences satisfying the equation
associated with f (which is indeed a vector space as it is closed under addition). This space
has dimension d: indeed for any (x0 , . . . , xd−1 ) ∈ K d there is a unique sequence solution to the
recurrence (un )n such that u0 = x0 , . . . , ud−1 = xd−1 . Thus, the dimension of this space is the
same as dim K d = d.

Now we prove that all sequences of the form un = f1 (n)α1n + . . . + fk (n)αrn are solutions. Since
these sequences also form a K-vector space with generating family given by

un = n(n − 1) · . . . · (n − (k − 1))αin

for k = 0, . . . , mi − 1 and i = 1, . . . , r. Thus we want to have


d−1
X
(n + d)(n + d − 1) · . . . · (n + d − (k − 1))αin+j = (n + j)(n + j − 1) · . . . · (n + j − (k − 1))αin+j
j=0

which is equivalent to
(X n f )(k) (αi ) = 0
and that’s true because αi is a root of multiplicity mi > k of X n f .

Finally, to show that all solutions have this form we want to prove that the dimension of the
space of solution of this form is the same as the dimension of the space of all solutions, i.e. d.
Since our generating family had exactly m1 + . . . + mr = d elements, this is equivalent to it being
a basis.

Thus, we want to show it is linearly indepndent. Suppose that a linear combination was zero,
i.e. X
un fi (n)αin = 0
i

for all n. We shall prove that fi (n) = 0 for all n and each
P i, thus implying that fi = 0 since
K has characteristic zero. We proceed by induction on i deg fi , the base case follows from
190 APPENDIX C. LINEAR ALGEBRA

the Vandermonde determinant C.3.2. For the induction step, suppose deg f1 ≥ 1 without loss of
generality. Consider the sequence
X
vn = un+1 − α1 un = (αi fi (n + 1) − α1 fi (n))αin .
i

Since deg(αi fi (X + 1) − α1 fi ) ≤ deg fi for i ≥ 1 and deg(α1 (fi (X + 1) − fi )) ≤ deg fi − 1, by


the induction hypothesis we have αi fi (X + 1) − α1 fi = 0 for all i. This means that they are
constant, but we have already treated this case so we are done.


Remark C.4.1
This theorem is equivalent to the existence of a (unique) partial fraction decomposition
Qr of any
rational function, meaning that, given a rational function h = f /g with g = i=1 (X − αi )mi and
deg g > deg f , i.e. deg h < 0, there are polynomials fi of degree at most mi − 1 such that
r
X fi
h= .
i=1
(X − αi )mi
Pd
In fact, if we fix g = i=0 ai X i , the rational function with denominator g and negative degree
correspond exactlyP to the generating functions of linear recurrences with characteristic polynomial
d
gb = X d g(1/X) = i=0 ad−i X i . Indeed, suppose that (un )n≥0 is such a sequence. Then, since
d−1
un+d = − a10 k=0 ad−k un+k , we have
P


X
h := un X n
n=0

d−1
!
X X
n
= un X + un+d X n+d
n=0 n=0

d−1
! d−1
X
n
X X n+d X
= un X + − ad−k un+k
n=0 n=0
a0
k=0

d−1 d−1
!
X 1 X X
= un X n − ad−k X d−k un+k X n+k
n=0
a0 n=0
k=0

d−1 d−1 k−1
! d−1
X X ad−k X 1 X X
n d−k+n
= un X + un X − ad−k X d−k un X n
n=0
a0 n=0
a 0 n=0
k=0 k=0
(a0 − g)h
:= f −
a0
(a0 −g)h f a0 f
so h = f − a0 , i.e. h = g
1−(1− a
= g as wanted. Note that f can be any polynomial of
0)
degree less than d since

f = u0 + (u1 + u0 a1 /a0 )X + (u2 + u1 a1 /a0 + u0 a2 /a0 )X 2 + . . .

so we can pick the ui inductively to get any f (equivalently, the matrix of the coefficients of f
represented as linear combinations of the ui is upper-triangular with no zero coordinate on the
diagonal so is invertible).

On the other hand, since the roots of gb are the α1i , by our characterisation of linear recurrences,
Pr
there are polynomials fi of degree at most mi − 1 such that un = i=1 fi (n)(1/αi )n . Hence, we
C.5. EXERCISES 191

also have

X r
X
h= Xn fi (n)(1/αi )n
n=0 i=1
Xr X∞
= fi (n)(X/αi )n .
i=1 n=0

Now, note that the fi are linear combinations of (X + k)(X + k − 1) · . . . · (X + 1) so it suffices


to prove that the result holds for these polynomials. For this, simply note that differentiating k
times

X 1 −α
(X/α)n = =
n=0
1 − X/α X −α
gives

X k!
(n + k)(n + k − 1) · . . . · (n + 1)(X/α)n =
n=0
(X − α)k+1
as wanted.

Remark C.4.2
This also works for fields of characteristic p 6= 0, but one needs the condition that no root has
multiplicity ≥ p + 1, otherwise n 7→ fi (n) could be identically zero without fi being zero (e.g.
fi = np − n). For instance the equation
P un+4n = un has characteristic polynomial (X − 1) but
4

the space of solutions of the form i fi (n)αi has dimension 2 while the space of solutions of
un+4 = un has dimension 4 so not all solutions have the wanted form.

Exercise C.4.1. Prove that Theorem C.4.1 holds in a field K of characteristic p 6= 0 as long as the multiplicities
of the roots of the characteristic polynomial are at most p. In particular, for a fixed characteristic equation, it
holds for sufficiently large p.

As a corollary, we get the following result, which is not obvious at first sight.

Corollary C.4.1

The product and sum of two linear recurrences are also linear recurrences.

C.5 Exercises
Vector Spaces and Bases
Exercise C.5.1 (Grassmann’s Formula). Let U be a vector space and V, W be two finite-dimensional
subspaces of U . Prove that

dim(V + W ) = dim V + dim W − dim(V ∩ W ).

Exercise C.5.2 (Noether’s Lemma). Let U, V, W be finite-dimensional vector spaces, and let ϕ : U →
V, ψ : V → W be linear maps. Suppose that im ϕ ⊆ ker ψ. Prove that there exists a linear map τ such
that ψ = τ ◦ ϕ. Similarly, if im ϕ ⊆ im ψ, prove that there is a linear map τ such that ϕ = ψ ◦ τ .

Exercise C.5.3† . Given a vector space V of dimension n, we say a subspace H of V is a hyperplane


of V if it has dimension n − 1. Prove that H is a hyperplane of K n if and only if there are elements
a1 , . . . , an ∈ K not all zero such that

H = {(x1 , . . . , xn ) ∈ K n | a1 x1 + . . . + an xn = 0}.
192 APPENDIX C. LINEAR ALGEBRA

Exercise C.5.4. Let E be a vector space with subspaces E1 , . . . , En . Suppose that E1 ∪ . . . ∪ En is


a vector space. Prove that one Ei contains all others.

Exercise C.5.5. Let z1 , . . . , zn+1 ∈ C be distinct complex numbers. Prove that (X − z1 )n , . . . , (X +


zn+1 )n is a C-basis of the space of complex polynomials with degree at most n.

Determinants
Exercise C.5.6. Let a0 , . . . , an−1 be elements of K and ω a primitive nth root of unity. Prove that
the circulant determinant  
a0 a1 · · · an−1
an−1 a0
 · · · an−2 

 .. .. .. .. 
 . . . . 
a1 a2 ··· a0
is equal to
f (ω)f (ω 2 ) · . . . · f (ω n−1 )
where f = a0 + . . . + an−1 X n−1 . Deduce that this determinant is congruent to a0 + . . . + ap−1 modulo
p when n = p is prime and a1 , . . . , ap are integers.

Exercise C.5.7 (Cramer’s Rule). Consider the system of equations M V = X where M is an n × n


matrix and V = (vi )i∈[[1,n]] and X = (xi )i∈[[1,n]] are column vectors. Prove that, for any k ∈ [[1, n]],
vk is equal to det M/ det Mk,X , where Mk,X denotes the matrix [M 1 , . . . , M k−1 , X, M k+1 , . . . , M n ]
obtained from M by replacing the kth column by X.

Exercise C.5.8† . Let (un )n≥0 be a sequence of elements of a field K. Suppose that the (m+1)×(m+1)
determinant det(un+i+j )i,j∈[[0,m]] is 0 for all sufficiently large n. Prove that there is some N such that
(un )n≥N is a linear recurrence of order at most m.

Exercise C.5.9† . Let f1 , . . . , fn : N → C be functions which grow at different rates, i.e.

f1 (m) f2 (m) fn−1 (m)


, ,..., −→ 0
f2 (m) f3 (m) fn (m) m→∞

Prove that there exists n integers m1 , . . . , mn such that the tuples

(f1 (m1 ), . . . , fn (m1 )), . . . , (f1 (mn ), . . . , fn (mn ))

are linearly independent over C.

Exercise C.5.10. Suppose that K is an infinite field and that A ⊆ K n×n is such that det(A + M ) =
det M for any M ⊆ K n×n . Prove that A = 0.

Exercise C.5.11. Let a1 , . . . , an be integers. Prove that


Y ai − aj

i<j
i−j

is an integer by expressing it as the determinant of a matrix with integer coordinates.

Algebraic Combinatorics
Exercise C.5.12† . Let A1 , . . . , An+1 be non-empty subsets of [n]. Prove that there exist disjoint
subsets I and J of [n + 1] such that [ [
Ai = Aj .
i∈I j∈J

Exercise C.5.13. Let p be a prime number and a1 , . . . , ap+1 real numbers. Suppose that, whenever
we remove one of the ai , we can divide the remaining ones into a certain amount of groups, depending
on i, each with the same arithmetic mean (and at least two groups). Prove that a1 = . . . = ap+1 .
C.5. EXERCISES 193

Exercise C.5.14. We have n coins of unknown masses and a balance. We are allowed to place some
of the coins on one side of the balance and an equal number of coins on the other side. After thus
distributing the coins, the balance gives a comparison of the total mass of each side, either by indicating
that the two masses are equal or by indicating that a particular side is the more massive of the two.
Show that at least n − 1 such comparisons are required to determine whether all of the coins are of
equal mass.

The Characteristic Polynomial and Eigenvalues


Exercise C.5.15 (Characteristic Polynomial). Let K be an algebraically closed field. Let M ⊆ K n×n
be an n × n matrix. Define its characteristic polynomial as χM = det(M − XIn ). Its roots (counted
with multiplicity) are called the eigenvalues λ1 , . . . , λn ∈ K of M . Prove that det M is the product
of the eigenvalues of M , and that Tr M is the sum of the eigenvalues. In addition, prove that λ is
an eigenvalue of M if and only if there is a non-zero column vector V such that M V = λV (in other
words, M acts like a homothety on V ). Conclude that, if f ∈ C[X] is a polynomial, the eigenvalues of
f (M ) are f (λi ) (with multiplicity). (We are interpreting 1 ∈ K as In for f (M ) here, i.e., if f = X + 1,
f (M ) is M + In .) In particular, the eigenvalues of M + Iα are λ1 + α, . . . , λn + α, and the eigenvalues
of M k are λk1 , . . . , λkn .3

Exercise C.5.16 (Cayley-Hamilton Theorem). Prove that, for any n × n matrix M , χM (M ) = 0


where χM is the characteristic polynomial of M and 0 = 0In . Conclude that, if every eigenvalue of M
is zero, M is nilpotent, i.e. M k = 0 for some k.4

Exercise C.5.17. Let A ⊆ Cn×n be a Hermitian matrix , i.e. A = AT . Prove that all its eigenvalues
are real.

Exercise C.5.18. Let M be a square matrix with integer coordinates and p a prime number. Prove
that Tr M p ≡ Tr M (mod p).

Exercise C.5.19. Let p be a prime number, and G be a finite (multiplicative) group of n × n matrices
with integer coordinates. Prove that two distinct elements of G stay distinct modulo p. What if the
elements of G only have algebraic integer coordinates and p is an algebraic integer with all conjugates
greater than 2 in absolute value?

Miscellaneous
Exercise C.5.20 (USA TST 2019). For which integers n does there exist a function f : Z/nZ → Z/nZ
such that
f, f + id, f + 2id, . . . , f + mid
are all bijections?

Exercise C.5.21 (Finite Fields Kakeya Conjecture, Zeev Dvir). Let n ≥ 1 an integer and F a finite
field. We say a set S ⊆ Fn is a Kakeya set if it contains a line in every direction, i.e., for every y ∈ Fn ,
there exists an x ∈ Fn such that S contains the line x + yF. Prove that any polynomial of degree less
than |F| vanishing on a Kakeya set must be zero. Deduce that there is a constant cn > 0 such that,
for any finite field F, any Kakeya set of Fn has cardinality at least cn pn .

Exercise C.5.22 (Siegel’s Lemma). Let a = (ai,j ) be an m×n matrix with integer coordinates. Prove
that, if n > m, the system
Xn
ai,j xj = 0
j=1

3 One of the advantages of the characteristic polynomial is that we are able to use algebraic number theory, or more

generally polynomial theory, to deduce linear algebra results, since the eigenvalues say a lot about a matrix (if we combine
this with the Cayley-Hamilton theorem). See for instance Exercise C.5.18 and the third solution of Exercise C.5.19.
4 Note that if, in the definition of χ , we replace det by an arbitrary multilinear form in the coordinates of M ,
PM
such as the permanent) perm(A) = σ∈Sn a1,σ(1) · . . . · an,σ(n) , the result becomes false, so we cannot just say that
"χM (M ) = det(M − M In ) = det 0 = 0" (this "proof" is nonsense because the scalar 0 is not the matrix 0, but the point
is that this intuition is fundamentally incorrect).
194 APPENDIX C. LINEAR ALGEBRA

for i = 1, . . . , n always has a solution in integers with


 m
 n−m
max |xi | ≤ n max |ai,j | .
i i,j

Exercise C.5.23. Define the trace of a matrix as the sum of its diagonal coefficients, and the trace
of a linear map as the trace of any matrix representing it. Prove that this doesn’t depend on the basis
chosen and is thus well-defined. In addition, let L/K be a finite separable extension with embeddings
σ1 , . . . , σn and let α ∈ K. Prove that the trace of the linear map x 7→ xα is
n
X
σi (α).
i=1

This function is called the trace TrL/K of L/K.


Exercise C.5.24. How many invertible n × n matrices are there in Fp ? Deduce the number of
(additive) subgroups of cardinality pm that (Z/pZ)n has.

Exercise C.5.25† . Let K be a field, and √let S ⊆ K 2 be a set of points. Prove that there exists a
polynomial f ∈ K[X, Y ] of degree at most 2n such that f (x, y) = 0 for every (x, y) ∈ S.
Exercise C.5.26† . Given an m×n matrix M , we define its row rank as the maximal number of linearly
independent rows of M . Similarly, its column rank is the maximal number of linearly independent
columns of M . Prove that these two numbers are the same, called the rank of M and denoted rank M .
Exercise C.5.27. Let A, B ∈ Rn×n . Prove that com(AB) = com(A) com(B).
Exercise C.5.28 (Nakayama’s Lemma). Let R be a commutative ring, I an ideal of a R, i.e. an
R-module inside R5 , and M a finitely-generated R-module. Suppose that IM = M , where IM does
not mean the set of products of elements of I and M , but instead the R-module it generates (i.e. the
set of linear combinations of products). Prove that there exists an element r ≡ 1 (mod I) of R such
that rM = 0.

5 See also Proposition C.3.5. A module is like a vector space but the underlying structure is not necessarily a field (in

this case it’s R).


Solutions

195
Chapter 1

Algebraic Numbers and Integers

1.1 Definition
i
Exercise 1.1.1. Is 2 an algebraic integer?

Solution

Suppose that f (i/2) = 0 for some monic f ∈ Z[X]. Note that the real part and imaginary part
are both polynomials in 1/2 with integer coefficients, and that one of them has leading coefficient
±1 (which one it is depends on the parity of deg f ). Thus 1/2 would be an algebraic integers,
contradicting Proposition 1.1.1. 

Exercise 1.1.2 (Rational Root Theorem). Let f ∈ Z[X] be a polynomial. Suppose that u/v is a
rational root of f , written in irreducible form. Prove that u divides the constant coefficient of f and
v divides its leading coefficient. (This is a generalisation of Proposition 1.1.1.)

Solution
Pn
Let f = i=0 ai X i , We have
n
X
ai ui v n−i = 0.
i=0
n
Modulo v, we get an u ≡ 0, i.e. v | an since u and v are coprime by assumption. Similarly,
modulo u we get a0 v n ≡ 0, i.e. u | a0 . 

1.2 Minimal Polynomial


Exercise 1.2.1∗ . Prove that the minimal polynomial of an algebraic number is irreducible and that
an irreducible polynomial is always the minimal polynomial of its roots.

Solution

Let α be an algebraic number. Assume, for the sake of a contradiction, that πα = f g with
0 < deg f, deg g < deg πα . Then, one of f or g must vanish at α, a contradiction since they have
smaller degree.

196
1.2. MINIMAL POLYNOMIAL 197

Conversely, let π ∈ Q[X] be a monic irreducible polynomial and let α be one of its roots. By
Proposition 1.2.1, πα | π. since π is irreducible and both πα and π are monic, this must mean
that they are equal. 

Exercise 1.2.2. Prove that Y 4 − 3 is irreducible in Q[X].

Solution

The roots of Y 4 − 3 are ik 4 3 where and k = 0, . . . , 3. Note that none of these are rational so
the only potential way to factorise Y 4 − 3 would be as a product of two√degree 2 polynomials,
but the constant term of such a degree 2 divisor would have the form ±ik 3 by Vieta’s formulas
A.1.4 which is not rational. This can also be seen as a special case of the Eisenstein criterion
5.1.4. 

Exercise 1.2.3∗ . Prove that any algebraic number of degree n has n distinct conjugates.

Solution

suppose α be an algebraic number of degree n with less than n distinct conjugates; i.e. it’s min-
imal polynomial π has a double root. Then gcd(π, π 0 ) has degree at least 1 by Proposition A.1.3
and at most n − 1, and divides π. Thus π is not irreducible, contradicting Exercise 1.2.1∗ . 

Exercise 1.2.4∗ . Prove that the conjugates of an algebraic integer are also algebraic integers.

Solution

Let α be an algebraic integer, i.e. a root of a monic polynomial f ∈ Z[X]. Then, f (β) = 0 for
any conjugate β of α by Proposition 1.2.1 so any conjugate β is also an algebraic integer. 

Exercise 1.2.5. We call an algebraic number of degree 2 a quadratic number . Characterise quadratic
integers.

Solution

By Proposition 1.2.2, a quadratic integer α is a root of a monic polynomial f ∈ Z[X] of degree


2 which is not rational. Write f = X 2 + uX + v. Then,

−u ± u2 − 4v
α= .
2

In particular, this has the form a±2 b for some b ≡ 1 (mod 4) and odd a if u is odd, since the

square of an odd rational integer is 1 mod 4, and the form a ± b for a, b ∈ Z if u is even. This is
our wanted characterisation: the former is a root of X 2 + uX + v where u = −a and u2 − 4v = b
which has a solution as u2 ≡ b (mod 4), while the latter is a root of X 2 + uX + v where u = −2a
and u2 − 4v = 4b (again possible since u2 ≡ 4b (mod 4)). 
198 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

1.3 Symmetric Polynomials


Exercise 1.3.1. Let α ∈ Q be an algebraic number with conjugates α1 , . . . αn and f ∈ Q[X1 , . . . , Xn ]
be a symmetric monic polynomial. Show that f (α1 , . . . , αn ) is rational. Further, prove that if α is an
algebraic integer and f has integer coefficients, f (α1 , . . . , αn ) is in fact a rational integer.

Solution

Write f (X1 , . . . , Xn ) = g(e1 , . . . , en ) with g ∈ Z[X]. Then,

f (α1 , . . . , αn ) = g(e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn ))

is a rational integer as ek (α1 , . . . , αn ) is ± the coefficient of X n−k of the minimal polynomial πα


of α by Vieta’s formulas A.1.4, and hence a rational integer. 

Exercise 1.3.2∗ . Prove that Z is closed under multiplication.

Solution

Let m and n be the degree of two algebraic integers α and β. The polynomial
Y Y Y Y
f= X − αi βj = αin X/αi − βj = αin πβ (X/αi )
i,j i j i

is symmetric as a polynomial (over the ring Z[X]) in α1 , . . . , αm (note that Y n πβ (X/Y ) is indeed
a polynomial in Y ) and hence takes value in

Z[X][e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn )] = Z[X].

Exercise 1.3.3∗ . Prove Proposition 1.3.1.

Solution

Note that the assumption f ≡ g (mod m) implies that a ≡ b (mod m). By multiplying f and g
by the inverse of their leading coefficient modulo m, we may thus assume that they are monic.
Let s = h(e1 , . . . , en ) with h ∈ Z[X] be a symmetric polynomial in Z[X1 , . . . , Xn ]. Then,

ek (α1 , . . . , αn ) ≡ ek (β1 , . . . , βn )

by Vieta’s formulas since f ≡ g (mod m). Finally, this implies that

s(α1 , . . . , αn ) = h(e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn ))


≡ h(e1 (β − 1, . . . , βn ), . . . , en (β1 , . . . , βn ))
= s(β1 , . . . , βn )

as wanted. 

1.4 Worked Examples


Exercise 1.4.1∗ . Let α ∈ Q be an algebraic number. Prove that there exists a rational integer
N 6= 0 such that N α is an algebraic integer.
1.5. EXERCISES 199

Solution

Let α be an algebraic number with minimal polynomial f = X n + . . . + a1 X + a0 ∈ Q[X]. Let


N be the lcm of the denominators of an−1 , . . . , a0 . Then,

N n f (X/N ) = X n + N an−1 X n−1 + N 2 an−2 X n−2 + . . . + N n−1 a1 X + N n a0

has integer coefficients and is zero at N α, as wanted. 

1.5 Exercises
Elementary-Looking Problems
Exercise 1.5.1† . Find all non-zero rational integers a, b, c ∈ Z such that a
b + b
c + c
a and b
a + c
b + a
c
are also integers.

Solution

Notice that the polynomial


     
 a b  c 3 a b c 2 b c a
X− X− X− =X − + + X + + + X −1
b c a b c a a b c

has integer coefficients by assumption. Thus, ab , cb , ac are rational algebraic integers, i.e. rational
integers. Since their product is 1, they must all be ±1, i.e. |a| = |b| = |c|. Conversely, these
clearly work. 

Exercise 1.5.3† (USAMO 2009). Let (an )n≥0 and (bn )n≥0 be two non-constant sequences of rational
numbers such that (ai −aj )(bi −bj ) ∈ Z for any i, j. Prove that there exists a non-zero rational number
b −b
r such that r(ai − aj ) and i r j are integers for any i, j.

Solution

Without loss of generality, by translating the sequences, we may assume that a0 = b0 = 0. Thus,
setting j = 0, we have ai bi ∈ Z for all i. The condition (ai − aj )(bi − bj ) ∈ Z then reads
ai bj + aj bi ∈ Z. We deduce that ai bj and aj bi are algebraic integers, since they are roots of

(X − ai bj )(X − aj bi ) = X 2 − X(ai bj + aj bi ) + ai bi aj bj ∈ Z[X].

Since they are rational, they must be rational integers, i.e. ai bj ∈ Z for every i, j. Choose k such
that ak 6= 0, there exists one since (an )n≥0 is non-constant. Let d be the gcd of the numbers
ak bj . Then, r = ak /d works. Indeed, we already have that rbj ∈ Z for all j so it remains to
show that ai /r ∈ Z for all i. By Bézout’s lemma, d is a linear combination of some ak bj , so that
rai = dai /ak is a linear combination of some ai bj and thus an integer as wanted. 

Exercise 1.5.5† (Adapted from Irish Mathematical Olympiad 1998). Let x ∈ R be a real number
such that both x2 − x and xn − x for some n ≥ 3 are rational. Prove that x is rational.
200 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Solution

Let a = x2 − x. Suppose for the sake of a contradiction that x is irrational, so that its minimal
polynomial is X 2 − X√− a. Let y be its other conjugate. Then, xn − x = y n − y. Since the root
of X 2 − X − a are 1± 24a+1 , we get
 n    n  
1 1 1 1
+δ − +δ = −δ − −δ ,
2 2 2 2

where δ = 4a+1 2 . We shall prove that this is only possible for δ ∈ {0, ±1/2}, which is a
contradiction since δ is irrational by assumption. Since this is equation is symmetric between δ
and −δ, it suffices to prove that it has no positive solution δ 6= 21 . By dividing it by δ − 1/2, we
wish to show that
1
n n−1 n−1 i  n−1
−1
 X 1
2 +δ 1 1
+ − δ − 2 = + δ + − δ −2
δ − 12 2 i=0
2 2

is positive for positive δ.

Suppose first that δ ≤ 21 . Then, we have


n−2
X i n−2
X i  n−2
1 1 1
+δ > =2−
i=0
2 i=0
2 2

and  n−1  n−1  n−1  n−1  n−1


1 1 1 1 1
+δ + −δ ≥ + =
2 2 2 2 2
by the power mean inequality.
1
Now, if δ ≥ 2 the inequality is trivial: since n ≥ 3 we have
n−2
X i
1
+δ ≥1+1=2
i=0
2

and  n−1  n−1


1 1
+δ + −δ > 0.
2 2


Exercise 1.5.9† . Let |x| < 1 be a complex number. Define


X
Sn = k n xk .
k=0

Suppose that there is an integer N ≥ 0 such that SN , SN +1 , . . . are all rational integers. Prove that
Sn is a rational integer for any integer n ≥ 0.

Solution
1
We shall prove that S0 = 1−x is a rational integer, and that this implies that Sn is a rational
1.5. EXERCISES 201

integer for all n. By differentiating the equality



X 1
xk =
1−x
k=0

n times, we get

X (−1)n n!
Rn := (k + 1)(k + 2) · . . . · (k + n)xk = .
(1 − x)n
k=0

Define fn (X) := (X + 1) · . . . · (X + m). Since each fm is monic and has degree n, they form a
Z-basis of Z[X], meaning that any element of Z[X] can be represented as P a linear combination
P n n n
k ak fk for some ai ∈ Z. This is in particular the case for X , say X = i=0 ak,n fk so that

n
X
Sn = ak,n Ri .
i=0

1
This shows that, if 1−x is a rational integer, then so is Rn for all n and hence Sn for all n.
1 1
Now, note that SN is a polynomial with integer coefficients in 1−x , so 1−x and thus x is algebraic.
Pn
Now, write fn as a linear of X k , i.e. fn = k=0 bk,n X k (this is just regular expansion). Let p
be a large rational prime which divides neither the numerator or the denominator of the norm
of 1 − x, so that 1 − x is invertible modulo p by Exercise 1.5.24† . Then,
n
X (−1)n n!
Rn = bk,n Sk =
(1 − x)n
k=0

PN
is divisible by p for n ≥ p. Thus, we deduce that k=0 bk,n Sk is congruent to a rational integer
modulo p. The idea now is to take many n so that the vectors (b0,n , . . . , bN,n ) are linearly
independent using Exercise C.5.9† , i.e. so that they have a non-zero determinant. Then, for p
sufficiently large, p the determinant is also non-zero modulo p. Since the inverse of a matrix
with coordinates in Fp also has coordinates in Fp , this implies that S0 , . . . , SN are congruent to
rational integers modulo p. By taking p sufficiently large and using Exercise 1.5.26† , we deduce
1
that S0 , . . . , SN are rational numbers. Finally, we shall look at the p-adic valuation of S0 = 1−x
to prove that it’s an integer.

To use Exercise C.5.9† , we need to prove that b0,n , . . . , bN,n all grow at different rates. We have

n
!k
X 1 n! X 1 n! log(n)k
bn,k = n! ∼ ∼
i1 · . . . · ik k! i k!
1≤i1 <...<ik ≤n i=1

which shows that the assumptions are satisfied. As said before, we get that S0 , . . . , SN are
congruent to rational integers modulo p. Thus, for p sufficiently large, they must all be rational.
1
Finally, suppose some prime p divides the denominator of 1−x . Since bk,n for a fixed k eventually
PN
becomes divisible by anything, k=0 bn,k Sk is a rational integer for sufficiently large n. However,
(−1)n!
it is congruent modulo 1 to (1−x) n which is not an integer since vp (n!) < n by Legendre’s formula

8.3.5; this is a contradiction. 

Exercise 1.5.10† . Let n ≥ 3 be an integer. Suppose that there exist a regular n-gon with integer
coordinates. Prove that n = 4.
202 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Solution

Let A, B, C be three consecutive vertices. Since the sum of the angles of the n-gon is (n − 2)π,
the angle ∠ABC is π n−2 2π
n = π − n . By the cosine law, we have
2 2
1 + cos 4π (AB 2 + BC 2 − AC 2 )2
 
n 2π 2π
= cos = cos π − = .
2 n n 4AB 2 · BC 2

Since A, B, C have integer coordinates, this is rational so cos 4π


n is rational. Finally, using Prob-
lem 1.1.1, we must have n ∈ {1, 2, 4, 8}. It remains to prove that n = 8 is impossible.
π
We have ∠BAC = ∠BCA = n. By the sine law,

AB 2 AC 2
2 = 2 .
cos nπ cos 2π

n

This is impossible since the RHS is rational while the LHS isn’t for n = 8 (we are again using
2
the identity cos(x)2 = 1+cos(x)
2 ). 

Exercise 1.5.11† . Let P be a polygon with rational sidelengths for which there exists a real number
α ∈ R such that all its angles are rational multiples of α, except possibly one. Prove that cos α is
algebraic.

Solution

Note that cos α being algebraic is equivalent to exp(iα) being algebraic, since if z is algebraic
then so is z + z = 2<(z). Without loss of generality, we may assume that the angles which are
rational multiple of α are in fact positive integer multiples of α, by rescaling α.
Let Xk denote the kth vertex and let n denote the number of vertices. Represent the polygon
by complex numbers: Xk is represented by xk ∈ C. We have
xk − xk+1 |xk − xk+1 |
= exp(i∠Xk−1 Xk Xk+1 ) .
xk − xk−1 |xk − xk−1 |
Without loss of generality, we may assume that the angle which is potentially not an integer
multiple of α is Xn−1 Xn X1 , so that ∠Xk−1 Xk Xk+1 is ak α for some ak ∈ N for k = 1, . . . , n − 1.
|xk −xk+1 |
Denote also |x k −xk−1 |
by rk ∈ Q. Thus, the condition reads

xk − xk+1
= rk exp(iak α).
xk − xk−1
By rescaling the xi , we may assume that x1 − xn = 1. Thus,
x1 − x2 = r1 exp(ia1 α) := s1 exp(ib1 α).
This implies that
x2 − x3 = (x2 − x1 )r2 exp(ia2 α) = −r1 r2 exp(i(a1 + a2 )α) := s2 exp(b2 α).
Continuing like that, we get
xk − xk+1 = −rk sk−1 exp(i(bk−1 + ak )α) := sk exp(ibk α)
for some sk ∈ Q and bk ∈ N. Finally, since
n−1
X n−1
X
sk exp(ibi α) = xk − xk+1 = x1 − xn = 1,
k=1 k=1
1.5. EXERCISES 203

exp(iα) is algebraic as wanted since it’s a root of

sn−1 X bn−1 + sn−2 X bn−2 + . . . + s1 X b1 − 1.

Exercise 1.5.15† . Let ω1 , . . . , ωm be nth roots of unity. Prove that |ω1 + . . . + ωn | is either zero or
greater than m−n .

Solution

Let ω = exp 2iπ



n be a primitive nth rooth of unity so that each ωi is a power of ω, say ω ki . We
shall multiply
ω1 + . . . + ωm = ω k1 + . . . + ωm
km

by its conjugates, which are among ω `k1 + . . . + ωm


`km
by the fundamental theorem of symmetric
polynomials. Suppose that it is non-zero. Taking the product over its conjugates, we get that
Y
ω `k1 + . . . + ωm
`km

`km
is a non-zero rational integer. Thus, it is at least one in absolute value. Since |ω `k1 +. . .+ωm |≤
m by the triangular inequality, we finally get

mn−1 |ω1 + . . . + ωm | ≥ 1,

i.e. |ω1 + . . . + ωm | ≥ m1−n ≥ m−n as wanted. 

Remark 1.5.1
In fact, since ω has only ϕ(n) conjugates and not n (see Chapter 3), we get the stronger bound
|ω1 + . . . + ωm | ≥ m1−ϕ(n) .

Exercise 1.5.16† . Let n ≥ 1 and n1 , . . . , nk be integers. Prove that


   
cos 2πn1 + . . . + cos 2πnk

n n

1
is either zero or greater than 2(2k)n/2
.

Solution

We imitate our proof of Exercise 1.5.15† : since 2 cos 2`π = exp 2`iπ + exp − 2`iπ
  
n n n , the fun-
damental theorem of symmetric polynomials shows that
   
Y 2`n1 π 2`nk π
2 cos + . . . + 2 cos
n n
`

2nk π
is an integer, where the product is taken over the conjugates of 2 cos 2nn1 π + . . . + 2 cos
 
n .
Suppose that it is non-zero, so that this product is non-zero too. Then, as
   
2 cos 2`n1 π + . . . + 2 cos 2`nk π ≤ 2k

n n
204 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

by the triangular inequality, we have


       
n/2
2n1 π 2nk π Y 2`n1 π 2`nk π
(2k) 2 cos n
+ . . . + 2 cos
n ≥ 2 cos n
+ . . . + 2 cos
n ≥1
`

since 2 cos 2nn1 + . . . + 2 cos 2nnk


π π
has at most n+1
 
2 ≥ n/2 + 1 conjugates (since cos x =
cos(2π − x) so half the potential conjugates get discarded). Thus, we get
   
cos 2πn1 + . . . + cos 2πnk ≥ 1

n n 2(2k)n/2

as wanted. 

Remark 1.5.2
In fact, since cos 2`π

n has only ϕ(n)/2 conjugates for n > 2 and not n (see Chapter 3), we get
the stronger bound
   
cos 2πn1 + . . . + cos 2πnk ≥ k

n n 2(2k)ϕ(n)/2

for n > 2 (for n ≤ 2 we get the bound 1).

Exercise 1.5.17† (USA TST 2014). Let N be an integer. Prove that there exists a rational prime p
and an element α ∈ F× 2
p such that the orbit {1, α, α , . . .} has cardinality at least N and is sum-free,
i j k
meaning that α + α 6= α for any i, j, k. (You may assume that, for any n, there exist infinitely many
primes for which there is an element of order n in Fp . This will be proven in Chapter 3.)

Solution

Let us fix the order n of α and suppose that any prime p for which there is an element of order
n fails. Let ω = exp 2iπ
n be a primitive nth root of unity. By Proposition 1.3.1, we have
Y Y
αi + αj − αk ≡ ωi + ωj − ωk .
i,j,k i,j,k

Thus, if the LHS is divisible by infinitely many primes, the RHS must be zero. However,  it is easy
to see that, when ζ is a root of unity, ζ + 1 is also one if and only if ζ = exp ±2iπ 3 . Indeed, if
ζ = exp (iθ), then, by looking at the imaginary part of ζ +1, we get ζ +1 = exp (i(2π − θ)). Then,
by looking at the real parts we get cos(θ) + 1 = cos(2π − θ) = − cos(θ) which gives cos(θ) = − 21
as wanted. Thus, if 6 - n, ω i + ω j = ω k is impossible since this implies ω i−j + 1 = ω k−i so ω k−i
would be exp 2π 6 . We are done: we just need to choose an n ≥ N which is not divisible by 3.
(As a bonus, if 6 | n then any p fails since αn/3 + 1 = −α2n/3 = αn/6 , while for 6 - n we have
proven that all sufficiently large p work.) 

Properties of Algebraic Numbers


Exercise 1.5.20† . Let α ∈ Q be an algebraic number with conjugates α1 , . . . , αn and f ∈ Q[X]
n
be a polynomial. Prove that the m conjugates of f (α) are each represented exactly m times among
f (α1 ), . . . , f (αn ).
1.5. EXERCISES 205

Solution

First, note that these are all conjugates since if f (α) is a root of g, then α is a rootQof f ◦ g so the
n
same goes for its conjugates. Second, note that there are no other conjugates since i=1 X −f (αi )
has rational coefficients by the fundamental theorem of symmetric polynomials. Finally, since the
roots of this polynomial are exactly the roots of πf (α) (which is irreducible), it must be a power
of πf (α) , say πfk(α) . Then, each f (αi ) is repeated k times and f (α) has degree nk as wanted. 

Exercise 1.5.21. Let α1 , . . . , αm ∈ Q be algebraic number and f ∈ Q[X1 , . . . , Xm ] a polynomial.


(1) (n )
Denote the conjugates of αk by αk , . . . , αk i . Prove that the conjugates of f (α1 , . . . , αk ) are among
(i ) (im )
{f (α1 1 , . . . , αm ) | ik = 1, . . . , nk }.

Solution

This is a consequence of the fundamental theorem of symmetric polynomials. (For a rigorous


proof, one may induct on m.) 

Exercise 1.5.22† . Let f ∈ Z[X] be a monic polynomial and α be one of its roots. Prove that α is
an algebraic integer.

Solution
(1) (n )
Let f = X n +an−1 X n−1 +. . .+a0 ∈ Z[X] be a polynomial and let ak , . . . , ak k be the conjugates
of ak . The fundamental theorem of symmetric polynomials then shows that the polynomial
(in−1 ) (i )
Y
X n + an−1 + . . . + a0 0
i0 ,...,in−1

has integer coefficients. Since it is monic, its roots are algebraic integers, and since it is divisible
by f , the same goes for f .

Alternatively, one can use Proposition C.3.5: M = Z[an−1 , . . . , a0 ] is a finitely generated Z-


module such that αM ⊆ M for any root α of f . 

Exercise 1.5.23† . We say an algebraic integer α ∈ Z is a unit if there exists an algebraic integer
α0 ∈ Z such that αα0 = 1. Characterise all units.

Solution

Let α ∈ Q be a non-zero algebraic number and let f = X n + an−1 X n−1 + . . . + a0 be its minimal
polynomial. Then, 1/α is a root of a0 X n + a1 X n−1 + . . . + 1 which shows that its degree is at
most n. By reiterating this process, we get that the degree of α is also at most n, which implies
that we have equality. Hence
a1 1
X n + X n−1 + . . . +
a0 a0
is the minimal polynomial of 1/α. In particular, for α ∈ Z, 1/α is an algebraic integer if and
only if a0 | 1, i.e. a0 = ±1. This is also equivalent to | N(α)| = 1. An alternative solution is given
206 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

in Exercise 7.1.1∗ : if α is invertible than so are its conjugates so the same goes for the product
of its conjugates, i.e. its norm. Hence, | N(α)| = 1 if α is invertible. Conversely, if N (α) = ±1,
then ±α2 · . . . · αn is the inverse of α, where α2 , . . . , αn are its conjugates distinct from itself. 

Exercise 1.5.24† . Let m be a rational integer. We say an algebraic integer α ∈ Z is a unit mod m
if there exists an algebraic integer α0 ∈ Z such that αα0 ≡ 1 (mod m). Characterise all units mod m.

Solution

Here we imitate the second solution of Exercise 1.5.23† . If α is invertible modulo m, then so are
its conjugates so the same goes for its norm. Conversely, if its norm is invertible modulo m, then
(N(α))−1 α2 · . . . · αn is the inverse of α, where α2 , . . . , αn are its conjugates distinct from itself

Exercise 1.5.26† . Let α ∈ Z be an non-rational algebraic integer. Prove that there are a finite
number of rational integers m such that α is congruent to a rational integer mod m.

Solution

Suppose that α ≡ k (mod m) for some k ∈ Z. Then, all its conjugates are also congruent to k
modulo m (by conjugating both sides) which implies

πα ≡ (X − k)n (mod m)

where n ≥ 2 is the degree of α. In particular, πα has a double root modulo m, so m divides


the discriminant of πα by Remark 1.3.2. Since πα has distinct roots by Exercise 1.2.3∗ , the
discriminant is non-zero so there are indeed a finite number of such m. 

Exercise 1.5.27† (Kronecker’s Theorem). Let α ∈ Z be a non-zero algebraic integer such that all its
conjugates have module at most 1. Prove that it is a root of unity.

Solution

Let α1 , . . . , αn be the conjugates of α. By the fundamental theorem of symmetric polynomials,


n
Y
fm := X − αim
i=1

has integer coefficients for all positive integers α. Moreover, its coefficients are bounded: by the
triangular inequality,
 
X X k
|ek (α1 , . . . , αn )| = αi1 · . . . · αik ≤ 1= .


i1 <...<ik

i1 <...<ik
n

r
Thus, there exist r and s such that f2r = f2r+s . This means that raising the roots αi2 of f2r to
r
the 2s power permutes them. If we iterate this permutation starting from α2 , we will eventually
r
cycle back to α2 , i.e.
r r+ks
α2 = α2
for some k > 0. Since α 6= 0, it must be a root of unity. 
1.5. EXERCISES 207

Exercise 1.5.28† . Determine all non-zero algebraic integers α ∈ Z such that all its conjugates are
real and have module at most 2.

Solution

αk +i 4−α2k
Let α1 , . . . , αn be the conjugates of α. Notice that 2 has module 1. Now,
p ! p !
αk + i 4 − αk2 αk − i 4 − αk2
X− X− = X 2 − αk X + 1
2 2

so that ! !
n p p
Y αk + i 4 − αk2 αk − i 4 − αk2
X− X−
2 2
k=1

has integer coefficients.



Since all its roots have module 1, by Kronecker’s theorem 1.5.27† , we
2
get that α+i 24−α is a root of unity. Since α/2 is its real part, we get α = 2 cos 2kπ

m for some
k, m. Conversely, such α work since the conjugates of
     
2kπ 2kiπ 2kiπ
2 cos = exp + exp −
m m m

are among 2 cos 2`π



m for ` ∈ Z by the fundamental theorem of symmetric polynomials. 

Exercise 1.5.29† . Suppose that ω is a root of unity whose real part is an algebraic integer. Prove
that ω 4 = 1.

Solution

Suppose that ω 2 6= 1. Then, by the triangular inequality,

ω + ω −1 1+1
<(ω) = < =1
2 2
since the inequality case would imply ω = ω −1 , i.e. ω 2 = 1. Notice that, since the conjugates of
−1
roots of unity are roots of unity, the conjugates of <(ω) = ω+ω 2 also all have absolute value at
most 1 by the triangular inequality, and are algebraic integers by assumption. Thus, the product
of <(ω) and its conjugates has absolute value strictly less than 1. Since it’s an integer it must
hence be zero, which implies <(ω) = 0, i.e. ω = ±i which satisfies ω 4 = 1 as wanted. 

Exercise 1.5.30† . Let ω1 , . . . , ωn be roots of unity. Suppose that 1


n (ω1 + . . . + ωn ) is a non-zero
algebraic integer. Prove that ω1 = . . . = ωn .1

Solution

Notice that, by the triangular inequality, if ω1 , . . . , ωn are not all equal, then n1 (ω1 + . . . + ωn )
has absolute value strictly less than 1. Since its conjugates all have absolute value at most 1 by
the triangular inequality, the product of n1 (ω1 + . . . + ωn ) with its conjugates has absolute value
strictly less than 1. Since it’s a rational integer, it must be zero. This implies n1 (ω1 +. . .+ωn ) = 0,

1 In fact, any algebraic integer that can be written as a linear combination of roots of unity with rational coefficients

can also be written as a linear combination of roots of unity with integer coefficients. However, this is a difficult result
to prove (see Exercise 3.5.26† for a special case).
208 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

contradicting our initial assumption. 

Exercise 1.5.31† . Let α ∈ Z be an algebraic number and let p be a rational prime. Must it follow
that αn ≡ 0 (mod p) or αn ≡ 1 (mod p) for some n ∈ N?2

Solution

The answer is No. As a counterexample, we can try to find an α such that the sequence (αn )n≥1
is constant modulo p and not congruent to 1 modulo p. The simplest possible p is p = 2, so let’s
pick p = 2. Then, one of the simplest ways to achieve α2 ≡ α (mod 2) and α 6≡ 0, 1 (mod 2) is
to choose α such that α2 = α − 2. Such an α must clearly be irrational. By Exercise 1.5.24† , α
is not invertible modulo 2, but it is also non-zero since 2 = αβ is not divisible by 2 · 2, where β is
the other conjugate of α. Indeed, if 2 divided α, then it would also divide β: α/2 is an algebraic
integer so its conjugate β/2 is as well byExercise 1.2.4∗ . We can also show directly that α 6≡
β−1
(mod 2) without invoking Exercise 1.5.24† : if α−1 2 were an integer, then so would 2 , so their
product
α−1 β−1 αβ − (α + β) + 1 2−1+1 1
· = = =
2 2 4 4 2
which is not the case.

A more elaborate example, which also sheds a lot of light on the situation, is the following.
Consider a prime p ≡ 1 (mod 4) and factorise it as ππ in the Gaussian integers Z[i]. By ??,
π and π aren’t associates so, with the help of the Chinese remainder theorem (in Z[i]), we can
pick an α ∈ Z[i] congruent to 0 modulo π and to 1 modulo π. Then, the powers of α are clearly
congruent to α modulo π and π, so modulo p = ππ. However, α is congruent to neither 0 or 1
modulo p. In fact, this is the only way such a counterexample happens, but we need to replace
the factorisation in prime elements (which doesn’t always hold) by the factorisation in prime
ideals (which always holds). 

2 In Chapter 4, we prove that the answer is positive for sufficiently large p.


Chapter 2

Quadratic Integers

Exercise 2.0.1. Why is the "naive" approach of factorising the equation as x2 = (y − 1)(y 2 + y + 1)
difficult to conclude with? Why does our solution not work as well for the equation x2 − 1 = y 3 ?

Solution

One reason is that both of these approaches transform one diophantine equation into two simulta-
neous diophantine equations. On the other hand, since i and −i are conjugate, (x + i)(x − i) = y 3
gives us only one equation, since, from (x + i)3 = (a + bi)3 , we also get (x − i)3 = (a − bi)3 by
conjugating. 

2.1 General Definitions


Exercise 2.1.1∗ . Prove that Z + αZ is a ring for any quadratic integer α. This amounts to checking
that it is closed under addition, subtraction, and multiplication. What happens if α is a quadratic
number which is not an integer?

Solution

Suppose that α2 − uα − v = 0, i.e. α2 = uα + v where u, v ∈ Z. We have

(a + bα) ± (c + dα) = (a ± c) + (b ± d)α

and

(a + bα)(c + dα) = ac + (ad + bc)α + bdα2


= ac + (ad + bc)α + bd(uα + v)
= (ac + bdv) + (ad + bc + bdu)α

If α is not a quadratic integer, then α2 is not a linear combination of 1 and α with rational
integer coefficients, so Z[α] would not be closed under multiplication with this definition (and
thus not a ring). 

Exercise 2.1.2∗ . Prove that α + αQ is a ring for any quadratic integer α. This amounts to checking
that it is closed under addition, subtraction, multiplication, and division.

209
210 CHAPTER 2. QUADRATIC INTEGERS

Solution

The same proof as the one of Exercise 2.1.1∗ shows that it is a ring, so it remains to prove
that every non-zero element has inverses. If we multiply a + bα by a + bα where α is the other
conjugate of α, we get a rational number c by the fundamental theorem of symmetric polynomials.
Moreover, this number is zero only if one of a + bα and a + bα is zero, which implies that the
other is too since they are conjugate. Indeed, if f (a + bX) has a root at α it has one at α too.

Hence, when a + bα is non-zero, it has an inverse


a + bα
.
c
To conclude, note that, if α2 + uα + v = 0, then α = −u − α by Vieta’s formulas so α ∈ Q(α).

Exercise 2.1.3∗ . Let α be a quadratic number and β ∈ Q(α). Show that β has degree 1 or 2.

Solution

As in Exercise 2.1.2∗ , let α ∈ Q(α) be the conjugate of α. Let β = a + bα with a, b ∈ Q. Then

(X − (a + bα))(X − (a + bα))

has rational coefficients by the fundamental theorem of symmetric polynomials so β has degree
at most 2 as wanted. 


Exercise 2.1.4∗ . Prove that a quadratic field K is equal to Q( d) for some squarefree rational
integer d 6= 1. Moreover, prove that such fields are pairwise non-isomorphic (and in particular distinct),

meaning
√ that, for distinct squarefree a, b 6= 1, there does not exist a bijective function
√ f : Q( a) →
Q( b) such that f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y) for any x, y ∈ Q( a).

Solution

Let √α be a quadratic number, i.e. a root of aX 2 + bX + c for some a, b, c ∈ Z. Then, α =


−b± b2 −4ac

so Q(α) = Q( b2 − 4ac). By letting d be the squarefree part of b2 − 4ac (but
2a √
conserving its sign), we thus get Q(α) = Q( d) as wanted (and d 6= 1 since it does not only have
rational elements).
√ √
Let a, b be squarefree and suppose f : Q( u) → Q( v) is an isomorphism. We will show
that a = b. Note that f (1)2 = f (1) so f (1) = 1 or 0 but the latter is impossible since then
f (x) = f (x)f (1) = 0 for all x. We have

f (u) = f (1) + f (1) + . . . + f (1) = uf (1) = u


√ 2 √ √ √ √
and f (u) = f ( u) so u ∈ Q( v). Thus, the problem of showing that Q( u) and Q( v)
are non-isomorphic
√ √ when u 6= v reduces √ to showing that they √ are distinct. This is easy: if
u = a + b v then u = a2 + vb2 + 2ab v so a = 0 or b = 0 since v is irrational but the former
means that u is a perfect square and the latter that u/v is one, i.e. that u = 1 or u = v. 

Exercise 2.1.5∗ . Prove that the conjugate is well defined.


2.1. GENERAL DEFINITIONS 211

Solution
√ √
√ element of Q( d) can be written in a unique way as a + b d
This amounts to the fact that every
with a, b ∈ Q which is true since d is irrational. 

Exercise 2.1.6∗ . Let d 6= 1 be a rational √


squarefree number. Prove that the conjugation satisifes

α + β = α + β and αβ = αβ for all α, β ∈ Q( d). Such a function is called an automorphism of Q( d)
if it is also bijective.

Solution

We have √ p √
(a − b d) + (a0 − b0 ) = (a + b) − (a0 + b0 ) d
and √ √ √ √
(a − b d)(a0 − b0 d) = (aa0 + bb0 d) − (ab0 + ba0 ) d.


Exercise
√ 2.1.7. Let d 6= 1 be a rational squarefree number. Prove that the only automorphisms of
Q( d) are the identity and conjugation.

Solution

Let f by an automorphism of Q( d). Since f is bijective and f (1)2 = f (1), we must have f (1) = 1
as otherwise f (x) = f (x)f (1) = 0 for all x. By induction we have f (nx) = f (x) + . . . + f (x) =
nf (x) for any n ∈ Z≥1 , by it is clearly true for n = 0 since f (0) + f (0) = f (0) and for n < 0
since f (−n) = f (0) − f (n) = −f (n) thus f (nx) = nf (x) for any n ∈ Z. Since

nf (xm/n) = f (xm) = mf (x),


√ 2
we√also have √ any a ∈ Q; in
√ f (ax) = af (x) for √ particular f fixes Q. To finish, f (d) = f ( d) so
f ( d) = ± d. Since f (a + b d) = a + bf ( d), the plus sign gives the identity and the minus
sign the conjugation. 


Exercise 2.1.8∗ . Let d 6= 1 be a squarefree rational integer, and α, β ∈ Q( d). Prove that N (αβ) =
N (α)N (β).

Solution
√ √
Let α = a + b d and β = a0 + b0 d. Then,
2 2
N (α)N (β) = (a2 − db2 )(a0 − db0 )

and √
N (αβ) = N ((aa0 + dbb0 ) + (ab0 + ba0 ) d) = (aa0 + dbb0 )2 − d(ab0 + ba0 )2 .
To conclude, both sides are equal to (aa0 )2 + (dbb0 )2 − d((ab0 )2 + (ba0 )2 ). 

Exercise 2.1.9. Prove Exercise 2.1.8∗ without any computations using Exercise 2.1.6∗ .
212 CHAPTER 2. QUADRATIC INTEGERS

Solution

We have
N (α)N (β) = ααββ = αβαβ = N (αβ).



Exercise 2.1.10. Let d < 0 be a squarefree integer. Prove that the conjugate
√ of an element of Q( d)
is the same as its complex conjugate. In particular, the norm over Q( d) is the module squared.

Solution
√ √
When d < 0, the complex conjugate√of d is − d. Since rational
√ numbers are real, this means
that the complex conjugate of a + b d for a, b ∈ Q is a − b d. 

2.2 Unique Factorisation


Exercise 2.2.1∗ . Let d 6= 1 be a squarefree rational integer. Prove that the product of two units of
OQ(√d) is still a unit, and that the conjugate of a unit is also a unit.

Solution

If uu0 = 1 and vv 0 = 1 then (uv)(u0 v 0 ) = 1 and uu0 = 1. 

Exercise 2.2.2∗ . Let d 6= 1 be a squarefree rational integer. Prove that α ∈ OQ(√d) is a unit if and
only if |N (α)| = 1.

Solution

If N (α) = αα = ±1, then it is clear α is a unit. Now suppose α is a uni. By Exercise 2.2.1∗ , α
is also a unit, which means N (α) = αα is one too. However the only rational integer units are
±1 by Proposition 1.1.1. 

Exercise 2.2.3∗ . Determine the units of the ring Z[i].

Solution

a + bi is a unit of Z[i] if and only if its norm a2 + b2 is 1 by Exercise 2.2.2∗ since its positive.
This means a = ±1 and b = 0 or a = 0 and b = ±1, i.e. a + bi ∈ {1, −1, i, −i}. 

Exercise 2.2.4∗ . Prove that an associate of a prime is also prime.

Solution

If p and q are associates, p | α if and only if q | α. 

Exercise 2.2.5∗ . Prove that the conjugate of a prime is also a prime.


2.2. UNIQUE FACTORISATION 213

Solution

p | η iff p | η so if p | αβ then p | αβ which implies p | α or p | α, i.e. p | α or p | β as wanted. 

Exercise 2.2.6∗ . Prove that primes are irreducible.

Solution

If p = αβ is prime, then p must divide α or β, we may assume it divides α, i.e. α = pγ. Then,

p = αβ = pγβ =⇒ βγ = 1

so β is a unit as wanted. 

Exercise 2.2.7∗ . Let d 6= 1 be a squarefree rational integer and let x ∈ OQ(√d) be a quadratic integer.
Suppose that |N (x)| is a rational prime. Prove x is irreducible.

Solution

If x = αβ then N (x) = N (α)N (β) so if N (x) is a rational prime then N (α) or N (β) must be ±1
by the uniqueness of the prime factorisation in Z, i.e. one of them is a unit. 

Exercise 2.2.8∗ . Suppose a prime p divides another prime q. Prove that p and q are associates.

Solution

Write q = αp. Since q is irreducible by Exercise 2.2.6∗ and p is not a unit, α must be a unit. 

Exercise 2.2.9∗ . Prove that p is a prime element of R if and only if it is non-zero and R (mod p) is an
integral domain (this means that the product of two non-zero elements is still non-zero). In particular,
if R (mod p) is a field (this means that elements which are not divisible by p have an inverse mod p),
p is prime.

Solution

αβ is divisible by p if and only if it is zero modulo p, so p is prime iff when αβ is zero modulo
p, α or β must already be zero modulo p. This is exactly what it means for R (mod p) to be an
integral domain. 

Exercise 2.2.10. Let d 6= 1 be a squarefree rational integer and let p ∈ OQ(√d) be a prime. Prove
that p divides exactly one rational prime q ∈ Z.

Solution

First we prove uniqueness. Suppose p | q and p | r for distinct rational primes q and r. By
Bézout, let aq + br = 1 for some a, b ∈ Z. Then p | aq + br = 1 which means that it’s a unit and
is a contradiction.
214 CHAPTER 2. QUADRATIC INTEGERS

For existence, consider the prime factorisation ±q1n1 · . . . · qknk of the norm N (p) of p. Since
p | N (p), p must divide one of the qi since it’s prime. 


Exercise 2.2.11. Prove that 2 is irreducible in Z[ −5] = OQ(√−5) but not prime.

Solution
√ √
First, note that 2 is not prime since 2 | (1 + −5)(1 − −5) but 2 divides neither of these factors.
Now, suppose 2 = αβ and that neither of α and β are units. Then,

4 = N (2) = N (α)N (β)



so N (α) = N (β) = ±2 as they are different from ±1. If we write α = a + b −5, we have
N (α) = a2 + 5b2 which cannot be equal to 2 and is thus a contradiction. 

Exercise 2.2.12. Show that the primes of Definition 2.2.2 must all be prime elements, and that there
is at least one associate of each prime element in that set. (Conversely, if we have unique factorisation,
any such set of primes work. This explains why we consider all primes defined in Definition 2.2.3.)

Solution

Suppose p is a prime as in Definition 2.2.2 and p | αβ. Write α = upa1 1 ·. . .·pann and β = vq1b1 ·. . .·qm
bm

the prime factorisations of α and β. Then, by the uniqueness of the prime factorisation, p must
be equal to some pi or qi times a unit w. In the first case, p | pi | α and in the second case
p | qi | β.

Now suppose p is a prime as in Definition 2.2.3 and consider its prime factorisation pa1 1 · . . . ·
pann . Then p must divide one of the pi since it’s prime, which means it is associate to it by
Exercise 2.2.8∗ . 

Exercise 2.2.13∗ . Prove that a greatest common divisor γ of α and β really is a greatest common
divisor of α and β, in the sense that if γ | α, β and δ | α, β then δ | γ.

Solution

Since α, β ∈ αR + βR, we have α, β ∈ γR, i.e. γ | α, β. Now suppose δ | α, β. Let x and y be


such that xα + yβ = γ. Then,
δ | xα + yβ = γ.


Exercise 2.2.14∗ . Prove that an associate greatest common divisor is also a greatest common divisor,
and that the greatest common divisor of two elements is unique up to association.

Solution

If γ and δ are two gcds of α and β then, by Exercise 2.2.13∗ , γ | δ | γ so they are associates. 

Exercise 2.2.15. Let R be a Euclidean domain with Euclidean function f . Show that, if f (α) = 0,
then α = 0, and if f ()α) = 1, then α is a unit or zero.
2.3. GAUSSIAN INTEGERS 215

Solution

If α 6= 0, consider the Euclidean division of 0 by α: 0 = ρα + τ . Then f (τ ) < f (α) = 0 which is


impossible.

For the second part, suppose that α is non-zero and f (α) = 1. Then, if we perform the Euclidean
division of any β by α: β = αρ + τ , we have f (τ ) < f (α) = 1, i.e. f (τ ) = 0. By the first part,
this means that τ = 0. Hence, α divides everything, and in particular it divides 1, i.e. it is a
unit. 

Exercise 2.2.16∗ . Prove that a Euclidean domain is a Bézout domain.

Solution

Let R be a Euclidean domain with Euclidean function f . Let α, β be two elements of R. Consider
a non-zero element γ ∈ αR + βR such that f (γ) is minimal. We will show that αR + βR = γR.
Suppose otherwise, that there is a δ ∈ αR + βR such that the Euclidean division δ = ργ + τ has
τ non-zero (otherwise γ | δ, i.e. δ ∈ γR). Then, f (τ ) < f (γ) but τ ∈ αR + βR and is non-zero
too so that contradicts the minimality of γ. 

Exercise 2.2.17∗ . Prove that irreducible elements are prime in a Bézout domain.

Solution

Let x be an irreducible element in a Bézout domain R. By Exercise 2.2.9∗ , it suffices to show


that every x - α has an inverse modulo p. Since R is a Bézout domain, there is some β such that

xR + αR = βR.

In particular β | x so β is a unit or a unit times x. The latter is not possible since β also divides
α, thus β is unit. Without loss of generality, β = 1 by Exercise 2.2.14∗ . Since β = ax + bα for
some a, b ∈ R by definition, modulo x we have bα ≡ 1 as wanted. 

2.3 Gaussian Integers


Exercise 2.3.1∗ . Let n ∈ Z be a rational integer and p an odd rational prime. If n2 ≡ −1 (mod p),
prove that p ≡ 1 (mod 4).

Solution

Suppose that p | n2 + 1. Then the order of n modulo p is 4, indeed n4 ≡ (−1)4 = 1 so the order
divides 4 and n2 ≡ −1 6≡ 1 so the order is not divisible by 2. Since the order divides p − 1 by
Fermat’s little theorem (see Exercise 3.3.4∗ ), we have 4 | p − 1, i.e. p ≡ 1 (mod 4) as wanted. 

Exercise 2.3.2∗ . Let p ≡ 1 (mod 4) be a rational prime. Prove that there exist a rational integer n
such that n2 ≡ −1 (mod p). (Hint: Consider (p − 1)!.)
216 CHAPTER 2. QUADRATIC INTEGERS

Solution

By Wilson’s theorem, (p − 1)! ≡ −1 (mod p). Hence,

−1 ≡ (p − 1)!
p−1 p−1
2
Y 2
Y
= k· p−k
k=1 k=1
p−1 p−1
2
Y 2
Y
≡ k −k
k=1 k=1
 p−1 2
2
p−1 Y
= (−1) 2  k
k=1
 p−1 2
Y2

≡ k
k=1

since p ≡ 1 (mod 4). 

Exercise 2.3.3. Which rational integers can be written as a sum of two squares of rational integers?

Solution

Since the norm of Z[i] is multiplicative, if m and n are sums of two squares, then so are mn. In
particular, all integers with only prime factors equal to 2 or congruent to 1 modulo 4 are sums
of two squares. Also, perfect squares times these numbers are sums of two squares.

Now suppose that n = a2 + b2 is a sum of two squares but not of this form, i.e. there is some
prime p ≡ −1 (mod 4) such that vp (n) is odd. If p - b then p | (ab−1 )2 + 1 which is impossible by
Exercise 2.3.1∗ . Thus p | a, b. We can now proceed by infinite descente on n/p2 = (a/p)2 + (b/p)2
(or equivalently suppose n was the minimal counterexample and reach a contradiction). 

Exercise 2.3.4∗ . Find all rational integer solutions to the equation x2 + 1 = y 3 . (This is the example
we considered in the beginning of the chapter.)

Solution

Note that any solution of x2 + 1 = y 3 must have y odd since x2 + 1 is congruent to 1 or 2 modulo
4. Thus, y is not divisible by 1 + i. Write the equation as (x + i)(x − i) = y 3 . The gcd of the
two factors divide (x + i) − (x − i) = 2i but is not divisible by 1 + i so is 1. This means that

x + i = uα3

for some unit u. Since the units of Z[i] are 1, −1, i, −i and they are all cubes (13 , (−1)3 , (−i)3 , i3 ),
we can assume u = 1. The rest of the solution is the same as in the introduction of the chapter.

2.4 Eisenstein Integers


Exercise 2.4.1∗ . Prove that the norm of a+bj is a2 −ab+b2 . (Bonus: do it without any computations
using cyclotomic polynomials from Chapter 3.)
2.4. EISENSTEIN INTEGERS 217

Solution

(a + bj)(a + bj) = a2 + ab(j + j) + bjj = a2 − ab + b2


since j + j = −1 and jj = 1 by Vieta (j is a root of X 2 + X + 1 = 0). For the bonus, one can
note that N (a + bj) = Φ6 (a, b) by Exercise 3.1.8∗ . 

Exercise 2.4.2∗ . Determine the units of Z[j].

Solution

a + bj is a unit if and only if a2 − ab + b2 = ±1. If ab ≤ 0 we get a = 0 and b = ±1 as well


as a = ±1 and b = 0, and if ab > 0 then a2 − ab + b2 = (a − b)2 + ab ≥ 02 + 1 which means
a = b = ±1. In conclusion, the units of Z[j] are ±1, ±j and ±(1 + j) = ±j 2 . (Note that these
are all roots of unity.) 

Exercise 2.4.3∗ . Prove that Z[j] is norm-Euclidean.

Solution
α
Let α, β ∈ Z[j] be two elements with β 6= 0. Write β = x+yj and let m and n be rational integers
1 1
2 2 2
such that |x − m| ≤ 2 and |y − n| ≤ 2. Thus, |N (x + yi − (m + ni))| ≤ 12 + 12 + 12 < 1.

Hence,
|N (α − β(m + ni))| = |N (β)| · |N (x + yi − (m + ni))| < |N (β)|
which means that the remainder τ = α − β(m + ni) works since it has norm less than N (β). 

Exercise 2.4.4. Characterise the primes of Z[j]. Conclude that when p ≡ 1 (mod 3) there exist
rational integers a and b such that p = a2 − ab + b2 . (You may assume that there is an x ∈ Z such
that x2 + x + 1 ≡ 0 (mod p) if p ≡ 1 (mod 3). This will be proven in Chapter 3, as a corollary of
Theorem 3.3.1.)

Solution

As in the Z[i] case, it suffices to find the prime factorisation of rational primes (this can also be
seen as a corollary of prime divides exactly one rational prime). Indeed, if α is a prime of Z[j],
then N (α) = αα has at most two rational prime factors: if it has one then N (α) is prime in Z.
Otherwise, N (α) = ±p2 α is an associate of a rational prime p.

Thus, suppose N (α) = p for some rational prime p (N (α) is positive). Write α = a + bj. Then,
N (α) = a2 −ab+b2 . Clearly, p - b as otherwise p2 | N (α) = p. Thus, p | (−a·b−1 )+(−a·b−1 )+1.

We wish to find which rational primes divide a number of the form x2 + x + 1. Note that
x3 − 1 = (x + 1)(x2 + x + 1) so x3 ≡ 1 (mod p). The order of x modulo p divides 3. If it is 1
then x2 + x + 1 ≡ 3 so p = 3 which factorises as (1 − j)(1 − j) = −j 2 (1 − j)2 (it ramifies).

Otherwise, the order must be 3. Since it divides p − 1, we have p ≡ 1 (mod 3). In particular,
primes congruent to −1 modulo 3 stay inert in Z[j]. Finally, if p ≡ 1 (mod 3), by Theorem 3.3.1,
there exists an x such that p | x2 + x + 1 = (x − j)(x − j 2 ). Since p - x − j, x − j 2 , this
means these primes split as a product of two Eisenstein primes a + bj and a − bj. (In particular
p = a2 − ab + b2 .) 
218 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.4.5∗ . Let θ ∈ Z[j] be an Eisenstein integer. Prove that, if λ - θ, then θ ≡ ±1 (mod λ).
In that case, prove that we also have θ3 ≡ ±1 (mod λ4 ).

Solution

Modulo λ, every element of Z[j] is congruent to a rational integer. Indeed, a + bj ≡ a + b


(mod 1 − j). Moreover, λ | 3 so every rational integer is congruent modulo λ to 0 or ±1.
3
∓1
Let θ = ηλ ± 1. We shall show that λ3 θθ∓1 = θ2 ± θ + 1. We have

θ2 ± θ + 1 = (ηλ ± 1)2 ± (ηλ ± 1) + 1


= η 2 λ2 ± 2ηλ + 1 ± ηλ + 1 + 1
= η 2 λ2 ± 3λη + 3
≡ η 2 λ2 + 3 (mod λ3 )
= λ2 (η 2 − j 2 )
= λ2 (η − j)(η + j)

since any λ - η is congruent to j ≡ 1 or −j ≡ −1 modulo λ. 

Exercise 2.4.6∗ . Let α, β ∈ Z[j] be coprime Eisenstein integers non-divisible by λ. Prove that, if
λ | α3 + β 3 = (α + β)(α + βj)(α + βj 2 ),
each pair of factors has gcd λ.

Solution

Modulo λ, j ≡ 1 so all the factors are the same modulo λ and are thus all divisible by λ. If
γ | α + β, α + βj then γβ(1 − j) which implies γ | 1 − j = λ since it is coprime with β (it divides
α + β). By sysmmetry between j and j 2 (they are conjugate), we reach the same conclusion if
γ | α + β, α + βj 2 .

Finally, if γ | α + βj, α + βj 2 then γ | βj(1 − j) but γ is coprime with β since it divides α + βj


which implies γ | 1 − j = λ again. 

Exercise 2.4.7. Check the computational details: ±1 ± µ ± η is never zero mod λ4 for units µ, η and
±1 ± µ ≡ 0 (mod λ3 ) implies µ = ±1.

Solution

We have seen in Exercise 2.4.2∗ that the units of Z[j] are roots of unity. Thus, the norms of
±ε, ±2 ± ε, ±1 ± µ have absolute value less than 3 · 3 by the triangular inequality (because the
conjugates of roots of unity are still roots of unity and thus have absolute value 1). In particular,
for it to be divisible by N (λ4 ) = 81, it must be zero, i.e. ±ε = 0, ±2 ± ε = 0 and ±1 ± µ = 0.
The first two cases are impossible, and the last one implies µ = ±1 as wanted. 

2.5 Hurwitz Integers


Exercise 2.5.1∗ . Prove that ij = k = −ji, jk = i = −kj and ki = j = −ik follows from i2 = j2 =
k2 = −ijk = −1 and associativity of the multiplication.
2.5. HURWITZ INTEGERS 219

Solution

We have
1. ij = −ijkk = k.

2. kj = ijj = −i.
3. jk = −iijk = i.
4. ik = jkk = −j.
5. ki = −kkj = j.

6. ji = kii = −k.


Exercise 2.5.2. Prove that i, j, k are distinct.

Solution

Exercise 2.5.1∗ tells us that everything is cyclic between i, j, k so assume i = j. Then we get
k = ij = i2 = −1 and
−i = ki = j = i
so i = 0 which is an obvious contradiction. 

Exercise 2.5.3∗ . Let α, β, γ ∈ H be quaternions. Prove that (αβ)γ = α(βγ). (We say multiplication
is associative. This is why we can write αβγ without ambiguity.)

Solution

The simplest (and neatest) way to see this is to use the representation by matrices given by
Remark 2.5.2, since multiplication of matrices is associative. Without matrices, but still with a
bit of (implicit) linear algebra, we can do the following. We shall prove that (xy)z = x(yz) for
any x, y, z ∈ {1, i, j, k}. Since real numbers commute with everything and multiplication is clearly
associative when one of the factor is real, we thus get (xy)z = x(yz) for any x, y, z which are
linear combinations with real coefficients of 1, i, j, k, i.e. all quaternions (since if (xy)z = x(yz)
and (wy)z = w(yz) then ((x + w)y)z = (x + w)(yz) too).

When one of x, y, z is 1 we trivially have (xy)z = x(yz) by the previous remark (since 1 is real).
Thus, suppose without loss of generality by cyclicity (Exercise 2.5.1∗ ) that x = i. We distinguish
all possible cases.

1. y = z = j. We have
(ij)j = kj = −i = i(jj).

2. y = z = k. We have
(ik)k = −jk = −i = i(kk).

3. y = j, z = k. We have
(ij)k = kk = −1 = ii = i(jk).

4. y = k, z = j. We have
(ik)j = −jj = 1 = −ii = i(kj).
220 CHAPTER 2. QUADRATIC INTEGERS

5. y = i, z = j. We have
(ii)j = −j = ik = i(ij).

6. y = i, z = k. We have
(ii)k = −k = −ij = i(ik).

7. y = j, z = i. We have
(ij)i = ki = −j = ik = i(ji).

8. y = k, z = i. We have
(ik)i = −ji = k = ij = i(ki).

9. y = z = i. We trivially have
(ii)i = i(ii).

Exercise 2.5.4. Prove that there are infinitely many square roots of −1 in H.

Solution

We have
(a + bi + cj + dk)2 = (a2 − b2 − c2 − d2 ) + 2abi + 2acj + 2adk
because the other terms cancel out because of Exercise 2.5.1∗ , for instance bici + cjbi = 0. In
particular, if a = 0, (bi + cj + dk)2 = −(b2 + c2 + d2 ) and there are thus clearly infinitely many
square roots of any negative number. 

Exercise 2.5.5∗ . Prove that, for any α, β ∈ H, α + β = α + β and αβ = βα (this is because


multiplication is not commutative anymore).

Solution

It is clear that conjugation is additive:

(a − bi − cj − dk) + (a0 − b0 i − c0 j − d0 k) = (a + a0 ) − (b + b0 )i − (c + c)j − (d + d0 )k.

For multiplicativity, one can see that

(a + bi + cj + dk) + (a0 + b0 i + c0 j + d0 k) = (aa0 − bb0 − cc0 − dd0 ) + (ab0 + ba0 + cd0 − dc0 )i
+ (ac0 + ca0 + db0 − bd0 )j + (ad0 + da0 + bc0 − cb0 )k.

If we exchange (a, b, c, d) with (a0 , b0 , c0 , d0 ) and switch the signs of b, c, d, b0 , c0 , d0 one can see that
it is the same as taking the conjugate: both give

(aa0 − bb0 − cc0 − dd0 ) − (ab0 + ba0 + cd0 − dc0 )i − (ac0 + ca0 + db0 − bd0 )j − (ad0 + da0 + bc0 − cb0 )k

(a times something switches sign because a stays the same but the other factor switches sign,
and cd0 − dc0 switches sign too because (c, d) ↔ (c0 , d0 )). 

Exercise 2.5.6∗ . Check that (a + bi + cj + dk)(a − bi − cj − dk) is indeed a2 + b2 + c2 + d2 .


2.5. HURWITZ INTEGERS 221

Solution

(a + bi + cj + dk)(a − i − cj − dk) = a2 + b2 + c2 + d2 because the non-real terms cancel out


because of Exercise 2.5.1∗ , for instance bi(−cj) + cj(−bi) = 0. 

Exercise 2.5.7∗ . Prove that H is a skew field. This amounts to checking that elements have multi-
plicative inverses (i.e. for any α there is a β such that αβ = βα = 1).

Solution

This follows from Exercise 2.5.6∗ : the inverse of a + bi + cj + dk is a−bi−cj−dk


a2 +b2 +c2 +d2 . 

Exercise 2.5.8∗ . Prove that the norm is multiplicative: for any α, β ∈ H, N (αβ) = N (α)N (β).

Solution

This follows from Exercise 2.5.5∗ :

N (αβ) = αββα = ααββ = N (α)N (β)

because real numbers commute with every quaternion and ββ is a real number. 

Exercise 2.5.9∗ . Prove that H = { a+bi+cj+dk


2 | a ≡ b ≡ c ≡ d (mod 2)}. Deduce that the elements
of H have integral norms.

Solution

When you multiply two such elements you get back an element of the same form:.Indeed, it is
clear when one of the factors is in Z[i, j, k]. For the same reason, we can consider a, b, c, d modulo
2. Thus, we just need to show that the product of two such elements with odd a, b, c, d still has
this form, which is true for 1+i+j+k
2 · 1−i−j−k
2 = 1.

From this we conclude that H is included in this set. However it is also clear that H contains
this set so they are equal. 

Exercise 2.5.10∗ . Determine the units of H.

Solution

α = a + bi + cj + dk is a unit if and only if a2 + b2 + c2 + d2 = N (α) = ±1. This means


 
1±i±j±k
α ∈ ±1, ±i, ±j, ±k, .
2

Indeed, either a, b, c, d are all integers in which case one of them is ±1 and the rest 0, or they are
all half integers in which case they must all be ±1 2 as otherwise the sum is too big. 

Exercise 2.5.11∗ . Let α, β, γ ∈ H. Prove that α d β implies α d βγ but does not always imply
α d γβ.
222 CHAPTER 2. QUADRATIC INTEGERS

Solution

Suppose that α d β, i.e. β = αδ. Then βγ = α(δγ) so α d βγ too.

For the second part, a very simple counter-example is α = β = 1 + i + j and γ = i. Suppose that
α d γβ. This means that there exists a δ ∈ H such that αδ = γβ, i.e.

(1 + i + j)δ = i(1 + i + j).

In other words, 3 divides


(1 − i − j)i(1 + i + j).
However, this is equal to
(i + 1 + k)(1 + i + j) = 1 + 2j + 2k
which is clearly not divisible by 3. 

Exercise 2.5.12∗ . Prove that being left-associate is an equivalence relation, i.e., for any α, β, γ, α is
a left-associate of itself, α is a left-associate of β if and only if β is a left-associate of α, and if α is a
left-associate of β and β is a left associate of γ then α is a left-associate of γ.

Solution

We have α = α · 1,
α = βε ⇐⇒ β = αε−1 ,
and
α = βε, β = γη =⇒ α = γηε.


Exercise 2.5.13∗ . Prove that a left-gcd γ of α and β satisifies the following property: γ d α, β and
if δ d α, β then δ d γ.

Solution

Since α, β ∈ γR, we have γ d α, β. Write γ = αx + βy. If δ d α, β, then δ d αx + βy = γ by


Exercise 2.5.11∗ . 

Exercise 2.5.14. Prove that 1 + i and 1 − j do not have a left-gcd in Z[i, j, k]. In particular, it
is not left-Bézout and thus not left-Euclidean too (and the same holds for being right-Bézout and
right-Euclidean by symmetry).

Solution

Let L = Z[i, j, k]. Suppose otherwise and let γ be a left-gcd. Note that 1 + i has norm 2 so γ
must have norm 1 or 2. If it has norm 1, it is a unit so (1 + i)L + (1 − j)L = γL. Let α and β
be such that (1 + i)α + (1 − j)β = 1. Then,
1 = (1+i)α+(1−j)β = (1+i)α+(1+bf i)(1/2−i/2−j/2+k/2)β = (1+i)(α+(1/2−i/2−j/2+k/2)β)
which is impossible since the norm of the LHS is divisible by 2 since the second factor is a Hurwitz
integer so has integral norm by Exercise 2.5.9∗ .
If γ has norm 2 (this is what happens in H), then, since γ left-divides 1 + i and 1 − j and has
2.6. EXERCISES 223

the same norm as them it is a left-associate of them and thus 1 + i and 1 − j are left-associates
too by Exercise 2.5.12∗ . This is a contradiction since the only units of L are ±1, ±i, ±j, ±k by
Exercise 2.5.10∗ . 

Exercise 2.5.15∗ . Prove Proposition 2.5.1.

Solution

By symmetry, suppose R is a left-Euclid domain with Euclidean function f . Let γ ∈ αR + βR


be a non-zero element such that f (γ) is minimal. Suppose for the sake of a contradiction that
α 6∈ γR. Write α = γρ+τ for some non-zero τ with f (τ ) < f (γ). This contradicts the minimality
of f (γ) since τ ∈ αR + βR too. Thus α ∈ γR and by symmetry β too. 

Exercise 2.5.16∗ . Prove Proposition 2.5.3.

Solution

We proceed by induction on N (α), the base case is the units. If α is irreducible it is its own
factorisation, otherwise write α = βγ with N (β), N (γ) > 1. Then N (β), N (γ) < N (α) so they
can be written as a product of irreducible elements and thus α too. 

Exercise 2.5.17. Prove that there is an irreducible Hurwitz integer x ∈ H for which there exist α
and β such that x d αβ but x left-divides neither α nor β.

Solution

1 + i has norm 2 which is a rational prime so is irreducible. In addition, 1 + i d 2 = (1 + j)(1 − j)


but 1 + i left-divides neither of 1 + j and 1 − j. Indeed, if it did, then there would be a unit η
such that 1 + j = (1 + i)η since they have the same norm. However, one can see that none of the
 
1±i±j±k
η ∈ ±1, ±i, ±j, ±k,
2

work since there are always two non-zero coefficients of i, j, k except when the unit is ±1 or ±j
(which do not work). 

Exercise 2.5.18∗ . Let p be a rational prime. Prove that there exist rational integers a and b such
that p | 1 + a2 + b2 .

Solution
p+1
1 + a2 and −b2 both reach 2 values modulo p so cannot be disjoint since Fp only has p <
p+1 p+1
2 + 2 elements. 

2.6 Exercises
Diophantine Equations
Exercise 2.6.2† . Prove that OQ(√2) and OQ(√−2) are Euclidean.
224 CHAPTER 2. QUADRATIC INTEGERS

Solution

We proceed exactly as in Proposition 2.3.1. Let α, β ∈ OQ(√±2) = Z[ ±2] be algebraic integers,

with β 6= 0. Write α/β = x + y ±2. Choose rational integers m, n such that |x − m|, |y − m| ≤ 12 .
Then,
 2  2
√ √ 1 1 3
|N ((x + y ±2) − (m + n ±2))| ≤ +2 = <1
2 2 4

so that τ = β − (m + n −2) works as the remainder of the Euclidean division of α by β since it
has norm less than |N (β)|. 

Exercise 2.6.4† . Prove that OQ(√−7) is Euclidean.

Solution
 √
1+ −7

α

Let α, β ∈ OQ(√−7) = Z 2 be quadratic integers with β 6= 0. Write β = x + y −7 with
x, y ∈ Q. Pick a half-integer n such that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such
that |x − n| ≤ 21 . Then,
 2  2
√ 1 1 11
|N ((x − m) + (y − n) −7)| ≤ +7 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y − n)
√ −7) works since it has norm less than |N (β)| by
the previous computation and α = β(m + n −7) + τ . 

Exercise 2.6.6† . Solve the equation x2 + 11 = y 3 over Z.

Solution
√  √ 
First, we prove that Q( −11) is norm-Euclidean. Let α, β ∈ OQ(√−11) = Z 1+ 2−11 be

quadratic integers with β 6= 0. Write α β = x + y −11 with x, y ∈ Q. Pick a half-integer n such
that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such that |x − n| ≤ 12 . Then,
 2  2
√ 1 1 15
|N ((x − m) + (y − n) −11)| ≤ + 11 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y − n)√ −11) works since it has norm less than |N (β)|
by the previous computation and α = β(m + n −11) + τ .

We conclude
√ that Q( −11) is Euclidean and thus a UFD.√It is easy to √ check that 2 stays prime
in Q( −11). Rewrite the equation x2 + 11 = y 3 as (x + −11)(x − −11) = y 3 . Note that, if
y is even, then √ √
x + −11 x − −11
2| ·
2 2
which is impossible since 2 is prime and divides √ neither of these
√ factors. Thus, √ y is odd, which
means that x is even.√ Let δ be the
√ gcd of x√+ −11 and x − −11. Since δ | 2 −11 and 2x, we
get δ = 1 as 2 - x +
√ −11 and −11 - x + −11 (otherwise 11 | x, y which gives a contradiction
modulo 112 ) and −11 is prime (it has prime norm).

We conclude that, since Q( −11) is a UFD, we have
√ 3


a + b −11
x + −11 = ε
2
2.6. EXERCISES 225


for some rational integers a ≡ b (mod 2) and ε a unit. Since the units of Q( −11) are just ±1
(see Section 7.1), ε is also a cube so we may assume ε = 1.

To conclude, by looking at the real and imaginary parts, we get 8x = a(a2 − 33b2 ) and 8 =
b(3a2 − 11b2 ). Hence, b | 8.
• If b = ±1, we get 3a2 − 11 = ±8, i.e. b = 1 and a = ±1. This yields (x, y) = (±4, 3).

• If b ≡ ±2, we have 3a2 − 44 = ±4, i.e. b = 2 and a = ±4. This yields (x, y) = (±58, 15).
• If 4 | b, then, since a ≡ b (mod 2), we get 16 | b(3a2 − 11b2 ) = 8 which is impossible.


Exercise 2.6.8† . Let n be a non-negative rational integer. In how many ways can n be written as a
sum of two squares of rational integers? (Two ways are considered different if the ordering is different,
for instance 2 = 12 + (−1)2 and 2 = (−1)2 + 12 are different.)

Solution

Let n ≥ 1 be an integer. We Qkwish to count in how many ways it is the sum of two squares. First
we do the case where n = i=1 pi is squarefree and all its prime factors are 1 modulo 4. Then,
Qk
n = i=1 πi π i is a product of k Gaussian primes and their conjugate. Expressing it as a sum of
two squares is equivalent to writing it as a product of two conjugate Gaussian integer n = αα,
which is equivalent to picking one out of πi and π i for each i and putting it in α. Thus, n is a
sum of two squares in 2k different ways, which turns out to be its number of divisors.
Qk
Now, suppose n = i=1 pni i where the pi are distinct and 1 modulo 4 again. The same approach
as before works, except we need to be more careful with where we put repeated prime factors.
For instance, for n = p2 , α = ππ and α = ππ are in fact the same. Here is how we do it: we
distribute some amount of πi to α, say πij , and fill the rest with π ni i −j . There are ni + 1 ways to
do this, so in total n is a sum of two squares in

(n1 + 1) · . . . · (nk + 1)

ways, which turns out to be again the number of divisors of n.


Qk Q`
Finally, we treat the general case, i.e. n = 2r i=1 pni i i=1 qi2mi , where pi are distinct primes
congruent to 1 modulo 4 and qi distinct primes congruent to −1 modulo 4 (any sum of two
squares has this form by Exercise 2.3.3). Notice that the dyadic valuation of n doesn’t change
the number of ways to represent it as a sum of two squares, since 2 = −i(1 + i)2 so we need to
distribute the same prime to α and α. Thus we may assume r = 0. Then, notice that since all
qi are −1 modulo 4, if n = a2 + b2 then
!2 !2
n a a
Q` 2mi
= Q` mi
+ Q` mi
i=1 qi i=1 qi i=1 qi

Qkof way to represent n as a sum of two squares is the same as the number of
so that the numbers
ways to represent i=1 pni i as a sum of two squares, i.e. (n1 + 1) · . . . · (nk + 1). Finally, we may
in fact summarise all of the above discussion as follows: the number of ways to represent n as a
sum of two squares is X
χ4 (d) = d1 (n) − d−1 (n),
d|n

where χ4 (d) is 1 if d ≡ 1 (mod 4), −1 if d ≡ −1 (mod 4) and 0 otherwise, d1 is the number of


(positive) divisors congruent to 1 modulo 4, and d−1 the number of (positive) divisors congruent
to −1 modulo 4. 
226 CHAPTER 2. QUADRATIC INTEGERS

Remark 2.6.1
P
In fact, the fact that the number of representations as a sum of two squares of n is d|n χ4 (d) is
not at all a coincidence. Let r2 (n) denote the way of representing n as a sum of two squares (of
positive rational integers), and s a formal object we will specify later. Then,

X r2 (n) X 1
=
n=1
ns (a2 + b2 )s
a,b≥0,(a,b)6=(0,0)
X 1
=
N (x)s
x6=0,<(x),=(x)≥0
∞ ∞
! !
X 1 Y 1 X
=
N (1 + i)ks N (p)ks
k=0 p≡−1 (mod 4) k=0

! ∞ !
Y X 1 X 1
N (π)ks N (π)ks
p≡1 (mod 4) k=0 k=0


! ∞
!2 ∞
!
X 1 Y X 1 Y X 1
=
2ks pks p2ks
k=0 p≡1 (mod 4) k=0 p≡−1 (mod 4) k=0
1 Y 1 Y 1
=
1 − 2−s (1 − p−s )2 (1 − p−2s )
p≡1 (mod 4) p≡−1 (mod 4)

by uniqueness of the prime factorisation in Z[i] (we take the sum of N 1(x) over all Gaussian
integers except we only choose one associate for each integer, and the product is also taken over
all Gaussian primes but with onlyassociate for each prime). This function is called the Dedekind
zeta function ζQ(i) of Q(i). Now consider the Dirichlet L-function of χ

X χ(n)
L(s, χ) := .
n=1
ns
P∞ 1
We claim that ζQ(i) (s) = ζ(s)L(s, χ), where ζ is the usual Riemann function ζ(s) = n=1 ns .
This is easy: by similar manipulations as before, we have ζ(s) = p 1−p1 −s and
Q

Y 1 Y 1 Y 1
L(s, χ) = =
p
1 − ξ(p)p−s 1 − p−s 1 + p−s
p≡1 (mod 4) p≡−1 (mod 4)

and thus ζ(s)L(s, χ) = ζQ(i) (s) (since 1+p−2s = (1−p−s )(1+p−s )). Finally, by regular expansion,
we have P
X d|n χ(d)
ζ(s)L(s, χ) = .
ns
n≥1
P r2 (n)
Since ζQ(i) = n≥1 ns , we get the wanted expression for r2 (n) (we do not neeed complex
analysis as the above manipulations were purely formal). This solution might seem a lot more
complicated than the previous, but it is in fact a lot deeper as this product formula can be
generalised to any Galois extension of number fields.

Exercise 2.6.11† (Euler). Let n ≥ 3 be an integer. Prove that there exist unique positive odd
rational integers x and y such that 2n = x2 + 7y 2 .
2.6. EXERCISES 227

Solution

Rewrite this as √ √
x + y −7 x − y −7
· = 2n−2 .
2 2
Thus, solving x2 + 2 n n
√7y = 2 in odd integers amounts to†writing 2√ as a product of conjugate
odd factors in Q( −7). We have seen in√Exercise 2.6.4 that Q( −7) is Euclidean, thus we
seek the prime factorisation of 2. Since 1± 2 −7 have norm 2 which is a rational prime, the prime
factorisation of 2 reads √ √
1 + −7 1 − −7
2= · = αβ.
2 2

Now, write x+y2 −7 = αi β j with i + j = n − 2. Since it is not divisible by 2 = αβ, we have

i = 0 or j = 0, and which one it is depends on the sign of its −7 part. This shows that there
is exactly one pair which works. Indeed, it is also

clear that they work since they will have the
same parity and thus be both odd as α2 = −3+2 −7 ≡ α (mod 2). 

Exercise 2.6.12† (Fermat’s Last Theorem for n = 4). Show that the equations α4 + β 4 = γ 2 and
α4 − β 4 = γ 2 have no non-zero solution α, β, γ ∈ Z[i].

Solution

Surprisingly, our solution is very similar to the proof of Theorem 2.4.1. Set λ = 1 − i. Any
fourth power θ4 which isn’t disivible by λ is congruent to ±1 modulo λ6 : θ is congruent to 1 or
i modulo 2, so θ2 is congruent to ±1 modulo 4, which implies that θ4 ≡ 1 (mod 8) as wanted.
Another way to see this is to notice that two factors are of θ4 − 1 = (θ − 1)(θ + 1)(θ − i)(θ + i)
are divisible by λ2 and the other two by λ.

Now suppose for the sake of a contradiction that α, β, γ ∈ Z[i] are pairwise coprime and non-zero
such that α4 +β 4 = γ 2 . If λ does not divide α nor β, then, the equation becomes 1+1 ≡ γ 2 modulo
λ6 by our first remark. In particular, vλ (γ) = 1; set γ = λδ. Then, λ2 δ 2 ≡ 2 = iλ2 (mod λ6 ) so
δ 2 ≡ i (mod λ4 ) which contradicts our previous observation: δ 4 ≡ −1 6≡ 1 (mod λ6 ).

Hence, λ must divide αβ, say it divides α. We will now, as we did in Theorem 2.4.1, that the
more general equation
ελ4n α4 + β 4 = γ 2
does not have solutions λ - α, β, γ ∈ Z[i] and ε ∈ Z[i]× a unit for n ≥ 1. Hence, suppose that
α, β, γ, ε is a solution with minimal n. We first prove that n must be at least 2. If there were a
solution with n = 1, modulo λ4 we get β 4 ≡ 1 ≡ γ 2 . We will prove that γ 2 is in fact congruent
to 1 modulo λ5 , whence λ5 | λ4n which implies n ≥ 2. To prove this notice that γ cannot be
congruent to i modulo λ2 so must be congruent to 1. Set γ = λ2 ρ + 1 and notice that

γ 2 − 1 = (γ − 1)(γ + 1)
λ2 µ(λ2 ρ + 2)
= λ4 ρ(ρ + i)

is divisible by λ5 since one of ρ and ρ + i must be divisible by λ.

Hence, n ≥ 2. We have
ελ4n α4 = (γ − β 2 )(γ + β 2 ).
We can see that both factors are congruent modulo 2 and that their gcd divides 2, which means
that (
γ ± β2 = uλ2 x4
γ ∓ β 2 = vλ4n−2 y 4
228 CHAPTER 2. QUADRATIC INTEGERS

for some units u, v and Gaussian integers λ - x, y. Subtracting the two lines yields the equation

2β 2 = uλ2 x4 + vλ4n−2 y 4 ,

i.e.
µx4 + ηλ4(n−1) y 4 = β 2
where µ = −iu and η = −iv are units. It only remains to prove that µ = ±1 to conclude that
we have found a solution to the equation corresponding to n − 1 ≥ 1, thus contradicting its
minimality. Indeed, if µ = −1 we get the equation

x4 − ηλ4(n−1) y 4 = (βi)2 .

For this, consider our equation modulo λ4 to get µ ≡ ±1 as wanted.

We now consider the equation α4 − β 4 = γ 2 . Suppose that (α, β, γ) is a non-zero solution of


coprime Gaussian integers. Without loss of generality, suppose as well that λ - α: if λ | α then
λ - β and (β, α, iγ) is a solution. Note that we can assume without we are already done when
λ | β because we solved the more general equation

α4 + ελ4n β 4 = γ 2 .

where ε is a unit and λ ≥ 1. Hence, it remains to settle the case where λ - α, β. In that case,
rewrite the equation as β 4 = (α2 − γ)(α2 + γ). The two factors are coprime since λ - β so

α2 ± γ = ε± δ±
4

for some units ε± ∈ Z[i] and Gaussian integers λ - δ± . This then yields
4 4
ε− + δ − + ε+ δ + = 2α2 .

Modulo λ2 , we get ε− + ε+ ≡ 0. It is easy to see that this implies ε− + ε+ = 0. But then, since
we know fourth powers are congruent to 1 modulo λ6 , we get

λ 6 | ε− + δ −
4 4
+ ε+ δ + = 2α2

which implies λ | α and is a contradiction. 

Exercise 2.6.14† . Prove that OQ(√5) is Euclidean.

Solution
 √ 
1+ 5 α

Let α, β ∈ OQ(√5) = Z 2 be quadratic integers with β 6= 0. Write β = x + y 5 with
x, y ∈ Q. Pick a half-integer n such that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such
that |x − n| ≤ 21 . Then,
 2  2
√ 1 1 9
|N ((x − m) + (y − n) 5)| ≤ +5 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y −√n) 5) works since it has norm less than |N (β)| by
the previous computation and α = β(m + n 5) + τ . 

Hurwitz Integers and Jacobi’s Four Square Theorem


Exercise 2.6.15† . Let α ∈ H be a primitive Hurwitz integer, meaning that there does not exist a
α
non-zero m ∈ Z such that m ∈ H and let N (α) = p1 · . . . · pn be its prime factorisation. Then, the
2.6. EXERCISES 229

factorisation of α = π1 · . . . · πn for irreducible elements πi of norm pi is unique up to unit-migration,


meaning that if if τ1 · . . . · τk is another such factorisation, then k = n and



 τ1 = π1 u1
τ = u−1
1 π2 u2


 2

...
τn−1 = u−1

n−1 πn un




−1

τ
n = un πn .

for some units u1 , . . . , un . Deduce that α is irreducible if and only if its norm is a rational prime.

Solution

Let α be a primitive Hurwitz integer. We shall construct its factorisation in irreducible step by
step, proving at the same time its uniqueness. We proceed inductively on n. Consider the right-
gcd π1 of p1 and α, i.e. π1 such that p1 H +αH = π1 H. This is unique up to multiplication on the
right by a unit. Note that this must necessarily be the π1 we’re looking for: if α = π10 · . . . · πn0 and
N (π10 ) = p1 , π10 divides both α and p1 , and, by looking at the norm, must be the right-gcd of p1
and α. Indeed, if the right-gcd were δ = π10 ρ0 it would have norm p21 since N (ρ) | N (p1 /π10 ) = p1 .
But then, ρ ∼ p so the right-gcd is p which is impossible since p - α.

Now, let’s prove that this π1 indeed has norm p. By construction, π1 e p1 , α so there exist ρ and
β such that p1 = π1 ρ and α = π1 β. In particular, N (π1 ) | N (p1 ) = p21 . We have already proved
that there was a problem if N (π1 ) = p21 : in that case p ∼ π1 divides α. It remains to settle the
case where N (π1 ) = 1. This would mean that p1 H + αH = H, which is impossible since any
element in the LHS has norm divisible by p0 :

N (p1 u + αv) = (p1 u + αv)(p1 u + vα)


≡ αvvα (mod p)
= N (α)N (v)
≡ 0.

Finally, suppose now that α is irreducible. It is then clearly primitive, since we have shown that
rational integers were reducible. Factorise its norm as a product of rational primes p1 · . . . · pk
and factorise α has π1 · . . . · πk for irreducible elements πi of norm pi . Since α is irreducible, we
must have k = 1, i.e. its norm N (α) = p1 is prime. 

Exercise 2.6.16† . Prove that (1 + i)H = H(1 + i)1 . Set ω = 1+i+j+k 2 . We say a Hurwitz integer
α ∈ H is primary if it is congruent to 1 or 1 + 2ω modulo 2 + 2i.2 Prove that, for any Hurwitz integer
α of odd norm, exactly one of its right-associates is primary.

Solution

As said in the footnote, 1 + i and its conjugate 1 − i = −i(1 + i) are associates. As a consequence,
1 + i right-divides α if and only if it left-divides α. In particular, the left and right multiples of
1 + i are the same.

For the second part, note that any Hurwitz integer α can be written in the form aω + bi + cj + dk.
Modulo 2, α is congruent to 1, i, j, k, or ω. In particular, there is a unit ε such that εα ≡ 1

1 This means that we can manipulate congruences modulo 1 + i normally. Note that the choice of i is not arbitrary

at all, since 1 − i = −i(1 + i) and 1 − j = (1 − ω)(1 + i) are associates. By α ≡ β (mod γ), we mean that γ divides α − β
from the left and from the right.
2 Note that a primary Hurwitz integer is always in Z[i, j, k].
230 CHAPTER 2. QUADRATIC INTEGERS

(mod 2), and this unit is unique up to sign. Then, any Hurwitz integer congruent to 1 modulo
2 is congruent to ±1 or ±(1 + 2ω) modulo 2 + 2i, so this determines the sign of ε. 

Exercise 2.6.17† . Let m ∈ Z be an odd integer. Prove that the Hurwitz integers modulo m, H/mH,
are isomorphic to the algebra of two by two matrices modulo m, (Z/mZ)2×2 . In addition, prove that
the determinant of the image is the norm of the quaternion.

Solution

We shall copy Remark 2.5.2. Our goal is to find matrices with coefficients in Z/mZ which  square

1 0
to −I2 . We are now going to do something very abusive: we shall define 1 for I2 = , i
0 1
     
0 1 0 i i 0
for , j for and k for . Then, we will consider the matrix with integral
−1 0 i 0 0 −i
coefficients a + bi + cij + dik for a, b, c, d ∈ Z, where i is the complex number. When we square
this, we get
a2 − b2 + c2 + d2 + 2abi + 2iacj + 2iadk,
as seen in Exercise 2.5.4. Since we want something linearly independent with 1 and i, we shall
assume that a = b = 0. Then, if c = u and d = v are such that u2 + v 2 ≡ −1 (mod m), the
matrix  
−v −u
j0 := uij + vik =
−u v
squares to −1 modulo m. We still need to find another matrix k0 which squares to −1, and
satisfies ij0 k0 = −1, i.e.
k0 = −ij0 = −i(uij + vik) = −uik + vij.
Clearly, this also squares to −1. It remains to prove the existence of such u, v. When m = p is
prime, this is Exercise 2.5.18∗ . In fact, our solution to this exercise also works when m = pk is
k
a prime power: there are p 2+1 squares since

x2 ≡ y 2 ⇐⇒ (x + y)(x − y) ≡ 0 ⇐⇒ x ≡ ±y
pk +1
since the two factors are coprime so pk must divide one of them. Thus, there are 2 elements
2 pk +1 2
of the form v + 1 and 2 of the form −u , so two must be equal as wanted. When m is
composite, the existence of such u, v follows from the Chinese remainder theorem.

To conclude, we have proven that there exists u, v such that u2 +v 2 ≡ −1 (mod m) and used them
to construct an isomorphism from H/mH to (Z/mZ)2×2 . Explicitely (for the reader which wasn’t
convinced by our perfectly valid manipulations with very abusive notation), this isomorphism is
given by
         
a + di b + ci 1 0 0 1 −v −u u −v
ϕ : a + i + cj + dk 7→ =a +b +c +d .
−b + ci a − di 0 1 −1 0 −u v −v −u

Actually, so far we have only shown that it is a morphism,


but we will prove at the end that it
is injective and thus an isomorphism since |H/mH| = (Z/mZ)2×2 .

Note that ϕ(α) is the adjugate of ϕ(α), so their product is

N (α)I2 = ϕ(αα)
= ϕ(α) adj ϕα
= det(ϕ(α))I2
2.6. EXERCISES 231

 
a b
by Proposition C.3.7. Indeed, as we saw in the beginning of Section C.3, the adjugate of
  c d
d −b
is . Since this is additive, it suffices to check that
−c a

ϕ(1) = ϕ(1)
 
0 −1
= ϕ(−i) = adj ϕ(i)
1 0
 
v u
= ϕ(−j) = adj ϕ(j)
u −v
 
−u v
= ϕ(−k) = adj ϕ(k)
v u

which is clearly true.

Finally, ϕ is injective since its kernel is trivial (see Exercise A.2.13∗ ). Indeed, ϕ(α) = 0 implies
ϕ(α) = adj ϕ(α) = 0. As a consequence, if α = a + bi + cj + dk,

2ϕ(a) = ϕ(α + α) = 0

so a = 0 since m is odd. The same reasoning used on αi, αj and αk shows that a = b = c = d =
0. 

Remark 2.6.2
The step where we assume a = b = 0 is completely legitimate: since a2 − b2 + c2 + d2 + 2abi +
2iacj + 2iadk must be −1, we need ab = ac = ad = 0, i.e. a = 0 or b = c = d = 0 but the latter
clearly doesn’t work when there’s no square root of −1 in Z/mZ. Thus, a = 0. Similarly, we want

k0 := −ij0 = −i(wi + uij + vik) = w − uik + vij

to square to −1, which implies w = 0 for the same reason.

Exercise 2.6.18† . Let m be an odd integer. We say a Hurwitz integer α = a + bi + cj + dk is primitive


modulo n if gcd(2a, 2b, 2c, 2d, m) = 1. Compute the number ψ(m) of primitive Hurwitz integers modulo
m with norm zero (modulo m).

Solution

By
 Exercise
 2.6.17† , we need to count the number of two by two primitive modulo m matrices
a b
with zero determinant, i.e. the number of a, b, c, d such that ad − bc ≡ 0 (mod m)
c d
and gcd(a, b, c, d, m) = 1. It is immediate from the Chinese remainder theorem that this is
multiplicative, i.e. ψ(mn) = ψ(m)ψ(n) when m and n are coprime. Hence, it remains to
compute ψ(pk ) for an odd prime p. We shall first prove that ψ(pk+1 ) = p3 ψ(pk ), thus reducing
this computation to the (easy) computation of ψ(p). More precisely, we show that any primitive
modulo pk quadruple (a, b, c, d) such that
ad − bc ≡ 0 (mod pk )
can be lifted to exactly p3 primitive modulo pk+1 quadruple (a0 , b0 , c0 , d0 ) ≡ (a, b, c, d) (mod pk )
such that a0 d0 − b0 c0 ≡ 0 (mod pk+1 ). Hence, suppose (a, b, c, d) is such a quadruple and suppose
without loss of generality that a is non-zero modulo p. Consider a quadruple (a0 , b0 , c0 , d0 ) ≡
(a, b, c, d) (mod pk ). The congruence a0 d0 ≡ b0 c0 (mod pk+1 ) is equivalent to
d0 ≡ b0 c0 (a0 )−1 (mod pk+1 ).
232 CHAPTER 2. QUADRATIC INTEGERS

Thus, for each choice of a0 , b0 , c0 , there is exactly one d0 satisfying this equality. Since there are p3
triplets (a0 , b0 , c0 ) modulo pk+1 which are congruent to (a, b, c) modulo pk , this proves the result.

It remains to compute ψ(p). Choose a t ∈ Z/pZ and consider the equation ad ≡ t ≡ bc (mod p).
If t 6≡ 0, there are (p − 1)2 solutions: pick any non-zero a, b, and set d ≡ ta−1 and c ≡ tb−1 . If
t ≡ 0, there are 2(p − 1) + 1 = 2p − 1 solutions to ad ≡ 0: if a ≡ 0, there are p − 1 non-zero
possibilities for d and inversely, and then we count the solution a ≡ d ≡ 0. Hence, there are
(2p − 1)2 solutions to ad ≡ 0 ≡ bc, but since we are interested in primitive quadruples, we must
remove the solution (0, 0, 0, 0). In total, we have

ψ(p) = (p − 1) · (p − 1)2 + (2p − 1)2 − 1 = (p2 − 1)(p + 1).


mk
To conclude, if m = pm
1 · . . . · pk , we have
1

k
Y
ψ(m) = ψ(pk )
i=1
k
Y
= p3(k−1) ψ(p)
i=1
k
Y
= p3(k−1) (p2 − 1)(p + 1)
i=1
Y 1

1

3
=m 1− 2 1+ .
p p
p|m

Exercise 2.6.19† . Let p be an odd prime. Prove that any non-zero α ∈ H/pH of zero norm modulo p
has a representative of the form ρπ, where π is a primary element of norm p and ρ ∈ H, and that this
π is unique. Conversely, let π ∈ H have norm p. Prove that the equation ρπ ≡ 0 (mod p) has exactly
p2 solutions ρ ∈ H/pH. Deduce that there are exactly p + 1 primary irreducible Hurwitz integers with
norm p.

Solution

Lift α to a Hurwitz integer β of norm divisible by p. Consider its primitive part γ. Since β isn’t
divisible by p, the norm of γ is still divisible by p. Hence, by Exercise 2.6.15† , γ = ρπ for some
ρ ∈ H and some π of norm p. In addition, as we saw in the solution to this exercise, this π is
unique up to multiplication by a right-unit since it’s the left-gcd of γ and p. However, we still
need to justify the step where we went from β to γ, i.e. that β = δπ for some δ ∈ H and π of
norm p implies γ = ρπ for some ρ ∈ H. Let m be the non-squarefree part of β, i.e. β = mγ.
Since p = ππ is invertible modulo m, π is too so m must divide δ by Bézout. We are done:
γ = (δ/m)π.

For the second part, by Exercise 2.6.17† , this amounts to counting the solutions (x, y, z, t) to
    
x y a b 0 0
≡ (mod p)
z t c d 0 0
   
x y a b
where is the matrix corresponding to ρ and the matrix corresponding to π. In
z t c d
2.6. EXERCISES 233

other words, we wish to count the solutions to

xa + yc ≡ 0 (2.1)
xb + yd ≡ 0 (2.2)
za + tc ≡ 0 (2.3)
zb + td ≡ 0. (2.4)

Note that the condition that π has norm divisible by p translates to ad − bc ≡ 0. Since π is non-
zero modulo p, at least one of its coordinate is non-zero, say a. Then, (1) becomes x ≡ −yca−1
and (3) becomes z ≡ −tca−1 . Note that the other two equations are then automatically fulfilled
since
a(xb + yd) = b(xa + yc) − (ad − bc)y
and the same goes for z and t. Hence, there are p2 solutions as claimed: we choose y and t
arbitrarily and x and y are then uniquely determined.

We have shown that each of the ψ(p) = (p2 − 1)(p + 1) non-zero classes of H/pH of zero norm
can be written in the form ρπ for some unique π of norm p. However, each π has exactly p2 left-
multiples modulo p: ρπ takes each value exactly p2 times and there are p4 elements ρ ∈ H/pH, π
has p4 /p2 = p2 left-multiples. (This can also be seen more efficiently with the language of group
theory: the morphism from H/pH to itself sending ρ to ρπ has a kernel of cardinality p2 so its
image has cardinality |H/pH|/p2 = p2 by the first isomorphism theorem from Exercise A.3.15† .)
Thus, each π occurs for exactly p2 − 1 classes, so there are

ψ(p)
=p+1
p2 − 1
primary elements of norm p. 

Exercise 2.6.20† (Jacobi’s Four Square Theorem). Let n be a positive rational integer. In how many
ways can n be written as a sum of four squares of rational integers. (Two ways are considered different
if the ordering is different, for instance 2 = 12 + 02 + 02 + (−1)2 and 2 = (−1)2 + 02 + 02 + 12 are
different.)

Solution

Note that counting the number of ways to write n as a sum of four squares is the same as counting
the number of quaternions in Z[i, j, k] with norm n. We start by counting the number of primary
primitive Hurwitz integers of odd norm m, then we will consider the contribution of primary
non-primitive integers of norm m, and finally the contribution of their associates too (and treat
the even case).
Let m be an odd positive integer and let pm mn
1 · . . . · pn
1
be its prime factorisation. By Exer-

cise 2.6.15 , each primitive integer of norm m has an expression of the form
mi
n Y
(i)
Y
πj
i=1 j=1

(i)
where πj is an element of norm pi . Since we are interested in primary integers, we may assume
(i)
that each πj is primary as well, by migrating the units. Then, this expression becomes unique,
and any such expression gives rise to a unique integer of norm m again by Exercise 2.6.15† ,
provided that it is primitive. We will prove that it is primitive as long as two consecutive factors
are not conjugates, which is obviously a necessary condition. (Note that the conjugate of a
primary integer is also primary.) Suppose that some rational prime p divides this product. Since
the product has norm m, p = pk for some k. Since every element of norm coprime with p is
234 CHAPTER 2. QUADRATIC INTEGERS

invertible modulo p, p must divide


(k) (k)
π1 · . . . · πm k
:= π1 · . . . · π` .

Consider the greatest i such that p divides π1 · . . . · πi . Then, π1 · . . . · πi−1 is not divisible by p so
is primitive since its norm is a power of p. We will prove that πi and πi−1 are conjugate. Write
pρ = π1 · . . . · πi for some ρ ∈ H. This is equivalent to

ρπi = π1 · . . . · πi−1 .

Since these elements are now primitive, Exercise 2.6.15† tells us that πi and πi−1 are the same
up to association, i.e. the same since they are primary.
Hence, the number f (m) of primary and primitive Hurwitz integers of norm m is equal to the
number of products of the form
n Ymi
(i)
Y
πj
i=1 j=1
(i)
where πj is primary of norm pi and no two consecutive factors are conjugate. In other words,
(1) (1)
we have p1 + 1 possibilities for π1 and then only p1 for every other πj since we need to avoid
(1)
the conjugate of πj−1 . The same goes for p2 , p3 , . . . , pn . Thus,
n n  
Y
i −1
Y 1
f (m) = (pi + 1)pm
i =m 1+ .
p
k=1 k=1

Now that we have computed the number of primary primitive integers of norm m, we shall
compute the number of primary integers of norm m. This is simply
X
g(m) = f (m2 /d)
d2 |m

because a primary integer α of norm m is a primary primitive integer of norm m/d2 , where d is
the non-primitive part of m, i.e. the unique positive integer d | α such that α/d is primitive. By
expanding the following expression, we see that
n m
Y Xi /2
i −2k
g(m) = f (pm
i )
i=1 k=0

because f is multiplicative, i.e. f (ab) = f (a)f (b) when a and b are coprime, and each d2 | m can
P`/2
be written as pd11 · . . . · pdnn with 2di ≤ mi for every i. Now, note that the sum k=0 g(p`−2k ) is

p`−1 (p + 1) + p`−3 (p + 1) + . . . + (p + 1) = p` + . . . + 1

when ` is odd, and

p`−1 (p + 1) + p`−3 (p + 1) + . . . + p(p + 1) + 1 = p` + . . . + 1

when ` is even. Thus, in all cases,


Y mi
n X X
g(m) = pk = d.
i=1 k=0 d|m

Now, only two things remain be done: take in account the contribution of units, and treat the
case where m is odd. Let α be a primary integer and let ε be a unit of H. Then, αε is in Z[i, j, k]
if and only if ε ∈ {±1, ±i, ±j, ±k}. Thus, there are
X
r4 (n) = 8g(n) = 8 · d
d|n
2.6. EXERCISES 235

ways to write n as a sum of four squares when n is odd. Now suppose that n is even and write
n = 2r m with r = v2 (n). We will prove that any element of norm n has the form (1 + i)r times
an element of norm m. As a consequence, the number of primary quaternions of norm n will be
X
g(n) = g(m) = d.
d|m

This time however all units will yield elements in Z[i, j, k] since (1 + i)ε ∈ Z[i, j, k] for any unit
ε. Since there are 24 units, we conclude that the numbers of ways to express n as a sum of four
squares is X
r4 (n) = 24g(m) = 24 · d.
d|n,d odd

Hence, it only remains to prove that an element of even norm is divisible by 1 + i. Indeed, since
1 + i has norm 2, iterating this result yields that an element of norm divisible 2r is divisible
by (1 + i)r as wanted. Suppose that α = a + bi + cj + dk has even norm. In particular,
(2a)2 + (2b)2 + (2c)2 + (2d)2 is divisible by 8. Since odd squares are 1 modulo 8, this implies
2 | 2a, 2b, 2c, 2d, i.e. α ∈ Z[i, j, k]. Modulo 1 + i, α is simply a + b + c + d, which is clearly divisible
by 1 + i since it is divisible by 2. We are done.
P
To summarise, there are 8 · d|n d ways to write n as a sum of two squares when n is odd, and
P
24 · d|n,d odd d when n is even. 

Domains
Miscellaneous
Exercise 2.6.25† . Let (Fn )n∈Z be the Fibonacci sequence defined by F0 = 0, F1 = 1, and Fn+2 =
Fn+1 + Fn for any integer n. Prove that, for any integers m and n, gcd(Fm , Fn ) = Fgcd(m,n) .

Solution
αn −β n
Note that Fn = α−β , where α, β are the roots of X 2 −X −1 (see Section C.4). Thus, d | Fm , Fn
if and only if
δ := (α − β)d | αm − β m , αn − β n .
Now, note that δ = αgcd(m,n) − β gcd(m,n) = (α − β)Fgcd(m,n) works since

αk gcd(m,n) ≡ β k gcd(m,n)

for any k ∈ Z. For the converse, let k be the smallest positive integer such that

δ | αk − β k ⇐⇒ δ | (α/β)k − 1

(note that β is a unit since αβ = 1 so we can divide by β like we did). Then, we shall prove that
k | m, n. Write m = qk + r the Euclidean division of m by k. Then,

1 ≡ (α/β)m = (α/β)k )q · (α/β)r ≡ (α/β)r

which contradicts the minimality of k, unless r = 0. Thus, k | m, and by symmetry k | n, which


implies that k | gcd(m, n) and δ | αgcd(m,n) − β gcd(m,n) as wanted. 

Remark 2.6.3
This is identical to the proof that gcd(an − bn , am − bm ) = agcd(m,n) − bgcd(m,n) for a, b ∈ Z using
236 CHAPTER 2. QUADRATIC INTEGERS

orders, but in OQ(√5) . See Section 7.2 for more.


√ √
Exercise 2.6.27† . Let n√be a rational integer. Prove that (1 + 2)n is a unit of Z[ 2]. Moreover,
prove that any unit of Z[ 2] has that form, up to sign.

Solution
√ √ √
Note that N ((1 + 2)k ) = (−1)k so (1 + 2)k√is a unit. Now, suppose a + b 2 is the smallest
unit with a, b > 0 which is not a power of 1 + 2. Then,

√ √ √ a+b 2 √
(2b − a) + (a − b) 2 = −(a + b 2)(1 − 2) = √ < a + b 2.
1+ 2
Thus, if we show that 2b − a > 0 and a − b > 0, we will reach a contradiction. This is easy: since
a2 − 2b2 = ±1, if 2b ≤ a we have
a2 − 2b2 ≥ 2b2 > 1,
and if a ≤ b we have
a2 − 2b2 < −b2 < −1
√ √
unless b = 1 but that gives a + b 2 = 1 + 2 which we have ruled out. (See also Section 7.1.)

Exercise 2.6.28† (IMO 2001). Let a > b > c > d be positive rational integers. Suppose that

ac + bd = (b + d + a − c)(b + d − a + c).

Prove that ab + cd is not prime.

Solution

We shall first simplify the condition on a, b, c, d:

ac + bd = (b + d)2 − (a − c)2 = b2 + 2bd + d2 − a2 + 2ac − c2 ,

i.e. a2 − ac + c2 = b2 + bd + d2 , or in other words, (a + jc)(a + j 2 c) = (b − jd)(b − j 2 d). This of


course suggests working in Z[j]. Set α = a + jc and β = b − jd. We have αα = ββ. Let ρ be the
gcd of α and β and write α = γρ and β = δρ. Then,

γγ = δδ

and gcd(γ, δ) = 1 so γ | δ and δ | γ, i.e. γ = εδ for some unit ε ∈ Z[j]. Now, notice that

αβ = (a + jc)(a − jd) = ab + cd + j(bc + cd − ad).

However, αβ is also equal to γδρ2 and γδ = εN (γ). Hence, if ab + cd is prime, we have N (γ) ∈
{1, ab + cd}.

Suppose first that N (γ) = 1, i.e. γ is a unit. Then, β = δρ is a unit times γρ = α, say α = ηβ.
Since the only units of Z[j] are ±j k by Exercise 2.4.2∗ , we get

±(a + jc) ∈ {b − jd, d + j(b + d), b + d + jb}.

All of these clearly contradict the assumption that a > b > c > d > 0. It remains to treat the
case where N (γ) = ab + cd. In that case, we have ab + cd | bc + cd − ad since

N (γ) | αβ = ab + cd + j(bc + cd − ad).


2.6. EXERCISES 237

Note that bc + cd − ad must be positive or zero for otherwise its absolute value is less than
ad < ab + cd. However, if it is positive, then bc + cd − ad ≥ ab + cd which is impossible since
ab > bc. Hence, bc + cd − ad must be 0. This implies that

ερ2 (ab + cd) = γδρ2 = ab + cd,

i.e. ρ is a unit. Then, β = εγρ is a unit times γρ = α, say α = µβ. As before, the only units of
Z[j] are ±j k so we get

±(a + jc) ∈ {b + d + jd, −d + jb, b + j(b + d)}.

Each of these cases still contradicts a > b > c > d > 0. 

Exercise 2.6.29† . Let x ∈ R be a non-zero real number and m, n ≥ 1 coprime integers. Suppose that
xm + x1m and xn + x1n are both rational integers. Prove that x + x1 is also one.

Solution
√ √
2 2
Let a = xm + x1m and b = xn + x1n . Then, xm = a± 2a −4 and xn = b± 2a −4 , i.e. xm and xn are

units in a quadratic field Q( d) (it is the same field since (xm )n = (xn )m ). Finally,√let u, v ∈ Z
be such that um + vn = 1, by Bézout’s lemma. Then, x = xum xvn is a unit of Q( d) too, i.e.
x + x1 ∈ Z (since x1 is the conjugate of x). 

Remark 2.6.4

One might be tempted to look at xmn : it is both an mth power and an nth power in Q( d) so we
may want to conclude that it is an mnth power. For UFDs, by looking at the p-adic valuation, √
we see that it is an mnth power times a unit. Since the only units in real quadratic fields (Q( d)
for d > 0) are ±1 (see Section 7.1), we conclude that it is ± an mnth power, and by looking
the parity of mn, we can see that in fact it must be an mnth power as wanted. In general,
at √
Q( d) might not be a UFD, but since it has ideal factorisation, there is still a concept of p-adic
valuation so the previous solution works too.
Chapter 3

Cyclotomic Polynomials

3.1 Definition
Exercise 3.1.1∗ . Let ω be an nth root of unity. Prove that its order divides n.

Solution

Let k be the order of ω and let n = qk + r be the Euclidean division of n by k. We have

ω n = (ω k )q ω r = ω r

so ω r = 1 but r < k which means that r = 0 by minimality of the order. 

Exercise 3.1.2∗ . Let p be a rational prime. Prove that Φp = X p−1 + . . . + 1.

Solution

We have Φ1 Φp = X p − 1 by Proposition 3.1.1 so

Xp − 1
Φp = = X p−1 + . . . + 1.
X −1


Exercise 3.1.3∗ . Let n ≥ 1 be an integer. Prove that Φn (0) = −1 if n = 1 and 1 otherwise.

Solution

By induction on n: true for n = 1 and for n > 1 we have


0n − 1 −1
Φn (0) = Q = = 1.
d|n,d<n Φd (0) −1 · 1 · . . . · 1

Exercise 3.1.4. Let n > 1 be an integer. Prove that Φn (1) = p if n is a power of a prime p, and
Φn (1) = 1 otherwise.

238
3.1. DEFINITION 239

Solution

By induction on n:
Y Xn − 1
Φd = = X n−1 + . . . + 1
X −1
16=d|n
Q
so 16=d|n Φd (1) = n. Note that the function given in the statement satisfies this equation:
Y Y
p= pvp (n) = n
16=pi |n p|n

since each factor p appears exactly vp (n) times. Thus, by induction, Φn (1) is p is n is a power
of p and 1 otherwise. 

Exercise 3.1.5∗ . Prove the Corollary 3.1.1 by induction.

Solution

By induction on n:
Xn − 1
Φn = Q
d|n,d<n Φd

and this polynomial division has integer coefficients since the divider is monic. 

Exercise 3.1.6∗ . Prove that Φn (1/X) = Φn (X)/X ϕ(n) for n > 1.

Solution

When n > 1, the primitive nth roots of unity come by pairs ω, 1/ω so the number of such . Thus,
Y Y 1 Y
Φn (1/X) = 1/X − ω = (ω − X) = Φn /X ϕ(n) (−1)ϕ(n) 1/ω
ω ω
ωX ω

and (−1)ϕ(n) 1/ω = Φn (0) by Vieta’s formulas which is 1 by Exercise 3.1.3∗ .


Q
ω 

Remark 3.1.1
If f = i ai X i is a polynomial, the fact that f (X) = X deg f f (1/X) can be seen more visually
P
using its coefficients: this is equivalent to
X X
ai X i = ai X deg f −i ,
i

i.e. ai = adeg f −i .

Exercise 3.1.7∗ . Prove that, for n > 1, Φn (X, Y ) is a two-variable symmetric and homogeneous, i.e.
where all monomials have the same degree, polynomial with integer coefficients.

Solution

It is homogeneous because Φn (X/Y ) is a homogeneous rational fractions (of degree 0) and Y ϕ(n)
240 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

ai X i then
P
is too. It is a polynomial because, if Φn = i
X
Y ϕ(n) Φn (X/Y ) = ai X i Y ϕ(n)−i
i

(we can also see it is homogeneous that way). It is symmetric by Exercise 3.1.6∗ :

Φn (Y /X) = (Y /X)ϕ(n) Φn (X/Y ) ⇐⇒ X ϕ(n) Φn (Y /X) = Y ϕ(n) Φn (X/Y ).

Exercise 3.1.8∗ . Prove that


Y
Φn (X, Y ) = X − ωY.
ω primitive nth root

Solution

We have Y Y
Φn (X, Y ) = Y ϕ(n) X/Y − ω = X − Y ω.
ω ω

Exercise 3.1.9∗ . Prove that, for odd n > 1, Φn (X)Φn (−X) = Φn (X 2 ) and deduce Corollary 3.1.3.

Solution

We have
Y Y Y
Φn (X)Φn (−X) = (X − ω)(−X − ω) = −(X 2 − ω 2 ) = (−1)ϕ(n) X 2 − ω2 .
ω ω ω

Since n > 2, ϕ(n) is even, and since n is odd, ω 7→ ω 2 is a permutation of the primitive nth
2
roots of unity so this is just Φn (X 2 ). Since Φ2n (X) = ΦΦnn(X )
(X) by Proposition 3.1.2, we get
Φ2 n(X) = Φn (−X). 

Exercise 3.1.10. Prove that, for any polynomial f , f (X)f (−X) is a polynomial in X 2 .

Solution

We present three proofs: the first work over any ring with the fundamental theorem of symmetric
polynomials and by expansion, and one which works over C (and any algebraically closed field)
using the fundamental theorem of algebra A.1.1.

For the first one, note that f (X)f (−X) is symmetric in X and −X so is a polynomial in −X 2
and 0.

For the second one, write f (X) as g(X 2 ) + Xh(X 2 ) to get

f (X)f (−X) = (g(X 2 ) + Xh(X 2 ))(g(X 2 ) − Xh(X 2 )) = g(X 2 )2 + X 2 h(X 2 )2 .


3.2. IRREDUCIBILITY 241

For the last one, note that that the result is true for polynomials of degree 1 as

(X − α)(−X − α) = −(X − α)(X + α) = −(X 2 − α2 )

so is true for any polynomial since any polynomial factorises as a product of a constant polynomial
and degree 1 polynomials. 

Exercise 3.1.11∗ . Let p be a prime number and n ≥ 1 an integer. Prove that if p | n then
p
,Y p )
Φpn (X, Y ) = Φn (X p , Y p ), and that Φpn (X, Y ) = ΦΦnn(X
(X,Y ) otherwise.

Solution

We have
(
Y pϕ(n) Φn (X p /Y p ) = Φn (X p , Y p ) if p | n
Φpn (X, Y ) = Y ϕ(pn) Φpn (X/Y ) = p
/Y p ) Φn (X p ,Y p )
Y pϕ(n) /Y ϕ(n) ΦΦnn(X
(X/Y ) = Φn (X,Y ) if p - n

by Proposition 3.1.2. 

Exercise 3.1.12∗ . Let k ≥ 1 be an integer. Prove that Φ2k = X 2


k−1
+ 1.

Solution
0
By induction on n: we have Φ2 = X 2 + 1 and
k−1 k−1
Φ2·22k = Φ2k (X 2 , Y 2 ) = X 2 +Y2

for k ≥ 1 by Proposition 3.1.2. 

3.2 Irreducibility
Exercise 3.2.1∗ . Let n ≥ 1 be an integer and ω be a primitive nth root of unity. Prove that any
primitive nth root can be written in the form ω k for some gcd(k, n) = 1.

Solution

Write ω = exp 2miπ for some gcd(m, n) = 1. The other primitive roots of unity are exp 2kiπ
 
n n
for gcd(k, n) = 1 and the powers of ω are exp 2kmiπ

n . Since m is coprime with n, it is invertible
mod n so k 7→ km is a bijection of (Z/nZ)× which is equivalent to ω 7→ ω k being a bijection of
primitive nth root as wanted. 

3.2.2∗ . Let f =
Qn
Exercise
Q k=1 X − αi be a polynomial. Prove that, for any k = 1, . . . , n, f 0 (αk ) =
i6=k αk − αi .
242 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution

By Exercise A.1.8∗ , we have XY


f0 = X − αi
i j6=i

from which the result follows by evaluating this at αk . 

Exercise 3.2.3∗ (Frobenius Morphism). Prove the following special case of Proposition 4.1.1: for any
rational prime p and any polynomial f ∈ Z[X], f (X p ) ≡ f (X)p (mod p).

Solution

Note that
X p
p
(f + g) = f k g p−k ≡ f p + g p (mod p)
k
k
p! P p P p
for any f, g ∈ Z[X] since p | kp = k!(p−k)!

for 0 < k < p. Thus, by induction, ( fi ) ≡ fi .
i i
P
Taking fi = ai X and letting f = i ai X , we get
!p
X X p X
p i
f (X) = ai X = ai X ip ≡ ai X ip = f (X p )
i i i

by Fermat’s little theorem. 

Exercise 3.2.4 (Alternative Proof of Theorem 3.2.1). Let ω be a primitive nth root of unity with
minimal polynomial π and let p - n be a rational prime. Suppose τ is the minimal polynomial of
π(ω p ). Prove that p | τ (0) and that τ (0) is bounded when p varies. Deduce that ω p is a root of π for
sufficiently large p, and thus that ω k is a root of π for any gcd(n, k) = 1.

Solution

τ (0) is ± the product of its roots by Vieta’s formulas, and since π(ω p ) is a root it is divisible by
it. Thus, p | π(ω p ) | τ (0). Note that, by the fundamental theorem of symmetric polynomials,
each root of τ has the form π(ω kp ) for some k. In particular, if ω1 , . . . , ωm are the roots of π (i.e.
the conjugates of ω), we have
Y
|π(ω kp )| = |ω kp − ωi | ≤ 2m ≤ 2n
i

by the triangular inequality. Thus, the roots of τ (0) all have absolute value less than 2n , and
2
since τ has at most n roots, its constant coefficient τ (0) is bounded by 2n . This shows that for
2
sufficiently large p > 2n , since p | τ (0), we have τ (0) = 0, i.e. τ = X and π(ω p ) = 0.
2
To finish, say p1 , . . . , p` - n are the primes less than 2n . Since ω p is also a primitive nth root
of unity for p - n, we can repeat our reasoning with this root of unity to show that π(ω m ) = 0
2
for any m whose prime factors are all greater than 2n . Pick any k coprime with n. Using the
Chinese remainder theorem, pick an m ≡ k (mod n) which is congruent to 1 modulo p1 , . . . , p` .
2
Then all prime factors of m are greater than 2n so

π(ω k ) = π(ω m ) = 0

as wanted. 
3.2. IRREDUCIBILITY 243

Exercise 3.2.5.  Let k and n ≥ 1 be coprime integers. Prove that the conjugates of cos 2kπ

n are
2k0 π 0 2kπ

the numbers cos n for gcd(k , n) = 1. What is its degree? What about sin n , what are its
conjugates and what is its degree?

Solution

Note that  
2kπ
2 cos = ω + 1/ω = ω + ω n−1
n
2kπ

where ω = exp n is a primitive nth root of unity. In particular, by the
 fundamental
 theorem
2k0 π
2kπ
for gcd(k 0 , n) = 1

of symmetric polynomials, the conjugates of 2 cos n are among 2 cos n
as wanted. For the converse, note that if f (2 cos 2kπ
 k
n ) = 0 then f (X + X ) has a  root at ω so
2kπ
at all other primitive nth
 roots
 of unity as wanted.
 In
 particular,
 for
 n ≥ 3, cos n conjugates
2k0 π 2k0 π −2k0 π
since the numbers cos n go by pair cos n , cos n . (For n ≤ 2, it has degree
ϕ(n) = 1.)

The situation
 is more complicated for sines. The same argument shows that its conjugates are
2k0 π
sin n for gcd(k 0 , n) = 1 (or using sin(x) = cos(π/2 − x)), but it is now harder to count its
conjugates. Perhaps the simplest way is to transform it into a cosine:
     
2kπ π 2kπ 2π(n − 4k)
sin = cos − = cos .
n 2 n 4n

We need to evaluate gcd(n − 4k, 4n). Since k and n are coprime, it is clear that  the only
potential prime factor of this gcd is 2. In particular, if 8 | n, the gcd is 4 so sin 2kπ
n has degree
ϕ(4n/4)/2 = ϕ(n)/2.

When n ≡ 4 (mod 8), the numerator is this time divisible by 8 because n ≡ 4k (mod 8). If
n ≡ 4 (mod 16), then the gcd is 16 so sin 2kπ
n has degree

ϕ(4n/16)/2 = ϕ(n)/4.

If n 6≡ 4 (mod 16), then it has degree ϕ(4n/8)/2 = ϕ(n)/4 as well (ϕ(m) = ϕ(2m) when m is
odd). Of course, we are assuming that n 6= 4 here, so that n − 4k is non-zero. If n = 4, it has
degree 1.

If n ≡ 2 (mod 4), the gcd of n − 4k and 4n is just 2, so sin 2kπ



n has degree ϕ(4n/2)/2 = ϕ(n).
Similarly, if n is odd, the gcd is 1 so it has degree ϕ(4n)/2 = ϕ(n) as well. This time we assumed
that n was greater than 2, otherwise it has degree 1. We can summarise our results in the
following table.

degree of sin 2kπ



n
n ∈ {1, 2, 4} 1
n ≡ 0 (mod 8) ϕ(n)/2
n ≡ 4 (mod 8) ϕ(n)/4
n ≡ 2 (mod 4) ϕ(n)
n ≡ 1 (mod 2) ϕ(n)


Remark 3.2.1
The reason why sines turn out to be so unstructured is because of the i in the denominator of
k −k
sin 2kπ = ω −ω

n 2i . Suppose k = 1 without loss of generality, by symmetry between primitive
244 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

nth roots of unity. The best way to see  this is with Galois theory (see Chapter 6). Because of
this i, to consider conjugates of sin 2π
n we need to work in Q(ω, i). Then, we count the number
 ω−ω−1
of automorphisms, i.e. elements of the Galois group over Q, fixing sin 2πn = 2i . If there are
N such elements, then there are exactly

[Q(ω, i) : Q]/N = ϕ(lcm(4, n))/N

conjugates, by Proposition 6.3.1. Normally N = 2, i.e. there are only two embeddings fixing
ω−ω −1
2i : the identity and the complex conjugation. However, sometimes there are more. Let’s see
−1
more closely what’s happening: if σ ∈ Gal(Q(ω, i)/Q) fixes ω−ω2i , since it sends i to ±i, it must
send ω − ω −1 to ±(ω − ω −1 ). In other words, we consider the potential embeddings
(
ω 7→ ω
id : ,
i 7→ i
(
ω → 7 ω −1
τ: ,
i → 7 −i
(
ω → 7 −ω
ϕ:
i → 7 −i
and (
ω 7→ −ω −1
ψ:
i 7→ i.
The first two always exist: they are the identity and the complex conjugation. The other two are
more delicate. First of all, if n is odd or congruent to 2 modulo 4, then −ω ±1 is not a conjugate
of ω so they do not exist. If 4 | n, since −ω ±1 = ω n/2±1 and ω n/4 is i or −i, we must have

(ω n/4 )n/2±1 = ±ω n/4


n
for these embeddings to exist. This means that 2 · n4 ≡ n
2 (mod n), i.e n
4 is odd, or in other words
n ≡ 4 (mod 8).

To conclude, when 8 | n we have ϕ(lcm(4, n))/N = ϕ(n)/2, when n ≡ 4 (mod 8) we have


ϕ(lcm(4, n))/N = ϕ(n)/4, and otherwise we have ϕ(lcm(4, n))/N = 2ϕ(n)/2 as desired. (Obvi-
ously, we exclude the exceptions 1, 2, 4.)
Exercise 3.2.6. Find all quadratic cosines.

Solution

The degree of cos 2kπ



n is 1 for n = 1, 2 and ϕ(n)/2 for n > 2. Indeed, when n > 2, the cosines
cos 2kπ

n for gcd(k, n) = 1 come into pairs
   
2kπ 2(n − k)π
cos = cos
n n

so cos 2kπ

n has half as many conjugates as the number of gcd(k, n) = 1, i.e. ϕ(n)/2. Thus, the
quadratic cosines are cos 2kπ

n for gcd(k, n) = 1 and ϕ(n)/2 ≤ 2, i.e. n ∈ {1, 2, 3, 4, 5, 6, 8}. 

3.3 Orders
Exercise 3.3.1∗ . Let p be a rational prime and a a rational integer. Prove that, for any n ≥ 1,
p | Φn (a) if and only if p | Φpn (a).
3.3. ORDERS 245

Solution

It suffices to note that Φpn (X) ≡ Φn (X)p or Φn (X)p−1 modulo p by Proposition 3.1.2. 

Exercise 3.3.2∗ . Let p be a rational prime. Prove that there always exists a primitive root or
generator modulo p, i.e. an integer g such that g k generates all integers p - m modulo p.

Solution

Primitive roots are the elements of order p − 1, i.e. the roots of Φp−1 modulo p. Since

Φp−1 | X p−1 − 1 = (X − 1) · . . . · (X − (p − 1))

it splits in Fp and in particular has a root. 

Exercise 3.3.3∗ . Let p be a rational prime and a, b two rational integers. Prove that p | Φn (a, b) if
and only if p | a, b or pvpn(n) is the order of ab−1 modulo p.

Solution

If p - a, b, Φn (a, b) is zero modulo p if and only if Φn (ab−1 ) is. 

Exercise 3.3.4∗ . Let p be a rational prime and a an integer of order n modulo p. Prove that ak ≡ 1
(mod p) if and only if n | k. Deduce that n divides p − 1.1

Solution

Let k = qn + r be the Euclidean division of k by n. We have

ak = (an )q ar ≡ ar

so ar ≡ 1 but r < n which means that r = 0 by minimality of the order. Since ap−1 ≡ 1 by
Fermat’s little theorem, we get n | p − 1. 

Exercise 3.3.5∗ . Let p be a rational prime and a, b two rational integers. Suppose that p | Φn (a, b).
Prove that p | a, b, p ≡ 1 (mod n) or p is the greatest prime factor of n.

Solution

If p - a, b then Φn (ab−1 ) ≡ 0 (mod p). 

Exercise 3.3.6∗ . Let p be a rational prime and a an integer. Suppose p | Φn (a), Φm (a) and n 6= m.
Prove that m
n is a power of p.

1 This is the mod p version of Exercise 3.1.1∗ . In fact the proof should be the same as it works in any group (see

Section A.2 and Theorem 6.3.2).


246 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution
n m m
a has both order pvp (n)
and pvp (m)
modulo p so n = pvp (m)−vp (n) is a power of p as wanted. 

Exercise 3.3.7. Prove the following strengthening of Problem 3.1.1: for any integer n ≥ 0, the
n+1 n
number 22 + 22 + 1 has at least n + 1 distinct prime factors.

Solution

We have
n
n+1 n Y
22 + 22 + 1 = Φ3·2k (2)
k=0

and 2 is not the greatest prime factor of 3 · 2k neither is it congruent to 1 modulo 3 · 2k so can’t
3·2k
divide Φ3·2k (2) by Corollary 3.3.1. Since 3·2 k0 is a power of 2, the only possible common prime
factor of Φ3·2k (2) and Φ3·2k0 (2) is 2 but we have already shown that they were odd. Thus, each
factor contributes to at least one prime factor and we have in total at least n + 1 prime factors
as wanted (we have shown in Problem 3.1.1 that they were non-trivial). 

Remark 3.3.1
This can also be seen as a corollary of the Zsigmondy theorem: each Φ3·2k (2) brings a primitive
prime factor except when k = 1 but 3 = Φ6 (2) is still primitive compared to 7 = Φ3 (2).

Exercise 3.3.8∗ . Let n ≥ 1 be an integer. Prove that there exist infinitely many rational primes
p ≡ 1 (mod n).

Solution

Suppose that there were only finitely many primes p1 , . . . , pk congruent to 1 modulo n. Consider
the number Φn (np1 · . . . · pk ). It is congruent to ±1 modulo np1 · . . . · pk by Exercise 3.1.3∗ so
any prime factor of it must be congruent to 1 modulo n by Theorem 3.3.1 and distinct from
p1 , . . . , pk . Since it is greater than 1 by the triangular inequality, it has a prime factor which is
a contradiction. 

3.4 Zsigmondy’s Theorem


Exercise 3.4.1∗ . Check that the exceptions stated in Theorem 3.4.1 are indeed exceptions.

Solution

When n = 2 and a + b is a power of 2, all prime factors of a2 − b2 = (a − b)(a + b) either divide


a − b or are equal to 2 which also divide a − b. For a = 2, b = 1, and n = 6, we see that all prime
factors of 26 − 1 = 9 · 7 divide 23 − 1 = 7 and 22 − 1 = 3. 

Exercise 3.4.2∗ . Prove that a2 − b2 has no primitive prime factor if and only if a + b is ± a power
of 2.
3.5. EXERCISES 247

Solution

We have already shown that a2 − b2 has no primitive prime factor if a + b is a power of 2. For
the converse, note that any common prime factor of a + b and a − b must divide 2a and 2b so
must be 2 since a and b are coprime. 

Exercise 3.4.3. Let n ≥ 3 be an integer. Prove that Φn is positive on R.

Solution

Since Φn (0) = 1 > 0 by Exercise 3.1.3∗ , if Φn (x) were nonpositive for some real x, Φn would
have a real root by the intermediate value theorem which would imply n = 1 or n = 2 since the
only real roots of unity are 1 and −1. 

Exercise 3.4.4. Prove that 2m−1 > m for any integer m ≥ 3 and 2m − 1 > 3m for any integer m ≥ 4.

Solution

It suffices to prove the second inequality since, if 2m > 3m + 1 then 2m−1 > m + m+1
2 ≥ m+1
and we already have 22 > 3. We use the binomial expansion:
     
m m m m(m − 1)
2m = (1 + 1)m > + + = 2m + > 3m + 1
m−1 2 1 2

since m ≥ 4. 

3.5 Exercises
Diophantine Equations
Exercise 3.5.2† (USA TST 2008). Let n be a rational integer. Prove that n7 + 7 is not a perfect
square.

Solution

Suppose that n7 + 7 = m2 . Then, by adding 121 = 112 to both sides, we get n7 + 27 = m2 + 112 ,
i.e.
Φ1 (n, −2)Φ7 (n, −2) = Φ4 (m, 11).
In particular, any prime factor of the RHS must be equal to 11 or congruent to 1 modulo 4. First
suppose that 11 - m. Then, we must have n + 2 = Φ1 (n, −2) ≡ 1 (mod 4), i.e. n ≡ −1 (mod 4).
However, we then have n7 + 27 ≡ −1 (mod 4) which is impossible.

Thus, 11 must divide m. Since 11 is not equal to 2 or 7 nor congruent to 1 modulo 7, it can’t
divide Φ7 (n, −2). Hence, it must divide n + 2 = Φ1 (n, −2). Since v11 (Φ4 (m, 11)) = 2, we also
have v11 (n + 2) = 2. But then, n + 2 is still congruent to 1 modulo 4, since all its prime factors
are congruent to 1 modulo 4 except 11, and its v11 is even. Hence, we get the same contradiction
as before which shows that our equation does not have any solution. 
248 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Exercise 3.5.5† (French TST 1 2017). Determine all positive integers a for which there exists
positive integers m and n as well as positive integers k1 , . . . , km , `1 , . . . , `n such that

(ak1 − 1) · . . . · (akm − 1) = (a`1 + 1) · . . . · (a`n + 1).

Solution

If we multiply both sides by (a`1 − 1) · . . . · (a`n − 1), we get

(ak1 − 1) · . . . · (akm − 1)(a`1 − 1) · . . . · (a`n − 1) = (a2`1 − 1) · . . . · (a2`n − 1).

If we eliminate common factors, we get an equality of the form (au1 − 1) · . . . · (aur − 1) =


(av1 − 1) · . . . · (avs − 1) with even vi and disjoint {u1 , . . . , ur } and {v1 , . . . , vs }. Now, consider
amaxi,j (ui ,vj ) − 1. By the Zsigmondy theorem, unless a = 2 or maxi,j (ui , vj ) ≤ 2, this has a
primitive prime factor which is a contradiction since this implies that some prime divides one side
of the equality but not the other. Conversely, it is easy to see that a = 2 works: (22 −1)2 = 23 +1.

Now suppose that maxi,j (ui , vj ) ≤ 2. It cannot be 1 since the ui and vj are disjoint. Hence, it
must be 2. Since the vj are even, this implies u1 = . . . = ur = 1 and v1 = . . . = vs = 2. We
conclude that (a − 1)r = (a2 − 1)s , i.e. (a − 1)r−s = (a + 1)s . The gcd of a + 1 and a − 1 divides
2, so we must have a − 1 and a + 1 must both be powers of 2. This gives us a = 3. Conversely,
we have (3 − 1)2 = (3 + 1).

We conclude that the only solutions are a = 2 and a = 3. 

Divisibility Relations
Exercise 3.5.7† . Find all coprime positive integers a and b for which there exist infinitely many
integers n ≥ 1 such that
n2 | an + bn .

Solution

We shall prove that a and b work if and only if a + b is not a power of 2 and {a, b} 6= {1, 2}.
Suppose that n2 | an + bn . Let p be the smallest prime factor of n. Then, the order of ab−1
divides 2n and p − 1 so must be 2 by assumption, i.e. p | a + b. If a + b was a power of 2, then 4
would not divide an + bn which would be a contradiction. Thus, a + b is not a power of 2.

Now suppose a = 2 and b = 1. The previous reasoning shows that the smallest prime factor of
n is 3. Let q be the second smallest prime factor (distinct from 3). Then, the order of 2 divides
2n and q − 1 so must divide 6, i.e. q = 7. This is impossible since the order of 2 modulo 7 is odd
so 7 never divides 2k + 1. Thus, n has only one prime factor, i.e. it is a power of3. Clearly, n
is odd, as otherwise 3 - 2n + 1. The lifting the exponent lemma gives v3 (2n + 1) = v3 (n) + 1 so
that v3 (n) ≤ 1, i.e. n ∈ {1, 3}. There are finitely many such integers.

Finally, suppose a + b is not a power of 2 and {a, b} =


6 {1, 2}. We shall proceed by induction on
k to find an odd n that works with exactly k prime factors. We start with the solution n = 1
2n
−b2n
corresponding k = 0. Then, Zsigmondy tells us that an + bn = aan −b n has an odd prime factor
q−1 q−1
p which doesn’t divide n, since a prime factor q | n divides a −b (the exception was with
{a, b} = {2, 1} which we have ruled out). We claim that pn is also a solution:

n2 | an + bn | anp + bnp
3.5. EXERCISES 249

since p is odd, and by the lifting the exponent lemma vp (anp + bnp ) = 1 + vp (an + bn ) ≥ 2 so p2
divides anp + bnp as well. Since p and n are coprime, we have (np)2 | anp + bnp as desired. 

Prime Factors
Exercise 3.5.11† (ISL 2002). Let p1 , . . . , pn > 3 be distinct rational primes. Prove that the number

2p1 ·...pn + 1
n
has at least 22 distinct prime factors.

Solution

Consider the 2n divisors of p1 · . . . · pn and order them d1 < . . . < d2n . Then, each 2di + 1 |
2p1 ·...·pn + 1 gives a primitive prime factor by Zsigmondy’s theorem (no exception since pi > 3),
n
so there are at least 2n prime factors in total and thus at least 22 divisors. 

Exercise 3.5.12† (Problems from the Book). Let a ≥ 2 be a rational integer. Prove that there exist
infinitely many integers n ≥ 1 such that the greatest prime factor of an − 1 is greater than n loga n.

Solution
k
We choose n = ak , so that n loga n = kak . We consider prime factors of Φak (a) | aa − 1. They
are all congruent to 1 modulo ak , and suppose for the sake of a contradiction that they are all
less than kak (which is the same as being at most kak since they are congruent to 1 modulo ak ).
k k
Since Φak (a) < aa , it has at most ak prime factors since they are all greater than ak . Let these
prime factors be k1 ak + 1, . . . , km ak + 1. The key claim is that Φak (a) ≡ 1 (mod a2k ), but
m
Y m
X
ki ak + 1 ≡ 1 + ak ki (mod a2k )
i=1 i=1

Pm k
and i=1 ki < km < ak since each ki is less than k and m is less than ak . Thus, it remains to
prove that Φak (a) ≡ 1 (mod ak ). We shall prove that this holds modulo pvk for any prime power
pv which divides a.

By Proposition 3.1.2, we have


kv−1
Φak (a) = Φp(a/pv )k (ap ).

Since pkv−1 ≥ 2kv for sufficiently large k (in fact k ≥ 3), modulo p2kv we get

Φp(a/pv )k (0) ≡ 1

as wanted. 

Exercise 3.5.13† (Inspired by IMO 2003). Let m ≥ 1 be an integer. Prove that there is some
rational prime p such that p - nm − m for any rational integer n.
250 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution

In fact, we prove more: if p is a prime factor of m and k = vp (m), there is some prime q such
that m is not a pk th power modulo q. For didactic purposes, we shall first do the case k = 1
(this whole paragraph will be about motivation, and the following paragraph will have the real
proof). By Exercise 4.6.19† , m is a pth power modulo q if and only if
q−1
m gcd(p,q−1) ≡ 1 (mod q).

In particular, we must have q ≡ 1 (mod p), otherwise this is always true. Hence, we want to have
q−1
m p 6≡ 1, i.e. the order r of m modulo q doesn’t divide q−1
p , or in other words vp (m) = vp (q −1).
This suggests to try, for instance, m = p and q 6≡ 1 (mod p2 ). Hence, we want to pick a prime
factor q of Φp (m) which is not congruent to 1 modulo p2 . If there was no such prime, we would
have Φp (m)equiv1 (mod p2 ) which is impossible since Φp (m) ≡ m + 1 (mod p2 ) and p2 - m.

Now, let’s do the general case. The proof is almost identical: we find a prime q for which m has
q−1
order p modulo q, and such that q 6≡ 1 (mod pk+1 ). That way, gcd(q−1,p k ) is not divisible by p.
k
Hence, if m were congruent to a gcd(q − 1, pk )th power ngcd(q−1,p )
modulo p, we would have
q−1
m gcd(q−1,pk ) ≡ nq−1 ≡ 1
q−1
but the order of m doesn’t divide gcd(q−1,p k ) . To find such a prime q, consider Φp (m) as before.

If all its prime divisors were congruent to 1 modulo pk+1 , we woudl have Φp (m) ≡ 1 (mod pk+1 )
which is impossible since it is congruent to 1 + m. 

Remark 3.5.1
This is also a consequence of (a corollary of) the Chebotarev density theorem: as said in Re-
mark 4.6.3, it there were no such prime, m would be an m/2th power if 8 | m, which is impossible
since 2m/2 > m for m ≥ 8, or an mth power if 8 - m, which is also impossible since 2m > m for
m ≥ 1.

Exercise 3.5.14† . Prove that ϕ(n)/n can get arbitrarily small. Deduce that π(n)/n → 0, where π(n)
denotes the number of primes at most n.

Solution

We take n = p1 · . . . · pk , where p1 , . . . , pk are the first k primes. We need to prove that


   
1 1
ϕ(n) = 1 − · ... · 1 − → 0,
p1 pk

i.e.
Y 1

1− = 0.
p
p

This follows from the following equality:



Y 1 X 1
−1
= = ∞.
p
1−p n=1
n
   
1 1
To deduce that π(n) = o(n), one can notice that there are 1 − p1 · ... · 1 − pk n + o(n)
numbers less than n which are not divisible by any of p1 , . . . , pk . 
3.5. EXERCISES 251

Exercise 3.5.15† . Let P (n) denote the greatest prime factor of any rational integer n ≥ 1 (P (1) = 0).
Let ε > 0 be a real number. Prove that there exist infinitely many rational integers n ≥ 2 such that

P (n − 1), P (n), P (n + 1) < nε .

Solution

We choose n = 2p1 ·...·pk , where p1 · . . . · pk are the first k odd primes. It is clear that P (n) = no(1) .
By factorisating the other two sides in cyclotomic polynomials, we get that P (n) is at most

max (Φd (2), Φd (−2)) ≤ 3ϕ(p1 ·...·pk ) = 2o(p1 ·...·pk )


d|p1 ·...·pk

ϕ(p1 ·...·pk )
since p1 ·...·pk → 0 by Exercise 3.5.14† . 

Exercise 3.5.16† (Brazilian Mathematical Olympiad 1995). Let P (n) denote the greatest prime
factor of any rational integer n ≥ 1. Prove that there exist infinitely many rational integers n ≥ 2 such
that
P (n − 1) < P (n) < P (n + 1).

Solution
k
Let p be an odd prime. Let k ≥ 0 be the smallest integer such that P (p2 + 1) > p, there exists
k k
one P (p2 + 1) → ∞ by Zsigmondy (one may also note that two numbers of the form p2 + 1
k
have gcd 2). Note that k ≥ 1 since P (p + 1) < p. We claim that n = p2 works. Indeed, we have
k k
P (p2 + 1) > p = P (p2 ) by assumption, and
k−1
!
2k
Y i
2
P (p − 1) = P (p − 1) p +1 <p
i=1

by minimality of k. 

Exercise 3.5.18† (Structure of units of Z/nZ). Let p be an odd rational prime and n ≥ 1 and
integer. Prove that there is a primitive root modulo pn , i.e. a number g which generates all the
numbers coprime with p modulo pn . Moreover, show that there doesn’t exist a primitive root mod 2n
for n ≥ 3, but that, in that case, there exist a rational integer g and a rational integer a such that
each rational integer is congruent to either g k for some k or ag k modulo 2n .2

Solution

Let g ∈ Z be a primitive root modulo p. Then, if g p−1 6≡ 1 (mod p2 ), we have vp (g n(p−1) − 1) =


1 + vp (n) by LTE which shows that g is a primitive root modulo pn for any n. If g p−1 6≡ 1
(mod p2 ), then g + p is also a primitive root modulo p and
(g + p)p−1 ≡ g p−1 + p(p − 1)g p−2 6≡ 1 (mod p2 )
so our previous argument shows that g is a primitive root modulo any power of p.

2 In group-theoretic terms, this says that (Z/pn Z)× ' Z/ϕ(pn )Z and that (Z/2Z)n ' (Z/2Z) × (Z/2n−2 Z) for n ≥ 2.

The Chinese remainder theorem then yields


(Z/2n pn 1 nm ×
1 · · · pm Z) ' (Z/2Z) × (Z/2
n−2
Z) × (Z/ϕ(pn 1 nm
1 )Z) × . . . × (Z/ϕ(pm )Z).
252 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

For p = 2, we have vp (g n − 1) = vp (n) + vp (g 2 − 1) − 1 ≥ vp (n) + 2 for even n since 8 | g 2 − 1, so


the order of any odd integer modulo 2n divides 2n−2 < ϕ(2n ). However, the same argument as
before shows that, if g 2 6≡ 1 (mod 16) (e.g g = 3), then g has exactly order 2n−2 . Then, note that
powers of g are all congruent to 1 or g modulo 8, and since there are exactly 2n−2 such elements
modulo 2n , this means that it goes through all of them. Thus, if a 6≡ 1, g (mod 8) is odd, every
element of Z/2n Z can be represented in exactly one way as either g n or ag n as wanted. 

Coefficients of Cyclotomic Polynomials


Exercise 3.5.20† . Let m ≥ 0 be an integer. Prove that the coefficient of X m of Φn is bounded when
n varies.

Solution

This follows from the formula Φn = d|n (X d − 1)µ(n/d) of Exercise 3.5.19. Indeed, modulo
Q

X m+1 , all terms with d > m vanish (possibly changing the sign also) and we are left with a finite
number of cases. ((X d − 1)−1 is too be interpreted as the inverse of X d − 1 modulo X m .) 

Remark 3.5.2
In fact, if we define µ(x) to be 0 when x is not an integer, we get

Y
Φn = (1 − X d )µ(n/d)
d=1

since the total number of times µ(n/d) is ±1 is even, for it is 2r where r is the number of prime
factors of n. This can be used to give explicit formulas for the coefficients of Φn , since the
coefficient an (k) of X k depends on the finite product
k
Y
(1 − X d )µ(n/d) ,
d=1

which we can expand as


k X  
Y µ(n/d)
di
(−X)
i
i
d=1

(this is an equality of formal power series) and then extract the coefficient of X k of this expression.
For instance, we get the formulas

an (1) = −µ(n)
µ(n)(µ(n) − 1)
an (2) = − µ(n/2)
2
µ(n)(µ(n) − 1)
an (3) = + µ(n/2)µ(n) − µ(n/3).
2

Exercise 3.5.21† . Let ψ(x) = pα ≤x log p. By noticing that


P

Z 1
exp(ψ(2n + 1))
exp(ψ(2n + 1)) xn (1 − x)n dx ≤ ,
0 4n

prove that π(n), the number of primes at most n, is greater than Cn/ log n for some constant C > 0.
3.5. EXERCISES 253

Solution
1
We have x(1 − x) ≤ 4 for x ∈ [0, 1] so
Z 1 Z 1
1 1
xn (1 − x)n dx ≤ dx = n .
0 0 4n 4
R1
However, since exp(ψ(2n + 1)) P = lcm(1, . . . , 2n + 1), exp(ψ(2n + 1)) 0
xn (1 − x)n is a positive
integer, since if X n (1 − X)n = i ai X i we get
Z 1 X ai
xn (1 − x)n dx = .
0 i
i+1

Hence, exp(ψ(2n + 1)) ≥ 4n which implies ψ(2n + 1) ≥ 2n log 2. In particular, ψ(2n) ≥ 2(n −
1) log 2 so ψ(n) ≥ (n − 2) log 2 for all n. Since
X  X  log n 
ψ(n) = logp (n) log p = log p ≤ log nπ(n),
log p
p≤n p≤n

(n−2) log 2
we get π(n) ≥ log n as wanted. 

Exercise 3.5.22† . Let m ≥ 3 be an odd integer and suppose that p1 < . . . < pm = p are rational
primes such that p1 + p2 > pm and let n = p1 · . . . · pm . What are the coefficient of X p and X p−2 of
Φn ? Deduce that any rational integer arises as a coefficient of a cyclotomic polynomial.3

Solution

By Exercise 3.5.19, we have Φn = d|n (X d − 1)µ(n/d) . Modulo X p+1 , if d > p + 1, (X d − 1)µ(n/d)


Q

becomes −1, since n/d is always squarefree. Hence,


Y (X p1 − 1) · . . . · (X pm )
Φn = (X d − 1)µ(n/d) ≡
X −1
d|n

since we removed 2t − (t + 1) factors, which is even by assumption, so the sign doesn’t change.
Moreover, since pi + pj ≥ p + 1 for any i, j, we have

Xp − 1
Φn = (1 − X p1 ) · . . . · (1 − X pm−1 )
X −1
= (1 + X + . . . + X p−1 )(1 − X p1 − X p2 − . . . − X pm−1 )

so the coefficient of X p is −m + 1 since each monomial of the second factor has a contribution
of −1 except the first one, which has no contribution since the degree of the first factor is less
than p. Similarly, the coefficient of X p−2 is −m + 2 since now the first monomial of the second
factor has a contribution of 1 since the degree of the first factor is large enough.

Suppose for the sake of a contradiction that there are no odd primes p1 < . . . < pm = p such that
p1 + p2 > p. In particular, if p1 < . . . < pm , we have pm > 2p1 . Hence, the number of primes
between 2k and 2k+1 is always less than m. As a consequence, the number of primes less than 2k ,
π(2k ), is less than kt. This contradicts Exercise 3.5.21† . This shows that any negative coefficient
can be represented, and for the positive coefficients (we can trivially get 0, e.g. Φ9 = X 6 +X 3 +1)
simply consider Φ2n which is Φn (−X) for odd n: this negates our coefficients since p and p − 2
are odd. 

3 This may come off as a bit surprising considering that all the cyclotomic polynomials we saw had only ±1 and 0

coefficients.
254 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Exercise 3.5.23† . Let p and q be two rational primes. Prove that the coefficients of Φpq are in
{−1, 0, 1}.

Solution

Let a and b be positive rational integers such that ap + bq = ϕ(pq), there exists scuh integers by
??. We claim that
a
! b 
q−1
!  p−1 
X X X X
Φpq = X pi  X qj  − X −pq X pi  X qj  .
i=0 j=0 i=a+1 j=b+1

Note that this is monic and has degree ap + bq = ϕ(pq) so it suffices to show that it is zero at
any primitive pqth root of unity ω. Here is a hint of motivation for this formula (which I don’t
find extremely convincing, if anyone has something better please contact me): we start with the
equations
p−1
X
Φp (ω q ) = ω qi = 0
i=0

and
q−1
X
p
Φq (ω ) = ω pi = 0.
j=0

To construct a polynomial vanishing at ω, we can consider polynomials in Φp (X q ) and Φq (X p ),


but it is easy to see that this will have a degree which is too high. Then, we try splitting the
sum Φp (ω p ), but it is again easy to see that the degree will be too high, unless we factorise one
of the parts by powers of ω. However, for this to work, we need to factorise by exactly X pq (the
exponent must be a multiple of pq since ω has order pq, and the higher the exponent the more
cancellation is needed so the best guess is X pq ).
Pa
Back to the problem, it is trivial to show that our polynomial is zero at ω: we have i=0 ω pi =
Pq−1 Pb Pp−1
− i=a+1 ω pi and j=0 ω qi = − j=b+1 ω qi so

a
! b 
q−1
! p−1

X X X X
ω pi  ω qi  = ω pi  ω qi 
i=0 j=0 i=a+1 j=b+1

as wanted.

Finally, let us return to the original problem. Showing that Φpq has coefficients in {−1, 0, 1}
is now equivalent to showing that that there is at most one way to write any integer n in the
form pi + qj for i ∈ [[0, a]] and j ∈ [[0, b]] or in the form pi + qj − pq for i ∈ [[a + 1.q − 1]] and
j ∈ [[b + 1, p − 1]].

For this, note that n can be written in two ways if and only if there are distinct pairs (i, j) and
(i0 , j 0 ) with i, i0 ∈ [[0, q − 1]] and j, j 0 ∈ [[0, p − 1]] such that

pi + qj ≡ pi0 + qj 0 (mod pq).

(It is clear that two such expressions give us an equality of this form, and the converse follows
from |pi + qj − (pi0 + qj 0 )| < 2pq, although it is techncially not needed in our case.) This is
equivalent to p(i − i0 ) ≡ q(j 0 − j) (mod pq), which implies that p | j 0 − j so j 0 = j and q | i − i0
so i = i0 . This contradicts the assumption that (i, j) 6= (i0 , j 0 ). (In fact, this a special case of
the Chinese remainder theorem: the map ψ : Z/pZ × Z/qZ → Z/pqZ given by (i, j) 7→ pi + qj is
bijective.) 
3.5. EXERCISES 255

Cyclotomic Fields and Fermat’s Last Theorem


Exercise 3.5.24† (Sophie-Germain’s Theorem). Let p be a Sophie-Germain prime, i.e. a rational
prime such that 2p + 1 is also prime. Prove that the equation ap + bp = cp does not have rational
integer solutions p - abc.

Solution

Suppose that ap + bp = cp for some coprime rational integers a, b, c such that p - abc. Modulo
q = 2p + 1, pth powers are congruent to ±1 or 0, so q | abc, say q | c. We have

Φ2 (a, b)Φ2p (a, b) = cp

and the gcd of Φ2 (a, b) and Φ2p divides p by LTE and Theorem 3.3.1. Since p - c by assumption,
the two factors are coprime and hence are both pth powers. Modulo q, this implies that a + b is
congruent to 0 or ±1. The same goes for a − c and b − c by symmetry. Since

0 ≡ 2c = (a + b) − (a − c) − (b − c) (mod q),

one of a+b, a−c and b−c must be divisible by q. If it is a−c or b−c, then q | a, b, c contradicting
the hypothesis that they are coprime. Thus, q | a + b. Since a − c ≡ a and b − c ≡ b are also
congruent to ±1 or 0 modulo q, we get a ≡ −b ≡ ±1, i.e. a ≡ 1 and b ≡ −1 without loss of
generality. But then,
p−1
X
Φ2p (a, b) = ak (−b)p−1−k ≡ p
k=0

which is not a pth power modulo q. This is a contradiction. 

Exercise 3.5.25† . Let ω be an nth root of unity. Define Q(ω) as Q + ωQ + . . . + ω n−1 Q. Prove that

Q(ω) ∩ R = Q(ω + ω −1 )

where Q(ω + ω −1 ) = Q + (ω + ω −1 )Q + . . . + (ω + ω −1 )n−1 Q.

Solution

ai ω k be a real element of Q(ω). Note that


P
Let f (ω) = i
X
2f (ω) = f (ω) + f (ω −1 ) = ai (ω i + ω −i ) ∈ Q(ω + ω −1 )
i

since X i + X1i is a polynomial with rational coefficients in X + X


1
by induction on i or by the
i 1 1
fundamental theorem of symmetric polynomials: X + X i is symmetric in X and X so it is a
1 1
polynomial in X + X and X · X = 1. 

Exercise 3.5.26† . Let ω be a primitive pth root of unity, where p is prime. Prove that the ring of
integers of Q(ω), OQ(ω) := Q(ω) ∩ Z is

Z[ω] := Z + ωZ + . . . + ω n−1 Z.

(In fact this holds for any nth root of unity but it is harder to prove.)
256 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution
Pp−2
Suppose that i=0 ai ω i = α ∈ Z for some rational numbers a0 , . . . , ap−2 . Then, the same is true
Pp−1
for its conjugates i=0 ai ω ki = αk . If we consider this as a system of equations, we know from
ij
Proposition C.3.7 that ai times the determinant of Ω Q = (ω )i,j∈[p−2] is an algebraic integer for
all i. Since this is is the Vandermonde determinant 1≤i<j≤p−2 ω i − ω j , we get
Y
ai ω i − ω j ∈ Z.
0≤i6=j≤p−1

Finally, as we saw in Theorem 3.2.1, the product i6=j ω i − ω j is ±pp by Exercise 3.2.2∗ which
Q
means
Pp−2 that the denominator of the ai is a power of p. To conclude, we shall prove that if
i
b
j=0 i ω is divisible by p, then all bi are divisible by p, thus showing that the denominator of
the ai is in fact not divisible by p, i.e. ai ∈ Z as desired.
Pp−2
Hence, suppose that i=0 bi ω i ≡ 0 (mod p). The same is true for its conjugates, and summing
them we get (p − 1)b0 ≡ 0, i.e. p | b0 . Since ω is invertible (ω p = 1), we can simply remove b0 ,
divide by ω and repeat this process to get p | bi for all i. 

Exercise 3.5.27† . Let ω be a primitive pth root of unity, where p is prime. Prove that p = u(1−ω)p−1 ,
where u ∈ Z is a unit of Z, i.e. 1/u is also an algebraic integer. Deduce that 1 − ω is prime in Q(ω).

Solution

We have
p−1
Y
p = Φp (1) = 1 − ωk
k=1

1−ω k
so we want to prove that 1−ω is a unit for every p - k. Note that this already shows that
Qp−1
1 − ω is prime since it has prime norm (the norm of f (ω) is defined as k=1 f (ω k ) and this is
1−ω
clearly multiplicative). We wish to show that 1−ω k is also an algebraic integer, which is true by

1−ζ `
symmetry between primitive roots of unity: if ζ = ω k and ω = ζ ` then this is just 1−ζ . 

Exercise 3.5.28† (Kummer). Let ω be a root of unity of odd prime order p and suppose ε is a unit
of Q(ω). Prove that ε = ηω n for some n ∈ Z and η ∈ R.

Solution

Let ε = f (ω) be a unit of Q(ω). Consider θ = ε/ε = f (ω)/f (ω −1 ). Then, its conjugates are
f (ω k )/f (ω −k ) which all have module 1, so θ is a root of unity by Kronecker’s theorem 1.5.27† .
We shall now analyze the roots of unity of Q(ω): by Bézout, if ζ ∈ Q(ω) is a primitive mth root
of unity, then ξ ∈ Q(ω) where ξ is a primitive lcm(m, p)th root of unity. Indeed,
 a  b  
2iπ 2iπ 2iπ
exp exp = exp
m p lcm(p, m)
where ap + bm = gcd(m, n) by Bézout’s lemma. However, the degree of a primitive kpth root of
unity is ϕ(kp) which is always greater than ϕ(p) (which is the maximum degree of an element
of Q(ω) by the fundamental theorem of symmetric polynomials), except when k ≤ 2. Thus, the
root of unity of Q(ω) have the form ±ω k , and this means that θ = ±ω n for some n.
Without loss of generality, we may assume that n is even (by replacing it by n + p if necessary).
Then, consider η = εω −n/2 . We wish to prove that it is real. By definition, η/η = ±1, so it is
3.5. EXERCISES 257

either real or purely imaginary: we want to rule the second case out. Thus, suppose that η = −η.
We claim that η is divisible by 1 − ω, and thus not a unit by Exercise 3.5.27† . Since 1 − ω | p, 2
is invertible modulo 1 − ω so it suffices to show that 2η = η − η is divisible by 1 − ω. Finally, if
η = i ai ω i then
P
X
η−η = ai (ω i − ω −i )
i

which is divisible by 1 − ω since 1 − ω | 1 − ω 2i = ω i (ω −i + ω i ). 

Exercise 3.5.29† . Let α ∈ Z[ω], where ω is a primitive pth root of unity. Prove that αp is congruent
to a rational integer modulo p.

Solution

Note that

(a0 + a1 ω + . . . + ap−1 ω p−1 )p ≡ ap0 + ap1 ω p + . . . + app−1 ω p(p−1) ≡ a1 + . . . + ap−1 (mod p)

by Frobenius. 

Exercise 3.5.30† (Kummer). Let p be an odd prime and ω a primitive pth root of unity. Suppose
that Z[ω] is a UFD.4 Prove that there do not exist non-zero rational integers a, b, c ∈ Z such that

ap + bp + cp = 0.

(You may assume that, if a unit of Z[ω] is congruent to a rational integer modulo p, it is a pth power
of a unit. This is known as "Kummer’s lemma". See Borevich-Shafarevich [6] or Conrad [10] for a
(1 − ω)-adic proof of this.)

Solution

Suppose that there are non-zero coprime a, b, c ∈ Z such that ap + bp = cp and, without loss of
generality, p - a, b. Working in Z[ω], this gives us

(a + b)(a + ωp) · . . . · (a + ω p−1 = cp .


k
−1
The gcd of two factors divides (ω i − 1)b and (ω j − 1)a for some p - i, j. Since ωω−1 is a unit

whenever k - p by Exercise 3.5.27 , the gcd of two factors divides (ω − 1)a and (ω − 1)b so divides
ω − 1. Since ω − 1 is prime by the same exercise, either all factors are divisible by 1 − ω (since
a + bω k ≡ a + b (mod 1 − ω)) or none of them are. We will distinguish these two cases.

First, suppose that 1 − ω - a + b. This corresponds to p - c. Then, by unique factorisation, there


are units εk ∈ Z[ω]× and elements ck ∈ Z[ω] such that a + bω k = εk cpk . Consider k = 1 and set
ε = ε1 and ε = ω m ε by Exercise 3.5.28† . Then, since cp1 ≡ cp1 (mod p) by Exercise 3.5.29† , we

4 Sadly, it has been proven that Z[ω] is only a UFD when p ∈ {3, 5, 7, 11, 13, 17, 19, 23}. This approach works however

almost verbatim when the class number h of Q(ω) is not divisible by p. The case h = 1 corresponds to Z[ω] being a
UFD. That said, it has not been proven that there exist infinitely many p such that p - h (but it has been conjectured
to be the case), while it has been proven that there exist infinitely many p such that p | h.
258 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

have

a + bω = εcp1
≡ εcp1
= ω m εcp1
= ω m a + bω
= aω m + bω m−1 .

Hence, p | a + bω − aω m − bω m−1 . If m 6= 1, 0, then the coefficient of ω m of this expression


is a and this is not divisible by p so the expression isn’t either by Exercise 3.5.26† . Similarly,
when m 6= 1, 2, the coefficient of ω m+1 is b which isn’t divisible by p. Thus, m = 1, which yields
a ≡ b (mod p). But then, by symmetry, we must also have a ≡ −c (mod p). This implies that
0 = ap + bp − cp ≡ 3ap which forces p = 3. It is however easy to see that a3 + b3 = c3 has no
solution 3 - abc by working modulo 9, which finishes the first case.

Now, we consider the second case. As in our proof of Theorem 2.4.1, we consider the more general
equation
αp + β p = ε(1 − ω)pn γ p
with coprime 1 − ω - α, β, γ ∈ Z[ω], ε ∈ Z[ω]× a unit, and n ≥ 1. Suppose that α, β, γ is a non-
trivial solution with minimal n. As we saw before, the gcd of two numbers of the form α + βω k
is 1 − ω. First, we prove that there are no solutions when n = 1. In that case, v1−ω (α + βω k )
must be 1 for all k, which implies that the numbers α + βω k are non-zero multiplies of 1 − ω
modulo (1 − ω)2 . Since there are only p − 1 such multiples as |Z[ω]/(1 − ω)Z[ω]× | = p − 1, two
of them must be equal which is impossible as we saw previously. Hence, n ≥ 2.

By replacing β by βω m for some m, we may assume that v1−ω (α + βω k ) = 1 for all p - k and
v1−ω (α + β) = p(n − 1) + 1.By unique factorisation, set α + βω = η(1 − ω)ρp and α + β =
µ(1 − ω)p(n−1)+1 τ p for some units η, µ. Then, since

(α + βω) + ω(α + βω −1 ) = (ω + 1)(α + β),

we get
ηρp + ωηρp = (ω + 1)µ(1 − ω)p(n−1) τ p .
2
Dividing by η and noticing that ω + 1 = ωω−1+1
is a unit, this gives us an equation of the form
p p p(n−1) p
x + uy = v(1 − ω) z for 1 − ω - x, y, z and u, v units. We wish to prove that u is a pth
power. This is where we use this fundamental lemma of Kummer: modulo p, u is congruent to
the pth power (−x/y)p so is a pth power itself. This contradicts the minimality since n − 1 ≥ 1,
so we are done. 

j k
Exercise 3.5.31† (Fleck’s Congruences). Let n ≥ 1 be an integer, p a prime number and q = n−1
p−1 .
Prove that, for any rational integer m,
 
k n
X
q
p | (−1) .
k
k≡m (mod p)

Solution

Let ω be a primitive pth root of unity. We use a unity root filter on the polynomial
X −m (mod p) (X − 1)n (see Exercise A.3.9† ):
  P −km
X n
k ω (1 − ω k )n
S := (−1) = k .
k p
k≡m (mod p)
3.5. EXERCISES 259

Now, note that the numerator is divisible by (1 − ω)n . This means that v1−ω (S) ≥ n − (p − 1)
since v1−ω (p) = p − 1 by Exercise 3.5.27† (and 1 − ω is prime in Q(ω)). Thus,

v1−ω (S) n
vp (S) ≥ ≥ −1
p−1 p−1
l m j k
n
which implies that vp (S) ≥ p−1 − 1 = n−1
p−1 as wanted. 

Miscellaneous
Exercise 3.5.33† (Korea Winter Program Practice Test 1 2019). Find all non-zero polynomials
f ∈ Z[X] such that, for any prime number p and any integer n, if p - n, f (n), the order of f (n) modulo
p is at most the order of n modulo p.

Solution

Let’s see what the condition means: it says that, if n is a mth root of unity in Fp , then f (n) is
either zero or a root of unity of order ≤ m in Fp . Thus, if we remember Proposition 1.3.1, we
might try to prove that the same holds over C. In fact we only need an assertion a lot weaker
than this to finish with Exercise A.3.23† , but since we can prove the general result directly with
cyclotomic polynomials let’s do it.

Let k ≥ 1 be an integer and ω a complex primitive mth root of unity. Let p ≡ 1 (mod m) be a
rational prime and z ∈ Fp an element of order m. Then, f (z) has order at most m (or is zero).
There are infinitely many such primes, so let m0 be such that for infinitely many p ≡ 1 (mod k),
f (z) has order m0 (or is zero). Then,
Y Y
f (ω k )Φm0 (f (ω k )) ≡ f (z k )Φm0 (f (z k )) ≡ 0 (mod p)
gcd(k,m0 )=1 gcd(k,m0 )

is divisible by infinitely many primes so must be zero, i.e. f (ω k ) = 0 or Φm0 (f (ω k )) = 0 for some
k. We have shown our claim: f (ω) is zero or a root of unity of order m0 ≤ m.

Finally, we can use Exercise A.3.23† : its assumption is a bit weaker than what we have, but we
can see that it works for any polynomial which sends infinitely many points on the unit circle to
itself, which is clearly the case here. Thus, f = ±X k for some k since real numbers of the unit
circles are ±1, and it is easily seen that −X k does not work as f (1) is a root of unity of order at
most 1. Conversely, it is easy to that X k works. 

Exercise 3.5.34† (Korea Mathematical Olympiad Final Round 2019). Show that there exist infinitely
many positive integers k such that the sequence (an )n≥0 defined by a0 = 1, a1 = k + 1 and

an+2 = kan+1 − an

for n ≥ 0 contains no prime number.

Solution
n n
Using Theorem C.4.1, we can see that an = α (1+α)−β
α−β
(1+β)
, where α and β are the roots of the
2
characteristic polynomial X − kX + 1. Indeed, we have a0 = 1 and a1 = α + β + 1 = k + 1.
260 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Let’s express this formula in a more convenient form:


αn (1 + α) − β n (1 + β)
an =
α−β
n
α (1 + α) − α1 1 + α1
n

=
α − α1
n
αn (1 + α) − α1 · 1+α α
= (1+α)(1−α)
α
1 α2n+1 − 1
= n· .
α α−1
These manipulations might seem a bit random at first, but they are very simple and motivated:
we have simply replaced β by α1 and simplified it as much as possible. We can now see where
cyclotomic polynomials appear:

α2n+1 − 1 Y
= Φd (α)
α−1
d|2n+1,d>1

is a product of cyclotomic polynomials! Note however that this product is trivial when 2n+1 = p
is prime, and that it is a product of cyclotomic polynomials evaluated at quadratic integers, so
we need to be a bit careful, but this is still a good sign. How could we transform this into a
non-trivial product even when 2n + 1 = p is prime? If α = γ m was an mth power, we would have

α2n+1 − 1 γ m(2n+1) Y
= m = Φd (γ).
α−1 γ −1
d|m(2n+1),d-m

In particular, for m = 2, this product is always non-trivial. Note that given a quadratic integer
of norm 1 γ, we can always construct a sequence an associated with α = γ m , since α is also a
quadratic integer of norm 1, and quadratic integers of norm 1 are exactly the roots of polynomials
of the form X 2 − kX + 1. Now let’s show that all these α work.

To show this, we take the norm of Φd (γ): if δ is the conjugate of γ, we have

γ m(2n+1) − 1 δ m(2n+1) − 1 Y
a2n = · = Φd (γ)Φd (δ)
γm − 1 δm − 1
d|m(2n+1),d-m

and Φd (γ)Φd (δ) is now a rational integer. First, we will prove that these factors are non-trivial,
and then that they cannot be all equal to a rational prime, thus establishing that a2n has at least
two distinct prime factors so that an isn’t prime as wanted. (Note that this last step isn’t needed
if we had chosen, say, m = 4, but we prefer to give the smallest possible m.)

Without loss of generality suppose that γ > δ. Since Φd (δ) = Φd (γ)/γ ϕ(n) , we want to have

Φd (γ)2 > γ ϕ(n) .

Since Φd (γ) > (γ − 1)ϕ(n) , this is true for (γ − 1)2 ≥ γ, i.e. when γ 2 + 1 = kγ ≥ 3γ. (This is
not a bad result at all: for k < 3, the roots of X 2 − kX + 1 are either rational or non-real so it
is normal that the situation gets weirder there. In general, it is very hard to estimate the size of
linear recurrences with non-real roots. For instance, that’s why we have this condition on a2 − 4b
in Exercise 4.6.33† . See also Theorem 8.5.1.)

Now suppose that


Φ2n+1 (γ)Φ2n+1 (δ) = p = Φ2(2n+1) (γ)Φ2(2n+1) (δ).

If we were dealing with rational integers, we could say that this is impossible since 2(2n+1)
2n+1
must be a power of p but p > 2 by our previous inequalities. We are dealing with quadratic
3.5. EXERCISES 261

integers instead, but it is not that different: we just use higher finite fields instead of only Fp
(see Chapter 4). If η ∈ Fp2 is a root of πγ = X 2 − kX + 1, we get

Φ2n+1 (η) = Φ2(2n+1) (η) = 0

2n+1 2(2n+1)
so η has order both pvp (2n+1)
and pvp (2(2n+1))
which implies that p = 2 as wanted. 

Remark 3.5.3
If α ∈ R and we choose γ to be the fundamental unit of Q(α) (which may have norm −1, see
Chapter 7), the same reasoning shows that if α = γ m and m is not an odd prime p, an is composite
except finitely many times (if m = pr the factorisation of a ps −1 for s ≤ r is trivial). In particular,
2
if the fundamental unit has norm −1, any α works since it has norm 1 so 2 | m. Conversely,
we can conjecture that, for m = p an odd prime, an is prime infinitely many times. This is an
anlogue of the conjecture that there exists infinitely many Mersenne primes.

Exercise 3.5.35† (Iran Mathematical Olympiad 3rd round 2018). Let a and b be positive rational
integers distinct from ±1, 0. Prove that there are infinitely rational primes p such that a and b have
the same order modulo p. (You may assume Dirichlet’s theorem.)

Solution

Without loss of generality, suppose that a 6= b. Note that, modulo p, if gcd(q, p − 1) = 1, a and aq
always have the same order. Hence, we pick a prime q and look at primes factors p of aq − b. Our
goal is to prove that there are infinitely many ones which is not congruent to 1 modulo q. Note
that if they were all congruent to 1 modulo q, then aq − b would be congruent to 1 modulo q too
so q | a−b, which is easy to avoid. The idea will be to control the (for the sake of a contradiction)
finitely many primes not congruent to 1 modulo q to reach the same contradiction.

Say these primes are p1 , . . . , pk . We allow q to vary here: these are the primes p which divide at
least one term of the form aq −b without being congruent to 1 modulo q. We wish to bound the p-
adic valuation of aq −b: for each i, depending on whether pi divides a or not, set mi = vp (a−b)+1
mk
in the former case and mi = vp (b) + 1 in the latter. Now consider N = ϕ(pm 1 ) · . . . · ϕ(pk ) and a
1

prime q ≡ −1 (mod N ) (there exists one by Dirichlet’s theorem, or by Theorem 4.4.1). We have
(
−b (mod pm i ) if pi | a
i

aq − b = 1
a −b≡
1−ab
a (mod pm i ) otherwise.
i

We have successfuly evaluated the contribution of our primes pi : if p = pi | a, then vp (aq − b) =


vp (b), otherwise vp (aq − b) = vp (ab − 1). If all other prime factors of aq − b were congruent to 1
modulo q, we would thus have
Y v (b) Y v (ab−1)
a − b ≡ aq − b ≡ pi p pi p .
pi |a pi -a

In particular, for large q,


v (b) v (ab−1
Y Y
pi p pi p = a − b.
pi |a pi -a

Now, note that the only property of the pi we have used is that every other prime factor of aq − b
is congruent to 1 modulo q. Hence, we may assume that the prime factors of ab − 1 are among
them. Since any p | ab − 1 doesn’t divide a, this yields vp (ab − 1) = vp (a − b) for every p | ab − 1.
Hence, ab − 1 | a − b. This is clearly impossible since a 6= b and a, b > 1 so |ab − 1| > |a − b| > 0.
262 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Remark 3.5.4
It is more natural to try this approach with q ≡ 1 (mod N ) at first. However, this only gives us
the equality Y v (b) Y v (a−b)
a − b ≡ aq − b ≡ pi p pi p
pi |a pi -a

which provides almost no information: it only yields vp (a − b) = vp (b) for p | a (it implies
vp (a) ≥ vp (b), and by symmetry vp (a) = vp (b), but since this is only for p | gcd(a, b) it is not
sufficient to finish). Thus, we need to try with q ≡ r (mod N ). Since q is prime, r needs to
be coprime with N and this can give complicated choices of r such as the smallest prime which
doesn’t divide N . Then, we need to evaluate vp (ar − b), if it is larger than vp (a − b) we are done
Q v (b) Q vp (ar −b)
by the equality a − b = pi |a pi p pi -a pi , and by the same equality we are done if it
r
smaller. Then, we can vary r so that vp (a − b) < vp (a − b) but this is complicated since we need
to take in account the prime factors of N and choose an r coprime. Finally, we can realise that
in fact there is a very natural choice of r coprime with N apart from 1, and that is −1. This
is also very good in the sense that a and a1 are the powers of a which are the most likely to be
distinct modulo p: if they aren’t, we have a ≡ ar for any odd r so all other choices of r aren’t
better. These considerations give the above solution.

Exercise 3.5.37† (IMC 2010). Let f : R → R be a function and a < b two real numbers. Suppose
that f is zero on [a, b], and
p−1  
X k
f x+ =0
p
k=0
for any x ∈ R and any rational prime p. Prove that f is zero everywhere.

Solution

Let N be a positive integer that we will choose later. Define I ⊆ R[X] as the set of polynomials
an X n + . . . + a0 such that the function
n  
X k
x 7→ ak f x +
N
k=0

is identically zero. We claim that I is an ideal of R[X], meaning that it is closed under addition
and closed under multiplication by any polynomial. The former is clear, for the latter note
that multiplication by X k corresponds to a translation (and that multiplication by constants
obviously doesn’t change anything to the condition). The key point about ideals is that we can
take the gcd of two polynomials: indeed, if u, v are elements of I, by Bézout’s lemma there exist
polynomials r, s ∈ R[X] such that ru + sv = gcd(u, v), and since I is an ideal, ru + sv ∈ I.

Now, we use the second condition of the statement. This gives us that, for any rational prime
p | N , the polynomial

XN − 1
up = 1 + X N/p + X 2N/p . . . + X (p−1)N/p =
X N/p − 1
is in I. Let’s compute the gcd of these polynomials when p ranges through the prime factors of
N : the roots of up are N th roots of unity with order not dividing N/p. Thus, the gcd of the up
is exactly the polynomials whose roots are primitive N th roots of unity, i.e. ΦN .

Now, since ϕ(N )/N can be arbitrarily N so that ϕ(N )/N ≤
small by Exercise 3.5.14 , chooseP
1

b − a. Let x be an element of a − N , b . By definition of I, since ΦN = i φi X i ∈ I, we have
ϕ(n)  
X k
φi f x + = 0.
N
k=0
3.5. EXERCISES 263

Note that all terms in this sum are in [a, b] except the first one, since

k ϕ(N )
a≤x+ ≤x+ ≤ a + (b − a) = b
N N
for 1 ≤ k ≤ ϕ(n). Thus, we also have f (x) = φ0 f (x) = 0, i.e. f is identically zero on a − N1 , b .
 

Similarly, f is identically zero on a, b + N1 . By induction, f is identically zero on a − Nk k


   
,b + N
for any k ∈ N , i.e. f is zero on R as wanted. 

Exercise 3.5.40† . Let n ≥ 1 be an integer. Prove that Φn (x) ≥ (x − 1)xϕ(n)−1 with equality if and
only if n = 1.5

Solution

We clearly have equality when n = 1, thus assume that n ≥ 2. We present ABCDE’s solution on
AoPS, see https://artofproblemsolving.com/community/c6h1596694p9917603. Write
Y
Φn (x) = (xd − 1)µ(n/d) ,
d|n

by Exercise 3.5.19. We wish to prove that this product is greater than (x − 1)xϕ(n)−1 , i.e., by
dividing by xϕ(n) , that Y
(1 − x−d )µ(n/d) ≥ 1 − x−1 .
d|n

Now, take the logarithm to get


X
µ(n/d) log(1 − x−d ) ≥ log(1 − x−1 ).
d|n

Recall the Taylor series of the logarithm:



X yk
log(1 − y) = − ,
k
k=1

valid for |y| < 1. Thus, we wish to prove that


∞ ∞ ∞
X 1X X X x−kd X xk
µ(n/d)x−kd = µ(n/d) ≤ .
k k k
k=1 d|n d|n k=1 k=1

Note that we exchanged the two sums thanks to absolute convergence. Finally, to show this, we
will prove that each term on the left is less than the term on the right, i.e. that
X
µ(n/d)x−kd ≤ x−k .
d|n

µ(n/d)y −d ≤ y −1 for all y ≥ 2. We distinguish a few


P
More specifically, we will prove that d|n
cases.
1. n is squarefree and has an even number of prime factors, i.e. µ(n) = 1. This is the most
interesting case, and the one where the inequality is the sharpest. Since µ(n) = 1,
X
µ(n/d)y −d = y −1 − y −p + . . . ,
d|n

5 In particular, Φn (2) ≥ 2ϕ(n)−1 .


264 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

where p is the smallest prime factor of n. Now, notice that the dots have absolute value
less than

X 1
y −d = y −(p+1) ≤ y −p
1 − y −1
d=p+1

since 1
1−y ≤ 2 ≤ y. Thus, . . . < y −p , i.e.
X
µ(n/d)y −d = y −1 − (y −p + . . .) ≤ y −1
d|n

as wanted.
2. n is squarefree and has an odd number of prime factors, i.e. µ(n) = −1. In that case, we
have X
µ(n/d)y −d = −y −1 + . . . ,
d|n

where the dots have absolute value less than



X 1
y −d = x−2 ≤ y −1
1 − y −1
d=2

so X
µ(n/d)y −d = −y −1 + . . . < 0 < y −1 .
d|n

3. n is not squarefree, i.e. µ(n) = 0. In that case, we have



X X 1 − y −1 −1
µ(n/d)y −d ≤ y −d = y −2 y .

d|n d=2


Chapter 4

Finite Fields

Exercise 4.0.1. Suppose K is a field of characteristic zero, i.e.

1 + ... + 1
| {z }
n times

(where 1 is the multiplicative identity) is never zero for any n ≥ 1. Prove that K contains (up to
relabelling of the elements) Q.1

Solution

We consider the following injective morphisme Q → K; its image will be the copy of Q inside K.
Send n ∈ N to
ξ(n) = 1 + . . . + 1
| {z }
n times

where 1 is the multiplicative identity of K. Then send −n to the additive inverse −ξ(n) of
ξ(n). Finally, send a/b to ξ(a)/ξ(b). It is clear that this is a well defined morphism, and this is
injective since K has characteristic zero. Indeed, by expanding we get ξ(mn) = ξ(m)ξ(n) for any
m, n ∈ N, which means it’s true for m, n ∈ Z too by adding signs where needed. In particular,
if a/b = c/d then ξ(a)/ξ(b) = ξ(c)/ξ(d), which shows that it is well-defined. To show that it is
multiplicative on all of Q and thus a morphism, we see that, for a, b, c, d ∈ Z with b, d 6= 0, we
have
ξ(ac) ξ(a)ξ(c)
ξ(ac/bd) = ξ(a/b)ξ(c/d) ⇐⇒ =
ξ(bd) ξ(b)ξ(d)
which is true since ξ is multiplicative on Z. 

Exercise 4.0.2∗ . Let p be a rational prime. Prove that there exists a unique field with p elements
(it’s Z/pZ).

Solution

First we prove that F has characteristic p. For this, we shall prove that the characteristic of a
finite ring divides its cardinality, thus proving that F has characteristic 1 or p but the former
is impossible since it is non-trivial. Let m be the characateristic of a ring R. Partition R into
sets of the form {a, a + 1, . . . , a + m − 1}. These sets either coincide or are pairwise disjointe: if
a + i = b + j then {a, . . . , a + m − 1} = {b, . . . , b + m − 1}. Thus the cardinality of R is divisible

1 Technically, it will usually not contain Q because Q is a very specific object. Indeed, the definition of a field is

extremely sensitive: if you change the set K (relabel its elements) but keep everything else the same you get a different
field. In that case we say the new field is isomorphic to the old one. So you must prove that K contains a field isomorphic
to Q, i.e. Q up to relabeling of its elements.

265
266 CHAPTER 4. FINITE FIELDS

by m since each such set has cardinality m.

Now, identify n ∈ Fp with


1 + ... + 1.
| {z }
n times

This is well defined because Fp and F have the same characateristic, thus yields a morphism (it
is clearly multiplicative and additive) between Fp and F , which is clearly injective. Since F and
Fp have the same cardinality, this is an isomorphism. 

Remark 4.0.1
What we did can be summarised as follow: use Lagrange’s theorem on the additive group of F
to prove that F has characteristic p, then conclude with Exercise A.2.3∗ that F contains a copy
of Fp which mean that they are isomorphic since they have the same cardinality.

Exercise 4.0.3∗ . Prove that F3 (i) := F3 + iF3 is a field (with 9 elements). (The hard part is to prove
that each element has an inverse.)

Solution

The inverse of a + i3 b is given by aa−i 3b 2 2


2 +b2 since (a + i3 b)(a − i3 b) = a + b . Note that this is well
2 2 2
defined since a + b = 0 iff a = b = 0, as the polynomial X + 1 has no root in F3 . 

4.1 Frobenius Morphism


Exercise 4.1.1. Why is commutativity (of R) needed?

Solution

We need R to be commutative for the binomial expansion to work: for instance, (a + b)2 =
a2 + ab + ba + b2 which is ab + ba if and only if a and b commute. 

Exercise 4.1.2∗ . Prove that an = αn + β n + γ n .

Solution

We have 1 + 1 + 1 = 3 = u1 , α + β + γ = 0 = u1 by Vieta’s formulas, and

α2 + β 2 + γ 2 = (α + β + γ)2 − 2(αβ + βγ + γα) = 2 = u2

by Vieta’s formulas. 

4.2 Existence and Uniqueness


Exercise 4.2.1∗ . Let K be a field and f ∈ K[X] an irreducible polynomial of degree n. Prove that
K(α) := K + αK + . . . + αn−1 K
is a field, where α is defined as a formal root of f , i.e. an object satisfying f (α) = 0.
4.3. PROPERTIES 267

Solution

It is clear that K(α) is a commutative ring, since αn is a linear combination of 1, . . . , αn−1 by


definition (f (α) = 0) so it is closed under multiplication (the other ring axioms are obvious).
Thus the tricky part is to prove that every non-zero element has an inverse. Note that this is
not necessarily true if f is reducible: if f = gh we have g(α)h(α) = 0 and the g(α), h(α) could
be both non-zero (keep in mind that α is just a formal object satisfying f (α) = 0).

Let g(α) be a non-zero element of K(α), i.e. f - g. We use Bézout’s lemma in K[X] (K[X] is
Euclidean for the degree map so Bézout too): since f is irreducible, it is coprime with g so there
exist r, s ∈ K[X] such that
rf + sg = 1.
Evaluation at α yields s(α)g(α) = 1 as wanted. 

Remark 4.2.1
In fact, if α is a root of f , we have K(α) ∼ K[X]/(f ), where K[X]/(f ) means K[X] modulo f .
This gives a more abstract way of constructing a field extension of K where f has a root. Indeed,
the element X ∈ K[X]/(f ) is a root of f : f (X) is divisible by f . (Note that we treat f as an
element of K[X]/(f )[Y ] here, i.e. a polynomial in Y with coefficients in K[X]/(f ) (in fact its
coefficients are simply in K).)

4.3 Properties
Exercise 4.3.1∗ . Let a and b be positive integers and K a field. Prove that X a − 1 divides X b − 1
in K if and only if a | b. Similarly, if x ≥ 2 is a rational integer, prove that xa − 1 divides xb − 1 in Z
if and only if a | b.

Solution

Note that the roots of X a − 1 are ath roots of unity and that these are all bth roots of unity if
and only if a | b (for instance by considering a primitive ath root of unity). Thus X a − 1 | X b − 1
if and only if a | b.

xa − 1 | xb − 1 if and only if the order of x modulo xa − 1 divides b. Since x has order a modulo
xa − 1, this means a | b. 

Exercise 4.3.2∗ . Let f ∈ Fp [X] be a polynomial of degree n. Prove that f splits over Fpn! .

Solution

It suffices to prove that an irreducible polynomial g of degree at most n has its roots in Fpn! , since
any polynomial of degree n is a product of such polynomials. This is true because Corollary 4.3.3:
deg g is at most n and thus divides n!. 

4.4 Cyclotomic Polynomials


Exercise 4.4.1∗ . Prove Proposition 4.4.1.
268 CHAPTER 4. FINITE FIELDS

Solution

The formula for Φm follows from Vieta’s formulas. The formula from Φn follows from Proposi-
tion 3.1.2 by induction on n/m = pk . 

Exercise 4.4.2∗ . Let p - m be a positive integer. Prove that Φm has a root in Fpn if and only if
m | pn − 1.

Solution

Since the order of any element of Fpn divides pn − 1 by Theorem 4.2.1, if Φm has a root in Fpn ,
since this root has order m we get m | pn − 1. For the converse, if m | pn − 1 then
n Y
Φm | X p −1 − 1 = X − a.
a∈F×
pn

The RHS splits in Fpn so the LHS too and in particular has at least one root there. 

Exercise 4.4.3. Prove that p2 ≡ 1 (mod 9) if and only if p ≡ ±1 (mod 9).

Solution

p2 ≡ 1 (mod 9) iff 9 | p2 − 1 = (p − 1)(p + 1). The two factors have gcd dividing 2 so are coprime
with 9, so 9 divides p2 − 1 iff 9 divides p − 1 or p + 1, i.e. iff p ≡ ±1 (mod 9). 

Exercise 4.4.4. Compute Ψ1 , . . . , Ψ8 .

Solution

We have Ψ1 = X − 2, Ψ2 = X + 2,
 
1 Φ3 1
Ψ3 X + = =X +1+
X X X

so Ψ3 = X + 1,    
1 Φ4 1
Ψ4 X+ = = X+
X X X
so Ψ4 = X,

Φ5 2
 
1
Ψ5 X + =
X X
1 1
= X2 + X + 1 + + 2
X X
 2  
1 1
= X+ + X+ −1
X X

so Ψ5 = X 2 + X − 1,  
1 Φ6 1
Ψ6 X + = =X −1+
X X X
4.5. QUADRATIC RECIPROCITY 269

so Ψ6 = X − 1,
 
1 Φ7
Ψ7 X+ =
X X3
1 1 1
= X3 + X2 + X + 1 + + 2+ 3
X X X
 3  2  
1 1 1
= X+ + X+ −2 X + −1
X X X

so Ψ7 = X 3 + X 2 − 2X − 1, and finally
2
12
  
1 Φ8 1
Ψ8 X+ = 2 = X2 + = X+ −2
X X X X

so Ψ8 = X 2 − 2. 

Exercise 4.4.5∗ . Let p 6= 0 be an integer. Prove that the numbers m/pk with m ∈ Z and k ∈ N are
dense in R.

Solution
PN
Let x ∈ R be a real number. Write it in base p: x = i=−∞ ai pi with ai ∈ {0, . . . , p − 1}. Let
PN
α = i=−(M −1) ai pi , which is a fraction with denominator a power of p. Then,

N −M
X p−1 0

x −
X
i
p−1 X 1 1 p−1
a i p < = = M ·

i=−∞ p
i pM
i=−∞
p i p 1 − p1
i=−(M −1)

which goes to zero as M → ∞, thus showing the wanted density. 

Exercise 4.4.6∗ . Prove that the leading coefficient of Ψn is 1.

Solution

Ψ1 = X − 2, Ψ2 = X + 2 are clearly monic so assume n > 2. Let a be the leading coefficient of


Ψn . Then, the leading coefficient of Φn /X ϕ(n)/2 = Ψn (X + 1/X) comes from (X + 1/X)ϕ(n)/2
and is thus aX ϕ(n)/2 . Since Φn is monic, Ψn is too. 

4.5 Quadratic Reciprocity


Exercise 4.5.1∗ . Prove Proposition 4.5.1.

Solution

Let g be a primitive root modulo p. a is a square modulo p if and only if it has the form g 2k for
p−1
some k, which is exactly equivalent to a 2 = g k(p−1) = 1.

Without primitive roots, one can also do a bit of elementary counting: there are exactly p−12
quadratic residues (they come by pairs x2 , (−x)2 , since x2 = y 2 ⇐⇒ x = ±y) and all quadratic
270 CHAPTER 4. FINITE FIELDS

p−1
residues are roots of X 2 − 1 by Fermat’s little theorem. The quadratic non-residues must
therefore be roots of
X p−1 − 1 p−1
p−1 =X 2 + 1.
X 2 −1


77

Exercise 4.5.2. Compute 101 .

Solution

We have
    
77 7 11
=
101 101 101
  
101 101
= 1
7 1
  
3 2
=
7 11
 
7
=
3
= 1.

p2 −1
Exercise 4.5.3. Prove that Ψ8 = X 2 − 2 and that (−1) 8 = 1 if and only if p ≡ ±1 (mod 8).

Solution

We have already computed Ψ8 in Exercise 4.4.4. 

 
Exercise 4.5.4∗ . Prove that, for any ` ∈ Fq , g` = `
q g.

Solution

If ` = 0 then both sides are 0. Otherwise, ` is invertible so


  X  k` 
`
g` = ω k` = g
q q
k∈Fq

   2
` `
which yields g` = q g since q = 1. 

Exercise 4.5.5∗ . Prove without computing g 2 that g has exactly 2 conjugates, i.e. is a quadratic
number.
4.6. EXERCISES 271

Solution
Q
i X − gi has rational coefficients by the fundamental theorem of symmetric polynomials so the
conjugates of g are among g and −g. Conversely, g and −g are conjugates of g since if
!
Xi
i
f X
i
p

has a root at ω it also has a root at ω k for p - k. Thus, if f (g) = 0 then f (gi ) = 0 too. 

4.6 Exercises
Dirichlet Convolutions
Exercise 4.6.1† (Dirichlet Convolution). A function f from N∗ to C is said to be an arithmetic
function. Define the Dirichlet convolution 2 f ∗ g of two arithmetic functions f and g as
X X
n 7→ f (d)g(n/d) = f (a)g(b).
d|n ab=n

Prove that the Dirichlet convolution is associative. In addition, prove that if f and g are multiplicative 3 ,
meaning that f (mn) = f (m)f (n) and g(mn) = g(m)g(n) for all coprime m, n ∈ N, then so is f ∗ g.

Solution

Let f, g, h be three arithmetic functions and let n ∈ N∗ . Then,


X
((f ∗ g) ∗ h)(n) = (f ∗ g)(d)h(c)
cd=n
X
= f (a)g(b)h(c)
cd=n,ab=d
X
= f (a)g(b)h(c).
abc=n

Similarly,
X
(f ∗ (g ∗ h))(n) = f (a)(g ∗ h)(d)
ad=n
X
= f (a)g(b)h(c)
ad=n,bc=d
X
= f (a)g(b)h(c)
abc=n

which shows that the Dirichlet convolution is associative. Now, suppose that f and g are multi-
plicative and let m, n be two coprime positive integers. We have
X  mn  X  mn 
(f ∗ g)(mn) = f (d)g f (ab)g
d ab
d|n a|m,b|n

2 The Dirichlet convolution appears naturally in the study of Dirichlet series: the product of two Dirichlet series
P∞ f (n) P∞ g(n) P∞ (f ∗g)(n)
n=1 ns and n=1 ns is the Dirichlet series corresponding to the convolution of the coefficients n=1 ns
.
3 This terminology has conflicting meanings: in algebra, it means that f (xy) = f (x)f (y) for all x, y, while for arithmetic

functions, it only means that f (xy) = f (x)f (y) for coprime x, y.


272 CHAPTER 4. FINITE FIELDS

because m and n are coprime, so each divisor of mn is a divisor of m times a divisor of n. By


multiplicativity of f and g, this is
  
X X X
f (a)f (b)g(m/a)g(n/d) =  f (a)g(m/a)  f (b)g(m/d) = (f ∗ g)(m)(f ∗ g)(n)
a|m,b|n a|m b|n

so f ∗ g is also multiplicative. 

Exercise 4.6.2† (Möbius Inversion). Define the Möbius function µ : Z≥1 → {−1, 0, 1} by µ(n) =
(−1)k where k is the number of prime factors of n if n is squarefree, and µ(n) = 0 otherwise. Define
also δ as the function mapping 1 to 1 and everything else to 0. Prove that δ is the identity element for
the Dirichlet convolution: f ∗ δ = δ ∗ f = f for all arithmetic functions f . In addition, prove that µ is
the inverse of 1 for the Dirichlet convolution, meaning that µ ∗ 1 = 1 ∗ µ = δ where 1 is the function
n 7→ 1.4

Solution

The first claim is very easy: for any n ∈ N∗ ,


X
(f ∗ δ)(n) = δ(d)f (n/d) = f (n).
d|n

For the second claim, note that the Möbius function is multiplicative. Hence, by Exercise 4.6.1† ,
µ ∗ 1 is as well. This means that, to prove that µ ∗ 1 is zero everywhere except at 1, we just need
to prove that it’s zero on prime powers. Thus, let pm 6= 1 be a prime power. We have
X
(µ ∗ 1)(pm ) = µ(d) = µ(1) + µ(p) = 1 − 1 = 0
d|pm

since µ(pm ) = 0 when m ≥ 2. To finish, we also have (µ ∗ 1)(1) = µ(1) = 1 = δ(1). 

Exercise 4.6.3† (Prime Number Theorem in Function Fields). Prove that the number of irreducible
polynomials in Fp [X] of degree n is
1 X n d
Nn = µ p
n d
d|n
pn
and show that this is asymptotically equivalent to logp (pn ) .

Solution

The fact that µ is the inverse of 1 means that the equalities g = 1 ∗ f and f = µ ∗ g are equivalent.
Now, consider the number f (n) of elements of Fp of degree n. This is n times the number of
irreducible polynomials of degree n, by grouping them by minimal polynomial. However, we also
have X
f (d) = pn
d|n
since this is the number of elements of Fpn . In other words, f ∗ 1 = n 7→ pn . This means that
f = (n 7→ pn ) ∗ µ, i.e.
X n
f (n) = µ pd .
d
d|n

4 This also explains how we found the formula for Φn from Exercise 3.5.19
4.6. EXERCISES 273

Division by n yields the formula for the number of irreducible polynomials of degree n. Now,
observe that
pn 1 X n
− Nn = µ pd
n n d
d|n,d<n

is at most
bn/2c
pbn/2c+1 − 1
 n/2 
1 X k p
p = =O
n n(p − 1) n
k=1

in absolute value since the greatest strict divisor of n is at most n/2. We conclude that

pn pn/2 pn
 
Nn = +O ∼ .
n n n

Linear Recurrences
Exercise 4.6.4† (China TST 2008). Define the sequence (xn )n≥1 by x1 = 2, x2 = 12 and xn+2 =
6xn+1 − xn for n ≥ 0. Suppose p and q are rational primes such that q | xp . Prove that, if q 6= 2, 3,
then q ≥ 2p − 1.

Solution
n n
−β
Without loss of generality, suppose that p is odd. It is easy to see that xn = 2 · αα−β where
2
α and β are the roots of X − 6X + 1. From now on we will assume α and β to be the roots
of X 2 − 6X + 1 in Fq , since we are working modulo q. We have α, β ∈ Fq2 and Φp (α, β) by
assumption. Thus, either α/β has order p unless p = q. If p = q, xp ≡ x1 = 2 so q = 2.
Otherwise, since the order of α/β divides q 2 − 1, we have p | q 2 − 1, i.e. q ≡ ±1 (mod p) so
q ≥ 2p − 1 as wanted since p ± 1 is even. 

 
Exercise 4.6.6† . Let p 6= 2, 5 be a prime number. Prove that p | Fp−ε where ε = 5
p .

Solution
p−ε ε
Let α ∈ Fp2 be a square root of 5. The key point is that 1±α 2 = 1. Indeed, 1±α
2 = 1±εα
2
1±α
since this is clearly true when ε = 1, and when ε = −1 it’s also true since 2 times its conjugate
is 1 (root of X 2 − X − 1). Thus,
1+α p−ε 1−α p−ε
 
2 − 2
Fp−ε ≡ =0
α
as wanted. 

 
Exercise 4.6.7† . Let p 6= 2, 5 be a rational prime. Prove that p | Fp − 5
p .
274 CHAPTER 4. FINITE FIELDS

Solution

Let α ∈ Fp2 be a square root of 5. Then,

(1 + α)p − (1 − α)p (1 + αp ) − (1 − αp )
Fp ≡ p
= = αp−1
2 α 2α
  p−1
5
which is p as αp−1 = 5 2 . 

Exercise 4.6.8† . Let m ≥ 1 be an integer and p a rational prime. Find the maximal possible period
modulo p ≥ m of a sequence satisfying a linear recurrence of order m.

Solution

We prove that the maximum possible is pm − 1. Here is a construction: let α ∈ Fp be an element


of order pm − 1 with conjugates α1 , . . . , αm , i.e. a primitive root of Fpm . Consider the sequence
m
X
an := αin ,
i=1

which takes values in Fp by the fundamental theorem of symmetric polynomials (and is a linear
recurrence of order m). Suppose that is has period t, i.e.

an+t = an , an+t+1 = an+1 , . . . , an+t+m−1 = an+m−1

for some m. Then, the Vandermonde determinant gives that αit = αi , by considering this as
system of equations with coefficients αij and solution αin+t = αin . This shows that the period is
pm − 1.
Pr
Now, let i=1 fi (n)αin be a linear recurrence of order m (the αi are not necessarily conjugates
anymore). Suppose first that all fi are constant. Group the αi by their degrees k1 , . . . , ks . Since
the period is at most the product of the orders of the αi , and the order of an element of degree
k divides pk − 1, the period is at most

(pk1 − 1) · . . . · (pkr − 1)
Ps
which is at most pm − 1 since i=1 ki ≤ m (there might be repeated roots so we don’t necessarily
have equality).

Finally, if one of the fi is not P


constant anymore, then we group the αi by their degrees as before.
s
The difference is that, now i=1 ki ≤ m − 1 (there is at least one repeated root). Since all
polynomials have period dividing p, the period is at most

p(pk1 − 1) · . . . · (pkr − 1) < pm − 1.

Remark 4.6.1
It is interesting to note that this proof also characterises the linear recurrences with maximal
period. Indeed, their characteristic polynomial must be the minimal polynomial of a primitive
root of Fpm by what we have seen, and, conversely, Vandermonde shows that all such sequences
have period pm − 1.
4.6. EXERCISES 275

Exercise 4.6.9† . Let f ∈ Z[X] be a polynomial and (an )n≥0 be a linear


 recurrence
 of rational integers.
an
Suppose that f (n) | an for any rational integer n ≥ 0. Prove that f (n) is also a linear recurrence.5

Solution
Pm
Write an = i=1 fi (n)αin . We shall prove that f | fi for every i, thus showing the wanted result.
Choose some n and a large prime p such that p | f (n) using Theorem 5.2.1. Then consider
an , an+p , . . . , an+(m−1)p . These are all zero modulo p since f (k) | ak . However, the Vandermonde
determinant shows that this implies that either fi (n) ≡ 0 for all i, or the determinant of αijp is
zero, i.e. αip ≡ αjp for some i 6= j. This is clearly impossible for large p since this implies
Y
p| αi αj ,
i6=j

as αip − αjp ≡ (αi − αj )p by Frobenius. Thus, we get p | f (n) =⇒ p | fi (n) for large p. We
can then use Corollary 5.4.2 (it is clear that the proof also works for f, g 6∈ Z[X]) to deduce that
all irreducible factors of f divide fi for every i. Simply divide f and all fi by these irreducible
factors, and repeat the argument. 

Polynomials and Elements of Fp


n
Exercise 4.6.11† . Let a ∈ Fp be non-zero. Prove that X p − X − a is irreducible over Fp if and only
if n = 1, or n = p = 2.

Solution

Exercise 4.6.12† (ISL 2003). Let (an )n≥0 be a sequence of rational integers such that an+1 = a2n − 2.
Suppose an odd rational prime p divides an . Prove that p ≡ ±1 (mod 2n+2 ).

Solution
1

We prove that f = X 2 − 2 iterated nth times is Ψ2n+2 . This means that f n X + X =
n n
Φ2n+2 /X 2 = X 2 + X12n . Note that f X + X
1
= X 2 + X12 so this follows by induction.



Exercise 4.6.14† . Let f ∈ Fp [X] be an irreducible polynomial of odd degree. Prove that its discrim-
inant is a square in Fp .

Solution
Q 2 Qn
The square root of the discriminant ∆ = i<j αi − αj of a polynomial f = i=1 X − αi is
√ Y
∆=± αi − αj .
i<j

5 In fact, the Hadamard quotient theorem states that if a linear recurrence b always divides another linear recurrence
  n
an then ab n is also a linear recurrence.
n
276 CHAPTER 4. FINITE FIELDS


Thus, if this was not in Fp , Fpn would contain Fp ( ∆) = Fp2 which is impossible since 2 - n. 

Remark 4.6.2

In particular, for n = 3, ∆ ∈ Fp if and only if f is irreducible or splits in K.

Exercise 4.6.15† (Chevalley-Warning Theorem). Let f1 , . . . , fm ∈ Fpk [X1 , . . . , Xn ] be polynomials


such that d1 + . . . + dm < n, where di is the degree of fi . Prove that, if f1 , . . . , fm have a common
root in Fpk , then they have another one.

Solution

We shall prove more strongly that P the number of common roots is divisible by p. This follows
from the following result: we have x∈Fp xk = 0 for k < p − 1 by Exercise A.3.11† , so the sum
over Fnp of f (x) for any polynomial f ∈ Fp [X1 , . . . , Xn ] of degree less than n(p − 1) also vanish
(since one variable must have degree less than p − 1). This yields our claim when applied to the
polynomial
f = (1 − f1p−1 ) · . . . · (1 − fm
p−1
)
(the powers mean exponentiation and not iteration). Indeed, this has degree less than n(p − 1)
by assumption, and f (x) is 1 if x is a common root of f1 , . . . , fm and 0 otherwise. 

Squares and the Law of Quadratic Reciprocity


Exercise 4.6.19† . Let q be a prime power, a ∈ F×
q and m ≥ 1 an integer. Prove that a is an mth
p−1
power in Fq if and only if a gcd(p−1,m) = 1.

Solution

Let g be a primitive root of F× k


p . Let k be such that a = g . Then, a is an mth power if and only if
k mn
there is an n such that g = g , i.e. k ≡ mn (mod p−1) which is equivalent to gcd(p−1, m) | k.
p−1
Finally, this is itself equivalent to a gcd(p−1,m) = 1. 

Exercise 4.6.20† . Let a be a rational integer. Suppose a is quadratic residue modulo every rational
prime p - a. Prove that a is a perfect square.

Solution

Without loss of generality, suppose that a = ε2n p1 · . . . · pk is squarefree, where ε = ±1, n ∈ {0, 1}
and p1 , . . . , pk are distinct odd primes. Suppose for the sake of a contradiction that pk ≥ 1. Let r
be a quadratic non-residue modulo p1 . Pick a prime p ≡ 1 (mod  2 ·.. .·p
 8p  k ) and p ≡ r (mod p1 ),
p q
using Dirichlet’s theorem. Then, since p ≡ 1 (mod 4) we have q = p by the law of quadratic
4.6. EXERCISES 277

   
2 −1
reciprocity for any odd prime q 6= p, and since p ≡ 1 (mod 8) we have p = p = 1. Thus,
       
a ε 2 p1 pk
= · ... ·
p p p p p
   
p p
= · ... ·
p1 pk
    
r 1 1
= · ... ·
p1 p2 pk
= (−1) · 1 · . . . · 1
= −1

which is a contradiction. This means that a ∈ {±1, ±2}; we could simply give counterexamples
to −1, ±2, but we construct arbitrarily large ones so that the problem still holds with the slightly
weaker assumption that a is quadratic resideu modulo sufficiently large primes. For a = −1,
simply pick any prime congruent to −1 modulo 4. For a = 2, pick a prime congruent to 3 modulo
8, and for a = −2, pick a prime congruent to −1 modulo 8.

Finally, we give some ways to avoid the use od Dirichlet’s theorem on primes in arithmetic
progressions. Instead of picking a prime p ≡ 1 (mod 8p2 · . . . · pk ) and p ≡ r (mod p1 ), we could
simply choose p to be such an integer with sufficiently large prime factors, and replace Legendre
symbols by Jacobi symbols. 

Remark 4.6.3
This result illustrates the celebrated Chebotarev density theorem, which implies that the set of
primes p such that the polynomial X 2 − a splits over Fp has density 21 when it is irreducible (and
of course 1 otherwise). (This can also be seen from a more careful observation of the quadratic
reciprocity law and our solution of the exercise.) This theorem also implies that, if a is an nth
power modulo all sufficiently primes, then it is an nth power if 8 - n, and an n2 th power otherwise
(which is sharp, as shown by Exercise 4.6.21† ). A note on this theorem: it does not imply that
the density of primes p such that an irreducible polynomial f of degree n splits over Fp has density
1
n ; this depends on its Galois group (see Chapter 6).

Exercise 4.6.21† . Prove that 16 is an eighth power modulo every prime but not an eighth power in
Q.

Solution

Notice that X 8 − 16 = (X 2 − 2)(X 2 + 2)(X 4 + 4). Thus, it has a root in Fp of 2 or −2 is a


quadratic residue. Otherwise, p ≡ 5 (mod 8) which implies that −4 is a fourth power in Fp so it
has a root as well. Indeed,
p−1 p−1 p−1
(−1) 4 = −1 = 2 2 = 4 4
p−1
so (−4) 4 = 1 which means that −4 is a fourth power by Exercise 4.6.19† . 

Exercise 4.6.22† . Prove that, if a polynomial f ∈ Z[X] of degree 2 has a root in Fp for any rational
prime p, then it has a rational root. However, show that there exists polynomials of degree 5 and 6
that have a root in Fp for every prime p but no rational root.6

6 The Chebotarev density theorem implies that such a polynomial must be reducible. In fact it even characterises

polynomials which have a root in Fp for every rational prime p based on the Galois groups of their splitting field (see
Chapter 6). In particular, it shows that 5 and 6 are minimal.
278 CHAPTER 4. FINITE FIELDS

Solution

For odd p, a quadratic polynomial f ∈ Z[X] has a root in Fp if and only if its discriminant ∆
is a square in Fp . Hence, ∆ is a square modulo sufficiently large primes, so it is a square by
Exercise 4.6.20† , i.e. f has rational roots.

For n = 6, the following polynomial works: (X 2 +1)(X 2 +2)(X 2 −2). Indeed, for any odd prime p,
if both 2 and −1 are quadratic non-residues, then −2 is a quadratic residue (and 1 is a quadratic
residue modulo 2). For n = 5, the following works: (X 2 + X + 1)(X 3 − 2). Indeed, if p ≡ 1
(mod 3) then Φ3 has a root modulo p, and otherwise 2 is a cube modulo p by Exercise 4.6.19† .

Exercise 4.6.23† (Jacobi Reciprocity). Define the Jacobi symbol n· of an odd positive integer n as


the product    
· ·
· . . . ·
pn1 1 pnk k
n1 nk
where n = p1 · . . . · pk is the prime factorisation of n. Prove the following statements: for any odd
m, n
m−1 n−1
• m 2 · 2 .
 n
n m = (−1)
m−1
• −1

m = (−1)
2 .

2
m −1
2

• m = (−1) 8 .
(The Jacobi symbol m

n is 1 if m is quadratic residue modulo n but may also be 1 if m isn’t.)

Solution
Q Q
Let m = i pi (not necessarily distinct) and n = i qi . Then,
m  n  Y p  q  Y pi −1 qi −1 pi −1 qi −1
i i
P
= = (−1) 2 · 2 = (−1) i 2 · 2 .
n m i
qi pi i

Thus, we want to show that a−1 b−1


2 + 2 =
ab−1
2 (mod 2) for any odd a and b, since this this
implies that Q Q
i pi − 1 qi − 1
X pi − 1 qi − 1 m−1 n−1
· ≡ · i = ·
i
2 2 2 2 2 2

as wanted. This is equivalent to a − 1 + b − 1 ≡ ab − 1 (mod 4), i.e. 4 | (a − 1)(b − 1) which is


clearly true. Similarly, we have
  Y  Y
−1 −1 pi −1 P pi −1
= = (−1) 2 = (−1) i 2
m i
pi i

m−1
which is (−1) 2 by the previous computation. Finally,
  Y  Y
2 2 p2
i −1
P p2
i −1
= = (−1) 8 = (−1) i 8
m i
pi i

a2 −1 2
−1 (ab)2 −1
so we want to show that 8 +b 8 = 8 (mod 2), i.e. 16 | (a2 − 1)(b2 − 1) which is true.

Exercise 4.6.24† . Suppose a1 , . . . , an are distinct squarefree rational integers such that
n
X √
bi ai = 0
i=1

for some rational numbers b1 , . . . , bn . Prove that b1 = . . . = bn = 0.


4.6. EXERCISES 279

Solution
Pn √
Let p1 , . . . , pk the prime factors of the ai . We proceed by induction on k. Write i=1 bi ai as
√ √ √ √ √ √
A + B pk := A( p1 , . . . , pk−1 ) + B( p1 , . . . , pk−1 ) pk

, where A and B are linear combinations of square roots of integers with prime factors among
p1 , . . . , pk−1 . The key point is that quadratic reciprocity gives us infinitely many primes p such
that all pi for i < k are quadratic residues, while pk isn’t. This implies that p | B, and for
sufficiently large p we get B = 0. We can then remove it and repeat the argument. Note that

A + B pk does not directly make sense modulo p, but if we consider square roots αi of pi we
get that (by symmetric polynomials), for some choice of ±1,

A(±α1 , . . . , ±αk−1 ) + B(±α1 , . . . , ±αk−1 )αk ∈ Fp

(for sufficiently large p so there’s no problem with the denominators of the coefficients). For the
p mentioned earlier, we get that B(±α1 , . . . , ±αk−1 ) = 0 otherwise this sum is in Fp2 and not
Fp . This implies that B = 0 (we see that infinitely many primes divide its norm, i.e. the product
of its conjugates).

Now, we prove this key claim. The proof is the same as the solution of Exercise 4.6.20† . Pick a
quadratic non-residue r and a large prime p ≡ r (mod pk ),√p ≡ 1 (mod 8p1 · . . . · pk ) (if pk 6= 2,
otherwise this simply corresponds to the irrationality of 2; or we can pick a prime p ≡ 5
(mod 8) instead of 1 (mod 8)). As in Exercise 4.6.20† , we can substitue our use of the quadratic
reciprocity law by the Jacobi reciprocity law, although we need adapt it slightly because we used
the convenient formalism of field theory (when p isn’t prime, Z/pZ is not a field, however the
fundamental theorem of symmetric polynomials works in any ring). 

n
Exercise 4.6.25† . Let n ≥ 2 be an integer and p a prime factor of 22 + 1. Prove that p ≡ 1
(mod 2n+2 ).

Solution

Note that, since p | Φ2n+1 (2) and n ≥ 2, p ≡ 1 (mod 8) which implies that 2 is q qudratic
n p−1
residue modulo p. Thus, we have 22 ≡ −1 (mod p) but 2 2 ≡ 1 (mod p) which implies that
v2 p−1
2 < n, i.e. p ≡ 1 (mod 2n+2 ). 

Exercise 4.6.26† (USA TST 2014). Find all functions f : Z → Z such that (m − n)(f (m) − f (n)) is
a perfect square for all m, n ∈ Z.

Solution

We shall prove that, for any prime p, if f (a) ≡ f (b) (mod p) for some a 6≡ b (mod b), then f is
constant modulo p. Since (f (a) − f (b))(a − b) is a square, we also get that f (a) ≡ f (b) (mod p2 ).
Then, since (f (n) − f (b))(n − b) and (f (n) − f (a))(n − a) are square, we get that, in fact, f is
constant modulo p2 , by looking at the vp . Thus, we can divide f by p2 and give rise to another
solution with a smaller value of f (1). Hence, assuming we have shown this, we can assume that
f (a) ≡ f (b) (mod p) =⇒ a ≡ b (mod p). Now, suppose we are working under this assumption.
Then, f (n + 1) − f (n) has no prime factor so must be ±1. Moreover, since f is injective, it
must be always 1 or always −1 but since f (n + 1) − f (n) is square, it must be always 1. Thus,
f (n) = n + c, and if we remove the assumption that f (1) was minimal, we get that all solutions
have the form f (n) = a2 (n + c) (and clearly these work).
280 CHAPTER 4. FINITE FIELDS

Hence, suppose that f (a) ≡ f (b) (mod p) for some a 6≡ b (mod p). We need to show that f is
constant modulo p. Without loss of generality, suppose that f (a) ≡ 0 be translating f , and that
b = 0 by translating f inside (replace f by x 7→ f (x + b)). Let S be the set of integers s such
that f (s) ≡ 0 (mod p). Note that, if f (x) 6≡ 0 for some x 6≡ 0, a, then xf (x) and f (x)(x − a) are
quadratic residues modulo p, and thus
x−a
x
too. Hence, if we choose x such that x−a a
x = t ⇐⇒ x = 1−t where t is a quadratic non-residue,
then f (x) ≡ 0, since x ≡ 0 or a is impossible in that case. Now, note the only condition on
a
a ∈ S is a 6≡ 0, so we can replace it by 1−t where t is a quadratic non-residue. This gives that
a
1−t a a
1−t (1−t)2 also satisfies these conditions, Iterating this process, we get that (1−t)k
is in S for any
a(1−t)
integer k. In particular, for k = p − 2, we have a(1 − t) ∈ S. Hence, 1− 1t
= −at ∈ S too, since
1
t is also a quadratic non-residue. Thus, att0 = −(−at)t0 ∈ S for any quadratic non-residues t, t0 ,
i.e. ar ∈ S for any quadratic residue r.

It remains
  to get  ∈ S for quadratic non-residues r. For this, note that if we have a b ∈ S such
 ar
that pb = − ap then we can get br for quadratic residues r and this corresponds to ar for
 
a
quadratic non-residues r. If there was no such b, since 1−t ∈ S, we would have 1−tp = 1 for any
quadratic non-residue t, i.e. the set of quadratic residues would be 1 minus the set of quadratic
non-residues. This is impossible since 1 is never reached by 1 − t. Thus, we have f (x) ≡ 0 for
any x 6≡ 0, a.

Finally, if p ≥ 3, by replacing a by a b 6≡ 0, a, we also get f (x) ≡ 0 for any x 6≡ 0, b and thus


for all x ≡ a. Similarly, by replacing 0 by b 6≡ 0, we get f (x) ≡ 0 for all x ≡ 0. If p = 2, by
translating f (on the inside) if necessary, it suffices to show that f (n) is even when n is (to also
show that f (n) is odd when n is). Since nf (n) is a square, we have f (n) ≡ 0 for n ≡ 2 (mod 4).
Then, since (n − 2)(f (n) − f (2)) is a square, we get f (n) ≡ 0 (mod 4) for n ≡ 0 (mod 4). 

Sums and Products


Exercise 4.6.27† (Tuymaada 2012). Let p be an odd prime. Prove that
p+1
1 1 1 (−1) 2
+ + ... + ≡ (mod p)
02 + 1 12 + 1 (p − 1)2 + 1 2

where the sum is taken over the k for which k 2 + 1 6≡ 0.

Solution

Note that we have the following partial fractions decomposition in Fp2 :


 
1 1 i 1 1
= = −
k2 + 1 (k − i)(k + i) 2 k+i k−i

where i ∈ Fp2 satisfies i2 = −1. First, we treat the case p ≡ 1 (mod 4), i.e. i ∈ Fp . Then, the
1 1
sum is telescopic: k+i cancels out with (k+2i)−i , except when k = −i. Thus, the only terms that
1 1
don’t cancel are 2i and − −2i , i.e. the sum is

i 1 −1
· =
2 i 2
as wanted.
4.6. EXERCISES 281

Now, suppose p ≡ −1 (mod 4). By Exercise A.3.26† and Fermat’s little theorem, we have
X 1 (X p − X)0 1
= =− p .
X −k Xp − X X −X
k∈Fp

Evaluating this at i, we get


X 1 1
=
i−k 2i
k∈Fp

since ip is the conjugate of i, i.e. −i. We finally conclude that


 
X 1 i X 1 X 1
= − 
k2 + 1 2 k+i k−i
k∈Fp k∈Fp k∈Fp
 
i 1 1
= −
2 2i −2i
1
= .
2


Remark 4.6.4
This argument can be adapted to compute
X 1
,
f (k)
k∈Fp

where f ∈ Fp [X] is a monic polynomial which irreducible over Fp . Indeed, it must have distinct
roots since it is irreducible: gcd(f, f 0 ) must be 1 or f , and the only
P way it could
P havei ap common
root with its derivative at the same time is if f 0 = 0, i.e. f = i ai X pi = i ai X which is
not irreducible. Thus,
1 X 1
= .
f f 0 (α)(X − α)
f (α)=0

This is called the partial fractions decomposition of f (see also Remark C.4.1). Indeed, it is
equivalent to
X f
=1
f 0 (α)(X − α)
f (α)

which is true, since by evaluating at α we get 1 as desired by Exercise 3.2.2∗ , so this is a polynomial
of less than deg f taking deg f times the value 1. (If this seems unmotivated, notice that there must
f
exist such a partial fractions decomposition since the polynomials X−α are linearly independent,
as can be seen by evaluating a linear combination at α, and then we get the precise coefficients
by also evaluating at α).

This implies that


X 1 X 1 X 1
= 0
f (k) f (α) k−α
k∈Fp f (α)=0 k∈Fp
X 1
=− .
f 0 (α)(αp − α)
f (α)=0
282 CHAPTER 4. FINITE FIELDS

Exercise 4.6.30† . Let n ≥ 1 be an integer. Prove that, for any rational prime p,
p−1
Y ϕ(n)
Φn (k) ≡ Φn/ gcd(n,p−1) (1) ϕ(n/ gcd(n,p−1)) (mod p).
k=1

Solution

First, notice that when we replace n by np both sides of the equality are raised to the pth or
(p − 1)th power (depending on whether
Q p | n or not). Thus, we may assume without loss of
generality that p - n. We have Φn = ω X − ω where the product is over the elements of order
n of Fp . Thus,
Y Y Y
Φn (k) = k−ω
k∈F×
p k∈F×
p
ω
Y Y
= ω−k
ω k∈F×
p
Y
= ω p−1 − 1
ω

since X p−1 − 1 = k∈F×


Q
p
X − k (we exchanged k − ω with ω − k since this multiplies everything by
(−1)p−1 = 1 when p is odd and −1 = 1 when p = 2). To conclude, we claim that each primitive
ϕ(n)
n/ gcd(n, p − 1)th root is represented exactly ϕ(n/ gcd(n,p−1)) times by ω p−1 , thus yielding the
p−1 p−1
wanted result (when n > 2 we can replace ω − 1 by 1 − ω since this multiplies the product
by (−1)ϕ(n) = 1, and when n ≤ 2 the product is zero).

Let m = n/ gcd(n, p − 1). Note that, if we fix an element of order n and write the others as power
ϕ(n)
of it, this becomes equivalent to the set of elements coprime with n modulo n restricting to ϕ(m)
copies of the set of elements coprime with m modulo m. Note that the cardinalities agree:

ϕ(n)
ϕ(n) = · ϕ(m).
ϕ(m)

Thus, we only need to check that each element of (Z/mZ)× (the subset of Z/mZ with invertible
elements) is reached as many times by evaluating elements of (Z/nZ)× modulo m. This is easy:
if a1 ≡ . . . ≡ ak ≡ a (mod m), then ca1 ≡ . . . ≡ cak ≡ b (mod m) for any b ∈ (Z/mZ)× , where
c is any element of (Z/nZ)× congruent to ba−1 modulo m. 

Miscellaneous
Exercise 4.6.32† (Lucas’s Theorem). Let p be a prime number and

n = pm nm + . . . + pn1 + n0

and
k = pm km + . . . + pk1 + k0
be the base p expansion of rational integers k, n ≥ 0 (ni and ki can be zero). Prove that
  Y m  
n ni
≡ .
k i=0
ki
4.6. EXERCISES 283

Solution

We have
m m
Y i Y i
(X + 1)n = (X + 1)ni p ≡ (X p + 1)ni
i=0 i=0
k
and considering the coefficient of X yields the desired result. 

Exercise 4.6.33† (Carmichael’s Theorem). Let a, b be two coprime integers such that a2 − 4b > 0,
and let (un )n≥1 denote the linear recurrence defined by u0 = 0, u1 = 1, and
un+2 = aUn+1 − bUn .
Prove that for n 6= 1, 2, 6, un always have a primitive prime factor, except when n = 12 and a = b = ±1
(corresponding to the Fibonacci sequence).

Solution
n n
−β
Notice that un = αα−β where α and β are the roots of X 2 −aX +b, which are real by assumption.
Thus, Carmichael’s theorem is an analogue of Zsigmondy’s theorem for real conjugate quadratic
integers. We proceed as in the case of Z. Note that Φn (α, β) is a rational integer since it is its
own conjugate has. Suppose that p is a non-primitive prime factor of Φn (α, β). Since p is not
primitive, p also divides Φm (α, β) for some m < n. Consider temporarily α and β as elements of
Fp2 . If one of them is zero, i.e. p | b, the other must be too since

Φm (α, β) = αϕ(m) + β ϕ(m) + αβ(· · · ).

This is impossible since p - a ≡ α + β as a and b are coprime. Hence, α/β is well-defined and has
both order m/pvp (m) and n/pvp (n) which implies that p | n. Since n/pvp (n) | p2 − 1, we conclude
that p is either the greatest prime factor of n or the second greatest, since if p < q, r | n, we
get n/pvp (n) ≥ qr > p2 . This means that there are at most two non-primitive prime factors not
dividing b.

The second step of the proof is to bound the p-adic valuations of Φn (α, β). When p is odd, the
same proof as the usual LTE works since p | Φn/p (α, β) | αn/p − β n/p so we prove the same way
that
αn − β n
n/p
≡ pαn (mod p2 ).
α − β n/p
When p = 2, things get trickier. Since n/pvp (n) | p2 − 1 = 3, we need to consider the cases
n = 2k and n = 3 · 2k . In fact, we will only consider the cases n = 4 and n = 12 since
k k k k
Φ2m·2k (α, β) = Φ2m (α2 , β 2 ) and the coefficients of the (X − α2 )(X + β 2 ) are also coprime.
Indeed, the coefficients of α2 and β 2 are a2 − 2b and b2 and we conclude by induction. (This also
k k
follows from ideal factorisation: if α and β are coprime, so are α2 and β 2 ).

We wish to show that α2 + β 2 cannot be a power of 2. If we


easy case, n = 4. √
We first do the √
write α = u + v d and β = u − v d with positive d, we have

Φ4 (α, β) = α2 + β 2 = 2(u2 + dv 2 ).

Since a = 2u and b = u2 − dv 2 are coprime, either u, v ∈ Z and and u2 − dv 2 is odd, so u2 + dv 2 is


too and can’t be a power of 2 (it’s greater than 1), or 2u, 2v are odd integers and u2 +dv 2 = 2u2 −b
is a half integer so 2(u2 + dv 2 ) is odd and can’t be a power of 2 (it’s greater than 1).

Now, we treat the case where n = 12. We write α, β = u ± v d again. We need to determine
when
Φ12 (α, β) = α4 − (αβ)2 + β 4 = 2(u4 + 14u2 v 2 d + v 4 d2 )
284 CHAPTER 4. FINITE FIELDS

is a power of 2 or 3 times a power of 2. Since a = 2u and b = u2 − dv 2 are coprime, if u, v ∈ Z


then u2 − dv 2 but then so is u4 + 14u2 v 2 d + v 4 d2 so it can’t be a power of 2. Hence, let
u = 2r, v = 2s with odd r, s. Then, an absolutely miraculous (computer) computation shows
that r4 + 14r2 s2 d + s4 d2 can never be divisible by 64 for odd r, s, d. Since d ≡ 1 (mod 4), it is
greater than 1 + 5 · 14 + 52 = 96 so must be exactly 96 since this is the only integer of the form
2k or 3 · 2k nfor some k ≤o 6 which is greater than 70. This yields |r| = |s| = 1 and d = 5, i.e.
√ √
1+ 5 1− 5
{α, β} = ± 2 , 2 corresponding to a = b = ±1 as desired.

To conclude, if Φn (α, β) doesn’t have a primitive prime factor and n is not a power of 2 or 3
times a power of 2 (we have already covered these cases), then Φn (α, β) = p for some prime p | n
or Φn (α, β) = pq for some distinct primes p, q | n. By Proposition 3.4.1, we have

pq ≥ Φn (α, β) > |α − β|ϕ(n)

(set q = 1 if Φn (α, β) is prime). Since ϕ(p) = p − 1 > p/2 and ϕ(pq) = (p − 1)(q − 1) ≥ pq/2, we
conclude that
pq > |α − β|pq/2 ,
i.e. C N < N 2 where C = |α − β| and N = pq. However, it is easy to see that, when C > 2.2,
C N > N 2 for any positive integer N .Indeed, this is true for N ≤ 4, and for N ≥ 5 we have
   
N N
2N = (1 + 1)N ≥ 2 + = N 2.
2 1

Hence, we must have |α − β| ≤ 2.2. Now, notice that |α − β| = a2 − 4b so that 0 ≤ a2 − 4b ≤
2.22 < 5. Since a2 − 4b is congruent to 0 or 1 modulo 4, it must be equal to 0, 1, or 4. In all
these cases α and β are rational integers and we have already proven the Zsigmondy theorem for
rational integers so we are done. 

Exercise 4.6.34† . Suppose p ≡ 2 or p ≡ 5 (mod 9) is a rational prime. Prove that the equation
α3 + β 3 + εaγ 3 = 0
where ε ∈ Z[j] is a unit and 2 6= a ∈ {p, p2 } does not have solutions in Z[j].

Solution

Note that p ≡ 2 (mod 3) so p is prime in Z[j]. Suppose that there is a solution α, β, γ 6= 0, and
pick one which minimises |N (αβγ)|. In particular, α, β, γ are coprime. Rewrite the equation as

(α + β)(α + jβ)(α + j 2 β) = −εaγ 3 .

Consider the numbers x = α + β, y = jα + j 2 β and z = j 2 α + jβ. By assumption, xyz = −εaγ 3 .


Since p3 does not the divide the RHS, exactly one of x, y, z is divisible by a and the other ones
are not divisible by p, by unique factorisation. By replacing α and β by j k α and j k β for some
k if necessary, suppose without loss of generality that it is z. Let d be the gcd of x, y, z and
consider the numbers 
3
x/d = ε1 u

y/d = ε2 v 3

z/d = ε3 aw3

by unique factorisation. Since x + y + z = 0, we have

ε1 u3 + ε2 v 3 + ε3 aw3 = 0,

i.e.
u3 + µv 3 + ηaw3 = 0
4.6. EXERCISES 285

for some units µ, η. Suppose for a moment that we manage to prove µ = ±1. Then, we get the
smaller solution
u3 + (±v)3 + ηaw3 = 0
for non-zero u, v, w. This implies, by assumption, that

|N (αβγ)|3 ≥ |N (uvw)|3
 xyz 
= N

d3 a

 γ  3
= N

d

which implies that |N (dαβ)| ≤ 1: α and β are units. This yields the equation ±1 ± 1 + εaγ 3 = 0,
which is clearly impossible since a - ±1 ± 1 as a > 2.

It remains to prove that µ = ±1 from the equation u3 + µv 3 + ηaw3 = 0. What else can we do to
prove this apart from considering the equation modulo p? This gives us that µ is congruent to a
cube modulo p. You might be wondering what the link between this problem and the theory of
finite fields is. Here is the answer: since p ≡ 2 (mod 3) is prime, Z[j]/pZ[j] ' Fp2 is a field with
p2 elements. Since p 6≡ −1 (mod 9), there is no primitive ninth root of unity as 9 - p2 − 1. Hence,
if µ were a primitive cube root of unity, it could not be a cube modulo p since a cube root of µ
would be a primitive ninth root of unity modulo p. This implies that µ = ±1 as wanted. 

Exercise 4.6.35† (Class Equation of a Group Action and Wedderburn’s Theorem). Let G be a finite
group, S a finite set, and · a group action of G on S.7 Given an element s ∈ S, let Stab(s) and Fix(G)
denote the set of elements of G fixing s and the elements of S fixed by all of G respectivelly. Finally,
let Oi = Gsi be the (disjoint) orbits of size greater than 1. Prove the class equation:
X |G|
|S| = | Fix(G)| + .
| Stab(si )|
|Oi |>1

Deduce Wedderburn’s theorem: any finite skew field is a field.

Solution

For the first part, notice that an orbit O = Gs has size 1 if and only if s is fixed by all of G, i.e.
is in | Fix(G)|, and that
|G|
|O| = |Gs| = |G/ Stab(s)| =
Stab(s)
for any s. Indeed, the map G/ Stab(s) → Gs sending h Stab(s) to the only element hs of
h Stab(s)s is a bijection: if gs = hs then h−1 gs = s so h−1 g ∈ Stab(s), i.e. g Stab(s) = h Stab(s).
Thus, the class formula becomes
X X
|G| = 1+ |Oi |
|Oi |=1 |Oi |>1

which is obviously true. (See Exercise A.3.14† for the definition of the group quotient G/H.)

We now consider the second part. We consider a finite skew field F as a multiplicative group
once we remove its zero element, and our goal is to prove that it is abelian. Hence, we define its
center Z = (F × ) as the group of non-zero elements which commute with every other element.
Note that Z ∪ {0} is a finite field, say of cardinality q. Then, F is naturally a vector space over
Z, hence of cardinality q n for some n.

7 In other words, a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G and s ∈ S. See also

Exercise A.3.20† .
286 CHAPTER 4. FINITE FIELDS

The class equation for the action of F−1 into itself defined by the conjugation g · s := gsg −1 is
r
×
X |F × |
|F | = |Z| +
i=1
|C(xi )|

where C(x) is the centraliser of x, i.e. the group of elements which commute with x. Indeed,
Fix(F × ) is the set of elements x such that gxg −1 = x for any g, i.e. x commutes with every g,
while Stab(x) is the set of g such that gxg −1 = x, i.e. elements which commute with x. Here
is the key point: for any x ∈ F , C(x) ∪ {0} is a vector space over F as well since Z commutes
with everything so ZC(x) = C(x). This implies that its cardinality is a power of q too, say
|C(xi )| = q ni − 1 for some ni | n since q ni − 1 | |F × | = q n − 1. Hence, our equation becomes
r
X qn − 1
qn − 1 = q − 1 + .
i=1
q ni − 1

Since the sum is taken over the orbits of size greater than 1, we have ni < n so, modulo Φn (q),
this becomes
0 ≡ q − 1 + 0.
In other words, Φn (q) divides q −1. Since Φn (q) ≥ (q −1)ϕ(n) with equality iff n = 1, we conclude
that n = 1, i.e. F × = Z is commutative! 
Chapter 5

Polynomial Number Theory

5.1 Factorisation of Polynomials


Exercise 5.1.1∗ . Prove that the content is well-defined: c(N f )/|N | = c(M f )/|M | for any non-zero
M, N ∈ Z such that N f, M g ∈ Z[X].

Solution

Assume without loss of generality that M and N are positive. rf has integer coefficients if and
only if r is divisible by the lcm of the denominators of the coefficients of f . Thus it suffices to
prove the result for f, g ∈ Z[X]. Indeed, if we write N = mN 0 and M = mM 0 where m is the
lcm of the denominators of the coefficients, we have

M c(N f ) = N c(M f ) ⇐⇒ M 0 c(N 0 g) = N 0 c(M 0 g)

where g = mf has integer coefficients. This follows from the fact that c(rg) = rc(g) for any
r ∈ Z (when you multiply all coefficients by r, the gcd also gets multiplied by r). 

Exercise 5.1.2∗ . Suppose f ∈ Q[X] has integral content. Prove that f has integer coefficients.

Solution

Write f = g/N with f ∈ Z[X] and 0 6= N ∈ Z.The content of f for is c(g)/N . This is an integer
iff N divides c(g), i.e. N divides all coefficients of g, which is equivalent to f = g/N having
integer coefficients. 

Exercise 5.1.3∗ . Prove Proposition 5.1.2.

Solution

Write g = f ∗ h. Then c(h) = c(g) ∈ Z. Thus h ∈ Z[X]. 

Exercise 5.1.4∗ . Prove Corollary 5.1.2.

287
288 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Solution

Consider the factorisation into irreducible polynomials of f in Q[X]: f = af1 · . . . · fk . The


factorisation of f in Z[X] is then given as follow: replace each fi by its primitive part fi∗ =
fi /c(fi ) ∈ Z[X]. The multiplicative constant is then c(f ) ∈ Z by Gauss’s lemma. The uniqueness
of the factorisation in Q[X] shows that this factorisation in Z[X] is unique too (each primitive
irreducible factor must be a constant times an irreducible factor occuring in the factorisation in
Q[X]). 

Exercise 5.1.5. Prove that Φpn is irreducible with Eisenstein’s criterion.

Solution

We have n
Xp − 1 n n−1
Φp n = pn−1 ≡ (X − 1)p −p (mod p)
X −1
by Proposition 3.1.1 and Frobenius. Thus, Φpn (X + 1) has all its coefficients divisible by p except
the leading one. Moreover, Φpn (1) = p by expanding the division, which is not divisible by p2 .
Hence, by Eisenstein’s criterion, Φpn (X + 1) is irreducible and thus Φpn too. 

5.2 Prime Divisors of Polynomials


Exercise 5.2.1∗ . Why does CRT imply that there is a value reached 2m times modulo p1 · . . . · pm ?

Solution

For each pi , there is a value which is reached twice: Ni = f (ai ) ≡ f (bi ) (mod pi ). Thus, the value
congruent to Ni modulo pi for i = 1, . . . , m is reached by m as long as m ∈ {ai , bi } (mod pi ) for
each bi . There are two choices for each pi , so 2m possible systems of congruence, and by CRT
all of these systems have a solution modulo p1 · . . . · pm . 

Exercise 5.2.2. Prove that X − v works iff 0 ≤ v ≥ −2022, and −X + v works iff 0 ≤ v ≤ 2022.

Solution

Without loss of generality, suppose f = X − v by multiplying f by −1 if necessary. Note that,


for 0 < a, b ≤ n, we have
||f (a)| − |f (b)|| ≤ |f (a) − f (b)| < n
so the only pairs which work are those for which |f (a)| = |f (b)|. Since a < b, this means that
a < v and b > v (they must lie in different affine parts). Thus we have v − a = b − v, i.e.
b = 2v − a. This is indeed greater than 2a − a = a and less than n for sufficiently large n. Thus,
for sufficiently large n, there are exactly v − 1 solutions and for smaller n there may be less. This
means that our solutions are those for which v − 1 ≤ 2021, i.e. v ≤ 2022 as wanted. 

5.3 Hensel’s Lemma


Exercise 5.3.1∗ . Let p be an odd prime and a a quadratic residue modulo p. Prove that a is a
quadratic residue modulo pn , i.e. a square modulo pn (coprime with pn ), for any positive integer n.
5.4. BÉZOUT’S LEMMA 289

Solution

We apply Hensel’s lemma on the polynomial X 2 − a. It has a root α by assumption, and the
derivative 2α is not divisible by p since p is odd and p - a. 

Exercise 5.3.2∗ . Prove that an odd rational integer a ∈ Z is a quadratic residue modulo 2n for n ≥ 3
if and only if a ≡ 1 (mod 8).

Solution

Since the only odd square modulo 8 is 1, we must have a ≡ 1 (mod 8). For the converse, we
consider the polynomial
(2Y + 1)2 − a a−1
=Y2+Y − .
4 4
a−1
It has a root modulo 2 (e.g. 0) since 4 is even, and its derivative is 1 which is indeed non-zero
so we can use Hensel’s lemma. 

5.4 Bézout’s Lemma


Exercise 5.4.1∗ . Prove Corollary 5.4.1.

Solution

Just multiply u and v in Proposition 5.4.1 by the lcm N of the denominators of their coefficients
to get something in Z[X]. 

5.5 Exercises
Algebraic Results
Exercise 5.5.1† . Suppose f, g ∈ Z[X] are polynomials such that f (n) | g(n) for infinitely many
rational integers n ∈ Z. Prove that f | g. In addition, generalise the previous statement to f, g ∈
Z[X1 , . . . , Xm ] such that f (x) | g(x) fo x ∈ S1 × . . . × Sn , where S1 , . . . , Sn ⊆ Z are infinite sets.

Solution

We have seen in Corollary 5.4.2 that this holds when the assumption f (n) | g(n) is true for
sufficiently large n (divide f and g by a primitive irreducible factor of f and repeat this process).
To prove this stronger result, we will use a different method, a completely analytical one. Let
g = qf + r be the Euclidean division of g by f , and let N ∈ Z be a non-zero integer such that
N q, N r ∈ Z[X]. Then,
f (n) | N g(n) − N q(n)f (n) = N r(n)
for infinitely many integers n. However, lim|n|→∞ Nfr(n)(n) = 0. Since it’s a sequence of rational
integers, it must be zero for sufficiently large n, which implies that r = 0, i.e. f | g.
For the second part, we proceed by induction on m (we have just done the case m = 1). If we fix
xm , we get that f (X1 , . . . , Xm−1 , xm ) | g(X1 , . . . , Xm , xm ) for any xm ∈ Sm . Suppose that some
irreducible factor π of f doesn’t divide g (if all of them do we can divide f and g by them and re-
peat the argument, as we outlined in the first part). Note that this makes sense as Z[X1 , . . . , Xm ]
290 CHAPTER 5. POLYNOMIAL NUMBER THEORY

is a UFD by Proposition 5.1.3. We shall use Bézout’s lemma in Q(X1 , . . . , Xm−1 )[Xm ] to get
two polynomials u, v ∈ Z[X1 , . . . , Xm ] such that

0 6= uπ + vg = h ∈ Z[X1 , . . . , Xm−1 ]

(by clearing denominators in the Bézout relation). We know that this h is divisible by
π(X1 , . . . , Xm−1 , n) for any infinitely many n. Since h has a finite number of divisors in
Z[X1 , . . . , Xm−1 ], we get π(X1 , . . . , Xm−1 , n) = d for a fixed d and infinitely many n. Thus,
the polynomial π(X1 , . . . , Xm−1 , X) − d has infinitely many roots so is identically zero, i.e.
π(X1 , . . . , Xm−1 , Xm ) is constant in Xm . In that case, we can proceed by induction on degXm g:
we have π | g(k) for some k so we

g(X1 , . . . , Xm−1 , n) − g(k)


π|
n
for any n ∈ Z, and this has a smaller degree in n. 

Exercise 5.5.2† . Let f ∈ Q[X] be a polynomial. Suppose that f always takes values which are mth
powers in Q. Prove that f is the mth power of a polynomial with rational coefficients. More generally,
find all polynomials f ∈ Q[X1 , . . . , Xm ] such that f (x1 , . . . , xm ) is a (non-trivial) perfect power for
any (x1 , . . . , xm ) ∈ Zm .

Solution
Qk
Without loss of generality, suppose f ∈ Z[X]. Let f = a i=1 πiri be the factorisation of f in
primitive irreducible polynomials. As stated in ??, for any i, we can find an n and a prime p such
that vp (f (n)) = ri . Thus, this implies m | ri for all i, and clearly a must also be an mth power
then: f is an mth power as wanted. For the general case where f (n) is just always a perfect
power, we can pick distinct primes pi and an integer ni such that vpi (f (ni )) = ri . Then, if we
pick an integer n ≡ ni (mod pi ), we get vpi (f (n)) = ri , which means that the ri must all have a
non-trivial common divisor. In other words, f is a constant times a perfect power, and it suffices
to look at vp (f (n)) for p dividing the constant to see that it is in fact a perfect power.
Now, we deduce the general case from the one variable case. Suppose again that f ∈ Z[X]. Let
π be a primitive irreducible factor of f . We shall find an arbitrarily large prime p such that
vp (f (x)) = vπ (f ) for some x ∈ Zm if f is non-constant. Then, using CRT, we can find primes
pπ for each primitive irreducible factor of f and an element y ∈ Zm such that vpπ (f (y)) = vπ (f )
for each π. Since f (y) is a perfect power by assumption, we get that all vπ have a non-trivial
common divisor, which means that f is a constant times a perfect power, and it is again easy
to see that it is in fact a perfect power. Let π be a primitive irreducible factor of f . Suppose
without loss of generality that it is non-constant in Xm , and let π 0 denote its derivative with
respect to Xm . Use Bézout’s lemma as in Exercise 5.5.1† to get u, v ∈ Z[X1 , . . . , Xm−1 ] such
that
0 6= uπ + vπ 0 = h ∈ Z[X1 , . . . , Xm−1 ].
Also, for every other primitive irreducible factor τ of f , consider a Bézout relation
0 6= uτ π + vτ τ ∈ Z[X1 , . . . , Xm−1 ].
Now, choose x ∈ Zm−1 such that π(x, X) is non-constant, and such that h(x) and hτ (x) are
non-zero for all π 6= τ | f . This is possible by e.g. Exercise A.1.7∗ used on the product of the
leading coefficient of f as a coefficient in Xm with h and all hτ . Then, pick a large prime p and
an integer n such that p | π(x, n), there exists one by Theorem 5.2.1. When p is sufficiently large,
by our Bézout relations, p - τ (x, n) for π 6= τ | f . Thus, vp (f (x, n)) = vπ (f )vp (π(x, n)). Now, if
p2 | π(x, n), by assumption
p2 - π(x, n + p) ≡ π(x, n) + pπ 0 (x, n).
5.5. EXERCISES 291

Thus, there is an n such that vp (f (x, n)) = vπ (f ) as wanted and we are done. 

Exercise 5.5.3† . Suppose f, g ∈ Z[X] are polynomials such that f (a) − f (b) | g(a) − g(b) for any
rational integers a, b ∈ Z. Prove that there exists a polynomial h ∈ Z[X] such that g = h ◦ f .

Solution

By Exercise 5.5.1† , we know that f (X) − f (n) | g(X) − g(n) for all n (in fact we even have
f (X) − f (Y ) | g(X) − g(Y ) but we won’t use that). Consider the base f expansion of g: g =
i i
P
i hi f , where f means exponentiation and not iteration, and where hi ∈ Q[X] are polynomials
of degree less than deg f . We have
X X
g(X) ≡ hi (X)f (X)i ≡ hi (X)f (n)i (mod f (X) − f (n))
i i

so X
f (X) − f (n) | (hi (X) − hi (n))f (n)i .
i

However, the RHS has degree strictly less than deg g so must be identically zero. By taking
n sufficiently large, we see that all hi must be constant, otherwise the RHS will be non-zero.
k k
P if ai iis the coefficient of X
Indeed, P of hii for some k ≥ 1, then the coefficient of X of the RHS
is i ai f (n) so the polynomial i ai X must have infinitely many roots and thus be zero, i.e.
ai = 0 for all i. The fact that all hi are constant is exactly what it means for g to be a polynomial
in f . 

Exercise 5.5.4 (RMM SL 2016). Let p be a prime number. Prove that there are only finitely many
primes q such that
bq/pc
X
q| k p−1 .
k=1

Solution

Write q = pn + r with r ∈ [p − 1]. It suffices to prove that, for each r, pn + r divides


n
X
k p−1 := fr (n)
k=1

finitely many times only. Note that, as we saw in Exercise A.3.8† , fr (n) is a polynomial in n of
degree p and leading coefficient 1/p. As a consequence, there is some integer N ∈ Z such that
N fr has integer coefficients and is non-zero modulo p. By Exercise 5.5.1† , if pn + r divides fr (n)
infinitely many times, pX + r | fr in Q[X]. By Gauss’s lemma, since pX + r is primitive, pX + r
divides N fr in Z[X]. In particular, p divides the leading coefficient of N fr so N fr has degree
at most p − 1 over Fp . Since N fr (n) is identically zero modulo p, as fr (n) ∈ Z and p | N , this
implies that N fr = 0 over Fp since it has degree at most p − 1 and p roots. This contradicts our
initial assumption. 

Polynomials over Fp
Exercise 5.5.9† (Generalised Eisenstein’s Criterion). Let f = an X n +. . .+a0 ∈ Z[X] be a polynomial
and let p a rational prime. Suppose that p - an , p | a0 , . . . , an−1 , and p2 - ak for some k < n. Then
any factorisation f = gh in Q[X] satisfies min(deg g, deg h) ≤ k.
292 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Solution

Without loss of generality suppose f = gh for some g, h ∈ Z[X] using Gauss’s lemma. Modulo
p, f ≡ an X n so g ≡ bX r , h ≡ cX s . If r, s > k, we get p2 | ak which is a contradiction. Indeed, if
g = bX r + pu and h = cX s + pv, then f ≡ bcX r+s + p(buX r + cvX s ) (mod p2 ) and the coefficient
of X k of this polynomial is zero when r, s > k. 

Exercise 5.5.10† (China TST 2008). Let f ∈ Z[X] be a (non-zero) polynomial with coefficients in
k
{−1, 1}. Suppose that (X − 1)2 divides f and deg f ≥ 2k . Prove that deg f ≥ 2k+1 − 1.

Solution
k
We proceed as in Exercise 5.5.11† . Since (X − 1)2 | f , we have n := deg f ≥ 2k . Modulo 2,
n+1 k k k
f ≡ XX−1−1 and (X − 1)2 = X 2 − 1 by Frobenius. Thus, X 2 − 1 | X n+1 − 1, which implies
2k | n + 1 by Exercise 4.3.1∗ , and in particular n ≥ 2k+1 − 1 as n ≥ 2k . 

Exercise 5.5.11† (Romania TST 2002). Let f, g ∈ Z[X] be polynomials with coefficients in {1, 2002}.
Prove that deg f + 1 | deg g + 1.

Solution

X deg f +1 −1
Modulo 3 (which was chosen so that 1 ≡ 2002), by Gauss’s lemma, we get f ≡ X−1 and
X deg g+1 −1 deg f +1 deg g+1
g≡ X−1 Thus, X
. −1 | X − 1 over F3 , which implies deg f + 1 | deg g + 1 by
Exercise 4.3.1∗ . 

Exercise 5.5.12† (USAMO 2006). Find all polynomials f ∈ Z[X] such that the sequence (P (f (n2 ))−
2n)n≥0,f (n2 )6=0 is bounded above, where P is the greatest prime factor function. (In particular, since
P (0) = +∞, we have f (n2 ) 6= 0 for any n ∈ Z.)

Solution

Suppose that P (f (n2 )) − 2n ≤ N for all n. Suppose that p | f (n2 ) for some odd prime p and a
rational integer n. Without loss of generality, suppose 0 ≤ n < p2 by replacing n by its remainder
upon its Euclidean division by p, and then replacing it by p − n if necessary. By assumption,
p − 2n ∈ [N ]. Thus, the odd prime factors of n always divide the ones of

(2n + 1) · . . . · (2n + N )(2n − 1) · . . . · (2n − N ) = (4n2 − 1) · . . . · (4n2 − N 2 ).

By Exercise 5.5.1† , this implies that f divides

(4X − 1) · . . . · (4X − N 2 ),

i.e. f has the form a i 4X − a2i for some a ∈ Q and ai ∈ Z. Since f (n2 ) has no root, all ai
Q
must be odd, and this implies a = ±1 by Gauss’s lemma since 4X − k is primitive for odd k.
Conversely, these all work, p | f (n2 ) =⇒ p | 2n ± ai and thus p − 2n ≤ maxi (|ai |) := N . 

Exercise 5.5.14† (China TST 2021). Suppose the polynomials f, g ∈ Z[X] are such that, for any
sufficiently large rational prime p, there is an element rp ∈ Fp such that f ≡ g(X + rp ) (mod p). Prove
that there exists a rational number r ∈ Q such that f = g(X + r).
5.5. EXERCISES 293

Solution

It is clear that f anf g have the same degree. We proceed by induction on deg f . When f is
constant it is trivial. For the inductive step, notice that the statement still holds for f 0 and g 0 .
Since they have degree deg f −1, we conclude that f 0 = g 0 (X +r) for some r, i.e. f = g(X +r)+c
for some c ∈ Q. When deg f = 1 this already implies that f = g(X + s) for some s. Otherwise,
since we have g 0 (X + rp ) ≡ g 0 (X + r), we get rp ≡ r. In that case, we get c ≡ 0 (mod p) for
infinitely many p so c = 0 as wanted. 

Iterates
Exercise 5.5.16† . Let f ∈ Z[X] be a polynomial. Show that the sequence (f n (0))n≥0 is a Mersenne
sequence, i.e.
gcd(f i (0), f j (0)) = f gcd(i,j) (0)
for any i, j ≥ 0.

Solution

If f i (0) ≡ 0 (mod d), then we get f ki (0) ≡ 0 for any integer k ≥ 0 by applying f i multiple times
on both sides. This shows that f gcd(i,j) (0) | gcd(f i (0), f j (0)). For the converse, suppose that d
divides f i (0) and f j (0) where j > i. Then,

0 ≡ f j (0) ≡ f j−i (f i (0)) ≡ f j−i (0)

so d | f j−i (0) too. This shows that we can apply Euclid’s algorithm to get d | f gcd(i,j) (0). 

Exercise 5.5.17† . Suppose the non-constant polynomial

f = ad X d + . . . + a2 X 2 + a0 ∈ Z[X]

has positive coefficients and satisfies f 0 (0) = 0. Prove that the sequence (f n (0))n≥1 always has a
primitive prime factor.

Solution

Notice that the coefficient of f i of X 2 is also 0 for any i (this can be seen via direct expansion
or via (f i )0 (0) = f 0 (0)f 0 (f i−1 (0)) = 0). Thus, we have f i (x) ≡ f i (0) (mod x2 ) for any x ∈ Z.
Letting x = f j (0) yields f i+j (0) ≡ f i (0) (mod f j (0)2 ). By induction, we get

f km (0) ≡ f m (0) (mod f k (0)2 ) (*)

for any integers k, m ≥ 0.

Suppose now that f n (0) doesn’t have a primitive prime factor for some n ≥ 2. This means that,
if p | f n (0) is prime, p | f k (0) for some k. By Exercise 5.5.16† , we may assume that k | n by
replacing it by gcd(k, n) if necessary. Then, by (∗), we have f n (0) ≡ f k (0) (mod f k (0)2 ). In
particular, vp (f n (0)) = vp (f k (0)). Thus, we conclude that

vp (f n (0)) ≤ vp (f (0) · . . . · f n−1 (0))

for any prime p. This means that

f n (0)) ≤ f (0) · . . . · f n−1 (0).


294 CHAPTER 5. POLYNOMIAL NUMBER THEORY

We shall prove that this is impossible for n ≥ 2 by induction. We clearly have f (0) ≥ 1 since the
coefficients are non-negative. Now, if f n (0) ≥ f (0) · . . . · f n−1 (0), then

f n (0)2 ≥ f (0) · . . . · f n−1 (0)

so it suffices to prove that f n+1 (0) > f n (0)2 . This is clearly true since f (x) > x2 for positive
x. 

Exercise 5.5.18† (Tuymaada 2003). Let f ∈ Z[X] be a polynomial and a ∈ Z a rational integer.
Suppose |f n (a)| → ∞. Prove that there are infinitely many primes p such that p | f n (a) for some
n ≥ 0 unless f = AX d for some A, d.

Solution

Suppose that there are finitely many such primes p1 , . . . , pm . Suppose first that f (0) = 0. Then,
if we let g = X k f where g(0) 6= 0, we get

g(f i (a)) ≡ g(0) (mod f i−1 (a)) (*)

since f (0) = 0. Choose an n such that p1 · . . . · pm | f n (a) there exists one by assumption (if
p | f j (a) then p | f j+1 (a)). Let p be one of p1 , . . . , pm . By (∗), we have vp (g(f i (a)) = 0 if p - g(0),
and, for i > n, if p | g(0),

vp (f i (a)) = vp (f i−1 (a)k g(f i−1 (a))) ≥ vp (f i−1 (a)) + 1

since g(f i−1 (a)) ≡ g(0) ≡ 0. Hence, for sufficiently large i we get vp (f i−1 (a)) > vp (g(0)) for
all p = p1 , . . . , pm . Combined with (∗), we must have g(f k (a)) = g(0) since the pi are the only
prime factors, and for large k this implies that g is constant as wanted.

Let n be a large integer. Consider the k + 1 numbers f n (a), f n+1 (a), . . . , f n+m (a). Each of them
is divisible by a large power of a pk , since by assumption they are large and their only prime
factors are the pi . By the pigeonhole principle, two of them must be divisible by a large power
of the same pk , say prk | f n+i (a), f n+j (a) with j > i. Then, prk | f j−i (0). Taking n → ∞ and
thus r → ∞ yields f s (0) = 0 for some 1 < s ≤ m. Now, the previous discussion implies that
f s = AX d for some d. However, if deg f ≥ 2, it is easy to see that f s can only be of this form is
f is. Indeed, if the two leading terms of g are U X u + V X v and the leading term of f is W X w ,
then the leading terms of f ◦ g are

W U w X uw + wV U w−1 X u(w−1)+v

(the situation is different when deg f = 1 because the contribution we also need to take in account
the second term of f ). Thus, we are done when deg f 6= 1. Otherwise, suppose that f = uX + v
with v 6= 0. Then,  
v v
f n (a) = un a + − .
u−1 u−1
Our previous discussion implies that a 6= 0, otherwise the sequence (f n (a))n≥0 is bounded. If we
take
n = ϕ((p1 · . . . · pk )N ),
v
we get f n (a) ∈ {a, − u−1 } modulo pNi for every i. For sufficiently large N , both of these are
N
non-zero modulo pi , so the p-adic valuation are bounded which implies that the sequence is too
since these are the only prime factors. This is a contradiction. 

Exercise 5.5.19† (USA TST 2020). Find all integers n ≥ 2 for which there exist a rational integer
m > 1 and a polynomial f ∈ Z[X] such that gcd(m, n) = 1 and n | f k (0) ⇐⇒ m | k for any positive
rational integer k.
5.5. EXERCISES 295

Solution

Let k# denote the product of the first k prime numbers (the kth primorial ). We shall prove that
n works if and only if rad n 6= k# for any integer k ≥ 1.

Suppose first that rad n 6= k# for any integer k ≥ 1. Let p be the greatest prime factor of n
and r = vp (n), and let q be the smallest prime which doesn’t divide n. By assumption, q < p.
Consider the cycle τ : 0 → 1 → . . . → q → 0. Construct the polynomial
p−1
X Y X −i
g= τ (i) j 6= i ∈ Z/pr Z[X]
i=0
i−j
P

which interpolates τ . Note that this is indeed in Z/pr Z as the denominators are in ] − p, p[ and
non-zero, and thus coprime with p. Lift g to any polynomial with integer coefficients which is
congruent to g modulo pk . We shall denote this new g abusively by g again. Let u ∈ Z be such
that a · pnk ≡ 1 (mod pk ). Then, f := un
pk
g works as we have

n | f k (0) ⇐⇒ pk | g k (0) ⇐⇒ q | k

(m = q).

Now, suppose rad n = k# for some k. Suppose for the sake of a contradiction that there is some
f ∈ Z[X] and m ∈ Z coprime such that n | f k (0) if and only if m | k. Notice that this implies
that the sequence (f k (0))k≥0 is periodic modulo n, and thus also modulo p for any p | n. Since it
can take only p values modulo p, the period is at most p. Since it is coprime with n, it must be
1 (by assumption all primes q ≤ p divide n). Thus, the sequence (f k (0))k≥0 is constant modulo
rad n. We shall prove by induction on ` that p` | f (0) for any p | n and ` ≤ vp (n), which implies
that f (0) ≡ 0, contradicting the fact that m > 1. We have already proved the base case. Note
that, by Corollary 5.3.1, we have

f k+1 (0) ≡ f (0) + f k (0)f 0 (0) (mod p`+1 ). (*)

Suppose for the sake of a contradiction that p - f 0 (0). If f 0 (0) ≡ 1 (mod p), by induction, we
get f k (0) ≡ kf (0) (mod p`+1 ). Thus, if p`+1 - f (0), we have p`+1 | f k (0) ⇐⇒ p | k which
contradicts the fact that m was coprime with n.

Thus, f 0 (0) 6≡ 1 (mod p). Accordingly, (∗) becomes


   
f (0) f (0)
f k+1 (0) + 0 ≡ f 0 (0) f k+1 (0) + 0 .
f (0) − 1 f (0) − 1

By induction, we get
f 0 (0)k − 1
f k (0) ≡ f (0) · .
f 0 (0) − 1
If p`+1 - f (0), we have
p`+1 | f k (0) ⇐⇒ p | f 0 (0)k − 1 ⇐⇒ s | k
where s is the order of f 0 (0) modulo p. However, s | p − 1 so s < p and is thus not coprime with
n which is again a contradiction.

Finally, we conclude that we must in fact have p | f 0 (0). But then, (∗) becomes f k (0) ≡ f (0),
and since f m (0) ≡ 0 we get f (0) ≡ 0 as wanted. 

Exercise 5.5.20† . Let f ∈ Q[X] be a polynomial of degree k. Prove that there is a constant h > 0
such that that the denominator of f (x) is greater than h times the denominator of xk .
296 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Solution
Pk
Let x = mn where m and n coprime rational integers, write f =
i
i=0 ai X (with k = deg f )
and pick 0 6= N ∈ Z such that N f ∈ Z[X]. Denote by Z(p) the set of rational numbers with
denominator not divisible by p. Let p be a prime factor of n and c ≥ 0 be an integer. Then,
k−1
N ak mk pvp (n)
m    
n X
Np kvp (n)−c
f = + N ai m i
p−c .
n pc pvp (n) i=0
n

For vp (n) ≥ c, the second sum is in Z(p) , while for c > vp (N ak ), the first term is not in Z(p) .
Thus, for vp (n) ≥ c > vp (N ak ) := cp , N pkvp (n)−c f m

n 6∈ Z(p) , i.e. the denominator D of
Nf m
 kvp (n)+1−c kvp (n)−cp
n is divisible by p . We conclude that, for v p (n) > cp , we have p | D.

We are now almost done. Let P be the product pvp (n) over the primes p for which vp (n) ≤ cp .
Then, Y
P ≤ pcp = |N ak | := C.
p
m m
 
Thus, the denominator of N f n , and hence of f n , is at least

Y 1 Y kvp (n)−cp 1 Y kvp (n) |n|k


pkvp (n)−cp ≥ · p ≥ p =
Ck C k+1 C k+1
vp (n)>cp p|n p|n

1
as desired (h = C k+1
). 

Exercise 5.5.21† . Let f ∈ Q[X] be a polynomial of degree at least 2. Prove that



\
f k (Q)
k=0

is finite.

Solution

Let D(x) denote the denominator of a rational number x ∈ Q. Note that, by Exercise 5.5.20† ,
when D(x) is sufficiently large, D(f (x)) > D(x) as deg f ≥ 2. This implies that, for a fixed
r, if r = f k (sk ) for all k, the denominator of sk is bounded (otherwise f k (sk ) would have a
denominator which is too large). However, its absolute value is also bounded, since |f (x)| > r
for sufficiently large |x|. Thus, there are a finite number of possible sk , and this implies
T∞ that
f i (s) = r = f j (s) for some i, j and s := si = sj . In other words, the intersection k=0 f k (Q)
consists only of pre-periodic points, since we also have f i (r) = f i+j (s) = f j (r). Thus, it suffices
to show that there are finitely many pre-periodic points.

Let r be a pre-periodic point, i.e. such that f i (r) = f j (r) for some j > i. The same trick as
before shows that D(r) is bounded. Indeed, if D(r) is sufficiently large, then D(f i (r)) > D(r) is
too, and thus
D(f j (r)) = D(f j−i (f i (r))) > D(f i (r)).
But at the same time, the absolute value of r is also bounded since |f (x)| > |x| for |x| sufficiently
large (as deg f ≥ 2), so there are a finite number of preperiodic points as wanted. 

Exercise 5.5.22† (Iran TST 2004). Let f ∈ Z[X] be a polynomial such that f (n) > n for any positive
rational integer n. Suppose that, for any N ∈ Z, there is some positive rational integer n such that
N | f n (1).
Prove that f = X + 1.
5.5. EXERCISES 297

Solution

Choose N = f n+1 (1) − f n (1) for some n. Then, the sequence f i (1) modulo N goes as follows:

f (1), f 2 (1), . . . , f n (1), f n (1), f n (1), . . . .

Thus, by assumption, N | f k (1) for some k ≤ n. If k = n, then f n+1 (1) − f n (1) ≡ f (0). Thus,
we get
f n+1 (1) − f n (1) ≤ f n−1 (0)
or f n+1 (1) − f n (1) ≤ f (0). It is easy to see that this forces f = X + m. Finally, modulo m the
sequence f n (1) is constant equal to 1 so m = ±1, and since f (n) > 1 this means that f = X + 1
as wanted. 

Divisibility Relations
Exercise 5.5.23† . Find all polynomials f ∈ Z[X] such that f (n) | nn−1 − 1 for sufficiently large n.

Solution

Let n be a sufficiently large rational integer, and p be a prime factor of f (n). Let m ≡ n (mod p)
and m ≡ 2 (mod p − 1) be an integer. Then, p | f (m) | mm−1 − 1 ≡ n − 1. Thus, every prime
factor of f (n) divides n − 1. By Corollary 5.4.2, this implies that f is a constant times a power
of X − 1, say c(X − 1)k . By LTE, for any p | n − 1, we have

vp (nn − 1) = vp (n − 1) + vp (n − 1) = 2vp (n − 1)

so k ≤ 2. Finally, the constant divides n − 1 for every sufficiently large n so must be ±1.
Conversely, by the previous discussion, ±1, ±(X − 1), ±(X − 1)2 all work. 

Exercise 5.5.25† (ISL 2012 Generalised). Find all polynomials f ∈ Z[X] such that rad f (n) |
rad f (nrad n ). (You may assume Dirichlet’s theorem.)

Solution

Let n ∈ Z and suppose that p is a prime factor of f (n). Suppose that p - n. Let k ∈ (Z/(p−1)Z)×
be arbitrary. Pick a prime number q ≡ n (mod p) and q ≡ k (mod p − 1) using Dirichlet’s
theorem and CRT. Then, p | rad f (q) | f (nq ) so p | f (nk ). Thus, whenever p | f (n), either p | n
or p | f (m) for any m with the same order as n modulo p. In particular, if n has order u modulo
p, then f has u roots in Fp . Let d be the degree of f . Since f has at most d roots in Fp , this
implies that the prime factors of f (n) all divide g(n) where g is

XΦ1 Φ2 · . . . · Φd .
a0
Q` as
Now, suppose f = λX s∈S Φs . We claim that f works iff S if r | s =⇒ r ∈ S for any s ∈ S.
Clearly, these all work since if n has order s modulo p, i.e. p | Φs (n), then nrad n has order r
dividing s and p | Φr (s) | f (nrad n ) as wanted.

It suffices to that r | s =⇒ r ∈ S when rs is prime, since we can obtain any divisor of s by


dividing multiple times by a prime. Thus, suppose that s = qr for some prime q. Let p ≡ 1
(mod s) be a prime not dividing any element of S and let n be an element of order s modulo p.
Pick a prime q 0 6= q such that qq 0 ≡ n (mod p) and q 0 ≡ 1 (mod p−1
q ), there exists one by CRT
0
and Dirichlet’s theorem again. Then, p | rad f (qq 0 ) | f (nqq ) so p | f (nq ). Since nq is an element
of order r modulo p, this means that Φr | f as wanted. We conclude that all solutions have the
298 CHAPTER 5. POLYNOMIAL NUMBER THEORY

form
`
Y
f = λX a0 Φas s
s∈S

where S is such that r | s ∈ S implies r ∈ S. 

Remark 5.5.1
There is an elementary way to avoid the use of Dirichlet’s theorem. When we take a prime
satisfying some congruence condition, we do not really care that it’s prime, we just care about
the value of its radical. Thus, it suffices to have a (sufficiently large) squarefree n such that
n ≡ a (mod b) for some coprime a, b. This can be done by showing that the density of such n is
positive. More specifically, let N be a positive integer. We want to count how many n ∈ [N ] are
1
congruent to a (mod b) and not squarefree. If we show that this is cN + o(N ) for some c < ϕ(b)
N
we are done since there are ϕ(b) + O(1) integers congruent to a modulo b in [N ]. How many
such integers are there then? Well, n is not squarefree if it is divisible by p2 for some prime p.
N
We can assume p - b as these primes can’t divide n ≡ a (mod b). Then, there are ϕ(b)p 2 + O(1)
2
such integers congruent a modulo b since the conditions n ≡ 0 (mod p ) and n ≡ a (mod b) are
independent by CRT. Thus, the number of n ≡ a (mod b) in [N ] which are not squarefree is at
most    
 
N  X 1 N X 1
+ O(1/N )  = O(π(N )) +  
ϕ(b) p2 ϕ(b) p2
p-b,p≤N p-b

where π(n) is the number of primes


P less than n. In Exercise 3.5.14† , we proved that π(N ) = o(N ),
1
so we only need to show that p-b p2 < 1 to be done. This follows from the following estimate:

X 1 ∞ ∞
X 1 X 1 1
2
< = − = 1.
p
p n=2
n(n − 1) n=2 n − 1 n

We end this remark with one more observation: our previous approach can be refined to show
that, in fact, the exact density of squarefree n ≡ a (mod b) is
1 Y 1 1 6
1− 2 = Q 1 = Q 1 .
ϕ(b) p ζ(2)ϕ(b) p|b 1 − p2 π 2 ϕ(b) p|b 1− p2
p-b

Indeed, the product of 1 − p12 comes from the inclusion-exclusion principle: the density of n not
divisible by p2 is 1 − p12 and these densities are independent by CRT. More precisely, there are

N Y 1
1− + O(1/N )
ϕ(b) p2
p-b,p≤N

integers in [N ] congruent to a modulo b which are not divisible by a square. If we take the
logarithm, since
log(x + ε) − log x = log(1 + ε/x) = ε/x + O(ε2 /x2 ),
we will be able to sum the O(1/N ) and get O(π(N )/N ) → 0. When we retake the exponential,
this gives us a constant which goes to 1. Hence, after dividing by N , we get the wanted result.

Exercise 5.5.27† . Find all polynomials f ∈ Z[X] such that f (p) | 2p − 2 for any prime p. (You may
assume Dirichlet’s theorem.)
5.5. EXERCISES 299

Solution

Let n ∈ Z be an integer and suppose p is an odd prime factor of f (n). Suppose for the sake of
a contradiction that p - n. Usin Dirichlet’s theorem, we may find a prime q ≡ n (mod p) and
q ≡ −1 (mod p − 1). This gives p | f (q) | 2q − 2 ≡ − 32 (mod p) so p = 3. Thus, the sufficiently
large prime divisors of f (n) divide n which implies that f = aX k for some a ∈ Z and some
integer k ≥ 0 by Corollary 5.4.2. Since f (2) | 2, we get the solutions f = ±2 and f = ±X, which
work by Fermat’s little theorem. 

Miscellaneous
Exercise 5.5.29† (Generalised Hensel’s Lemma). Let f ∈ Z[X] be a polynomial and a ∈ Z an integer.
Let m = vp (f 0 (a)). If p2m+1 | f (a), prove that f has exactly one root b modulo pk which is congruent
to a modulo pm+1 for all k ≥ 2m + 1.

Solution

We need to show that we can still perform the inductive step for k ≥ 2m + 1. Write bk+1 =
bk + upk−m . We have

f (bk+1 ) ≡ f (bk ) + upk+1−m f 0 (a) (mod pk+1 )

as before. This can be congruent to 0 if and only if vp (pk+1−m f 0 (a)) < vp (f (bk )), which is true
since
vp (f (bk )) > k = (k − m) + vp (f 0 (a)).
As before, this u is unique modulo p which shows the wanted result. 

Remark 5.5.2
This doesn’t work under the weaker assumption that vp (f (a)) > m, as can be seen from f =
14X 2 + 3X + 9 which doesn’t have a root modulo 33 (this may seem very random but it was
in fact carefully constructed from our previous proof), because the congruence we get with the
derivative holds modulo p2(k−m) , and 2(k − m) ≥ k + 1 only for k ≥ 2m + 1.

Exercise 5.5.30† . Let f ∈ Z[X] be a non-constant polynomial. Is it possible that f (n) is prime for
any n ∈ Z?

Solution

Let f be a polynomial which is always prime. If f (n) = p, then p | f (n + kp), so we must have
f (n + kp) = p. For sufficiently large k, this implies that f = p is constant. 

Exercise 5.5.31† . Find all polynomials f ∈ Q[X] which are surjective onto Q.

Solution

Without loss of generality, suppose that f ∈ Z[X] and f (0) = 0. We will prove that f must have
degree 1 (and conversely, these polynomials are surjective). Let p be a rational prime. Examine
the equation f (r) = p: by Exercise 1.1.2, the potential rational solutions have the form ps or 1s
where s is a divisor of the leading coefficient of f . The latter is impossible for large p, while the
300 CHAPTER 5. POLYNOMIAL NUMBER THEORY

p

former is only possible for large p if f has degree 1, otherwise f (r) = f s grows too fast, of the
order of pn where n is the degree of f . 

Exercise 5.5.35† (ISL 2005). Let f ∈ Z[X] be a non-constant polynomial with positive leading
coefficient. Prove that there are infinitely many positive rational integers n such that f (n!) is composite.

Solution

Wilson’s theorem tells us that, if p is prime and 0 ≤ n ≤ p − 1,

(p − 1)! (−1)n−1
(p − 1 − n)! ≡ ≡ .
(−1) · (−2) · . . . · (−n) n!

In other words, if we let g(X) be the polynomial X deg f f (1/X), for any prime p and any positive
rational integer n, p | f (n!) ⇐⇒ p | g((−1)n−1 (p − 1 − n)!). Hence, we wish to find an integer
m such that g((−1)m−1 m!) has a prime factor p > m for which f ((p − 1 − m)!) is greater than p.
That way, f ((p−1−m)!) will be divisible by p and not equal to p which means that it’s composite
as desired. Suppose that there are finitely many such primes. Note that, for sufficiently large m,
if p | g(m!) and p ≤ m, then p | g(0). In particular, there are finitely many such primes since
g(0) is the leading coefficient and f , and the same argument shows that
Y
pvp (g(m!))≤g(0) .
p|g(m!),p≤m

This implies that, for sufficiently large m, g((−1)m−1 m!) has a prime factor p > m. Then, for
this p, we have p | f ((p − 1 − m)!) by construction so f ((p − 1 − m)!) = p for sufficiently large
p by assumption. In particular, since f (n!) ≥ n!/2 > 2n for large n, we have p ≤ 2m, otherwise
f ((p − 1 − m)!) > 2m−1 ≥ p. In other words, p is fairly close to m: between m and 2m. We are
almost done. Consider the sequence (f (n!))n≥0 . By assumption, p is an element of this sequence.
However, the terms of f (n!) grow further and further away: f ((n + 1)!)/f (n!) → ∞. Hence, if
we choose m = 2f (n!) for instance, p will be greater than f (n!) but smaller than f ((n + 1)!) and
so won’t be in the sequence (for large n). 
Chapter 6

The Primitive Element Theorem and


Galois Theory

6.1 General Definitions


Exercise 6.1.1∗ . Prove that R[α1 , . . . , αn ] is indeed the smallest ring containing R and α1 , . . . , αn ,
in the sense that any other such ring must contain R[α1 , . . . , αn ]. Similarly, prove that any field
containing K and α1 , . . . , αn contains K(α1 , αn ).

Solution

If a ring contains R and α1 , . . . , αn , it contains all polynomials in α1 , . . . , αn with coefficients in


R since these are obtained from multiplication and addition of elements of R and the αi . Thus,
it contains R[α1 , . . . , αn ]. Similarly, if a field contains K and α1 , . . . , αn , it contains all rational
functions in α1 , . . . , αn with coefficients in K since these are obtained from multiplication of
elements of K[α1 , . . . , αn ] with inverses of other elements (we have already shown that the field
must contain K[α1 , . . . , αn ] since a field is also a ring). 

Exercise 6.1.2∗ . Let α ∈ Q be an algebraic number. Prove that Q(α) = Q[α].

Solution

We wish to prove that Q[α] is closed under inversion. Let f (α) be a non-zero element of Q[α],
i.e. πα - f . Then, since πα is irreducible, it is coprime with f . Thus, by Bézout’s lemma, we
have rf + sπα = 1 for some a, b ∈ Q[X]. Evaluating at α yields r(α)f (α) = 1 as wanted. 


Exercise 6.1.3∗ . Prove that the minimal polynomial of (i+1) 2
2 over Q(i) is X 2 − i.

Solution

(i+1) 2
2 is a root of X 2 − i and not in Q(i) so its minimal polynomial has degree 2 and divide
2
X − i and is therefore equal to it. 

Exercise 6.1.4∗ . Check that L is a K-vector space.

301
302 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Since K ⊆ L, multiplication of elements of L (vectors) by elements of K (scalars) is well-defined


and satisfies the obvious properties since it’s just the multiplication of two elements of L! 

Exercise 6.1.5∗ . Prove that (ui vj )i∈[m],j∈[n] is a K-basis of M .

Solution
P
Suppose that i,j ai,j ui vj = 0 for some ai,j ∈ K. Rewrite it as
 
X X
ui  ai,j vj  = 0.
i j

P
This is a linear combination of the L-basis of M . Thus, by definition of a basis, j ai,j vj = 0
for each i. Again by the definition of a basis, this means that ai,j = 0 for each j and each i.

Thus this family is linearly independent. It remains to show that P it generate all of M . We can
proceed exactly as we did but in the reverse direction: let α = i bi ui be an element
P of M , with
bi ∈ L (recall that ui is the L-basis of M ). Write each bi as a linear combination j ai,j vj with
ai,j ∈ K (vi is the K-basis of L). We get
X
α= ai,j ui vj
i,j

as wanted. 

Exercise 6.1.6∗ . Let M/L/K be a tower of extensions and α ∈ M . Prove that the minimal
polynomial of α over L divides the minimal polynomial of α over K. In other words, its L-conjugates
are among its K-conjugates.

Solution

The minimal polynomial of α over K is also a polynomial over L since K ⊆ L. Since it vanishes
at α, this means that it is divisible by the minimal polynomial of α over L. 

Exercise 6.1.7∗ . Prove that finite extensions of K are exactly the fields of the form K(α1 , . . . , αn )
for α1 , . . . , αn algebraic elements over K, using Proposition 6.1.1.

Solution

We proceed by induction on [L : K], where L is a finite extension of K (we do not fix K). When
this is one we have L = K. For the induction step, let α ∈ L be an element which is not in K.
By the tower law,
[L : K] = [L : K(α)][K(α) : K].
Since α 6∈ K, [K(α) : K] > 1 so that [L : K(α)] < [L : K]. By the induction hypothesis, this
means
L = K(α)(α1 , . . . , αn ) = K(α, α1 , . . . , αn )
as wanted. 
6.2. THE PRIMITIVE ELEMENT THEOREM AND FIELD THEORY 303

6.2 The Primitive Element Theorem and Field Theory


Exercise 6.2.1. Let K be a number field. Prove that the embeddings of K are the non-zero functions
f : K → C which are both multiplicative and additive.

Solution

This is the Cauchy equation: we shall show that any additive function is Q-linear. By induction,
we have f (nx) = nf (x) for any n ∈ Z. Thus, for 0 6= m, n ∈ Z, we have

nf (xm/n) = f (xm) = mf (x)

which means f (xm/n) = f (x)m/n, i.e. f is Q-linear. 

Exercise 6.2.2∗ . Let L/K be a finite extension and ϕ ∈ EmbK (L) an embedding of L. Prove that
ϕ(f (α)) = f (ϕ(α)) for any f ∈ K[X] and α ∈ L.

Solution

ai X i . Then,
P
Let f = i
!
X X X X
ϕ ai αi = ϕ(ai αi ) = ϕ(ai )ϕ(α)i = ai ϕ(α)i
i i i i

since ϕ fixes K. (Note that we have used the fact the sum is finite here, it is not true in general
that embeddings commute with power series.) 

Exercise 6.2.3∗ . Let α ∈ L be an element and σ ∈ EmbK (L) be an embedding. Prove that σ(α) is
a conjugate of α.

Solution

Let f be the minimal polynomial of α. By Exercise 6.2.2∗ , we have

0 = σ(f (α)) = f (σ(α))

so σ(α) is a conjugate of α. 

Exercise 6.2.4∗ . Prove that an embedding is injective.

Solution

Suppose that α 6= β. Then,  


1
σ(α − β)σ = σ(1) = 1
α−β
so σ(α − β) = σ(α) − σ(β) is non-zero which means that σ(α) 6= σ(β). (In algebraic terms,
we have shown that the kernel was trivial to prove that the morphism was injective. See Exer-
cise A.2.13∗ .) 

Exercise 6.2.5. Solve Problem 6.2.1 without field theory, i.e. using only the content of Chapter 1.
304 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

One could proceed as follow. Let S denote the sum of the αi . Choose an k ∈ [n], we shall prove
that αk is rational. For this, consider
X
αk = S − αi .
i6=k

The fundamental
P theorem of symmetric polynomials tells us that each conjugate αk0 of αk has
the form S − i6=k αi where αi0 is some conjugate of αi . Thus, we have
0

n
X n
X
αi = S = αi0 .
i=1 i=1

Since the αi are maximal among their conjugates, proceeding as in the field theory solution, we
get αi0 = αi for each i. But since αk0 was an arbitrary conjugate of αk , this means αk has only
one conjugate, i.e. αk ∈ Q. You can see that this was a lot messier than with field theory! 

√ √
Exercise 6.2.6∗ . Check that NQ( √
3
3 3 3 3 3
2) (a + b 2 + c 4) = a + 2b + 4c − 6abc.

Solution
√ √
Let j be a primitive cube root of unity. The norm of a + b 3 2 + c 3 4 is
√ √ √ √ √ √
(a + b 2 + c 4)(a + bj 2 + cj 2 4)(a + bj 2 2 + cj 4).
3 3 3 3 3 3

This is
√ √ √ √ √ √
a3 + (b 2)3 + (c 4)3 + (1 + j + j 2 )(a(b 2)2 + a(c 4)2 ) + (1 + j + j 2 )(a2 b 2 + a2 c 4)
3 3 3 3 3 3

√ √ √ √ √ √
+ (1 + j + j 2 )((b 2)2 (c 4) + (b 2)(c 4)2 ) + 3(j + j 2 )a(b 2)(c 4)
3 3 3 3 3 3

= a3 + 2b3 + 4c3 + 0 + 0 + 0 − 3(2abc)


= a3 + 2b3 + 4c3 − 6ab.

Remark 6.2.1
It is perhaps easier to use the definition of the norm as√a determinant
√ (see Remark√C.3.5).
√ One
can check that the matrix of the linear map x 7→ (a + b 3 2 + c 3 4)x in the basis 1, 3 2, 3 4 is
 
a 2c 2b
 b a 2c
c b a

which has determinant

a a2 − 2bc − 2c ba − 2c2 + 2b(b2 − ac) = a3 + 2b3 + 4c3 − 6abc.


 

6.3 Galois Theory


Exercise 6.3.1∗ . Check that K(α1 , . . . , αn )/K is Galois and prove that any Galois extension has this
form.
6.3. GALOIS THEORY 305

Solution

K(α1 , . . . , αn )/K is Galois because K-embeddings send αi to some other αj so send


K(α1 , . . . , αn ) to itself and are thus all automorphisms. Conversely, if L/K is Galois, let α
be a primitive element for L and α1 , . . . , αn its conjugates. Then,

L = K(α) = K(α1 , . . . , αn )

since L/K is Galois. 

Exercise 6.3.2∗ . Can you write the Galois group of a quadratic extension L/K in a way that doesn’t
depend on L or K? (More specifically, show that the Galois groups of quadratic extensions are all
isomorphic.)

Solution

The
√ embeddings√ of a quadratic extension K( d)/K are the identity and the conjugation a +
b d 7→ a − b d. The Galois group is isomorphic to Z/2Z (with addition): the identity is sent
to 0 and the conjugation to 1: conjugation composed with conjugation gives the identity, i.e.
1 + 1 = 0. (Coincidentally the only group with two elements is Z/2Z so we could have directly
concluded that they were isomorphic.) 

Exercise 6.3.3∗ . Check that the Galois group is a group under composition. (You may assume that
each element has an inverse, this will be proven later as a corollary of Theorem 2.5.1.)

Solution

There are two things to check: that the operation is associative and that it has an identity. The
former is trivial since composition is associative, and the latter too since σ ◦ id = id ◦ σ = σ for
any embedding σ. 

Exercise 6.3.4. Let L/K be Galois and K ⊆ M ⊆ L be an intermediate field. Prove that EmbK (M )
is a system of representatives of Gal(L/K)/ Gal(L/M ), where the quotient means Gal(L/K) modulo
Gal(L/M ), i.e. we say σ 0 ≡ σ if σ −1 ◦ σ 0 ∈ Gal(L/M ). (Our quotient A/B is more commonly though
of as the set of right-cosets of B in A, i.e. the sets Ba for a ∈ A (which we just wrote as a in our
case).) (See also Exercise A.3.14† .)

Solution

Extend any embedding σ ∈ EmbK (M ) to an embedding σ 0 ∈ EmbK (L) = Gal(L/K). This is


a well defined map from EmbK (M ) to Gal(L/K)/ Gal(L/M ): if ϕ and ψ are equal on M , then
ψ −1 ◦ ϕ is the identity on M so is in Gal(L/M ) as wanted.

This map is clearly injective, to show that it is bijective we just follow our argument in the other
way. Let σ ∈ Gal(L/K)/ Gal(L/M ). We prove that its image on M is well defined: if σ 0 ≡ σ
then σ and σ 0 have the same images on M since σ −1 ◦ σ 0 is the identity on M . 

Exercise 6.3.5. Prove Proposition 6.2.4 using Exercise 6.3.4. (This is a bit technical.)
306 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution
Q
We have NM/K = σ∈EmbK (M ) σ and
Y Y
NL/K ◦ NM/L = ϕ◦ ψ.
ϕ∈EmbK (L) ψ∈EmbL (M )
Q
We would like to say that this is ϕ,ψ ϕ ◦ ψ and then show that ϕ ◦ ψ correspond to the K-
embeddings of M but there is one problem: im ψ is in general not contained in L, the domain
of ϕ. Thus, let F be the Galois closure of M , i.e., if M = K(α), then F = K(α1 , . . . , αn )
where α1 , . . . , αn are the conjugates of α. Using Exercise 6.3.4, we extend embeddings of L to
embeddings of F .

Let GK =
Q Gal(F/K), GM = Gal(F/M ) and GL = Gal(F/L). By Exercise 6.3.4, we have
NM/K = σ∈GK /GM σ and
Y Y Y
NL/K ◦ NM/L = ϕ◦ ψ= ϕ ◦ ψ.
ϕ∈GK /GL ψ∈GM /GL ϕ∈GK /GL ,ψ∈GL /GM

To conclude, we prove that if ϕi are a system of representatives of GK /GL and ψj of GL /GM ,


then ϕi ◦ ψj is a system of representatives of GK /GM . By looking at the cardinalities, it sufficies
to show that they are distinct in GK /GM . Thus, suppose that

ϕ0 ψ 0 ≡ ϕψ (mod GM ) ⇐⇒ ϕ−1 ϕ0 ψ 0 ≡ ψ (mod GM ).

If we look at this modulo GL , we get ϕ−1 ϕ0 = id, i.e. ϕ = ϕ0 . Then if we look at the remainder
modulo GM , we have ψ 0 = ψ which shows what we wanted. 

Exercise 6.3.6∗ . Compute σ(0,−1) ◦ σ(1,1) and σ(1,1) ◦ σ(0,−1) .

Solution

This follows from the fact that σ(a,b) ◦ σ(c,d) = σ(a+bc,bd) . Indeed, σ(a,b) sends ζ to

σ(a,b) (ζ d )b = (ζ d )b = ζ bd

3
and 2 to √ √ √
σ(a,b) (ζ c 2) = (ζ c )b · ζ a 2 = ζ a+bc 2.
3 3 3

Thus,
σ(0,−1) ◦ σ(1,1) = σ(0−1·1,−1·1) = σ(−1,−1)
but
σ(1,1) ◦ σ(0,−1) = σ(1+1·0,1·(−1)) = σ(1,−1)


Exercise 6.3.7∗ . Prove that ei (σ1 (α), . . . , σk (α)) is fixed by H for any i.

Solution

We shall prove that any symmetric polynomial f evaluated at σi (α) is fixed by H (this is in fact
equivalent to what we need to prove by the fundamental theorem of symmetric polynomials).
For this, note that
σ(f (σ1 (α), . . . , σk (α)) = f (σσ1 (α), . . . , σσk (α))
6.3. GALOIS THEORY 307

by a slightly generalised version of Exercise 6.2.2∗ . Since σi 7→ σσi is a permutation of H (since


H is a group!) and f is symmetric, this is exactly equal to f (σ1 (α), . . . , σk (α). 

Exercise 6.3.8∗ . Given two subfields A and B of a field L, define their compositum or composite
field AB as the smallest subfield of L containing both A and B (in other words, the field generated
by A and B). Let L/K be a finite Galois extension and A, B be intermediate fields. Prove that
Gal(L/AB) = Gal(L/A) ∩ Gal(L/B).

Solution

Note that any embedding which fixes both AB fixes both A and B. Hence, Gal(L/AB) ⊆
Gal(L/A) ∩ Gal(L/B). Conversely, if A = K(α) and B = K(β), then AB = K(α, β) and it is
clear that the embeddings which fix both α and β fix AB. 

Exercise 6.3.9∗ . Given two subgroups H1 , H2 of a group H, define the subgroup they generate,
hH1 , H2 i, as the smallest subgroup containing both H1 and H2 . Let L/K be a finite Galois extension
and A, B be intermediate fields. Prove that Gal(L/A ∩ B) = hGal(L/A), Gal(L/B)i.

Solution

Note that any embedding which fixes A or B fixes A ∩ B. Hence, hGal(L/A), Gal(L/B)i ⊆
Gal(L/A ∩ B) (if a group contains two subgroups it also contains the subgroup generated by
them, by definition). Conversely, if hGal(L/A), Gal(L/B)i fixes all of M , then M is fixed both
by the embeddings of A and by the embeddings of B, which implies that M ⊆ A ∩ B. This shows
that Gal(L/A ∩ B) ⊆ hGal(L/A), Gal(L/B)i. 

Exercise 6.3.10∗ . Prove Proposition 6.3.2.

Solution

If H1 ⊆ H2 then LH1 is fixed by less embeddings than LH2 so has more elements (any element
fixed by the embeddings of H2 is also fixed by the embeddings of H1 ). Conversely, if M1 ⊆ M2 ,
there are more embeddings which fix M1 than M2 since any embedding which fix M2 also fix
M1 . 

Exercise 6.3.11∗ . Let L/K be a finite Galois extension and let M be an intermediate field. Prove
that, for any σ ∈ Gal(L/K), Gal(L/σM ) = σ Gal(L/M )σ −1 . Deduce that the intermediate fields
which are also Galois (over K) are M = LH where H is a normal subgroup of G = Gal(L/K),
meaning that σHσ −1 = H for any σ ∈ G. In particular, if L/K is abelian, meaning that its Galois
group is, any intermediate field is Galois over K.

Solution

If ϕ fixes M , σϕ fixes sends M to σM and thus σϕσ −1 sends σM to itself as wanted (this
is of course reversible). M is Galois over K iff it is fixed under conjugation, i.e. σM = M
for any σ ∈ Gal(L/K). By the fundamental theorem of Galois theory, this is equivalent to
Gal(L/M ) = Gal(L/σM ) = σ Gal(L/M )σ −1 . This proves the second part. For the third, simply
note that σHσ −1 = σσ −1 H = H when G is abelian. 
308 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.3.12∗ . Fill in the details of this proof of the quadratic reciprocity law.

Solution
√ ∗
√ ∗ of the proof. First, we prove that q ∈ Q(ωq ) where ωq is a primitive qth root
Here is a summary
of unity, say q = f (ωq ). Then, we assume that f ∈ Z(p) [X], where Z(p) denote the rational
numbers with non-negative p-adic valuation. This follows for instance from Exercise 3.5.26† .
After that, we apply the Frobenius morphism on both sides to get
√ √
σp ( q ∗ ) = f (ωqp ) ≡ ( q ∗ )p ,

i.e.
p √ ∗ √
 
q ≡ ( q ∗ )p
q
√ ∗   √
since the Galois group of Q(ωq )/Q( q ) is {σk | kq = 1}, i.e. these embeddings fix q ∗ and

the others negate it. To see that this is necessarily the Galois group of Q(ωq )/Q( q ∗ ), notice
that the only subgroup of cardinality q−1
2 of Z/(q − 1)Z is 2Z/(q − 1)Z, which corresponds to the
quadratic residue once we raise a primitive root to these powers (since primitive roots are what
give us an isomorphism Z/(q − 1)Z ' (Z/qZ)× ).

Now that we have this equality, we can rewrite it as


  
p q−1
 p−1
2
≡ (−1) 2 q (mod p),
q

i.e.   
p q p−1 q−1
≡ (−1) 2 · 2
q p
as wanted. 

Exercise 6.3.13∗ . Convince yourself of this solution.

Solution

HereQis the proof written in the correct order. We pick a primitive element r of Q(ω)H , say
r = h∈H u − ω h for some u ∈ Z by ??, and consider its minimal polynomial f . Then, let p be
any prime not dividing n and consider an element ζ ∈ Fp of order n. If p (mod n) ∈ H, then
Y
ρk = u − ζ kh
h∈H

is in Fp since Y
ρpk = u − ζ khp = ρk
h∈H

as h 7→ hp is a permutation of H since p (mod n) ∈ H. By the fundamental theorem of symmetric


polynomials, Y Y
X − f (ρh ) ≡ X − f (σh (r)) = X ϕ(n) (mod p)
h∈H h∈H

so f has all its roots ρh in Fp as asserted.

Now, suppose for the sake of a contradiction that infinitely many p ≡ m (mod n) are such that
f has a root in Fp . Since the roots of f are still the ρh , this means that, for some h ∈ H, ρh ∈ Fp
for infinitely many p ≡ m (mod n). Since ρh ∈ Fp is equivalent to ρph = ρh and ρph = ρhp = ρhm ,
6.4. SPLITTING OF POLYNOMIALS 309

we get
ρhm − ρh = 0
for infinitely many p. As a consequence, the product
Y Y
ρghm − ρgh ≡ σghm (r) − σgh (mod p)
g∈(Z/nZ)× /H g∈(Z/nZ)× /H

is divisible by infinitely many primes p by the fundamental theorem of symmetric polynomials.


This implies that σghm (r) = σgh for some g, i.e. ghm and gh are in the same coset, which is
false since m 6∈ H. (We can also assume that g = 1 by replacing ω by ω g .) 

Exercise 6.3.14∗ . Prove that the identity of a group is unique.

Solution

If e and e0 are two identities than e = ee0 = e0 so they are equal. 

Exercise 6.3.15∗ . Prove the following refinement of Theorem 2.5.1: if G is a finite group and H a
subgroup of G, |H| divides |G|. Why does it imply Theorem 2.5.1?

Solution

Partition G into left-cosets aH, a ∈ G. Each coset has cardinality |H|, and two distinct cosets
are disjoint so this is indeed a partition: if ag = bh with g, h ∈ H, then a = bhg −1 so aH = bH.
Thus, the cardinality of a coset divides the cardinality of the union, i.e. |H| divides |G|. When
H is the subgroup generated by an element g, this means that the order of g divides the order
of G. 

6.4 Splitting of Polynomials


Exercise 6.4.1. Does there exist an a 6≡ 1 (mod n) such that any non-constant f ∈ Z[X] has infinitely
many prime factors congruent to a modulo n?

Solution

No, a counterexample is f = Φn . 

6.5 Exercises
Field and Galois Theory
Exercise 6.5.1† . Let L/K be a finite separable extension of prime degree p. If f ∈ K[X] has prime
degree q and is irreducible over K but reducible over L, then p = q.
310 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Let α be a root of f . On the one hand, [L(α) : K] = [L(α) : K(α)][K(α) : K] is divisible by q.


On the other hand,
[L(α) : K] = [L(α) : L][L : K].
The first factor is smaller than q so not divisible by q, and the second is p. Thus, q | p, i.e.
p = q. 

Exercise 6.5.2† . Let L/K be a finite Galois extension and M/K be a finite extension. Prove that
Gal(LM : M ) ' Gal(L : L ∩ M ). In particular, [LM : L] = [L : L ∩ M ]. Conclude that, if L/K and
M/K are Galois, we have
[LM : K][L ∩ M : K] = [L : K][M : K].

Solution

For the first part, let L = K(α). Consider the restriction ϕ from Gal(LM/M ) → Gal(L/K)
(an element of Gal(LM/M ) is a function σ : LM → LM fixing M , which can be restricted to a
function σ : L → L fixing K). This is injective, since σ ∈ Gal(LM/M ) is determined by its value
as α, and the same goes for σ ∈ Gal(L/K). We wish to show that the image of this restriction is
Gal(L/L ∩ M ) (thus corresponding to an isomorphism between Gal(LM/M ) and Gal(L/L ∩ M )
as wanted).

Notice for this that

Lϕ Gal(LM/M ) = {x ∈ L | σ(x) = x ∀ σ ∈ ϕ Gal(LM/M )}


= {x ∈ L | σ(x) = x ∀ σ ∈ Gal(LM/M )}
=M ∩L

since Gal(LM/M ) fixes exactly M but here we restrict it to L.

For the second part, we have [LM : K] = [LM : L][L : K] by the tower law and [LM : L] =
[M :K]
[M : L ∩ M ] by the first part. Now, [M : L ∩ M ] = [L∩M :K] by the tower law again. Thus, we
conclude that
[L : K][M : K]
[LM : K] = [LM : L][L : K] = [M : L ∩ M ][L : K] =
L ∩ M : K]

as wanted. 

Exercise 6.5.3† . Prove that, for any n, there is a finite Galois extension K/Q such that Gal(K/Q) '
Z/nZ.

Solution

Pick a prime p ≡ 1 (mod n), let ω be a primitive pth root of unity, and set K = Q(ω). Note
that, since K has abelian Galois group, every subfield of K is also Galois over Q. We wish
to find a subfield L such that Gal(L/Q) ' Z/nZ. By Exercise 6.3.4, we have Gal(L/Q) '
Gal(K/Q)/ Gal(K/L), so we want to find a subgroup H of Z/(p − 1)Z such that
 
Z/(p − 1)Z /H ' Z/nZ.

Now, note that the subgroups of Z/(p − 1)Z have the form d · Z/(p − 1)Z ' Z/ p−1
d Z, and that
6.5. EXERCISES 311

(e.g. by Exercise A.3.15† )


  
Z/(p − 1)Z d · Z/(p − 1)Z ' Z/dZ.

Thus, if H is the subgroup of Gal(K/Q) corresponding to n · Z/(p − 1)Z, we get Gal(K H /Q) '
Z/nZ as wanted. 

Remark 6.5.1
If we combine the structure of units from Exercise 3.5.18† with the structure of finite abelian
groups from Exercise A.3.19† , we get that every finite abelian group is a Galois group Gal(K/Q)
for some number field K.

Exercise 6.5.4† (Cayley’s Theorem). Let G be a finite group. Prove that it is a subgroup of Sn for
some n. Conclude that there is a finite Galois extension L/K of number fields such that G ' Gal(L/K).
(This is part of the inverse Galois problem. So far, it has only been conjectured that we can choose
K = Q.)

Solution

We claim that G ⊆ S|G| . Indeed, left-multiplication by g defines a permutation sg of G, and it


is clear that this is an isomorphism: sg ◦ sg− 1 = id and

sg ◦ sh = x 7→ hx 7→ ghx = sgh .

For the second part, we can consider an L such that Gal(L/Q) ' S|G| and then take K = LG
since G is a subgroup of S|G| by Cayley’s theorem. To prove that such an L exists without
invoking Exercise 6.5.22† , one can consider a prime number p ≥ n and a subgroup S ' Sn of Sp .
By Exercise 6.5.21† , there exists a number field M Galois over Q such that Gal(M/Q) ' Sp .
Indeed, it is clear that there exists a polynomial of degree p with real coefficients and exactly
two non-real roots. We can then refine it to an irreducible polynomial with rational coefficients
by replacing its coefficients by close rational numbers of p-adic valuation 1, except for its leading
coefficient which we replace by a close rational number of p-adic valuation 1. This gives us a
polynomial irreducible over Q by Eisenstein’s criterion. Finally, we pick L = M S . 

Exercise 6.5.5† (Dedekind’s Lemma). Let L/K be a finite separable extension in characteristic 0.
Prove that the K-embeddings of L are linearly independent.

Solution

Suppose for the sake of a contradiction that a non-zero linear combination annihilates the em-
beddings:
a1 σ1 + . . . + ak σk = 0
and pick k to be minimal. Pick an element a ∈ L such that σ1 (a) 6= σk (a). Then,

a1 σ1 (ax) + . . . + ak σk (ax) = 0

for all x ∈ L by assumption, but this is also

a1 σ1 (a)σ1 (x) + . . . + ak σk (a)σk (x)


312 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

so we conclude that
   
σ1 (a) σk−1 (a)
a1 1 − σ1 + . . . + ak−1 1 − σk−1 = 0,
σk (a) σk (a)

contradicting the minimality of k. 

Exercise 6.5.6† (Hilbert’s Theorem 90). Suppose L/K is a cyclic extension in characteristic 0, mean-
ing its Galois group Gal(L/K) ' (Z/nZ, +) for some n (like Gal(Fpn /Fp )) or Gal(Q(exp(2iπ/p))/Q)).
Prove that α ∈ L has norm 1 if and only if it can be written as β/σ(β) for some β ∈ L, where σ is a
generator of the Galois group (element of order n).

Solution

It is clear that β/σ(β) has norm 1 for any β. Now suppose α has norm 1 and let σ ∈ Gal(L/K)
be a generator. By Exercise 6.5.5† , pick a γ such that
n−1
X k−1
Y
β= σ k (γ) σ i (α)
k=0 i=0

is non-zero. Then,
n−1
X k−1
Y
ασ(β) = σ k+1 (γ) σ i+1 (α)
k=0 i=0
n−1
X k−1
Y
= σ n (γ)ασ(α) · . . . · σ n−1 (α) + σ k (γ) σ i (α)
k=1 i=0
n−1
X k−1
Y
= γNL/K (α) + σ k (γ) σ i (α)
k=1 i=0

since NL/K (α) = 1 by assumption. Thus, α = β/σ(β) as wanted. 

Remark 6.5.2

This theorem gives us very interesting corollaries such as Exercise 7.5.11† : x + y d has norm 1
√ √
a+b√d 2
+db2

iff x + y d = a−b d
= aa2 −db 2ab
2 + a2 −db2 d.

Exercise 6.5.8† (Lüroth’s Theorem). Let K be a field and L a field between K and K(T ). Prove
that there exists a rational functions f ∈ K(T ) such that L = K(f ).

Solution

The proof given here is taken from Bergman [3]. Without loss of generality, suppose that L 6= K.
Given an element h ∈ K(T )[X] express it as h = f (T, X)/g(T ) with coprime f, g ∈ K[T, X] (g
is constant in X) and define its height ht(h) as max(degT (f ), degT (g)). Now, pick any element
of minimal height u = f (T )/g(T ) ∈ L. We will prove that

f (X) − ug(X)

is the minimal polynomial of T both over L and K(u). This implies that [K(T ) : K(u)] = [K(T ) :
6.5. EXERCISES 313

L] and K(u) ⊆ L(u), i.e. L = K(u) as desired.

Without loss of generality, we may assume that deg f 6= deg g, by replacing u by u + t where
t ∈ K is such that f +tg has degree less than deg f if deg f +deg g. Similarly, we can assume that
deg f > deg g by replacing u by 1/u if necessary. Finally, by multiplying u by a constant, we can
suppose that f and g are monic. That way, the polynomial f (X)−ug(X) is monic in X of degree
deg f . Note that, when c = a(T, X)/b(T ) is monic in X, b(T ) divides the leading coefficient of X
of a so degT a ≥ degT b which implies that ht(c) = degT (a). In particular, ht(cd) = ht(c) + ht(d)
for polynomials c, d monic in X.

Suppose now that f (X) − ug(X) is divisible by another monic polynomial π = i ui X i ∈ L[X].
P
We have
ht(π(X)) ≥ ht(uk ) ≥ ht(u) = ht(f (X) − ug(X))
where k is chosen so that uk is non-constant. Hence, we conclude that, if f (X) − ug(X) =
π(X)τ (X), we have ht(τ ) = 0, i.e. τ = h(X) ∈ K[X]. This h divides both f and g: indeed, 1
and u are linearly independent since u 6∈ K so f /h + ug/h ∈ K(T )[X] implies that f /h and g/h
are in K[X]. This is of course impossible if τ is non-constant since f and g are coprime. Hence,
f (X) − ug(X) is the minimal polynomial of T over L.

The second part is a lot easier. Suppose that π = i fi (X)ui ∈ K[u][X] with fi ∈ K[X] vanishes
P
at X = T . We will prove that degX π ≤ ht(u) = deg f , thus showing that f (X)ug(X) is the
minimal polynomial of T over K(u). Let m = degu π. We have
m−1
X
0 = g(T )m−1 π(T ) = fm (T )f (T )m /g(T ) + fi (X)f (T )i g m−1−t .
i=0

Note that every term in the second sum is a polynomial, hence fm (T )f (T )m /g(T ) is too: g(T )
divides fm (T ) since f and g are coprime so

degX π ≥ deg fm ≥ deg f = ht u

as claimed. 

nth Roots
Exercise 6.5.9† . Let K be a field, p a prime number, and α an element of K. Prove that X p − α is
irreducible over K if and only if it has no root.

Solution

Suppose that X p − α is reducible for some α 6= 0. We will show that α is a pth power in K. Let
f be a non-trivial factor of X p − α, say of degree k ∈ [p − 1]. Its constant term has the form
ωαk/p by Vieta’s formula (we are working in a field extension where X p − α splits here), where
ω is a pth root of unity. Thus, αk := β p is a pth power. If m is the inverse of k modulo p, say
mk = np + 1, we get  mk p
αmk β
α = np =
α αn
as wanted. 

Exercise 6.5.10† . Let f ∈ K[X] be a monic irreducible polynomial and p a rational prime. Suppose
that (−1)deg f f (0) is not a pth power in K. Prove that f (X p ) is also irreducible.
314 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Suppose that f (X p ) is reducible and let α be a root of f . By Lemma 6.1.1, X p − α is reducible


over K(ω), where ω is a primitive pth root of unity. By Exercise 6.5.9† , α is a pth power in K(ω),
say g(α)p . Let α1 , . . . , αn be the conjugates of α. Then, by Vieta’s formulas,
n n n
!p
Y Y Y
n p
(−1) f (0) = αk = g(αk ) = g(αk )
k=1 k=1 k=1

is a pth power. 

Exercise 6.5.11† (Vahlen, Capelli, Redei). Let K be a field and α ∈ K. When is X n − α irreducible
over K?

Solution

Suppose that X m − α and X n − α are irreducible over K. Then, so is X mn − α. Indeed,


suppose that β is a root of X mn − α. Then, [K(β) : K] is divisible by [K(β m ) : K] = n and
[K(β n ) : K] = m, which implies that it’s divisible by mn as well since they are coprime. Thus,
[K(β) : K] = mn as wanted since it’s clearly at most mn.

Hence, it suffices to study the case where n = pk is a prime power. First, suppose that p is odd.
k
Then, we prove by induction that X p − α is irreducible if and only if α is not a pth power, the
k+1
base case being Exercise 6.5.9† . For the induction step, by Exercise 6.5.10† , if f = X p − α is
k+1
reducible, then (−1)p f (0) = α is a pth power. For p = 2 we get −α which could be a square
while α isn’t, except if K has characteristic 2 since we then α = −α. Thus, we are already done
if char K = 2 so we may now suppose that char K 6= 2.
k
Finally, it remains to study X 2 − α. We claim that, for k ≥ 2, this is irreducible iff α is a square
or −4 times a fourth power. One implication is Sophie Germain’s identity: if α = −4β 4 , then

X 2 + 4β 4 = (X 2 + 2βX + 2β 2 )(X 2 − 2βX + 2β 2 ).


k
It remains to prove that X 2 − α is irreducible if α is not a square or −4 times a fourth power.
k+1
Suppose that X 2 − α is reducible. Then, α = −β 2 for some β by Exercise 6.5.10† . Since α
k+1 k
is not a square, −1 isn’t as well. Let γ be a root of X 2 − α. We have γ 2 = iβ for some
r
i2 = −1 ∈ K. We will prove that X 2 − iβ is irreducible over K(i), thus showing that

[K(γ) : K] = [K(γ) : K(i)][K(i) : K] = 2k · 2 = 2k+1

as wanted. If it were reducible, iβ would have the form −(u + vi)2 = (v + ui)2 for some u, v ∈ K.
This gives us u2 − v 2 = 0 and β = 2uv so

α = −β 2 = −4u4

as wanted.

We can summarise the previous discussion as follows.


• X n − α is irreducible iff α is not a pth power for any p | n, and not −4 times a fourth power
in the case that 4 | n and char K 6= 2.
• As a corollary, if α is not −4√times a fourth power or 4 - n or char K = 2, the minimal
√ n
polynomial of n α is X n/d − αd where d | n is the greatest integer such that αd is an nth
power.

6.5. EXERCISES 315


Exercise 6.5.12
√ . Let n ≥ 1 be an integer and ζ a primitive nth root of unity. What is the Galois
n
group of Q( 2, ζ) over Q?

Solution
√ √
Knowing the embeddings of Gal(ζ, n 2) is equivalent to knowing the conjugates of n 2√over
Gal(ζ). √
By Exercise
√ 6.5.11† and Problem 6.3.3, we know that the minimal polynomial of n 2 is
n/2
X − 2 if 2 ∈ Gal(ζ) and 2 | n, and X n − 2 otherwise. The embeddings are thus
(
ζ 2 7→ ζ 2a
σ(a,b) : √ √
n
2 7→ ζ b n 2

for independent a ∈ Z/ n2 Z and b ∈ Z/nZ in the first case, and
(
ζ 7→ ζ a
σ(a,b) : √ √
n
2 7→ ζ b n 2

for independent a ∈ (Z/nZ)× and b ∈ Z/nZ in the√second case. (One may note that neither of
these Galois groups are abelian for n ≥ 3.) Since 2 is in Q(ζ) iff 8 | n by Exercise 6.5.27, we
are done. 

Exercise 6.5.13† . Let n ≥ 1 be an integer and p1 , . . . , pm rational primes. Prove that


√ √
[Q( n p1 , . . . , n pm ) : Q] = nm .
(This is a generalisation of Exercise 4.6.24† .)

Solution

Suppose that the degree is strictly smaller than nm , i.e. a non-trivial linear combination of
powers is zero:
k
X √
bi n ai = 0
i=1

for non-zero rational numbers b1 ,q


. . . , bk and distinct nth-powers-free positive integers a1 , . . . , ak .
By mulyiplying this equation by n an−1
1 , we may assume that a1 = 1. We will prove that b1 = 0,

thus reaching a contradiction. Let K be a Galois extension of Q containing all n ai with Galois
group G. Take the sum of the equations
k
X √
bi σ( n ai ) = 0,
i=1


over σ ∈ G. Exercise 6.5.11† shows that the sum of the conjugates of n ai is zero for i 6= 1. Since
the action of Gal(K/Q) on Q(α) restricts to multiple copies of Emb(Q(α)) for any α ∈ Q, the
P √
sum σ∈G n ai is zero for any i 6= 1. Thus, we are left with the same
X
b1 σ(1) = b1 |G|,
σ∈G

which means b1 = 0 as wanted. 

Exercise 6.5.14† (Kummer Theory). Let L/K be a finite Galois extension in characteristic 0. Suppose
that Gal(L/K) ∼ Z/nZ. If K contains a primitive nth root of unity, prove that L = K(α) for some
αn ∈ K.
316 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Let σ be a generator of Gal(L/K) and let ω ∈ K be a primitive nth root of unity. If we find a
non-zero α ∈ L such that σ(α) = ωα, we are done, since, by iterating σ, we get σ k (α) = ω k α,
i.e. L is generated by the nth root α of some element of Ki+1 (as can be seen from considering
its norm for instance). For this, consider

Xn − 1
f= = X n−1 + ωX n−2 + . . . + ω n−1 .
X −ω
We have 0 = σ n − id = (σ − ωid) ◦ (f (σ)), hence, α = f (σ)(β) works for any β. We only need
to ensure that α 6= 0, and this follows from the linear independence of the embeddings from
Exercise 6.5.5† . 

Exercise 6.5.15† (Artin-Schreier Theorem). Let L/K be a finite extension such that L is algebraically
closed. Prove that [L : K] ≤ 2.

Solution

First, we prove that [L : K] is a power of 2. Suppose otherwise. Using Cauchy’s theorem and the
fundamental theorem of Galois theory, we can find an intermediate field M such that [L : M ] = p
where p | [L : K] is an odd prime. Thus, suppose without loss of generality that [L : K] = p.
Consider a primitive pth root of unity ω ∈ L. Since ω has degree at most p − 1 over K, its
degree is not divisible by p which means that ω must be in K. Then, Kummer theory from
Exercise 6.5.14† implies that L = K(α) for some β = αp ∈ K. Since α is not in K, β is not a pth
2
power in K. This implies that the polynomial X p − β is irreducible over K by Exercise 6.5.11† .
This is a contradiction since any element algebraic over K has degree at most p by assumption.

Thus, [L : K] = 2k for some k. In particular, any polynomial of odd degree has a root in K.
Indeed, group the roots of a polynomial f ∈ K[X] in disjoint orbits of the form

σ1,i (αi ), . . . , σki ,i (αi )

where αi is a root of f and σ1,i , . . . , σki ,i form a set of representatives of


Gal(L/K)/ Gal(L/K(αi )). In other words, we are simply asking that, for a fixed i, σj,i (αi ) go
through every conjugate of αi exactly once. Then, if αi 6∈ K,

ki = | Gal(L/K)/ Gal(L/K(αi ))| = 2k /| Gal(L/K(αi ))|

is even since Gal(L/K(αi )) < 2k . If this is the case for all i, then f has an even number of roots,
contradicting the assumption that its degree is odd. (This is a generalisation of the fact that
non-real roots always come by pair of complex conjugates.)

Accordingly, any polynomial of odd degree has a root in K. We will prove that any element
α ∈ K is such that α is a square of −α is. Assuming this, notice that these are exactly the
assumptions we used in our proof that C = R(i) is closed, in Section B.3. Hence, L = K(i)
where i2 = −1, which means in particular that [L : K] ≤ 2 as wanted.
k+1
It remains to show this claim. For this, notice that X 2 − α is reducible over K since L/K has
degree 2k . By Exercise 6.5.11† , this implies that α is a square, or −4 times a fourth power, and
thus minus a square. 

Constructibility and Solvability


Exercise 6.5.16† . Given two points, you are allowed to draw the line between them, as well as the
circle of center one of the points going through the other. Initially, you may start with the points (0, 0)
6.5. EXERCISES 317

and (0, 1) and define additional points that way. We say a real number r is constructible
p if the point
(0, r) is constructible. Prove that, if x and y are constructible, so are x + y, xy, −x, |x|, and x1 if
x 6= 0.

Solution

Exercise 6.5.17† . Prove that a real number is constructible if and only if it is algebraic and the
degree of its splitting field, meaning the field generated by its conjugates, is a power of 2. Deduce that,
using only a (non-graded) ruler and a compass,
1. A regular n-gon is constructible if and only if ϕ(n) is a power of 2.
2. It is not always possible to trisect an angle.
3. Given a square with area A, it is not possible to construct a square with area 2A.

Solution

Note that α = cos π2 n is constructible iff its degree ϕ(n)/2 (for n ≥ 2) is a power of 2. In


that case, we can construct the point (cos(2kπ/n), 0) and the point (0, sin(2kπ/n)), and hence
the point (cos(2kπ/n), sin(2kπ/n)) as well. Similarly, if we are able to trisect an angle of α,
by intersecting with a vertical line we are able to construct cos(α/3) (and this is equivalent).
However, for α = 2π3 , we get cos(2π/9) which has degree 3 and is thus not constructible.

It remains to prove the characterisation of the constructible numbers. One direction is easy:
when we intersect a line with a line, the field generated by the coordinates does not change, and
when we intersect a line with a circle, the field generated by the coordinates becomes a quadratic
extension of itself or does not change. Thus, the degree over Q gets multiplied by 1 or by 2
each time, and must thus be a power of 2. It is also clear that this is Galois: each time we
take a square root, we could also take the other square root (this amounts to considering the
other intersection with the circle). The other direction is almost given by Exercise 6.5.16† . To
finish, we proceed by induction on [K : Q] = 2k where K is the splitting field of α. By Cauchy’s
theorem 6.3.3, Gal(K/Q) has an element of order 2, and thus, K has a subfield of index 2 by the
Galois correspondence, say [K : L] = 2. By assumption, the points of L are constructible since
[L : Q] = 2k−1 , and if K = Q(α), we can recover α from L using the quadratic formula so α is
also constructible by Exercise 6.5.16† . 

Exercise 6.5.18† . We say a finite Galois extension L/K in characteristic 0 is solvable by radicals if
there is a tower of extensions
K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L
such that Ki+1 is obtained from Ki by adjoining an nth root of some element of Ki to Ki , for some
n. We also say a group G is solvable if there is a chain 0 = G0 ⊂ G1 ⊂ . . . ⊂ Gm = G such that Gi is
normal in Gi+1 (see Exercise 6.3.11∗ ) and Gi+1 /Gi is cyclic. Prove that L/K is solvable by radicals if
and only if its Galois group is. (When L is the field generated by the roots of a polynomial f ∈ K[X],
L/K being solvable by radicals means that the roots of f can be written with radicals, which explains
the name.)

Solution

First, suppose that G is solvable. Consider the tower of fields Ki = LGi . Each Ki+1 /Ki is
Galois by Exercise 6.3.11∗ , with Galois group isomorphic to Gi+1 /Gi by Exercise 6.3.4. We
318 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

wish to conclude that Ki+1 is generated from Ki by adding the nth root of an element. The
problem is that this is almost always false, since then Ki+1 /Ki wouldn’t be Galois extension!
Hence, we shall to consider the tower of fields Ki (ω), where ω be a primitive |G|th root of unity
(since |Gi+1 /Gi | | |G| by Lagrange). By Exercise 6.5.2† , the Galois group of Ki+1 (ω)/Ki (ω) is
isomorphic to the subgroup Gal(Ki+1 /Ki (ω) ∩ Ki+1 ) of Gal(Ki+1 /Ki ), and hence cyclic as well,
say of order n. Then, Kummer theory from Exercise 6.5.14† states that Ki+1 (ω) is Ki (ω)(αi )
for some αin ∈ Ki (ε). This gives us that L/K is solvable as wanted since we have the tower

K ⊆ K(ω) ⊆ K(ω)(α1 ) ⊆ . . . K(ω)(αm ) = L(ω) ⊇ L.

Now, we need to prove that G is solvable if L/K is. Note that, if G is solvable, then so is G/H
for any normal subgroup H. Indeed, if 0 = G0 ⊆ . . . ⊆ Gm = G, then H = G0 H/H ⊆ . . . ⊆
Gm H/H = G/H and
(Gi+1 H/H)/(Gi H/H) ' Gi+1 H/Gi H
by Exercise A.3.15† , and this is cyclic, since if gGi generates Gi+1 /Gi , then gGi+1 H generates
Gi+1 H/Gi H.

Set M = Km (ω), where ω is a root of unity chosen so that Ki+1 (ω)/K is Galois for each i (and
in particular M/K is). Since Gal(L/K) ' Gal(M/K)/ Gal(M/L) is a quotient of Gal(M/K), it
suffices to prove that Gal(M/K) is solvable. There is only one thing left to do now: add ω to all
Ki so that Ki+1 /Ki becomes Galois and we can simply take its Galois group. This gives us that
Gal(M/K(ω)) is solvable, but we what we want is for Gal(M/K) to be. However,

Gal(M/K)/ Gal(M/K(ω)) ' Gal(K(ω)/K)

is cyclic, so we can add to a chain 0 = G0 ⊆ . . . ⊆ Gm = Gal(M/K(ω)) ⊆ Gal(M/K) to conclude


that Gal(M/K) is solvable as wanted!

We shall however explain a bit more why taking Galois groups gives us cyclic extensions. Let
us write Mi = Ki (ω) = Mi−1 (αi ) for some αini ∈ Mi . Then, Gal(Mi+1 /Mi ) is cyclic since its
embeddings have the form σ(αi ) = ω k α for some k, so they form a subgroup of Z/nZ where n is
the order of ω, which is thus cyclic. If we let Gi = Gal(M/Mi ), we have

0 = Gal(M/M ) = Gm ⊆ . . . ⊆ G0 = Gal(M/K(ω))

and
Gi /Gi+1 = Gal(M/Mi )/ Gal(M/Mi+1 ) ' Gal(Mi /Mi+1 )
by Exercise 6.3.4 which is cyclic as wanted. 

Exercise 6.5.19† . Let n ≥ 1 be an integer. Prove that Sn is not solvable for n ≥ 5. Conclude
from Exercise 6.5.21† that some polynomial equations are not solvable by radicals.1 (This is quite
technical.)

Solution

The usual proof proves a lot more than just the non-solvability of Sn : it completely characterise
all its descending chains of normal subgroups. More precisely, the normal subgroups of Sn are
the symmetric group Sn , the alternating group of even permutations (Definition C.3.2) An , as
well as the trivial group 0 = {id}, while the only strict normal subgroup of An is 0 (we say it’s
simple). However, this demands a lot of work, and since this is a number theory book and not
an algebra one, we will not prove this. See Weinstraub [31], appendix A, section 3 for an account
of the more general result.

1 If one only wants to show that there is no general formula, one doesn’t need to do the first part since the general

polynomial n
Q
i=1 X − Ai ∈ Q(A1 , . . . , An )[X] already has Galois group Sn (where A1 , . . . , An are formal variables).
6.5. EXERCISES 319

Note that if G is solvable, so is any of its subgroups H: 0 = G0 ⊆ . . . ⊆ Gm = G becomes

0 = G0 ∩ H ⊆ . . . ⊆ Gm ∩ H = G

and
Gi+1 ∩ H (H ∩ Gi+1 ) ∩ Gi
=
Gi ∩ H Gi
by the second isomorphism theorem (see Exercise A.3.15† ). This is a subgroup of Gi+1 /Gi , which
is cyclic, so it’s cyclic itself. Hence, to show that Sn is not solvable for n ≥ 5, it suffices to prove
that S5 is not solvable. This can be done using a computer for instance, as there are only 120
elements. However, we will still present a somewhat more satisfactory proof.

We first prove that the only strict subgroup G of Sn such that Sn /G is abelian is An . Let H be
such a subgroup. Note that, in cycle notation, we have

(1, 2, 4)(1, 4, 2) = id
(1, 3, 5)(1, 5, 3) = id
(1, 2, 3) = (1, 2, 4)(1, 3, 5)(1, 4, 2)(1, 5, 3).

Hence, if we let f ((1, 2, 4)) = x and f ((1, 3, 5)) = y, we get

f ((1, 2, 3)) = f ((1, 2, 4)(1, 3, 5)(1, 4, 2)(1, 5, 3)) = xyx−1 y −1 = id

in Sn /G which is abelian by assumption. By symmetry, G contains all 3-cycles. It remains to


prove that the 3-cycles generate An . Since An is generated by the products of two transpositions,
it suffices to prove that 3-cycles generate all products of two transpositions. This follows from
the following equalities

(i, k, j) = (i, j)(i, k)


(i, k, j)(i, k, `) = (i, j)(k, `)

for distinct i, j, k, `.

Now, we need to prove that G = A5 has no non-trivial normal subgroup. For this, we shall find
the conjugacy classes Cg = {hgh−1 | h ∈ G}. We can check that there are 5 such conjugacy
classes, of respective size 1, corresponding to the identity, 15, 20, 12 and 12. Assume we have
proven this. Since a normal subgroup H is a union of conjugacy classes by definition, it must
contain the trivial class of size 1, corresponding to the identity. But then, it is easy to see that
its cardinality only divides |A5 | = 60 for H = A5 or H = {id}. Thus, by Lagrange’s theorem
from Exercise 6.3.15∗ , H must be A5 or {id} as wanted. But A5 /{id} ' A5 is not cyclic so we
are done.

To prove that these are the cardinalities of the conjugacy classes, recall Remark 6.5.3:if γ =
(i1 , . . . , ik ) is a cycle, we have
σγσ −1 = (σ(i1 ), . . . , σ(ik )).
This means that the conjugates of 3-cycles are all 3-cycles, thus forming the class of size 20 (it is
not hard to see that we can pick an even σ). Similarly, there are two pairs of conjugacy classes
of 5-cycles of size 12, because this time our σ will not be even. It only remains to prove that all
15 products of two transpositions (i, j)(k, `) with i, j, k, ` distinct are conjugate. This is not very
hard:
σ(i, j)(k, `)σ −1 = (σ(i), σ(j))(σ(k), σ(`)).
Hence, if (i0 , j 0 )(k 0 , `0 ) is another product of two transpositions, we pick σ(i) = i0 , σ(j) = j 0 ,
σ(k) = k 0 and σ(`) = `0 . If this has even signature we are done, otherwise exchange i0 and j 0 .

Finally, to conclude that some algebraic numbers are not expressible by radicals, we only need to
prove that there exist polynomials of prime degree with exactly two non-real roots. One example
is X 2 − 4X − 2 (irreducible by Eisenstein’s criterion). 
320 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.20† . We say a finite Galois extension L/K of real fields, i.e. L ⊆ R, is solvable by real
radicals if there is a tower of extensions
K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L
such that Ki+1 is obtained from Ki by adjoining the nth root of some positive element of Ki to Ki .
Prove that L/K is solvable by real radicals if and only if [L : K] is a power of 2.

Solution

Without loss of generality, by adding more intermediate fields, we can suppose that Ki+1 =
Ki (αi ), where αip ∈ K for some prime p. The key point is that, if [L : K] is equal to an odd
prime q, then [L(α) : K(α)] is also equal to q for any αp ∈ K. Let’s see first how this implies
our result: if [L : K] is not a power of 2, say is divisible by an odd prime q, Gal(L/K) has an
element of order q by Cauchy’s theorem 6.3.3, and hence L has a subfield of index q by the Galois
correspondence, say [L : M ] = q. Then, L/M is also solvable by real radicals, but

[L : M ] = [L(α1 ) : K(α1 )] = [L(α1 , α2 ) : K(α1 , α2 )] = . . . ,

i.e. [LKi : M Ki ] = [L : M ] by our lemma. This is a contradiction for i = m: we get [L : L] =


[L : M ]. Conversely, if [L : K] is a power of 2, say L = K(α), L has a subfield M of index 2,
i.e. [L : M ] = 2. Then, α can be obtained by real radicals from M using the quadratic formula.
Repeating this process yields that [L : K] is solvable by square roots as well.

Hence, it suffices to show that, when [L : K] = q, [L(α) : K(α)] = q as well. Note that
Gal(L(α)/K(α)) is a subgroup of Gal(L/K), as we saw in Exercise 6.5.2† . In particular, [L(α) :
K(α)] divides q so must be 1 or q. Suppose for the sake of a contradiction that it is 1 (in
particular α 6∈ K). This gives L ⊆ K(α), so q divides [K(α) : K], which is p by Exercise 6.5.9† .
Hence, p = q and L = K(α). But this is impossible, since the conjugates of α are not in L as
primitive qth roots of unity are not real since q ≥ 3. 

Exercise 6.5.21† . Let p be a prime number and G ⊆ Sp a subgroup containing a transposition τ


(see the paragraph after Definition C.3.2) and an element γ of order p. Prove that G = Sp . Deduce
that, if f ∈ Q[X] is an irreducible polynomial of degree p with precisely two non-real complex roots,
then the Galois group of the field generated by its roots (called its splitting field , because it is a field
where it splits) over Q is Sp .

Solution

Suppose without loss of generality that τ is 1 ↔ 2 (which we usually denote in cycle notation
τ = (1, 2)). A power of γ, say γ k sends 1 to 2. Suppose without loss of generality that γ. By
symmetry between 3, 4, . . . ,, we can in fact suppose that γ is the cycle (1, 2, . . . , p) which sends
1 → 2 → . . . → p → 1. We will prove that G contains all transpositions and thus must be Sp
since transpositions generate the symmetric group by Exercise C.3.12∗ . Notice that γτ γ −1 is the
transposition (2, 3) since it goes 2 → 1 → 2 → 3 and 3 → 2 → 1 → 2 and must is the identity
else where since τ is. Similarly, γ k τ γ −k is the transposition (k + 1, k + 2). Since
(1, k) = (1, k − 1)(k − 1, k)(1, k − 1),
a straightforward induction tells us that (1, k) ∈ G for all k. Finally, (1, i)(1, j)(1, i) = (i, j) so
G contains all transpositions as wanted.
Now, suppose f is an irreducible polynomial of degree p with only two non-real roots. Since its
degree divides the degree of its splitting field, its Galois group G has an element of order p by
Cauchy’s theorem 6.3.3. Moreover, it contains the transposition corresponding to the complex
conjugation, which exchanges the two non-real roots. 
6.5. EXERCISES 321

Exercise 6.5.22† . Let n be a positive integer. Prove that there is a number field K, Galois over
Q, such that Gal(K/Q) ' Sn . (You may assume the following result of Dedekind: if f ∈ Z[X] is a
polynomial, for any prime number p not dividing the discriminant ∆ of f , the Galois group of f over
Fp is a subgroup of the Galois group of f over Q.2 )

Solution

Let’s look at what the injection of GalFp (f ) in GalQ (f ) gives us. We use the cycle notation
σ1 · . . . · σk to mean the permutation σ which decomposes into the disjoint cycles σ1 , . . . , σk . A
cycle
σ : i1 7→ i2 7→ . . . 7→ ik 7→ i1
will be denoted (i1 , . . . , ik ) (it is the identity on other elements).

We know from Chapter 4 that GalFp (f ) is generated by the Frobenius morphism. Write f ≡
f1 ·. . .·fk (mod p) with distinct irreducible polynomials fi ∈ Fp [X] of respective degrees ni (there
is no repeated root since p doesn’t divide the discriminant). Then, the Frobenius morphism acts
as a cycle on the roots of each fi , of length ni . Hence, Frob can be written in cycle notation as
σ1 · . . . · σk with σi a cycle of length ni . This implies, by assumption, that GalQ (f ) also has an
element of the form.

We have a lot of freedom on the factorisation of f modulo primes, so let’s put aside the question
of ensuring that the primes we choose do not divide the discriminant of f for the moment and
focus on the rest of the proof. Suppose that G = GalQ (f ) has an n-cycle σ, a transposition τ ,
and an (n − 1)-cycle ψ. Without loss of generality, by symmetry, suppose that ψ = (2, 3, . . . , n).
By consdering σ m τ σ −m for an appropriate m, we can assume that τ is the transposition (1, k) for
some k. Indeed, by symmetry, if σ = (1, 2, . . . , n) and τ = (i, j), we have σ m τ σ −m = (i+m, j+m).
Since
ψ = (k, k + 1, . . . , n, 2, 3, . . . , k − 1),
we can, again by symmetry, suppose that in fact k = 2. To conclude, we will prove that (1, 2)
and (2, 3, . . . , n) generate Sn , thus implying that G = Sn as desired. By Exercise C.3.12∗ ,
it suffices to prove that they generate all transpositions. Since ϕ := τ ψ = (1, 2, . . . , n) ∈ G,
we can proceed as we did in Exercise 6.5.21† : (m + 1, m + 2) = ϕm τ ϕ−m ∈ G, and since
(1, m) = (1, m − 1)(m − 1, m)(1, m − 1), we have (1, m) ∈ G for all m by induction. Finally, since
(i, j) = (1, i)(1, j)(1, i), we have all transpositions and we are done: G = Sn .

It remains to prove that we can ensure that the Galois group contains an n-cycle, a transposition,
and an n − 1 cycle. For this, pick three primes p, q, r. Then, choose a polynomial f ∈ Z[X], using
the Chinese remainder theorem, such that
• f is irreducible modulo p,
• f factorises as a product of a polynomial of degree 1 and an irreducible polynomial of
degree n − 1 modulo q, and
• f factorises as a product of an irreducible polynomial of degree 2 and n − 2 polynomials of
degree of degree 1 modulo r (choose r sufficiently large so that there is no common root).
Then, GalFp (f ) contains an n-cycle, GalFq (f ) a transposition, and GalFp (f ) an (n − 1)-cycle. In
addition, the discriminant of f is not divisible by p, q, r since f has no repeated factor modulo
these primes. Hence, by Dedekind’s result and our previous observation, GalQ (f ) = Sn as
desired. 

2 The Galois group of a polynomial f over a field F is defined as the Galois group of its splitting field over F , i.e. as

Gal(F (α1 , . . . , αk )/F ), where α1 , . . . , αk are the roots of f .


322 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Remark 6.5.3
If γ = (i1 , . . . , ik ) is a cycle, then it is straightforward to see that σγσ −1 = (σ(i1 ), . . . , σ(ik )).
This explains how we found our relations.

Cyclotomic Fields
Exercise 6.5.23† . Let ω be a primitive nth root of unity. When is Φm irreducible over Q(ω)?

Solution

Let ζ be a primitive mth root of unity and let ξ be a primitive lcm(m, n)th root of unity. Φm is
irreducible over Q(ω) if and only if ζ has degree ϕ(m) over Q(ω). We have, by Problem 6.3.2,

[Q(ξ) : Q] ϕ(lcm(m, n))


[Q(ζ, ω) : Q(ω)] = [Q(ξ) : Q(ω)] = = .
[Q(ω) : Q] ϕ(n)

Thus, Φm is irreducible over Q(ω) if and only if ϕ(lcm(m, n)) = ϕ(m)ϕ(n). Finally, note that
we always have
Y Y
ϕ(m)ϕ(n) = pvp (m)−1 (p − 1) q vq (n)−1 (q − 1)
p|m q|n
Y Y
min(vp (m),vp (n))−1
= p (p − 1) q max(vp (m),vp (n))−1 (q − 1)
p|m,n q|mn

= ϕ(lcm(m, n))ϕ(gcd(m, n))

since the p − 1 factor is repeated twice when p | m, n and only once otherwise, with exponent
vp (m) + vp (n) − 2 = min(vp (m), vp (n)) − 1 + max(vp (m), vp (n)) − 1 in the first case and exponent
max(vp (m), vp (n)) − 1 in the second. Thus, Φn is irreducible over Q(ω) iff ϕ(gcd(m, n)) = 1, i.e.
iff gcd(m, n) = 1 or 2. 

Exercise 6.5.25† . Let n be an integer and m ∈ Z/nZ be such that m2 ≡ 1 (mod n). Prove that
there exist infinitely many primes congruent to m modulo n, provided that there exists at least one
which is greater than n2 . (It is also true that our Euclidean approach to special cases of Dirichlet’s
theorem only works for m2 ≡ 1 (mod n), see ??.)

Solution

Suppose that p > n2 is a prime congruent to m (mod n). We have already done the case m = 1
in Exercise 3.3.8∗ and the case m = −1 in Theorem 4.4.1, so suppose m 6= ±1. Let ω be a
primitive nth root of unity, and let H = {1, m}. Let also σk denote the embedding ω 7→ ω k .
Consider g = (X − ω)(X − ω m ). For large k, we have Q(g(N )) = Q(ω)H by Remark 6.3.2.
Now, let f be the minimal polynomial of g(N ), where N is an integer that we will choose later.
Consider discriminant Y
∆=± f 0 (σi (g(N )))
i

by Exercise 3.2.2∗ . This is a polynomial of degree ϕ(n)(ϕ(n) − 1) < n2 in N , which can’t always
be divisible by p since p > n2 by assumption. Hence, there is some N such that this is not
divisible by p. Choose this N to also be divisible by n, using CRT.
We are now ready to finish. Suppose for the sake of a contradiction that there are a finite number
of such primes p1 = p, p2 , . . . , pk . Let Q be the product of the possible exceptions, i.e. the prime
divisors of f which are not congruent to 1 or m modulo p. Pick an M congruent to g(N ) or
g(N ) + p modulo p2 , so that vp (f (M )) = 1 using Corollary 5.3.1. Pick also M to be divisible by
6.5. EXERCISES 323

np2 · . . . · pk Q. Since
f (0) = ±Φn (N ) ≡ Φn (0) = ±1 (mod n)
, we know that the prime factors of f (0) are all congruent to 1 modulo n. Thus, the prime factors
of f (M ) ≡ f (0) (mod np2 · . . . · pk Q) are all congruent to 1 or m modulo n since we don’t run
in an exception. Since, by assumption, the only primes congruent to m modulo n are the pi ,
this means that f (N ) is divisible only by primes congruent to 1 modulo n, and potentially by p
too. Since we also have vp (N ) = 1, we get f (N ) ≡ ±m (mod n) depending on its sign. This is
a contradiction if m 6= ±1 since we have f (N ) ≡ f (0) ≡ ±1 (mod n). 

Pn
Exercise 6.5.26†P(Mann). Suppose that ω1 , . . . , ωn are roots of unity such that i=1 ai ωi = 0 for
some ai ∈ Q and i∈I ai ωi 6= 0 for any non-empty strict subset I ⊆ [n]. Prove that ωim = ωjm for any
i, j ∈ [n] where m is the product of primes at most n.

Solution

Suppose without loss of generality that ω1 = 1, by dividing everything by ω1 . Next, let m the
smallest integer such that ωim = 1 for all m, let p be a prime factor of m and write m = pk r
for somep - s. We will prove that p ≤ n and r ≤ 1, thus yielding the wanted result. Let
2iπ
ω = exp pk
be a primitive pk th root of unity, and write ωi = ζ ti ω si for some si ≤ p − 1, where
 
2iπ
is a primitive m/pth root of unity. Inded, if ωi = exp 2`iπ

ζ = exp m/p m , we have

 
ti p + si r
ζ ti ω si = exp
m

and it suffices to choose si r ≡ ` (mod p). The equation


n
X
ai ωi = 0
i=1

thus becomes f (ω) = 0 for some f ∈ Q(ζ) of degree at most p − 1, which is non-zero by
assumption. Let’s compute the degree of ω over Q(ζ):
(
[Q(ω, ζ) : Q] ϕ(pk r) p − 1 if k = 1
[Q(ω, ζ) : Q(ζ)] = = k−1
= .
[Q(ζ) : Q] ϕ(p r) p if k ≥ 2

Thus, we already reach a contradiction if k ≥ 2 since f has degree less than p. Hence, k = 1 and
we must have
f = α(1 + X + . . . + X p−1 ).
However, f has at most k non-zero coefficients, which implies p ≤ k as wanted. 

Exercise 6.5.27. Which quadratic subfields does a cyclotomic field contain?

Solution

Given a positiverinteger m, we let ωm denote


ra primitive mth root of unity. We have seen that,
  
−1 −1
when p is odd, p p ∈ Q(ωp ). Hence, p p ∈ Q(ωn ) whenever p | n. This implies that
q
−1
∈ Q(ωn ) whenever m | n is odd, and −1
 
m m is the Jacobi symbol. In Problem 6.3.3, we
√ √ √
also saw that 2 ∈ Q(ω8 ) so 2 ∈ Q(ωn ) when 8 | n. Also, we of course have −1 ∈ Q(ω4 ) so
324 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY


−1 ∈ Q(ωn ) when 4 | n. To summarise the above discussion, Q(ωn ) contains all the quadratic
subfields of the form
q
−1

• Q( m ) when m is a squarefree positive odd divisor of n

• Q( ±m) when m is a squarefree odd divisor of n and 4 | n

• Q( ±2m) when 8 | n and m is a squarefree odd divisor of n.

We claim that these are all the subfields of Q(ωn ). Suppose that Q(ωn ) contains Q( m) with
minimal m not of the wanted form. Then, Q(ωn ) and Q(ω4m ) have a non-trivial intersection so
n and 4m have gcd at least r3 by ??. First suppose that they have a common odd prime divisor
 
−1
p. Then Q(ωn ) contains Q( p m/p) which contradicts the minimality of m. Otherwise, if
the gcd is exactly 4, the intersection is Q(i) so m = −1 which is in√our list of described subfields.
Otherwise,
p the gcd is at least 8 which means that m is odd and 2 ∈ Q(ωn ) so Q(ωn ) contains
Q( m/2) which contradicts the minimality of m again. 

Exercise 6.5.28† . Prove the Gauss and Lucas formulas: given an odd squarefree integer n > 1, there
exist polynomials An , Bn , Cn , Dn ∈ Z[X] such that
n−1 n−1
4Φn = A2n − (−1) 2 nBn2 = Cn2 − (−1) 2 nXDn2 .
Deduce that, given any non-zero rational number r, there are infinitely many pairs of distinct rational
prime (p, q) such that r has the same order modulo p and modulo q.

Solution
n−1
Let ω be a primitive nth root of unity and set n∗ = −1

n n = (−1)
2 n. Notice that the
2 ∗ 2

expression A − n B is a norm in Q[X]( n ∗ ). Exercise 6.5.27 tells us that Q(ω) contains

Q( n∗√), so we just need to write Φn √in the form U V where U, V are polynomials conjugate in
k
Q[X]( n∗ ). This is easy: in Q(ω)/Q( ∗
 n ), we can see that the conjugates of ω are ω k for 
n  =
k
√ ∗ k k
 Q k
1. Indeed, σk :7→ ω fixes p iff p = 1 and negates it otherwise. Since n = p|n p ,
√ √ Q √
σk negates an even number of p∗ , i.e. fixes n∗ = p|n p∗ iff nk = 1. This means that


Gal(Q(ω)/Q( n∗ )) = {σk | nk = 1}. Thus, we can write


Y Y
Φn = X − ωk X − ωk
( nk )=1 ( nk )=−1
√ h √ i

as wanted. Note that the ring of integers of Q( n∗ ) is Z 1+ 2 n so the coefficients of A and B
are in 21 Z which gives us the factor of 4 on the left as wanted. We also have the formula
√ Y
An + Bn n∗ = X − ωk ,
( nk )=1

which can be used to derive the explicit formulas


Y Y
2An = X − ωk + X − ωk
( nk )=1 ( nk )=−1

and √ Y Y
2Bn n∗ = X − ωk − X − ωk .
( nk )=1 ( nk )=−1
6.5. EXERCISES 325

For Lucas’s formula, we wish to have

Φn (X)Φn (−X)Φn (X 2 ) = Cn (X 2 )2 − n∗ (XDn (X 2 ))2 .

For this consider the equality


√ √ √
Un + Vn n∗ = (An (X) + Bn (X) n∗ )(An (−X) − Bn (−X) n∗ )

= (An (X)An (−X) − n∗ Bn (X)Bn (−X)) + n∗ (An (−X)Bn (X) − An (X)Bn (−X)).

Note that Un (−X) = Un (X) so Un = Cn (X 2 ) for some Cn while Vn (−X) = −Vn (X) so Vn =
XDn (X 2 ) for some Dn . These Cn and Dn are the ones we were looking for.

Finally, we prove that, for any non-zero r = a/b ∈ Q, there are infinitely many integers n such
that (the numerator of) Φn (a, b) has at least two distinct prime factors, unless r = ±1 but these
r have the same order modulo any odd prime. One piece of notation: by multiplying the equality
4Φn (X/Y ) = Cn (X/Y )2 − X/Y Dn (X/Y )2 , we get an equality of the form

4Φn (X, Y ) = Cn (X, Y )2 − XY Dn (X)2 .

Without loss of generality, a and b are coprime and b > 0. First, we treat the case where ab has
even dyadic valuation. If we restrict ourselves to odd n, we can also assume that a is positive,
since we then have Φ2n (a, b) = Φn (−a, b). Thus, suppose that a and b are positive and coprime
and let m be the squarefree part of ab. Suppose that m 6= 1. Then, if p is a prime factor of m,
we have k k k k k k k
4Φpk m (a, b) = 4Φm (ap , bp ) = Cm (ap , bp )2 − m(ab)p Dm (ap , bp )2 .
k
Here is the magic: m(ab)p is a perfect square so this is a difference of two squares which
factorises! It remains to prove that the two factors are not both of the form 2q ` for some prime
q. Indeed, we can ensure that all prime factors q of Φpk m (a, b) are such that a/b has order pk m
modulo q: the only other possible prime factors are common divisors of a and b, of which there
are none, and prime factors of m | ab, which also implies that they are common divisors of a and
b. Finally, since deg Cn > deg Dn , the two factors are asymptotically equivalent (the quotient
goes to 1), so if they both had the form 2q ` , they would need to be equal for large p. This is of
course impossible. When m = 1, we can consider the equation Φ3 = X 2 + X + 1 = (X + 1)2 − X
to get a difference of squares in the same way (replace p by 3) and the same conclusion applies.

It only remains to treat the case where ab has odd dyadic valuation. In that case, we shall derive
a formula of the form
Φ2n = Cn (X)2 − nXDn (X)2
for any squarefree even n. It is clear that the above argument will work as before as long as
the squarefree part m of ab has an odd prime factor p, i.e. is not equal to 2 since it’s even by
assumption. We can simply consider Φ12 = X 4 − X 2 + 1 = (X 2 + X + 1)2 − 2X(X + 1)2 in that
case (and raise X to the power 3k for large k).

Hence, it suffices to show that there exist such polynomials Cn and Dn for even squarefree n.
Without loss of generality, we can assume that n is positive since Φ2n (X) = Φn (X 2 ) = Φ2n (−X).
Our formula is equivalent to Φ4n (X) = Cn (X 2 )2 − n(XDn (X 2 ))2 . We will√now proceed as
before. Let ω be a primitive nth root. By Exercise 6.5.27, Q(ω) contains Q( n) and  we have
√ k
 k
 k k

Gal(Q(ω)/Q( n) = {σk | n = 1}, by definition of the Kronecker symbol n = n/2 2 ,
k2 −1 √
where k2 = (−1) 8 . Indeed, it is easy to see that σk fixes 2 iff k2 = 1, and the rest follows
 

from a parity argument as before. Thus,


√ Y
Un + nVn = X − ωk
( nk )=1
326 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

satisfty Φ4n = Un2 − nVn2 . We wish to show that Un is a polynomial in X 2 and Vn is X times a
polynomial in X 2 , i.e. that Un is even and Vn odd. This follows from the equalities
Y Y
2Un = X − ωk + X − ωk
( nk )=1 ( nk )=−1

and √ Y Y
2 nVn = X − ωk − X − ωk .
( nk )=1 ( nk )=−1
Indeed, we have
Y Y
−X − ω k = X + ωk
( nk )=±1 ( nk )=±1
Y
= X − ω k+2n
( nk )=±1
Y
= X − ωk
( nk )=∓1
   
since k+2n = kp for any odd prime p but k+2n k
 
p 2 =− 2 as 2n ≡ 4 (mod 8). This concludes
the proof. 

Remark 6.5.4
Schinzel has generalised our identities to give a wide class of cyclotomic polynomials with at least
two distinct prime factors. See [25].

Miscellaneous
Exercise 6.5.30† . Let f ∈ Q[X] be an irreducible polynomial with exactly one real root of degree at
least 2. Prove that the real parts of its non-real roots are all irrational.

Solution

Let α be the real root and let β be any non-real root. Suppose for the sake of a contradiction
that 2<(β) = β + olβ is rational. Let σ be an embedding of Q(β) sending β to α. Then,
α + σ(β) = 2<(β) since 2<(β) is rational so fixed by σ, which implies that σ(β) is real. Since
α is the only real root by assumption, we get σ(β) = α which implies that α = <(β) is rational
and is a contradiction. 

Remark 6.5.5
It is not√true at all in general that √
embeddings
√ commute with √ complex √ conjugation.
√ For
√ instance,
over Q( 3 2, j), the embedding σ : 3 2 7→ 3 2j, j 7→ j 2 sends 3 2j 2 to 3 2, and 3 2j to 3 2j 2 , which
are not complex conjugate.

Exercise 6.5.31† . Let K be a number field of degree n. Prove that there are elements α1 , . . . , αn of
K such that
OK ⊆ α1 Z + . . . + αn Z.
By showing that any submodule of a Z-module generated by n elements is also generated by n elements,
deduce that OK has an integral basis, i.e. elements β1 , . . . , βn such that
OK = β1 Z + . . . + βn Z.
6.5. EXERCISES 327

Solution
Pn−1
Let α ∈ OK be a primitive element for K. Suppose that i=0 ai αi ∈ Z for some ai ∈ Q. We
will prove that the denominator of the ai are bounded. Set Emb(K) = {σ1 , . . . , σn }. Consider
the following system of equations:
 n−1
a0 + a1 σ1 (α) + . . . + an−1 σ1 (α)
 = A1
n−1

a + a σ (α) + . . . + a
0 1 2 σ
n−1 2 (α) = A2
.
 .......................................


a0 + a1 σn (α) + . . . + an−1 σn (α)n−1 = An

for some A1 , . . . , An ∈ Z. This can be written in matrix form as

· · · σ1 (α)n
    
1 σ1 (α) a1 A1
1 σ2 (α) · · · σ2 (α)n   a2   A2 
..   ..  =  ..  .
    
 .. .. ..
. . . .  .   . 
1 σn (α) ··· σn (α)n an An

Write this equation as M a = A ⇐⇒ a = AM −1 . Then, by Exercise C.3.18∗ , we know that the


ai are linear combinations of algebraic integers divided by det M . However,
Y
D := det(M )2 = ± σi (α) − σj (α) ∈ Z
i6=j

by Vandermonde, so that Dai ∈ Z for any i, i.e. Dai ∈ Z since Dai ∈ Q. (This is the application
of the exact value of the Vandermonde determinant promised in Remark C.3.2!) This shows that
1 1
OK ⊆ Z + . . . + αn−1 Z
D D
as wanted.

Pn by induction on the number of generators n0 (it


It remains to prove the second part. We proceed
is
Pn trivial when n = 0). Suppose that M = i=1 αi Z and N ⊆ M is a submodule. Define M as
0 0 0
i=2 αi Z, and N as N ∩ M . Using the induction hypothesis, set N = β2 Z + . . . + βn Z. Now,
consider the S set of rational integers k such that kα1 + β ∈ N for some β ∈ N 0 . Since N is
a Z-module, A is an additive subgroup of Z which thus has the form b1 Z for some b. (Indeed,
pick the smallest positive element b1 ∈ A and consider the remainder of the Euclidean division
of k ∈ A by b1 to show that b1 | k, as otherwise A would contain a smaller positive element.)

Set
β1 = b1 α1 + b2 α2 + . . . + bn αn ∈ N
Pn
for some b2 , . . . , bn . If α = i=1 ai αi is an element of N , we have a1 ∈ A so a1 = kb1 for some
k and thus α − kβ ∈ N 0 . Hence,

N = β 1 Z + N 0 = β1 Z + β2 Z + . . . + βn Z

as wanted. 

Exercise 6.5.32† . Let f ∈ Q[X] be an irreducible polynomial of prime degree p and denote its roots
by α0 , . . . , αp−1 . Suppose that
λ0 α0 + . . . + λp−1 αp−1 ∈ Q

for some rational λi . Prove that λ0 = . . . = λp−1 .


328 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Let K = Q(α0 , . . . , αp−1 ). Since p = [Q(α0 ) : Q] divides [K : Q] = | Gal(K/Q), Gal(K/Q) has


an element σ of order p by Cauchy’s theorem 6.3.3. By relabelling the αi , suppose without loss
Pp−1
of generality that σ sends αk to αk+1 . Let S = i=0 λi αi ∈ Q. Then, by applying σ to the
Pp−1
equation i=0 λi αi multiple times, we get the following system of equations in the αi


 λ0 α0 + . . . + λp−1 αp−1 = S

λ α + . . . + λ α = S
0 1 p−1 0
. . . . . . . . . . . . . . . . . . . . . . . . . . .


λ0 αp−1 + . . . + λp−1 αp−2 = S.

Since this system has a non-trivial solution (the trivial one being α0 = . . . = αp−1 ), its determi-
nant must be zero. By Exercise C.5.6, this circulant determinant is

g(ω)g(ω 2 ) · . . . · g(ω p−1 )


Pp−1 i k
where ω is a primitive pth root of unity and g = i=0 λi X . Thus, g(ω ) = 0 for some
k ∈ [p − 1]. Since p is prime, the ω k for k ∈ [p − 1] are all conjugate with minimal polynomial
Φp = 1 + . . . + X p−1 . Thus, Φp | g. Since deg Φp ≥ deg g, we have g = λΦp for some λ ∈ Q.
This yields λ0 = . . . = λp−1 = λ as wanted. (Conversely, all such λi work since the sum of the
conjugates of an algebraic number is rational.) 

Exercise 6.5.33† (TFJM 2019). Let N be an odd integer. Prove that there exist infinitely many
rational primes p ≡ 1 (mod N ) such that x 7→ xn+1 + x is a bijection of Fp , where n = p−1
N .

Solution

The idea is that, if ω + 1 is an N th power in Fp for all N th roots of unity ω ∈ Fp , then


p−1
f is a bijection. Indeed, this means that (ω + 1)n = (ω + 1) N = 1 for any k. Then, if
n+1 n+1 p−1
x +x=y + y, by raising the equation to the n = N th power, since xn and y n are N th
roots of unity, we get
xn = y n
as (xn + 1)n = (y n + 1)n = 1. But then, our original equation becomes
x(1 + xn ) = y(1 + y n )
so x = y or xn = −1. However, the latter implies (−1)N = xp−1 = 1 which is impossible since
N is odd by assumption.
, Thus, we are done if we find infinitely many p such that ω + 1 is an N th power in Fp for any
N th root ω ∈ Fp . Let ζ ∈ C be a complex primitive N th root of unity. Consider the polynomial
Y p
f= X − ζ i N ζ j + 1,
i,j

which has integer coefficients by the fundamental theorem of symmetric √ polynomials. Then, any
p ∈ Psplit (f ) which doesn’t divide N nor f (0) works (the idea is that N ω + 1 will exist in Fp for
such a p). Let p be such a prime, and let α1 , . . . , αm be the roots of f in Fp . We will prove that,
for any root N th root of unity ω ∈ Fp , ω + 1 is the N th power of an element of Fp (in particular,
ω + 1 ∈ Fp ). Note that ω + 1 6= 0 since p - f (0). By the fundamental theorem of symmetric
polynomials, we have
YY YY p
ω + 1 − αkn = ζ k − (ζ i N ζ j + 1)n = 0
ω k k i,j

as wanted. 
6.5. EXERCISES 329

Exercise 6.5.34† . Let f ∈ C(X) be a rational function, and suppose f sends rational integers
algebraic integers to algebraic integers. Prove that f is a polynomial.

Solution

By linear algebra, f has coefficients in a number field K (which we will assume without loss of
generality to be Galois). Indeed, consider the system of linear equations in the coefficients of its
numerator g and denominator g

g(n1 ) = α1 h(n1 ), . . . , g(nk ) = αk h(nk )

for n1 , . . . , nk ∈ Z and α1 , . . . , αn ∈ Z. It has in solution in some finite-dimensional K :=


Q(α1 , . . . , αk )-vector space V (the one generated by the coefficients), and thus also in K, for
instance by considering a basis 1, v1 , . . . , vm of V and looking at the coefficient of 1. Next,
g(n)
notice that for k > deg g + deg h, these conditions completely determine f : if h(n) = u(n)
v(n) for
deg g + deg h + 1 values of n and some polynomials u and v of same degrees as g and h, the
polynomial gu − hv has more roots than its degree so must be identically zero.

Now, consider its conjugates

(f1 , . . . , fk ) = (σ1 (f ), . . . , σk (f )),

where {σ1 , . . . , σk } = Gal(K/Q). By assumption, for any i,

ei (f1 , . . . , fk )

takes infinitely many values which are rational integers at rational integers. Thus, it is a poly-
nomial by Exercise 5.5.1† . This implies that f is integral over the ring of polynomials C[X]: it’s
a root of the monic polynomial

(Y − f1 ) · . . . · (Y − fk ) ∈ C[X][Y ].

However, it is also rational over C[X] since it is in C(X). Thus, an analogue of Proposition 1.1.1
shows that it must be in C[X], as a rational integral (over C[X]) element of C(X) (C[X] is a
UFD so the same proof as Proposition 1.1.1 works). 
Chapter 7

Units in Quadratic Fields and Pell’s


Equation

7.1 Fundamental Unit


Exercise 7.1.1∗ . Prove that α is invertible if and only if its norm is ±1.

Solution

If α is invertible than so are its conjugate σi (α) since αβ = 1 transforms into σi (α)σi (β) = 1.
Thus, so is the product of its conjugates, i.e. its norm. But we have seen that the only invertible
rational integers are ±1. Conversely, if N(α) = ±1 then α times ± the product of its other
conjugates is 1 so α is invertible. 

7.2 Pell-Type Equations


Exercise 7.1.2∗ . Prove Proposition 7.1.1.

Solution

We have
√ already shown that these√were the only units of Q(i) and Q(j)
√ in Chapter 2. The units
of Q( −d) are the elements a + b −d of OQ(√−d) satisfying N (a + b −d) = a2 + db2 = 1 since
the norm is positive
√ so cannot be −1. For d = 2, there are only the trivial solutions ±1 since
OQ(√−2) = Z[ −2] and |b| ≥ 1 implies a2 + 2b2 ≥ 2 while |a| ≥ 2 implies a2 + 2b2 ≥ 4.
5
If d ≥ 5 (d = 4 is not squarefree), a and b are both half integers so a2 + db2 ≥ 4 > 1 if |b| ≥ 1
which implies b = 0 from which we get a = ±1, corresponding to the units ±1. 

Exercise 7.2.1∗ . Prove that OQ(√u) /βOQ(√u) is finite if β 6= 0.

Solution

Let α be such that O := OQ(√u) = Z[α] (see Proposition 2.1.1). Note that, when β = m is a
rational integer, this has exactly m2 = N (m) elements since m | a + bα iff m | a, b (by definition
c + dα is an algebraic integer for rational c, d iff c and d are integers).

330
7.3. STØRMER’S THEOREM 331

For the general case, note that O/βO ⊆ O/N (β)O since β | N (β) so the former has a finite
number of elements too. 

7.3 Størmer’s Theorem


Exercise 7.3.1∗ . Prove that ym | yn iff m | n.

Solution

This is exactly the same as Exercise 4.3.1∗ but in number fields. Write yn = α 2√
n
−β n
d
with
α, β ∈ OQ(√d) . Note that ym | yn iff αm − β m | αn − β n which is equivalent to α/β = α2 having
order dividing n modulo αm − β m . Since its order is exactly m, this is equivalent to m | n. 

Remark 7.3.1
This solution also works for the more general problem where α, β are in any number field K, but
the difficulty lies in showing that OK mod γ is finite for any non-zero γ ∈ OK , so that talking
about the order makes sense. This follows for instance from Exercise 6.5.31† .

7.4 Units in Complex Cubic Fields and Kobayashi’s Theorem


Exercise 7.4.1. Why does looking at the (2k )2 Pell-type equations ax2 − by 2 = k for squarefree
integral S-units a, b not prove that u − v = k has finitely many integral S-units solutions?

Solution

It doesn’t work because there are no reason for it to work, we were simply lucky before. Indeed,
the solutions of Proposition 7.2.4 are very messy: they have the form αk βi for √ some α and
elements β1 , . . . , βn . The problem is the additional factor: if we look as the d part we get
something of the form
X n  X n 
n−2k−1 2k k
yn = a x y d +b xn−2k y 2k dk
2k + 1 2k
k k

which has too many terms to work with. In particular, we can’t even restrict ourselves to the
n = p prime case since we do not necessarily jave yn | ym when n | m. 

Exercise 7.4.2∗ . Prove Theorem 7.4.2 in the case where a/b is a rational cube.

Solution

Write au3 = bv 3 = c with non-zero u, v ∈ Z. Then, ax3 + by 3 = k becomes

(xv)3 + (yu)3 = ku3 v 3 /c

so it suffices to consider the case where a = b = 1. We have x + y | x3 + y 3 = k, say x + y = d.


It suffices to show that there are finitely many solutions for a fixed d since k has finitely many
332 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

divisors. In fact we will prove that there is at most one solution for a fixed d. We have
k k
x2 − xy + y 2 = = := d0 .
x+y d

Thus, 3xy = (x + y)2 − (x2 − xy + y 2 ) = d2 − d0 . Hence, x and y are roots of

d2 − d0
X2 − X +d
3
by Vieta’s formulas which implies that there is at most one (unordered) pair of solutions as
wanted. 


Exercise 7.4.3∗ . Prove that the only roots of unity of Q( 3 d, j) are ±1, ±j and ±j 2 .

Solution

Note that       


2iπ 2iπ 2iπ
Q exp , exp = Q exp
n m lcm(m, n)
  a
2iπ 2iπ b
= exp 2iπ

since the RHS clearly contains the LHS, and exp lcm(m,n) n exp m where am +
bn = gcd(m,
√ n) by Bézout. Thus, if n is the greatest order of a root of unity ω contained in
K = Q( 3 d, j), then 6 | n since −j has order 6.

By Chapter 3, ω has degree ϕ(n), and since K has degree 6 we get ϕ(n) | 6. This implies
n ∈ {6, 12, 18}. n = 6 is what we want, so we need to show that the other two cases are
impossible. For this, note that ϕ(12) = ϕ(18) = 6, so if it were the case then we would have
K = Q(ω).

To finish, we shall imitate the solution of Problem 6.3.1 to show that this is impossible, i.e. that
the Galois group of K is not abelian. Note that the embeddings of K are
(√ √
3
d 7→ j a 3 d
σ(a,b) :
j 7→ j b

for a ∈ Z/3Z and b ∈ (Z/3Z)× . Moreover, by Exercise 6.3.6∗ , we have

σ(0,−1) ◦ σ(1,1) = σ(−1,−1)

and
σ(1,1) ◦ σ(0,−1) = σ(1,−1)
so σ(0,−1) and σ(1,1) do not commute which means that Galois group of K is not abelian as
wanted. 

Exercise 7.4.4∗ . Prove that θ/σ(θ) ∈ {±j, ±j 2 } is also impossible.

Solution

• θ/σ(θ) = ±j yields

3

3

3

3
x + y d + z d2 = ±j(x + yj d + zj 2 d2 )

which means x = 0 since there’s no j term on the left and y = 0 since there’s no j 2 term
7.5. EXERCISES 333


3

3
on the left. Thus θ = z d2 which is impossible since the norm of z d2 is z 3 d2 which can’t
be 1.
• θ/σ(θ) = ±j 2 yields

3

3

3

3
x + y d + z d2 = ±j 2 (x + yj d + zj 2 d2 )

which means x = 0 since√there’s no j 2 term on the left and z = 0 since


√ there’s no j term
on the left. Thus θ = y 3 d which is impossible since the norm of y 3 d is y 3 d which can’t
be 1.

7.5 Exercises
Diophantine Equations
12 +...+n2
Exercise 7.5.1† (ISL 1990). Find all positive rational integers n such that n is a perfect
square.

Solution

Note that
12 + . . . + n2 (n + 1)(2n + 1)
= .
n 6
Thus, this is equal to k 2 is and only if

2n2 + 3n + 1 = (n + 1)(2n + 1) = 6k 2 ,

i.e.
(4n + 3)2 − 48k 2 = 1.
Thus, we want to√solve the√Pell equation x2 − 48y 2 = 1 with x ≡ 3 (mod 4). The solutions are
n n
given by x = (7+ 48) +(7−
2
48 )
and it is easy to see that this is congruent to 3 modulo 4 when
n is odd. Indeed, modulo 4 it is congruent to
√ √
(7n + n7n−1 48) + (7n − n7n−1 48)
= 7n ≡ (−1)n .
2



Exercise 7.5.2† (BMO 1 2006). Let n be a rational integer. Prove that, if 2 + 2 1 + 12n2 is a
rational integer, then it is a perfect square.

Solution
m m m m
−β
The solutions to the equation x2 − 3y 2 = 1 are given by xm = α +β
2 , ym = α2(α−β) , where α

and β are the conjugate fundamental
p units 2 ± 3). Since y1 = 1 and y2 = 4, y√
m is even iff n is.
Thus, by assumption, since 1 + 3(2n)2 is an integer, we have 2n = y2m and 1 + 12n2 = x2m
for some m, i.e.
p α2m + β 2m
2+ 1 + 12n2 = 2 + 2 · = 2αm β m + α2m + β 2m = (αm + β m )2 = (2xm )2 .
2
334 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Exercise 7.5.4† (RMM 2011). Let Ω(·) denote the number of prime factors counted with multiplicity
of a rational integer, and define λ(·) = (−1)Ω(·) . Prove that there are infinitely many rational integers n
such that λ(n) = λ(n + 1) = 1 and infinitely many rational integers n such that λ(n) = λ(n + 1) = −1.

Solution

For the first part, let (x, y) be a solution to the Pell equation x2 − 6y 2 = 1. Then, n = 6y 2 has
an even number of prime factors and so does n + 1 = x2 .

For the second part, let (x, y) be a solution to the Pell-type equation 3x2 −2y 2 = 1. Then, n = 3y 2
has an odd number of prime factors, and so does n+1 = 2x2 . Note that this equation has infinitely
√ n √ n
many solutions. Indeed, the Pell equation z 2 − 2y 2 = 1 has the solutions z = (3+2 2) +(2−2
2
2)
,
and this is divisible by 3 iff n is odd. 

Exercise 7.5.5† . Let k be a rational integer. Prove that there are infinitely positive integers n such
that n2 + k | n!.

Solution

By Proposition 7.2.3, the equation x2 − dy 2 = −k has infinitely many solutions if it has at least
one and d isn’t a perfect square. Thus, pick an r such that r2 + k = d is not a perfect square (this
is true for sufficiently large m since gaps between consecutive perfect squares are increasing) and
consider any solution n to the equation n2 − dm2 = −k, which has infinitely many solutions since
(n, m) = (r, 1) is one. Finally, note that

n2 + k = dy 2 = y · dy | (dy 2 − k)! = n!

for sufficiently large y. 

Pell-Type Equations
Exercise 7.5.11† . Let d be a rational integer. Solve the equation x2 − dy 2 = 1 over Q.

Solution

We will solve this geometrically! The idea is that we (almost) get a correspondence between the
rational points of our conic (the curve x2 + dy 2 = 1), and the rational points of the horizontal
line y = 0. Indeed, if we have a rational point p on the conic, we get a rational point on the
horizontal line by intersecting it with the line going through p and (1, 0). Conversely, if we have
a rational point q on the horizontal line and intersect the conic with the line going through q and
(1, 0), we get a rational point on the conic.

Let’s make this more explicit. Let (0, t) be a rational point on the horizontal line. Then, the line
joining (0, t) with (1, 0) is y = t(1 − x). When we intersect this with the conic, we get

dt2 + 1
x2 + dt2 (1 − x)2 = 1 ⇐⇒ x + 1 + dt2 (1 − x) = 0 ⇐⇒ x = .
dt2 − 1
7.5. EXERCISES 335

2t
From this, we get y = dt2 −1 . Thus, the solutions are

dt2 + 1
  
2t
t ∈ Q ∪ {(1, 0)}.
,
dt2 − 1 dt2 − 1

Exercise 7.5.13† . Prove that the equation x2 − 34y 2 = −1 has no non-trivial solution in Z despite
−1 being a square modulo 34.

Solution
√ √
The fundamental unit of Q( −34) is 35 + 6 34 which has norm 1. 

Fundamental Units
√ √
Exercise 7.5.15† . Let d ≡ 1 (mod 4) be a squarefree integer, and suppose η = a+b d
6∈ Z[ d] is the
√ √ 2
fundamental unit of Q( d). Prove that η n ∈ Z[ d] if and only if 3 | n.

Solution

Let’s look at Z[η] = OQ(√d) modulo 2: there are four elements since a+bη 2 ∈ Z[η] iff a2 , 2b ∈ Z.
Out of these four, three are invertible modulo 2 and the last isn’t (it’s 0): 1 · 1 ≡ 1, η · η ≡ 1, and
η + 1 ≡ η which is invertible by the previous equality. Indeed, by assumption η − η ≡ η + η ≡ 1
(mod 2).

In other words, 2 is prime in Z[η], since an element √


is either invertible modulo 2 or divisible by 2
(note that this relies on the assumption that η 6∈ Z[ d], in general, for d ≡ 1 (mod 4), 2 is prime
iff d ≡ 5 (mod 8)). Thus, by Fermat’s little theorem in Z[η]/2Z[η] (this is a finite field with 4
elements, see Theorem 4.2.1), the order of η modulo 2 divides 3. Since η 6≡ 1, its order must be
exactly 3. Finally, we have
√ η n − η −n
η n ∈ Z[ d] ⇐⇒ ∈ Z ⇐⇒ 2 | η 2n − 1 ⇐⇒ 3 | 2n ⇐⇒ 3 | n
2
as wanted. 

Exercise 7.5.16† . Let d 6= 1 be a squarefree 2n 2


√ rational integer, and suppose√ that 2 + 1 = dm for
n
some integers n, m ≥ 0. Show that 2 + m d is the fundamental unit of Q( d), provided that d 6= 5.

Solution

Clearly, m√is odd. Suppose for
√ the sake of a contradiction that 2n + m √d is not the fundamental
unit of Q( d). Then, 2n + m d is the mth power of an element η ∈ Q( d) for some m. Without
m/p
loss of generality, we may assume that m =√p is prime
√ (by replacing η by α ). First, suppose
that p is odd. Suppose also that η = u + v d ∈ Z[ d], where u and v are positive. Then,
√ √
2n + m d = (u + v d)p
336 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

gives us
X p  X p 
2n = up−2k v 2k dk = u up−2k−1 v 2k dk .
2k 2k
k k
p−1
Thus, u | 2n . Notice that the second factor is congruent to pv p−1 d 2 modulo u and that this is
coprime with u as p and d are odd and u, v are coprime since 2n and m are. Thus, either u = 1
or u = 2n . The former is impossible since the unit is non-trivial, and the latter as well since the
p−1
second factor is at least pv p−1 d 2 > 1.

Now, suppose that η = u+v2 d for some odd u, v. Then, we get exactly the same equation as
before, but with n replaced by n + p:
X p  X p 
2n+p = up−2k v 2k dk = u up−2k−1 v 2k dk .
2k 2k
k k

(We distinguished the two cases because we want u and v to be coprime for the two factors to
be as well.) As before, u = 1 or u = 2n+p : the latter is still impossible as the second factor is
p−1
at least pv p−1 d 2 > 1, but now the former could be possible since the unit is non-trivial (the
rational part is now 21 and not 1). However, it gives dv 2 ∈ {5, −3}: the first case is ruled out by
the hypothesis and the latter is impossible.
√ √
It only remains to settle the case p = 2 now. In that case, suppose first that η = u+v d ∈ Z[ d].
Then, we get √ √ √
2n + m d = (u + v d)2 = (u2 + dv 2 ) + 2uv d

u+v d
and this is impossible since m is odd. Finally, if η = 2 for some odd u, v, we get
√ √ √
2n+2 + 4m d = (u + v d)2 = (u2 + dv 2 ) + 2uv d

which is impossible since 2uv is not divisible by 4. 

Remark 7.5.1
We could have also used Carmichael’s theorem from Exercise 4.6.33† : we have
αm + β m
2n = ,
2

m odd (since 2n + y d is not a square), where α and β are the conjugate fundamental units
with √
of Z[ d]. Since m is odd, we do not have to consider any exceptions, and we get that αm − (−β)m
has a primitive prime factor p which does not divide α + β. Since α + β is even (it’s twice the
this implies that p is odd, which√is a contradiction since 2n has no odd prime
rational part of α), √
n
factor.
√ Thus, 2 +y d is the fundamental unit of Z[ d], but√it might not be the fundamental unit
of Q( d). √The last case we need to consider is when 2n + y d = η 3 , where η is the fundamental
unit of Q( d), by Exercise 7.5.15† .

Exercise 7.5.17† . Suppose that d = a2 ± 1 is squarefree, where a ≥ 1 is some rational integer and
let k ≥ 0 be a rational integer. Suppose that the equation x2 − dy 2 = m has a solution in Z for some
|m| < ka. For sufficiently large d, prove that |m|, d + m or d − m is a square.

Solution

Note
√ that, the assumption that d = a2 ± 1 gives us that θ =
√ a + d is the fundamental unit of
Q( d). Indeed, if x2 − dy 2 = ±1 for some y 6= 0, then x ≥ d − 1 so x ≥ a.

Suppose (x, y) is a positive solution to x2 − dy 2 = m. By dividing x + y d by a suitable power
7.5. EXERCISES 337

of θ (this √
may change the sign of m but doesn’t change its absolute value), we may assume that
1 ≤ x + y d < θ. Then,
√ √ √ |m|
|2y d| ≤ |x + y d| + |x − y d| < θ + < 2a + k + o(1).
θ
Thus, |y| < 1 + o(1). For sufficiently large a, our inequality forces |y| = 1 or y = 0. If |y| = 1, we
get that m + d is a perfect square, and if y = 0 we get that m is a perfect square as wanted. 

Remark 7.5.2
The argument used in Remark 5.5.1 can be slightly modified to show that, for any choice of ±1,
there exist infinitely many squarefree numbers of the form a2 ± 1.

Exercise 7.5.18† . Solve completely the equation x3 + 2y 3 + 4z 3 = 6xyz + 1 which was seen in
Problem 6.2.2.

Solution
√ √
Since the norm of√ x + y 3 2 + z 3 4 is x3 + 2y 3 + 4z 3 − 6xyz (see √
Problem 6.2.2),√we wish
√ to find
units in K = Q( 3 2). We claim that the fundamental unit of Q( 3 2) √ is θ =√1 + 3
2 + 3
4.
√ Thus,√
the solutions will be the one considered in√Problem 6.2.2, i.e. x + y 3 2 + z 3 4 = (1 + 3 2 + 3 4)n
for some n (the only roots of unity of Q( 3 2) are ±1 and −1 has norm −1 so does not work).

We need to show that this unit has minimal absolute√ value


√ (among the ones greater than 1).
3 3
Suppose that there is a unit greater than 1 a + b 2 + c 4 := ε < θ < 4. Let σ be a complex
embedding of K. Then, |σ(ε)|2 = 1ε ∈ 12 , 1 . Hence, the minimal polynomial X 3 + uX 2 + vX ± 1
 

of ε satisfies
2
0 < −u = ε + σ(ε) + σ(ε) < ε + √ < 5
ε
and
√ 1
|v| = ε(σ(ε) + σ(ε)) + |σ(ε)|2 ≤ 2 ε + < 5.

ε
Thus, u ∈ [−4, −1] and v ∈ [−4, 4]. However, we also have u = −3a and v = 3(a2 − 2bc). Thus,
u = −3 and v = ±3, which yield a = 1, and b = c = 1 or b = 0 or c = 0. If b = 0, then
a3 + 2b3 + 4c3 − 6abc = 1 + 4c3 so c must also be 0, and if c = 0 then a3 + 2b3 − 6abc = 1 + 2b3
so b = 0 or b = −1.
√ √ √ √
Thus, we conclude that ε = 1 + 3 2 + 3 4 as wanted, or ε = 1 − 3 2. However,
√ 1 − 3 2 < 1 so we
1 √
must be in the first case, as asserted. (In fact, it turns out that 1 − 3 2 = − 1+ √
3
2+ 3 4
, which is
perhaps a neater choice of fundamental unit.) 

Exercise 7.5.19† (Weak Dirichlet’s Unit Theorem). Let K be a number field with r real embeddings
and s pairs of complex embeddings. Prove that there exist units ε1 , . . . , εk with k ≤ r + s − 1 such
that any unit of K can be written uniquely in the form

ζεn1 1 · . . . · εnk k

for some integers ni and a root of unity ζ.


338 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Solution

Let σ1 , . . . , σr be the real embeddings of K, and σr+1 , σ r+1 , . . . , σr+s , σ r+s its pairs of complex
embeddings and let U be the group of units of K. We look at the logarithms of the embeddings
of units:
L = {(log |σ1 (ε)|, . . . , log |σr+s−1 (ε)|) | ε ∈ U } ⊆ Rr+s−1 .
We claim that this set is a discrete additive subgroup of Rr+s−1 , meaning that it’s closed under
addition and subtraction, and, for any x ∈ Rr+s−1 , there is no sequence of distinct elements of
L tending to x. To show this, we will prove that, for any A, B > 0, there are finitely many units
such that A < |σi (ε)| < B for i = 1, . . . , r + s − 1. Notice that such a number also satisfies
1 1
< |σr+s−1 | < r+s−1
B r+s−1 A
since it is a unit (so the product of |σi (ε)| is 1). Thus, all the conjugates of ε have bounded
absolute value, which implies that its minimal polynomial has bounded coefficients. This shows
that there are a finite number of such ε.

Next, we show that any discrete additive subgroup Γ of Rm is a lattice, i.e. admits a linearly
independent basis as a Z-module, or in P other words, there are α1 , . . . , αk such that any element
k
of Γ can be written in a unique way as i=1 ni αi with ni ∈ Z. Since Rm has dimension m as a
R-vector space, this implies that k ≤ m by Proposition C.1.2. To show this, pick any maximal
set of linearly independent elements β1 , . . . , βk ∈ Γ and let Γ0 = β1 Z + . . . + βk Z. We will prove
that there are a finite numbers of elements in Γ modulo Γ0 , i.e. that Γ/Γ0 is finite, say has N
elements. Then, Lagrange’s theorem 2.5.1 implies that N α ∈ Γ0 for any α ∈ Γ, i.e.
1 0 β1 βk
Γ⊆ Γ = Z + ... + Z.
N N N
We can then conclude with Exercise 6.5.31† (so many intermediate results!) that Γ also has
a Z-basis. Thus, it remains to prove that Γ/Γ0 is finite. For this, note that it suffices to
prove that β1 [0, 1] + . . . + βk [0, 1] contains finitely many elements of Γ, as this is a system
of representatives of Rm /Γ0 . If there were infinitely many elements of Γ there, Γ would have
a convergent subsequence by the Bolzano-Weierstrass theorem from 8.7.10† , contradicting its
discreteness (we do not actually need BW since we actually directly showed that there were a
finite number of elements of L in any interval).

Finally, the previous discussion implies that L has a basis corresponding to the image of ε1 , . . . , εk
under the logarithmic embedding, for some k ≤ r + s − 1. Thus, by raising everything to the
exponential, we get that, for every unit ε, there are unique integers n1 , . . . , nk such that the
number
ε
εn1 1 · . . . · εnk k
has all its conjugates on the unit circle. By Exercise 1.5.27† , this implies that it is a (unique)
root of unity ζ as wanted. 

Remark 7.5.3
There is nothing particularly deep about the logarithm in this proof, apart from the fact that
it transforms multiplication into addition and that we feel more comfortable working with addi-
tion. We could of course transform our additive proof into a multiplicative one by removing the
logarithms and turning addition into multiplication.

Exercise 7.5.20† (Gabriel Dospinescu). Find all monic polynomials f ∈ Q[X] such that f (X n ) is
reducible in Q[X] for all n ≥ 2 but f is irreducible.
7.5. EXERCISES 339

Solution

Let α be a root of f and let K be the splitting field of f , i.e. Q(α1 , . . . , αk ) where αi are the roots
of f (the conjugates of α). Note that the statement is equivalent to f (X p ) being reducible for
any prime p. In fact we only need this assumption for infinitely many primes. By Lemma 6.1.1,
f (X p ) is reducible over Q if and only if X p − α is reducible over Q(α), and by Exercise 6.5.9† ,
this is equivalent to α being a pth power in Q(α), and thus in K too.

By looking at the norm of α in K, we see that α must have norm 1 or 0 since its norm is a
pth power in Q for infinitely many p. If α = 0, then f = X since it is irreducible, which works.
Otherwise, α must be a unit. By Exercise 7.5.19† , there are multiplicatively independent units
ε1 , . . . , εk ∈ K and integers n1 , . . . , nk as well as root of unity ζ such that

α = ζεn1 1 · . . . · εnk k .

Since ε1 , . . . , εk are multiplicatively independent, the fact that α is a pth power means that p | ni
for every i. For sufficiently large p, we find n1 = . . . = nk = 0. Thus, α is a root of unity.
However, since a primitive mth root of unity has degree ϕ(m) over Q, K contains finitely many
roots of unity (ϕ(m) is greater than [K : Q] for sufficiently large m), which implies ζ = 1 since
it’s a pth power for infinitely many primes p. Thus, we conclude that α = 1. The two solutions
are hence f = X and f = X − 1, which indeed work. 

Miscellaneous
Exercise 7.5.21† (Liouville’s Theorem). Let α be an algebraic number of degree n. Prove that there
exists a constant C > 0 such that
α − p > C

q qn
for any p, q ∈ Z (with q > 0).

Solution

Let α1 , . . . , αn be the conjugates of α. We have


n n
Y p Y
qn − αi = p − qαi ≥ 1


i=1
q
i=1


since it’s a non-zero integer. If pq − α < 1, then pq − α0 < 1 + |α − α0 | for any conjugate α0 6= α.

Thus, in this case we have
n
Y 1
|p − qα| 1 + |α − αi | ≥ n
i=1
q

as wanted (C = Qn 1+|α−α 1
). Otherwise, we have pq − α ≥ 1 > qCn too. 

i=1 i|

Exercise 7.5.22† . Prove that 5n2 ± 4 is a perfect square for some choice of ± if and only if n is a
Fibonacci number.

Solution
√ √
Simply note that 1+2 5 is the fundamental unit of Q( 5), and that the solutions to the equation
x2 − 5y 2 = ±1 for x ≡ y (mod 1) half-integers, i.e. the rational integers solutions to the equation
340 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

(2x)2 − 5(2y)2 = ±4 are thus


 √ n  √ n
1+ 5
2 − 1−2 5
2y = √ = Fn .
5


Exercise 7.5.23† (ELMO 2020). Suppose n is a Fibonacci number modulo every rational prime.
Must it follow that n is a Fibonacci number?

Solution

By Exercise 7.5.22† , the statement means that, for every p, 5n2 + 4 or 5n2 − 4 is a quadratic
residue (or zero). This implies that 5n2 + 4 or 5n2 − 4 is a perfect square by an argument similar
to Exercise 4.6.20† , i.e. n is a Fibonacci number too. Indeed, if, modulo sufficiently large primes,
one of a and b is a quadratic residue, then one of them must be a square (this is not true anymore
with 3 numbers, see Exercise 4.6.22† ). By Exercise 4.6.20† , we may assume that a 6= b.

Suppose without loss of generality that a and b are squarefree. Write a = ε2r p1 · . . . · pk and
b = η2s q1 · . . . · qm with ε, η ∈ {−1, 1}, r, s ∈ {0, 1}, and p1 , . . . , pk , q1 , . . . , qm odd primes. Let t
be a quadratic non-residue modulo p1 (if k ≥ 1). If a and b are both divisible by an odd prime,
say p1 = q1 , then pick a large prime

p≡1
(mod 8p2 · . . . · pk q2 · . . . · qm )
   
and p ≡ t (mod p1 ). Then, quadratic reciprocity gives us ap = pb = −1 which is a contra-
diction.

Suppose for the sake of a contradiction that k, m ≥ 1. Then, pick a large prime

p≡1 (mod 8p2 · . . . · pk q2 · . . . · qm ),


0 0
 ≡
p   p1 ) and p ≡ t (mod q1 ) where t is a quadratic non-residue modulo q1 to get
 t (mod
a
p = pb = −1. Thus, suppose without loss of generality that m = 0. If k ≥ 1, pick a large
   
prime p ≡ 8 (mod p1 · . . . · pk ) and p ≡ t (mod p1 ) to get ap = pb = −1 again.

Finally, we have k = m = 0 so {a, b} ∈ {1, −1, 2, −2}. It remains to show that {a, b} = {2, −2},
{a, b} = {−1, −2} and {a, b} = {−1, 2} are all impossible. For the first, note that they are both
quadratic non-residues modulo p ≡ 5 (mod 8), for the second, note that they are both quadratic
non-residues modulo p ≡ −1 (mod 8), and for the last note that they are both quadratic non-
residues modulo p ≡ 3 (mod 8).

As a final remark, as in Exercise 4.6.20† , we may avoid Dirichlet’s theorem on primes in arithmetic
progressions with Jacobi’s quadratic reciprocity law (by picking any rational integer p ≡ u
(mod v) with sufficiently large prime factors instead of a prime). 

Exercise 7.5.24† (Nagell, Ko-Chao, Chein). Let p be an odd rational prime. Suppose that x, y ∈ Z
are rational integers such that x2 − y p = 1. Prove that 2 | y and p | x. Deduce that this equation has
no solution for p ≥ 5. (The case p = 3 is Exercise 8.7.19† .)
7.5. EXERCISES 341

Solution

If y is odd, then the two factors of y p = (x − 1)(x + 1) are coprime so x − 1 and x + 1 are pth
powers. This is impossible, as there are no pth power distant by 2: (m + 1)p − mp ≥ p + 1. Now,
p
suppose for the sake of a contradiction that p - x. Then, the two factors of x2 = (y + 1) · yy+1+1

are coprime. Indeed,


p
this is a product of cyclotomic polynomials, but it can also be seen more
elementarily: yy+1+1
≡ p (mod y + 1). This implies that y + 1 = a2 and y p + 1 = b2 . Now consider
p−1
the Pell equation u2 − yv 2 = 1. We have two solutions: (u, v) = (a, 1) and (u, v) = (b, y 2 ).
Notice that, for both of them, v is a y-unit. By Størmer’s theorem, this implies that they are
both the fundamental solution, which is impossible.

Thus, p | x and 2 | y. Without loss of generality, suppose that x + 1 = 2p−1 ap and x − 1 = 2bp
(by replacing x by −x if necessary). Since |x| > 1, a and b have the same sign, and |a| < |b|.
The key (magical?) point is that
 2  2
x−1 x−3
b2p + (2a)p = + 2(x + 1) = .
2 2
b2p +(2a)p
For p 6= 3, this is not divisible by p since p | x, so b2 + 2a and b2 +2a are perfect square.
However,
b2 < b2 + 2a < (b + 1)2
if a and b are positive, and
(b − 1)2 < b2 + 2a < b2
if a and b are negative. In all cases, we have reached a contradiction. 

Exercise 7.5.25† . Prove that there are at most 3|S| pairs of S-units distant by 2.

Solution

If u − v = 2, then (v + 1)2 − uv = 1. We let rad(uv) | d be minimal such that uv/d is a square.


There are 3|S| possible d. As before, any u − v = 2 give rise to a solution to the Pell equation
x2 −dy 2 = 1 for some d-unit number y, which must thus be the minimal unit by Proposition 7.3.1.
Thus, there are also at most 3|S| pairs of S-units distant by 2. 

Exercise 7.5.26† . Assuming the finiteness of rational solutions to the S-unit equation u + v = 1 for
any finite S, determine all functions f : Z → Z such that m − n | f (m) − f (n) for any m, n and f is a
bijection modulo sufficiently large primes.

Solution

Let S be the set of primes p for which f is not a bijection modulo p or p = 2. By assumption,
f (n + 1) − f (n), f (n + 2) − f (n + 1), and f (n + 2) − f (n) are all S-units. Thus, we have a solution

(f (n + 2) − f (n + 1)) + (f (n + 1) − f (n)) = f (n + 2) − f (n)

to the S-unit equation. There are a finite number of solutions to this equation (up to scaling),
so we get that f (n+2)−f (n+1)
f (n+1)−f (n) is in a finite set U . Now, pick a large prime p 6∈ S such that
|U (mod p)| = |U | and let a ∈ Z. Since f (n+2)−f (n+1) f (n+ap+2)−f (n+ap+1)
f (n+1)−f (n) and f (n+1+ap)−f (n+ap) are congruent
modulo p and in U , they must be equal. By picking another sufficiently large prime q 6= p and
342 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

b ∈ Z such that ap + bq = 1, we get

f (n + 2) − f (n+!) f (n + ap + bq + 2) − f (n + ap + bq + 1) f (n + 3) − f (n + 2)
= =
f (n + 1) − f (n) f (n + 1 + ap + bq) − f (n + ap + bq) f (n + 2) − f (n + 1)

which means that the quotient f (n+2)−f (n+1)


f (n+1)−f (n) is in fact constant, say equal to r. Then, f satisfies
the following linear recurrence: f (n + 2) = f (n + 1) + r(f (n + 1) − f (n)) = sf (n + 1) − f (n)
which, unless s = 2 which implies that the characteristic polynomial has a double root, reduces
to f (n) = uαn + vβ n for some conjugate quadratic integers α, β. But then, if p 6= 2 is such that
the characteristic polynomial X 2 − sX + 1 splits modulo p, we get

f (n) ≡ up αpn + vp βpn (mod p)

for some up , vp , αp , βp ∈ Fp , so that f (p − 1) = up + vp = f (1). This is a contradiction. Thus,


s = 2, which gives f (n) = un + v for some u, v ∈ Z. Conversely, it is clear that arithmetic
progressions work. 
7.5. EXERCISES 343

Remark 7.5.4
If f is taken to be |g|p for some g, which is usually how the theorem is used in p-adica analysis,
there is in fact a simpler argument. Since the distance of Qp is almost discrete, i.e. the values
that it reaches 0, . . . , 1/p2 , 1/p, 1, p, p2 , . . . are all isolated except 0, we get the stronger conclusion
that f has a maximum if and only if it is bounded above, and it has a minimum if 0 ∈ im f or if
it is bounded below by a positive number.
Chapter 8

p-adic Analysis

8.1 p-adic Integers and Numbers


Exercise 8.1.1∗ . Check that Zp is an integral domain. What is its characteristic?

Solution

ab = 0 means ai bi = 0 for all i, where a = (a1 , a2 , a3 , . . .) and b = (b1 , b2 , b3 , . . .). Suppose that
a is non-zero, and let k be such that ak 6= 0. Then, vp (ai ) = vp (ak ) for i ≥ k since ai ≡ ak
(mod pk ). Thus, for i ≥ k, we have vp (bi ) ≥ i−vp (k). Hence, the coordinates of b have arbitrarily
large p-adic valuation which means that they are all zero by compatibility: if vp (bi ) ≥ N and
i ≥ N , then bN ≡ bi ≡ 0 (mod pN ).

Zp has characteristic zero since (n, n, n, . . .) is zero only when n = 0, otherwise n has a non-zero
vp and it thus a non-zero coordinate too. 

Exercise 8.1.2∗ . Check that a 7→ (a (mod p), a (mod p2 ), a (mod p3 ), . . .) is indeed an embedding
of Z(p) into Zp , i.e. that it’s injective.

Solution

It is clearly additive and multiplicative, and it is injective since the kernel is trivial: if a is
non-zero then it has a non-zero vp so a non-zero component under this embedding too. 

Exercise 8.2.1∗ . Convince yourself of this proof.

Solution

Another way to write this proof is to define bk as

∞ |ai |<p−k
X X
k
ai (mod p ) ≡ ai .
i=0 k=0

This sequencePis Cauchy since |bi − bj | < p− min(i,j) by the strong triangle inequality and clearly

converges to i=0 ai by the strong triangle inequality again. 

344
8.2. P -ADIC ABSOLUTE VALUE 345

8.2 p-adic Absolute Value


Exercise
P 8.2.2∗ . Prove that the strong triangle inequality also holds for series: if ai → 0 then
| i ai |p ≤ maxi |ai |p with equality if the maximum is achieved only once.

Solution

We have n
X
ai ≤ max |ai |p ≤ max |ai |p


1≤i≤n i
i=1

for all n and this yields the wanted inequality by taking the limit as n → ∞. For the equality
part, just note when the maximum is achieved only once, we have equality when n is sufficiently
large so taking the limit yields the equality again. 

Exercise 8.2.3∗ . Prove the product formula.

Solution

This is a consequence of the prime factorisation:


Y Y 1
|x|p = p−vp (x) = .
p p
|x|∞

8.3 Binomial Series


Exercise 8.3.1∗ . Prove that Q is dense in Qp .

Solution

This is a consequence of the density of Z in Zp : if α is an element of Qp then write α = pk a with


a ∈ Zp . There is a sequence of rational integers approaching a, and multiplying this sequence by
pk yields a sequence of rational numbers approaching α. 

Exercise 8.3.2∗ . Let f ∈ Qp [X] be a polynomial. Prove that f is continuous on Qp .

Solution

The proof is the same as in R: if ε is very small then


n  
n n
X k k−1 n−k
(x + ε) − x = ε ε x
n
k=1

is also very small by the triangular inequality (it is in fact even neater in Qp since we have the
strong triangle inequality to bound the second factor). 

Exercise 8.3.3∗ . Let f : Zp → Qp be a continuous function. If |f (x)|p ≤ 1 for any n in a dense


subset (in Zp ), prove that |f (x)|p ≤ 1 for any x ∈ Zp .
346 CHAPTER 8. P -ADIC ANALYSIS

Solution

Let x ∈ Zp be a p-adic integer and let a1 , a2 , . . . be sequence of elements of that dense subset
approaching x. By the triangular inequality, we have

|f (x)|p − 1 ≤ |f (x)|p − |f (an )|p ≤ |f (x) − f (an )|p → 0

which means that |f (x)|p ≤ 1 as wanted. 

Exercise 8.3.4∗ . Prove that, if p > 5 is a rational prime, p2 | k=1


Pp−1 1
Pp−1 1
k3 and p | k=1 k4 .

Solution

Note that
1 1 k 3 + (p − k)3 3k 2 p
+ = ≡ − (mod p2 )
k3 (p − k)3 (p(p − k))3 (k(p − k))3
P p−1 k2 k2
3 (p−k)2
so we need to prove that p | k=12
(k(p−k))3 . Since p is odd and (k(p−k)) ≡ ((p−k)k)3 , this is
equivalent to
p−1
X k2
p| .
(k(p − k))3
k=1

Since
k2 k2
3
≡ ≡ −k −4 ,
(k(p − k)) (−k 2 )3
Pp−1
we need to show that k=1 k −4 ≡ 0 (mod p). Let ω be a primitive root modulo p. Then,
p−1 p−1
X X ω −4(p−1) − 1
k −4 ≡ ω −4k ≡ ≡0
ω −4 − 1
k=1 k=1

since the numerator is zero and the denominator is non-zero as p > 5. Note that this is also the
second claim. 

Exercise 8.3.5∗ . Prove Proposition 8.3.4.

Solution

If we consider only the nth coordinate of these series, then, since ai,j → 0, the series become
finite sums (the nth coordinate of ai,j is zero for sufficiently large i + j). In particular, both sides
are equal. Letting n go to infinity, this shows both that the series converge and that they are
equal. 

Exercise 8.3.6∗ . Let n ∈ N be a positive rational integer and p be a prime number. Prove that
∞  
X n
vp (n)! = .
pk
k=1
8.4. ANALYTIC FUNCTIONS 347

Solution
j k
There are np numbers in [n] which are divisible by p, which explains this term in the sum.
However, their contribution to the total p-adic valuation
j k might not always be one: some of these
numbers are divisible by p too. Hence we add pn2 to account for them too (combining with
2
j k
n
p this constitutes a contribution of 2 for the multiples of p2 ). Then we take in account the
j k
contribution of multiples of p3 with pn3 , then the multiples of p4 , etc. 

Exercise 8.3.7∗ . Prove Corollary 8.3.1.

Solution

Since (1 + u)x = k xk uk , we need to prove that uk /k! → 0 for |u|p < p−1/(p−1) by Proposi-
P 

tion 8.3.3. Proposition 8.3.5 gives us |k!|p ≥ p−k/(p−1) so that


 k
k |u|p
|u /k!|p = |u|kp /|k!|p ≥ → 0.
p−1/(p−1)

8.4 Analytic Functions


Exercise 8.4.1∗ . Prove that locally analytic functions are continuous.

Solution

Let α ∈P
Zp be a p-adic integer. We prove that, if f is analytic at α, it is continuous at α. Write

f (x) = i=0 ai (x − α)i around α. Then,

X∞
ai (x − α) ≤ max ai (x − α)i p ≤ C|x − α|p
i

|f (x) − f (α)|p =

i≥1
i=1 p

for |x − α|p sufficiently small and some constant C. Indeed, the series converge when |x − α|p ≤ ε
for some ε > 0 so ai εi → 0 which implies
 i
i x−α x−α

i
|ai εi |p

max ai (x − α) p = max ai ε
≤ max
i i≥1 ε i≥1 ε
p

Thus, when x → α we also have f (x) → f (α) as wanted. 

Exercise 8.4.2∗ . Prove that the sum and product of two locally analytic functions is again a locally
analytic function.

Solution

It is clear that the sum of two analytic functions is analytic, thus we need to prove that the
product of two analytic functions is also analytic. Let f and g be two analytic at α ∈ Zp
348 CHAPTER 8. P -ADIC ANALYSIS

P∞ P∞
functions. Write f (x) = i=1 ai (x − α)i and g(x) = j=1 bj (x − α)j for |x − α|p ≤ ε. Then,

X ∞
X X
f (x)g(x) = ai (x − α)i bj (x − α)j = ai bj (x − α)k .
i=1 j=1 i+j=k

We already allowed to do this expansion by Proposition 8.3.4 since

|ai bj (x − α)i+j |p ≤ max(|ai |p , |bj |p )εi+j → 0

when i + j → ∞. 

Exercise 8.4.3∗ . Prove that polynomials are locally analytic (everywhere).

Solution

This is Proposition 5.3.1. 

Exercise 8.4.4∗ . Prove Proposition 8.4.2.

Solution

Since |αk xn ak+n |p ≤ max(|α|, |x|)k+n |ak+n |p → 0 when x ∈ Zp , Proposition 8.3.4 gives us
X
f (x) = ak (α + (x − α))k
k
 
XX n k−n
= ak α (x − α)n
n
k
k
X X n
= ak αk−n (x − α)n
n
k
k
X X n
= (z − α)n ak αk−n
n
k
k
 
X X n k−n
= (x − α)n ak+n α
n
k
k
(n)
X f (α)
= (x − α)n
n
n!

as wanted. 

8.5 The Skolem-Mahler-Lech Theorem


Exercise 8.5.1∗ . Convince yourself of this proof.

Solution

Not much I can say here. 

Exercise 8.5.2∗ . Do you think this proof could be formulated without appealing to p-adic analysis?
8.6. STRASSMANN’S THEOREM 349

Solution

As said before, there is no known proof which doesn’t use p-adic ideas. However, one could
phrase the proof without mentionning p-adic numbers by looking at partial sums of our analytic
functions modulo powers of p. See Block ?? for an example. 

Exercise 8.5.3∗ . Prove that any number field has a finite number N of roots of unity, and that
ω N = 1 for any root of unity ω of K. (In other words, the roots of unity of K are exactly the N th
roots of unity.)

Solution

Since ϕ(n) → ∞ and a primitive nth root of unity has degree ϕ(n) over Q, contains finitely many
roots of unity.

One way to finish from this is to say that the roots of unity of K form a subgroup of the
multiplicative group of nth roots of unity for some n. Since this is a cyclic group, any of its
subgroup is also cyclic, and in particular the group of roots of unity of K.

Another way to finish from the first observation is to pick a root of unity ω ∈ K of maximal
order N . Then, if ζ is another root of unity of K, say of order n, Q(ω, ζ) ⊆ K contains a root of
unity of order lcm(N, n) by Problem 6.3.2 which implies that n divides N by maximality of N .

8.6 Strassmann’s Theorem



Exercise 8.6.1. Prove that Q( −7) is norm-Euclidean. (This is also Exercise 2.6.4† .)

Solution
 √
1+ −7

α

Let α, β ∈ OQ(√−7) = Z 2 be quadratic integers with β 6= 0. Write β = x + y −7 with
x, y ∈ Q. Pick a half-integer n such that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such
that |x − n| ≤ 21 . Then,
 2  2
√ 1 1 15
|N ((x − m) + (y − n) −7)| ≤ +7 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y − n)
√ −7) works since it has norm less than |N (β)| by
the previous computation and α = β(m + n −7) + τ . 

√  √ n−2
x± −7 1+ −7
Exercise 8.6.2. Prove that, if x2 + 7 = 2n , then 2 = 2 for some choice of ±.

Solution
√ √
By the uniqueness of the prime factorisation in Q( −7) (Exercise 8.6.1), we have x+ 2 −7 = αk β m .
Since the LHS is not divisible by 2, this means min(k, m) = 0 as otherwise 2 = αβ divides the
LHS. 
350 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.6.3∗ . Compute the Strassmann bounds for the function s 7→ (α − β)(us+10r ± 1), for each
r ∈ {0, 1, . . . , 9}. (If you do not want to do it all by hand, you may use a computer. In any case, it is
better to do it to have a feel for why it works because it’s very cool.)

Solution

Modulo 11, we can see that

αr − β r ± (α − β) ≡ 5r − 7r ∓ 2 (mod 11)

can be zero only for r ∈ {1, 2, 3, 5}, which means that in the other cases the Strassmann bound
is 0. Now, let’s study the second coefficient:

aαr − bβ r ≡ 99 · 5r − 77 · 7r .

When r = 1, this is
99 · 16 − 77 · 7 ≡ 77 (mod 112 )
so the Strassmann bound is 1 since all other coefficients are divisible by 112 . Similarly, when
r = 2 this is 33 6≡ 0, and when r = 5 this is 55 6≡ 0.

Finally, we need to treat the case r = 3. This time, the second coefficient is divisible by 112 so
we need to consider the third one:

(α − β)(ur+10s ± 1) = αr (1 + a)s − β r (1 + b)s ± (α − β)


X s X s
= αr as − β r bs ± (α − β)
k k
k k
   
r 2 s(s − 1) r 2 s(s − 1)
≡ α 1 + as + a − β 1 + bs + b ± (α − β) (mod 113 ).
2 2
2 r 2 3
The coefficient of s2 is a α −b
2
β
. Since we are now working modulo 113 , we need to compute a
and b modulo 11 . For this, we also need to compute α and β modulo 113 , but afterwards we
3

can return to their values modulo 11 since 112 x ≡ 112 y (mod 113 ) if and only if x ≡ y (mod 11).
With the help of Hensel’s lemma, we find α ≡ 137 and β ≡ 1195. This yields a ≡ 1188 and
b ≡ 198. Finally, a2 αr − b2 β 3 is

11882 · 53 − 1982 · 73 ≡ 726 6≡ 0 (mod 113 )

so the Strassmann bound is 3 as claimed. 

Exercise 8.6.4. Prove that 3, 4, 5, 7, 15 are indeed solutions to the given equation. (You may use a
computer for n = 15.)

Solution

We have
• 12 + 7 = 8 = 23 .
• 32 + 7 = 16 = 24 .
• 52 + 7 = 32 = 25 .
• 112 + 7 = 128 = 27 .
• 1812 + 7 = 32768 = 215 .

8.7. EXERCISES 351

8.7 Exercises
Analysis
Exercise 8.7.1† (Vandermonde’s Identity). Let x and y be p-adic integers. Prove that
    
x+y X x y
=
k i j
i+j=k,i,j≥0

for any k.

Solution

When x and y are natural integers, this follows from considering the coefficient of X k in (X +
1)x+y = (X + 1)x (X + 1)y . For arbitrary p-adic integers, this follows from the density of N in
Zp . 

Exercise 8.7.2† (Mahler’s Theorem). Prove that a function f : Zp → Qp is continuous if and only if
there exist ai → 0 such that
∞  
X x
f (x) = ai
i=0
i

for all x ∈ Zp . These ai are called the Mahler coefficients of f . Moreover, show that max(|f (x)|p ) =
max(|ai |p ).

Solution

It is clear that any such function is continuous on Zp , hence we need to prove that the reverse
holds as well. Let ∆f = x 7→ f (x + 1) − f (x) denote the discrete derivative operator from
ExercisePA.3.6† . The coefficients ak are then ∆k f (0): indeed, a straightforward P
shows that
n ∞
n
f (n) = k=0 ak k for any n ∈ N, so if these ak go to 0, f must be equal to x 7→ x=0 ak xk


by density and continuity.

Thus, it only remains to show that ∆k f (0) → 0. To prove


P this, we will show
 that they eventually
k x
all become divisible by p. We can then subtract k
p-∆ f (0) ∆ f (0) k from f (x) and divide
2 k
everything by p to conclude that p | ∆ f (0) for large k. Iterating this process yields that
vp (∆k f (0)) → +∞ as desired.

To show this, let N be such that p | f (x + pN ) − f (x) for any x. There exists such an N since f
is continuous by assumption. Then, by Exercise A.3.7† ,
N
pN
 
pN pN −k
X
∆ f (x) = (−1) f (x + k)
k
k=0

N N pN

for any x ∈ Zp . Now, by Frobenius, (1 + X)p ≡ 1 + Xp (mod p) which means that p | k
for any 1 ≤ k ≤ pN − 1. Hence,
N N
∆p f (x) ≡ f (x + pN ) + (−1)p f (x) (mod p).

When p is odd this is f (x + pN ) − f (x) which is divisible by p by construction, and when p is even
N
the same holds since −1 ≡ 1. Hence, p | ∆p f (x) for all x ∈ Zp which implies that p | ∆n f (x)
N
for all n ≥ pN as well by applying ∆ multiple times to ∆p f (x). In particular, p | ∆n f (0) for
sufficiently large n as wanted. 
352 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.7.4† . Prove that the following power series converge if and only if for |x|p < 1 and
|x|p < p−1/(p−1) respctively:
∞ ∞
X (−1)k−1 xk X xk
logp (1 + x) = , expp (x) = .
k k!
k=1 k=0

In addition, prove that

1. expp (x + y) = expp (x) expp (y) for |x|p , |y|p < p−1/(p−1) .

2. logp (xy) = logp (x) + logp (y) for |x|p , |y|p < 1

3. expp (log(1 + x)) = 1 + x for |x|p < p−1/(p−1) .

4. logp (exp(x)) = x for |x|p < p−1/(p−1) .

Solution

We shall only prove the convergence, the claimed equalities follow from the general theory of
power series: if g(x), (f ◦ g)(x) and f (g(x)) all converge, we have (f ◦ g)(x) = f (g(x)) (this is
even easier over Qp because we have the strong triangle inequality). The convergence for logp
follows from the fact that that |xk /k|p = |x|kp /|k|p goes to 0 when |x|p < 1 since |k|p > 1/k, but
does not go to 0 when |x|p = 1 since |k|p ≤ 1 for all k.

The convergence for expp is very similar: by Legendre’s formula,


 
k k sp (k) 1
vp (x /k!) = kvp (x) − + = k vp (x) − + o(k)
p−1 p−1 p−1

where o(k)/k → 0. This forces vp (x) ≥ 1


p−1 , i.e. |x|p ≤ p−1/(p−1) . Finally, we need to see that
1 sp (k)
we can’t have equality. This is easy: when vp (x) = p−1 , vp (xk /k!) is p−1 which is bounded
when k is a power of p, so does not go to infinity. 

Exercise 8.7.5† . Prove that !


n
X 2k
v2 → ∞.
k
k=1

Solution
P∞ k
The problem is equivalent to showing that k=1 2k = 0 in Q2 . Note that this sum is exactly
log2 (−1), which is 1/2 log2 (1) = log2 (1) = 0 by Exercise 8.7.4† . 

P∞
Exercise 8.7.6† (Mean Value Theorem). Let f (x) = i=0 ai xi be a p-adic power series converging
for |x|p ≤ 1, i.e. ai → 0. Prove that

|f (t + h) − f (t)|p ≤ |h|p max(|ai |p )


i

for any |t|p ≤ 1 and |h|p ≤ p−1/(p−1) .


8.7. EXERCISES 353

Solution

We shall prove that |(t + h)n − tn |p ≤ |h|p for any |t|p ≤ 1 and |h|p ≤ p−1/(p−1) . The strong
triangle inequality then implies that
∞ ∞

X X X
i i i i
ai (t + h) − ai t = ai ((t + h) − t )



i=0 i=0 i=0
≤ max(|ai ((t + h)i − ti )|p )
i
≤ |h|p max(|ai |p )
i

as wanted. Our claim is however very easy to prove: since |h|p ≤ p−1/(p−1) , we have |hk /k!|p ≤ 1
by Legendre’s formula so that
n
X
(t + h)n − tn = tn−k n(n − 1) · . . . · (n − (k − 1))hk /k!
k=0

has absolute value at most |h|p by the strong triangle inequality. 

Absolute Values
Exercise 8.7.7† . We say an absolute value | · | over a field K, i.e. a function | · | → R≥0 such that
• |x| = 0 ⇐⇒ x = 0
• |x + y| ≤ |x| + |y|
• |xy| = |x| · |y|
is non-Archimedean if the sequence |m| ≤ 1 for all m ∈ Z and Archimedean otherwise. Prove that m
is non-Archimedean if and only if it satisfies the strong triangular inequality |x + y| ≤ max(|x|, |y|)
for all x, y ∈ K. In addition, prove that, if | · | is non-Archimedean, we have |x + y| = max(|x|, |y|)
whenever |x| = 6 |y|.

Solution

It is clear that | · | is non-Archimedean if it satisfies the strong triangle inequality. Thus, suppose
that |m| ≤ 1 for all m ∈ Z. Now, notice that, for any positive integer n,

|x + y|n = |(x + y)n |



n  
X n k n−k
= x y

k


k=0
n  
X n k n−k
= k |x| |y|

k=0
≤ n max(|x|, |y|)n .

Taking the limit as n goes to ∞, we get

|x + y| ≤ n1/n max(|x|, |y|) → max(|x|, |y|)

as wanted. For the equality, if |x| > |y|, note that, by the same inequality, we also have |x| ≤
max(| − y|, |x + y|). Since | − y| = |y| < |x|, we must have max(|x + y|, | − y|) = |x + y| so
|x + y| ≥ |x| ≥ |x + y| as wanted. 
354 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.7.8† . Let K be a field and let | · | : K → R≥0 be a multiplicative function which is an
absolute value on Q. Suppose that | · | satisfies the modified triangular inequality |x + y| ≤ c(|x| + |y|)
for all x, y ∈ K, where c > 0 is some constant. Prove that it satisfies the triangular inequality.

Solution

The argument is very similar to our proof of Exercise 8.7.7† . Let x, y be elements of K. For any
positive integer n,

|x + y|n = |(x + y)n |


n  
X n n−k k
≤c
k x y

k=0
n  
X n n−k k
=c k |x| |y|

k=0
n  
X n
≤c |x|n−k |y|k
k
k=0
= c(|x| + |y|)n .

Indeed, a straightforward induction shows that |m| ≤ m for m ∈ N since | · | is an absolute value
on Q so |m + 1| ≤ |m| + |1| = |m| + 1 for any m ∈ N since |1|2 = |1| and |1| =
6 0. Taking the nth
root and letting n tend to infinity, we get

|x + y| ≤ c1/n (|x| + |y|) → |x| + |y|

as wanted. 

Exercise 8.7.9† (Ostrowski’s Theorem). Let | · | be an absolute value of Q. Prove that | · | is equal to
| · |rp for some prime p and some r ≥ 1, or to | · |r∞ for some 0 < r ≤ 1 or is the trivial absolute value
| · |0 which is 0 at 0 and 1 everywhere else.

Solution

First, note that f (1)2 = f (1) so f (1) = 1 since f (x) 6= 0. For the same reason, f (−1) = 1. Now,
suppose that there is some a ∈ N such that |a| > 1 and let b ∈ N be any integer. By the previous
Pbn log (a)c
remark, we have a > 1 so let am = i=0 b ai bi . We get
bn logb (a)c
X
|a|m ≤ |ai ||b|i
i=0

which implies that |b| > 1 as well. But then,


bm logb (a)c
X
|a| m
≤ |ai ||b|i ≤ C|b|bn logb (a)c
i=0

for some constant C = max(|1|, |2|, . . . , |b − 1|) > 0 which implies that |a| ≤ |b|logb (a) when we
take m → ∞, i.e. |a|1/ log a ≤ |b|1/ log b . Since the reverse inequality is true as well by symmetry,
we get that |a|1/ log a = c is constant. This gives us |a| = alog c := ar . It is then easy to see that
this extends to |a|r∞ on all of Q using the multiplicativity of | · |. Finally, it is easy to check that
this satisfies the triangular inequality only for 0 < r ≤ 1.
Now suppose that |n| ≤ 1 for all n ∈ Z. By Exercise 8.7.7† , | · | satisfies the strong triangle
inequality. Without loss of generality, assume that | · | is non-trivial and let p ∈ N be the smallest
8.7. EXERCISES 355

positive integer such that |p| < 1. Since | · | is multiplicative, p must be prime as it has no
non-trivial divisor and is distinct from 1. By assumption, |a| = 1 for any 1 ≤ a ≤ p − 1. We shall
prove that |n| = 1 for any p - n to conclude that, in general,
− log |p|
|n| = |p|vp (n) |n/p| = |p|vp (v) = |n|p p

Consider any p - n now and express it in base p as i ai pi . Since p - n, we have a0 < p, so


P
1 = |a0 | > maxi≥1 |ai pi |. By the previous inequality, we are in the equality case of

X
ai p ≤ max |ai pi | = 1
i


i
i

so |n| = 1 as wanted. To conclude, it is this time easy to see that |x + y| ≤ max(|x|, |y|) only
when r ≥ 1. 

Exercise 8.7.10† (Bolzano-Weierstrass Theorem). Prove that a set S ⊆ Rn is sequentially compact if


and only if it closed, meaning that any sequence of elements of S converging in Rn (for the Euclidean
distance) converges in S, and bounded.

Solution

Clearly, if S is unbounded or not closed, one can extract a sequence which diverges to infinity
or converges to an element not in S, and thus has no convergent subsequence. Now, suppose
that S is closed and bounded and let s = (sm )m≥0 be a sequence of elements of S. Without
loss of generality, by translating S, suppose that all its elements have coordinates in [0, M ]. We
shall proceed by dichotomy to extract a convergent in Rn subsequence of s, it will thus also
be convergent in S since S is closed. By the (infinite) pigeonhole principle, there must some
(1) (n)
I1 , . . . , I1 ∈ {[0, M/2], [M/2, M ]} such that
(1) (n)
S ∩ I1 × . . . × I1

is infinite. Pick an element r1 in this product


 of intervalls  and then repeat  the operation: if
(i) (i) (i) (i)
(i) (i) (i) (i) (i) a2 +b2 a2 +b2 (i)
I1 = [a1 , b1 ], there must be some I2 ∈ a2 , 2 , 2 , b2 such that

(1) (n)
S ∩ I2 × . . . × I2

is infinite. Pick an element in this product of intervalls r2 , and proceed inductively that way to
(i) (i) (i) (i) (m)
get chains of intervalls Im = [am , bm ] of length M/2m such that Im+1 ⊆ In and
(1) (n)
S ∩ Im × . . . × Im
(i) (i)
is infinite and in particular contains rm . Since the length of Im is M/2n , the sequences (am )m≥0
(i)
and (bm )m≥0 are Cauchy, say they converge to ci . Then, the sequence (rm )m≥1 we produced
converges to (c1 , . . . , cn ) as desired. 

Exercise 8.7.11† (Extremal Value Theorem). Let M be a metric space, i.e. a set with a distance
d : M → R≥0 such that d(x, y) = 0 iff x = y, d(x, y) = d(y, x) (commutativity) and d(x, y) ≤
d(x, z) + d(z, y) (triangle inequality) for any x, y, z ∈ M and let S be a sequentially compact subset of
M . Suppose f : S → R is a continuous function. Prove that f has a maximum and a minimum.
356 CHAPTER 8. P -ADIC ANALYSIS

Solution

Suppose otherwise. There is a sequence (sn )n≥0 of elements of S such that

f (sn ) → s 6∈ im f

(s can be ±∞). Let (rn )n be subsequence of (sn )n converging to r ∈ S. Then, we get

f (r) = lim f (rn ) = s


n→∞

which is a contradiction. 

Exercise 8.7.12† (Equivalence of Norms). Let (K, | · |) be a complete valued field in characteristic
0, i.e. a field with an absolute value | · | which is complete1 for the distance induced by this absolute
value. A norm on a vector space V over K is a function k · k : V → R≥0 such that
• kxk = 0 ⇐⇒ x = 0
• kx + yk ≤ kxk + kyk
• kaxk = |a|kxk
for all x, y ∈ V and a ∈ K. We say two norms k · k2 and k · k2 are equivalence of norms if there are
two positive real numbers c1 and c2 such that kxk1 ≤ c1 kxk2 and kxk2 ≤ c2 kxk1 for all x ∈ V .2 Prove
that any two norms are equivalent over a finite-dimensional K-vector space V . In addition, prove that
V is complete under the induced distance of any norm k · k.

Solution

Since we wish to show that all norms are equivalent, it suffices to prove that any norm is equivalent
to a fixed norm we choose. A particularly simple one is the maximum norm
kxk∞ = max |ai |
i
Pn
where e1 , . . . , en is a basis of V and x = i=1 ai ei for some ai ∈ K. In other, words this is simply
the maximum of thePcoefficients of x in the basis (e1 , . . . , en ). Clearly, V is complete under this
n
norm, since if xk = i=1 ak,i ei is a Cauchy sequence, then so is every (ak,i )k≥0 for the distance
induced by | · | which means that ak,i −→ ai for some ai and
k→+∞

n
X
xk → ai ei .
i=1

Since two equivalent norms induce the same topology (a sequence is Cauchy for one norm if and
only if it is Cauchy for the other), we are P done if we prove that any norm k · k is equivalent to
n
| · k∞ . One inequality is very easy: if x = i=1 ai ei , we have
n
X
kxk = ai ei


i=1
n
X
≤ |ai |kei k
i=1
n
!
X
≤ kxk∞ · kei k .
i=1

1 Recall that completeness means that all Cauchy sequences converge. A Cauchy sequence (u )
n n≥0 is a sequence such
that, for any ε > 0, there is an N such that |um − un | ≤ ε for all m, n ≥ N .
2 This means that they induce the same topology on V .
8.7. EXERCISES 357

For the other inequality, suppose for the sake of a contradiction that there doesn’t exist a c > 0
such that kxk ≤ ckxk∞ for all x ∈ V . In other words, for all ε, there is some x such that
kxk < εkxk∞ . In particular, x 6= 0. Since we have infinitely P
many x, by the pigeonhole principle,
n
we can assume that kxk∞ = |ak | for a fixed k, where x = i=1 ai ei . By dividing x by ak , we
may also assume that ak = 1. This gives us a sequence

xm = ym + ek

converging to 0, where ym is in the space W spanned by e1 , . . . , ek−1 , ek+1 , . . . , en . In particular,

kym − y` k ≤ kym + ek k + ky` + ek k

also converges to 0 when min(m, `) → +∞. In other words, (ym )m≥0 is a Cauchy sequence. Now,
we use induction on n = dim V . When n = 1 the result is trivial since V = K and k · k = k1k| · |.
For the inductive step, notice that W has dimension n−1 so, by assumption, it is complete under
k · k. Hence, (ym )m≥0 converges to some y ∈ W . This means that

ky + ek k = lim kym + ek k = 0,
m→+∞

which is impossible since y + ek 6= 0. 

Exercise 8.7.13† . Let K = Qp be a local field3 , where p be a prime number or ∞ and let L be a
finite extension of K. Prove that there is only one absolute value of L extending | · |p on K, and that
1/[L/K] 456
it’s given by | · |p = NL/K (·) p
.

Solution

For simplicity purposes, we write | · | for | · |p . We first prove the uniqueness. Suppose that | · |(1)
and | · |(2) are two absolute values extending | · |. Then, they are norms over the K vector space
L. By Exercise 8.7.12† , they must be equivalent:

a|x|(1) ≤ |x|(2) ≤ b|x|(1)

for some positive real numbers a, b. In particular, if we let x = y n , we get a|y|n(1) ≤ |y|n(2) ≤ b|y|n(1) .
By taking nth roots and letting n tend to infinity, this gives us

|y|(1) ← a1/n |y|(1) ≤ |y|(2) ≤ b1/n |y|(1) → |y|(1)

so |y|(1) = |y|(2) as wanted. Note that we didn’t use the fact that K was a field of the form Qp
here.
Now, we prove the existence. Multiplicativity is obvious, and |x| = 0 iff x = 0 too. The tricky
part is to prove that it satisfies the triangular inequality |x + y| ≤ |x| + |y|. After dividing by |y|,
this is equivalent to |x+1| ≤ |x|+1. We will however not prove this directly, but rather that there
is a constant c > 0 such that |x + 1| ≤ c(|x| + 1). Assuming we have proven this, Exercise 8.7.8†
tells us that we can in fact pick c = 1, i.e. that | · | satisfies the triangular inequality (and is thus
an absolute value).

3 This result is true for any complete valued field (K, | · |), but it is harder to prove.
4 Inparticular, this absolute value is still non-Archimedean if it initially was. For instance, by Exercise 8.7.7† , if p is
prime, the extension of | · |p still satisfies the strong triangle inequality. In fact, this is the only interesting case since it’s
too hard to treat the case K = R separately.
5 Here is why this absolute value is intuitive: by symmetry between the conjugates, we should have |α| = |β| if α
p p
[K:Q ]
and β are conjugates. Taking the norm yields |NK/Qp (α)|p = |α|p p as indicated.
6 One might be tempted to also define a p-adic valuation for elements of K as v (·) = − log(| · | )/ log(p), and this is
p p
also what we will do in some of the exercises. However, we warn the reader that, if α ∈ Z is an algebraic integer and αp
is a root of its minimal polynomial in Qp , vp (αp ) ≥ 1 does not mean anymore that p divides α in Z, it only means that
p divides αp in Zp := {x ∈ Qp | |x|≤ 1}.
358 CHAPTER 8. P -ADIC ANALYSIS

It remains to prove that such a c exists. Let e1 , . . . , en be a K-basis of L (for instance ei = αi


for some primitive element α). Define the maximum norm as

X
ai ei = max |ai |.


i
i ∞

The point is that this defines a distance d(x, y) = |x − y|(∞) and that the unit sphere S = {x |
kxk∞ = 1} is sequentially compact for this distance, so that our extension of | · | will have a
(non-zero) minimum there by the extreme value theorem from Exercise 8.7.11† .

It is also not very hard to see that the unit sphere is indeed sequentially compact: this is the
Bolzano-Weierstrass theorem from Exercise 8.7.10† for p = ∞, i.e. K = R, and is very easy when
p is prime by an argument similar to the proof of ??.
p
To Pconclude, our extension of |·|, n |N (·)| is continuous for the distance induced by |·|(∞) because
N ( i ai ei ) is polynomial in the ai . Thus, there are positive a and b such that a ≤ |x| ≤ b for
|x|(∞) = 1 by the extremal value theorem from Exercise 8.7.11† . Note that a is positive as | · |
doesn’t vanish on S. From this we conclude that akxk∞ ≤ |x| ≤ bkxk∞ for any x. But then, we
have
b
|x + 1| ≤ b|x + 1|∞ ≤ b(|x|∞ + 1) ≤ (|x| + 1)
a
which is what we wanted to show. 

Exercise 8.7.14† . Let (K, ·) be a complete valued field in characteristic 0 and let f ∈ K[X] be a
polynomial. Prove that f either has a root in K, or there is a real number c > 0 such that |f (x)| ≥ c
for all x ∈ K.

Solution

Suppose without loss of generality that f is irreducible and that there does not exist a c > 0 such
that |f (x)| ≥ c for all x ∈ K. In other words, |f (x)| takes arbitrarily small values for x ∈ K.
We will produce a Cauchy sequence (xn )n≥0 such that |f (xn )| → 0. The limit x of (xn )n≥0 will
then clearly be a root of f .

We use the Newton method to find such a sequence. Let x0 ∈ K be such that |f (x0 )| < 1 is
small (we will specify this later). Note that |x0 | is bounded since, by the triangular inequality, if
f = an X n + . . . + a0 , we have

|f (x0 )| ≥ |an ||x|n − |an−1 ||x|n−1 − . . . − |a0 |.

Define the sequence (xn )n≥0 by xn+1 = xn + εn , where εn will be chosen in the next sentences.

Given an element x ∈ K such that |x2 + 1| is small, we define the sequence (xn )n≥0 as follows.
Set x0 = x. Then, set xn+1 = xn + ε for some small ε. We have, by Taylor’s formula 5.3.1
n−1
X εkn f (k) (xn )
f (xn+1 ) = = f (xn ) + εn f (xn ) + O(ε2n ).
k!
k=0

Hence, to kill the greatest term of this sum, we choose εn = − ff0(x n)


(xn ) . Let’s justify a bit the
notation O(ε2n ): we have shown that, if f (x) is very small then x is bounded, so the derivatives
f (k) (x) are bounded as well. We also need to ensure that f 0 (x) is not too small when f (x) is, so
that εn = − ff0(x n)
(xn ) is very small. This follows from Bézout’s lemma: since f is irreducible, it is
coprime with its derivative f 0 (we are in characteristic zero) so there are u, v ∈ K[X] such that

uf + vf 0 = 1.
8.7. EXERCISES 359

When f (x) is very small, u(x) is bounded (since x is) so |v(x)f 0 (x)| is very close to 1. Since v(x)
is also bounded, we get that |f 0 (x)| is bounded below as wanted.

To conclude, we have n−1


X εk f (k) (x )
n n
|f (xn+1 )| = < c|εn |2

k!
k=0

when f (xn ) is very small. Since


|f (xn )|2
|εn |2 = ,
|f 0 (xn )|2
there is some θ < 1 such that
|f (xn+1 )| ≤ θ|f (xn )|
when f (xn ) is sufficiently small (in particular, it suffices to have f (x0 ) sufficiently small). Hence,
pick an x0 such that |f (x0 )| is sufficiently small and this inequality is true. Then, |f (xn )| ≤ θn−1
by induction so that,
|f (xn )|
|xn+1 − xn | = 0 ≤ cθn
|f (xn )|
for some constant c > 0. It is not hard to see that this implies that (xn )n≥0 is Cauchy, so we are
done. 

Exercise 8.7.15† (Ostrowski). Let (K, ·) be a complete valued Archimedean field in characteristic
07 . Prove that it is isomorphic to to (R, | · |∞ ) or (C, | · |∞ ).

Solution

Without loss of generality, suppose that Q ⊆ K. By Exercise 8.7.9† , we may also assume that
| · | extends the usual absolute value | · |∞ of Q, by replacing | · | by | · |r for some suitable r ≥ 1.
This new absolute value might not satisfy the triangular inequality, but in fact it does. Indeed,
by the power mean inequality, we have
r
|x|r + |y|r |x + y|r

|x| + |y|
≥ ≥ .
2 2 2r
Setting c = 2r−1 , we get that this absolute value, which we will from now one abusively denote |·|
as well, satisfies the modified triangular inequality |x + y| ≤ c(|x| + |y|). Then, by Exercise 8.7.8† ,
| · | satisfies the triangular inequality as desired.
Now, note that K contains (a field isomorphic to) R since it is complete and R is the set of limits
of Cauchy sequences of rational numbers. | · | is then the usual absolute of R, by construction of
R. Without loss of generality, suppose also that C ⊆ K, by extending | · | to K(i) if necessary.
By Exercise 8.7.13† , we know that we should extend | · | to K(i) by
p
|α + βi| = |α2 + β 2 |,
but we don’t know if it is indeed an absolute value. To show that it is, note that, if i 6∈ K, by
Exercise 8.7.14† there is a constant c > 0 such that
|α2 + β 2 | ≥ c(|α|2 + |β|2 )
for all α, β ∈ K. Indeed, if |x2 + 1| ≥ c/2 for all x ∈ K, we have, for any |β| ≥ |α|,
|α2 + β 2 | =|β|2 |(α/β)2 + 1|
≥ |β|2 c/2
≥ c(|α|2 + |β|2 )

7 In fact it is quite easy to show that char K = 0 follows from the assumption that | · | is Archimedean, but we add

this assumption for the convenience of the reader.


360 CHAPTER 8. P -ADIC ANALYSIS

Thus, for any α, β, γ, δ ∈ K,

|(α + βi) + (γ + δi)|2 = |(α + γ)2 + (β + δ)2 |


≤ 2(|α|2 + |β|2 + |γ|2 + |δ|2 )
2
≤ (|α + βi|2 + |γ + δi|2 )
c
where the third line follows from the triangular inequality and the inequality between the arith-
metic and geometric mean, so | · | satisfies the triangular inequality by Exercise 8.7.8† (and the
quadratic-geometric mean inequality).

We will now prove that any element of K is in fact in C, thus showing that K = C as wanted.

Let α be an element of K and let m be the minimum of |α − x| for x ∈ C. This minimum exists
by the Bolzano-Weierstrass theorem: we have |α − x| ≥ |x| − |α| so |α − x| → ∞. If we choose r
such that |α − x| > |α| for |x| > r, we get that the minimum of |α − x| over C is also its minimum
over the ball {x | |x| ≤ r}. However, this ball is compact by the Bolzano-Weierstrass theorem,
and the function x 7→ |α − x| is continuous by the triangular inequality, so a minimum exists by
the extremal value theorem. We wish to prove that this minimum m is zero.

The idea is now to take an x such that |α − x| is large, and, at the same time, A − x divides a
f
polynomial f such that |f (α)| is small. If we let g = A−x , we get that |g(α)| is quite small so
that one of |α − z| where z is a root of g is small, and in particular smaller than m. Since the
remainder of a polynomial f modulo A − x is f (x), we can relax the condition to |f (α)| small
and |f (x)| as well. With these conditions, it is natural to pick f first and then x: an obvious
candidate for f is
f = (A − y)n
where y is such that |α − y| = m. Now, we need to estimate |f (α) − f (x)|. By the triangular
inequality, it is at most mn + |x − y|n . In particular, if ε = |x − y| < 1, it is at most mn plus
something very small. In addition, by definition, we know that |g(α)| ≥ mn−1 , where g = f −f (x)
A−x .
Hence,
|α − x|mn−1 ≤ |g(α)||α − x| = |f (α) − f (x)| ≤ mn + εn .
This means that, if m is non-zero, by dividing by mn ,

|α − x| ≤ m (1 + (ε/m)n ) → m.

Thus, |α − x| = m for all |x − y| < 1. Iterating this process, we get |α − x| = m for all x ∈ C
which is obviously a contradiction since |α − x| goes to ∞ when |x| → ∞. Hence, |α − z| = 0 for
some z ∈ C, i.e. α = z ∈ C as wanted. 

Diophantine Equations
Exercise 8.7.16† (Brazilian Mathematical Olympiad 2010). Find all positive rational integers n and
x such that 3n = 2x2 + 1.

Solution
√ √ √
We proceed as in Proposition 8.6.1: working in Q( −2), we find 1 + −2x = (1 ± 2)n , i.e.
√ √
(1 + −2)n + (1 − −2)n = ±2.
√ √
To solve this, we shall work in Q11 . We thus consider α = 1 ± −2 and β = 1 ∓ −2 as elements
of Q11 ; Hensel’s lemma gives us α ≡ 20 (mod 121) and β ≡ 103 (mod 121). We wish to find the
zeros of the linear recurrence αn − β n ± 2. Note that we have αn − β n ≡ ±2 modulo 11 only
when n ∈ {0, 1, 2}, so we restrict our attention to these n.
8.7. EXERCISES 361

Set a = α5 − 1 ≡ 0 (mod 11) and b = β 5 − 1 ≡ 0 (mod 11). We shall compute the Strassmann
bounds of the analytic functions

fr (s) = αr (1 + a)s − β r (1 + b)s

for r ∈ {0, 1, 2}. Modulo 112 , we have

fr (s) ≡ αr (1 + as) + β r (1 + bs) − 2.

The coefficient of s is αr a + β r b. However, for r ∈ {1, 2}, this is respectively 44 and 88 modulo
112 so non-zero in both cases. Hence, the Strassmann bounds for f1 and f2 are 1. It remains to
compute the Strassmann for f0 . This time, we have a + b ≡ 0 (mod 112 ) so we need to expand
one more term. We get
   
s s 2 s 2 s
f0 (s) = (1 + a) + (1 + b) − 2 ≡ 1 + as + a +1+b − 2 (mod 113 ).
2 2
2 2
The coefficient of s2 is thus a +b
2 modulo 113 . However, we can check with Hensel’s lemma
that α ≡ 587 (mod 11 ) and β ≡ 746 (mod 113 ), which yields a ≡ 1012 (mod 113 ) and b ≡ 317
3

(mod 113 ). One can then verify that

a2 + b2 ≡ 847 6≡ 0 (mod 113 ).

Hence, the Strassmann bound for 0 is 2.

To finish, we need to find solutions: two solutions congruent to 0 modulo 5, one congruent to 1
modulo 5, and one congruent to 2 modulo 5. It is not hard to see that we indeed have

30 = 2 · 02 + 1
31 = 2 · 12 + 1
32 = 2 · 22 + 1
35 = 2 · 112 + 1.

Hence, we have found all solutions: (n, x) ∈ {(0, 0), (1, 1), (2, 2), (5, 11)}. 

Exercise 8.7.19† . Solve the diophantine equation x2 − y 3 = 1 over Z.

Solution

Write this equation as (x − 1)(x + 1) = y 3 . The gcd of the two factors divides 2, so we have
x − 1 = a3 and x + 1 = b3 or x ± 1 = 2a3 and x ∓ 1 = 4b3 for some a, b ∈ Z. The former is
impossible, so we must be in the latter case. The problem thus reduces to solving the equation
a3 − 2b3 = ±1 in rational integers. We know by Section 7.4 that

a − b 2 = ±θn
3


for some n, where θ is the fundamental
√ unit of Q( 3 2). In addition, by Exercise
√ 7.5.18† , we√
know
1
3
that we can choose θ = 1 − 2 = − 1+ √ 3 3 . Hence, we wish to have a − b

2+ 4
3
2 = ±(1 − 3
2)n .
As we saw in the proof of Theorem 7.4.2, for a given n, there are such a, b if and only if
√ √ √
(1 − 2)n + j(1 − j 2)n + j 2 (1 − j 2 2)n = 0.
3 3 3

We work in Q3 (α, j), where α3 = 2 and j is now a tryadic root of unity of order 3. Note that
this has degree 6 over Q3 since j 6∈ Q3 and α 6∈ Q3 (j) (for instance because Gal(Q3 (j)/Q3 ) is
abelian but Gal(Q3 (α)/Q3 ) isn’t). In particular, α0 = α, α1 = jα and α2 = j 2 α are conjugate.
362 CHAPTER 8. P -ADIC ANALYSIS

We wish to find when the linear recurrence (1 − α)n + j(1 − jα)n + j 2 (1 − j 2 α)n is zero. Here is
the magic: this is already almost a tryadic analytic function. Indeed, we can rewrite it as

2n ((1 + π0 )n + j(1 + π1 )n + j 2 (1 + π2 ))

where
√ πk = −(1 + αj k )/2 has √norm −3/8 and thus tryadic absolute value 3−1/3 < 1. (In fact,
1+ 2 is prime in OQ( 3 2) = Z[ 3 2].) However, 3−1/3 is still too large: it’s greater than 3−1/(3−1) .
3 √

Hence, we consider the function

fr (s) = (1 + π0 )r (1 + π0 )3s + j(1 + π1 )r (1 + π1 )3s + j 2 (1 + π2 )r (1 + π2 )3s

for r ∈ [3]. Indeed, these converge since (1 + πk )3 = 1 + 3(−3/8 − αk /8 + αk2 /8) has absolute
value 3−1 < 3−1/(3−1) . It is then straightforward to compute the Strassmann bounds: we claim
that it is 1 for r = 0 and r = 1, and 0 for r = 2. Let us start with r = 0. In that case,
2
X
f0 (s) ≡ s j3(−3/8 − αj k /8 + α2 j 2k /8) (mod 27).
k=0

P2
easy to compute a sum of the form k=0 j k i ai j ki : this is a unity root filter
P
It is in fact veryP
so is the sum 3 i≡−1 (mod 3) ai αi (see Exercise A.3.9† ). This is actually normal: it’s why we
considered this sum in the first place. In particular, this also explains why this congruence holds
modulo 33 instead of simply 32 : it’s because of the additional factor of 3 added by the unity root
filter. Hence, the coefficient of s of f0 is 9α2 /8 which has tryadic valuation 2 < 3 and 1 is thus
the Strassmann
√ bound for f0 . Conversely, it is clear that s = 0 is a solution (corresponding to
1 = (1 − 3 2)0 ).

We now consider f1 . As before, we are done if the coefficient of s has tryadic valuation 2 since
all the following ones have valuation at least 3. We expand f1 modulo 27, and remember that
we only care about the coefficient of α3n+2 :
2
X
f1 (s) = j k (1 − αk )/2 · (1 + 3(−3/8 − αk /8 + αk2 /8))s
k=0
2
X 2
X
≡ j k (1 − αk )/2 + 3s j k (1 − αk )/2 · (−3/8 − αk /8 + αk2 /8) (mod 27)
k=0 k=0
= −9sα2 /8

which has
√ absolute√value 3−2 as desired. It is again clear that s = 0 is a solution (corresponding
to 1 − 2 = (1 − 3 2)1 ).
3

Finally, we consider f2 . Here is what changes: the coefficient of s0 is no longer zero because
(1 − α)2 now has a non-zero coefficient for some α3n+2 . More specifically,
2
X
f2 (s) = j k (1 − αk )2 /4 · (1 + 3(−3/8 − αk /8 + αk2 /8))s
k=0
2
X
≡ j k (1 − αk )2 /4 (mod 9)
k=0
= −3α2 /4

which has absolute value 3−1 as desired. √This shows that √ the Strassmann bound is 0, and
concludes our study of the equation a − b 3 2 = ±(1 − 3 2)n : the only solutions are a = ±1,
b = 0 as well as a = b = ±1. If we go back to the original problem, these correspond to
x ± 1 = 2a3 = ±2, i.e. x ∈ {±1, ±3}. These then yield (x, y) ∈ {(±1, 0), (±3, 2)}, which are, in
conclusion, the only rational integer solutions to the equation x2 − y 3 = 1. 
8.7. EXERCISES 363

Exercise 8.7.20† (Lebesgue). Solve the equation x2 + 1 = y n over Z, where n ≥ 3 is an odd integer.

Solution

Suppose (x, y) is a solution. By the unique factorisation in Z[i], we have xi + 1 = ε(a + bi)n for
some a, b ∈ Z and ε ∈ Z[i] a unit. Note that y = a2 + b2 is odd, since x2 + 1 is never divisible by
4, so one of a, b is even and the other is odd. Since the units of Z[i] have the form ik for some k
by Exercise 2.2.3∗ , they are all nth powers since n is odd, so we can assume ε = 1.

Hence, we wish to find the a, b ∈ Z such that (a + bi)n + (a − bi)n = 2. Since n is odd, this is
divisible by 2a so a is ±1. Since y = a2 + b2 is odd, b must be even. Expanding the real part of
(a + bi)n (which must be 1), we get
n−1
2  
X n
an−2k (−1)k b2k = 1.
2k
k=0

Modulo b2 we get b2 | 1 − an ∈ {0, 2}, and since b2 is at least 4 since b is even, a must be 1. In
other words, our equation becomes

(1 + bi)n + (1 − bi)n = 2.

We wish to expand the LHS as a dyadic analytic function, but this is not possible because |b|2
1
might be equal to 2− 2−1 = 2−1 , i.e. b might have dyadic valuation equal to 1. To remedy this
situation, we use the LTE lemma:

(1 + bi)2 = (1 + 2b(i − b/2)).

Since n is odd, we can set n = 2m + 1 and reduce the problem to finding the zeros of the now
dyadic analytic function

f (m) = (1 + bi)(1 + 2b(i − b/2))m + (1 − bi)(1 + 2b(−i − b/2))m − 2

where i is now a square root of −1 in Q2 . Since this a root of unity filter (see Exercise A.3.9† ),
as in Exercise 8.7.19† , f (m) is twice the "real dyadic" part of (1 + bi)(1 + 2b(i − b/2))m − 1, i.e.
the coefficient of 1 in this expression. Now expand this as
∞  
X m k
−1 + (1 + bi) (2b(i − b/2)) .
k
k=0

Suppose that b 6= 0, otherwise we get x = 0 and y = 1. Since v2 (k!) ≤ k − 1 by Legendre’s


formula, every term except the first two vanish modulo 2b2 . Hence, modulo b2 , this is simply

−1 + (1 + bi)(1 + 2bm(i − b/2)).

If we expand this while focusing only on the real dyadic part, we get

(1 − b2 m) + bi · 2bim − 1 = −3b2 m.

Since |b2 |2 > |2b2 |2 , we conclude that the Strassmann bound is (at most) 1. Since m = 0 is a
trivial solution, we conclude that it is the only solution (corresponding to n = 1, which is not
the case). Thus, the only solution (x, y) = (0, 1). 

Remark 8.7.1
364 CHAPTER 8. P -ADIC ANALYSIS

We can also finish directly with a slightly ad-hoc dyadic method once we reach
n−1
2  
X n
(−1)k b2k−2 = 0.
2k
k=1

Let m be the dyadic valaution of n2 = n(n−1) . We will prove that 2m+1 divides n2 , which is
 
2
n
of course a contradiction. The denominator of 2k is (2k)!. By Legendre’s formula, we have
b2k−2
v2 ((2k)!) = 2k − s2 (2k) ≤ 2k − 1. As a result, (2k)! has dyadic valuation at least −1. Since
n

(n − 1)(n − 3) divides (2k)! 2k , we conclude that
  
n 2k
v2 b ≥ v2 ((n − 1)(n − 3)) + v2 (b2k /(2k)!) ≥ m + 2 − 1 = m + 1
2k

as wanted since m = v2 n−1



2 = v2 (n − 1) − 1. Hence, 2m+1 divides every term of the sum
n−1
2  
X n
(−1)k b2k = 0.
2k + 2
k=0

except the first one, which means that it also divides the first one.
Exercise 8.7.21† . Solve the equation x2 + 1 = 2y n over Z, where n ≥ 3 is an odd integer.

Solution

Suppose (x, y) is a solution. By factorising in Z[i], we get xi + 1 = ε(1 + i)(a + bi)n for some
a, b ∈ Z and a unit ε ∈ Z[i]. Note that y = a2 + b2 is odd, since x2 + 1 is never divisible by 4, so
one of a, b is even and the other is odd. Since the units of Z[i] have the form ik for some k by
Exercise 2.2.3∗ , they are all nth powers since n is odd, so we can assume ε = 1.

By assumption,

2 = (1 + ix) + (1 − ix)
= (1 + i)(a + bi)n + (1 − i)(a − bi)n
= i(1 − i)(a + bi)n + (1 − i)(a − bi)n
= (1 − i) ((±b ∓ ia)n + (a − bi)n )

where the ±1 sign depends on n modulo 4. Since n is odd, this last expression is divisible by

±b ∓ ia + a − bi = (a − b)(1 ∓ i).

Thus, (1 − i)(a − b)(1 ∓ i) divides 2. This is equivalent to a − b | 1, so a − b = ±1. Without loss


of generality, suppose that a and b are non-zero since {|a|, |b|} = {0, 1} yields y = 1 and thus
x = ±1. Now we distinguish a few cases, depending on which one of a and b is even and whether
a − b is 1 or −1.
1. b is even and a − b = 1. In that case, our equation is

f (n) := (1 + i)(1 + b(1 + i))n + (1 − i)(1 + b(1 − i))n − 2 = 0.

Unlike Exercise 8.7.20† , this is already a dyadic analytic function since |(1 + i)|2 = 2−1/2
1
which means that |b(1 + i)|2 ≤ 2−3/2 < 2− 2−1 (we are working with the dyadic i ∈
Q2 ). This is a unity root filter, so we are just focusing on the "real dyadic" part of
(1 + i)(1 + b(1 + i))n − 1. When we expand this modulo b3 , we get

(1 + i)(1 + b(1 + i)n + b2 (1 + i)2 n(n − 1)/2) − 1 = i + 2bin + (1 + i)2b2 in(n − 1)/2
8.7. EXERCISES 365

since (1 + i)2 = 2i, which has real dyadic part −b2 n(n − 1). Clearly, |b2 |2 > |b3 |2 since
b 6= 0 so the Strassmann bound is 2. The previous computation P in fact shows
 that the first

two coefficients of f are zero (when written as a Mahler series k=0 ak xk ), which means
that n = 0 and n = 1 are solutions. In other words, these are the only solutions, which are
ruled out by the statement.

2. b is even and a − b = −1. Since n is odd, we have (−1 + b(1 ± i))n = −(1 − b(1 ± i))n so
our equation is

f (n) := (1 + i)(1 − b(1 + i))n + (1 − i)(1 − b(1 − i))n + 2 = 0.

The same computation as before shows that the coefficient of n1 is −2bi + 2bi = 0. Thus,


modulo 2b2 , we have


f (n) ≡ 4 + 0n
so the Strassmann bound is 0 since |2b2 |2 < |4|2 . There are no solutions in this case.
3. a is even. Then, a + bi = ±i + a(1 + i) so the equation is

(1 + i)(±i + a(1 + i))n + (1 − i)(±i + a(1 − i))n − 2 = 0.

Since (±i + α)n = ±i(1 + ±iα)n for any α ∈ Q2 (i), where the ± are independent and
depend on whether n ≡ 1 (mod 4) or n ≡ −1 (mod 4), our equation is

f (n) = (1 + i)(1 ± ia(1 + i))n + (1 − i)(1 ± ia(1 − i))n ± 2i = 0.

where the first two ± signs are the same and the last one is independent. We will prove
that the Strassmann bound is always 0. Modulo 2a (this is a unity root filter so the "real
dyadic" part gets doubled), we have

f (n) ≡ 2(1 + ±i).

Since |2a|2 < |2(1 ± i)|2 , we are done.

To conclude there are no solutions to the equations x2 + 1 = 2y n when y is not equal to 1 and
n ≥ 3, i.e. the only solutions to our equation are (±1, 1). 

Linear Recurrences
Exercise 8.7.22† . Let (un )n≥0 be a linear recurrence of rational integers given by i fi (n)αin such
P
that αi /αj is not a root of unity for i 6= j. If un is not of the form aαn for some a, α ∈ Z, prove that
there are infinitely many prime numbers p such that p | un for some integer n ≥ 0.

Solution

Without loss of generality, suppose that un is not identically zero. By Corollary 8.5.2 and
Corollary 8.5.1, the condition on the αi tells us that |un | → ∞. The idea is that we will bound
the p-adic valuation of un over a subsequence (uan+b )n≥0 to get a contradiction if (un )n≥0 has
finitely many prime divisors (since (uan+b )n≥0 would then be bouded).
We shall analyze the local behaviour of (un )n≥0 for a fixed prime p. Write un = i fi (n)αin .
P
We wish to factorise αi by a suitable power of p so that maxi |αi |p = 1. Indeed, since |p1/n |np =
|p|p = 1/p, the absolute values of powers of p take any value which
P can be taken by | · |p . Thus,
suppose that maxi |αi |p = 1 and consider the sequence vn = i∈I fi (n)αi (n) where I denotes
the set of i such that |αi |p = 1.mb Let Kp be the field generated by the αi . We claim that
that the integers OKp := {|x|p ≤ 1 | x ∈ Kp } of Kp are finite modulo pk , for any fixed k. This
implies (by the pigenhole principle) that (vn )n≥0 is periodic modulo pk for any k. To prove our
366 CHAPTER 8. P -ADIC ANALYSIS

claim, suppose for the sake of a contradiction that there were infinitely many elements of OKp
non-congruent modulo pk , say f is a set of such elements. Pick a primitive element β of Kp /Qp
Pd−1
with conjugates β1 , . . . , βd , and consider an element x = i=0 bi β i ∈ OKp . By definition of the
p-adic absolute value, we also have |xi | ≤ 1, where xi is the image of x under the embedding
β → βi . To conclude, Cramer’s rule (Exercise C.5.7) or the adjugate (Proposition C.3.7) let us
express the bi as linear combinations of the βji and the xi . Then, using the triangle inequality,
we conclude that |b0 |p , . . . , |bd−1 |p are bounded. As a consequence, the set

Xd−1
{(b0 , . . . , bd−1 ) | bi β ≤ 1} ⊆ Qnp
i


i=0

is compact. In particular, OKp is as well, and thus S too. This implies that there are s, r ∈ S
such that |s − r|p is arbitrarily small, but then they will be equal modulo pk since
u−v
u≡v (mod pk ) ⇐⇒ ∈ OKp ⇐⇒ |u − v|p ≤ |pk | = p−k .
pk

To conclude, note that (vn )n≥0 is non-zero for large n by the Skolem-Mahler-Lech theorem.
Pick any N so that vN is non-zero, and let Tp be the period of (vn )n≥0 modulo pbvp (vn )c+1 .
Then, |vN +kTp |p is greater than some constant c > 0 for any k, and thus |un+kTp |p as well for
sufficiently large n + kTp , since un − vn → 0. If we finally return to the global behaviour and
let p vary among our finitely many prime divisors of (un )n≥0 , we get that, for any sufficiently
large N , vp (uN +k Qp Tp ) is bounded for any p and for any sufficiently large k. This contradicts
the assumption that |un | → ∞. 

Remark 8.7.2
Note that, to prove that αn is periodic modulo p for α ∈ OKp , we cannot simply "convert" (with
the fundamental theorem of symmetric polynomials, after having introduced its conjugates) α to
an element of Fp and use the Frobenius morphism. Why? Because the minimal polynomial of
α does not necessarily have coefficients in Zp . Indeed, we only consider the constant coefficient
of the minimal polynmomial of α to compute its p-adic absolute value, and disregard all other
coefficients. For instance, the roots of X 2 − X/2 + 1 over Q2 are in the unit ball.

As another remark, it has in fact been proven, using a generalisation of (a p-adic extension of)
the Thue-Siegel-Roth theorem (see Remark 7.4.3) that un either has the form cαn , or its greater
prime factor tends to infinity. See [28].

Exercise 8.7.23† . Does there exists an unbounded linear recurrence (un )n≥0 such that un is prime
for all n?

Solution

Suppose for the sake of a contradiction that (un )n≥0 is such a sequence. Without loss of generality,
suppose that |un | → ∞ by replacing (un )n≥0 by (uN n+m )n≥0 for some suitable N, m, as indicated
after Corollary 8.5.2. Now, let m be sufficiently large so that um = p is a prime which doesn’t
divide the denominator of the norm of any algebraic number appearing in the formula of um
(so that they still make sense modulo p). Finite fields theory (e.g.P Theorem 4.2.1) tells us that
there exists a k unpk ≡ un (mod p) for any n. Indeed, if un ≡ i fi (n)αin , with fi ∈ Fp [X] and
αi ∈ Fp , it suffices to pick k so that fi ∈ Fpk [X] and αi ∈ Fpk , by the Frobenius morphism.

In particular, ump`k ≡ 0 (mod p) for any `. By assumption, this means that ump` = p, contra-
dicting the fact that un → ∞. 
8.7. EXERCISES 367

Miscellaneous
Exercise 8.7.24† . Which roots of unity are in Qp ?

Solution

Let α = (a1 , a2 , . . .) is a root of unity of order n in Qp . Suppose initially that p is odd. We first
focus on the case where p - n. We have ank ≡ 1 (mod pk ). However, the group of units modulo pk
is isomorphic to pk−1 (p − 1) by Exercise 3.5.18† (in more elementary terms: there is a primitive
root) so we also have
gcd(p−1,n) gcd(pk−1 (p−1),n)
ak ≡ ak ≡ 1.
Hence, α has order dividing gcd(p − 1, n), which implies that n | p − 1 since n is the order of α.
Now suppose that p | n. We wish to reach a contradiction, so suppose without loss of generality
that α has order exactly p, by replacing it by αn/p . Then, apk ≡ 1 (mod pk ) so ak ≡ 1 (mod p)
which implies that
vp (apk − 1) = 1 + vp (ak − 1)
by LTE. For large k, vp (ak − 1) stabilises since α 6= 1, which means that this cannot be at least
k.

It remains to provide a construction for (p − 1)th roots of unity. One can do this using the
structure of (Z/pk Z)× , or by means of Hensel’s lemma: the derivative of X p − X is 1 which
is never zero so we can lift all roots of X p − X modulo p to roots in Qp . This is called the
Teichmüller character ω which sends x ∈ (Z/pZ)× to the unique root of X p−1 − 1 congruent to
x modulo p.

It remains to treat the case where p = 2. When n is odd, the same argument as before works:
gcd(2k−2 (2−1),n)
this time we even have ak ≡ 1 (mod 2k ) for k ≥ 2. However, unlike the previous
case, there is now a root of unity of order 2: −1. Since the only root of unity of odd order is 1,
v2 (n)
the order of any root of unity must be a power of 2, since α2 is a root of unity of odd order.
Hence, we shall prove that there is no root of unity of order 4. This is easy: we use the LTE for
p = 2 (which simply amounts to the fact that a square is always 1 modulo 4) to get

v2 (a4k − 1) = 1 + v2 (a2k − 1)

and this stabilises since α2 6= 1 by assumption. This time, the Teichmüller character is defined
as ω : (Z/4Z)× : Q2 sending 1 to 1 and −1 to −1.

To conclude, the roots of unity of Qp are all (p − 1)th roots of unity, as well as a root of order 2
when p = 2. 

Exercise 8.7.27† (China TST 2010). Let k ≥ 1 be a rational integer. Prove that, for sufficiently
large n, nk has at least k distinct prime factors.

Solution
n
The key lemma is that, for any prime p and any positive integer n, pvp ((k )) ≤ n. Suppose that
n

we have proven this. Then, if k has at most k − 1 prime factors, say p1 , . . . , pm , we have
  Y m
n vp ((n))
= pi i k ≤ nm ≤ nk−1
k i=1

n

which is impossible for large n since k is a polynomial of degree k in n.
368 CHAPTER 8. P -ADIC ANALYSIS

n n!

It remains to prove this key claim. We use Legendre’s formula and the fact that k = k!(n−k)!
to write
  blogX p (n)c      
n n n−k k
vp = i
− i
− i .
k i=1
p p p

The wanted
j k result
j now k from the trivial inequality bx + yc ≤ bxc + byc + 1: each
k jfollows of the
n n−k k
 
terms pi − pi − pi is at most 1, so the whole sum is less than or equal to logp (n) .
n
This gives us vp nk ≤ logp (n), i.e. pvp ((k )) ≤ n as claimed.



Exercise 8.7.28† . Find all additive functions f : ZN → Z, where addition is defined componentwise.
(To those who have read Section C.2, the fact that there are a nice characterisation of those functions
should come off as a surprise.)

Solution

We claim that the Z-linear functions from ZN → Z are given by linear combinations of the
coordinates, which is surprising since the vectors ei with 1 in the ith coordinate and 0 everywhere
else do not form a basis of ZN : any linear combination of them has finitely many non-zero
coordinates (so (1, 1, . . .) isn’t one for instance)! This problem thus has two parts: proving that
any such function is 0 on all but finitely many ei , and proving that an additive function which is
0 on the ei is identically 0. We will do the second part first.

Suppose that f : ZN → Z is additive and f (ei ) = 0 for all i, i.e. f is 0 on the space of vectors
with finitely many non-zero coordinates. The special property of Z is that we can use the theory
of divisibility. More precisely, if the coordinates of x ∈ ZN eventually get all divisible by m, then
m | f (x). Indeed, if x = (x0 , x1 , . . .) is such that m | xn for any n ≥ N , we have

f (x) = f (0, . . . , 0, xN , xN +1 , . . .) = mf (0, . . . , 0, xN /m, xN +1 /m, . . .).

Thus, if the xi get eventually all divisible by increasingly large integers, f (x) must be zero! For
instance, f (a0 , a1 p, a2 p2 , . . .) is divisible by pn for any n so must be zero. You should now be able
to see the p-adic flavor of this problem (even if we won’t really use any of the theory developped
in this chapter)! In particular, x is congruent modulo p to

f (x0 , x1 (p + 1), x2 (p + 1)2 , . . .) = 0.

Since this is true for abritrary p, f (x) must be 0 too. Alternatively, using Bézout’s lemma, there
are 2n yn and 3n zn such that 2n yn + 3n zn = xn . Thus,

f (x) = f (y0 , 2y1 , 4y2 , . . .) + f (z0 , 3z1 , 9z2 , . . .) = 0 + 0 = 0.

P f : Z → Z is 0 on all but finitely many ei , say i 6∈ I.


N
Now we prove that any additive function
This implies that the function x 7→ f − i∈I f (ei )xi , where xi denotes the ith coordinate of x, is
zero on every ei so must be identically zero by the previous step. This shows that any additive
function is a linear combinations of the coordinate.

The idea will again be p-adic. We will produce a sequence x = (x0 , x1 , . . .) such that v2 (xn ) is
increasing and grows so fast that f (ei ) must be 0 for large i, since we have the congruence
n−1
X
f (x) ≡ xi f (ei ) (mod 2v2 (xn ) ).
i=0
P∞
We can rephrase this as saying that the series i=0 xi f (ei ) converges dyadically to the rational
integer f (x). The point is that we have too many degrees of freedom for this to be always a
8.7. EXERCISES 369

rational integer, unless f (ei ) = 0 for sufficiently


P∞ large i. This follows from the fact that, if we write
the dyadic expansion of a dyadic integer as i=0 ai 2i with ai ∈ {0, 1}, then the dyadic integers
with a finite dyadic expansion are exactly the rational integers. Indeed, P∞ this decomposition
P∞ is
unique for the same reason that the base 2 decomposition is: if i=0 ai 2i = i=0 b i 2i
, pick
the smallest n such that an 6= bn to get an 2n ≡ bn 2n (mod 2n+1 ), i.e. an = bn , which is a
contradiction. Thus, the dyadic expansion of a rational integer must be its base 2 expansion,
which is indeed finite.

Hence, we pick xi = 2ni with (ni )i≥0 an increasing Pmi sequence which grows sufficiently fast. More
precisely, if we write 2ni f (ei ) in base 2 as a
k=ni k 2 k
, we want ni+1 to be larger than mi .
ni+1
That way,P∞ the base 2 expansion of 2 f (e i+1 ) only adds new terms to the dyadic expansion of
f (x) = i=0 2ni f (ei ), unless f (ei+1 ) = 0. Since the dyadic expansion of f (x) ∈ Z is finite, for
sufficiently large i, 2ni f (ei ) cannot add new terms to it, which means f (ei ) = 0 as wanted. This
concludes the solution. 

Exercise 8.7.29† . Prove that the Skolem-Mahler-Lech theorem holds over any field of characteristic
zero.

Solution

The idea is to reduce again the problem to sequences of algebraic numbers. More precisely, let K
be the field generated by the numbers involved in the formula for un and pick a transcendance
basis α1 , . . . , αk (see Exercise B.4.9† ), i.e. a maximal subset of elements which are algebraically
independent over Q. The primitive element theorem then yields K = Q(α1 , . . . , αk )(α) for some
α algebraic over Q(α1 , . . . , αk ) with minimal polynomial π(α1 , . . . , αk ) ∈ Z(α1 , . . . , αk )[X]. The
key point is that we can "replace" α1 , . . . , αn by any rational integers a1 , . . . , ak and α by any
root a of π(a1 , . . . , ak ), since an inequality in Q(α1 , . . . , αk ) reduces to an equality of algebraic
functions in Q(X1 , . . . , Xk ) modulo π(X1 , . . . , Xk ).

Given an A = (a1 , . . . , ak ) and a root a of π(a1 , . . . , ak ), we shall denote the image of un under
[A,a]
the substitution αi → ai , α → a by un (note that this only makes sense if the denominator of
[
un A, a] is non-zero). Our main claim is the following: there is a T such that, for any uN 6= 0, there
[A,a]
is an A for which uN stays non-zero and such that the common difference of the arithmetic
[A,a]
progressions of zeros of (un )n∈Z divides T . It is straightforward to see that this implies the
wanted theorem: if (unT +m )n∈Z is not identically zero, then pick an N ∈ T Z+m such that uN 6= 0
[A,a]
and an A as before to get that (unT +m )n∈Z has finitely many zeros and thus (unT +m )n∈Z as
well.

It remains to prove this claim. Write un = i fi (n)rin , where ri = Ri (α1 , . . . , αk , α) and let
P
Li (α1 , . . . , αk ) and Ci (α1 , . . . , αk ) be the leading and constant coefficients of the minimal poly-
nomial of ri , as seen as a polynomial Z[α1 , . . . , αk ][X]. Pick B = (b1 , . . . , bk ) ∈ Zk such that
Li (B) and Ci (B) are non-zero: this implies that the ri are too (when evaluated at b and any
root of π(B)) since their norm is. Then pick a prime p which divides none of them. Finally,
we choose (a1 , . . . , ak ) such that ai ≡ bi (mod p) (in particular these coefficients are still non-
[A,a]
zero) and uN is non-zero, where A = (a1 , . . . , ak , a) and a is always an arbitrary root of
[A,a]
π(α1 , . . . , αk ). This is possible since if uN were always zero, then the norm of uN is zero on
(b1 + pZ) × . . . × (bk + pZ) so is zero by Exercise A.1.7∗ which implies that uN = 0.
[A,a]
Now, we consider the norm (vn )n∈Z of un to get a sequence of rational numbers. We will
consider (vn )n∈Z as a union of p-adic analytic functions to deduce information about its zero,
[A,a]
as usual. Hence, we shall abusively consider the ri as elements of a finite extension Kp
of Qp . Note that they have zero p-adic valuation, i.e. are units in Kp . Indeed, note that
[A,a]
r = Li (A)ri is a root of a monic polynomial with constant coefficient c = Ci (A)Li (A)m−1 : if
370 CHAPTER 8. P -ADIC ANALYSIS

Ri (A) = Li (A)X m + . . . + Ci (A), then

R = Li (A)m−1 Ri (A)(X/Li (A)) = X m + . . . + Ci (A)Li (A)m−1 .

Since c is not divisible by p, the norm of r cannot be smaller than 1 by the strong triangle
inequality, otherwise the norm of R(r) would be |c|p = 1. Similarly, if it has norm greater than
1, the norm of this polynomial evaluated at r would be |rm |p . Since p - Li (A), we conclude that
[A,a]
|ri |p = 1 as wanted.
t[A,a]
Finally, we wish to transform (vn )n∈Z into analytic function, i.e. have |ri − 1|p ≤ 1/p ≤
1
1/p p−1 . This t will be our bound for the common period of the arithmetic progressions of zeros.
[A,a]
For this, consider the ri as algebraic numbers again and then as elements of Fp . Since their
degree is bounded (by [K : Q]), their order in Fp is bounded too, say divides T . Then, we have
T [A,a] [A,a]
ri = 1 in Fp , so if we return to our p-adic ri ∈ Qp , the fundamental theorem of symmetric
T [A,a]
polynomials shows that we also have ri ≡ 1 (mod p) there. We conclude that (vnT +m )n∈Z
is analytic over OK for any m ∈ [T ], which finishes the proof of our claim. Indeed, note that T
only depends on [K : Q] and p, which was fixed at the beginning of the proof, so does not depend
on N (our chosen index such that uN 6= 0). 
Appendix A

Polynomials

A.1 Fields and Polynomials


Exercise A.1.1∗ . Let K be a field. Prove that 0K a = 0K for any a ∈ K.

Solution

0K a = (0K + 0K )a = 0K a + 0K a so 0K a = 0K . 

Exercise A.1.2∗ . Let † be a binary (taking two arguments) associative operation on a set M . Suppose
that M has an identity. Prove that it is unique. Similarly, prove that, if an element g ∈ M has an
inverse, then it is unique.1

Solution

If e and e0 are identities, then e = ee0 = e0 so e = e0 . Similarly, if b and b0 are two inverses of a,
then
b = (b0 a)b = b0 (ab) = b0
by associativity. 

Exercise A.1.3∗ . Prove that multiplication of polynomials is associative and commutative.

Solution

aiX , g = j k
P P P
Let f = i j bj X and h = k ck X be three polynomials. We have
X X
fg = a i bj X ` = bj ai X ` = gf
i+j=` i+j=`

since multiplication is commutative in a field and


X X
(f g)h = (ai bj )ck X ` = ai (bj ck )X ` = f (gh)
i+j+k=` i+j+k=`

since multiplication is associative in a field. (This also works for formal power series.) 

1 Such a structure is called a monoid.

371
372 APPENDIX A. POLYNOMIALS

Exercise A.1.4∗ . Prove that the gcd of 0 and 0 is 0.

Solution

Any polynomial divides 0 and 0 if and only if it divides 0. 

Exercise A.1.5∗ . Prove that the Euclidean algorithm produces the gcd. Deduce that the gcd of
two polynomials in K[X] is also in K[X]. (As a consequence, the fundamental theorem of algebra
Theorem A.1.1 implies that two polynomials with rational coefficients are coprime in Q[X] if and only
if they have a common complex root.)

Solution

We need to prove that gcd(f, g) = gcd(f − gq, g) for any f, g. This implies that the steps in the
Euclidean algorithm preserve the gcd. Since deg f + deg g decreases at each step, we eventually
reach a situation where f = 0, and the gcd of 0 and g is clearly g. It is however very trivial that
the gcd is conserved since
h | f, g =⇒ h | f − gq, g
and
h | f − gq, g =⇒ h | g, (f − gq) + gq = f.


Exercise A.1.6∗ (Bézout’s Lemma). Consider two polynomials f, g ∈ K[X]. Prove that there exist
polynomials u, v ∈ K[X] such that uf + vg = gcd(f, g).

Solution

Without loss of generality, suppose that deg g ≤ deg f . We proceed by induction on deg g. When
this is 0, i.e. g is constant, we have 0 + ·f + 1/g · g = 1 as wanted. For the induction step, perform
the Euclidean division of f by g: f = gq + r. Since deg r < deg g, by the induction hypothesis,
there are u and v such that uf + vr = 1. Then,

1 = uf + vr
= uf + v(f − gq)
(u + v)f − (qv)g

as wanted. 

Remark A.1.1
One might, at first sight, think that this proof also works for non-coprime f, g (which is impossible
for obvious reasons). However, we used the assumption that they were coprime when we said
the base case was deg g = 1: this is only true because the gcd is 1 so the Euclidean algorithm
eventually yields a pair {1, f } with f 6= 0, right before the pair {1, 0}. Otherwise, we would have
to do the base case when deg g = −∞ which is clearly impossible.

Exercise A.1.7∗ . Let f ∈ K[X1 , . . . , Xn ] be a polynomial in n variables and suppose S1 , . . . , Sn ⊆ K


are subsets of K such that |Si | > degXi f . If f vanishes on S1 × . . . × Sn , prove that f = 0. (This is
the generalisation of Corollary A.1.1 to multivariate polynomials.)
A.1. FIELDS AND POLYNOMIALS 373

Solution

We proceed by induction on n, the base case being the previous proposition. Fix xn ∈ Sn . Then,
the polynomial
g(xn ) = f (X1 , . . . , Xn−1 , xn ) ∈ K[X1 , . . . , Xn−1 ]
vanishes on S1 × . . . × Sn−1 and has degree less than |Si | in Xi . Hence, g(xn ) = 0. Finally, g is a
polynomial in Xn (with coefficients in the ring K[X1 , . . . , Xn−1 ]) of degree less than |Sn | vanishing
on Sn , which implies that it’s 0 by Corollary A.1.1. (Technically, to use Corollary A.1.1 we would
need to work over a field, while we are only working over a ring: K[X1 , . . . , Xn ]. However, this
is trivial to fix: this is an integral domain so we can embed it its field of fractions, i.e. work over
the field K(X1 , . . . , Xn ).) 

Exercise A.1.8∗ . Prove that (f g)0 = f 0 g + gf 0 and (f + g)0 = f 0 + g 0 for any f, g ∈ K[X]. Show
also that (f n )0 = nf 0 f n−1 for any positive integer n, where f k denotes the kth power and not the kth
iterate. More generally, show that

n
!0 n
Y X Y
fi = fi0 fj .
i=1 i=1 j6=i

Solution

ai X i and g = j
P P
Write f = i j bj X . We have
X X X
(f + g)0 = k(ak + bk )X k−1 = iai X i−1 + jbj X j−1 = f 0 + g 0
k i j

which shows additivity. For the multiplication, we have


 0
X X
(f g)0 =  ai bj X k  = kai bj X k−1
i+j=k i+j=k

and
X X X X
f 0 g + g0 f = iai bj X k−1 + jai bj )X k−1 = (iai bj + jai bj )X k−1 = kai bj X k−1
i+j=k i+j=k i+j=k i+j=k

as wanted. Finally, the last point follows from the (f g)0 = f 0 g + g 0 f by induction:
n
!0 n−1 n−1 n
Y Y X Y X Y
fi = fn0 fi + fn fi0 fj = fi0 fj .
i=1 i=1 i=1 n6=j6=i i=1 j6=i

The previous point follows is the case f1 = . . . = fn = f . 

Exercise A.1.9∗ . Prove that every function f : Fp → Fp is polynomial.

Solution

This follows from Lagrange’s interpolation theorem since Fp is finite. 


374 APPENDIX A. POLYNOMIALS

Exercise A.1.10∗ . Prove that the derivative of a rational function does not depend on its form: i.e.
(f /g)0 = ((hf )/(hg))0 for any f, g, h ∈ K[X] with g, h 6= 0.

Solution

We have
f 0 g − g0 f
(f /g)0 =
g(X)2
and
(hf )0 (hg) − (hg)0 hf
(hf /hg)0 = (hg(X))2
g2


A.2 Algebraic Structures and Morphisms


Exercise A.2.1∗ . Prove that 1R and 0R are unique, and that any element has a unique additive
inverse and a unique multiplicative inverse if it is non-zero.

Solution

This follows from Exercise A.2.9∗ . 

Exercise A.2.2∗ . Let R be a ring. Prove that 0R a = a0R = 0R for any a ∈ R.

Solution

The proof is the same as for Exercise A.1.1∗ . 

Exercise A.2.3∗ . Prove that char R is the smallest m ≥ 0 such that R contains a copy of Z/mZ

Solution

If R contains a copy of Z/mZ with m ≥ 1 then R has characteristic dividing m which shows the
result when m 6= 1. If m = 0, then R has characteristic zero since n 6= 0 for all n ∈ Z. The
converse is clear: the copy Z/mZ is a (mod m) 7→ 1 + . . . + 1 for a ∈ N. (This is well-defined
| {z }
a times
because the characteristics are the same.) 

Exercise A.2.4∗ . Prove that the characteristic of a field is either 0 or a prime number p.

Solution

Let c denote the characteristic of a given field K. If c 6= 0, then c ≥ 2 since the trivial ring is not
a field. Suppose that c = ab. Then, in K, we have ab = 0 which means a = 0 or b = 0 since it’s
an integral domain. By minimality of the characteristic, this means that c = a or c = b. 

Exercise A.2.5. Let R be a finite integral domain (i.e. with finitely cardinality). Prove that it is a
field.
A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 375

Solution

Let a ∈ R be non-zero. Consider the powers of a: a, a2 , . . .. Since R is finite, there exist i < j
such that ai = aj , i.e. ai (aj−i − 1) = 0. Since a 6= 0 and R is an integral domain, we get
aj−i − 1 = 0, so that aj−i−1 is the inverse of a. 

Exercise A.2.6∗ . Prove that a subring of a field is an integral domain.

Solution

If ab = 0 and a 6= 0 then b = a−1 ab = 0. 

Exercise A.2.7. What goes wrong if you try to construct the field of fractions of a commutative ring
which isn’t a domain?

Solution

Clearly, if uv = 0, there is something wrong with 1/u. Indeed, we would have 1/u = v/(uv) = v/0
which doesn’t make sense (even formally: 1 · 0 is not equal toThe problem is that a/b = c/d if
ad = bc is not an equivalence relation anymore: we can have a/b = c/d and c/d = x/y but
a/b 6= x/y. Indeed, this is how the usual proof of transitivity goes: we have ad = bc and cy = dx
so
ady = bcy = bdx
which doesn’t necessarily means ay = bx since d might not be invertible. Here is a concrete
counterexample, if dd0 = 0, then 1/d = d0 /0 and d0 /0 = 1/0 but 1/d 6= 1/0. 

Exercise A.2.8∗ . Let R be an integral domain. Prove that R[X] is also one.

Solution

Suppose that f and g are non-zero elements of R[X] with respective leading coefficients a and b.
Then, the leading coefficient of f g is ab since ab 6= 0 as R is an integral domain, which implies
in particular that f g is non-zero. 

Exercise A.2.9∗ . Prove that the identity e of a group G is unique, and that any a ∈ G has a unique
inverse. Moreover, prove that (xy)−1 = y −1 x−1 .

Solution

If e and e0 are two identities then e = ee0 = e0 . The inverse of xy is y −1 x−1 since (xy)(y −1 x−1 ) =
xx−1 = e. 

Exercise A.2.10∗ . Check that (Sn , ◦) is a group.


376 APPENDIX A. POLYNOMIALS

Solution

Since permutations are bijective, they are invertible. Moreover, the identity permutation is the
identity of the group. Finally, it is clear that the operation is associative since composition is.

Exercise A.2.11∗ . Prove that a morphism of groups from (G, †) to (H, ?) maps the identity of G to
the identity of H.

Solution

Let ϕ be such a morphism and eG , eH be the identities of G and H respectively. We have

ϕ(eG ) = ϕ(eG † eG ) = ϕ(eG ) ? ϕ(eG )

so ϕ(eG ) = eH as wanted (by starring both sides by its inverse).) 

Exercise A.2.12∗ . Prove that the kernel of a morphism (of rings or groups) is closed under addition.

Solution

If ϕ(a) = 0 and ϕ(b) = 0 then ϕ(a + b) = ϕ(a) + ϕ(b) = 0. 

Exercise A.2.13∗ . Prove that a morphism of groups is injective iff its kernel is trivial, i.e. consists
of only the identity.

Solution

If it is injective, then the kernel is trivial. Otherwise, suppose that ϕ(a) = ϕ(b) and a 6= b. Then
ϕ(ab−1 ) = e so the kernel is non-trivial. 

A.3 Exercises
Derivatives
Exercise A.3.1† . Let f, g ∈ K[X] be two polynomials. Prove that the derivative of f ◦ g is g 0 · f 0 ◦ g.

Solution

ai X i . Then, (f ◦ g)0 = ai (g i )0 = iai g 0 g i−1 = g 0 f 0 ◦ g.


P P P
Write f = i i i 

Exercise A.3.2† . Let f ∈ K[X] be a non-constant polynomial. Prove that there are a finite number
of g, h ∈ K[X] such that g ◦ h = f , up to affine translation, meaning (g, h) ≡ g(aX + b), h−b
a .
A.3. EXERCISES 377

Solution

By composing with an affine transformation, we may assume that h(0) = 0 and that h is monic.
If we differentiate the equation g ◦ h = f , we get h0 | f 0 . There is a finite number of such h0
since we fixed its leading coefficient and f is non-constant. Since h(0) = 0, there is also a finite
number of h. Since g is uniquely determined from h, we are done. 

Exercise A.3.4† (USA TST 2017). Let f, g ∈ R[X] be non-constant coprime polynomials. Prove
that there are at most three real numbers λ such that f + λg is the square of a polynomial.

Solution

The key point is that, if f + λg is a square h2 , then h divides f + λg as well as f 0 + λ0 g = 2hh0


so must divide
g 0 (f + λg) − g(f 0 + λg 0 ) = f g 0 − g 0 f
 
f g
which is independent of f . (Note that this is the determinant of 0 which was also used
f g0
in Exercise A.3.22† . This explains why it doesn’t depend on λ.)

Also, if f + λg = r2 and f + µg = s2 for µ 6= λ, r and s are coprime since they two linearly
independent linear conmbinations of f and g, and we know f and g are coprime. Thus, if
f + λi g = h2i for i = 1, . . . , n, we get h1 · . . . · hn | f 0 g − g 0 f as they all divide it and are coprime.
However, when n is large (i.e. greater than 3), the degree of the LHS will be too big so this will
be impossible. Indeed, from f + λi g = h2i , we deduce that deg hi is max(deg f, deg g)/2, except
for possibly one value of λ and deg f = deg g. In the first case we are done since

4 max(deg f, deg g)/2 > deg f + deg g − 1,

so we must have f 0 g = g 0 f which is impossible as this would mean f | f 0 since f and g are coprime.
For the second case, if deg(f + λg) is small, note that we can replace f by f + λg (and replace the
λi by other real numbers µi ) and this case is now impossible since deg(f + λg) < deg g = deg f .
Note that this doesn’t change the value of f 0 g − g 0 f because we constructed it to be

g 0 (f + λg) − g(f 0 + λg 0 ) = f g 0 − g 0 f.

Exercise A.3.6† (Discrete Derivative). Let f ∈ K[X] be a polynomial of degree n and leading
coefficient a. Define its discrete derivative as ∆f := f (X + 1) − f (X). Prove that, for any g ∈ K[X]
∆f = ∆g if and only if f − g is constant, and that ∆f is a polynomial of degree n − 1 with leading
coefficient an where a is the leading coefficient of f . Deduce the minimal degree of a monic polynomial
f ∈ Z[X] identically zero modulo m, for a given integer m ≥ 1.

Solution

The discrete derivative operator is a morphism (from the space of polynomials of degree at most
n to the space of polynomials of degree at most n − 1), thus it suffices to show that its kernel
Pn from the second part, that ∆f is a polynomial of degree
consists only of constants. This follows
n − 1. For this, simply write f = i=0 ai X i . Then,
n n i−1  
X X X i
∆f = ai ((X + 1)i − X i ) = ai Xj
i=0 i=0 j=0
j
378 APPENDIX A. POLYNOMIALS

n

and the term in X n−1 is reached only once for i = n, j = n − 1, with coefficient an n−1 = an.
Finally, if a polynomial is identically zero modulo m and monic of degree n, ∆n f = n! since the
degree decreases by one every time we apply ∆, while the leading coefficient gets multiplied by
the degree. Thus, m | n!. Conversely, if n is the minimal integer such that m | n!, the polynomial
 
X
f = n! = X(X − 1) · . . . · (X − (n − 1))
n

works. 

Exercise A.3.7† . Let f : R → R be a function. Define its discrete derivative ∆f as x 7→ f (x + 1) −


f (x). Prove that, for any integer n ≥ 0,

n  
n
X
n−k n
∆ f (x) = (−1) f (x + k).
k
k=0

Solution

We proceed by induction on n. For n = 0 it is of course trivial. If it’s true for n, then

∆n+1 f (x) = ∆(∆n f )(x)


n  
X n
= (−1)n−k (f (x + k + 1) − f (x + k))
k
k=0
n+1    
X
n+1−k n n
= ((−1) − f (x + k)
k+1 k
k=0
n+1    
X
n+1−k n n
= (−1) + f (x + k)
k+1 k
k=0
n+1  
X n+1
= (−1)n+1−k f (x + k).
k
k=0

Exercise A.3.8† . Let m ≥ 0 be an integer. Prove that there is a polynomial fm ∈ Q[X] of degree
m + 1 such that

n
X
k m = fm (n)
k=0

for any n ∈ N.
A.3. EXERCISES 379

Solution
Pn
We proceed by induction on m by noting that k=0 k 0 = n + 1 := f0 (n) and that
n
X
(n + 1)m+1 = (k + 1)m+1 − k m+1
k=0
m   n
X m+1 X
= ki
i=0
i
k=0
n
! m−1
X X m+1

m
= (m + 1) k + fi (n)
i=0
i
k=0

so that
n m−1
X
m (n + 1)m+1 X m + 1 fi (n)
fm (n) = k = −
m+1 i=0
i m+1
k=0

is a polynomial as well. Note also that its leading coefficient is f rac1m + 1. 

Roots of Unity
Exercise A.3.9† (Root of Unity Filter). Let f = i ai X i ∈ K[X] be a polynomial, and suppose that
P
ω1 , . . . , ωn ∈ K are distinct nth roots of unity. Prove that
f (ω1 ) + . . . + f (ωn ) X
= ak .
n
n|k

Deduce that, if K = C,
max |f (z)| ≥ |f (0)|.
|z|=1

(You may assume the existence of a primitive nth root of unity ω, meaning that ω k 6= 1 for all k < n,
or, equivalently, every nth root of unity are powers of ω. This will be proven in Chapter 3.)

Solution

Let ω be a primitive nth root of unity. Note that, if n - m,


n n−1
X X ω mn − 1
ωkm = ω km =
ωm − 1
k=1 k=0
Pn
since the numerator is zero and the denominator isn’t. When n | m, the sum is simply k=1 1 =
n. Thus, we have proven the result for monomials, and the general case follows by taking linear
combinations (if it’s true for f and g it’s also true for af and f + g).
f (ω1 )+...+f (ωn )
For n > deg f we have n = f (0) so

f (ω1 ) + . . . + f (ωn )
max |f (ωk )| ≥ = |f (0)|
k n

by the triangular inequality. 

Exercise A.3.10† . Let f = i ai X i ∈ R[X] be a polynomial and ω1 , . . . , ωn ∈ C be distinct nth


P
roots of unity with n > deg f . Prove that
|f (ω1 )|2 + . . . + |f (ωn )|2 X
= a2i .
n i
380 APPENDIX A. POLYNOMIALS

Denote by S(f ) the sum of the squares of the coefficients of f . Deduce that S(f g) = S(f X deg g g(1/X))
for all f, g ∈ R[X]. (X deg g g(1/X) is the polynomial obtained by reversing the coefficients of g.)

Solution

Note that
|f (ω)|2 = f (ω)f (ω) = f (ω)f (ω) = f (ω)f (ω −1 )
for any ω on the unit circle, since ωω = |ω|2 = 1 for these ω. Thus,
n n
1X 1X
|f (ωk )|2 = f (ωk )f (ωk−1 )
n n
k=1 k=1
n X
1 X X
= ai ωki aj ωkj
n
k=1 i j
n X
1 X
= ai aj ωki−j
n i,j
k=1
X
= a2i
i

by Exercise A.3.9† since n | i − j iff i = j for i, j ∈ [[0, deg f ]], as n > deg f . For the second part,
note that |f (ω)g(ω)| = f (ω)g(1/ω)| for any ω on the unit circle. 

Exercise A.3.11† . Let k be an integer. Prove that a∈Fp ak is 0 if p − 1 - k and −1 otherwise.


P

Deduce that any non-constant polynomial f ∈ Fp [X] satisfying f (a) ∈ {0, 1} for all a ∈ Fp must have
degree at least p − 1.

Solution

The first part is Exercise A.3.9† for K = Fp , since non-zero elements of Fp are (p − 1)th roots of
unity by Fermat’s little theorem. For the second, let m be the number of times f (a) = 1. Then,
P
a∈Fp f (a) ≡ m (mod p). If deg f < p − 1, this sum is zero modulo p by the first part which is
impossible since m ∈ [1, p − 1] (if f is constant over Fp and has degree less than p, f − f (0) has
more roots than its degree so is zero). 

Exercise A.3.12† . Let p 6= 3 be a prime number. Suppose that a and b are integers such that
p | a2 + ab + b2 . Prove that (a + b)p ≡ ap + bp (mod p3 ).

Solution

Note that we can suppose that a, b 6≡ 0 (mod p) and reduce the problem to the case where b = 1
by considering x ≡ ab−1 (mod p) so that x2 + x + 1 ≡ 0. In particular, x has order 3 modulo p
since x3 − 1 ≡ (x − 1)(x2 + x + 1) but x 6≡ 1 since p 6= 3. This implies that p ≡ 1 (mod 3) by
Exercise 3.3.4∗ . (This is a special case of Theorem 3.3.1.)

The key point is that, since p ≡ 1 (mod 3), we have (X 2 + X + 1)2 | (X + 1)p − X p − 1 := f .
Since (X + 1)p − X p − 1 ≡ 0 (mod p) by the binomial expansion (see ?? for more details), this
means that (X 2 + X + 1)2 divides the polynomial fp in Q[X], and hence in Z[X] too since it is
monic. We conclude that p(X 2 + X + 1)2 | f in Z[X] so that

vp (f (x)) ≥ vp (p(x2 + x + 1)) ≥ 3


A.3. EXERCISES 381

as wanted. First, note that X 2 + X + 1 is irreducible over Q[X] and that its roots are primitive
third root of unity ω, since X 3 −1 = (X 2 +X +1)(X −1). Hence, we wish to show that f (ω) = 0,
f 0 (ω) = 0 and f 00 (ω) = 0. We have

f (ω) = (ω + 1)p − ω p − 1 = (−ω 2 )p − ω p − 1 = 0

since ω p is also a primitive third root of unity. Similarly,

f 0 (ω) = p(ω + 1)p−1 − pω p−1 = pω 2(p−1) − pω p−1 = p − p = 0

since 3 | p − 1 and p − 1 is even so we are done. 

Remark A.3.1
n n
−X −1
It has been conjectured that the polynomials (X+1) n n
(X 2 +X+1)ε where ε = vX +X+1 ((X +1) −X −1)
2

is 2 if n ≡ 1 (mod 3), 1 if n ≡ −1 (mod 3) and 0 if n ≡ 0 (mod 3) are irreducible. These are


called the Cauchy-Mirimanoff polynomials.

Group Theory
Exercise A.3.14† . Given a group G and a normal subgroup H ⊆ G, i.e. a subgroup such that

x+H −x=H

for any x ∈ G,2 we define the quotient G/H of G by H as G modulo H 3 , i.e. we say x ≡ y (mod H)
if x − y ∈ H.4 Prove that this indeed a group, and that |G/H| = |G|/|H| for any such G, H.

Solution

G/H is clearly closed under the operation of G and has inverses and an identity. We need however
to check that the operation is well defined: x ≡ y (mod H) and z ∈ G, x + z ≡ y + z (mod H)
and z + x ≡ z + y (mod H). For the former, note that (x + z) − (y + z) = x − y ∈ H since the
inverse of y + z is −z − x, and for the latter note that (z + x) − (z + y) = z + (x − y) − z is in
H because H is normal in G. The second part is obvious: any x ∈ G is equal to exactly |H|
elements modulo H: x + y for y ∈ H. 

Exercise A.3.15† (Isomorphism Theorems). Prove the following first, second, and third isomorphism
theorems.

1. Let ϕ : A → B be a morphism of groups. Then, A/ ker ϕ ' im ϕ. (In particular, ker ϕ is normal
in A and | im ϕ| · | ker ϕ| = |A|.)

2. Let H be a subgroup of a group G, and N a normal subgroup of G. Then, H/H ∩ N ' HN/N .
(In particular, you need to show that this makes sense: HN is a group and H ∩ N is normal in
H.)

3. Let N ⊆ H be normal subgroups of a group G. Then, (G/N )/(H/N ) ' G/H.

2 In particular, when G is abelian, any subgroup is normal.


3 This is where the notation Z/nZ comes from! In fact this shows that, in reality, we should say "modulo nZ" instead
of "modulo n".
4 A better formalism is to say that G/H is the set of cosets g + H for g ∈ G. In fact, we will almost always use this

definition in the solutions of exercises (since this is the only place where this will appear), but we introduced it that way
to make the analogy with Z/nZ clearer.
382 APPENDIX A. POLYNOMIALS

Solution

1. Note that ker ϕ is normal in A. Indeed, if ϕ(x) = 1, then ϕ(yxy −1 ) = ϕ(y)ϕ(x)ϕ(y)−1 = 1


too. Second, note that every element in the image of ϕ has exactly one one preimage in
A/ ker ϕ: indeed, if ϕ(x) = ϕ(y), then xy −1 ∈ ker ϕ so they are equal modulo ker ϕ. This
shows that it is an isomorphism (it is clearly surjective, and we have shown it was injective
too).
2. Note that H ∩ N is normal in H since N is so hH ∩ N h−1 ⊆ N but this is also in H
when h ∈ H so must be equal to H ∩ N . Note also that HN is indeed a group since, if
gm, hn ∈ HN , then mh = h` for some ` ∈ N as N is normal, so

gmhn = gh`n ∈ HN.

Similarly, gm = kg for some k ∈ G so (gm)−1 = g −1 k −1 ∈ HN . Now, consider the natural


map from H to HN/N , sending h to hN . Its kernel consists of the h such that hN = N , i.e.
h ∈ N . Hence, its kernel is H ∩ N so we get H/H ∩ N ' HN/N by the first isomorphism
theorem.
3. Consider the surjective map G/N → G/H which sends gN to gH. It is well defined
since N ⊆ H. gN is in the kernel if gH = H, i.e. g ∈ H. Hence, the kernel consists
of hN for h ∈ H, i.e. H/N . We conclude from the first isomorphism theorem that
G/H ' (G/N )/(H/N ) as wanted.

Exercise A.3.16† . Let G be a finite group, ϕ : G → C× be a non-trivial group morphism (i.e. not
the constantPfunction 1), where (C× , ·) is the group of non-zero complex numbers under multiplication.
Prove that g∈G ϕ(g) = 0.

Solution

Note that, for any h ∈ G, g 7→ hg is a bijection so


X X X
ϕ(g) = ϕ(hg) = ϕ(h) ϕ(g)
g∈G g∈G g∈G
P
which means that g∈G ϕ(g) = 0 by picking an h such that ϕ(h) 6= 1. 

Remark A.3.2
Alternatively, this can be done as follows: the image of ϕ is a subgroup of the group of |G|th roots
of unity by Lagrange, so must be the group of nth roots for some n, greater than 1 by assumption
(this is just the fact that subgroups of cyclic groups are also cyclic). Let ω = exp(2iπ/n) be a
primitive nth root of unity. Hence, we have
n−1
X X ωn − 1
x= ωk = =0
ω−1
x∈im ϕ k=0

since the numerator is zero while the denominator isn’t, as n > 1. To conclude, by the first
A.3. EXERCISES 383

isomorphism theorem from Exercise A.3.15† , we have


X |G| X
ϕ(g) = x = 0.
ker ϕ
g∈G x∈im ϕ

Exercise A.3.17† (Lagrange’s Theorem). Let G be a group of cardinality n (also called the order of
G). Prove that g n = e for all g ∈ G. In other words, the order of an element divides the order of the
group. More generally, prove that the order of a subgroup divides the order of the group.

Solution

See Theorem 2.5.1 and Exercise 6.3.15∗ . 

Exercise A.3.18† (5/8 Theorem). Let G be a non-commutative finite group. Prove that the proba-
bility

|{(x, y) ∈ G2 | xy = yx}|
p(G) =
|G|2

that two elements commute is at most 5/8.

Solution

Denote by Z the center of the group, i.e. the set of elements which commute with every other
one. For a given x ∈ G, denote also by C(x) the centraliser of x, i.e. the set of y such that x and
y commute. The wanted probability is x∈G |C(x)|
P
|G|2 . Note that C(x) are subgroups of G (and
hence Z is too): if xy = yx and xz = zx then

xyz = yxz = yzx.

First, let’s see how big the center can be. It’s a subgroup of G, so its cardinality divides |G| by
Lagrange’s theorem Exercise A.3.17† . It can’t be |G| since G is non-abelian, it can’t be |G|
2 since
G/Z is then isomorphic to Z/2Z so is generated by one element and hence G is generated by Z
and one additional element which means that it’s commutative:

am xan y = am+n xy = an yam x

for x, y ∈ Z. For the same reason, it can’t be |G|


3 since G/Z still has prime order so must be
|Z|
generated by one element by Lagrange’s theorem. Thus, |G| ≤ 14 .

|G|
Now, if x 6∈ Z, C(x) is a subgroup of G distinct from it so has cardinality at most 2 . To
384 APPENDIX A. POLYNOMIALS

conclude,

|{(x, y) ∈ G2 | xy = yx}| X |C(x)|


=
|G|2 |G|2
x∈G
X |G| X |C(x)|
= +
|G|2 |G|2
x∈Z x6∈Z
|Z| |G|/2
≤ + (|G| − |Z|) ·
|G| |G|2
|Z| 1 |Z|
= + −
|G| 2 2|G|
|Z| 1
= +
2|G| 2
1 1
≤ +
8 2
5
= .
8


Remark A.3.3
One can check that the bound 5/8 is achieved by the quaternion group Q8 consisting of the
elements e, b, b2 , b3 , a, ab, ab2 , ab3 under the presentation a4 = b4 = e, a2 = b2 , and ba = ab3 .

Exercise A.3.19† (Fundamental Theorem of Finitely Generated Abelian Groups). Let G be an


abelian group which is finitely generated, i.e., if we write its operation as +, there are g1 , . . . , gk ∈ G
such that any g ∈ G can be represented as n1 g1 + . . . + nk gk for integers ni ∈ Z. Prove that there
is a unique integer n ≥ 0 (called the rank of the group) and a unique sequence of positive integers
d1 | . . . | dm such that
(G, +) ' (Zn × Z/d1 Z × . . . × Z/dk Z, +).

Solution

This problem has two parts: proving that a finite abelian group is isomorphic to a product of
cyclic groups in the wanted way, and proving that the torsion T of a finitely generated abelian
group, i.e. the set of elements with finite order (which is a subgroup here since G is abelian) is
finite and that G ' Zn × T for some n.
For the first part, pick an element h ∈ G of maximal order m. We claim that the order of any
element g ∈ G divides m. (We know that this must be true by the statement: this m is our dk .
Note however that this is false for non-abelian groups.) Indeed, suppose that x, y ∈ G have order
a, b. We will construct an element of order lcm(a, b). First suppose that a and b are coprime.
Then, a(x + y) = ax has order b since gcd(a, b) = 1, and similarly b(x + y) = by has order a.
Thus, the order of x + y is divisible by a and b, and hence by ab. Conversely, it clearly divides
ab so must be exactly ab.
Now, if a and b are not necessarily coprime, let a0 = vp (a)≥vp (b) pvp (b) and b0 = vp (a)<vp (b) pvp (b)
Q Q

so that a0 , b0 are coprime and have product lcm(a, b). The elements (a/a0 )x and (b/b0 )y have
respective orders a0 and b0 so we are done by the previous step since a0 and b0 are coprime.
Now, let H = hhi be the subgroup generated by g, i.e. {0, g, . . . , (m − 1)g}. This is isomorphic
to Z/mZ. We claim that
G ' H × G/H.
Continuing in this fashion with G/H (which has a strictly smaller cardinality unless G is already
trivial) yields the wanted decomposition, since we have shown that the di are divisible by the
A.3. EXERCISES 385

previous one (m is divisible by the order of any element). To prve that G ' H × G/H, we will
find a morphism ϕ from G to H which is the identity H. Indeed, g 7→ (ϕ(g), g (mod H)) will
then be the wanted isomorphism between G and H × G/H: if ϕ(g) = ϕ(g 0 ) and g ≡ g 0 (mod H),
then ϕ(g − g 0 ) = g − g 0 since it is the identity on H so we must have g = g 0 . Thus, our morphism
is injective and hence bijective since |G| = |H| · |G/H|.

We proceed by induction on the minimal number of elements needed to generate G from H.


When H = G it is trivial. Now, suppose ϕ is a morphism from G0 ⊆ G to G and let g ∈ G \ G0 .
We will extend ϕ to hG0 , gi, the subgroup generated by G0 and g as desired. Let n be the
order of y in G/G0 , i.e. the smalles k such that ny ∈ G0 . Then, ky ∈ G ⇐⇒ n | k. Thus,
ϕ(g 0 + kg) := ϕ(g 0 ) + kϕ(g) is well-defined as long as ϕ(g) is such that

ϕ(ng) = nϕ(g).

Now, note that n divides the order of g which divides m = |H|. Hence, it is always possible to
find such a ϕ(g): if ϕ(ng) = kh, since mg = 0, we have (mk/n)h = 0, i.e. n | k which means that
ϕ(g) = (k/n)h works. Note also that this reasoning shows that the decomposition is unique too.

Now, we prove that torsion-free finitely generated abelian groups are isomorphic to Zn for a
unique n. But first, we show how the problem follows from these two special cases. Note that
G/T is torsion-free: if x (mod T ) has finite order, then nx ∈ T for some n so x has finite order,
i.e. x ∈ T . Pick a basis α1 (mod T ), . . . , αn (mod T ). Now, we claim that

G ' T × (α1 Z + . . . + αn Z) ' T × Zn

as wanted. This follows from the simple isomorphism (x, y) 7→ x + y. This is surjective by
definition, since α1 Z + . . . + αn Z is a system of representatives of G/T . For the injectivity,
note that, if x + y = x0 + y 0 , then y − y 0 = x − x0 ∈ T so y = y 0 and thus x = x0 since
α1 Z + . . . + αn Z ' G/T has trivial torsion. There is one last thing we need to show however:
that T is finite. Pick an isomorphism ϕ : G 7→ T × Zn . Then, the first coordinates of the image
of a generating family of elements of G generate T . Since they all have finite order, they generate
a finite number of elements as wanted.

Hence, we only need to prove that if G has trivial torsion, it is isomorphic to Zn for some n.
Note that this n is unique: if we had an isomorphism from Zm to Zn , we would have one from
(Z/2Z)m → (Z/2Z)n by reducing it modulo 2, and this forces m = n. Pick a generating set of
minimal cardinality α1 , . . . , αn . We wish to prove that it is linearly independent. Suppose that
it is not the case, and let N 6= 0 be the minimum value of the absolute values of the coefficients
of a non-trivial linear combination which is zero. In fact, we shall also pick the generating set
to minimise N . The contradiction will then come from a construction of another generating set
with zero linear combination with smaller coefficients.

Suppose that k1 α1 + . . . + kn αn = 0 and N = |k1 | + . . . + |kn |. Suppose without loss of generality


that 0 < |k1 | < |k2 |. Say we replace the family α1 , . . . , αn by α1 ± α2 , α2 , . . . , αn . Then,
k1 α1 + . . . + kn αn = 0 becomes

k1 (α1 ± α2 ) + (k2 ∓ k1 )α2 + k3 α3 . . . + kn αn = 0.

By picking the ±1 sign appropriately, we ensure that |k2 ∓ k1 | < |k2 | thus leading to a smaller
value of N , which is a contradiction. We are done. 

Exercise A.3.20† (Burnside’s Lemma). Let G be a finite group, S a finite set, and · a group action
of G on S, meaning a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G
and s ∈ S. Given a g ∈ G, denote by Fix(g) the set of elements of s fixed by g. Prove that

1 X
|S/G| = Fix(g),
|G|
g∈G
386 APPENDIX A. POLYNOMIALS

where |S/G| denotes the number of (disjoint) orbits Oi = Gsi . Deduce the number of necklaces that
have p beads which can be of a colours, where p is a prime number and two necklaces are considered
to be the same up to rotation.

Solution
P
Consider the sum g∈G | Fix(g)|.
P This is equal to the number of pairs (g, s) such that gs = s.
Hence, this is also equal to s∈S | Stab(s)|, where Stab(s) denotes the elements of G fixing s.
Now consider the orbit Gs of s. We claim that |Gs| = |G/ Stab(s)| = |G|/| Stab(s)|. Indeed,
the map from the left-cosets G/ Stab(s) to Gs sending g Stab(s) to gs is clearly a bijection: if
gs = hs then h−1 g ∈ Stab(s) so g Stab(s) = h Stab(s). Hence,
X X 1
| Fix(g)| = |G|
|Gs|
g∈G s∈S
X X 1
=
|O|
O∈S/G x∈O
X
= 1
O∈S/G

= |S/G|

as desired.

For the second part, consider the cyclic group Z/pZ acting on the sets of words (necklaces) in
an alphabet (the set of colours) of size a. Why did we choose Z/pZ? Because we consider the
necklaces up to rotation. The action of g ∈ Z/pZ is of course a rotation of g beads, say to
the right. Then, there is one element fixing all words: 0, and all the other ones only fix words
with all letters equal, i.e. monochromatic necklaces. Indeed, if 0 6= g ∈ Z/pZ fixes a necklace,
then so does Z/pZ = gZ/pZ which means that the necklace is invariant under all rotations, i.e.
monochromatic. Hence, the number of necklaces is

ap + (p − 1)a
p
by Burnside’s lemma. Notice that this also proves Fermat’s little theorem.. 

Miscellaneous
Exercise A.3.21† (China TST 2009). Prove that there exists a real number c > 0 such that, for any
prime number p, there are at most cp2/3 positive integers n satisfying n! ≡ −1 (mod p).

Solution

We shall prove that any set S such that a! ≡ b! 6≡ 0 (mod p) has cardinality at most 2p2/3 .
Consider the polynomial following polynomial

fm = (X + 1) · . . . · (X + m) − 1 ∈ Fp [X].

Since Fp is a field, fm has at most m roots in Fp . Thus, there are at most m integers n such that
n! ≡ (m + n)!, since this is equivalent to fm (n) = 0.

Let k be an integer which we will specify later on. Let N be the set of pairs of elements of S at
a distance less than k, i.e.

N = {{a, b} ⊆ S | a 6= b, |a − b| < k.}


A.3. EXERCISES 387

By our previous result,


k2
|N | ≤ 1 + 2 + . . . + (k − 1) <
.
2
Now, let M = {a | ∃b : {a, b} ∈ N }. Consider S \ M . By definition, for any a, b ∈ S \ M , we
have |a − b| ≥ k. Since the elements of S are between 0 and p − 1, by the pigeonhole principle,
we have |S \ M | ≤ kp + 1. To conclude,

p k2
|S| ≤ |S \ M | + |M | ≤ |S \ M | + |N | ≤ + + 1.
k 2
√ 
If we now pick k = 3 p , we get |S| ≤ 2p2/3 as wanted. 

Exercise A.3.22† (Mason-Stothers Theorem, ABC conjecture for polynomials). Suppose that A, B, C ∈
C[X] are coprime polynomials such that A + B = C. Prove that

1 + max(deg A, deg B, deg C) ≤ deg(rad ABC)

where rad ABC is the greatest squarefree divisor of ABC (in other words, deg(rad ABC) is the number
of distinct complex roots of ABC). Deduce that the Fermat equation f n + g n = hn for f, g, h ∈ C[X]
does not have non-trivial solutions for n ≥ 2.

Solution
 
A B
Consider the determinant D = det = AB 0 − BA0 . Note that this is the same up sign
A0 B 0
when we replace A and B by two polynomials out of A, B, C: this is because the determinant
is invariant up to sign under column operations (adding certain columns to other columns and
exchanging columns, see Proposition C.3.4). Of course, it can also be proven by computing it
explicitly: (A + B)B 0 − B(A + B)0 = AB 0 − BA0 (and the rest follows by symmetry). Thus, r
is a double root of ABC only if it is a root of D: indeed such a root must a double root of one
of A, B, C since they are coprime, say A. It is then a common root of A and A0 so of D too.
However, a lot more holds. if v is the multiplicity of r in ABC (thus in A in our case), r is a
root of multiplicity v − 1 of D since it’s a root of multiplicity v − 1 of A0 . Thus,

ABC | rad(ABC)D,

which gives the wanted bound since deg D ≤ deg A + deg B − 1 and the same with B, C and C, A
by symmetry.

Suppose that A = f n , B = g n , C = hn are non-zero and satisfy A + B = C. Then,

1 + n max(deg f, deg g, deg h) = 1 + max(deg A, deg B, deg C)


≤ deg(rad ABC)
= deg(rad f gh) ≤ deg f + deg h + deg h

so n < 3 as wanted. 

Exercise A.3.23† . Find all polynomials f ∈ C[X] which send the unit circle to itself.

Solution

As in Exercise A.3.9† , f (z) = f (z −1 ) for any z on the unit circle. Thus, 1 = |f (z)|2 = f (z)f (z −1 ).
Hence, f (z)(z n f (z −1 )) = z n for z on the unit circle, where n = deg f . Note that X n f (1/X) is
388 APPENDIX A. POLYNOMIALS

Pn Pn
indeed a polynomial: if f = i=0 ai X i , then X n f (1/X) = i=0 an−i X i .

Thus, the polynomials f (X)(X n f (1/X)) and X n have infinitely many roots in common, which
mean that they are equal. In particular, f | X n , which implies that f = εX k for some ε and
some k. It is clear that ε must be on the unit circle, and conversely any such ε works (in other
words, the polynomials which send the unit circle to itself contract it and then rotate it). 

Exercise A.3.26† (Gauss-Lucas Theorem). Let f ∈ C[X] be a polynomial with roots α1 , . . . , αk .


Prove that
f0 X 1
= .
f X − αk
k

Deduce the Gauss-Lucas theorem: if f ∈ C[X] is non-constant, Pthe roots ofPf 0 are in the convex hull of
0
the roots of f , that is, any root β of f is a linear combination i λi αi with i λi = 1 and non-negative
λi ∈ R.

Solution

The identity follows from Exercise A.1.8∗ . Let α be a root of f 0 , without loss of generality such
that f (α) 6= 0. We have
n n
X 1 X α − αk
0= =
i=1
α − αk i=1
|α − αk |2
so that
n n
X 1 X αk
α 2
= .
i=1
|α − αk | i=1
|α − αk |2
If we now conjugate this equality, we get
Pn αk
i=1 |α−αk |2
α = Pn 1
i=1 |α−αk |2

which has the desired expression. 

Remark A.3.4
You may notice that the first identity is the logarithmic derivative (log f )0 . This can be used to
produce an analytic proof of this identity: it holds when X > αk for all k (in particular they are
all real), but is also a polynomial identity in X and the αk , so it must hold polynomially. More
specifically, if we fix the αi ∈ R, it holds for sufficiently large X so it must hold for all X. Thus,
it holds for all αi , X ∈ R which means that it always holds by Exercise A.1.7∗ .

Exercise A.3.27† (Sturm’s Theorem). Given a squarefree polynomial f ∈ R[X], define the sequence
f0 = f , f1 = f 0 and fn+2 is minus the remainder of the Euclidean division of fn by fn+1 . Define also
V (ξ) as the number of sign changes in the sequence f0 (ξ), f1 (ξ), . . ., ignoring zeros. Prove that the
number of distinct real roots of f in the interval ]a, b] is V (a) − V (b).5

5 If we choose a = −∞, b = +∞, this gives an algorithm to compute the number of real roots of f , by looking at the

signs of the leading coefficients of f0 f1 , . . ..


A.3. EXERCISES 389

Solution

When x increases from a to b, it may pass through a zero of some fk (otherwise, by the interme-
diate value theorem, V (a) = V (b) and there is clearly no root in the intervall as claimed). We
shall prove that this leaves V (x) invariant if k ≥ 1, and decreases it by 1 precisely when k = 0,
i.e. x is a root of f . Before doing that, note that the important part of the definition of (fn )n≥0
is that fn+1 ≡ −fn−1 (mod f )n for all n. In particular, if fn (x) and fn+1 (x) are zero, then so is
fn−1 (x), which implies, by induction that x is a root of every fi . This is impossible since f0 = f
and f1 = f 0 have no common root by assumption.

First, suppose that fi (ξ) = 0 for some ξ and i ≥ 1. Then, since fi+1 ≡ −fi−1 (mod fi ), fi+1 (x)
and fi−1 (x) have opposite signs around ξ (and are non-zero by our previous observation). This
means that, before ξ, we had one sign change in (fi−1 (x), fi (x), fi+1 (x)) since this has the form
(±1, ε, ∓1) for ε ∈ {−1, 1}. After ξ and at ξ, we still have one sign change for the same reason.
Hence, V (x) stays invariant when x passes through a root of some fi with i ≥ 1.

Now, suppose that f (ξ) = 0. Then, around ξ, f (ξ + ε) = εf 0 (ξ) + O(ε2 ) which means that the
sign of f (x) flips before and after ξ, while the sign of f 0 does not change since ξ is a simple root.
More precisely, before ξ, f (x) and f 0 (x) had opposite sign, while they have the same sign after
ξ. At ξ, we do not count a sign change since f (ξ) = 0 so V (ξ) = V (ξ + ε) for sufficiently small
ε > 0, which finishes the proof. 

Exercise A.3.28† (Ehrenfeucht’s Criterion). Let K be a characteristic zero field, let f1 , . . . , fk ∈ K[X]
be polynomials and define

f = f1 (X1 ) + . . . + fk (Xk ) ∈ K[X1 , . . . , Xk ].

If k ≥ 3, prove that f is irreducible. In addition, prove that this result still holds if k = 2 and f1 and
f2 have coprime degrees.

Solution

Let us first do the case k = 2. Suppose that f (X) + g(Y ) is reducible, say equal to uv. Let
m = deg f and n = deg g. Consider f (X n ) + g(Y m ), which is a polynomial of degree mn in
both X and Y . Let r and s be the homogeneous parts of u(X n , Y m ) and v(X n , Y m ), i.e. the
polynomial formed by the monomials of highest degree of u(X n , Y m ) and v(X n , Y m ). By looking
at the degrees, we must have rs = aX mn + bY mn where a and b are the leading coefficients of u
and v respectively.

Suppose without loss of generality (by symmetry) that r has at least two monomials, i.e. u has
at least two monomials X i1 Y j1 and X i2 Y j2 such that

ni1 + mj1 = ni2 + mj2 ⇐⇒ n(i1 − i2 ) = m(j1 − j2 ).

Since m and n are coprime, this implies n | j1 − j2 and m | i1 − i2 . But then, degX u ≥ m and
degY u ≥ n, which implies that s is constant in both X and Y , i.e. constant, since f (X)+g(Y ) =
uv. This is a contradiction.

Now suppose k ≥ 3 and f = uv. Let ni = deg fi and let ai be the leading coefficient of fi . The
same argument as before shows that

rs = a1 X1N + . . . + ak XkN ,
N/n
where N = n1 · . . . · nk (we replace Xi by Xi i and take homogeneous parts). Thus, we have
reduced the problem to the case of monomials. We can however reduce it even further: if we
evaluate this at (X, Y, 1, 0, . . . , 0), we get that aX N + bY N + c is reducible in K[X, Y ] (the
390 APPENDIX A. POLYNOMIALS

factorisation we get is non-trivial since r and s have degree < N so still degree < N when we
evaluate them), say

aX N + bY N + c = (gM X M + . . . + g0 )(hN −M X N −M + . . . + h0 )

for some polynomials gi , hi in Y of degree < N . Now, substitute y a complex root of bY N + c to


Y . This gives us the polynomial aX N which can only be factored as a product of two monomials,
so
g0 (y) = . . . = gM −1 (y) = hN −M −1 (y) = . . . = h0 (y).
But since the roots of bY N +c are distinct (there is no common root with the derivative N bY N −1 ),
gi and hj for i < M and j < N − M vanish at N distinct points, which is more than their degree.
Thus, they must be zero. This leaves us with the factorisation aX N + bY N + c = gM hN −M X N
which is clearly impossible since X N doesn’t divide the LHS. 

Exercise A.3.29† (IMC 2007). Let a1 , . . . , an be integers. Suppose f : Z → Z is a function such that
n
X
f (kai + `) = 0
i=1

for any k, ` ∈ Z. Prove that f is identically zero.

Solution
Pm i
Consider the set I of polynomials f = i=0 bi X ∈ Q[X] such that
m
X
bi f (i + x) = 0
i=1

for any x ∈ Z. We claim that this set is an ideal of Q[X], meaning that it’s closed under
addition, and closed under multiplication by any polynomial in Q[X]. The first fact is clear. For
the second, note that multiplication by X i corresponds to a translation and that multiplication
by a constant is trivial, so we can deduce it from the first fact. Thus, I is closed under gcd: by
Bézout’s lemma, if f, g ∈ I, there are u, v ∈ Q[X] such that

gcd(f, g) = uf + vg ∈ I.

Our goal is to show that I contains the element 1: this gives f (x) = 0 for any x ∈ Z as wanted.
The statement gives us that
Xn
f= X kai +` ∈ I
i=1

for any k, ` such that kai + ` ≥ 0 for all i. Hence, the problem reduces
Pn to proving that these
kai
polynomials are coprime, i.e., that for any algebraic number α, i=1 α can not always be
zero. This follows from our proof of Theorem C.4.1: this is a linear recurrence,Pn and the only
linear recurrence which is identically zero is the zero recurrence. However, i=1 αkai is clearly
not the zero recurrence since the coefficient before αkai for every i. 
Appendix B

Symmetric Polynomials

B.1 The Fundamental Theorem of Symmetric Polynomials


Exercise B.1.1. Let f ∈ K(X1 , . . . , Xn ) be a rational function, where K is a field. Suppose f is
symmetric, i.e. invariant under permutations of X1 , . . . , Xn . Prove that f = g/h for some symmetric
polynomials g, h ∈ K[X1 , . . . , Xn ].

Solution

Let r = f /g be a symmetric rational function. We write it as


Q
f (σ(X1 , . . . , Xn ))
r = Q σ∈Sn .
g id6=σ∈Sn f (σ(X1 , . . . , Xn ))

The numerator is a symmetric polynomial so the denominator must be too since the quotient
is. 

Exercise B.1.2. Prove that the decomposition of a symmetric polynomial f as g(e1 , . . . , en ) is unique.

Solution

This accounts to proving that f (e1 , . . . , en ) = 0 if and only if f = 0. This is clear by induction on
n (trivial when n = 1). Let f be such a polynomial and suppose for the sake of a contradiction
that en | f . If we set Xn = 0 we get

f (e1 , . . . , en−1 , 0) = 0

where the ei are now the elementary symmetric polynomials in X1 , . . . , Xn−1 . By the induction
hypothesis, this means that f (X1 , . . . , Xn−1 , 0) = 0, i.e. Xn | f . By symmetry, en = X1 ·. . .·Xn |
f , a contradiction. 

B.2 Newton’s Formulas


Exercise B.2.1∗ . Prove Corollary B.2.1.

391
392 APPENDIX B. SYMMETRIC POLYNOMIALS

Solution

We have K(p1 , . . . , pn ) ⊆ K(e1 , . . . , en ) by the fundamental theorem of symmetric polynomials,


and the reverse inclusion comes from the Newton formulas by induction, as explained before. (We
need the assumption that K is a field because the LHS of the Newton’s formulas has a factor of
k which we need to divide by in the inductive step, and we need K to have characteristic zero
so that k 6= 0.) 

B.3 The Fundamental Theorem of Algebra


Exercise B.3.1∗ . Prove Proposition B.3.2.

Solution

By the quadratic formula (or completing the square), solving quadratic equations is equivalent
to finding square roots. Thus, let a + bi ∈ C be a complex number, with a, b ∈ R. We wish to
find a square root x + iy or a + bi, i.e.

x2 − y 2 + 2ixy = (x + iy)2 = a + bi.

This means x2 − y 2 = a and 2xy = b. This is equivalent to x2 and −y 2 being roots of X 2 −


aX − b2 /4 by Vieta’s formulas. Since the constant coefficient is negative, the roots are real (e.g.
by the intermediate value theorem), and since the product is negative, one is positive and one
negative. Label the positive one as x2 and the negative one as −y 2 , take the square roots to find
x and y and adjust the sign to have 2xy = b. 

B.4 Exercises
Newton’s Formulas
Exercise B.4.2† (Hermite’s Theorem). Prove that a function f : Fp → Fp is a bijection if and only
if a∈Fp f (a)k is 0 for k = 1, . . . , p − 2 and −1 for k = p − 1.
P

Solution

If f is a bijection, then this is Exercise A.3.11† . Now suppose that this condition holds. Newton’s
formulas (note k 6= 0 for k < p so Corollary B.2.1 still holds) tell us that there is only one possible
value of ek (f (0), . . . , f (p − 1)) for any fixed k. Hence, we must have

ek (f (0), . . . , f (p − 1)) = ek (0, . . . , p − 1)

since 0, . . . , p − 1 satisfy the condition. This implies that

(X − f (0)) · . . . · (X − f (p − 1)) = (X − 0) · . . . · (X − (p − 1))

so f is a bijection as wanted. 

Exercise B.4.3† . Suppose that α1 , . . . , αn are such that α1k + . . . + αkn is an algebraic integer for all
n. Prove that α1 , . . . , αk are algebraic integers.
B.4. EXERCISES 393

Solution

Newton’s formulas give us k!ei (α1 , . . . , αk ) ∈ Z for any i. Thus, k!α ∈ Z for any α = αi , by
Exercise 1.5.22† . In particular, since the statement is also true when we replace the αi by αim
for any fixed m, we get k!αm ∈ Z for any m.

Thus, the problem reduces to showing that, if α ∈ Q is algebraic and such that N αn ∈ Z (i.e.
powers of α have bounded denominator) for some non-zero N ∈ Z and any positive integer n,
n
then α ∈ Z. For large n, the degree of α2 is constant, since the sequence
n n n−1 n−1
[Q(α2 ) : Q] = [Q(α2 ) : Q(α2 )][Q(α2 ) : Q]
m
is a non-increasing sequence of integers. By replacing α by α2 for some large m, we may assume
that this is true for any n ≥ 0. Let β1 , . . . , β` be the conjugates of α. Consider the minimal
polynomial
`
Y k
f2k = X − βi2
i=1

2k
of α and let Nk = 1/c(f2k )) be the smallest positive integer such that Nk f2k ∈ Z[X]. By
assumption Nk is bounded. However, we have
`
Y n+1
Nk2 f2k+1 (X 2 ) = Nk2 X 2 − βi2
i=1
`
! `
!
Y n Y n
= Nk X − βi2 Nk X + βi2
i=1 i=1
= ±(Nk f2k )(Nk f2k (−X))

which is primitive by Gauss’ lemma 5.1.2. Hence, Nk+1 = Nk2 so N1 must be 1 otherwise
n−1
Nk = N12 → ∞. This means that the minimal polynomial of α has integral coefficients, i.e. α
is an algebraic integer. 

Remark B.4.1
It is necessary to mention that the key claim admits a very short and intuitive proof if we allows
ourself some ideal theory. The idea is that, if α ∈ Q, we can simply look at the p-adic valuations
to get nvp (α) + vp (N ) ≥ 0 which gives us vp (α) ≥ 0 for large enough n. Hence, α is an integer.
For arbitrary algebraic integers, the same proof works almost verbatim: a number field K is not
always a UFD but always has ideal factorisation. This means that we can this time consider
prime ideals p of K to get nvp (α) + vp (N ) ≥ 0 which implies vp (α) ≥ 0 again. Finally, since this
is true for any prime ideal p, we get α ∈ OK .

Algebraic Geometry

Exercise B.4.4† (Resultant). Let R be a commutative ring, and f, g ∈ R[X] be two polynomials of
respective degrees m and n. For any integer k ≥ 0, denote by Rk [X] the subset of R[X] consisting of
polynomials of degree less than k. The resultant Res(f, g) is defined as the determinant of the linear
map

(u, v) 7→ uf + vg
394 APPENDIX B. SYMMETRIC POLYNOMIALS

ai X i and g = i
, we have1
P P
from Rm [X] × Rn [X] to Rm+n [X]. Prove that, if f = i i bi X

··· ···

a0 0 0 b0 0 0
··· ···

a1 a0 0 b1 b0 0
.. ..

. .

a2 a1 0 b2 b1 0
.. .. .. ..

.. ..
. . . a0 . . . b0
Res(f, g) = .. ,

..
am
am−1 ··· . bn bn−1 ··· .
.. .. .. ..
0
am . . 0 bn . .
. .. .. .. .. ..
..

. . am−1 . . . bn−1
0 0 ··· am 0 0 ··· b
n

X − βj , then2
Q Q
and, if f = a i X − αi and g = b j
Y
Res(f, g) = am bn αi − βj .
i,j

In addition, prove that Res(f, g) ∈ (f R[X] + gR[X]).3 Finally, prove that if f, g ∈ Z[X] are monic and
uf +vg = 1 for some u, v ∈ Z[X], Res(f, g) = ±1. (It is not necessarily true that (f R[X]+gR[X])∩R =
Res(f, g)R for specific polynomials f, g, but we always have Res(f, g) ∈ f R[X]+gR[X] by the previous
point.)

Solution

The determinant form of the resultant simply follows from considering the matrix of the linear
function corresponding to the basis 1, X, . . . , X m+n−1 . To prove the explicit formula, consider
the case where A = a, B = b, αi = Ai and βj = Bj are indeterminates. Working over a field
K, the resultant vanishes when Ai = Bj for some i, j since the map is not surjective: it never
reaches 1. Thus, the resultant is divisible by Ai − Bj for all i, j. Looking at the determinant
formula, we see that the degree of Res(f, g) in A1 is n and its leading coefficient is Am B n , which
proves the wanted formula.

For the second part, write the equation uf +vg = r in the monomial basis as RV = (r, 0, . . . , 0) :=
re1 , where R is the matrix corresponding to the linear map (u, v) 7→ uf + vf . Hence, we wish to
have rR−1 e1 ∈ Rn . ?? tells us that r = det R = Res(f, g) works.

Now let f and g be generic polynomials with integer coefficients of respective degree m and n.
Qm Qn
Suppose finally that (f Z[X] + gZ[X]) ∩ Z = Z. Write f and g as i=1 X − αi and j=1 X − βi .
We have u(βi )f (βi ) = 1 for each i, so
f
Y
(βi ) = ± Res(f, g)
i=1

divides 1 as desired. 

Exercise B.4.6† (Hilbert’s Nullstellensatz). Let K be an algebraically closed field. Suppose that
f1 , . . . , fm ∈ K[X1 , . . . , Xn ] have no common zeros in K. Prove that there exist polynomials g1 , . . . , gm
such that
f1 g1 + . . . + fm gm = 1.
1 This is an (m + n) × (m + n) matrix, with n times the element a0 and m times the element b0 .
n(n−1)
(−1) 2
2 In particular, the discriminant of f is a
· Res(f, f 0 ).
3 In other words, the resultant provides an explicit value of a possible constant in Bézout’s lemma for arbitrary rings

(such as Z).
B.4. EXERCISES 395

Deduce that, more generally, if f is a polynomial which is zero at common roots of polynomials
f1 , . . . , fm (we do not assume anymore that they have no common roots), then there is an integer k
and polynomials g1 , . . . , gm such that

f k = f1 g1 + . . . + fm gm .

Solution

We proceed by induction on the number n of variables. When n = 1 this is just Bézout’s lemma.
Now, if n ≥ 1, we will eliminate one variable with the resultant. Consider the polynomial

g = ResXn (fm , U1 f1 + . . . + Um−1 fm−1 ) ∈ K[U1 , . . . , Um−1 ][X1 , . . . , Xn−1 ],

where U1 , . . . , Um−1 are formal variables. By Exercise B.4.4† , (x1 , . . . , xn−1 ) is a root of g if and
only if fm and U1 f1 + . . . + Um−1 fm−1 have a common root xn at (x1 , . . . , xn−1 ), i.e. (x1 , . . . , xn )
is a common root of f1 , . . . , fm , or if the leading coefficient in Xn of fm and U1 f1 +. . .+Um−1 fm−1
vanish at (x1 , . . . , xn−1 ), i.e. the leading coefficient in Xn of f1 , . . . , fm vanish at (x1 , . . . , xn−1 )
(we say (x1 , . . . , xn−1 ) is a common root at infinity). We wish to rule out the second case. This
is not very hard: perform the change of coordinates Xi → Xi + ci Xn for i = 1, . . . , n − 1 and
some ci to get constant leading coefficients in Xn (thus sharing no common root).

Hence, g has no root by assumption since f1 , . . . , fm have no common root. However, a root of
g is simply a common root of its coefficients gi when g is seen as a polynomial in U1 , . . . , Um−1 .
This implies that a linear combination of the gi is 1, by the induction hypothesis. Finally, notice
that
g = ResXm (f, U1 f1 + . . . + Um−1 fm−1 ) = uf + v(U1 f1 + . . . + Um−1 fm−1 )
for some u, v ∈ K[X1 , . . . , Xn ][U1 , . . . , Um−1 ], by Exercise B.4.4† . Hence, the coefficients gi of g
are linear combinations of the fi (with coefficients in K[X1 , . . . , Xn ]). We conclude that a linear
combination of the fi is 1 as wanted.

For the second part, suppose without loss of generality that f 6= 0. Use the first part on
f1 , . . . , fm , 1−Xn+1 f which have no common root by assumption (this is known as Rabinowitsch’s
trick). Thus, there are g1 , . . . , gm , g ∈ K[X1 , . . . , Xn+1 ] such that

g1 f1 + . . . + gm fm + g(1 − Xn+1 f ) = 1.

Now, evaluate this at Xn+1 = 1/f and multiply by a large enough power of f to get the wanted
equality. 

Exercise B.4.7† (Weak Bézout’s Theorem). Prove that two coprime polynomials f, g ∈ K[X, Y ] of
respective degrees m and n have at most mn common roots in K. (Bézout’s theorem states that they
have exactly mn common roots counted with multiplicity, possibly at infinity.4 )

Solution

We can assume without loss of generality that K has as many elements as we want by iteratively
adding new elements to K using Exercise 4.2.1∗ .)

We shall proceed as in Exercise B.4.6† . Consider the resultant h = ResY (f, g). This is a
polynomial of degree at most mn by its matrix expression of Exercise B.4.4† . By the same
exercise, if (x, y) is a common root of f, g, then x is a root of h. Thus, we would be done if
there was at most one possible value of y for each x, since h has degree at most mn and thus

4 This requires some care: we need to define the multiplicity of common roots as well as what infinity means. See any

introductory text to algebraic geometry, e.g. Sharevich [shafarevich]. See also the appendix on projective geometry of
Silverman-Tate [26].
396 APPENDIX B. SYMMETRIC POLYNOMIALS

has at most mn roots. Note that we already get that there are finitely many common roots
(although that’s already a consequence of Bézout’s lemma). Here is how we can achieve that: do
a change of coordinates X → X + c0 Y for some c chosen so that each x appears at most once as
a common root (x, y) of f and g: this is possible because the common roots in this new system
of coordinates are (α + c0 β, β) and there are finitely many c0 for which

α − α0
α + c0 β = α0 + c0 β 0 ⇐⇒ c = .
β0 − β


Exercise B.4.8† . Prove that n + 1 polynomials f1 , . . . , fn+1 ∈ K[X1 , . . . , Xn ] in n variables are


algebraically dependent, meaning that there is some non-zero polynomial f ∈ K[X1 , . . . , Xn+1 ] such
that
f (f1 , . . . , fn+1 ) = 0.

Solution

We present two solutions: one with linear algebra and one with resultants.

For the first solution, consider the linear system of equations in (N + 1)n+1 variables
in+1
X
ai1 ,...,in+1 f1i1 · . . . · fn+1 = 0. (∗)
i1 ,...,in+1 ≤N

We wish to find a non-trivial solution to this system. Let us count the number of equations we
have. Set M = maxi (deg fi ). Then, the LHS of (∗) is a polynomial of degree (n + 1)M N , when
we consider the ai1 ,...,in+1 as formal variables. Hence, we have (N + 1)n+1 unknowns and

N
X ((n + 1)M N )n+1 − 1
((n + 1)M N )k =
(n + 1)M N − 1
k=0

n+1
−1
equations, one for each coefficient. For large N , (N + 1)n+1 > ((n+1)M N)
(n+1)M N −1 , which means
that there is a non-trivial solution as wanted (the kernel is non-trivial by e.g. the rank-nullity
theorem ??, or Proposition C.1.2).

To make the idea of the second solution clearer, we treat the case n = 1 first. If f, g ∈ K[X]
are polynomials, the resultant h = ResX (f − S, g − T ) is a non-zero polynomial in S, T with
coefficients in K. Indeed, it is non-zero since when S − f and T − g are coprime it takes a
non-zero value (we can choose T = 0 and S ∈ K to be a large constant for instance). However,
when S = f and T = g, the polynomials f − S and g − T are not coprime anymore so h(f, g) = 0
as wanted.

Now, we construct by backwards induction on k a polynomial with coefficients in K[X1 , . . . , Xk ]


vanishing at f1 , . . . , fn+1 . In other words, we eliminate one variable each time. Here is how we
do it: at first, fn,i = fi . Then, we define the polynomials

fk−1,i = ResXk (fk,k+1 − Tk,k+1 , fk,i − Tk,i )

for i = 1, . . . , k. At each step we get rid of Xk and introduce k + 1 new variables. Thus,
f0,1 ∈ K[{Ti,j | i ≤ j − 1}. It is clear that it is zero when evaluated at Tn,i = fi for every i and
Tk,i constant for i ≤ k − 1 ≤ n − 2. Indeed, note that Res(A, B)(t) is not in general equal to
Res(A(t), B(t)), since A(t), B(t) do not have the same degree as A, B. If we consider constant
polynomials as polynomials of degree deg(fk,i − Tk,i ) > 0, then

ResXk (fk,k+1 − Tk,k+1 , fk,i − Tk,i ) = 0,


B.4. EXERCISES 397

as can be seen from the matrix expression of Exercise B.4.4† . It remains to prove that there
is some choice of such Tk,i for which f0,1 is not the zero polynomial. This is easy to see: we
can choose Tk,k+1 = 0 for all k and at each step we pick Tk,i so that fk,k+1 and fk,i + Tk,i are
coprime. Indeed, if fk,k+1 has ` irreducible prime factors, if we pick ` + 1 values of Tk,i one of
them must work, as otherwise we would have
0 0
π | (fk,i + Tk,i ) − (fk,i − Tk,i ) = Tk,i − Tk,i
0
for some irreducible π | fk,k+1 and distinct Tk,i , Tk,i ∈ K by the pigeonhole principle. This is
0
impossible sine it implies Tk,i = Tk,i . There is still one slight technicality: we could have ` ≥ |K|.
However, we can simply add elements to K to get a sufficiently large K as in Exercise B.4.7† ,
and then consider the norm of the polynomial f we obtain (i.e. take the product over each of its
conjugates, exactly like we did in the solution of Exercise 1.5.22† ). 

Exercise B.4.9† (Transcendence Bases). Let L/K be a field extension. Call a maximal set of K-
algebraically independent elements of L a transcendence basis. Prove that, if L/K has a transcendence
basis of cardinality n, then all transcendence bases have cardinality n. This n is called the transcendence
degree trdegK L. Finally, show that, if L = K(α1 , . . . , αn ) any maximal algebraically independent
subset of α1 , . . . , αn is a transcendence basis. (In particular trdegK L ≤ n.)

Solution

We prove a result analogous to Proposition C.1.2: if α1 , . . . , αm ∈ L are K-algebraically inde-


pendent and β1 , . . . , βn ∈ L are such that any element of L is algebraic over K(β1 , . . . , βn ), then
m ≤ n. Since transcendence bases satisfy both conditions, this shows that trdegK L is well-
defined. This almost Exercise B.4.8† : any family of n + 1 elements algebraic over K(β1 , . . . , βn )
is algebraically dependent over K. The only difference is that, in our case α1 , . . . , αm are not
necessarily in K(β1 , . . . , βn ). However, the first argument still works perfectly fine, the only
difference is that, if αi has degree di over K(β1 , . . . , βn ), we get (at most)
m
Y (mM N )n+1 − 1
di
i=1
mM N − 1

equations this time, which is still less than (N + 1)m for large N if m > n.

For the second part, note that, by the same argument as Theorem 1.3.2 or by Chapter 6, any
element of K(α1 , . . . , αn ) is algebraic over K(S), where S ⊆ {α1 , . . . , αn } is a maximal subset of
K-algebrically independent element. 

Exercise B.4.10† . Let K be an algebraically closed field which is contained in another field L.
Suppose that f1 , . . . , fm ∈ K[X1 , . . . , Xn ] are polynomials with a common root in L. Prove that they
also have a common root in K.

Solution

We present two solutions, one based on Hilbert’s Nullstellensatz from Exercise B.4.6† and one
in characteristic 0 based on transcendence basis from Exercise B.4.9† . For the first sol, note
that f1 , . . . , fm have a common root in L if and only if there are no g1 , . . . , gm ∈ L[X1 , . . . , Xn ]
such that f1 g1 + . . . + fm gm = 1. In that case, there are no such gi in K[X1 , . . . , Xn ] either, so
f1 , . . . , fm also have a common root in K.

We new present the second solution, which is perhaps more intuitive as it "lifts" (or "reduces"
in our case) the common root over L to a common root over K. Thus, suppose that char K = 0
398 APPENDIX B. SYMMETRIC POLYNOMIALS

and let α1 , . . . , αk be a K-transcendence basis for the field generated by K and the common
root. Then, let α be such that this field is equal to K(α1 , . . . , αk , α), there exists such an α by
the primitive element theorem 6.2.1. Let

r1 (α1 , . . . , αk )(α), . . . , rn (α1 , . . . , αk )(α))

be the common root, with ri ∈ K(X). The equality

fi (r1 (α1 , . . . , αk )(α), . . . , rn (α1 , . . . , αk )(α)) = 0

is an equality modulo the minimal polynomial π(α1 , . . . , αk ) of α. Thus, if we replace αi by


ai ∈ K and α by a root a ∈ K of π(a1 , . . . , ak ), we get a common root in K. We just need to
check that the ri (a1 , . . . , ak )(a) are well-defined, i.e. their denominator is non-zero. This follows
from Exercise A.1.7∗ : the denominator is non-zero so it stays non-zero infinitely many times in
K n . Note that ri (α) is not necessarily a polynomial, instead it is algebraic over K(α1 , . . . , αk ),
but by considering its norm (the product with its conjugates over K(α1 , . . . , αk )) we can get a
polynomial. Indeed, if the norm of ri (α) is non-zero then so is ri (α). (We also need to be careful
with the leading coefficient of π: if it vanishes α has too few conjugates and things can get weird,
but we can simply pick a1 , . . . , ak so that it doesn’t vanish either.) 

Miscellaneous
Exercise B.4.11† (ISL 2020 Generalised). Let n ≥ 1 be an integer. Find the maximal N for which
there exists a monomial f of degree N which can not be written as a sum
n
X
e i fi
i=1

with fi ∈ Z[X1 , . . . , Xn ].

Solution

The answer is N = n(n−1) 2 . First, we prove that X2 X32 · . . . · Xnn1 can not be Pwritten in the
desired form. Suppose for the sake of a contradiction that X2 X32 · . . . · Xnn1 = i ei fi for some
polynomials fi , which we suppose without loss of generality to be homogeneous of degree n(n−1) 2 −i
1 n−1
(by ignoring all other monomials). Then, we sum ε(σ)Xσ(2) · . . . · Xσ(n) over all permutations
σ ∈ Sn of [n], where ε denotes the signature (see Definition C.3.2). Since the ei are symmetric,
we have X X X
1 n−1
ε(σ)Xσ(2) · . . . · Xσ(n) = ei ε(σ)fi (Xσ(1) , . . . , Xσ(n) ).
σ∈Sn i σ∈Sn

Here is the key point: if f has degree less than n(n−1)


P
2 , σ∈Sn ε(σ)f (Xσ(1) , . . . , Xσ(n) ) = 0.
This is an obvious contradiction as the LHS is a sum of distinct monomialsQso is non-zero.
n ai
To Pnclaim, suppose without loss of generality that f is a monomial i=1 Xi . Since
Pnprove this
i=1 ai < i=1 (i − 1), two ai must be equal, say ai = aj . Denote by τ the transposition i ↔ j.
Then, by grouping permutations of [n] by orbits σ, σ ◦ τ , the sum is zero since

f (Xσ(1) , . . . , Xσ(n) ) = f (Xσ◦τ (1) , . . . , Xσ◦τ (n) )

but ε(σ ◦ τ ) = −ε(σ) by Exercise C.3.11∗ so the sum over each orbits cancels out.

It remains to prove that X1a1 · . . . · Xnan works when a1 + . . . + an > n(n−1)


2 . When a1 , . . . , an ≥ 1
it is trivial since the monomial is divisible by e1 . We proceed by induction on a21 + . . . + a2n , with
the following base case: a1 , . . . , an ≥ 1 the monomial is divisible by e1 .
B.4. EXERCISES 399

Now suppose that a1 + . . . + an > n(n−1)


2 and, without loss of generality, 0 = a1 ≤ a2 ≤ . . . ≤ an .
There must exist some k such that ek+1 ≥ ek +2, since otherwise ek ≤ k −1 for all k contradicting
our initial assumption on the sum. Now consider
a
X1a1 · . . . · Xnan − X1a1 · . . . · Xk−1
k−1
Xkak −1 · . . . · Xnan −1 en−k .

We claim that the sum of the squares of the exponents in any monomial appearing in this
polynomial is less than a21 + . . . + a2n , thus concluding the inductive step. To see this, express a
ak−1 ak −1
monomial of X1a1 · . . . · Xk−1 Xk · . . . · Xnak en−k as
a +bk−1
X1a1 +b1 · . . . · Xk−1
k−1
Xkak +bk −1 · . . . · Xnan +bn −1

for some bi ∈ {0, 1} with b1 + . . . + bn = n − k. The wanted result then follows from the
convexity of the square function: if bi = 1 for some i < k and bj = 0 for some j ≥ k, then
(ai + 1)2 + (aj − 1)2 < a2i + a2j . Iterating this process to "push" all the ones to the positions
greater than or equal to k, we get

(a1 + b1 )2 + . . . + (ak−1 + bk−1 )2 + (ak + bk − 1)2 + . . . + (an + bn − 1)2 ≤ a21 + . . . + a2n

with equality if and only if we already had equality in the beginning, i.e. if the monomial is
X1a1 · . . . · Xnan . However, we have ruled that case out by subtracting precisely this monomial, so
we are done. 

Exercise B.4.12† (Lagrange). Given a rational function f ∈ K[X1 , . . . , Xn ], we denote by Gf the


set of permutations σ ∈ Sn such that

f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ).

Let f, g ∈ K(X1 , . . . , Xn ) be two rational functions. If Gf ⊆ Gg , prove that there exists a rational
function r ∈ K[e1 , . . . , en ](X) such that
g = r ◦ f.

Solution

We present the proofs in Prasolov [21]. Partition Gg into disjoint cosets Gf =


h1 Gf , h2 Gf , . . . , hk Gf and write fi = hi f and gi = hi f for each i, where σf means
f (Xσ(1) , . . . , f (Xσ(n) ) (we say the group of permutations Sn acts on the field K(X1 , . . . , Xn )).
(This is where we use the assumption that Gf ⊆ Gg .)

For the first proof, notice that


k
X gi
i=1
T − fi
Qk
is, by definition, symmetric in X1 , . . . , Xn . Since Ω = i=1 T − fi is as well, we get
k
X gi F (T )
=
i=1
T − fi Ω(T )

for some F ∈ K(e1 , . . . , en )[T ] by the fundamental theorem of symmetric polynomials. Notice
that F 0 (f ) = i=2 f −fi is F/(T −f ) evaluated at T = f by Exercise 3.2.2∗ . Hence, we conclude
Qk
that
k  
F (f ) X Ω
= gi (f ) = g
Ω0 (f ) i=1
(T − fi )Ω0

since (T −fi )Ω0 vanishes at f 6= fi .
400 APPENDIX B. SYMMETRIC POLYNOMIALS

The second proof is perhaps more intuitive. We consider the system of equations
k
X
fis gi = Ts ,
i=1

where the exponent represents powers and not iterates. Cramer’s rule from Exercise C.5.7 and
the Vandermonde determinant from Appendix C and tell us that
D
g=

where
1
··· 1

f1 ··· fn Y
∆ = .. =
.. fi − fj

..
.
k−1 . .

i<j
f
1 ··· f k−1
k

and
T0
1 ··· 1

T1 f2 ··· fk
D = .. .. .. .

..
.
. . .

Tk−1 f2k−1 ··· fk
k−1

Write this as g = D∆ 2
∆2 . Notice that ∆ is symmetric, while D and ∆ both change sign when two
fi are switched, so D∆ is symmetric in f2 , . . . , fk . However, it is easy to see that, for any i,
ei (f2 , . . . , fk ) can be expressed polynomially in terms of f1 and ej (f1 , . . . , fk ). Hence, this D∆
∆2
is a rational function in f with symmetric coefficients by the fundamental theorem of symmetric
polynomials. 

Exercise B.4.13† (Iran Mathematical Olympiad 2012). Prove that there exists a polynomial f ∈
R[X0 , . . . , Xn−1 ] such that, for all a0 , . . . , an−1 ∈ R,
f (a0 , . . . , an−1 ) ≥ 0
is equivalent to the polynomial X + an−1 X n−1 + . . . + a0 having only real roots, if and only if
n

n ∈ {1, 2, 3}.

Solution
Q
If n ≤ 3, the discriminant
Q satisfies the condition. Indeed, the discriminant of f = i X − αi is
theQsquare of i<j αi − αj so is positive if all αi are positive. It remains to prove that, for these
n, i<j αi − αj is real if and only if all αi are (its square is real so it must be real or purely
imaginary). For n = 1, it is trivial since any polynomial of degree 1 with real coefficients splits
in R. For n = 2, if the roots of f are α 6= α, then α − α is not real since complex conjugation
negates it. For n = 3, if the roots of f are α 6= α and β ∈ R, then complex conjugation also
negates
(α − α)(β − α)(β − α)
so it isn’t real as desired.
Now, if there exists such a polynomial for n ≥ 4, there exists one for n = 4 by setting g(a, b, c, d) =
f (a, b, c, d, 0, . . . , 0). Thus, it only remains to prove that there doesn’t exist such a polynomial
for n = 4. For this, consider the special polynomial f (0, b, 0, d) since we know precisely when
the roots of X 4 + bX 2 + d are real. For convenience, we shall in fact consider the polynomial
g(r, s) = f (0, −r − s, 0, rs) which is non-negative iff the roots of
X 4 − (r + s)X 2 + rs = (X 2 − r)(X 2 − s)
B.4. EXERCISES 401

are all real, In other words, g(r, s) is non-negative if and only if r and s are. This implies that

0 ≥ lim− g(r, s) = g(r, 0) = lim+ g(r, s) ≥ 0,


s→0 s→0

i.e. g(r, 0) = 0 for any non-negative r. But then, the polynomial g(R, 0) must be zero since it
has infinitely many roots, so g(r, 0) is also zero (and in particular non-negative) for negative r
which is a contradiction. 
Appendix C

Linear Algebra

C.1 Vector Spaces


Exercise C.1.1∗ . Prove Proposition C.1.3.

Solution

Let u1 , . . . , uk be linearly independent elements of the n-dimensional vector space V . Proceed as


follow to complete it into a basis: as long as it does not generate everything, add one element
that is not generated (and thus linearly independent with the previous ones). This process must
stop since n + 1 vectors are always linearly dependent by Proposition C.1.2.

For the second part, let u1 , . . . , uk be a generating family of elements and suppose without loss of
generality that u1 , . . . , um is a maximal subset of linearly independent elements. This is a basis,
since every other uk can be represented as a linear combination of them (and thus Pm all of V too).
Indeed, since u1 , . . . , um , uk are linearly dependent for k > m, we have aum + i=1 ai ui = 0 for
some ai ∈ K. Since u1 , . . . , um are linearly independent, a must be non-zero and we get
m
X ai
um = − ui
i=1
a

as wanted. 

C.2 Linear Maps and Matrices


         
1 0 1 1 1 1 1 1 1 0 1 0
Exercise C.2.1∗ . Prove that = but = .
0 0 0 0 0 0 0 0 0 0 0 0

Solution

We have       
1 0 1 1 1·1+0·0 1·1+0·0 1 1
= =
0 0 0 0 0·1+0·0 0·1+0·0 0 0
and     
1 1 1·1+1·0 1·0+1·0 1 0
= .
0 0 0·1+0·0 0·0+0·0 0 0


402
C.3. DETERMINANTS 403

Exercise C.2.2∗ . Prove that matrix multiplication is distributive over matrix addition, i.e. A(B +
C) = AB + AC and (A + B)C = AC + BC for any A, B, C of compatible dimensions.

Solution

Let A = (ai,j ), B = (bi,j ) and C = (ci,j ). Then, the (i, j) coordinate of A(B + C) is
X
ai,k (bk,j + ck,j )
k
P P
which is equal to k ai,k bk,j + k ai,k ck,j , i.e. the (i, j) coordinate of AB + AC. The right-
distributivity is completely analogous by symmetry of the left and right multiplication. 

C.3 Determinants
Exercise C.3.1. Prove that an m × n matrix can only have a right-inverse if m < n, and only a
left-inverse if m > n. When does such an inverse exist?

Solution

By symmetry, it suffices to consider right-inverses. Suppose that a matrix A has dimensions


m × n and is right-invertible. Consider the surjective linear map from K n×m to K m×m defined
by B 7→ AB. By surjectivity, mn ≥ m2 , i.e. n ≥ m as wanted.

For the converse, we shall refine a bit our original argument. Note that each column of AB is
the sum of linear combinations of the columns of A: if A1 , . . . , An are the columns of A and
B = (bi,j ), then X
bi,k Ai
i

is the kth column of AB. Hence, A has a right-inverse if and only if n ≥ m and its columns are
linearly independent. 

Exercise C.3.2∗ . Prove that (AB)T = B T AT for any n × n matrices A, B.

Solution
P
Let A = (ai,j ) andPB = (bi,j ). The
P (i, j) coordinate of AB is k ai,k bk,j so the (i, Tj) Tcoordinate
of its transpose is k aj,k bk,i = k bk,i aj,k which is also the (i, j) coordinate of B A . 

Exercise C.3.3. Prove this identity.

Solution

We have
      
a b d −b ad + b(−c) a(−b) + ba ad − bc 0
= = = (ad − bc)I2 .
c d −c a cd + d(−c) c(−b) + da 0 ad − bc

Exercise C.3.4∗ . Prove that det In = 1.


404 APPENDIX C. LINEAR ALGEBRA

Solution

As always, by induction. From the definition of the determinant, we have det In = 1 · det In−1 +
0 · . . . = det In−1 and det I1 = 1. 

Exercise C.3.5∗ . Prove that the determinant of a matrix with a zero column is zero.

Solution

Suppose that the kth column M k of M is zero. By Proposition C.3.1 (with t = 0), we have

det M = detk0 (A) = 0 detk0 (M ) = 0.

Exercise C.3.6. Prove that the determinant of a non-invertible matrix is 0.

Solution
i
P
Suppose that the columns of M are linearly dependent, i.e. i ai M = 0 for some ai ∈ K with
ak 6= 0. Then, X
0 = detk0 (M ) = ai detkM i (M )
i

by Exercise C.3.5∗ and Proposition C.3.1. Now, by Proposition C.3.3, all the determinants vanish
except the one with i = k. Thus, we get 0 = ak det M which means det M = 0 as wanted. 

Exercise C.3.7∗ . Prove that an upper triangular matrix is invertible if and only if its determinant
is non-zero, i.e. if the elements on its diagonal are non-zero.

Solution

Let αi denote the ith element on the diagonal (i.e. the (i, i) coordinate). First, suppose that
αi 6= 0 for every i and that
Xn
ai M i = 0
i=1

for some ai ∈ K not all zero. Consder the least k such that ak 6= 0 and let α denote the (k, k)
Pn Pk
coordinate of M . The kth coordinate of i=1 ai M i is i=1 ai αi = ak αk since the ai = 0 for
i < k. This means that ak = 0 which contradicts our assumption.

For the converse, let k be such taht αk = 0. Then, the columns M k , M k+1 , . . . , M n all have the
top k coordinates zero. Thus, we can view them as vectors with n − k coordinates. We have
n − k + 1 vectors in a space of dimension n − k so they must be linearly dependent. 

Exercise C.3.8. Prove that Z is integrally closed , meaning that, if f is a monic polynomial with alge-
braic integer coefficients, then any of its root is also an algebraic integer. (This is also Exercise 1.5.22† .)
C.3. DETERMINANTS 405

Solution

Let f = X n + αn−1 X n−1 + . . . + α0 be a monic polynomial with algebraic integer coefficients


and let β be one of its roots. Then,

M = Z[αn−1 , . . . , α0 , β]

is a finitely generated Z-module such that βM ⊆ M , so β is an algebraic integer. 

Exercise C.3.9∗ . Prove Lemma C.3.1.

Solution

By induction:
n
X
det A = (−1)i−1 ai,1 det Ai,1
i=1
n
X X
= (−1)i−1 ai,1 ε(σ)a2,σ(2) · . . . · an,σ(n)
i=1 σ
Xn X
= (−1)i−1 ε(σ)ai,1 aσ(2),2 · . . . · aσ(n),n
i=1 σ

where the sum is over the bijections σ : [n] \ {1} → [n] \ {i}. This has the desired form. Finally,
note that the sign of a1,1 · . . . · an,n is (−1)1−1 times the sign of a2,2 · . . . · an,n which is 1 as
wanted. 

Exercise C.3.10∗ . Prove that the number of derangements of [m] is


m
X (−1)i m!
i=0
i!

and that this number is odd if m is even and even if m is odd.

Solution

We shall instead count the number of permutations with at least one fixed point. We use the
principle of inclusion-exclusion. For a fixed k, there are (n − 1)! permutations σ satisfying
σ(k) = k. Thus, we count n · (n − 1)! = n! permutations. However, we have counted some
permutations twice: the ones which have at least two fixed points. Thus, we remove n2 · (n − 2)!


from our count (we need to choose two fixed points and then permute the remaining n − 2
elements arbitrarily). But now, we have removed some permutations too many times: the ones
with at least three fixed points, so we must add n3 · (n − 3)! to our count, etc. At the end, we


get
n   n
X n X n!
(−1)k−1 (n − k)! = (−1)k−1 .
k k!
k=1 k=1

If we subtract this from n! (the total number of permutations), we get exactly the wanted
formula. 

Exercise C.3.11∗ . Prove that the signature is negated when one exchanges two values of σ (i.e.
compose a transposition with σ).
406 APPENDIX C. LINEAR ALGEBRA

Solution

Say that we exchange σ(i) with σ(j), i.e. we apply the transposition τ = τσ(i),σ(j) . We have
σ(i)−σ(j) Q σ(k)−σ(j) Q σ(k)−σ(i)
j−i · k6=i k−i · k6=j k−j
ε(σ)/ε(τ ◦ σ) = σ(j)−σ(i) Q σ(k)−σ(i) Q σ(k)−σ(j)
j−i · k6=i k−i · k6=j k−j
σ(i)−σ(j)
j−i
= σ(j)−σ(i)
j−i
= −1

as wanted. 

Exercise C.3.12∗ . Prove that transpositions τi,j : i ↔ j and k 7→ k for k 6= i, j generate all
permutations (through composition).

Solution

By induction: we start with τn,σ−1 (n) so that σ ◦ τn,σ−1 (n) (n) = n. Then, ignoring the last
element of σ ◦ τ1,σ−1 (1) , it is a permutation of [n − 1] so a composition of transpositions. We get
−1
the wanted result by applying τn,σ−1 (n) = τn,σ −1 (n) to both sides. 

Exercise C.3.13∗ . Prove Theorem C.3.3.

Solution

We proceed as in Exercise C.3.9∗ : we have


n X
X
det(ai,j ) = (−1)i−1 ε(σ)ai,1 aσ(2),2 · . . . · aσ(n),n
i=1 σ

where the sum is over the bijections σ : [n] \ {1} → [n] \ {i}. It remains to prove that ε(σ) =
(−1)σ(1)−1 ε(σ 0 ) where σ 0 denotes the bijection [n] \ {1} → [n] \ {σ(1)} obtained by forgetting
the first element. This is easy: we want to count the number of inversions of σ which are not
inversions of σ 0 . This is exactly the number of inversions (k, 1) since 1 is the only difference
bewteen σ and σ 0 . Since k > 1 for each k 6= 1, the number of such inversions is the number of
σ(k) < σ(1), i.e. σ(1) − 1 as wanted. 

Exercise C.3.14∗ . Prove that det A = det AT for any square matrix A.
C.3. DETERMINANTS 407

Solution

Our formula from Theorem C.3.3 is symmetric in rows and columns:


X
det A = ε(σ)aσ(1),1 · . . . · aσ(n),n
σ∈Sn
X
= ε(σ −1 )a1,σ−1 (1) · . . . · an,σ−1 (n)
σ∈Sn
X
= ε(σ)a1,σ(1) · . . . · an,σ(n)
σ∈Sn

= det AT .

Exercise C.3.15∗ . Prove that Dk (In−1 ) = (−1)k−1 .

Solution

Consider the n × n matrix I we defined Dk with and substitute A = In−1 . We will exchange
k − 1 columns to transform it into In , thus getting a determinant of (−1)k−1 det In = (−1)k−1
as wanted.

Note that I is already almost equal to In : its first column should be kth and the 1 to k ones
should be shifted to the left. Here is how we do this shift using transpositions.

We first exchange the first column of the with the second one so that a1,1 = 1 becomes the (1, 1)
coordinate of I, then we exchange the (new) second column with the third one so that a2,2 = 1
becomes the (2, 2) coordinate of I, etc., until we exchange the k − 1th column with the kth one so
that ak,k = 1 becomes the (k, k) coordinate of I. Thus, the k − 1th column, which was originally
the first one, becomes the kth one as wanted. 

Exercise C.3.16. Prove that the determinant is multiplicative by using the explicit formula of
Theorem C.3.3.

Solution

Write C = AB, so that


C k = b1,k A1 + . . . + bn,k An .
Then, by multilinearity of the determinant,

det C = det(b1,1 A1 + . . . + bn,1 An , . . . , b1,n A1 + . . . + bn,n An )


X
= bσ(1),1 · . . . · bσ(n),n det(Aσ(1) , . . . , Aσ(n) )
σ∈Sn
X
= σ(ε)bσ(1),1 · . . . · bσ(n),n det(A1 , . . . , An )
σ∈Sn

= det B det A

where the second-to-last equality comes from Remark C.3.4. 

Exercise C.3.17. Let L/K be a finite extension. Prove that the determinant of the K-linear map
L → L defined by x 7→ xα is the norm of α defined in Definition 6.2.3.
408 APPENDIX C. LINEAR ALGEBRA

Solution

We first treat the case where L = K(α). Consider the basis 1, α, . . . , αn−1 of L and let f =
X n + an−1 X n−1 + . . . + a0 be the minimal polynomial of α. The matrix corresponding to x 7→ xα
in this basis is  
0 0 0 ··· −a0
1 0 0 · · · −a1 
 
0 1 0 · · · −a2 
.
 
0 0 1 · · · −a3 
 
 .. .. .. . . .. 
. . . . . 
0 0 0 ··· −an−1
Using Theorem C.3.3, we see that this is −ε(σ)a0 , where σ is the cycle (1, 2, . . . , n). Hence, we
need to prove that ε(σ) = (−1)n−1 since the product of the conjugates of α is (−1)n a0 by Vieta’s
formulas. This follows from the fact that

σ = (1, n) · . . . · (1, 3)(1, 2)

is a product of n − 1 transpositions.

Now here is what we can do for the general case. Note that the norm is multiplicative. Indeed,
for any linear maps ϕ and ψ,
det ϕ · det ψ = det ϕ ◦ ψ
since the determinant is multiplicative. Since the composite of x 7→ αx and x 7→ βx is x 7→ αβx,
the norm of αβ is the norm of α times the norm of β. Then, pick a primitive element γ of L/K
such that α/γ is also a primitive element. We can do this since

σ(α)
σ(α)/σ(γ) = α/γ ⇐⇒ σ(γ) = γ ·
α
is false for γ = aδ + b for well-chosen a, b ∈ K and a fixed primitive element δ. Thus, from the
first observation, we get

NL/K (α) = NL/K (γ)NL/K (α/γ)


Y Y
= σ(γ) σ(α/γ)
σ∈EmbK (L) σ∈EmbK (L)
Y
= σ(α)
σ∈EmbK (L)

as wanted. 

Exercise C.3.18∗ . Prove that adj AA = (det A)In .

Solution

Set (bi,j ) = adj AA. This time, we have


n
X
bi,j = (−1)i+k det(Ak,i )ak,j .
k=1

When i = j this is the column expansion of the determinant of A which is det A, and when i 6= j
this is (−1)i+j times the determinant of the matrix obtained by replacing the ith column of A
by its jth column. This matrix has two identical columns so its determinant is zero as wanted.
C.4. LINEAR RECURRENCES 409

C.4 Linear Recurrences


Exercise C.4.1. Prove that Theorem C.4.1 holds in a field K of characteristic p 6= 0 as long as the
multiplicities of the roots of the characteristic polynomial are at most p. In particular, for a fixed
characteristic equation, it holds for sufficiently large p.

Solution

The only thing to check is that if fi (n) = 0 for all n ∈ Z and fi has less than the multiplicity of αi
so less than p then fi = 0. This is true because Z reduces to p elements in a field of characteristic
p 6= 0 so fi has p roots which is more than its degree and is thus zero. 

C.5 Exercises
Vector Spaces and Bases
Exercise C.5.1 (Grassmann’s Formula). Let U be a vector space and V, W be two finite-dimensional
subspaces of U . Prove that

dim(V + W ) = dim V + dim W − dim(V ∩ W ).

Solution

Let u1 , . . . , uk be a basis of V ∩ W . Complete it to a basis u1 , . . . , uk , v1 , . . . , vm of V and a basis


u1 , . . . , uk , w1 , . . . , wn of W . We claim that

u1 , . . . , uk , v1 , . . . , vm , w1 , . . . , wn

is a basis of V + W . It clearly spans all of V + W so it remains to check that it’s linearly


independent. If
Xk m
X m
X
a i ui + bi vi + ci wi = 0,
i=1 i=1 i=1
Pm
then i=1 bi vi is both in V and W so is in V ∩ W . This means that it’s a linear combination
of the ui , but by construction this implies b1 = . . . = bm = 0 since the ui and vi are linearly
independent (they form a basis of V together). By symmetry, c1 = . . . = cn = 0. Finally, this
forces a1 = . . . = ak = 0 too.

We conclude that

dim(V + W ) = k + m + n = (k + m) + (k + n) − k = dim V + dim W − dim V ∩ W.

Exercise C.5.3† . Given a vector space V of dimension n, we say a subspace H of V is a hyperplane


of V if it has dimension n − 1. Prove that H is a hyperplane of K n if and only if there are elements
a1 , . . . , an ∈ K not all zero such that

H = {(x1 , . . . , xn ) ∈ K n | a1 x1 + . . . + an xn = 0}.

Solution

Clearly, if H is defined as the zero set of a1 X1 + . . . + an Xn then H has dimension n − 1 since,


410 APPENDIX C. LINEAR ALGEBRA

assuming without loss of generality that an 6= 0, we get a bijective map K n−1 → H given by
 
a1 x1 + . . . + an xn
(x1 , . . . , xn−1 ) 7→ x1 , . . . , xn−1 , − .
a1

For the converse, pick a linear map ϕ mapping H to 0 without being identically 0. Then,
n
X
(x1 , . . . , xn ) = x ∈ H ⇐⇒ ϕ(x) = 0 ⇐⇒ xi ϕ(ei ) = 0
i=1

where ei is the canonical basis of K n : 0 everywhere except in its ith coordinate where there is a
1. 

Determinants
Exercise C.5.6. Let a0 , . . . , an−1 be elements of K and ω a primitive nth root of unity. Prove that
the circulant determinant  
a0 a1 · · · an−1
an−1 a0
 · · · an−2 

 .. .. .. .. 
 . . . . 
a1 a2 ··· a0
is equal to
f (ω)f (ω 2 ) · . . . · f (ω n−1 )
where f = a0 + . . . + an−1 X n−1 . Deduce that this determinant is congruent to a0 + . . . + ap−1 modulo
p when n = p is prime and a1 , . . . , ap are integers.

Solution

Let A be the circulant matrix. We shall imitate the proof of Theorem C.3.2: we need to prove
that det A is zero when a0 = −(a1 ω + . . . + an−1 ω n−1 ) for any root of unity ω. This shows
that the product f (ω) divides the determinant as a polynomial in a0 . Then, by looking at the
coefficient of an0 , we can conclude that they are in fact equal. To show that the determinant
vanishes for these a0 , note that the following linear combination of the columns is zero
n
X
ω i Ai = 0.
i=1

Alternatively, one could note that A = f (J), where is the circulant matrix with a1 = 1 and
a0 = a2 = a3 = . . . = an−1 = 0. We can check that all nth roots of unity are eigenvalues of J.
To finish, Exercise C.5.15 implies that the eigenvalues of f (J) are the f (ω), and the determinant
is their product as wanted. 

Exercise C.5.7 (Cramer’s Rule). Consider the system of equations M V = X where M is an n × n


matrix and V = (vi )i∈[[1,n]] and X = (xi )i∈[[1,n]] are column vectors. Prove that, for any k ∈ [[1, n]],
vk is equal to det M/ det Mk,X , where Mk,X denotes the matrix [M 1 , . . . , M k−1 , X, M k+1 , . . . , M n ]
obtained from M by replacing the kth column by X.

Solution

Note that this formula is linear in XX since the determinant is multilinear. Since the kth
coordinate of the formula M −1 V is also linear in X, we just need to check that both formulas
C.5. EXERCISES 411

agree on a basis of K n . This is easy: when X = M k we get vk = 1 and vi = 0 for i 6= k which


is indeed the solution to M V = M k . Since these form a basis of K n as M is invertible, we are
done. 

Exercise C.5.8† . Let (un )n≥0 be a sequence of elements of a field K. Suppose that the (m+1)×(m+1)
determinant det(un+i+j )i,j∈[[0,m]] is 0 for all sufficiently large n. Prove that there is some N such that
(un )n≥N is a linear recurrence of order at most m.

Solution

We proceed by induction on m (it’s trivial when m = 1). More precisely, we prove that, under the
assumptions of the problem, if the (m − 1) × (m − 1) determinant det(un+i+j )i,j∈[[0,m−1]] vanishes
for one value of n, then it vanishes for all the next ones which means that there exists some N for
which (un )n≥N is a linear recurrence of order at most m − 1 ≤ m. If it doesn’t vanish for n ≥ N ,
then (un+i+j )i,j∈[[0,m]] has rank m − 1 by definition of the rank (see Exercise C.5.26† ), and, more
precisely, its first m rows as well as its first last rows are linearly independent and generate the
same hyperplane H. Notice that the last m rows of (un+i+j )i,j∈[[0,m]] are the first m rows of
(un+1+i+j )i,j∈[[0,m]] so this hyperplane H is always the same. Finally, with Exercise C.5.3† , we
conclude that
a0 un + a1 un+1 + . . . + an+m un+m = 0
for all n ≥ N , i.e. (un )n≥N is a linear recurrence of order at most m as claimed.

It remains to prove that, if det(un+i+j )i,j∈[[0,m−1]] = 0, then det(un+1+i+j )i,j∈[[0,m−1]] = 0 as well.


Hence, suppose that the first determinant is 0, i.e. that there is a linear dependence between the
rows. If this dependence does not involve the first row, then it also creates a linear dependence
in the rows of the second matrix which implies that its determinant is 0 as wanted. Otherwise,
the first row is a linear combination of the m − 1 next ones. This implies that the first row of
(un+i+j )i,j∈[[0,m]] is a linear combination of the m − 1 next ones as well as a vector of the form
(0, . . . , 0, a) for some a ∈ K. Then, by performing row operations, we find

0
··· 0 a
un+1 · · · un+m un+m+1
0 = (un+i+j )i,j∈[[0,m]] = ± . = ±a det det(un+1+i+j )i,j∈[[0,m−1]]

. . ..
.. .. .. .

un+m · · · un+2m−1 un+2m

by expanding with respect to the first row. We are done. 

Exercise C.5.9† . Let f1 , . . . , fn : N → C be functions which grow at different rates, i.e.


f1 (m) f2 (m) fn−1 (m)
, ,..., −→ 0
f2 (m) f3 (m) fn (m) m→∞
Prove that there exists n integers m1 , . . . , mn such that the tuples

(f1 (m1 ), . . . , fn (m1 )), . . . , (f1 (mn ), . . . , fn (mn ))

are linearly independent over C.

Solution

We proceed by induction on n. It is of course trivial when n = 1. Fix m1 , . . . , mn−1 such that

(f1 (m1 ), . . . , fn−1 (m1 )), . . . , (f1 (mn−1 ), . . . , fn−1 (mn−1 ))


412 APPENDIX C. LINEAR ALGEBRA

are linearly independent, i.e. such that the determinant


 
f1 (m1 ) ··· fn−1 (m1 )
C=
 .. .. .. 
. . . 
f1 (mn−1 ) · · · fn−1 (mn−1 )

is non-zero. We wish to show that there is some mn such that


 
f1 (m1 ) · · · fn (m1 )
 .. .. .. 
 . . . 
f1 (mn−1 ) · · · fn (mn )

is non-zero. Expand it with respect to the last column to get


n
X
Ci fi (mn )
i=1

where Ci are constants and Cn = C. Since fn dominates all other fi and Cn is non-zero by
construction, this is non-zero for sufficiently large m as wanted. 

Algebraic Combinatorics
Exercise C.5.12† . Let A1 , . . . , An+1 be non-empty subsets of [n]. Prove that there exist disjoint
subsets I and J of [n + 1] such that
[ [
Ai = Aj .
i∈I j∈J

Solution

Identity each subset Ai ⊆ [n] with the vector vi ∈ Rn whose ith coordinate is 1 if i ∈ S and 0
otherwise. This makes the set of subsets of [n] into a subset of a R-vector space of dimension n.
(It’s not a vector space itself, though it would be if we chose F2 instead of R. F2 is usually very
useful in algebraic combinatorics but doesn’t work here.) We have n + 1 vectors in a space of
dimension n so they must be linearly dependent, say
n+1
X
ci vi = 0.
i=1

Now consider the set I of indices of positive ci , and the set J of indices of negative ci . We have
X X
|ci |vi = |cj |vj
i∈I j∈J
S S
which gives us i∈I Ai = j∈J Aj as wanted. In addition, I and J are disjoint by construction.

The Characteristic Polynomial and Eigenvalues


Exercise C.5.15 (Characteristic Polynomial). Let K be an algebraically closed field. Let M ⊆ K n×n
be an n × n matrix. Define its characteristic polynomial as χM = det(M − XIn ). Its roots (counted
with multiplicity) are called the eigenvalues λ1 , . . . , λn ∈ K of M . Prove that det M is the product
of the eigenvalues of M , and that Tr M is the sum of the eigenvalues. In addition, prove that λ is
an eigenvalue of M if and only if there is a non-zero column vector V such that M V = λV (in other
words, M acts like a homothety on V ). Conclude that, if f ∈ C[X] is a polynomial, the eigenvalues of
C.5. EXERCISES 413

f (M ) are f (λi ) (with multiplicity). (We are interpreting 1 ∈ K as In for f (M ) here, i.e., if f = X + 1,
f (M ) is M + In .) In particular, the eigenvalues of M + Iα are λ1 + α, . . . , λn + α, and the eigenvalues
of M k are λk1 , . . . , λkn .1

Solution

The first part follow simply from expanding the determinant and using Vieta’s formulas. For the
second one, there is a non-zero vector V such that M V = λV ⇐⇒ (M − λIn )V if and only if
ker(M − λIn ) is non-trivial, which is equivalent to det(M − λIn ) = 0. Now, note that M V = λV
gives M k V = λk V and thus, by taking linear combinations, f (M )V = f (λ)V . This shows that
the f (λ) are eigenvalues of M . To account for the multiplicity, note that we have established the
result when χM has simple roots, and that the general results follows by density, analytical or
algebraic. We present the proof by algebraic density (more technically, Zariski density) since it
works over any field, which is similar to Remark C.3.7.
Qn
Note that the equality χf (M ) = k=1 f (λi ) − X is a polynomial equality in the coefficients of
f and the coordinatesQnof M by the fundamental
Q theorem of symmetric polynomials. Let ∆f by
the discriminant of k=1 f (λi ) − X, i.e. ± i6=j f (λi ) − f (λj ) which is again polynomial in the
coefficients of f and the coordinates of M . We have shown that
n
Y
(χf (M ) − f (λi ) − X)∆f
k=1

always takes the value zero. Hence, it must be the zero polynomial. If we show that ∆f is
non-zero, we are hence done. Choosing f = X, this amounts to finding a matrix with distinct
eigenvalues. This is not very hard: we can fix all coordinates of M except one and let it vary.
The determinant then varies affinely in this coordinates, say it is ua + v, where a is the varying
coordinate. By induction, we can choose u to have simple roots. A multiple root of ua + v would
also be a root of u0 , but this is impossible for large a unless this root is a common root of u and
v, which must thus have multiplicity one: this is a contradiction. (Alternatively, we can consider
the matrix J from Exercise C.5.6.) 

Exercise C.5.16 (Cayley-Hamilton Theorem). Prove that, for any n × n matrix M , χM (M ) = 0


where χM is the characteristic polynomial of M and 0 = 0In . Conclude that, if every eigenvalue of M
is zero, M is nilpotent, i.e. M k = 0 for some k.2

Solution

Using Proposition C.3.7, we get


(M − XIn ) adj(M − XIn ) = adj(M − XIn )(M − XIn ) = χM In . (*)
We wish to substitute M for X in M − XIn , but we can only do that if M commutes with the
coefficients of adj(M − XIn ) (which are matrices). Note that this is the case since
XIn adj(M − XIn ) = adj(M − XIn )XIn
(X is a formal variable so commutes with everything, and In does as well) so M adj(M − XIn ) =
adj(M − XIn )M by (∗). The second part is obvious: if all eigenvalues of M are zero, then
χM = ±X n so M n = 0. 

1 One of the advantages of the characteristic polynomial is that we are able to use algebraic number theory, or more

generally polynomial theory, to deduce linear algebra results, since the eigenvalues say a lot about a matrix (if we combine
this with the Cayley-Hamilton theorem). See for instance Exercise C.5.18 and the third solution of Exercise C.5.19.
2 Note that if, in the definition of χ , we replace det by an arbitrary multilinear form in the coordinates of M ,
PM
such as the permanent) perm(A) = σ∈Sn a1,σ(1) · . . . · an,σ(n) , the result becomes false, so we cannot just say that
"χM (M ) = det(M − M In ) = det 0 = 0" (this "proof" is nonsense because the scalar 0 is not the matrix 0, but the point
is that this intuition is fundamentally incorrect).
414 APPENDIX C. LINEAR ALGEBRA

Exercise C.5.19. Let p be a prime number, and G be a finite (multiplicative) group of n × n matrices
with integer coordinates. Prove that two distinct elements of G stay distinct modulo p. What if the
elements of G only have algebraic integer coordinates and p is an algebraic integer with all conjugates
greater than 2 in absolute value?

Solution

We will present three solutions, in increasing order of non-elementariness, and of generality.


Suppose that the reduction modulo p (group) morphism ϕ is not injective, i.e. has non-trivial
kernel (see Exercise A.2.13∗ – this is really trivial, I’m only using this language so that you
familiarise with it), say I := In 6= A ⊆∈ ker ϕ, i.e. A = I + pB for some non-zero B. Since G
is a finite group, we have Am = I for m = |G| by Lagrange’s theorem Exercise A.3.17† . We will
show that this is impossible (if fact it is equivalent to the problem: if it were possible, the group
generated by M would be a counterexample). (Note that all solutions will use the assumption
that p 6= 2 somewhere, and for a good reason: −I ≡ I (mod 2).)
Here is the first solution, which works only for the first part of the problem. Let k be the greatest
integer such that B/pk has integer coordinates. Suppose first that p - m. Then, modulo pk+2 ,
using the binomial expansion (which we can use since I and pB commute), we have

(I + pB)m ≡ I + mpB 6≡ I (mod pk+2 )

which is a contradiction. We will now replace B by C such that (I + pB)p = I + pC. That, way,
m gets replaced by m/p, and by iterating this process we will eventually reach a p - m which is
a contradiction. However, we need to prove that C 6= 0 too. Thus, suppose that (I + pB)p = I.
We then have
m(m − 1) 2 2
(I + pB)m ≡ I + mpB + p B equivI + p2 B 6≡ I (mod pk+3 )
2
since p is odd, which is also a contradiction.
We now present the second solution. Let |M | denote the maximum of the absolute value of the
coordinates of a matrix M . Since Am = I for some m, |Ak | is bounded when k varies, say by C.
We have
r  
r r
X r
|(pB) = |(I − A) | ≤ |Ar | ≤ C2r .
k
k=0
However, the left hand side is divisible by pn so is at least pn unless it is 0. Since p > 2, by
taking r sufficiently large, we get |B r | = 0, i.e. B r = 0: B is nilpotent. When p is only an
algebraic integer with all conjugates greater than 2, the same argument works: the coordinates
of B r /pr are algebraic integers with absolute value less than C2r /pr so less than 1 for large r.
However, the same goes for their conjugates by assumption. Since the only algebraic integer
whose conjugates are all strictly in the unit disk is 0 (by looking at the constant coefficient of its
minimal polynomial), we get B r = 0 for large r as wanted.
Consider k such that B k 6= 0 but B k+1 = 0. Since B 6= 0, we have k ≥ 1. By expanding
(I + pB)m = I, we get
m(m − 1) 2
pmB + p2 B + . . . = 0.
2
Finally, by multiplying this equation by B k−1 , we get pmB k = 0 which implies m = 0 and is a
contradiction.
Finally, the third solution uses more advanced linear algebra. Let β be an eigenvalue of B. Then,
α = 1 + pβ is an eigenvalue of A which is congruent to 1 modulo p. Further, since Am , we have
αm = 1 so it is a root of unity. This implies that β = α−1p has module less than 1 since p > 2.
This is also true for all its conjugates. Thus, the constant coefficient of its minimal polynomial
must be 0, i.e. β = 0. Hence, all eigenvalues of B are zero, which implies that B is nilpotent by
Exercise C.5.16. We finish as in the previous solution. 
C.5. EXERCISES 415

Miscellaneous
Exercise C.5.20 (USA TST 2019). For which integers n does there exist a function f : Z/nZ → Z/nZ
such that
f, f + id, f + 2id, . . . , f + mid
are all bijections?

Solution

There exists such a function if and only if all prime factors of n are greater than m + 1. In that
case, it is clear that f = id works. Now suppose that f has a prime factor p ≤ m + 1. Pick p to
be minimal, and suppose without loss of generality that m = p − 1. For 1 ≤ k ≤ m, since f and
f + kid are both bijections, we have
n n m   n
X
m
X
m
X m m−i X
g(x) ≡ (g(x) + kx) = k g(x)i xm−i ,
x=1 x=1 i=0
i x=1

i.e.
m−1
X  n
m m−i X
k g(x)i xm−i ≡ 0.
i=0
i x=1
m
 Pn
Thus, we have a linear system in the k m−i with solution xi = i x=1 g(x)i xm−i . By Vander-
monde, the determinant of the matrix M = (k m−i )k,i is
Y
i−j
1≤i<j≤m

which is invertible modulo n since p is the smallest prime factor of n by assumption and m = p−1.
Thus, by Exercise C.3.18∗ , M is invertible modulo n which implies that our system has exactly
one solution. Since x0 ≡ . . . ≡ xm−1 = 0 is of course the trivial solution, this must in fact be the
case. In particular,
Xn
x0 = xp−1
x=1
n n(n−1)
is zero. We will prove that this is impossible. If p = 2, this sum is n−1 2 so n | 2 which
implies that n is odd and is a contradiction.

Now suppose that p is odd. Let k = vp (n). Since this sum is congruent to
k
p
n X p−1
x
pk x=1

Ppk
modulo pk , it suffices to prove that pk - x=1 xp−1 . Let g be a primitive root modulo pk , there
exists one by Exercise 3.5.18† . Then,
k
pX −1 k−1
(p−1)2
X
p−1 gp −1
x ≡ g k(p−1) = .
g p−1 −1
x∈(Z/pZ)× k=1

By LTE 3.4.3, the p-adic valuation of this is k − 1 < k. To take care of the terms of the sum
which are divisible by p, simply note that
   
X X
vp  xp−1  = (p − 1)` + vp   = (p − 1)` + k − ` − 1 ≥ k
vp (x)=`,x∈Z/pk Z x∈(Z/pk−` )× xp−1

by our previous computation. 


416 APPENDIX C. LINEAR ALGEBRA

Exercise C.5.21 (Finite Fields Kakeya Conjecture, Zeev Dvir). Let n ≥ 1 an integer and F a finite
field. We say a set S ⊆ Fn is a Kakeya set if it contains a line in every direction, i.e., for every y ∈ Fn ,
there exists an x ∈ Fn such that S contains the line x + yF. Prove that any polynomial of degree less
than |F| vanishing on a Kakeya set must be zero. Deduce that there is a constant cn > 0 such that,
for any finite field F, any Kakeya set of Fn has cardinality at least cn pn .

Solution

Let q be the cardinality of F. The proof will be in two steps. Suppose that f is a polynomial of
degree d < q vanishing on a Kakeya set S. Fix any y ∈ Fn . Then, for some x ∈ Fn , f (x + ty) = 0
for any t ∈ F. The polynomial f (x + T y) has more roots than its degree so is zero. Let g be the
homogeneous part of f , i.e. the polynomial formed by the monomials of degree d of f . Notice
that the coefficient of T d in f (x + T d) is exactly g(y). Hence, g(y) = 0 for any y ∈ Fn , which
implies that g = 0 by Exercise A.1.7∗ . This contradicts the assumption that f had degree d.

For the second step, note that the dimension of the vector space V of polynomials of degree at
most q − 1 is n+q−1n . Indeed, the monomials X1d1 · . . . · Xndn for d1 + . . . + dn ≤ q − 1 form a basis
of this space. However, the number of such tuples is the same as the number of ways to choose
n elements from [n + q − 1]: choose a1 < . . . < an and decide that d1 = a1 , d1 + d2 = a2 , etc.,
until d1 + . . . + dn = an (this technique is usually called stars and bars because you have q − 1
stars and n bars used to separate them). Now consider the linear map T : V → F|S| defined by
T (f ) = (f (s))s∈S . If |S| < dim V , it must have a non-trivial kernel by Proposition C.1.2 (or the
rank-nullity theorem). This contradicts the first step.

We conclude that
qn
 
n+q−1 q(q + 1) · . . . · (n + q − 1)
|S| ≥ = ≥
n n! n!
1
so we can take cn = n! . 

Exercise C.5.22 (Siegel’s Lemma). Let a = (ai,j ) be an m × n matrix with integer coordinates.
Prove that, if n > m, the system
Xn
ai,j xj = 0
j=1

for i = 1, . . . , n always has a solution in integers with


 m
 n−m
max |xi | ≤ n max |ai,j | .
i i,j

Solution

Let M = maxi,j |ai,j |. Fix a constant N to be chosen later. Suppose that the integers a1 , . . . , ak
are negative while ak+1 , . . . , an are positive. Then, for any (x1 , . . . , xn ) ∈ [N ]n , we have

N (a1 + . . . + ak ) ≤ a1 x1 + . . . + an xn ≤ N (ak+1 + . . . + an ).

Thus, the expression a1 x1 + . . . + an xn can take at most 1 + N (|a1 | + . . . + |an |) values.

Now, we return to the problem. Set A = (ai,j ). We have shown that, when X ⊆ [N ]n , each
rows of AX can take at most 1 + N nM values. Thus, when X ranges through [N ]n , AX takes
at most (1 + N nM )m < (1 + N )m (nM )m values. Since X can take (1 + N )n values, if
m
(1 + N )n > (1 + N )m (nM )m ⇐⇒ (1 + N )n−m > (nM )m ⇐⇒ 1 + N > (nM ) n−m

then one value will be taken twice by the pigeonhole principle, say AX = AY . This yields
C.5. EXERCISES 417

 m 
A(X − Y ) = 0 for some Z = X − Y ⊆ [[−N, N ]]n as wanted. It is clear that N = (nM ) n−m
works and gives us what we want. 

Exercise C.5.24. How many invertible n × n matrices are there in Fp ? Deduce the number of
(additive) subgroups of cardinality pm that (Z/pZ)n has.

Solution

We proceed inductively to determine the number of tuples of linearly independent vectors of


cardinality k. At first, we can pick any non-zero vector in Fnp , there are thus pn − 1 choices.
Then, we can pick any vector which is not a linear combination of the first one, there are thus
pn −p possible choices. Continuing like that, if we have picked k vectors, their linear combinations
generate pk elements so we have pk elements to avoid and thus pn − pk possibilities for the next
vector. In conclusion, the number of invertible n × n matrices with coefficients in Fp is

(pn − 1)(pn − p) · . . . · (pn − pn−1 ).

For the second part, note that a subgroup of (Z/pZ)n is a Fp -vector space, and the fact that
it has pm elements means that its dimension is m. Thus, we want to count the subspaces of
Fnp of dimension m. Here is how we will proceed: we count the number of tuples of m linearly
independent elements, and divide this by the number of tuples which represent a fixed subspace.
We have already computed the first one: it is

(pn − 1) · . . . · (pn − pm−1 ).

We have also determined the second: if we fix a subspace of dimension m, it has

(pm − 1) · . . . · (pm − pm−1 )

bases. We conclude that (Z/pZ)n has

(pn − 1) · . . . · (pn − pm−1 )


(pm − 1) · . . . · (pm − pm−1 )

subgroups of cardinality pm . 

Exercise C.5.25† . Let K be a field, and√let S ⊆ K 2 be a set of points. Prove that there exists a
polynomial f ∈ K[X, Y ] of degree at most 2n such that f (x, y) = 0 for every (x, y) ∈ S.

Solution

ai,j X i Y j where the sum is over the i, j such that i + j ≤ 2n. By stars and
P
Write f = i,j √
bars, there are 2+b 2 2nc such pairs: we choose the two values i and i + j + 1 in [[0, 2n + 1]].
 √ 

Since  √  √  √  √ √
2+ 2n (1 + 2n )(2 + 2n ) 2n · 2n
= > = n,
2 2 2
we have more unknowns then equations so there is a solution. 

Exercise C.5.26† . Given an m×n matrix M , we define its row rank as the maximal number of linearly
independent rows of M . Similarly, its column rank is the maximal number of linearly independent
columns of M . Prove that these two numbers are the same, called the rank of M and denoted rank M .
418 APPENDIX C. LINEAR ALGEBRA

Solution

Without loss of generality, by removing some rows if necessary, suppose that all m rows of M
are linearly independent. We will prove that M has at least m linearly independent columns,
which implies that the column rank is greater than or equal to the row rank (if we add rows the
linearly independent columns stay linearly independent). Taking the transpose then yields the
reverse inequality, so they are in fact both equal.

Suppose for the sake of a contradiction that M has at most m − 1 linearly independent columns,
say M 1 , . . . , M k . If we consider the m vectors corresponding to the rows of [M 1 , . . . , M k ],
they are linearly dependent by Proposition C.1.2. Now, if we add a column M k+1 which is a
linear combination of M 1 , . . . , M k , the vectors corresponding to the rows of [M 1 , . . . , M k+1 ] stay
linearly dependent. Indeed, if
Xk
M k+1 = ai M i
i=1

and the linear dependence of the rows of [M , . . . , M k ] is


1

m
X
bi mi,j
i=1

for any j ∈ [k], then


m
X m
X k
X
bi mi,k+1 = bi aj mi,j
i=1 i=1 j=1
k
X m
X
= aj bi mi,j
j=1 i=1

= 0.

In other words, a linear dependence between the rows extends to a linear dependence with the
same coefficients between the rows when we add a linearly dependent column. Continuing like
that shows that, finally, the rows of [M 1 , . . . , M m ] = M are linearly dependent, contradicting
our initial assumption. 

Exercise C.5.28 (Nakayama’s Lemma). Let R be a commutative ring, I an ideal of a R, i.e. an


R-module inside R3 , and M a finitely-generated R-module. Suppose that IM = M , where IM does
not mean the set of products of elements of I and M , but instead the R-module it generates (i.e. the
set of linear combinations of products). Prove that there exists an element r ≡ 1 (mod I) of R such
that rM = 0.

Solution

Let α1 , . . . , αn be generators of M . We have a system of equation as follows:


n
X
αi = βi,j αj
j=1

for i = 1, . . . , n and βi,j ∈ I. Let B = (βi,j ) and A = (αi ) (as a colum vector), so that BA = A,
i.e. (In − B)A = 0. We claim that r = det(In − B) works. First, note that r ≡ 1 (mod I) since
In − B ≡ In (mod I). By Proposition C.3.7, we have
rIn = adj(In − B)(In − B)

3 See also Proposition C.3.5. A module is like a vector space but the underlying structure is not necessarily a field (in

this case it’s R).


C.5. EXERCISES 419

so, after right-multiplying by A, we get rA = 0. This implies that rM = 0 as wanted. 


Further Reading

Here are some books I like4 . As said in the foreword, I particularly recommend Andreescu-Dospinescu
[1, 2], Ireland-Rosen [11], and Murty [19].

For classical algebraic number theory, I suggest Murty [19]. I’ll just define ideals here because they
are not defined in the book (but no prior exposure to them is assumed, so a wikipedia search would
also do). An ideal I of a commutative ring R is simply a set which is closed under addition, and
closed under multiplication by elements of R. In number fields, ideals are all finitely generated, i.e.
I = a1 R + . . . + an R for some a1 , . . . , an (see chapter 5 of Murty). As an exercise you can prove that
the ideals of Z have the form nZ for some n. Milne [18] (pages 7–10) has good motivation on why to
consider ideals.

For p-adic analysis, I wholeheartedly recommend the addendum 3B of SFTB first, and then the
excellent book by Cassels [7], which, although a bit old5 , is full of number-theoretic applications like
the one in Section 8.6. Robert [23] is also very good, but focuses a lot more on analysis than on number
theory, and assumes a fair amount of topology.6 Also, Borevich-Shafarevich [6] has a great proof of
Thue’s theorem 7.4.3 using local methods (p-adic methods).

The elementary theory of elliptic curves is also of similar flavour to the topics of the present book,
see Silverman-Tate [26] for a wonderful introduction.

For more on polynomials, see Prasolov [21] (things like Appendix A). See also Rosen [24] for number
theory in function fields, i.e. algebraic number theory but with polynomials over Fq instead of rational
integers!

For abstract algebra (including Galois theory), I recommend Lang [15]. For even more advanced
algebra, see his other book [13].

For linear algebra, I also recommend Lang [14] because I like everything I read by him. For
applications of linear algebra to combinatorics, see chapter 12 of PFTB [1], which assumes approxi-
mately Appendix C as background, and Stanley [27], which assumes approximately Lang’s book as
background.

4 Disclaimer: I haven’t finished reading all of them.


5 It’s not typeset in LATEX!
6 The first chapter is particularly topologically heavy but it gets better afterwards. I would suggest to skip it at first

since it just defines the p-adic numbers, and have a look at the other chapters if you’re interested in the analytical theory.

420
Further Reading

[1] T. Andreescu and G. Dospinescu. Problems from the Book. 2nd ed. XYZ Press, 2010.
[2] T. Andreescu and G. Dospinescu. Problems from the Book. XYZ Press, 2012.
[6] Z. I. Borevich and I. Shafarevich. Number Theory. Academic Press, 1964.
[7] J. W. S. Cassels. Local fields. Cambridge University Press, 1986.
[11] S. Ireland and M. Rosen. A Classical Introduction to Modern Number Theory. 2nd ed. Vol. 84.
Graduate Texts in Mathematics. Springer-Verlag, 1990.
[13] S. Lang. Algebra. 5th ed. Vol. 211. Graduate Texts in Mathematics. Springer-Verlag, 2002.
[14] S. Lang. Linear Algebra. 3rd ed. Undergraduate Texts in Mathematics. Springer-Verlag, 1987.
[15] S. Lang. Undergraduate Algebra. 3rd ed. Undergraduate Texts in Mathematics. Springer-Verlag,
1987.
[18] J. S. Milne. Algebraic Number Theory. url: https://www.jmilne.org/math/CourseNotes/
ant.html. (accessed: 26.09.2021).
[19] M. R. Murty and J. Esmonde. Problems in Algebraic Number Theory. 2nd ed. Vol. 190. Graduate
Texts in Mathematics. Springer-Verlag, 2005.
[21] V. V. Prasolov. Polynomials. Vol. 11. Algorithms and Computation in Mathematics. Springer-
Verlag, 2004.
[23] A. M. Robert. A Course in p-adic Analysis. Vol. 198. Graduate Texts in Mathematics. Springer-
Verlag, 2000.
[24] M. Rosen. Number Theory in Function Fields. Vol. 210. Graduate Texts in Mathematics. Springer-
Verlag, 2002.
[26] J. H. Silverman and J. T. Tate. Rational Points on Elliptic Curves. 2nd ed. Undergraduate Texts
in Mathematics. Springer-Verlag, 2015.
[27] R. Stanley. Algebraic Combinatorics. 2nd ed. Undergraduate Texts in Mathematics. Springer-
Verlag, 2018.

421
Bibliography

[3] G. M. Bergman. Luroth’s Theorem and some related results, developed as a series of exercises.
url: https://math.berkeley.edu/~gbergman/grad.hndts/. (accessed: 26.09.2021).
[4] Y. Bilu, Y. Bugeaud, and M. Mignotte. The Problem of Catalan. Springer-Verlag, 2014.
[5] A. B. Block. The Skolem-Mahler-Lech Theorem. url: http://www.columbia.edu/~abb2190/
Skolem-Mahler-Lech.pdf. (accessed: 26.09.2021).
[8] E. Chen. A trailer for p-adic analysis, first half: USA TST 2003. url: https://blog.evanchen.
cc/2018/10/10/a-trailer-for-p-adic-analysis-first-half-usa-tst-2003/. (accessed:
26.09.2021).
[9] E. Chen. Napkin. url: https://web.evanchen.cc/napkin.html. (accessed: 26.09.2021).
[10] K. Conrad. Kummer’s lemma. url: https://kconrad.math.uconn.edu/blurbs/gradnumthy/
kummer.pdf. (accessed: 26.09.2021).
[12] M. Klazar. Størmer’s solution of the unit equation x − y = 1. url: https://kam.mff.cuni.cz/
~klazar/stormer.pdf. (accessed: 26.09.2021).
[16] P-S. Loh. Algebraic Methods in Combinatorics. url: https://www.math.cmu.edu/~ploh/docs/
math/mop2009/alg-comb.pdf. (accessed: 26.09.2021).
[17] D. Masser. Auxiliary Polynomials in Number Theory. Cambridge Tracts in Mathematics. Cam-
bridge University Press, 2016.
[20] M. R. Murty and N. Thain. “Primes in Certain Arithmetic Progressions”. In: Functiones et
Approximatio 35 (2006), pp. 249–259. doi: 10.7169/facm/1229442627.
[22] P. Ribenboim. 13 Lectures on Fermat’s Last Theorem. Springer-Verlag, 1979.
[25] A. Schinzel. “On Primitive Prime Factors of an − bn ”. In: Mathematical Proceedings of the Cam-
bridge Philosophical Society 58 (4 1962), pp. 556–. doi: 10.1017/s0305004100040561.
[28] C. L. Stewart. “On the greatest prime factor of terms of a linear recurrence sequence”. In: Rocky
Mountain Journal of Mathematics 35 (2 1985), pp. 599–608. doi: 10.1216/RMJ-1985-15-2-599.
[29] R. Thangadurai. On the Coefficients of Cyclotomic Polynomials. url: https://www.bprim.
org/sites/default/files/th.pdf. (accessed: 26.09.2021).
[30] N. Tsopanidis. “The Hurwitz and Lipschitz Integers and Some Applications”. PhD thesis. Facul-
dade De Ciências da Universidade do Porto, 2020.
[31] S. Weintraub. Galois Theory. 2nd ed. Universitext. Springer-Verlag, 2009.

422
Index

Symbols expansion 51, 58, 78, 81, 115, 127


5/8 theorem 158, 383 series 126
binomial expansion 247, 252, 414
A BMO 1 120, 333
abelian Bolzano-Weierstrass theorem 132, 338
field extension 101, 155, 307 Brazil
group 98, 103, 155, 158, 384 MO 54, 65, 141, 251, 360
absolute convergence 263 Bulgaria
action MO 120
of a group 74, 158, 285, 385, 399
algebra C
closure 149, 166, 193, 394, 412 Capelli 106, 314
fundamental theorem of 149, 163 Carmichael’s theorem 74, 283
algebraic Cauchy
closure 64, 90, 240 equation 94, 172, 303
field extension 94 sequence 126
independence 166, 396 theorem 103
integer 14 Cauchy-Mirimanoff polynomials 381
number 14 Cauchy-Schwarz inequality 22
AMM 21, 23 Cayley
APMO 80, 149 theorem 105, 311
Archimedean 126, 140, 353 Cayley-Hamilton theorem 193, 413
arithmetic function 71, 271 center 383
multiplicative 71, 231, 271 centraliser 286, 383
Artin-Schreier theorem 106, 316 characateristic 147
associate 29 characteristic
left or right 38 of a ring 57, 58, 69, 93, 131, 144, 152, 162,
associative 37, 143, 173, 219 171, 189, 265
automorphism 28, 97, 211 polynomial
of a linear recurrence 138, 189
B of a matrix 193, 412
Bézout Chebotarev density theorem 73, 277
domain 30, 83, 114 Chebyshev polynomial 66, 255
left or right 38 Chevalley-Warning Theorem 72, 276
lemma 29, 145, 256, 332, 358, 372, 390 China
theorem 166, 395 MO 41
Bézout’s lemma 262 TST 72, 86, 87, 142, 157, 159, 273, 292,
BAMO 120 367, 386
basis 91, 169 Chinese remainder theorem 54, 80, 81, 231,
canonical 176, 186 242, 251, 254, 321
changes of bases 174 chinese remainder theorem 297
integral 108, 326 circulant determinant 192, 328, 410
transcendence 166, 397 class equation (of a group action) 74, 285
binomial class number
coefficient 74, 80, 86, 282 of a number field 55, 257

423
424 INDEX

closure derangements 184


algebraic 64, 90, 240 derivative 147, 172
Galois 97 discrete 157, 377
integral 24, 181, 205, 404 determinant 177
column operations 179 circulant 192, 328, 410
comatrix 187 norm 187
commutative resultant 165, 393
group 155 Vandermonde 182
operation 37, 143, 173 dimension 143, 170
ring 18, 58, 153, 188 finite 170
compact transcendence 166, 397
sequentially 132 Dirichler
complete homogeneous 165 theorem 276
completion 126 Dirichlet
complex field approximation theorem 110
cubic 116 convolution 71, 271
quadratic 109 L-function 226
composite field 100, 307 series 71, 271
compositum 100, 307 theorem 88, 297
congruence 16 theorem (on arithmetic progressions) 49,
conjugate 67, 86, 88, 298
complex 15, 17 unit theorem 119, 121, 337
of an algebraic number 17 discrete 343
quadratic 28 discrete derivative 157, 377
quaternion 37 discriminant 72, 108, 275
constructible disriminant 21
number 106, 317 distance 125, 140, 355
content (of a polynomial) 76 distributivity 143, 174, 403
convergence 22 divisibility 16, 30
absolute 263 left or right 38
p-adic 125 of polynomials 145
convex hull 159, 388 domain 153
coset 98, 305, 309 Bézout 30
cosine 23–25, 47, 56, 66, 202, 203, 207 Euclidean 31
law 202 integral 30, 153, 213
quadratic 47, 244 principal ideal 42
rational 15 unique factorisation 29
cyclic
field extension 105, 312 E
group 106, 155, 317 effective 116, 136
cyclotomic EGMO 120
field 25, 55, 97, 99, 207, 255 Ehrenfeucht’s criterion 159, 389
quadratic subfield 107, 323 eigenvalues 193, 412
units 55, 256 Eisenstein
polynomial 22, 34, 43, 65, 73, 78, 216, 282 criterion 78, 92
ring of integers 55, 255 integers 34
ELMO 121, 340
D embedding 123
d’Alembert-Gauss theorem 164 complex 117
Dedekind 107, 321 of a field extension 94
lemma 105, 311 real 117
zeta function 226 equivalence relation 38, 222
degree equivalent 141, 356
of a field extension 90 Euclid 67
of a polynomial 144 algorithm 293
of an algebraic number 16 lemma 29
density 127 Euclidean
INDEX 425

algorithm 93, 145 of algebra 149, 163


division 17, 31, 114, 144, 154 of finitely generated abelian groups 158,
domain 31, 75, 83, 138 384
function 31, 38 of Galois theory 99
left or right 38 of symmetric polynomials 161
Euler 41, 226 symmetric polynomials 19
criterion 68 fundamental unit 109
extremal value theorem 140, 355
G
F Galois
Fermat 92 closure 97
last theorem 34, 41, 55, 227, 257 correspondence 99
for polynomials 159, 387 field extension 97
little fundamental theorem 99
theorem 386 group 73, 97, 277
little theorem 60, 102, 103, 144, 147–149, inverse problem 105, 311
281, 380 Galois theory 244
two square theorem 34 Gauss
Fibonacci sequence 42, 54, 72, 74, 121, 235, formula 107, 324
273, 283, 339 integers 32
field 143, 153 primes 33
algebraically closed 149, 166, 193, 394, lemma 76
412 sum 70, 101
complex gcd 145
cubic 116 Gelfond-Schneider theorem 126
quadratic 109 generating functions 162
extension 90 generator 48, 93, 102, 105, 181, 245, 312
finite 57, 93, 99, 136, 171, 366 global 131, 136
fixed 99 field 126
of fractions 153 Grassmann’s Formula 191, 409
real greatest common divisor 30
quadratic 109 left or right 38
totally 117 group 49, 54, 97, 154, 155, 245, 251
field extension 90 abelian 98, 103, 155
abelian 101, 155, 307 action 74, 158, 285, 385, 399
algebraic 94 commutative 155
cyclic 105, 312 cyclic 106, 155, 317
finite 92 of units 60
Galois 97 quotient 158, 381
separable 94 simple 318
solvable solvable 106, 317
radicals 106, 317 symmetric 155
real radicals 107, 320 H
tower 91, 97 Hadamard quotient theorem 72, 275
finite height 312
field 366 Hensel’s lemma 81, 124, 350
finite field 57, 93, 99, 136, 171 Hermite
fixed field 99 matrix 193
Fleck’s congruences 56, 258 Hilbert
formal 123, 132, 144, 164 Theorem 90 105, 312
power series 144, 163 homogeneous 28, 45, 119, 165, 177, 239
France Hurwitz
TST 53, 248 integers 37
Frobenius hyperplane 191, 409
morphism 366
Frobenius morphism 47, 58, 71, 79, 101, 242 I
fundamental theorem ideal 194, 237, 262, 390, 418
426 INDEX

prime 208 L
ideal factorisation 283 Lüroth’s theorem 105, 312
image 156, 174 Lagrange 167, 399
IMC 23, 56, 159, 262, 390 four square theorem 39
IMO 42, 54, 87, 236, 249 interpolation 150, 172, 182
SL 49, 54, 72, 83, 86, 88, 120, 249, 275, theorem 103, 158, 383
297, 300, 333 lattice 338
inclusion-exclusion principle 405 Legendre
inert prime 33, 217 formula 129
integer symbol 68, 277
algebraic 14 lifting the exponent lemma 50, 54, 297, 363
Eisenstein 34 linear
Gaussian 32 independence 168
Hurwitz 37 map 172
p-adic 122 multi 177
quadratic 18, 197 recurrence 189
rational 15 order 189
integral basis 108, 326 transformation 172
integral closure 24, 181, 205, 404 linear recurrence 59, 60, 72, 74, 274, 283
integral domain 30, 57, 76, 89, 122, 153, 213 Liouville’s theorem 121, 339
integral element 329 local 131, 136
intermediate value theorem 164, 247, 389 field 126
inverse Galois problem 105, 311 local-global 20
Iran localisation 123
MO 56, 88, 167, 261, 400 locally analytic function 130
TST 54, 86–88, 296 Lucas
Ireland 23, 199 formula 107, 324
irrationality measure 119 sequence 42
irreducible theorem 74, 282
element 30
polynomial 150 M
ISL 108 Möbius Function 55, 71, 272
isolated 133 Mahler’s theorem 139, 351
isomorphism 57, 62, 105, 154, 265 Mann 107, 323
IZHO 24 Mason-Stothers theorem 159, 387
matrix
J adjacency 176
Jacobi adjugate 187
four square theorem 42, 233 change of bases 174
reciprocity 73, 278 comatrix 187
symbol 73, 277, 278 Hermitian 193
Japan identity 174
MO 53 multiplication 173
transpose 176
K upper triangular 179
Kakeya mean value theorem 140, 352
conjecture 193, 416 Mersenne
set 193, 416 prime 261
kernel 156, 174, 303 Mersenne sequence 87, 293
Kobayashi’s theorem 116 metric space 140, 355
Korea Miklós Schweitzer 23, 72
MO 56, 72, 259 minimal polynomial 16
winter program 56, 259 module 180, 194, 418
Kronecker’s theorem 24, 206 monic
Kronecker-Weber theorem 99 polynomial 144
Kummer 55, 257 monoid 143, 371
lemma 55, 257 morphism 155
INDEX 427

multilinear 177 constant coefficient 144


multiplicative 68 cyclotomic 43
multiplicity divisibility 145
of a root 147 elementary symmetric 18
irreducible 150
N leading coefficient 144
Nagell 105, 138, 309 monic 144
Nakayama’s lemma 194, 418 power sum 161
Newton primitive 75
method 358 root 145, 147
Newton’s formulas 22, 162 symmetric 18
nilpotent 193, 413 power mean inequality 200, 359
Noether’s Lemma 191 power series
norm 92 formal 144
absolute 28 pre-periodic points 87, 296
determinant 187 primary
Euclidean 31, 32, 42 Hurwitz integer 41, 229
of a field extension 96 prime
quadratic 28 divisors of a polynomial 79
quaternion 37 element 29, 30
norm (on a vector space) 141, 356 Gaussian 33
normal subgroup 158, 381 ideal 58, 123, 208
number field 92 inert 33, 217
ramified 33, 217
O rational 30
order split 33, 217
group 103 primitive
maximal 37 root (of finite fields) 346
of a group 158, 383 element 93, 108
of a linear recurrence 189 Hurwitz integer 41, 228
Ostrowski 140, 141, 354, 359 polynomial 75
prime factor 50, 87, 293
P root (modulo n) 54, 68, 251
p-adic 55, 257 root (of finite fields) 48, 68, 156, 245
absolute value 124 root of unity 17, 43, 65, 92, 98, 116, 134,
convergence 125 157, 192, 379, 410
exponential 140, 352 primitive element theorem 90, 93, 104, 108
integer 122 primorial 295
logarithm 140, 352 principal ideal domain 42
number 123
unit 123 Q
valuation 123 quadratic
partial fractions decomposition 190, 281 complex field 109
Pell’s equation 109 conjugate 28
Pell-type 113 cosine 47, 244
permanent 193, 413 field 26, 90, 97
permutation integer 18, 197
even 185 norm 28
odd 185 number 18, 197
transposition 185 real field 109
PFTB 22, 54, 249 reciprocity 69, 102
pigeonhole principle 110, 111, 294, 355, 357, residue 68, 83, 288
387, 397, 416 unit 109
Poland quaternion
MO 85 conjugate 37
polynomial 144 norm 37
chebyshev 66, 255 numbers 36
428 INDEX

R normal 101, 307


Rabinowitsch’s trick 395 symmetric
Ramanujan 138 group 155
ramified prime 33, 217 polynomial 18, 160
rank 194, 417 complete homogeneous 165
of a linear map 174 elementary 18, 160
of an abelian group 158, 384 fundamental theorem of 19, 161
rank-nullity theorem 174 power sum 161
rational function 151
rational root theorem 15, 196 T
real field Taiwan
quadratic 109 TST 141
totally real 117 Taylor
Redei 106, 314 expansion 132
resultant 165, 393 formula 81
Riemann series 263
zeta function 226, 250 Taylor’s formula 358
ring 152 Teichmüller character 367
of integers 27, 58, 92 TFJM 108, 328
commutative 18, 58, 153, 188 Thue’s equation 116
multiplicative group 60 Thue-Siegel-Roth theorem 119
of integers 121 torsion 384
RMM 120, 334 totally real 117
SL 86, 291 trace 194
Romania of a field extension 194
TST 86, 292 transcendence
root of unity filter 157, 379 basis 166, 397
Russia degree 166, 397
All-Russian Olympiad 157 transcendental number 14, 95
transposition 185
S Tuymaada 87, 294
scalars 168 Tuymadaa 73, 280
series 125
Siegel’s Lemma 193, 416 U
signature 184 unique factorisation domain 29, 77
simple group 318 unit 24, 54, 109, 205, 206, 251
sine law 202 circle 159, 387
skew field 37, 153, 221 complex cubic 117
Skolem-Mahler-Lech theorem 118, 133, 142, complex quadratic 109
369 Dirichlet theorem 119, 121, 337
solvability fundamental 109, 117
by radicals 106, 317 group 60
by real radicals 107, 320 of a ring 29, 155
Sophie Germain’s identity 314 of cyclotomic fields 55, 256
Sophie-Germain p-adic 123
prime 55, 255 real quadratic 110
theorem 55, 255 S 114, 119, 121, 341
split USA
polynomial 57, 104 MO 23, 73, 87, 107, 159, 199, 292
prime 33, 217 TST 23, 24, 53, 82, 87, 88, 97, 128, 139,
splitting field 62, 73, 107, 277, 320, 321 157, 193, 204, 247, 294, 377, 415
squarefree 55, 71, 272 TSTST 74
Størmer’s theorem 114, 121, 341
stars and bars 416 V
Strassmann’s theorem 137 Vahlen 106, 314
Sturm’s theorem 159, 388 Vandermonde 139, 351
subgroup 99 determinant 182
INDEX 429

Vandermonde determinant 415 Z


vector space 156, 168 Zariski density 413
dimension 90 Zeev Dvir 193, 416
vectors 168 zeta function
Vieta’s formulas 18, 19, 47, 148, 164 Dedekind 226
Riemann 226, 250
W Zsigmondy’s theorem 50
Wedderburn’s theorem 74, 285
Wilson’s theorem 148, 300

You might also like