Mathematics of Coding Information Compression Error Correction and Finite Fields
Mathematics of Coding Information Compression Error Correction and Finite Fields
Information,
Compression,
Error Correction,
and Finite Fields
Paul Garrett
Contents
Preface
. . . . . . . . . . . . . . . . . . . . . . . . . . ix
Probability . . . . . . . . . . . . . . . .
1.1
Sets and functions . . . . . . . . . . .
1.2
Counting . . . . . . . . . . . . . . .
1.3
Preliminary ideas of probability . . . . .
1.4
More formal view of probability . . . . .
1.5
Random variables, expected values, variance
1.6
Markovs inequality, Chebysheffs inequality
1.7
Law of Large Numbers . . . . . . . . .
. .
. .
. .
. .
. .
. .
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 1
. . 1
. . 5
. . 8
. 13
. 20
. 27
. 27
Information
. . . . . . . . . . . . . . . . . . . . . . 33
2.1
Uncertainty, acquisition of information . . . . . . . . . . 33
2.2
Definition of entropy . . . . . . . . . . . . . . . . . . 37
Noiseless Coding
. . . . . . . .
3.1
Noiseless coding . . . . . . .
3.2
Kraft and McMillan inequalities
3.3
Noiseless coding theorem . . .
3.4
Huffman encoding . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 44
. 44
. 48
. 51
. 54
Noisy Coding . . . . . . . . . .
4.1
Noisy channels
. . . . . . .
4.2
Example: parity checks . . . .
4.3
Decoding from a noisy channel
4.4
Channel capacity . . . . . .
4.5
Noisy coding theorem . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 61
61
63
66
67
71
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 82
82
83
86
88
vi
Contents
The Integers . . . . . . . . . .
6.1
The reduction algorithm . . .
6.2
Divisibility . . . . . . . . .
6.3
Factorization into primes . . .
6.4
A failure of unique factorization
6.5
The Euclidean Algorithm . . .
6.6
Equivalence relations . . . . .
6.7
The integers modulo m . . . .
6.8
The finite field Z/p for p prime
6.9
Fermats Little Theorem . . .
6.10 Eulers theorem . . . . . . .
6.11 Facts about primitive roots . .
6.12 Eulers criterion . . . . . . .
6.13 Fast modular exponentiation .
6.14 Sun-Zes theorem . . . . . . .
6.15 Eulers phi-function . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
. 134
. . 134
. . 139
. . 141
Groups
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
. . . . . . . . . . . . . .
Groups . . . . . . . . . . . . .
Subgroups . . . . . . . . . . . .
Lagranges Theorem . . . . . . .
Index of a subgroup . . . . . . .
Laws of exponents . . . . . . . .
Cyclic subgroups, orders, exponents
Eulers Theorem . . . . . . . . .
Exponents of groups . . . . . . .
Group homomorphisms . . . . . .
Finite cyclic groups . . . . . . . .
Roots, powers . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
. 167
. . 167
. . 171
. . 175
10 Polynomials
. . . . . . . . . . . .
10.1 Polynomials . . . . . . . . . . .
10.2 Divisibility . . . . . . . . . . .
10.3 Factoring and irreducibility . . . .
10.4 Euclidean algorithm for polynomials
10.5 Unique factorization of polynomials
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . . .
. . . .
. . . .
.
.
.
.
. .
. .
. .
. .
. .
. .
. 93
. 93
. 96
. 99
. 103
. 105
. 108
. 111
. 115
. 117
. 118
. 120
. 121
. 122
. 124
. 128
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
145
145
147
148
150
151
153
154
155
156
158
161
178
178
181
184
187
189
Contents
vii
11 Finite Fields . . . . . . . . .
11.1 Making fields . . . . . . .
11.2 Examples of field extensions
11.3 Addition mod P . . . . . .
11.4 Multiplication mod P
. . .
11.5 Multiplicative inverses mod P
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12 Linear Codes . . . . . . . . . . .
12.1 An ugly example
. . . . . . .
12.2 A better approach . . . . . . .
12.3 An inequality from the other side .
12.4 The Hamming binary [7, 4] code .
12.5 Some linear algebra . . . . . .
12.6 Row reduction: a review . . . .
12.7 Linear codes
. . . . . . . . .
12.8 Dual codes, syndrome decoding .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13 Bounds
13.1
13.2
13.3
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
. 228
. . 228
. . 230
. . 232
for Codes . . . . . . . . .
Hamming (sphere-packing) bound
Gilbert-Varshamov bound . . . .
Singleton bound . . . . . . . .
.
.
.
.
.
.
.
.
.
192
192
195
197
197
197
.
.
.
.
.
.
.
.
200
200
203
204
205
208
211
218
222
. .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
240
240
241
243
246
251
252
253
256
257
258
16 Primitive Polynomials . . . . . . .
16.1 Definition of primitive polynomials
16.2 Examples mod 2 . . . . . . . .
16.3 Testing for primitivity . . . . .
16.4 Periods of LFSRs . . . . . . .
16.5 Two-bit errors in CRCs
. . . .
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
260
260
261
264
267
272
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
viii
Contents
17 RS and
17.1
17.2
17.3
17.4
17.5
BCH Codes . . . . . . . . . .
Vandermonde determinants . . . . .
Variant check matrices for cyclic codes
Reed-Solomon codes . . . . . . . .
Hamming codes . . . . . . . . . .
BCH codes . . . . . . . . . . . .
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
276
277
280
282
285
287
18 Concatenated Codes . . . . . . . . .
18.1 Mirage codes . . . . . . . . . .
18.2 Concatenated codes . . . . . . .
18.3 Justesen codes
. . . . . . . . .
18.4 Some explicit irreducible polynomials
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
297
297
301
303
306
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
309
309
313
317
318
318
321
329
331
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
335
335
339
342
348
348
353
354
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
356
.
.
.
.
.
360
360
363
365
372
374
. . . . . . . . . . . . . . . . . .
378
Bibliography . . . . . . . . . . . . . . . . . . . . . . .
384
Select Answers . . . . . . . . . . . . . . . . . . . . . .
386
Index
393
. . . . . . . . . . . . . . . . . . . . . . . . . .
Preface
This book is intended to be accessible to undergraduate students with two
years of typical mathematics experience, most likely meaning calculus with a little
linear algebra and differential equations. Thus, specifically, there is no assumption
of a background in abstract algebra or number theory, nor of probability, nor of
linear algebra. All these things are introduced and developed to a degree sufficient
to address the issues at hand.
We will address the fundamental problem of transmitting information effectively and accurately. The specific mode of transmission does not really play
a role in our discussion. On the other hand, we should mention that the importance
of the issues of efficiency and accuracy has increased largely due to the advent of
the internet and, even more so, due to the rapid development of wireless communications. For this reason it makes sense to think of networked computers or wireless
devices as archetypical fundamental practical examples.
The underlying concepts of information and information content of data
make sense independently of computers, and are relevant in looking at the operation
of natural languages such as English, and of other modes of operation by which
people acquire and process data.
The issue of efficiency is the obvious one: transmitting information costs time,
money, and bandwidth. It is important to use as little as possible of each of these
resources. Data compression is one way to pursue this efficiency. Some well
known examples of compression schemes are commonly used for graphics: GIFs,
JPEGs, and more recently PNGs. These clever file format schemes are enormously
more efficient in terms of filesize than straightforward bitmap descriptions of graphics files. There are also general-purpose compression schemes, such as gzip, bzip2,
ZIP, etc.
The issue of accuracy is addressed by detection and correction of errors
that occur during transmission or storage of data. The single most important
practical example is the TCP/IP protocol, widely used on the internet: one basic
aspect of this is that if any of the packets composing a message is discovered to be
mangled or lost, the packet is simply retransmitted. The detection of lost packets
is based on numbering the collection making up a given message. The detection
of mangled packets is by use of 16-bit checksums in the headers of IP and TCP
packets. We will not worry about the technical details of TCP/IP here, but only
note that email and many other types of internet traffic depend upon this protocol,
which makes essential use of rudimentary error-detection devices.
And it is a fact of life that dust settles on CD-ROMs, static permeates network
lines, etc. That is, there is noise in all communication systems. Human natural
languages have evolved to include sufficient redundancy so that usually much
less than 100% of a message need be received to be properly understood. Such
ix
Preface
redundancy must be designed into CD-ROM and other data storage protocols to
achieve similar robustness.
There are other uses for detection of changes in data: if the data in question is
the operating system of your computer, a change not initiated by you is probably
a sign of something bad, either failure in hardware or software, or intrusion by
hostile agents (whether software or wetware). Therefore, an important component
of systems security is implementation of a suitable procedure to detect alterations
in critical files.
In pre-internet times, various schemes were used to reduce the bulk of communication without losing the content: this influenced the design of the telegraphic
alphabet, traffic lights, shorthand, etc. With the advent of the telephone and radio, these matters became even more significant. Communication with exploratory
spacecraft having very limited resources available in deep space is a dramatic example of how the need for efficient and accurate transmission of information has
increased in our recent history.
In this course we will begin with the model of communication and information
made explicit by Claude Shannon in the 1940s, after some preliminary forays by
Hartley and others in the preceding decades.
Many things are omitted due to lack of space and time. In spite of their
tremendous importance, we do not mention convolutional codes at all. This is
partly because there is less known about them mathematically. Concatenated codes
are mentioned only briefly. Finally, we also omit any discussion of the so-called
turbo codes. Turbo codes have been recently developed experimentally. Their
remarkably good behavior, seemingly approaching the Shannon bound, has led to
the conjecture that they are explicit solutions to the fifty-year old existence results
of Shannon. However, at this time there is insufficient understanding of the reasons
for their good behavior, and for this reason we will not attempt to study them here.
We do give a very brief introduction to geometric Goppa codes, attached to
algebraic curves, which are a natural generalization of Reed-Solomon codes (which
we discuss), and which exceed the Gilbert-Varshamov lower bound for performance.
The exercises at the ends of the chapters are mostly routine, with a few more
difficult exercises indicated by single or double asterisks. Short answers are given
at the end of the book for a good fraction of the exercises, indicated by (ans.)
following the exercise.
I offer my sincere thanks to the reviewers of the notes that became this volume.
They found many unfortunate errors, and offered many good ideas about improvements to the text. While I did not choose to take absolutely all the advice given, I
greatly appreciate the thought and energy these people put into their reviews: John
Bowman, University of Alberta; Sergio Lopez, Ohio University; Navin Kashyap,
University of California, San Diego; James Osterburg, University of Cincinnati;
LeRoy Bearnson, Brigham Young University; David Grant, University of Colorado
at Boulder; Jose Voloch, University of Texas.
Paul Garrett
[email protected]
http://www.math.umn.edu/garrett/
1
Probability
1.1
1.2
1.3
1.4
1.5
1.6
1.7
Z
Q
R
C
=
=
=
=
=
Chapter 1
Probability
which is the set of integers greater than 0 and less than 9. This set can also be
described by a rule like
S = {1, 2, 3, 4, 5, 6, 7, 8} = {x : x is an integer and 1 x 8}
This follows the general format and notation
{x : x has some property}
If x is in a set S, then write x S or S 3 x, and say that x is an element of S.
Thus, a set is the collection of all its elements (although this remark only explains
the language). It is worth noting that the ordering of a listing has no effect on a set,
and if in the listing of elements of a set an element is repeated, this has no effect.
For example,
{1, 2, 3} = {1, 1, 2, 3} = {3, 2, 1} = {1, 3, 2, 1}
A subset T of a set S is a set all of whose elements are elements of S. This
is written T S or S T . So always S S and S. If T S and T 6=
and T 6= S, then T is a proper subset of S. Note that the empty set is a subset of
every set. For a subset T of a set S, the complement of T (inside S) is
T c = S T = {s S : s 6 T }
Sets can also be elements of other sets. For example, {Q, Z, R, C} is the set
with 4 elements, each of which is a familiar set of numbers. Or, one can check that
{{1, 2}, {1, 3}, {2, 3}}
is the set of two-element subsets of {1, 2, 3}.
The intersection of two sets A, B is the collection of all elements which lie in
both sets, and is denoted A B. Two sets are disjoint if their intersection is . If
the intersection is not empty, then we may say that the two sets meet. The union
of two sets A, B is the collection of all elements which lie in one or the other of the
two sets, and is denoted A B.
Note that, for example, 1 6= {1}, and {{1}} =
6 {1}. That is, the set {a} with
sole element a is not the same thing as the item a itself.
An ordered pair (x, y) is just that, a list of two things in which there is a
first thing, here x, and a second thing, here y. Two ordered pairs (x, y) and (x0 , y 0 )
are equal if and only if x = x0 and y = y 0 .
The (cartesian) product of two sets A, B is the set of ordered pairs (a, b)
where a A and b B. It is denoted A B. Thus, while {a, b} = {b, a} might be
thought of as an unordered pair, for ordered pairs (a, b) 6= (b, a) unless by chance
a = b.
In case A = B, the cartesian power A B is often denoted A2 . More generally,
for a fixed positive integer n, the nth cartesian power An of a set is the set of
ordered n-tuples (a1 , a2 , . . . , an ) of elements ai of A.
Some very important examples of cartesian powers are those of R or Q or C,
which arise in other contexts as well: for example, R2 is the collection of ordered
1.1
pairs of real numbers, which we use to describe points in the plane. And R3 is
the collection of ordered triples of real numbers, which we use to describe points in
three-space.
The power set of a set S is the set of subsets of S. This is sometimes denoted
by PS. Thus,
P = {}
P{1, 2} = {, {1}, {2}, {1, 2}}
Intuitively, a function f from one set A to another set B is supposed to be
a rule which assigns to each element a A an element b = f (a) B. This is
written as
f :AB
although the latter notation gives no information about the nature of f in any
detail.
More rigorously, but less intuitively, we can define a function by really telling
its graph: the formal definition is that a function f : A B is a subset of the
product A B with the property that for every a A there is a unique b B so
that (a, b) f . Then we would write f (a) = b.
This formal definition is worth noting at least because it should make clear that
there is absolutely no requirement that a function be described by any recognizable
or simple formula.
Map and mapping are common synonyms for function.
As a silly example of the formal definition of function, let f : {1, 3} {2, 6}
be the function multiply-by-two, so that f (1) = 2 and f (3) = 6. Then the official
definition would say that really f is the subset of the product set {1, 3} {2, 6}
consisting of the ordered pairs (1, 2), (3, 6). That is, formally the function f is the
set
f = {(1, 2), (3, 6)}
Of course, no one usually operates this way, but it is important to have a precise
meaning underlying more intuitive usage.
A function f : A B is surjective (or onto) if for every b B there is
a A so that f (a) = b. A function f : A B is injective (or one-to-one) if
f (a) = f (a0 ) implies a = a0 . That is, f is injective if for every b B there is at
most one a A so that f (a) = b. A map is a bijection if it is both injective and
surjective.
The number of elements in a set is its cardinality. Two sets are said to have
the same cardinality if there is a bijection between them. Thus, this is a trick
so that we dont have to actually count two sets to see whether they have the same
number of elements. Rather, we can just pair them up by a bijection to achieve
this purpose.
Since we can count the elements in a finite set in a traditional way, it is clear
that a finite set has no bijection to a proper subset of itself. After all, a proper
subset has fewer elements.
Chapter 1
Probability
By contrast, for infinite sets it is easily possible that proper subsets have bijections to the whole set. For example, the set A of all natural numbers and the set
E of even natural numbers have a bijection between them given by
n 2n
But certainly E is a proper subset of A! Even more striking examples can be
arranged. In the end, we take as the definition that a set is infinite if it has a
bijection to a proper subset of itself.
Let f : A B be a function from a set A to a set B, and let g : B C be a
function from the set B to a set C. The composite function g f is defined to
be
(g f )(a) = g(f (a))
for a A.
The identity function on a non-empty set S is the function f : S S so
that f (a) = a for all a A. Often the identity function on a set S is denoted by
idS .
Let f : A B be a function from a set A to a set B. An inverse function
g : B A for f (if such g exists at all) is a function so that (f g)(b) = b for all
b B, and also (g f )(a) = a for all a A. That is, the inverse function (if it
exists) has the two properties
f g = idB
g f = idA
1.2
Counting
///
1.2 Counting
Here we go through various standard elementary-but-important examples of counting as preparation for finite probability computations. Of course, by counting we
mean structured counting.
Example: Suppose we have n different things, for example the integers from 1 to
n inclusive. The question is how many different orderings or ordered listings
i1 , i2 , i3 , . . . , in1 , in
of these numbers are there? Rather than just tell the formula, lets quickly derive
it. The answer is obtained by noting that there are n choices for the first thing i1 ,
then n 1 remaining choices for the second thing i2 (since we cant reuse whatever
i1 was), n 2 remaining choices for i3 (since we cant reuse i1 nor i2 , whatever they
were!), and so on down to 2 remaining choices for in1 and then just one choice for
in . Thus, there are
n (n 1) (n 2) . . . 2 1
possible orderings of n distinct things. This kind of product arises often, and there
is a notation and name for it: n-factorial, denoted n!, is the product
n! = n (n 1) (n 2) . . . 2 1
It is an important and useful convention that
0! = 1
The factorial n! is defined only for non-negative integers.
Example: How many ordered k-tuples of elements can be chosen (allowing
repetition) from a set of n things? There are n possibilities for the first choice. For
each choice of the first there are n choices for the second. For each choice of the
first and second there are n for the third, and so on down to n choices for the k th
for each choice of the first through (k 1)th . That is, altogether there are
n n . . . n = nk
{z
}
|
k
Chapter 1
Probability
there are n 1 remaining choices for the second, since the second element must
be different from the first. For each choice of the first and second there are n 2
remaining choices for the third, since it must be different from the first and second.
This continues, to n (k 1) choices for the k th for each choice of the first through
(k 1)th , since the k 1 distinct element already chosen cant be reused. That is,
altogether there are
n (n 1) (n 2) . . . (n (k 2)) (n (k 1)) = n!/(n k)!
ordered k-tuples of distinct elements that can be chosen from a set with n elements.
Example: How many (unordered!) subsets of k elements are there in a set of
n things? There are n possibilities for the first choice, n 1 remaining choices for
the second (since the first item is removed), n 2 for the third (since the first and
second items are no longer available), and so on down to n (k 1) choices for the
k th . This number is n!/(n k)!, but is not what we want, since it includes a count
of all different orders of choices, but subsets are not ordered. That is,
n!
= k! the actual number
(n k)!
since we saw in a previous example that there are k! possible orderings of k distinct
things. Thus, there are
n!
k! (n k)!
choices of subsets of k elements in a set with n elements.
The number n!/k!(n k)! also occurs often enough to warrant a name and
notation: it is called a binomial coefficient, is written
n!
n
=
k
k! (n k)!
and is pronounced n choose k in light of the previous example. The name binomial
coefficient is explained below in the context of the Binomial Theorem.
Example: How many disjoint pairs of 3-element and 5-element
subsets are there
10
in a set with 10 elements? We just saw that there are
choices for the
3
first subset with 3 elements. Then
the
remaining part of the original set has just
7
10 3 = 7 elements, so there are
choices for the second subset of 5 elements.
5
Therefore, there are
10!
10! 7!
10
7
=
=
3
5
7! 3! 5!2!
3! 5! 2!
5!
10!
=
=
5! 5! 3! 2!
10
5
5
3
1.2
Counting
pairs of disjoint subsets of 3 and 5 elements inside a set with 10 elements. Note
that we obtain the same numerical outcome regardless of whether we first choose
the 3-element subset or the 5-element subset.
Example: How many disjoint pairs of subsets, each with k elements,
are there in
n
a set with n elements, where 2k n? We saw that there are
choices for the
k
first subset with k elements. Then the
remaining part of the original set has just
nk
n k elements, so there are
choices for the second subset of k elements.
k
But our counting so far inadvertently takes into account a first subset and a second
one, which is not what the question is. By now we know that there are 2! = 2
choices of ordering of two things (subsets, for example). Therefore, there are
1
2
n!
(n k)!
1
n
nk
=
k
k
2 (n k)!k! k!(n 2k)!
=
n!
2 k! k!(n 2k)!
nk
k
n 2k
k
n (` 1)k
k
for the `th subset. But since ordering of these subsets is inadvertently counted
here, we have to divide by `! to have the actual number of families. There is some
cancellation among the factorials, so that the actual number is
number of disjoint subsets of k elements =
n!
`! (k!)` (n `k)!
Chapter 1
Probability
This identity shows that the binomial coefficients are integers, and is the basis for
other identities as well. This identity is proven by induction, as follows. For n = 1
the assertion is immediately verified. Assume it is true for exponent n, and prove
the corresponding assertion for exponent n + 1. Thus,
n+1
(x + y)
n
X
n k nk
= (x + y) (x + y) = (x + y)
x y
k
n
k=0
n
X
n
xk+1 y nk + xk y nk+1
k
k=0
= x0 y n+1 + xn+1 y 0 + +
n
X
n
n
+
xk y n+1k
k1
k
k=1
Thus, to prove the formula of the Binomial Theorem for exponent n + 1 we must
prove that for 1 k n
n
n
n+1
+
=
k1
k
k
We do this by expressing the left-hand side in terms of binomial coefficients:
n
n
+
k
k1
=
n!
n! k
n! (n k + 1)
n!
+
=
+
(k 1)! (n k + 1)! k! (n k)!
k! (n k + 1)! k! (n k + 1)!
n+1
(n + 1)!
=
=
k! (n k + 1)!
k
as claimed.
1.3
rule is taken as true for chance: the sum of the percentages of all possible outcomes
should be 100%.
But what is probability ?
possible sequences of n outcomes. The assumptions that the coin is fair and
that the separate coin tosses do not influence each other is interpreted as saying
that each one of the 2n possible sequences of coin-toss outcomes is equally likely.
Therefore, the probability of any single sequence of n outcomes is 1/2n . Further,
for any subset S of the set A of all 2n possible sequences of outcomes, we assume
that
probability of a sequence of n tosses giving an outcome in S
=
number of elements in S
number of elements in S
=
number of elements in A
2n
Then the probability that exactly k heads will occur out of n tosses (with
0 k n) is computed as
probability of k heads out of n tosses
=
10
Chapter 1
=
Probability
n-choose-k =
n!
=
k! (n k)!
n
k
Thus, for example, the probability that exactly 5 heads come up in 10 tosses
is
10
5
210
=
( 109876
252
1
54331 )
=
1024
1024
4
as commented just above. And the probability that 6 heads and 4 tails or 4 heads
and 6 tails occur is
number of sequences of 10 with exactly 6 or exactly 4 heads
210
10
4
+ 10
2 10987
420
2
6
432
=
=
1024
1024
1024
5
Perhaps not entirely surprisingly, the probability of getting exactly half heads
and half tails out of 2n flips goes down as the number of flips goes up, and in fact
goes to 0 as the number of flips goes to infinity. Nevertheless, more consistent
with our intuition, the sense that the number of heads is approximately one half is
correct. Still, in terms of the expression
2n
n
22n
=0
It is not so easy to verify this directly, but consideration of some numerical examples
is suggestive if not actually persuasive. Quantification of the notion that the number
of heads is approximately one half is filled in a little later by the Law of Large
Numbers.
1.3
11
0.5
0.375
0.3125
0.2734
0.2461
0.1813
0.1683
0.176197052002
0.144464448094
0.12537068762
0.112275172659
0.102578173009
0.0950254735405
0.0889278787739
0.0838711229887
0.0795892373872
0.0563484790093
0.046027514419
0.0398693019638
0.0356646455533
0.032559931335
0.0301464332521
0.0282006650947
0.0265887652343
0.0252250181784
Remark: Were not really answering the question what is probability?, but
instead were telling how to compute it.
One attempt to be more quantitative taken in the past, but which has several
flaws, is the limiting frequency definition of probability, described as follows in
the simplest example. Let N (n) be the number of times that a head came up in n
trials. Then as n grows larger and larger we might imagine that the ratio N (n)/n
should get closer and closer to the probability of heads (1/2 for a fair coin). Or,
in the language of limits, it should be that
probability of heads = lim
(And probably this limit really is 1/2.) But there are problems with this definition.
Its not that the assertion itself is false, but rather that this isnt a good definition
of probability from which to start. For example, either in real life or in theory its
not convenient to do infinitely many flips. Second, if we try to do only finitely many
12
Chapter 1
Probability
flips and approximate the probability, how many do we need to do? Third, how
do we know that every infinite sequence of trials will give the same limiting value?
There are many further objections to this as a fundamental definition, but we should
be aware of interpretations in this direction. A more supportable viewpoint would
make such limiting frequency assertions a consequence of other things, called the
Law of Large Numbers. Well prove a special case of this a bit later.
Example: The next traditional example involves picking colored balls out of an
urn. Suppose, for example, that there are N balls in the urn, r red ones and
b = N r blue ones, and that they are indistinguishable by texture, weight, size,
or in any way. Then in choosing a single ball from the urn we are equally likely
to choose any one of the N . As in the simpler case of coin flips, there are N
possibilities each of which is equally likely, and the probabilities must add up to 1,
so the probability of drawing any particular ball must be 1/N . Further, it may seem
reasonable to postulate that the probability of picking out one ball from among a
fixed subset of k would be k times the probability of picking a single ball. Granting
this, with r red balls and b blue ones, we would plausibly say that the probability
is r/N that a red ball will be chosen and b/N that a blue ball will be chosen. (We
should keep in mind that some subatomic particles do not behave in this seemingly
reasonable manner!) So without assigning meaning to probability, in some cases we
can still reach some conclusions about how to compute it.
We suppose that one draw (with replacement) has no effect on the next one,
so that they are independent. Let r(n) be the number of red balls drawn in a
sequence of n trials. Then, in parallel with the discussion just above, we would
presume that for any infinite sequence of trials
lim
But, as noted above, this should not be the definition, but rather should be a
deducible consequence of whatever definition we make.
Running this in the opposite direction: if there are N balls in an urn, some
red and some blue, if r(n) denotes the number of red balls chosen in n trials, and if
lim
r(n)
=f
n
1.4
13
1
2
P (T) =
1
2
where (to repeat) the sum is over the points xi that lie in A. The function P ()
extended in this fashion is really what a probability measure is. The event A
occurs if any one of the xi A occurs. Thus, for A = {xi1 , . . . , xik },
P (A) = P (xi1 or xi2 or . . . or xik )
As extreme cases,
P () = 1
and
P () = 0
Generally, for an event A, the event not-A is the set-theoretic complement Ac =
A of A inside . Then
P (not A) = P (Ac ) = P ( A) = 1 P (A)
14
Chapter 1
Probability
1
10
1
10
for all i and j. This is the model of 10 balls in an urn. Then the subsets
P (bj ) =
1
1
3
1
+
+
=
10 10 10
10
7
10
We can assign these probabilities pi by intuition, by using the limiting frequency idea, or by other means. In fact, they might be measured experimentally,
or assigned in some operational manner possibly hard to justify rigorously. Lets
repeat the limiting frequency story one more time in this situation. We imagine
that the same experiment X is conducted over and over, and that subsequent trials
are unaffected by the earlier ones, that is, they are independent trials. For n
such independent trials let n(i ) be the number of times that the event i occurs.
Suppose that for any infinite sequence of trials the limit
P (B) = P (b1 ) + P (b2 ) + . . . + P (b7 ) =
pi = lim
n(i )
n
exists and is unique. Then this limiting frequency pi should be the probability
of the event i .
1.4
15
Example: Consider the experiment of drawing a ball from an urn in which there
are 3 red balls, 3 blue balls, and 4 white balls (otherwise indistinguishable). As
above, we would postulate that the probability of drawing any particular individual
ball is 1/10. (These atomic events are indeed mutually exclusive, because we only
draw one ball at a time.) Thus, the smallest events x1 , x2 , . . . , x10 are the possible
drawings of each one of the 10 balls. Since they have equal chances of being drawn,
the probabilities pi = P (xi ) are all the same (and add up to 1):
p1 = p2 = p3 = . . . = p10
Then the (compound) event A of drawing a red ball is the subset with three
elements consisting of draw red ball one, draw red ball two, and draw red ball
three. Thus,
1
1
1
3
P (A) =
+
+
=
10 10 10
10
Let B be the event draw a white ball. Then, since A and B are disjoint events,
the probability of drawing either a red ball or a white ball is the sum:
P ( drawing red or white ) = P (A B) = P (A) + P (B) =
3
4
7
+
=
10 10
10
Proof: When
N = 1, the probability that A occurs is p, and the binomial
coef
ficient 11 is 1. The probability that A does not occur is 1 p, and 10 = 1 also.
The main part of the argument is an induction on N . Since the different trials are
independent, by assumption, we have
P (A occurs in k of N )
= P (A occurs in k of the first N 1) P (A does not occur in the N th )
+ P (A occurs in k 1 of the first N 1) P (A occurs in the N th )
N 1 k
=
p (1 p)N 1k (1 p)
k
N 1 k1
+
p
(1 p)N 1(k1) p
k1
16
Chapter 1
Probability
P (A B)
P (A)
In effect, the phrase given that A occurs means that we replace the universe
of possible outcomes by the smaller universe A of possibilities, and renormalize
all the probabilities accordingly.
The formula P (B|A) = P (A B)/P (A) allows us to compute the conditional
probability in terms of the other two probabilities. In real-life situations, it may
be that we know P (B|A) directly, for some other reasons. If we also know P (A),
then this gives us the formula for
P (A and B) = P (A B)
namely
P (A B) = P (B|A) P (A)
Example: What is the probability that 7 heads appear in 10 flips of a fair coin
given that at least 4 heads appear? This is a direct computation of conditional
probability:
P (7 heads | at least 4 heads) =
P (7 heads)
P (at least 4 heads)
1
10
7 210
10
10
10
10
10
10
4 + 5 + 6 + 7 + 8 + 9 +
=
10
10
1
210
1.4
17
10
7
10
7
= 10
4
10
5
10
6
10
8
10
9
10
10
since the requirement of 7 heads and at least 4 is simply the requirement of 7 heads.
Two subsets A, B of a probability space are independent if
P (A B) = P (A) P (B)
In simple examples, it usually happens that independence of events is due to some
fairly obvious independence of causality. Equivalently,
P (B) = P (B|A)
and equivalently
P (A) = P (A|B)
Example: Let = {10, 11, . . . , 99} be the collection of all integers from 10 to 99,
inclusive. Let A be the subset of consisting of integers x whose ones-place
digit is 3, and let B be the subset of integers x whose tens-place digit is 6.
Then it turns out that
P (A B) = P (A) P (B)
so, by definition, these two (compound) events are independent. Usually we expect
an explanation for an independence result, rather than just numerical verification
that the probabilities behave as indicated. In the present case, the point is that
there is no causal relation between the ones-place and tens-place digits in this
example.
To model repeated events in this style, we need to use the set-theoretic idea of
cartesian product: again, the cartesian product of n sets X1 , . . . , Xn is simply
the collection of all ordered n tuples (x1 , . . . , xn ) (the parentheses and commas are
mandatory), where xi Xi . The notation is
X1 . . . Xn = {(x1 , . . . , xn ) : xi Xi , 1 i n}
(No, we are not in any sense generalizing the notion of multiplication: its just a
notation.) If all the sets Xi are the same set X, then there is a shorter notation,
Xn = X . . . X
|
{z
}
n
18
Chapter 1
Probability
for any n-tuple (i1 , i2 , . . . , in ). Its not hard to check that with this probability
measure n is a probability space. Further, even for compound events A1 , . . ., An
in , its straightforward to show that
P (A1 . . . An ) = P (A1 ) . . . P (An )
where A1 . . . An is the cartesian product of the Ai s and naturally sits inside
the cartesian product . . . = n .
The idea is to imagine that (i1 , i2 , . . . , in ) is the event that i1 occurs on the
first trial, i2 on the second, and so on until in occurs on the nth . Implicit in this
model is the idea that later events are independent of earlier ones. Otherwise that
manner of assigning a probability measure on the cartesian power is not appropriate!
Example: Let = {H, T} with P (H) = 1/2 and P (T) = 1/2, the fair-coinflipping model. To model flipping a fair coin 10 times, one approach is to look at 10 ,
which is the set of all 10-tuples of values which are either heads or tails. Each such
10-tuple is assigned the same probability, 1/210 . Now consider the (compound)
event
A = exactly 7 heads in 10 flips
This subset of 10 consists of all (ordered!) 10-tuples with exactly 7 heads values
among them, and (by definition) the probability of A is the number of such multiplied by 1/210 , since each such atomic event has probability 1/210 . Thus, to
compute P (A) we need only count the number of elements of A. It is the number
of ways to choose 7 from among 10 things, which is the binomial coefficient 10
7 .
Thus,
10
1
P (7 heads in 10 flips) = P (A) =
10
2
7
1.4
19
Note that the numerical value does not depend on the exact location of the Ao s and
the r1 s, but only on the number of them. Thus, the number of ways to choose the
3 locations of the Ao element from among the 5 places is the binomial coefficient
5
3 . Thus,
P (3 blues in 5 draws) = P (A) =
3 2
1
5
2
3
3
3
Remark: All our examples so far are finite probability spaces, meaning the obvious thing, that there are only finitely many elements of the set, so only finitely-many
atomic events. This restriction is not really terribly confining, and already gives
ample opportunity to illustrate many fundamental phenomena, but nevertheless we
might want to see how to treat some infinite probability spaces.
Example: Suppose we take to be the interval [0, 1] of real numbers and that
we want every real number in that interval to be equally probable to be selected.
If we try to assign values P (x) to x [0, 1] since they are all equal, but infinitely
many must add up to 1, we find ourselves in an impossible situation. Instead, we
give up on the idea of assigning a probability to every subset A of = [0, 1], and
give up on the too-naive idea that
P (A) =
P (x)
xA
and instead only assign probabilities to a restricted class of subsets. For example,
we might assign
P ([a, b]) = b a
for any subinterval [a, b] of [0, 1], and then define
P ([a1 , b1 ] . . . [an , bn ]) = P ([a1 , b1 ]) + . . . + P ([an , bn ])
for disjoint collections of intervals [ai , bi ]. This is a start, but we need more. In
fact, for a collection of mutually disjoint intervals
[a1 , b1 ], [a2 , b2 ], [a3 , b3 ], . . .
indexed by positive integers, we can compute the probability of the union by the
obvious formula
P ([a1 , b1 ] [a2 , b2 ] . . .) = P ([a1 , b1 ]) + P ([a2 , b2 ]) + . . .
(A collection indexed by the positive integers is called countable.) We could also
compute the probability measure of the complement
{ : 6 A}
20
Chapter 1
Probability
[
i=1
!
Ai
P (Ai )
i=1
1.5
21
Remark: Yes, due to tradition at least, instead of the f otherwise often used
for functions, an X is used, perhaps to be more consonant with the usual use of x
for a (non-random?) variable. Further, there is a tradition that makes the values
of X be labeled xi (in conflict with the calculus tradition).
For a possible value x of X, we extend the notation by writing
P (X = x) = P ({ : X() = x})
That is, the probability that X = x is defined to be the probability of the subset
of on which X has the value x.
The expected value of such a random variable on a probability space =
{1 , . . . , n } is defined to be
E(X) = P (1 ) X(1 ) + P (2 ) X(2 ) + . . . + P (n ) X(n )
Of course, we may imagine that after a large number of independent trials
with outcomes i1 , i2 , . . ., iN the average value
1
( X(i1 ) + X(i2 ) + . . . + X(iN ))
N
will be close to E(X). But in fact we can prove such a thing, rather than just
imagine that its true: again, it is a Law of Large Numbers.
The simplest models for the intuitive content of this idea have their origins
in gambling. For example, suppose Alice and Bob (A and B) have a fair coin
(meaning heads and tails both have probability 0.5) and the wager is that if the
coin shows heads Alice pays Bob a dollar, and if it shows tails Bob pays Alice a
dollar. Our intuition tells us that this is fair, and the expected value computation
corroborates this, as follows. The sample space is = {0 , 1 } (index 0 for heads
and 1 for tails), with each point having probability 0.5. Let X be the random
variable which measures Alices gain (or loss):
X(0 ) = 1
X(1 ) = +1
22
Chapter 1
Probability
It is not true that we can take the average of 010 first (namely, 5) and square it
(getting 25) to obtain the expected value.
pi (X(i ) + Y (i ))
pi X(i ) +
pi Y (i )) = E(X) + E(Y )
///
X
i
pi cX(i ) = c
pi X(i ) = c E(X)
///
Let be a sample space. Let X and Y be random variables on . The product
random variable XY is defined on the sample space in the reasonable way:
(XY )() = X() Y ()
These two random variables X and Y are independent random variables if for
every pair x, y of possible values of X, Y , we have
P (X = x and Y = y) = P (X = x) P (Y = y)
1.5
23
P ()XY ()
P ()X() Y ()
To prove the proposition gracefully it is wise to use the notation introduced above:
let x range over possible values of X and let y range over possible values of Y .
Then we can rewrite the expected value by grouping according to values of X and
Y : it is
XX
P ()X() Y ()
x,y
P (X = x) P (Y = y) x y
x,y
24
Chapter 1
Probability
///
Remark: If X and Y are not independent the conclusion of the previous proposition may be false. For example, let X and Y both be the number of heads obtained
in a single flip of a fair coin. Then XY = X = Y , and we compute that
E(X) = E(Y ) = E(XY ) = P (head) 1 + P (tail) 0 =
1
1
1
1+ 0=
2
2
2
Then
1
1
1 1
6= = = E(X) E(Y )
2
4
2 2
An important case of independent random variables arises when several independent trials are conducted (with the same experiment). Let be a sample
space. Consider N independent trials. Consider the product
E(XY ) =
N = . . .
|
{z
}
N
(for 0 i n)
(otherwise)
1.5
25
Remark: The expected value assertion is certainly intuitively plausible, and there
are also easier arguments than what we give below, but it seems reasonable to warm
up to the variance computation by a similar but easier computation of the expected
value.
Proof: This computation will illustrate the use of generating functions to evaluate
naturally occurring but complicated looking expressions. Let q = 1 p.
First, lets get an expression for the expected value of X: from the definition,
EX =
n
X
i P (X = i) =
i=0
n
X
i=0
n i ni
i
p q
i
An astute person who remembered the binomial theorem might remember that it
asserts exactly that the analogous summation without the factor i in front of each
term is simply the expanded form of (p + q)n :
n
X
n i ni
p q
= (p + q)n
i
i=0
This is encouraging! The other key point is to notice that if we differentiate the
latter expression with respect to p, without continuing to require q = 1 p, we get
n
X
n
i pi1 q ni = n (p + q)n1
i
i=0
The left-hand side is nearly the desired expression, but were missing a power of p
throughout. To remedy this, multiply both sides of the equality by p, to obtain
n
X
n
i pi q ni = n p(p + q)n1
i
i=0
Once again requiring that p + q = 1, this simplifies to give the expected value
n
X
n
EX =
i pi q ni = n p
i
i=0
n
X
k=0
P (X = k) k 2
26
Chapter 1
Probability
As usual, there are nk ways to have exactly k 1s, and each way occurs with
probability pk q nk . Thus,
E(X 2 ) =
n
X
k2
k=0
n k nk
p q
k
This is very similar to the expression that occurred above in computing the
expected value, but now we have the extra factor i2 in front of each term instead
of i. But of course we might repeat the trick we used above and see what happens:
since
p pi = ipi
p
then by repeating it we have
p
p pk = k 2 pk
p p
=p
i2 pi q ni =
n
X
n
i=0
p pi q ni
p p
n
X n i ni
pq
=p
p
p (p + q)n
p p i=0 i
p p
since after getting the i2 out from inside the sum we can recognize the binomial
expansion. Taking derivatives gives
p
Using p + q = 1 gives
E(X 2 ) = p(n + p n(n 1))
So then
2 = E(X 2 ) 2 = p(n + p n(n 1)) (pn)2 = pn + p2 n2 p2 n p2 n2
= p(1 p)n
This finishes the computation of the variance of a binomial distribution.
///
Remark: The critical or skeptical reader might notice that theres something
suspicious about differentiating with respect to p in the above arguments, as if p and
q were independent variables, when in fact p+q = 1. Indeed, if a person had decided
that p was a constant, then they might feel inhibited about differentiating with
respect to it at all. But, in fact, there is no imperative to invoke the relationship
p + q = 1 until after the differentiation, so the computation is legitimate.
1.7
27
Ef (X)
a
1
0
(if f (t) a)
(if f (t) < a)
Then
f (X) a (X)
Note that the expected value of the random variable (X) is simply the probability
that f (X) a:
X
P ((X) = x) x = P ((X) = 0) 0 + P ((X) = 1) 1 = P (f (X) a)
E (X) =
x
by the definition of . Taking the expected value of both sides of this and using
f (X) a (X) gives
E f (X) a E (X) = a P (f (X) a)
by the previous observation.
///
2 X
2
Proof: This follows directly from the Markov inequality, with f (X) = (XE(X))2
and a = 2 .
///
28
Chapter 1
Probability
Proof: We will obtain this by making a good choice of the parameter in Chebycheffs inequality. We know from computations above that E(Xn ) = p n and
2 (Xn ) = p(1 p)n. Chebycheffs inequality asserts in general that
P (|X E(X)| > t(X)) <
1
t2
p(1 p)
to obtain
P (|Xn p n| > n) <
p(1 p)
n 2
///
Exercises
29
Exercises
1.01
How many elements are in the set {1, 2, 2, 3, 3, 4, 5}? How many are in the
set {1, 2, {2}, 3, {3}, 4, 5}? In {1, 2, {2, 3}, 3, 4, 5}? (ans.)
1.02
1.03
List all the elements of the power set (set of subsets) of {1, 2, 3}. (ans.)
1.04
Let A = {1, 2, 3} and B = {2, 3}. List (without repetition) all the elements
of the cartesian product set A B. (ans.)
1.05
How many different ways are there to order the set {1, 2, 3, 4}? (ans.)
1.06
How many choices of 3 things from the list 1, 2, 3, . . . , 9, 10 are there? (ans.)
1.07
1.08
How many different choices are there of an unordered pair of distinct numbers
from the set {1, 2, . . . , 9, 10}? How many choices of ordered pair are there?
(ans.)
1.09
How many functions are there from the set {1, 2, 3} to the set {2, 3, 4, 5}?
(ans.)
1.10
How many injective functions are there from {1, 2, 3} to {1, 2, 3, 4}? (ans.)
1.11
How many injective functions are there from {1, 2, 3} to {1, 2, 3, 4, 5}?
1.12
How many surjective functions are there from {1, 2, 3, 4} to {1, 2, 3}? (ans.)
1.13
How many surjective functions are there from {1, 2, 3, 4, 5} to {1, 2, 3, 4}?
1.14
How many surjective functions are there from {1, 2, 3, 4, 5} to {1, 2, 3}?
1.15
Prove a formula for the number injective functions from an m-element set
to an n-element set.
1.16
S(m, n) = n
n1
X
i=1
n
S(m, i)
i
1.17
1.18
1.19
30
Chapter 1
Probability
1.21
Verify that the sum of all binomial coefficients nk with 0 k n is 2n .
(ans.)
Verify that the sum of expressions (1)k nk with 0 k n is 0.
1.22
How many subsets of all sizes are there of a set S with n elements? (ans.)
1.23
How many pairs are there of disjoint subsets A, B each with 3 elements
inside the set {1, 2, 3, 4, 5, 6, 7, 8}? (ans.)
1.24
1.25
Give a bijection from the collection of all integers to the collection of nonnegative integers. (ans.)
1.26
(*) Give a bijection from the collection of all positive integers to the collection
of all rational numbers.
1.27
(**) This illustrates a hazard in a too naive notion of a rule for forming a
set. Let S be the set of all sets which are not an element of themselves.
That is, let
S = { sets x : x 6 x}
1.20
Is S S or is S
6 S? (Hint: Assuming either that S is or isnt an element
of itself leads to a contradiction. Whats going on?)
1.28
1.29
What is the probability that there will be strictly more heads than tails out
of 10 flips of a fair coin? Out of 20 flips? (ans.)
1.30
If there are 3 red balls and 7 blue balls in an urn, what is the probability
that in two trials two red balls will be drawn? (ans.)
1.31
If there are 3 red balls and 7 blue balls in an urn, what is the probability
that in 10 trials at least 4 red balls will be drawn?
1.32
Prove that
1 + 2 + 3 + 4 + . . . + (n 1) + n =
1
n(n + 1)
2
1.33
A die is a small cube with numbers 1-6 on its six sides. A roll of two dice
has an outcome which is the sum of the upward-facing sides of the two, so
is an integer in the range 2-12. A die is fair if any one of its six sides is as
likely to come up as any other. What is the probability that a roll of two
fair dice will give either a 7 or an 8 ? What is the probability of a 2 ?
1.34
What is the probability that there will be fewer than (or exactly) N heads
out of 3N flips of a fair coin?
1.35
(*) You know that in a certain house there are two children, but you do
not know their genders. You know that each child has a 50-50 chance of
Exercises
31
being either gender. When you go to the door and knock, a girl answers the
door. What is the probability of the other child being a boy? (False hint:
out of the 4 possibilities girl-girl, girl-boy, boy-girl, boy-boy, only the first
3 occur since you know there is at least one girl in the house. Of those 3
possibilities, in 2/3 of the cases in addition to a girl there is a boy. So (?) if
a girl answers the door then the probability is 2/3 that the other child is a
boy.) (Comment: In the submicroscopic world of elementary particles, the
behavior of the family of particles known as bosons is contrary to the correct
macroscopic principle illustrated by this exercise, while fermions behave in
the manner indicated by this exercise.)
1.36
The Birthday Paradox: Show that the probability is greater than 1/2 that,
out of a given group of 24 people, at least two will have the same birthday.
1.37
(*) The Monty Hall paradox You are in a game show in which contestants
choose one of three doors, knowing that behind one of the three is a good
prize, and behind the others nothing of any consequence. After youve chosen
one door, the gameshow host (Monty Hall) always shows you that behind
one of the other doors there is nothing and offers you the chance to change
your selection. Should you change? (What is the probability that the prize
is behind the door you did not initially choose? What is the probability that
the prize is behind the other closed door?)
1.38
(**) Suppose that two real numbers are chosen at random between 0 and
1. What is the probability that their sum is greater than 1? What is the
probability that their product is greater than 1/2?
1.39
If there are 3 red balls in an urn and 7 black balls, what is the expected
number of red balls to be drawn in 20 trials (replacing whatever ball is
drawn in each trial)? (ans.)
1.40
1.41
What is the expected number of coin flips before a head comes up (with a
fair coin)?
1.42
What is the expected number of coin flips before two consecutive heads come
up?
1.43
1.44
1.45
1.46
(*) What is the expected number of coin flips before n consecutive heads
come up?
32
Chapter 1
Probability
1.47
(*) Choose two real numbers at random from the interval [0, 1]. What is
the expected value of their sum? product?
1.48
Compute the variance of the random variable which tells the result of the
roll of one fair die.
1.49
Compute the variance of the random variable which tells the sum of the
result of the roll of two fair dice.
1.50
Compute the variance of the random variable which tells the sum of the
result of the roll of three fair dice.
1.51
(*) Compute the variance of the random variable which tells the sum of the
result of the roll of n fair dice.
1.52
(*) Consider a coin which has probability p of heads. Let X be the random
variable which tells how long before 2 heads in a row come up. What is the
variance of X?
1.53
Gracefully estimate the probability that in 100 flips of a fair coin the number
of heads will be at least 40 and no more than 60. (ans.)
1.54
Gracefully estimate the probability that in 1000 flips of a fair coin the number of heads will be at least 400 and no more than 600. (ans.)
1.55
Gracefully estimate the probability that in 10,000 flips of a fair coin the
number of heads will be at least 4000 and no more than 6000. (ans.)
1.56
With a coin that has probability only 1/10 of coming up heads, show that
the probability is less than 1/9 that in 100 flips the number of heads will be
more than 20. (ans.)
1.57
With a coin that has probability only 1/10 of coming up heads, show that
the probability is less than 1/900 that in 10,000 flips the number of heads
will be less than 2000.
2
Information
2.1
2.2
The words uncertainty, information, and redundancy all have some intuitive content. The term entropy from thermodynamics may suggest a related
notion, namely a degree of disorder. We can make this more precise, and in our
context we will decide that the three things, uncertainty, information, and entropy,
all refer to roughly the same thing, while redundancy refers to lack of uncertainty.
Noiseless coding addresses the issue of organizing information well for transmission, by adroitly removing redundancy. It does not address issues about noise or
any other sort of errors. The most visible example of noiseless coding is compression of data, although abbreviations, shorthand, and symbols are equally important
examples.
The other fundamental problem is noisy coding, more often called errorcorrecting coding, meaning to adroitly add redundancy to make information
robust against noise and other errors.
The first big result in noiseless coding is that the entropy of a memoryless
source gives a lower bound on the length of a code which encodes the source.
And the average word length of such a code is bounded in terms of the entropy.
This should be interpreted as a not-too-surprising assertion that the entropy of a
source correctly embodies the notion of how much information the source emits.
34
Chapter 2
Information
The outcome of the roll of a single fair die (with faces 16) is more uncertain
than the toss of a coin: there are more things that can happen, each of which has
rather small probability.
On the other hand, we can talk in a similar manner about acquisition of information. For example, in a message consisting of ordinary English, the completion
of the fragment
Because the weather forecast called for rain, she took her...
to
Because the weather forecast called for rain, she took her umbrella.
imparts very little further information. While its true that the sentence might
have ended boots instead, we have a clear picture of where the sentence is going.
By contrast, completion of the fragment
The weather forecast called for...
to
The weather forecast called for rain.
imparts a relatively large amount of information, since the first part of the sentence
gives no clues to its ending. Even more uncertainty remains in trying to complete
a sentence like
Then he surprised everyone by...
and commensurately more information is acquired when we know the completion.
In a related direction: the reason we are able to skim newspapers and other
lightweight text so quickly is that most of the words are not at all vital to the content, so if we ignore many of them the message still comes through: the information
content is low, and information is repeated. By contrast, technical writing is harder
to read, because it is more concise, thereby not allowing us to skip over things. It is
usually not as repetitive as more ordinary text. What concise means here is that
it lacks redundancy (meaning that it does not repeat itself). Equivalently, there
is a high information rate.
Looking at the somewhat lower-level structure of language: most isolated typographical errors in ordinary text are not hard to correct. This is because of the
redundancy of natural languages such as English. For example,
The sun was shining brghtly.
is easy to correct to
The sun was shining brightly.
In fact, in this particular example, the modifier brightly is hardly necessary at all:
the content would be almost identical if the word were omitted entirely. By contrast,
typographical errors are somewhat harder to detect and correct in technical writing
than in ordinary prose, because there is less redundancy, especially without a larger
context.
2.1
35
Note that correction of typos is a lower-level task than replacing missing words,
since it relies more upon recognition of what might or might not be an English word
rather than upon understanding the content. Corrections based upon meaning
would be called semantics-based correction, while corrections based upon misspelling or grammatical errors would be syntax-based correction. Syntax-based
correction is clearly easier to automate than semantics-based correction, since the
rules for semantics are much more complicated than the rules for spelling and
grammar (which many people find complicated enough already).
Still, not every typo is easy to fix, because sometimes they occur at critical
points in a sentence:
I cano go with you.
In this example, the cano could be either can with a random o stuck on its
end, or else either cannot with two omissions or maybe cant with two errors. By
contrast, errors in a different part of the message, as in
I can go wih you.
are easier to fix. In the first of these two examples, there would be a lot of information imparted by fixing the typo, but in the second case very little. In other
words, in the first case there was high uncertainty, but in the second not.
Lets look at several examples of the loss of intelligibility of a one-line sentence
subjected to a 12% rate of random errors. That is, for purposes of this example,
well randomly change about 12% of the letters to something else. Well do this
several times to see the various effects. Starting with
Experiment and pattern recognition are important in number theory.
we get
Dxpbviment and pattecn recognition arx iqporxant in kumder theofy.
Expurkmest and pattetn rncognition zrp impoxtant in number theocv.
Expecimeno and pattern recognition ake imboltanj in number thporq.
Experimect utk pattern regognitoon ame important in nkmber theoxy.
Experiment and pattern rncognltion xre important in yumbwr qheory.
Expkriment and pattern recognition bre importajt ip number tceory.
Ewperiment and gattern ieungnition are impjrtdlt in numwer theory.
Experiment awk gattern recognition are important jr qumbea tkeosj.
Euperiment anm paltern recognition are importanr in numbew tpvory.
Exmeriment and piztkrn recognition are importgnt in number theory.
Several things should be observed here. First, the impact on clarity and correctability depends greatly on which letters get altered. For example, the word number is
sensitive in this regard. Second, although the average rate of errors is 12%, sometimes more errors than this occur, and sometimes fewer. And the distribution of
errors is not regular. That is, a 12% error rate does not simply mean that every
8th letter is changed, but only expresses an average. Among the above 10 samples,
36
Chapter 2
Information
in at least 2 the meaning seems quite obscure. Third, using more than one of the
mangled sentences makes it very easy to infer the correct original.
With an error rate of 20%, there are frequent serious problems in intelligibility:
perhaps none of the ten samples retains its meaning if presented in isolation. From
the same phrase as above
Dxpbviment and pattecn recognition arx dmpottant in kumder theofy.
Expurkmest and pathgrn abcognitiom lre imvortanl vn numser teeory.
Pxpefiment cnd patrern recogibtiyz ure yvmortnnt iy nmmber thodry.
Edwhriment anh putzern mecovnition arl mmportanq im number theory.
Experimewt ang patjern recognition ace iepootant in uumber thkory.
Experiment and patuerj rgcocnitkon gre ihportans in numbej tyeoul.
Vxhdpiment and patoejc rvcognioion are important in ndtbvr theory.
Experiment and pattern rfgojsitreq asp ijportant in wvhber theory.
Exaegiment and paryern rectgrikion aoj imuovtant en thmbyr theory.
Expedimctt anc katcern recagnition rre impertant in numbzr theory.
In these 10 examples few of the words are recognizable. That is, looking for
an English word whose spelling is close to the given, presumably misspelled, word
does not succeed on a majority of the words in these garbled fragments. This is
because so many letters have been changed that there are too many equally plausible
possibilities for correction. Even using semantic information, these sentences are
mostly too garbled to allow recovery of the message.
Notice, though, that when we have, in effect, 9 retransmissions of the original
(each garbled in its own way) it is possible to make inferences about the original
message. For example, the 10 messages can have a majority vote on the correct
letter at each spot in the true message. Ironically, the fact that there are so many
different error possibilities but only one correct possibility makes it easy for the
correct message to win such votes. But a large number of repetitions is an inefficient
method for compensating for noise in a communications channel.
Another version of noise might result in erasures of some characters. Thus,
we might be assured that any letter that comes through is correct, but some are
simply omitted.
One point of this discussion is that while English (or any other natural language) has quite a bit of redundancy in it, this redundancy is unevenly distributed.
In other words, the information in English is not uniformly distributed but is concentrated at some spots and thin at others.
Another way to illustrate the redundancy is to recall an advertisement from
the New York subways of years ago:
F u cn rd ths, u cn gt a gd jb.
An adroit selection of about 40% of the letters was removed, but this is still intelligible.
2.2
Definition of entropy
37
1
=1
2
1
=1
2
This simplest example motivates the name for the unit of information, the bit.
The entropy of a sample space is the expected value of the self-information of
(atomic) events in . That is, with the notation as just above,
X
entropy of sample space = H() =
P (i ) I(i )
I(T) = log2
1in
X
1in
P (i ) log2 P (i )
38
Chapter 2
Information
That is, only the probabilities matter, not their ordering or labeling.
H(p1 , . . . , pn ) 0, and is 0 only if one of the pi s is 1. That is, uncertainty
disappears entirely only if there is no randomness present.
H(p1 , . . . , pn ) = H(p1 , . . . , pn , 0). That is, impossible outcomes do not contribute to uncertainty.
1
1
1
1
,...,
)
H( , . . . , ) H(
n
n
n
+
1
n
+
1
| {z }
|
{z
}
n
n+1
2.2
Definition of entropy
39
This is about conditional probabilities and a sensible requirement about uncertainty in such a situation. That is, we group the outcomes of an experiment
into two subsets and then say that the uncertainty is the uncertainty of which
batch the outcome falls into, plus the weighted sum of the uncertainties about
exactly where the outcome falls in the subsets.
Theorem: Any entropy function H(p1 , . . . , pn ) meeting the above conditions is a
positive scalar multiple of
X
H(p1 , . . . , pn ) =
pi log2 pi
i
x0+
Remark: The logarithm is taken base 2 for historical reasons. Changing the base
of the logarithm to any other number b > 1 merely uniformly divides the values
of the entropy function by log2 b. Thus, for comparison of the relative uncertainty
of different sets of probabilities, it doesnt really matter what base is used for the
logarithm. But base 2 is traditional and also does make some answers come out
nicely. Some early work by Hartley used logarithms base 10 instead, and in that
case the unit of information or entropy is the Hartley, which possibly sounds more
exotic.
Remark: The units for entropy are also bits, since we view entropy as an expected value (thus, a kind of average) of information, whose unit is the bit. This is
compatible with the other use of bit (for binary digit), as the coin-flipping example
illustrates.
Example: The entropy in a single toss of a fair coin is
1
1
1
1
1 1
,
=
log2
+
log2
H (coin) = H
2 2
2
2
2
2
1
1
=
(1)) + ((1) = 1 bit
2
2
Indeed, one might imagine that such a coin toss is a basic unit of information.
Further, if we label the coin 0 and 1 instead of heads and tails, then such a
coin toss exactly determines the value of a bit.
40
Chapter 2
Information
1 1 1 1 1 1
, , , , ,
6 6 6 6 6 6
=
6
X
1
i=1
log2
1
= log2 6 2.58496250072 bits
6
1 2 3 4 5 6 5 4 3 2 1
, , , , , , , , , ,
36 36 36 36 36 36 36 36 36 36 36
1
2
1
2
1
1
log2
log2
...
log2
3.27440191929 bits
36
36 36
36
36
36
Example: The entropy in a single letter of English (assuming that the various
letters will occur with probability equal to their frequencies in typical English) is
approximately
H(letter of English) 4.19
(This is based on empirical information, that e occurs about 11% of the time, t
occurs about 9% of the time, etc.) By contrast, if all letters were equally likely,
then the entropy would be somewhat larger, about
H
1
1
,...,
26
26
= log2 (26) 4.7
Remark: The proof that the axioms uniquely characterize entropy is hard, and
not necessary for us, so well skip it. But an interested reader can certainly use
basic properties of logarithms (and a bit of algebra and basic probability) to verify
that
X
H(p1 , . . . , pn ) =
pi log2 pi
i
meets the conditions, even if its not so easy to prove that nothing else does.
Joint entropy of a collection X1 , . . ., XN of random variables is defined in
the reasonably obvious manner
H(X1 , . . . , Xn )
=
x1 ,...,xn
2.2
Definition of entropy
41
Let q1 , . . . , qn vary,
i pi = 1. P
subject only to the restriction that qi 0 for all indices, and i qi = 1. Then
min
q1 ,...,qn
pi log2 qi
Proof: First, from looking at the graph of ln x we see that the tangent line at
x = 1 lies above the graph of ln x and touches it only at x = 1. That is,
ln x x 1
with equality only for x = 1. And since
log2 x = log2 e ln x
we have
log2 x (log2 e)(x 1)
with equality only for x = 1. Then replace x by q/p to obtain
log2 (q/p) (log2 e)(q/p 1)
Multiply through by p to get
p log2 (q/p) (log2 e)(q p)
and then
p log2 q p log2 p + (log2 e)(q p)
with equality occurring only for q = p. Replacing p, q by pi and qi and adding the
resulting inequalities, we have
X
pi log2 qi
Since
pi = 1 and
pi log2 pi + (log2 e)
i qi
(qi pi )
= 1, this simplifies to
X
i
pi log2 qi
pi log2 pi
Multiplying through by 1 reverses the order of inequality and gives the assertion.
///
1
n
for
all indices.
Proof: This corollary follows from the previous inequality by letting qi = n1 . ///
42
Chapter 2
Information
=
P (X = xi )
=
P (Y = yj )
= P (X = xi , Y = yj )
P
We
P will use the fact that for fixed i we have j rij = pi and for fixed j we have
i rij = qj . Then compute directly:
H(X) + H(Y ) =
pi log2 pi
qj log2 qj
rij log2 pi
!
X X
rij
log2 qj
ij
ij
where we use the previous notion of conditional entropy with respect to the subset
where Y = yj . The idea here is that H(X|Y ) is the amount of uncertainty or
entropy remaining in X after Y is known.
It is pretty easy to check that
H(X|X) = 0
Exercises
43
(so knowing the outcome of X removes all uncertainty about the outcome of X,
which seems fair) and that
H(X|Y ) = H(X) if X and Y are independent
since the independence should presumably mean that no information about X is
imparted by knowing about Y .
proto-Theorem: We have H(X|Y ) = 0 if and only if X is a function of Y .
The previous proto-theorem, which is not a real theorem because the phrase
is a function of needs clarification, is really a special case of the following genuine
theorem.
///
Exercises
2.01
1
1
1
16 , 16 , 16 .
2.02
2.03
2.04
Determine the entropy of the random variable which counts the number of
heads in flipping three fair coins.
2.05
Determine the entropy of the random variable which counts the sum of
three dice.
2
27 ,
1
1
12 , 24 .
2
2
1
81 , 243 , 243 .
3
Noiseless Coding
3.1
3.2
3.3
3.4
Noiseless coding
Kraft and McMillan inequalities
Noiseless coding theorem
Huffman encoding
Noiseless coding addresses the issue of organizing information well for transmission, by adroitly removing redundancy. It does not address issues about noise or
any other sort of errors. The most visible example of noiseless coding is compression of data, although abbreviations, shorthand, and symbols are equally important
examples.
The other fundamental problem is noisy coding, more often called errorcorrecting coding, meaning to adroitly add redundancy to make information
robust against noise and other errors.
The first big result in noiseless coding is that the entropy of a memoryless
source gives a lower bound on the length of a code which encodes the source.
And the average word length of such a code is bounded in terms of the entropy.
This should be interpreted as a not-too-surprising assertion that the entropy of a
source correctly embodies the notion of how much information the source emits.
3.1
Noiseless coding
45
Example: The simplest sort of source is a gadget which emits a stream of 0s and
1s with equal probabilities. In this case, each of the random variables X1 , X2 , . . .
has distribution
1
P (Xi = 0) =
2
P (Xi = 1) =
1
2
Remark: The most general type of source, with no assumptions about interdependence, would be called a stochastic source. Such sources are probably too general
to say much about. A more restricted model is a Markov source, meaning that
there is a fixed T so that the nth word emitted (that is, the value of the random
variable Xn ) depends only the T previous values, Xn1 , . . . , XnT . A yet more
restricted model is a stationary Markov source, which is a Markov source in which
the form of the dependence of Xn on Xn1 , . . . , XnT does not depend upon the
time index n. But for our present purposes things are complicated enough already
for the simpler i.i.d. model.
An alphabet is simply a finite set. The elements of the set are the
characters of the alphabet. For any alphabet , denote by the set of all finite
strings composed of characters from . That is, is the collection of finite
ordered lists of elements of .
For example, the alphabet might simply be = {0, 1}, or it might be
= {a, b, . . . , y, z}. Or more characters might be included. It is clear that the
precise nature of the characters does not matter, but perhaps at most the number
of characters in the alphabet.
A code or encoding f of a (memoryless) source S = X1 , X2 , . . . (emitting
sourcewords in a set W ) into codeword strings over an alphabet is simply a map
(that is, function)
f : W
We extend the definition of the code f by making it behave reasonably with respect
to concatenation of strings: define
f ( concatenation w1 w2 . . . wn ) = concatenation f (w1 )f (w2 ) . . . f (wn )
46
Chapter 3
Noiseless Coding
Example: For encoding the English alphabet into dots and dashes for telegraph
transmission, the alphabet is {dot, dash}, and the collection W of words can be
simply the usual English alphabet. An important design feature of this encoding is
that more commonly used letters such as e and t have shorter expressions in the
Morse alphabet, for example. (Notice that in this case the words are not words
in the colloquial sense, but instead are letters in the English alphabet.)
Example: Another example is the encoding of the English alphabet (along with
numerals, punctuation, and some control characters) into ASCII code, that is, into
numbers in the range 0255. Here the source words are again single characters
rather than words in the ordinary sense of being a string of letters. For the
code alphabet we have 4 different choices, all of which are actually used: if the
numbers 0255 are written in binary, then we just need alphabet = {0, 1}, and
the encoding of each word takes up to 8 characters. If the numbers 0255 are
written in octal, then we need alphabet = {0, 1, 2, . . . , 7}, and the encoding of a
single source word takes 3 characters. If the numbers 0255 are written in decimal,
then we need alphabet = {0, 1, . . . , 8, 9}, and the encoding of each source word
may take 3 characters. If the numbers 0255 are written in hexadecimal, then
we need alphabet = {0, 1, . . . , 8, 9, A, B, C, D, E, F } and the encoding of a source
character takes only 2 characters.
Example: The Braille alphabet system is an encoding of the 26-letter alphabet
(along with numerals and some punctuation, as well as a few short common words)
into a 3-by-2 pattern of raised-or-not dots. The 3-by-2 grid gives 6 different choices
of whether to raise the dot or not, so there are 26 available code words. A practical
problem with this code is that it has so little redundancy: if through use the dots
are worn away or damaged, it is impossible to deduce the character. But since
the patterns of dots need to be fairly large to be discernible to fingertips, it seems
infeasible to add redundancy. By contrast, printed letters have enough redundancy
so that even if slightly blurred they are often legible.
Example: Systems of abbreviations used in otherwise ordinary English are examples of coding. For example, we may take as the set of words W the set of
genuine words in English, and take to be the usual alphabet with numerals and
punctuation. Then we can define various encoding maps f : W . For example,
we might define
St.
if word =
Ave. if word =
f (word) = Blvd. if word =
Rd. if word =
word
Street
Avenue
Boulevard
Road
otherwise
This code doesnt do much. Larger systems of abbreviations were often used in
telegraphy, both for efficiency and for secrecy.
Note that depending upon the context St. may be an abbreviation for either
Street or for Saint. And IP may be either intellectual property or internet
protocol. Without sufficient information from the context, this abbreviation is not
uniquely decipherable.
3.1
Noiseless coding
47
We will only consider uniquely decipherable codes, that is, codes in which
two different messages will never be encoded the same way. That is, no information
is lost in the encoding! This condition requires that the function f : W is
injective, meaning (by definition) that
f (w1 ) = f (w2 ) implies w1 = w2
(An injective function is sometimes called one-to-one, although the latter phrase is
a little ambiguous because of its colloquialness.) This hypothesis of unique decipherability simplifies things a bit, and is often a reasonable hypothesis to take. In
fact, it might seem that no one would ever want anything but uniquely decipherable codes, but this is not the case. In coding of graphics, for example in use of
the JPEG file format, it is tolerable to lose a certain amount of certain kinds of
information that are apparently not detectable to the human eye. Further, it turns
out that in such scenarios giving up the demand for unique decipherability allows
much greater economy.
Given two strings
s = s1 s2 . . . sm
t = t1 t2 . . . t n
in , say that s is a prefix of t if s is an initial piece of t, that is, if m n and
t1 = s1 , t2 = s2 , . . . , tm = sm
This terminology is compatible with colloquial usage.
A code f : W is an instantaneous or prefix code if for all words w, w0
in the set W of source words,
f (w) is not a prefix of f (w0 )
for
w 6= w0
Example: The code with words 00, 01, 110, and 001 is not a prefix code, because
the first codeword 00 is the first part of the fourth codeword 001.
Example: If all codewords are of the same (known) length, then we know when
a codeword is completed.
48
Chapter 3
Noiseless Coding
compression programs are an example of progress toward this goal. The notion of
entropy will allow us to understand some theoretical limitations of such techniques.
Theorem: (Krafts inequality.) Let the set W of source words have m elements,
and let the encoding alphabet have n characters. A necessary and sufficient
condition that there exist an instantaneous uniquely decipherable code f :
W with lengths `1 , . . . , `m is that
m
X
1
1
`i
n
i=1
Theorem: (McMillans inequality.) Let the set W of source words have m elements, and let the encoding alphabet have n characters. A necessary and
sufficient condition that there exist a uniquely decipherable code f : W
with lengths `1 , . . . , `m is that
m
X
1
1
n `i
i=1
Remark: The corollary follows from the fact that the conditions for the two
theorems are the same. We will prove the two theorems simultaneously, in effect
proving the hard half of each, by proving first that if the indicated inequality holds
then there is an instantaneous code with indicated word lengths, and proving second
that the word lengths of any uniquely decipherable code satisfy the inequality.
Remark: These inequalities give absolute limits on the size of encoding words
necessary to encode a vocabulary W of source words of a certain size. These
limitations are independent of any probabilistic considerations.
3.2
49
Remark: Note that the quantities `i that occur in Krafts and McMillans inequalities are integers. Thus, there is no justification in trying to apply these results with
non-integer quantities, and in fact some heuristically plausible conclusions reached
in such manner are simply false.
Proof: (of Krafts inequality) Suppose that the set of encoded word lengths satisfies
m
X
1
1
`i
n
i=1
Let ` be the maximum word length, and let tj be the number of (encoded) words
of length j. Then the supposed inequality can be rewritten as
`
X
i=1
ti
1
1
ni
50
Chapter 3
Noiseless Coding
This leaves
(n t1 )n t2
two-character strings whose first characters are not the encoding of a word, and
which themselves are not the encoding of a word. Then there are
((n t1 )n t2 )n
three-character words whose first character is not the encoding of a source word,
and whose two-character prefix is not the encoding of a source word. We arbitrarily
choose t3 among these. From the inequality above
t3 n 3 t1 n 2 t2 n
this is possible. Continuing in the obvious way gives the code f . This proves the
sufficiency half of the assertion of the Kraft inequality theorem.
Now we prove half of McMillans inequality, namely that given a uniquely
decipherable code f : W the word lengths satisfy the inequality. Let the
set of word lengths be `1 , . . . , `m . Let ` be the maximum length. For any positive
integer t, we can re-express
n`1 + n`2 + . . . + n`m
t
t
t`
X
Cs ns
s=0
for some coefficients Cs (depending on the number of terms in the sum and also
upon t). By the nature of multiplication, Cs is the number of ways a string of length
s can be created by concatenating t strings with lengths from among `1 , `2 , . . . , `m .
(This style of argument is very similar to use of generating functions in counting
problems.)
The assumption of unique decipherability implies that any string obtained by
sticking together codewords comes from just one sequence of codewords. That is, a
given string of s characters occurs as a concatenation of encoded words in at most
one way. Since there are ns choices of strings of length s made from the alphabet
, the unique decipherability implies that Cs ns , since each such string can occur
in at most a single way as a concatenation of encoded words.
Using Cs ns in the expression above, we have
n`1 + . . . + n`m
t
t`
X
s=0
Cs ns
t`
X
ns ns =
s=0
t`
X
s=0
1 = t`
3.3
51
Letting t +, the right-hand side goes to 1, and we obtain the necessity half of
McMillans theorem.
Now we combine the two halves to easily complete the proof of both theorems.
Since any uniquely decipherable code must satisfy the inequality (by the half of
McMillans theorem we proved) certainly an instantaneous one must. This proves
the second half of Krafts theorem. And, similarly, to prove that a uniquely decipherable code exists for any set of word lengths satisfying the inequality above,
it certainly suffices to prove this with the additional condition of instantaneity.
Krafts theorem proved this, so we obtain the second half of the proof of McMillans theorem.
///
m
X
pi `i
i=1
Note that this is the expected value of the random variable which returns the length
of the codewords.
Example: Let the source words be cat with probability 1/4, dog with probability
1/8, elephant with probability 1/8, and zebra with probability 1/2. Let the code
alphabet be = {0, 1}, and let the encoding f be
f (cat) = 011
f (dog) = 01
f (elephant) = 0
f (zebra) = 111
Then the average length of an encoded word is, by definition,
average length = P (cat) length(f (cat)) + P (dog) length(f (dog))
+ P (elephant) length(f (elephant)) + P (zebra) length(f (zebra))
= P (cat) length(011) + P (dog) length(11)
+ P (elephant) length(0) + P (zebra) length(111)
52
Chapter 3
Noiseless Coding
1
1
1
21
1
3+ 2+ 1+ 3=
= 2.625
4
8
8
2
8
That is, the average codeword length with this encoding is 2.625. Note that the
lengths of the source words play no role in this computation.
Let || denote the number of elements in a finite set (such as an alphabet
of symbols).
Theorem: For a memoryless source X with entropy H(X), a uniquely decipherable code f : W into strings made from an alphabet (with || > 1) must
have average length satisfying
H(X)
log2 ||
average length f
Further, there exists a code f with
H(X)
log2 ||
Define
qi = n`i /
n`i
Since (by construction) the sum of the qi s is 1, and since they are non-negative, the
collection of numbers q1 , . . . , qm fits the hypotheses of the Fundamental Inequality
above, and we conclude that
pi log2 pi
pi log2 qi
By its definition
!
log2 qi = log2 n
`i
log2
X
i
`i
!
= `i log2 n log2
X
i
`i
3.3
53
pi log2 pi log2 n
pi `i +
log2
`i
n`i 1, so log2
pi log2 pi log2 n
pi
!
X
pi `i
That is, the entropy of the source is less than or equal to the average length of the
encoded words times log2 of the size of the alphabet. This proves the lower bound
for the average word length.
For the other half of the theorem, we will try to cleverly choose the word
lengths according to the rule that `i is the smallest integer such that
p1
n `i
i
Of course, it is not immediately clear that this is possible, but the fact that the
probabilities pi add up to 1 gives
X
n`i 1
log2 pi
log2 n
X
i
since
pi `i <
X
i
pi 1
1 X
H(X)
pi log2 pi = 1 +
log2 n
log2 n
///
54
Chapter 3
Noiseless Coding
3.4
Huffman encoding
55
corresponding to the idea that emission of the original source words wn1 and wn
0
by X are combined into a single emission of wn1
by the new source X 0 . Let
f 0 : W 0 {0, 1}
be a binary Huffman encoding for X 0 , which we can assume by induction to exist.
Then define the encoding for source X by
f (wi ) = f 0 (wi )
and
f (wn1 )
f (wn )
for i = 1, 2, . . . , n 2
0
= f 0 (wn1
) + 0
0
0
= f (wn1 ) + 1
=
=
2
5
0
1
56
Chapter 3
Noiseless Coding
log2
1.57095
H(X) = log2
5
5 10
10 10
10
We can see that the inequality of the Noiseless Coding Theorem is met:
H(X) = 1.57095 length = 1.6 2.57095 = H(X) + 1
The least likely word w4 should be combined with one of the next least
likely words, say w3 , into a single case w30 for a new source X 0 . The probability
5
P (X 0 = w30 ) should be the sum 14 + 16 = 12
. The words w10 = w1 and w20 = w2 are
0
emitted by X with the same probabilities as for X. In this example we need to go
one step further, creating a new source X 00 by combining the two least likely words
emitted by X 0 , w10 with probability 31 and w20 with probability 41 into a single case,
w200 emitted by X 00 with probability
1 1
7
+ =
3 4
12
5
Let w100 = w30 , emitted by X 00 with probability 12
. Since X 00 emits just two words,
00
to make its good encoding f we actually dont care about the probabilities any
more:
f 00 (w100 ) = 0
f 00 (w200 ) = 1
3.4
Huffman encoding
57
(Of course this is so, since all the encoding words are of length 2.) The entropy of
the source is
1 1
1 1
1 1
1
1
H(X) = log2 log2 log2 log2 1.9591
3
3 4
4 4
4 6
6
We can see that the inequality of the Noiseless Coding Theorem is met:
H(X) = 1.9591 length = 2.0 2.9591 = H(X) + 1
Two of the least likely words, say w3 and w4 , should be combined into
a single case w30 for a new source X 0 . The probability P (X 0 = w30 ) should be the
sum 61 + 16 = 31 . The words w10 = w1 and w20 = w2 are emitted by X 0 with the same
probabilities as for X. We need to go one step further, creating a new source X 00
by combining the two least likely words emitted by X 0 , w20 with probability 61 and
w30 with probability 13 into a single case, w200 emitted by X 00 with probability
1
1 1
+ =
6 3
2
Let w100 = w10 , emitted by X 00 with probability 12 . Since X 00 emits just two words,
to make its good encoding f 00 we actually dont care about the probabilities any
more:
f 00 (w100 ) = 0
f 00 (w200 ) = 1
Working backwards, the encoding f 0 for X 0 should be
f 0 (w10 ) =
f 00 (w100 )
= 0
0
0
00
f (w2 ) = f (w200 ) + 0 = 10
f 0 (w30 ) = f 00 (w200 ) + 1 = 11
and then, one step further back, the encoding f for X is
= 0
f (w1 ) =
f 0 (w10 )
f 0 (w20 )
f (w2 ) =
= 10
f (w3 ) = f 0 (w30 ) + 0 = 110
f (w4 ) = f 0 (w30 ) + 1 = 111
In this example, by contrast to the previous one, the word w1 occurs with such high
probability that it is optimal to allocate a very short encoding to it, consisting of
a single bit. Evidently the added cost of having to encode two of the other (least
likely) words by 3 bits is worthwhile.
The (average) length of the latter encoding is
1
1
1
1
1 + 2 + 3 + 3 = 1.83333
2
6
6
6
58
Chapter 3
Noiseless Coding
f (w2 ) = 1
is optimal, regardless of the probabilities with which the two words are emitted. This is clear from the fact that the encodings cant be any shorter than
a single character.
For a compact instantaneous code f : W {0, 1} if
P (X = w1 ) > P (X = w2 )
for two words w1 , w2 , then necessarily
length(f (w1 )) length(f (w2 ))
Indeed, if instead
length(f (w1 )) > length(f (w2 ))
then make a new code g by having g be the same as f except interchanging
the encoding of w1 and w2 :
g(w1 ) = f (w2 )
g(w2 ) = f (w1 )
We can check that the new code g has strictly shorter average length than f
(and it is certainly still instantaneous): in the expression for average length,
the only thing that will change is the subsum for the two words w1 , w2 .
Letting p1 = P (X = w1 ), p2 = P (X = w2 ), and `1 = length(f (w1 )),
`2 = length(f (w2 )),
P (X = w1 ) length(g(w1 )) + P (X = w2 ) length(g(w2 )) = p1 `2 + p2 `1
= p2 `2 + (p1 p2 )`2 + p1 `1 + (p2 p1 )`1
= (p1 `1 + p2 `2 ) + (p1 p2 )(`2 `1 ) < p1 `1 + p2 `2
3.4
Huffman encoding
59
= s + 0
= s + 1
where the string s is the common prefix shared by g(wn1 ) and g(wn ). (Since g is
instantaneous, so is g 0 .) For brevity let
`i = length (g(wi ))
Note that
`n1 = `n = length(s) + 1
since g(wn1 ) and g(wn ) have the common prefix s and differ only in the last bit.
That is,
length(s) = `n 1 = `n1 1
Then, writing simply length for average length, we have
length g 0 = p1 `1 + . . . + pn2 `n2 + (pn1 + pn )(`n 1)
60
Chapter 3
Noiseless Coding
Exercises
3.01
3.02
3.03
How many source words must there be to require that any (binary) encoding
of the source have average word length at least 4? (ans.)
3.04
3.05
3.06
2
2
Determine the Huffman encoding of a source with probabilities 23 , 92 , 27
, 81
,
1
2
,
.
Compare
the
average
word
length
to
the
entropy
of
the
source.
243 243
3.07
4
Noisy Coding
4.1
4.2
4.3
4.4
4.5
Noisy channels
Example: parity checks
Decoding from a noisy channel
Channel capacity
Noisy coding theorem
That is, the sum of the probabilities of all the possible output characters that might
be received (for given input xi ) is 1.
Further, we suppose that the channel operates in a manner so that the transmission and receipt of each character are independent of the transmission and
receipt of other characters: the probabilities are independent of what has come
before or what comes after.
61
62
Chapter 4
Noisy Coding
0
1
0
1
0
1
That is, the two characters 0 and 1 never transmute into each other, but either
one may be erased with probability . This is the binary erasure channel.
The N th extension C (N ) of a channel C is a channel whose input alphabet is
all N -tuples of characters from the input alphabet of C, whose output alphabet is
the collection of N -tuples of characters from the output alphabet of C, and so that
the transition probabilities are what would occur if we had N copies of the original
channel working independently in parallel:
PC (N ) (out = b1 . . . bN |in = a1 . . . aN )
= PC (out = b1 |in = a1 ) . . . PC (out = bN |in = aN )
The situation well consider is that a source X emits words w1 , . . . , wm with
probabilities pi = P (X = xi ), which are encoded (perhaps by Huffman encoding)
into binary, then sent across a binary symmetric channel C, and decoded on the
other side. The encoding to binary is noiseless and is known to the decoder. In a
picture, this is
source X emits words wi with probabilities pi
decoder
4.2
63
64
Chapter 4
Noisy Coding
changed, this will not be detected. If all 3 bits change, this will be detected. So
the probability of an undetected bit error is the probability of exactly 2 bit errors,
which is
2
3
1
7
117 117 117
21
+
+
=
= 0.041016
=
888 888 888
2
8
8
512
This is a huge improvement over the previous 0.234375 probability of undetected
error. Of course, there remains the responsibility of correcting an error once its
detected.
Example: Lets look at what happens if theres even more noise on the channel.
Suppose that a source emits 2-bit binary codewords 00, 01, 10, 11. Let C be a
symmetric binary channel with bit error probability 1/3. Then the probability that
at least one bit error occurs in transmission of one of these 2-bit words is
1 2 2 1 1 1
5
+ + = 0.5555
3 3 3 3 3 3
9
So wed have scant chance of succesful transmission! Add a single parity-check bit
to this code by replacing these words with 000, 011, 101, 110, respectively. Again,
if just one bit of the 3 bits is changed, then the last bit will not correctly reflect
the even/odd-ness of the first 2 bits, so this error will be detected. However, if 2 of
the 3 bits are changed, this will not be detected. If all 3 bits change, this will be
detected. So the probability of at least one undetected bit error is
112 121 211
+
+
=
333 333 333
2
1
3
2
6
=
0.22222
2
3
3
27
Thus, by use of parity-check bits added to the code, in this example we can reduce
the probability of undetected bit error within a word to well below 1/2, though its
still quite high.
Example: Finally, lets look at what happens if the channel is as noisy as possible:
suppose that the bit error probability is 1/2. Suppose that a source emits 2-bit
binary codewords 00, 01, 10, 11. Then the probability that at least one bit error
occurs in transmission of one of these 2-bit words is
3
1 1 1 1 1 1
+ + = = 0.75 > 0.5
2 2 2 2 2 2
4
Add a single parity-check bit to this code by replacing these words with
000, 011, 101, 110, respectively. Again, if just one bit of the 3 bits is changed,
then the last bit will not correctly reflect the even/odd-ness of the first 2 bits, so
this error will be detected. However, if 2 of the 3 bits are changed, this will not be
detected. If all 3 bits change, this will be detected. So the probability of at least
one undetected bit error is
2
3
1
1
3
111 111 111
+
+
=
= = 0.375 < 0.5
222 222 222
2
2
2
8
4.2
65
Thus, by use of parity-check bits added to the code, even with a maximally noisy
channel, we can still reduce the probability of undetected bit error within a word to
3/8, significantly below 1/2.
Example: In the case of 3-bit binary words, adding a parity-check bit creates
4-bit words so that an odd number of bit errors will be detected. Suppose that a
binary symmetric channel has bit error probability 1/8. Then the probability that
at least one bit error will occur in transmission of a 3-bit word is
2 2 3
7
7
3
1
3
1
3
1
+
0.33
+
8
8
8
8
3
8
2
1
Of course, any such error is undetected. When a parity-check bit is added, the
probability of an undetected error is the probability of a positive even number of
bit errors, which is
2 2 4
4
1
7
4
1
+
0.072
2
8
8
8
4
which is less than 1/4 of the undetected errors that would occur without the paritycheck bit.
Example: A symmetric binary channel has bit error probability 1/5. A source
emits words w1 , w2 , w3 , w4 with probabilities 1/2, 1/4, 1/8, 1/8. These words are
Huffman-encoded as 0, 10, 110, 111, respectively. The probability that a word is
transmitted with some error is
2 14
2 11
1 1
+ P (X = w2 )
+
P (X = w1 )
1 55
2 55
1 5
2 2 3 !
1
1
4
3
4
3
3
1
+ P (X = w3 )
+
+
5
5
2
5
5
3
5
1
2 2 3 !
1
1
4
3
4
3
3
1
+
+
+ P (X = w4 )
5
5
2
5
5
3
5
1
1 1 1 9
1 61
1 61
+
+
+
0.312
2 5 4 25 8 125 8 125
Now add a parity-check bit, giving codewords 00, 101, 1100, 1111. The probability that there is an undetected error in transmission of a word now becomes
2
2
3
1
4
2
1
+ P (X = w2 )
P (X = w1 )
2
2
5
5
5
2 2
4
4
1
4
4
4
+ P (X = w3 )
+ P (X = w3 )
2
5
5
4
5
2 2
4
4
4
1
4
4
+ P (X = w4 )
+ P (X = w4 )
5
5
5
2
4
=
0.084
66
Chapter 4
Noisy Coding
4.4
Channel capacity
67
1
2,
m
X
i=1
m
X
pij pi
i=1
68
Chapter 4
Noisy Coding
where the maximum is taken over all probability distributions for sources emitting
the alphabet accepted as inputs by the channel, and where for each X the source
Y is constructed from X and from the channel as just above.
Remark: Note that the expression
I(X|Y ) = H(X) H(X|Y ) = H(X) + H(Y ) H(X, Y )
is actually symmetrical in the two random variables, so
I(X|Y ) = I(Y |X)
In words, more intuitively, but less precisely, the amount of information about X
imparted by Y is equal to the amount of information about Y imparted by X.
Remark: Since the definition of capacity depends continuously upon the probabilities p1 , . . . , pm for the sources emissions and since the collection of all such
m-tuples of probabilities is a closed and bounded set in Rm , the maximum really
occurs. This is a special case of the fact that the maximum of a continuous function
on a closed and bounded set in Rm is achieved, that is, is bounded above and there
is some point where the bounding value actually occurs.
Remark: The units for channel capacity are bits per symbol.
Theorem: Let C be a binary symmetric channel with bit error probability p.
Then the channel capacity of C is
1 + p log2 p + (1 p) log2 (1 p)
bits per symbol.
4.4
Channel capacity
69
= p log2 p + q log2 q
q+p
2
log2
q+p
2
q+p
2
log2
1
1
1
1
log2
log2
= 1 + p log2 p + q log2 q
2
2
2
2
Remark: When p =
q+p
2
///
1
2
Proof: Let c(n) be the capacity of the nth extension. Let X = (X1 , . . . , Xn ) be a
source for C (n) , in which each Xi is a source for C. Let Y = (Y1 , . . . , Yn ) be the
corresponding outputs from the channel C. By definition
c(n) = max I(X|Y ) = max H(X) H(X|Y )
X
Computing,
H(Y |X) =
X
x
P (X = x) H(Y |X = x)
70
Chapter 4
Noisy Coding
Since the channel is memoryless, the results of the various Yj are independent of
each other. More precisely,
H(Y |X = x) =
H(Yi |X = x)
H(Yi |X = x) =
H(Yi |Xi = xi )
where x = (. . . , xi , . . .). Putting this back into the expression for H(Y |X), we
obtain
X
X
H(Y |X) =
P (X1 = x1 , . . . , Xn = xn )
H(Yi |Xi = xi )
x1 ,...,xn
XX
i
P (X1 = x1 , X2 = x2 , . . . , Xn = xn ) = P (X1 = x1 )
x2 ,x3 ,...,xn
H(Yi |Xi )
whether or not the Xi are independent, due to the memorylessness of the channel.
In the general inequality
H(Y1 , . . . , Yn ) H(Y1 ) + . . . + H(Yn )
we have equality if and only if the Yi are independent. Therefore,
c(n) = max I(X|Y ) = max H(Y ) H(Y |X)
X
!
max
X
X
i
H(Yi )
H(Yi |Xi )
4.5
71
N
1 X
P (error |wi sent)
N i=1
Apart from the obvious objection that this might allow unacceptably large errors in
decoding rare words, it seems that in practice a slightly different measure is used:
The maximum word error probability is
maximum word error probability = max P (error|wi sent)
i
Certainly
maximum word error probability of f average word error probability of f
so if we make the maximum error probability small then certainly the plain error
probability will be small. And there is the virtue that we have made no unwarranted
assumptions about the probabilities that various codewords were sent. On the other
hand, minimizing maximum word error probability requires that we perhaps overly
concern ourselves with rare source words.
Now we return to the simple situation that all codes are binary, meaning that
everything is expressed in 0s and 1s. That means that we think of a binary symmetric channel and its extensions. We use maximum-likelihood (equivalently,
minimum-distance) decoding. From our earlier computation, a symmetric binary channel C with bit error probability p has capacity
c = 1 + p log2 p + (1 p) log2 (1 p)
The rate of a binary code with maximum word length n and with t codewords
is defined to be
rate =
Remark: The maximum possible rate is 1, which can occur only for a binary code
with maximum word length n where all the 2n binary codewords of length n are
used in the code. This represents the fullest possible transmission of information
through a channel.
72
Chapter 4
Noisy Coding
Remark: In a noisy channel, that is, in a channel whose bit error probability is
greater than 0, it is not reasonable to try to use a code with rate too close to 1,
because such a code will not have enough redundancy in it to allow either detection
or correction of errors.
A little more specifically: given > 0, for sufficiently large n there is a code C of
length n with rate R0 R such that
|R0 R|
1
n
and
max word error probability (C) <
Remark: Due to the nature of the proof, the theorem gives no explanation of
how to find or create the codes, nor is there a concrete indication of how rapidly
the maximum word error probability decreases to 0.
Proof: To set the context, we review some basic probabilistic aspects of the situation. Let p be the bit error probability, and let q = 1 p for brevity. Then the
expected number of bit errors in a binary word of length n is pn. The variance
of the random variable that counts the bit errors in a binary word of length n is
npq. The probability of any specific pattern of t bit errors in a word of length n is
pt q nt , as usual independent of the exact pattern, but rather depending only upon
the number of bit errors, not their location.
Fix > 0, and let
r
npq
b=
/2
Then, by Chebyshevs inequality,
P (number of expected bit errors > np + b)
4.5
73
Since p < 12 , for fixed and sufficiently large n the integer r = floor(np + b) is
surely strictly less than n2 . Recall that the floor function floor(x) is the greatest
integer less than or equal x.
For fixed word length n, let
Br (x) = {words y : d(x, y) r}
(where d( , ) is Hamming distance) denote the ball of radius r centered at the word
x. The volume vol Br (x) of the ball Br (x) is the number of words in it, which is
the sum of the numbers of words with 0 bit errors (just x itself!), with 1 bit error,
with 2 bit errors, . . ., with t errors:
X n
vol Br (x) =
t
0tr
Next, we recall the big-oh notation: let f (n), g(n), and h(n) be three functions of positive integers n and K a constant with the property that for all sufficiently large positive integers n
|f (n) g(n)| K h(n)
If we dont care much about the specific value of K, then we can write instead
f (n) g(n) = O(h(n))
or
f (n) = g(n) + O(h(n))
For example,
1
1
= +O
n+1
n
1
n2
This sort of notation is useful when were trying to simplify things and dont care
too much about relatively small details.
Now we start the proof in earnest, keeping the notation above. Define a function of two length n words w, w0 by
1 (for d(w, w0 ) r)
f (w, w0 ) =
0 (for d(w, w0 ) > r)
where r = floor(np + b). Certainly r depends on n, and is approximately p n. For
x in the collection C of codewords, define
X
Fx (w) = 1 f (w, x) +
f (w, y)
yC, y6=x
This quantity Fx (y) indicates roughly how many decoding errors we might make in
decoding y as x. In particular, if there is no codeword within distance r of y other
74
Chapter 4
Noisy Coding
than x, then Fx (y) = 0. Otherwise, if there are other codewords within distance r
of y, then Fx (y) 1.
Let C = {x1 , . . . , xt } be the codewords. We use the following decoding rule: for
a received word y if there is a unique codeword xi within distance r of y, then decode
y as xi , otherwise declare an error (or decode as some fixed default codeword x1 ).
It is clear from its description that this is a sloppier rule than minimum-distance
decoding, so if we can arrange to have this decoding rule achieve a good error
rate then minimum-distance (equivalently, maximum-likelihood) decoding will do
at least as well.
Keep in mind that p is the channels bit error probability. Let Pi be the
probability of an incorrect decoding given that xi C is transmitted. We assume
that codewords are transmitted with equal probabilities, each of which would have
to be 1/t since there are t codewords in C. Thus, the expected probability of error
in decoding the code C is
expected decoding error probability of code C
= PC =
1 X
Pi
t
1it
Let
Pbest = minimum PC for length n codes C with t codewords
Now we compute some things. Let Vn denote the collection of all binary words
of length n. For brevity, for codeword xi C and received word y Vn , write
P (y|xi ) = P (y received |xi sent)
For a codeword xi , the probability of incorrectly decoding the received word given
that xi was sent is
X
X
X X
P (y|xi ) f (xj , y)
Pi
P (y|xi ) Fxi (y) =
P (y|xi ) (1 f (xi , y)) +
yVn
yVn
yVn j6=i
The expression
X
yVn
is the probability that the received word is not inside the ball Br (xi ) of radius r
around xi . By the choice of b and r above (for given and n), we have arranged
that Chebysheffs inequality gives
X
yVn
4.5
75
1 X X X
+
P (y|xi ) f (xj , y)
2
t
1it yVn j6=i
Shannons insight was that whatever the average value Pavg of PC is (averaged
over all length n codes C with t codewords), there must be at least one code C0
which has
PC0 Pavg
This is a relatively elementary assertion about numbers: let a1 , . . . , aN be real
numbers, and let
a1 + . . . + aN
A=
N
be the average. We are claiming that there is at least one ai (though we cant
really predict which one) with ai A. To understand why this is so, suppose to
the contrary that ai > A for all ai . Then (by elementary properties of inequalities)
a1 + . . . + aN > A + . . . + A = N A
and
a1 + . . . + aN
>A
N
X
X
X
1
Pbest average (PC ) average +
P (y|xi ) f (xj , y)
2
t
1it yVn j6=i
X X X
1
P (y|xi ) f (xj , y)
= + average
2
t
1it j6=i yVn
Since the various codewords xk are chosen independently of each other (because weve given up the requirement that they be distinct from each other!), the
averaging processes with respect to x1 , x2 , . . . can be done independently of each
other, allowing smaller and easier computations. In particular, for each j
P
vol Br (y)
xj Vn f (xj , y)
average over xj of f (xj , y) =
=
2n
2n
76
Chapter 4
Noisy Coding
avg over xi
P (y|xi )
yVn
1it
X vol Br (y)
2n
j6=i
Since the volume of that ball of radius r doesnt depend upon the center, but only
upon the radius, we can write
vol Br = volume of any ball of radius r
Then this constant can be brought outside. Thus, so far,
Pbest
vol Br X
+
2
t 2n
avg over xi
P (y|xi ) (t 1)
yVn
1it
vol Br X
+
2
2n
t1
t
avg over xi
1it
P (y|xi )
yVn
Next, of course
X
P (y|xi ) = 1
yVn
for any codeword xi , since some word is received when xi is sent! This simplifies
things further to
Pbest
vol Br X
+
2
2n
1it
vol Br
t
+
2
2n
The factor of t comes from the outer sum over the t codewords xi in each code.
Next, we use the estimate (from the lemma below) on the volume vol Br :
vol Br
nn
n
r
2 r (n r)nr
t n
nn
t n
1
n r
= n
2
2
2 r (n r)nr
2
2 (r/n)r ((n r)/n)nr
4.5
77
Divide through by n:
1
r
(n r)
log2 n
1
r
(n r)
1
log2 t 1 +
log2 Pbest
log2
log2
n
2
n
n
n n
n
n
n
Now r/n is roughly p, and (n r)/n is roughly q. The other lemma below makes
this precise, and gives us
1
log2 t + log2 n 1
log2 Pbest
(1 + p log2 p + q log2 q) + O n1/2
n
2
n
The summand (log2 n 1)/n is also O(n1/2 ), so we have
1
log2 t
log2 Pbest
1
(1 + p log2 p + q log2 q R)
2
From the inequality 0 < R < 1 + p log2 p + q log2 q, we have > 0. Likewise, by
that inequality, for sufficiently large n
=
R (1 + p log2 p + q log2 q) + O n1/2 <
For such large n we multiply the equation
log2 t
floor(R n)
=
= R + O n1 = R + O n1/2
n
n
through by n, exponentiate base 2, and move the /2 to the other side to obtain
Pbest
+ 2 n
2
Take n large enough so that 2 n < /2 (since > 0). For such n we have Pbest ,
finishing the proof of Shannons theorem.
///
78
Chapter 4
Noisy Coding
Here is the simple estimate on volumes used above in the proof of Shannons
theorem:
nn
n
r
2 r (n r)nr
...
0
1
2
r
from which
vol Br (x) =
X n
X n
n
n n
=r
2 r
t
r
r
0tr
0tr
n = (r + (n r)) =
r
i
n
0in
n
n
nn
r
2 rr (n r)nr
as claimed.
///
4.5
79
Proof: First, because of the way r is defined via the floor function,
|r (np + b)| 1
from which, dividing through by n, we have
r
(p + b ) 1 = O 1
n
n n
n
For fixed > 0 and for fixed p (and q), by the definition of b we have
b = O( n)
and then, dividing by n,
b
=O
n
Since
n n we have
1
1
1
O
+O
=O
n
n
n
and therefore
r
r
b b
1
1
1
=O
p (p + ) + + O
n
n
n
n
n
n
n
Abstracting the situation slightly, fix y = p in the range 0 < y < 1, and let
xn = r/n with xn y = O(n1/2 ). We claim that
xn log2 xn y log2 y = O(n1/2 )
To prove this, we need the Mean Value Theorem from calculus: for any differentiable
function f , given a < b, for some between a and b we have
f (b) f (a) = f 0 () (b a)
for differentiable f and for some between a and b. Also recall that
d
(x ln x) = 1 + ln x
dx
and that
log2 A =
so
ln A
ln 2
d
1
(x log2 x) =
(1 + ln x)
dx
ln 2
80
Chapter 4
Noisy Coding
Then we have
1
(ln + 1) (xn y)
ln 2
1
=
+ log2 (xn y)
ln 2
xn log2 xn y log2 y =
for some between xn and y.. For n large enough, xn 2y, so y 2y. Thus,
by the monotonicity of log2 ,
log2 y log2 log2 2y
This gives a bound on log2 which does not depend on n at all, so, in terms of the
parameter n, this says
1
xn log2 xn y log2 y = O(xn y) = O
n
as desired.
///
Exercises
4.01
A symmetric binary channel has error probability 1/4. What is the probability that the binary word 01 is transmitted correctly? (ans.)
4.02
A symmetric binary channel has error probability 1/6. What is the probability that at least one error occurs in transmission of the binary word 0011 ?
(ans.)
4.03
4.04
4.05
Exercises
81
4.06
4.07
What is the channel capacity of the binary erasure channel with erasure
probability ?
4.08
4.09
4.10
4.11
What is the rate of a code with binary source words of length 4 in which
codewords are made by sending a source word 3 times in a row (thereby
making them of length 12)? (The correct decoding is decided by a 2/3 vote,
or the word is rejected if no 2 of the 3 4-bit pieces agree.)
4.12
5
Cyclic Redundancy Checks
5.1
5.2
5.3
5.4
The idea of parity check bit can be extended in various ways to detect more
errors. (Recall that a single parity check bit only detects an odd number of biterrors, and certainly cannot correct any errors at all.)
0+1=1
01=0
1+0=1
10=0
1+1=0
11=1
Various notations are used for this set with these operations: F2 and GF (2) are
the most common. Also Z/2. The notation Z2 is sometimes used, but this is not
so good since in other contexts Z2 is an entirely different thing, the 2-adic integers.
Also, sometimes this finite field is called a Galois field, in honor of Evariste Galois,
who first systematically studied finite fields such as F2 .
Remark: Since 1 + 1 = 0, it is reasonable to say that
1 = 1
if by 1 we mean something which when added to 1 gives 0. Similarly,
0 = 0
82
5.2
83
84
Chapter 5
x is 0, we dont write it at all, and if the coefficient is 1 we just write the power of
x. As usual, such a polynomial gives rise to a polynomial function from F2 to
F2 , by evaluation inside the finite field F2 :
P (0) = 03 + 1 = 1
P (1) = 13 + 1 = 0
Unlike the case of real numbers, however, different polynomials can give rise to the
same function: for example the two polynomials P (x) = x2 + x + 1 and Q(x) = 1
have the same values for any input in F2 .
Addition of polynomials with coefficients in F2 is as usual: add the coefficients
of corresponding powers of x, but now inside the finite field F2 . For example,
(x3 + 1) + (x3 + x2 + x + 1) = (1 + 1) x3 + (0 + 1) x2 + (0 + 1)x + (1 + 1) = x2 + x
Multiplication of polynomials is as usual, satisfying the distributive law. To
multiply polynomials is akin to multiplying decimal integers, but keeping track of
powers of x instead of tens place, hundreds place, etc. And the multiplication of
polynomials is somewhat simpler in that there is no carry, unlike integer multiplication. First, the integer multiplication case is something like
2 0 3
1 2 3
6 0 9
4 0 6
2 0 3
2 4 9 6 9
(This example did not have any carries in it.) The polynomial multiplication is
very similar. For example, with coefficients in the real numbers:
2x3
3x2
2x2
+x 3
3x +2
4x5
4x5
That is, each term in the first polynomial multiplies each term in the
second polynomial. Entirely analogously we can multiply polynomials with coefficients in the finite field F2 : again, each term in the first polynomial multiplies
each term in the second one, and then we add them all up. Now its actually easier
than for real or complex coefficients, because the arithmetic of the numbers is so
easy to do. For example, keeping in mind that 1 + 1 = 0 in F2 :
x3
x2
+x +1
+x +1
+x3
4
+x +1
+x2 +x
+x3 +x2
+x4
+1
+x
x5
x5
5.2
85
Note that in all cases we preserved the vertical alignment of like powers of x as a
precaution against miscopying errors. This is much like keeping the tens places,
hundreds places, etc., lined up when doing integer arithmetic by hand, except that
there is no carrying.
Division of one polynomial by another is also analogous to long division (with
remainder) of integers, except again there is no borrowing or carrying. First we
do an example with coefficients viewed as being ordinary integers or real numbers:
5
+0 +x +0 +x +x
x3 +x2 x1 1 R x4 +0 +0 +3x +2
x8 +x7 +0 +0 +x4 +x3 +0 +x1 +x0
x8 +0 +x6 +0 +x4 +x3 +0 +0 +0
x7 x6 +0 +0 +0 +0 +x1 +x0
x7 +0 +x5 +0 +x3 +x2 +0 +0
x6 x5 +0 x3 x2 +x1 +x0
x6 +0 x4 +0 x2 x1 +0
x5 +x4 x3 +0 +2x1+x0
x5 +0 x3 +0 x1 x0
x4 +0 +0 +3x1+2
Thus, in effect, at first we ask how many x s go into x8 (exactly x3 ), multiply the
divisor by x3 and subtract from the dividend, getting a temporary remainder, in
this case x7 x6 +x+1. Next, how many x5 s go into x7 ? Certainly x2 . Multiply the
divisor by x2 and subtract from the temporary remainder, giving a newer temporary
remainder x6 x5 x3 x2 + x + 1. Continue until the degree of the remainder
is strictly less than the degree of the divisor.
Now we use the same dividend and divisor, but viewed as having coefficients
in F2 , so that 1 = 1, 1 + 1 = 0, etc:
5
+0 +x +0 +x +x
x4 +0 +0 +x1 +0
Because in F2 we have 1 = +1, addition and subtraction are conveniently the same
thing. Thus, at each line we have the liberty of adding rather than subtracting,
which is a little bit easier since it is a symmetric function of its inputs (rather than
being unsymmetric as subtraction is in general).
Remark: No, it is not possible to tell what kind of numbers the coefficients of a
polynomial are intended to be without knowing the context. This is especially true
of the simplest expressions such as 1 or 0, and all the more so when we suppress
the coefficients, like in x3 + x.
86
Chapter 5
Remark: In any case, regardless of what kind of coefficients were using, the
exponents of the indeterminate x are still just ordinary non-negative integers.
+0 +x +x
5.3
87
than descending order as we have done. This doesnt change the idea of the thing,
but certainly changes the interpretation in terms of polynomials.
Computation of a single parity bit is computation of a CRC with generating
polynomial x + 1.
88
Chapter 5
that d = d(x) is transmitted or played back with some errors, and becomes d = d(x)
instead. (We may suppress the reference to the indeterminate x in the polynomials
here, to reduce the clutter in the notation.)
The error vector or error polynomial is obtained by subtracting:
= d d
e = e(x) = d(x) d(x)
(Since the coefficients are in F2 it doesnt really matter whether we subtract or
add.) The number of non-zero coefficients in e(x) is its Hamming weight. The
number of non-zero coefficients in e(x) is the number of bit errors.
5.4
89
Let r = r(x) be the CRC of d(x). It is the remainder when d(x) is divided by
g(x), so we can write
d(x) = q(x) g(x) + r(x) = q g + r
where q(x) is the quotient obtained by dividing d(x) by g(x). Let r(x) be the CRC
of d(x),
and let
= q(x) g(x) + r(x) = q g + r
d(x)
This will fail to be detected by the CRC if and only if g(x) divides e(x) = xi .
Since xi is just the product of i copies of the factor x, g(x) cannot divide xi
unless g(x) is xj for some j i. So already g(x) = x + 1 will detect single
bit errors. Single bit errors are easy to detect.
If there are just two bit errors, at the mth and nth positions (with m < n),
then the error polynomial is
e(x) = xm + xn
This will fail to be detected by the CRC if and only if g(x) divides e(x) =
xm + xn . This error polynomial can be factored as
e(x) = xm + xn = xm (1 + xnm )
If g(x) has no factor of x, which is easy to arrange by having the constant
term be non-zero, then for such an error to go undetected it must be that
g(x) divides 1 + xnm (with remainder 0). Already this is mysterious if we
dont know anything else.
90
Chapter 5
algebra identities
x2 1
x3 1
x4 1
x5 1
xN 1
= (x 1)(x + 1)
= (x 1)(x2 + x + 1)
= (x 1)(x3 + x2 + x + 1)
= (x 1)(x4 + x3 + x2 + x + 1)
...
= (x 1)(xN 1 + xN 2 + . . . + x + 1)
(The fact that we are working with coefficients in F2 , and that 1 = +1, does not
harm these identities.) Replacing x by x8 , we find
x16 1
x24 1
x32 1
x40 1
x8N 1
= (x8 1)(x8 + 1)
= (x8 1)(x16 + x8 + 1)
= (x8 1)(x24 + x16 + x8 + 1)
= (x8 1)(x32 + x24 + x16 + x8 + 1)
...
= (x8 1)(x8(N 1) + . . . + x8 + 1)
That is, x8 1 divides (with remainder 0) any polynomial x8N 1. For error
detection, that means that if two bit errors occur a distance apart which is a
multiple of 8, the XOR checksum CRC will not detect it.
Example: Still thinking about 2-bit errors: using the CRC with generating polynomial x4 + x + 1, even though its only of degree 4, fails to detect two-bit errors
only when theyre a multiple of 15 apart. That is, x4 + x + 1 divides xN 1 (with
remainder 0) only when N is a multiple of 15. You can certainly check by trying
to divide that no smaller N works. This is some sort of proof that the XOR
checksum is inefficient.
Example: Still thinking about 2-bit errors: using the CRC with generating polynomial x5 + x2 + 1, even though its only of degree 5, fails to detect two-bit errors
only when theyre a multiple of 31 apart. That is, x5 + x2 + 1 divides xN 1 (with
remainder 0) only when N is a multiple of 32. You can certainly check by trying
to divide that no smaller N works, but this is not the intelligent way to verify the
property.
Example: Still thinking about 2-bit errors: lets change the generating polynomial
from the previous example slightly, from x5 + x2 + 1 to x5 + x + 1. Mysteriously, the
performance deteriorates from the previous example, so that two-bit errors which
are a multiple of 21 apart will pass undetected.
5.4
91
92
Chapter 5
Indeed, let f (x) = xn + g(x) with g(x) being the lower-degree terms in f (x). Then
f (x)2 = (xn )2 + 2xn g(x) + g(x)2 = (xn )2 + 0 + g(x)2
since 2 = 0 in F2 . By induction on the number of terms in the polynomial, we can
assume that g(x)2 = g(x2 ), so this gives
f (x)2 = (xn )2 + g(x2 ) = (x2 )n + g(x2 ) = f (x2 )
as asserted.
Exercises
5.01
5.02
Compute
1 + 1 + 1 + ... + 1
{z
}
|
107
5.04
5.05
5.06
5.07
5.08
6
The Integers
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
6.15
94
Chapter 6
The Integers
(Very often the word modulo is abbreviated as mod.) The non-negative integer
m is the modulus. For example,
10 reduced modulo 7 = 3
10 reduced modulo 5 = 0
12 reduced modulo 2 = 0
15 reduced modulo 7 = 1
100 reduced modulo 7 = 2
1000 reduced modulo 2 = 0
1001 reduced modulo 2 = 1
Remark: In some sources, and sometimes for brevity, this terminology is abused
by replacing the phrase N reduced mod m by N mod m. This is not so terrible,
but there is also a related but significantly different meaning that N mod m
has. Usually the context will make clear what the phrase N mod m means, but
watch out. We will use a notation which is fairly compatible with many computer
languages: write
x%m = reduced modulo m
The little theorem which describes existence and uniqueness, enabling us to
prove things but not necessarily do numerical computations, is:
Theorem: Given a non-zero integer m and an arbitrary integer x, there are unique
integers q (for quotient) and r (for remainder) with 0 r < |m| such that
x=qm+r
Proof: Lets do the proof just for positive x and m. Let S be the set of all nonnegative integers expressible in the form x sm for some integer s. The set S is
non-empty since x = x 0 m lies in it. Let r = x qm be the least non-negative
element of the set S. (This exists by the well-orderedness of the non-negative
integers.) We claim that r < m. (Keep in mind that were treating only m > 0, so
m = |m|.) If not, that is, if r m, then still r m 0, and also
r m = (x qm) m = x (q + 1)m
is still in the set S. But this would contradict the fact that r is the smallest nonnegative element in S. Thus, r < m. For uniqueness, suppose that both x = qm + r
and x = q 0 m + r0 . Then subtract to find
r r0 = m (q 0 q)
Thus, r r0 is a multiple of m. For such quantities r r0 in the obvious range
m < r r0 < m, the only one divisible by m is 0, so r = r0 . Then it follows easily
that q = q 0 also.
///
6.1
95
96
Chapter 6
The Integers
6.2 Divisibility
The ordinary integers Z with operations of addition, subtraction, multiplication,
and division (when possible, since not every quotient x/y of integers is an integer
itself), are intuitive and familiar. In this section we establish some terminology and
basic facts. In particular, at the end we resolve the question of when an integer x
has a multiplicative inverse modulo m.
For two integers d, n, the integer d divides n (or is a divisor of n) if n/d is
an integer. This is equivalent to there being another integer k so that n = kd. We
may also (equivalently) say that n is a multiple of d if d divides n. We write
a|b
if a divides b. As a good sample of how to prove things about divisibility, we have:
Proposition:
///
6.2
Divisibility
97
n
n
n
d
2
(where we are looking at inequalities among real numbers!). Therefore, neither of
the two factors d nor n/d is 1 nor n. So n is not prime.
On the other hand, suppose that n has a proper factorization n = d e, where
e is the larger of the two factors. Then
d=
n
n
e
d
gives d2 n, so d n.
///
Two integers are relatively prime or coprime or mutually prime if for
every integer d if d|m and d|n then d = 1. Also we may say that m is prime
to n if they are relatively prime. For a positive integer n, the number of positive
integers less than n and relatively prime to n is denoted by (n). This is called the
Euler phi-function or Euler totient function. (The trial-and-error approach
to computing (n) is suboptimal, but works.)
An integer d is a common divisor of a family of integers n1 , . . . , nm if d
divides each of the integers ni . An integer N is a common multiple of a family
of integers n1 , . . . , nm if N is a multiple of each of the integers ni . The following
theorem gives an unexpected and strange-looking characterization of the greatest
common divisor of two integers.
Theorem: Let m, n be integers, not both zero. Among all common divisors of
m, n there is a unique one, call it d, so that for every other common divisor e of
m, n we have e|d, and also d > 0. This divisor d is the greatest common divisor or
gcd of m, n, denoted gcd(m, n). The greatest common divisor of two integers m,
n (not both zero) is the least positive integer of the form xm + yn with x, y Z.
98
Chapter 6
The Integers
mn
gcd(m, n)
Proof: This is a corollary of the previous result because we use the existence and
form of the gcd to prove that of the lcm. Indeed, we will grant the existence of the
gcd and show that the quantity
L=
mn
gcd(m, n)
is the least common multiple of m and n. First we show that this L is a multiple
of both. Indeed, gcd(m, n) divides n, so n/gcd(m, n) is an integer, so
L=m
n
gcd(m, n)
m
n
+b
g
g
m rm
+
g
g
= (as + br) L
6.3
as claimed.
99
///
100
Chapter 6
The Integers
Proof: (of Lemma) If p|a we are done. So suppose that p does not divide a. Then
the greatest common divisor gcd(p, a) cannot be p. But this greatest common
divisor is also a divisor of p, and is positive. Since p is prime, the only positive
divisor of p other than p itself is just 1. Therefore, gcd(p, a) = 1. We saw that
there exist integers x, y so that xp + ya = 1.
Since p|(ab), we can write ab = hp for some integer h.
b = b 1 = b (xp + ya) = bxp + yba = (bx + yh) p
This shows that b is a multiple of p.
///
Proof: (of Corollary) This is by induction on n. The Lemma is the assertion for
n = 2. Suppose p|(a1 . . . an ). Then write the latter product as
a1 . . . an = (a1 . . . an1 ) an
By the lemma, either p divides an or p divides a1 a2 . . . an1 . If p|an we are done.
If not, then p|(a1 . . . an1 ). By induction, this implies that p divides one of the
factors a1 , a2 , . . . , an1 . Altogether, we conclude that in any case p divides one of
the factors a1 , . . . , an .
///
6.3
101
Proof: (of Theorem) First we prove that for every integer there exists a factorization, and then that it is unique. It certainly suffices to treat only factorizations of
positive integers, since factorizations for n and n are obviously related.
For existence, suppose that some integer n > 1 did not have a factorization
into primes. Then n cannot be prime itself, or just n = n is a factorization into
primes. Therefore n has a proper factorization n = xy with x, y > 0. Since the
factorization is proper, both x and y are strictly smaller than n. Thus, x and y
both can be factored into primes. Putting together the two factorizations gives the
factorization of n. This contradicts the assumption that there exist integers lacking
prime factorizations.
Now we prove uniqueness. Suppose we have
em
q1e1 . . . qm
= N = pf11 . . . pfnn
N
= pf22 . . . pfnn
q1e1
102
Chapter 6
The Integers
We had assumed that all the exponents ei were positive, so N/q1e1 < N . Thus, by
induction, N/q1e1 has unique factorization, and we conclude that all the remaining
factors must match up. This finishes the proof of the unique factorization theorem.
///
Now we prove the corollary, giving the formula for Eulers phi-function:
(N ) = (p1 1)p1e1 1 (p2 1)pe22 1 . . . (pn 1)penn 1
where n = pe11 . . . penn is the factorization into distinct prime factors pi , and all
exponents are positive integers. The argument is by counting: well count the
number of numbers x in the range from 0 through N 1 which do have a common
factor with N , and subtract. And, by unique factorization, if x has a common
factor with N then it has a common prime factor with N . There are exactly N/pi
numbers divisible by pi between 0 and N 1, so we would be tempted to say that
the number of numbers in that range with no common factor with N would be
N
N
N
N
...
p1
p2
pn
However, this is not correct in general: we have accounted for numbers divisible by
two different pi s twice, so we should add back in all the expressions N/pi pj with
i 6= j. But then weve added back in too many things, and have to subtract all the
expressions M/pi pj pk with i, j, k distinct. And so on:
(N ) = N
XN
i
pi
X N
pi pj
i6=j
N
+ ...
pi pj pk
distinct
1
1
1
1
... 1
=N 1
p1
p2
pn
1
1
1
e1
e2
en
= p1 1
p2 1
. . . pn 1
p1
p2
pn
i,j,k
///
6.4
103
d3 p
> 1 of N/d1 d2 is found, or it is determined that N/d1 d2 has no proper divisors
N/d1 d2 . In the latter case N/d1 d2 is prime. In the former case...
This recursive procedure ends when some N/(d1
d2 . . . dm ) is prime. At the
same time, if N has no divisor d in the range 1 < d N then N is prime.
R = {a + b 5 : a, b Z}
6 = 2 3 = (1 + 5)(1 5)
gives two different-looking factorizations of 6 in R. But to genuinely verify that
weve
factored 6in two different ways into primes we should verify that 2, 3, 1 +
a + b 5 = a b 5
For , in R, we have the property
=
= a + b 5 c + d 5 = (ab 5cd) + (ad + bc) 5
= (ab 5cd) (ad + bc) 5 = a b 5 c d 5 =
This computation is a special case of more general ones, as is already visible from
the fact that the 5 played no serious role.
104
Chapter 6
The Integers
5 = N () = N () N ()
By unique factorization in the ordinary integers Z, and since these norms are nonnegative integers, the integers N () and N () must either be 1, 6, 2, 3, 3, 2, or
6, 1. By our observation on the possible small values of the norm, the middle two
cases are impossible. In the remaining
two cases, one of or is 1, and the
cannot be factored further in R. An
factorization is not proper. That is, 1 + 5
essentially identical discussion applies to 1 5. Thus,
6=23= 1+
5
6.5
105
(reduction
(reduction
(reduction
(reduction
(reduction
(reduction
of
of
of
of
of
of
Notice that the first step is reduction of the larger of the given numbers modulo
the smaller of the two. The second step is reduction of the smaller of the two modulo
the remainder from the first step. At each step, the modulus of the previous step
becomes the dividend for the next step, and the remainder from the previous
step becomes the modulus for the next step.
In this example, since we obtained a 1 as the last non-zero remainder, we know
that the greatest common divisor of 614 and 513 is just 1, that is, that 614 and 513
are relatively prime. By the time we got close to the end, it could have been clear
that we were going to get 1 as the gcd, but we carried out the procedure through
the final step.
Notice that we did not need to find prime factorizations in order to use the
Euclidean Algorithm to find the greatest common divisor. Since it turns out to be
a time-consuming task to factor numbers into primes, this fact is worth something.
As another example, lets find the gcd of 1024 and 888:
1024 1 888 = 136
888 6 136 = 72
136 1 72 = 64
72 1 64 =
8
64 8 8 =
0
(reduction
(reduction
(reduction
(reduction
(reduction
of
of
of
of
of
In this case, since we got a remainder 0, we must look at the remainder on the
previous line: 8. The conclusion is that 8 is the greatest common divisor of 1024
and 888.
106
Chapter 6
The Integers
= r1
= r2
= r3
where 0 r1 < |y1 |, 0 r2 < r1 , and 0 r3 < r2 . We claim that r3 < r1 /2.
That is, we claim that in each two steps of the algorithm the remainder decreases
at least by a factor of 1/2. (Keep in mind that all the remainders are non-negative
integers).
If already r2 r1 /2, then since r3 < r2 we certainly have what we want. On
the other hand, if r2 > r1 /2 (but still r2 < r1 ), then evidently q3 = 1, and
1
1
r3 = r1 q3 r2 = r1 r2 < r1 r1 = r1
2
2
as desired.
Since in the first step the remainder is (strictly) smaller than |y|, after 1 + 2n
steps the remainder is (strictly) smaller than |y|/2n . The algorithm stops when this
remainder is 0 or 1. The remainder is an integer, so we can say that the algorithm
stops when the remainder is strictly less than 2. Thus, the algorithms stops when
2n |y| < 2
which is equivalent to
log2 |y| < 1 + n
or
2 log2 |y| 1 < 1 + 2n = number of steps
The way we have arranged it here, the number of steps is an odd integer, and the
latter inequality is satisfied for an odd integer at most 2 log2 |y| + 1. This proves
the proposition.
///
So far weve only seen how to find gcd(x, y). For small numbers we might feel
that its not terribly hard to do this just by factoring x, y into primes and comparing
6.5
107
=
=
=
=
=
=
=
=
=
=
=
312
(from the last line of the forward algorithm)
3 1 (5 1 3)
(replacing 2 by its expression from the previous line)
1 5 + 2 3
(rearranging as sum of 5s and 3s)
1 5 + 2 (8 1 5)
(replacing 3 by its expression from the previous line)
2835
(rearranging as sum of 8s and 5s)
2 8 3 (101 12 8)
(replacing 5 by its expression from the previous line)
3 101 + 38 8
(rearranging as sum of 101s and 8s)
3 101 + 38 (513 5 101)
(replacing 8 by its expression from the previous line)
38 513 193 101
(rearranging as sum of 513s and 101s)
38 513 193 (614 513)
(replacing 101 by its expression from the previous line)
231 513 193 614
(rearranging as sum of 614s and 513s)
108
Chapter 6
The Integers
Proof: (that the Euclidean Algorithm computes greatest common divisors): The
crucial claim is that if
x qy = r
with 0 r < |q| then gcd(x, y) = gcd(y, r). If we can prove this claim, then we
know that the gcd of the two numbers at each step of the algorithm is the same
as the gcd of the two initial inputs to the algorithm. And, at the end, when the
remainder is 0, the last two equations will be of the form
x0 q 0 y 0 = d
y 0 q 00 d = 0
This shows that d divides y 0 , so gcd(y 0 , d) = d. At the same time, if we grant
the crucial claim just above, gcd(y 0 , d) is the same as the gcd gcd(x, y) of the
original inputs. Thus, the gcd of the two original inputs is indeed the last non-zero
remainder.
Now we prove that crucial claim, that if
x qy = r
with 0 r < |q| then gcd(x, y) = gcd(y, r). On one hand, if d divides both x and
y, say x = Ad and y = Bd, then
r = x qy = Ad qBd = (A qB) d
so d divides r. On the other hand, if d divides both y and r, say y = Bd and
r = Cd, then
x = qy + r = qBd + Cd = (qB + C) d
so d divides x. This proves that the two gcds are the same.
///
6.6
Equivalence relations
109
The goal here is to make precise both the idea and the notation in writing
something like x y to mean that x and y have some specified common feature.
We can set up a general framework for this without worrying about the specifics of
what the features might be.
Recall the formal definition of a function f from a set S to a set T : while
we think of f as being some sort of rule which to an input s S computes or
associates an output f (s) T , this way of talking is inadequate, for many reasons.
Rather, the formal (possibly non-intuitive) definition of function f from a set
S to a set T is that it is a subset G of the cartesian product S T with the property
For each s S there is exactly one t T so that (s, t) G.
Then connect this to the usual notation by
f (s) = t
if
(s, t) G
(Again, this G would be the graph of f if S and T were simply the real line, for
example).
In this somewhat formal context, first there is the primitive general notion of
relation R on a set S: a relation R on a set S is simply a subset of the cartesian
product S S. Write
xRy
if the ordered pair (x, y) lies in the subset R of S S.
This definition of relation compared to the formal definition of function
makes it clear that every function is a relation. But most relations do not meet the
condition to be functions. This definition of relation is not very interesting except
as set-up for further development.
An equivalence relation R on a set S is a special kind of relation, satisfying
Reflexivity: x R x for all x S
Symmetry: If x R y then y R x
Transitivity: If x R y and y R z then x R z
The fundamental example of an equivalence relation is ordinary equality of
numbers. Or equality of sets. Or any other version of equality to which we are
accustomed. It should also be noted that a very popular notation for an equivalence
relation is
xy
(that is, with a tilde rather than an R). Sometimes this is simply read as x tilde
y, but also sometimes as x is equivalent to y with only implicit reference to the
equivalence relation.
A simple example of an equivalence relation on the set R2 can be defined by
(x, y) (x0 , y 0 )
if and only if
x = x0
That is, in terms of analytic geometry, two points are equivalent if and only if they
lie on the same vertical line. Verification of the three required properties in this
case is easy, and should be carried out by the reader.
110
Chapter 6
The Integers
Proof: The fact that the union of the equivalence classes is the whole thing is not
so amazing: given x S, x certainly lies inside the equivalence class
{y S : y x}
Now let A and B be two equivalence classes. Suppose that A B 6= , and
show that then A = B (as sets). Since the intersection is non-empty, there is some
element y A B. Then, by the definition of equivalence class, for all a A we
have a y, and likewise for all b B we have b y. By transitivity, a b. This
is true for all a A and b B, so (since A and B are equivalence classes) we have
A = B.
///
6.7
111
A set S of non-empty subsets of a set S whose union is the whole set S, and
which are mutually disjoint, is called a partition of S. The previous proposition
can be run the other direction as well:
Proof: Since the union of the sets in X is the whole set S, each element x S is
contained in some X X. Thus, we have the reflexivity property x x. If x y
then there is X X containing both x and y, and certainly y x, so we have
symmetry.
Finally, the mutual disjointness of the sets in X assures that each y S lies
in just one of the sets from X. For y S, let X be the unique set from X which
contains y. If x y and y z, then it must be that x X and z X, since y
lies in no other subset from X. Then x and z both lie in X, so x z, and we have
transitivity.
Verification that the equivalence classes are the elements of X is left as an
exercise.
///
112
Chapter 6
The Integers
///
Z/3 = {
6, 10,
Remark: On many occasions, the bar is dropped, so that x-mod-m may be written
simply as x with only the context to make clear that this means x-mod-m and not
the integer x. Also, of course, we can use symbols without bars for elements of the
set Z/m.
Thus, for example, modulo 12 we have
0 = 12 = 12 = 2400
7 = 7 = 5 = 2407
1 = 13 = 11 = 2401
or, equivalently,
0-mod-12 = 12-mod-12 = 12-mod-12 = 2400-mod-12
7-mod-12 = 7-mod-12 = 5-mod-12 = 2407-mod-12
6.7
113
114
Chapter 6
The Integers
Proof: It suffices to prove only the more general assertions. Since x0 x mod m,
m|(x0 x), so there is an integer k so that mk = x0 x. That is, we have x0 = x+mk.
Similarly, we have y 0 = y + `m for integer `. Then
x0 + y 0 = (x + mk) + (y + m`) = x + y + m (k + `)
Thus, x0 + y 0 x + y mod m. And
x0 y 0 = (x + mk) (y + m`) = x y + xm` + mky + mk m` = x y + m (k + ` + mk`)
Thus, x0 y 0 xy mod m.
///
Remark: Dont become over-confident, though. For example, it is not true that
210 = 210%5 mod 5
as we can check by noting that
210 %5 = 1024%5 = 4
while
210%5 %5 = 20 %5 = 1%5 = 1
and 1 6= 4. That is, exponents cant be simply reduced modulo the modulus.
As a corollary of this last proposition, congruences immediately inherit some
properties from ordinary arithmetic, simply because x = y implies x = y mod m:
Distributivity: x(y + z) = xy + xz mod m
Associativity of addition: (x + y) + z = x + (y + z) mod m
Associativity of multiplication: (xy)z = x(yz) mod m
Property of 1: 1 x = x 1 = x mod m
Property of 0: 0 + x = x + 0 = x mod m
Recall that we proved that a has a multiplicative inverse if and only if
gcd(a, m) = 1, in which case the Euclidean Algorithm is an effective means to
actually find the inverse. There is a separate notation for the integers-mod-m
which are relatively prime to m and hence have inverses:
(Z/m) = {
x Z/m : gcd(x, m) = 1}
The superscript is not an x but is a times, making a reference to multiplication
and multiplicative inverses mod m. Note also that gcd(x, m) is independent of the
representative x of the equivalence class, so this is well-defined!
Proof: One way to think about this would be in terms of prime factorizations,
but lets do without that. Rather, lets use the fact that the gcd of two integers a, b
can be expressed as
gcd(a, b) = sa + tb
6.8
115
1 = cy + dm
Then
1 = 1 1 = (ax + bm)(cy + dm) = (ac)(xy) + (bcy + axd + bdm)m
Thus, 1 is expressible in the form A(xy) + Bm, so (by the sharp form of this
principle!) necessarily xy and m are relatively prime.
///
So in the batch of things denoted (Z/m) we can multiply and take inverses
(so, effectively, divide).
Proof: Weve already done most of the work to prove this. First, prior to proving
any of these properties, there was the funny business verifying that addition and
multiplication were well-defined modulo p, meaning that the operations really
made sense mod p.
After the well-definedness is proven, the associativity and distributivity and
commutativity are simply inherited by Z/p from the corresponding properties of
116
Chapter 6
The Integers
x = (x)
But, for example, if we try to find 2 among {0, 1, 2, 3, 4} then we might mistakenly
think that 2 has no additive inverse modulo 5. In reality, modulo 5,
2 = (2) = 3
since 2 = 3 mod 5.
The only real issue is verifying that non-zero things x modulo p have multiplicative inverses. Note that non-zero modulo p means that p does not divide x.
Thus, gcd(x, p) is not p, but some proper divisor of p. But since p is prime there
are few choices left: we must have gcd(x, p) = 1. By the funny characterization
of gcds, there are integers a, b so that ax + bp = 1, and then (as weve discussed
already on another occasion) a is a multiplicative inverse mod p. That is, x
1 = a
.
///
Corollary: For every prime integer p, there exists a finite field with p elements,
denoted Fp .
///
Remark: What is not clear is that there are no other finite fields with a prime
number p of elements than Z/p. In fact, it is true that a field F with a prime
number p of elements is the same as Z/p, but this requires proof.
Collections of numbers such as the ordinary integers Z dont quite have all
the properties of a field, however. The particular missing property is that not every
non-zero element of Z has a multiplicative inverse (in Z). Such a collection which
meets all the conditions above except possibly the requirement that every non-zero
element has a multiplicative inverse is a commutative ring. If the commutativity
of multiplication is also dropped, then we have a ring.
Remark: The entity Z/m is not a field for m not prime, although it is a commutative ring. Indeed, let m = a b be a proper factorization of m. Then its pretty
easy to see that neither a nor b is actually 0 modulo m, but also that neither has
a multiplicative inverse.
Remark: We will see later that for any prime power pn there exists a finite field
Fpn with pn elements. It is important to realize that for n > 1 it is never the
case that Z/pn gives a field, despite the fact that it has pn elements. As in the
previous remark, if n > 1 there are many non-zero elements in Z/pn which have no
multiplicative inverses, failing that important property of a field.
Remark: We could also spend time proving that in a field there is only one
element that behaves like 0, only one element that behaves like 1, that additive or
multiplicative inverses are unique, and such things, but since our viewpoint will be
mostly computational, proof of these very unsurprising facts is not urgent.
6.9
117
Remark: The corollary follows easily from the theorem by remembering that if
gcd(x, p) = 1 then x has a multiplicative inverse x1 modulo p. Then multiply both
sides of the equation xp = x mod p by x1 to obtain the assertion of the corollary.
Proof: We will first prove that prime p divides the binomial coefficients
p
i
with 1 i p 1, keeping in mind that the extreme cases i = 0 and i = p cant
possibly also have this property, since
p
p
=1
=1
0
p
Indeed, from its definition,
p
p!
=
i! (p i)!
i
Certainly p divides the numerator. Since 0 < i < p, the prime p divides none of the
factors in the factorials in the denominator. By unique factorization into primes,
this means that p does not divide the denominator at all. Indeed,
p
i! (p i)! = p!
i
The prime p divides the right-hand side, so it divides the left-hand side. But p
cannot divide i! nor (p i)! (for 0 < i < p) since these two numbers are products of
118
Chapter 6
The Integers
integers smaller than p and hence not divisible by p. (And, even more important,
we have seen that if a prime p does not divide a, b then p does not divide ab.)
The Binomial Theorem asserts that
X p
(x + y)p =
xi y pi
i
0ip
In particular, since the coefficients of the left-hand side are integers the same must
be true of the coefficients on the right-hand side. Thus, all the binomial coefficients
are integers. We did not use the fact that p is prime to reach this conclusion.
Thus, the binomial coefficients with 0 < i < p are integers expressed as fractions whose numerators are divisible by p and whose denominators are not divisible
by p. Thus, when all cancellation is done in the fraction, there must remain a factor of p in the numerator. This proves the desired fact about binomial coefficients.
(One might notice that unique factorization is used here!)
Now we prove Fermats Little Theorem for positive integers x by induction on
x. First, certainly 1p = 1 mod p. For the induction step, suppose that we already
know for some particular x that
xp = x mod p
Then
(x + 1)p =
X p
X p
xi + 1
xi 1pi = xp +
i
i
0<i<p
0ip
All the coefficients in the sum in the middle of the last expression are divisible by
p. Therefore,
(x + 1)p = xp + 0 + 1 = x + 1 mod p
since our induction hypothesis is that xp = x mod p. This proves the theorem for
positive x.
///
Remark: The special case that n is prime is just Fermats Little Theorem, since
for prime p we easily see that (p) = p 1.
6.10
Eulers theorem
119
gG
gG
gG
120
Chapter 6
The Integers
///
Remark: On one hand, this argument might hint that it is a mere shadow of
some more systematic general approach. This is indeed the case. On the other
hand, there are other equally important techniques toward which this little proof
gives no hint.
Proof: Basically by the definition of (n), the number of distinct residue classes
x
modulo n with x relatively prime to n is (n). We claim that if ` is the smallest
positive integer ` so that g ` = 1 mod n, then we can only get ` different values of
g L mod n no matter what integer exponent L we use. Indeed, write L = q` + r
with 0 r < `. Then
g L = g q`+r = (g ` )q g r = 1q g r = g r mod n
That is, in fact, all possible values of g L mod n lie in the list
g 0 , g 1 , g 2 , . . . , g `1
This proves the proposition.
///
For most integers n there is no primitive root modulo n. The precise statement is
Theorem: The only integers n for which there is a primitive root modulo n are
those of the forms
n = pe with an odd prime p, and e 1
n = 2pe with an odd prime p, and e 1
n = 2, 4
This will be proven later. In particular, the most important case is that there
do exist primitive roots modulo primes. It is useful to make clear one important
property of primitive roots:
Proposition: Let g be a primitive root modulo a prime p. Let ` be an integer so
that
g ` = 1 mod p
Then p 1 divides `.
6.12
Eulers criterion
121
122
Chapter 6
The Integers
Proof: (Easy half ) Suppose that x = y n mod p. Then, invoking Fermats Little
Theorem,
x(p1)/n = (y n )(p1)/n = y p1 = 1 mod p
as claimed.
(Hard half ) Now suppose that x(p1)/n = 1 mod p, and show that x is an nth
power. Let g be a primitive root modulo p, and let ` be a positive integer so that
g ` = x. We have
(g ` )(p1)/n = 1 mod p
From the discussion of primitive roots above, this implies that
(p 1) | ` (p 1)/n
Let k be an integer such that
k (p 1) = ` (p 1)/n
Then kn = `. Then
x = g ` = g kn = (g k )n mod p
That is, x is the nth power of g k .
///
x2 = (x8 )2
5
x2 = (x2 )2
...
6.13
123
Then
4
xe = xeo (x2 )e1 (x4 )e2 (x8 )e3 (x2 )e4 . . . (x2 )en
Again, the ei s are just 0 or 1, so in fact this notation is clumsy: we omit the factor
k
k
x2 if ek = 0 and include the factor x2 if ek = 1.
A fairly good way of implementing this is the following, which we call the Fast
Modular Exponentiation algorithm. To compute xe %m, we will keep track
of a triple (X, E, Y ) which initially is (X, E, Y ) = (x, e, 1). At each step of the
algorithm:
If E is odd then replace Y by (X Y )%m and replace E by E 1
If E is even then replace X by (X X)%m and replace E by E/2. When
E = 0 the value of Y at that time is xe %m.
This algorithm takes at most 2 log2 E steps. Note that in the fast exponentiation modulo m, no number larger than m2 will arise. Thus, for example, to
compute something like
21000 %1000001
would require no more than 2 log2 1000 2 10 = 20 multiplications of 6-digit
numbers.
For example, lets directly evaluate 21000 mod 89. Setting this up as indicated
just above, we have
X
2
4
16
78
78
32
45
45
67
67
39
39
8
8
64
64
E
1000
500
250
125
124
62
31
30
15
14
7
6
3
2
1
0
Y
1
1
1
1
78
78
78
39
39
32
32
2
2
16
16
45
initial state
E was even: square X mod 89
E was even: square X mod 89
E was even: square X mod 89
E was odd: multiply Y by X mod
E was even: square X mod 89
E was even: square X mod 89
E was odd: multiply Y by X mod
E was even: square X mod 89
E was odd: multiply Y by X mod
E was even: square X mod 89
E was odd: multiply Y by X mod
E was even: square X mod 89
E was odd: multiply Y by X mod
E was even: square X mod 89
E was odd: multiply Y by X mod
We conclude that
21000 %89 = 45
89
89
89
89
89
89
124
Chapter 6
The Integers
Proof: First, the peculiar characterization of the gcd(m, n) as the smallest positive
integer expressible in the form am + bn for integers a and b assures (since here
gcd(m, n) = 1) that integers r and s exist such that rm + sn = 1. Second, we
should check that the function f is well-defined, that is, that if x0 = x + am and
y 0 = y + bn for some integers a and b, then still
f (x0 , y 0 ) = f (x, y)
Indeed,
f (x0 , y 0 ) = y 0 rm + x0 sn = (y + an)rm + (x + am)sn
= yrm + xsn + mn(ar + bs) = f (x, y) mod mn
This proves the well-definedness.
To prove surjectivity of f , for any integer z, let x = z and y = z. Then
f (x, y) = zrm + zsn = z(rm + sn) = z 1 mod mn
To prove injectivity, we could use the fact that Z/m Z/n and Z/mn are finite
sets of the same size, so a surjective function is necessarily injective. But we can
learn a little more by a more direct proof. Suppose that
f (x0 , y 0 ) = f (x, y)
6.14
Sun-Zes theorem
125
Then modulo m the terms yrm and y 0 rm are 0, so this asserts that
xsn = x0 sn mod m
From rm + sn = 1 mod mn we obtain sn = 1 mod m, so
x = x0 mod m
Symmetrically,
y = y 0 mod n
This proves injectivity.
Finally, observe that (by the same reasoning)
f (x, y) = yrm + xsn = y 0 + x 1 mod m = x mod m
and similarly
f (x, y) = yrm + xsn = y 1 + x 0 mod n = y mod n
These facts, together with the identity f (z, z) = z mod mn already proven, show
that f 1 is as claimed.
///
The more general version is
Theorem: (Sun-Ze) For m1 , . . . , mn mutually relatively prime, the map
g : Z/(m1 . . . mn ) Z/m1 Z/m2 . . . Z/mn
defined by
g(x) = (x mod m1 , x mod m2 , . . . , x mod mn )
is a bijection.
126
Chapter 6
The Integers
///
6.14
Sun-Zes theorem
127
The discussion of the congruence modulo n is nearly identical, with roles reversed.
Lets do it:
xo = (a(tn) + b(sm)) mod n = 0 + b(sm) mod m
= b(sm) mod n = b(1 tn) mod n
= b(1) mod n = b mod n
Thus, anything congruent to this xo modulo mn is a solution to the system.
On the other hand, suppose x is a solution to the system, and lets prove that
it is congruent to xo modulo mn. Since x = a mod m and x = b mod n, we have
x xo = a a = 0 mod m
and
x xo = b b = 0 mod n
That is, both m and n divide x xo . Since m and n are relatively prime, we can
conclude that mn divides x xo , as desired.
Note the process of sticking the solutions together via the formula above uses
the Euclidean Algorithm in order to be computationally effective (rather than just
theoretically possible).
For example, lets solve the system
(
x = 2 mod 11
x = 7 mod 13
x = b1 mod m1
x = b2 mod m2
x = b3 mod m3
...
x = bn mod mn
128
Chapter 6
The Integers
Well only consider the scenario that mi and mj are relatively prime (for
i 6= j). We solve it in steps: first, just look at the subsystem
(
x = b1 mod m1
x = b2 mod m2
and use the method above to turn this into a single (equivalent!) congruence of the
form
x = c2 mod m1 m2
Then look at the system
(
x = c2 mod m1 m2
x = b2 mod m3
and use the method above to combine these two congruences into a single equivalent
one, say
x = c3 mod m1 m2 m3
and so on.
Proof: Note that we already proved several things relevant to this in our earlier
discussion of Eulers theorem, such as the fact that (n) is the cardinality of (Z/n) .
Exercises
129
If we can prove the first formula, then by unique factorization it suffices to prove
the second formula for prime powers n = pe . Recall that Sun-Zes theorem gives a
bijection Z/m Z/n Z/mn. In general, the cardinality of a cartesian product
A B of sets (meaning the set of ordered pairs (a, b) with a A and b B) is
the product of the cardinalities of A and B. By now we know that (Z/t) exactly
consists of x modulo t with gcd(x, t) = 1. Combining these facts proves the first
formula of the theorem.
Next, we prove the second formula for prime powers n = pe with e a positive
integer. In this special case, gcd(`, pe ) > 1 if and only if p divides `. There are pe /p
multiples of p between (inclusive) 1 and pe , so
(pe ) = pe pe1 = (p 1)pe1
as claimed.
///
Exercises
6.01
6.02
6.03
6.04
Prove that the reduction mod 10 of a positive integer N is simply the onesplace digit of N in decimal expansion. (ans.)
6.05
Prove that the reduction mod 100 of a positive integer N is the two-digit
number made up of the tens-place and ones-place digits of N .
6.06
6.07
6.08
6.09
By brute force, check that among 1,2,...,25 the integers with multiplicative
inverses modulo 26 are the odd integers in that range, excluding 13. Is
there any shortcut here, by cleverness alone, without invoking any fancier
mathematics?
6.10
(*) (This is a little hard to do without using anything further than what we
have already!) Let m be a positive integer. Prove that for all integers x, y
((x%m) + (y%m))%m = (x + y)%m
and
((x%m) (y%m))%m = (x y)%m
6.11
Find all the divisors of 60. Why are you sure that you have them all? (ans.)
130
Chapter 6
The Integers
6.12
For all the numbers under 100, note either that they are prime, or factor
them into primes.
6.13
Show directly from the definition of divisibility that if d|m then d|(m).
(ans.)
6.14
Prove directly, from the very definition of divisibility, that if d|x and d|y
then d|(x y) and d|(x + y).
6.15
Observe that 1331 and 14641 cannot be prime, without computation. (ans.)
6.16
6.17
6.18
Find the least common multiple of 2, 4, 8, 16, 32, 64, and 128. (ans.)
6.19
Show that for any integer n if d|n and d|(n + 2) then d|2. (ans.)
6.20
Show that for any integer n the two integers n and n + 1 are relatively
prime.
6.21
6.22
Show that for any integer n, the integers n and n2 + 1 are relatively prime.
6.23
(*) Show that for any integer n the greatest common divisor of 16n2 + 8n + 1
and 16n2 8n + 1 is 1.
6.24
Prove that for any two integers m, n, the least common multiple lcm(m, n)
exists, and is given by the formula lcm(m, n) = m ngcd(m, n). (Caution:
do not accidentally assume that the lcm exists to prove the formula.)
6.25
(**) How likely is it that two randomly chosen positive integers will be
relatively prime? (Hint: Part of the issue is to make suitable sense of the
question. First look in the range 1, . . . , N with N = p1 . . . pt with distinct
primes p1 , . . . , pt , and take a limit. Second, estimate the inaccuracy in this
approach. There remains the question of evaluating
Y
p prime
X
1
1 p2 =
2
n
n=1
6.26
Find a proper factor of 111, 111, 111, 111, 111 without using a calculator.
(ans.)
6.27
Find a proper factor of 101, 010, 101, 010, 101 without using a calculator.
(ans.)
6.28
6.29
Exercises
131
6.30
6.31
6.32
6.33
6.34
6.35
xi yi
!
=
X
i
x2i
!
X
i
yi2
(xi yj xj yi )2
i<j
6.36
(*) (Euclids proof that there are infinitely-many primes) Suppose there
were only finitely many primes p1 , p2 , . . . , pn . Consider the number N =
p1 . . . pn + 1. Show that none of the pi can divide N . Conclude that there
must be some other prime than those on this list, from which one would
obtain a contradiction.
6.37
Find gcd(1112, 1544) and express it in the form 1112x + 1544y for some
integers x and y by hand computation. (ans.)
6.38
6.39
For an integer n, show that the greatest common divisor of the two integers
n3 + n2 + n + 1 and n2 + n + 1 is unavoidably just 1.
6.40
For an integer n, show that the greatest common divisor of the two integers
n3 + n2 + n + 1 and n8 + n7 + n6 + n5 + n4 + n3 + n2 + n + 1 is unavoidably
just 1.
6.41
Show that the subset {(1, 1), (2, 2), (3, 3), (1, 2), (2, 1)} of {1, 2, 3} {1, 2, 3}
is an equivalence relation on the set {1, 2, 3}.
6.42
6.43
6.44
How many equivalence relations are there on the set {1, 2, 3, 4}? (ans.)
132
Chapter 6
The Integers
6.45
Take two positive integers n and N with n not dividing N . Find an integer
x so that
(x%N )%n 6= x%n
6.46
6.47
6.48
6.49
Compute and reduce modulo the indicated modulus: 110 124 modulo 3 and
also 12 + 1234567890 mod 10.
6.50
6.51
6.52
6.53
6.54
Find four distinct residue classes x modulo 15 so that x2 = 1 mod 15. (ans.)
6.55
Find three distinct residue classes x modulo 105 so that x2 = 1 mod 105.
(ans.)
6.56
6.57
6.58
6.59
6.60
6.61
6.62
6.63
6.64
6.65
6.66
Exercises
133
6.67
By direct computation, check that 2 is not a primitive root modulo 17, but
that 3 is.
6.68
6.69
6.70
6.71
6.72
6.73
6.74
6.75
6.76
6.77
6.78
Let r be a prime, and let p be a prime with r 6 |(p 1). Show that every
element of Z/p has a unique rth root.
6.79
Let r be a prime, and let p be a prime with r|(p 1) but r2 6 |(p 1). Let s
be a multiplicative inverse of r modulo (p 1)/r. Show that if b is an rth
power modulo p then bs is an rth root of b modulo p. (ans.)
6.80
6.81
6.82
6.83
7
Permutations and Interleavers
7.1
7.2
7.3
Permutations of sets
Shuffles
Block interleavers
1
2
3
...
f (1) f (2) f (3) . . .
134
n
f (n)
7.1
Permutations of sets
135
n
n
which does not move any element of the set. That is, for all i, e(i) = i.
Of course, one permutation may be applied after another. If g, h are two
permutations, write
gh
for the permutation that we get by first applying h and then applying g. This is the
composition or product of the two permutations. It is important to appreciate
that, in general
g h 6= h g
Well see examples of this below. But in any case this notation is indeed compatible
with the notation for (and the idea of) composition of functions. Thus, for 1 i
n, by definition
(g h)(i) = g(h(i))
It is a consequence of the definition of permutations as (bijective) functions
from a set to itself that composition of permutations is associative: for all permutations g, h, k of a set,
(g h) k = g (h k)
Indeed, for any element i of the set, the definition of composition of permutations
gives
((g h) k)(x) = (g h)(k(x))
= g(h(k(x)))
= g((h k)(x))
= (g (h k))(x)
definition of (g h) k, applied to x
definition of h k, applied to x
definition of g (h k), applied to x
136
Chapter 7
2 3 1
3 2 1
we see what this composite does to each of 1, 2, 3. The permutation on the right is
applied first. It sends 1 to 3, which is sent to 1 by the second permutation (the one
on the left). Similarly, 2 is sent to 2 (by the permutation on the right) which is sent
to 3 (by the permutation on the left). Similarly, 3 is sent to 1 (by the permutation
on the right) which is sent to 2 (by the permutation on the left). Graph-listing this
information, we have
1 2 3
1 2 3
1 2 3
=
2 3 1
3 2 1
1 3 2
If we multiply (compose) in the opposite order, we get something different:
1
3
2
2
3
1
1
2
3
3
1
=
1
2
2
1
3
3
1
3
2
2
3
1
= (1 3)
7.1
Permutations of sets
137
1
2
2
3
3
1
= (1 2 3)
138
Chapter 7
The decomposition into disjoint cycles tells how many times a permutation
must be repeated in order to have no net effect: the least common multiple of the
lengths of the disjoint cycles appearing in its decomposition.
The order of a permutation is the number of times it must be applied in order
to have no net effect. (Yes, there is possibility of confusion with other uses of the
word order). Thus,
The order of a k-cycle is k. The order of a product of disjoint cycles is the
least common multiple of the lengths.
We might imagine that permutations with larger orders mix better than permutations with smaller orders since more repetitions are necessary before the mixing effect is cancelled. In this context, it may be amusing to realize that if a card
shuffle is done perfectly, then after some number of repetitions the cards will be
returned to their original order! But the number is pretty large with a 52-card
deck, and its not easy to do perfect shuffles anyway.
As an example, lets examine all the elements of S7 , determining their structure
as products of disjoint cycles, counting the number of each kind, and noting their
order.
First, lets count the 7-cycles (i1 . . . i7 ): there are 7 choices for i1 , 6 for i2 , and
so on, but there are 7 different ways to write each 7-cycle, so there are 7!/7 distinct
7-cycles altogether.
Next, 6-cycles (i1 . . . i6 ): there are 7 choices for i1 , 6 for i2 , and so on down to
2 choices for i6 , but there are 6 different ways to write each 6-cycle, so there are
7!/6 distinct 6-cycles altogether.
Next, 5-cycles (i1 . . . i5 ): there are 7 choices for i1 , 6 for i2 , and so on down to
3 choices for i5 , but there are 5 different ways to write each 5-cycle, so there are
7!/(2! 5) distinct 5-cycles altogether.
For variety, lets count the number of permutations writeable as a product of
a disjoint 5-cycle and a 2-cycle. We just counted that there are 7!/(2! 5) distinct
5-cycles. But each choice of 5-cycle leaves just one choice for 2-cycle disjoint from
it, so there are again 7!/(2! 5) distinct products of disjoint 5-cycle and 2-cycle. And
we note that the order of a product of disjoint 5 and 2-cycle is lcm(2, 5) = 10.
There are 7!/3! 4 distinct 4-cycles, by reasoning similar to previous examples.
There are 7!/(3! 4) 3!/2 choices of disjoint 4-cycle and 2-cycle. The order of
the product of such is lcm(2, 4) = 4.
There are 7!/(3! 4) 3!/3 choices of disjoint 4-cycle and 3-cycle. The order of
the product of such is lcm(3, 4) = 12.
There are 7!/(4! 3) distinct 3-cycles, by reasoning similar to previous examples.
There are 7!/(4! 3) 4!/(2! 2) choices of disjoint 3-cycle and 2-cycle. The order
of the product of such is lcm(2, 3) = 6.
The number of disjoint 3-cycle, 2-cycle, and 2-cycle is slightly subtler, since
the two 2-cycles are indistinguishable. Thus, there are
7! 4! 2! 1
4! 3 2! 2 0! 2 2!
where the last division by 2! is to take into account the 2! different orderings of
the two 2-cycles, which make only a notational difference, not a difference in the
permutation itself. The order of such a permutation is lcm(2, 2, 3) = 6.
7.2
Shuffles
139
The number of disjoint pairs of 3-cycle and 3-cycle is similar: the two 3-cycles
are not actually ordered although our choosing of them gives the appearance that
they are ordered. There are
7! 4! 1
4! 3 1! 3 2!
such pairs, where the last division by 2! is to take into account the 2! different orderings of the two 3-cycles, which make only a notational difference, not a difference
in the permutation itself. The order of such a permutation is lcm(3, 3, 1) = 3.
There are 7!/(5! 2) distinct 2-cycles, each of order 2.
There are 7!/(5! 2) 5!/(3! 2) 1/2! pairs of disjoint 2-cycles, where the last
division by 2! is to take into account the possible orderings of the two 2-cycles,
which affect the notation but not the permutation itself.
Finally, there are
7! 5! 3! 1
5! 2 3! 2 1! 2 3!
triples of disjoint 2-cycles, where the last division by 3! is to account for the possible
orderings of the 3 2-cycles, which affects the notation but not the permutation itself.
The order of such a permutation is just lcm(2, 2, 2) = 2.
As a by-product of this discussion, we see that the largest order of any permutation of 7 things is 12, which is obtained by taking the product of disjoint 3 and
4-cycles.
As a more extreme example of the counting issues involved, lets count the
disjoint products of three 2-cycles and three 5-cycles in S24 . As above, this is
24! 22! 20! 1 18! 13! 8! 1
7.2 Shuffles
Overhand shuffles and riffle shuffles of decks of cards, viewed as permutations of
the set of cards in the deck, are amenable to analysis. Some of the conclusions may
be surprising. A mixing procedure identical to a riffle shuffle is used in interleaving
convolutional codes.
The simplest type of overhand shuffle applied to a deck of n cards consists of
choosing a random spot to break the deck in two, and then interchanging the two
parts. For example, with a deck of just 6 cards labeled 0, 1, 2, 3, 4, 5, the deck might
be broken into pieces 0, 1 and 2, 3, 4, 5. Then the two pieces are put back together
as 2, 3, 4, 5, 0, 1. With a deck of n cards, the ith overhand shuffle fi is defined as
being the permutation that has the effect
0, 1, 2, 3, . . . , n 2, n 1 i, 1 + i, . . . , n 2, n 1, 0, 1, 2, 3, . . . , i 1
140
Chapter 7
(respectively), meaning that the 0 is sent to the ith position, and so on. In the
graph-listing notation above, starting the indexing with 0 rather than 1, this is
0
1
2
...
n1
i i + 1 i + 2 . . . (n 1 + i)%n
That is, in terms of reduction of integers modulo n, as a function
fi : Z/n Z/n
this shuffle is
fi (x) = (x + i)%n
where y%n denotes the reduction of y modulo n. That is, an overhand shuffle on a
deck of n cards simply amounts to adding modulo n. In particular,
fj (fi (x)) = fi+j (x)
That is, the effect of two overhand shuffles is identical to that of a single overhand
shuffle. In particular, in that regard overhand shuffles are not very thorough mixers,
since you can overhand shuffle a deck all day long and have no more effect than
just doing a single overhand shuffle.
It turns out that riffle shuffles are best described labeling the cards starting
from 1, rather than starting from 0 as in the case of the simplest overhand shuffle.
A good riffle shuffle of a deck of 2n cards consists of breaking the deck into two
equal pieces
1, 2, 3, . . . , n n + 1, n + 2, . . . , 2n 1, 2n
and then interleaving the cards from one half with the cards from the other as
n + 1, 1, n + 2, 2, n + 3, 3, . . . , 2n 1, n 1, 2n, n (good riffle)
Note that the top and bottom cards do not stay in their original positions. There
is a bad riffle shuffle, which may be useful in various card tricks, in which the top
and bottom cards stay in the same position: the interleaving in the bad case is
1, n + 1, 2, n + 2, 3, n + 3, 3, . . . , n 1, 2n 1, n, 2n (bad riffle)
This bad riffle shuffle is the same thing as a good riffle shuffle on the deck of cards
obtained by removing the top and bottom cards from the deck. Also, note that
there is really just one riffle shuffle, unlike the overhand shuffles where there is a
parameter.
Proposition: The good riffle shuffle on a deck of 2n cards 1, 2, . . . , 2n 1 is the
function
f (x) = (2 x)%(2n + 1)
That is, a good riffle shuffle is multiplication by 2 followed by reduction modulo
2n + 1.
7.3
Block interleavers
141
Proof: On one hand, if 1 x n, then by its definition the riffle shuffle sends
x to the 2xth spot in the deck, because of the interleaving. On the other hand, if
n < x 2n, write x = n + i. Then by definition of the shuffle x is sent to the
(2i 1)th spot in the deck. We can re-express this as
f (n + i) = 2i 1 = 2(n + i) (2n + 1) = 2(n + i)%(2n + 1)
since 2n + 1 < 2(n + i) < 2(2n + 1). This proves that the riffle shuffle is just
multiplication by 2 modulo 2n + 1, as claimed.
///
Proof: The xth card is put into position 2t x mod 2n + 1 by t applications of the
riffle shuffle. The equations
2t x = x mod 2n + 1
for x = 1, 2, 3, . . . , 2n include as a special case x = 1, which is
2t = 1 mod 2n + 1
The smallest positive solution is t = e, and then indeed 2e x = x mod 2n + 1 for
all x.
///
1
n+1
mn n
mn n + 1
2
n+2
...
mn n + 2
...
...
n1
2n 1
...
mn 1
Then read the numbers out by by columns, from left to right, top to bottom:
0, n, 2n, . . . , mn n, 1, n + 1, 2n + 1, . . . , mn n + 1, . . . , mn 1
This has the bad feature that 0 and mn 1 are left in the same positions. This
disadvantage is offset by some other positive features and simplicity. Variations on
this idea can certainly avoid these fixed points.
142
Chapter 7
From the physical description of the interleaver, we can get a formula for the
effect of the m-by-n block interleaver: given x, let x = qm + r with 0 r < m.
Then the interleaver sends
qm + r = x q + rn
Indeed, notice that the row from which x is read out is the integer part of x/n and
the column is x%n. Writing into the array reverses the roles of column and row,
and interchanges the roles of n and m.
For example, the 3-by-4 block interleaver is computed by creating the array
0
4
8
1
5
9
2
6
10
3
7
11
0
0
1
4
2
8
3
1
4
5
5
9
6
2
7
6
8
10
9
3
10
7
11
11
Proposition: Ignoring the obvious fixed point mn 1, the m-by-n block interleaver acts on the set
{0, 1, 2, 3, . . . , mn 2}
by multiplication by n followed by reduction modulo mn 1. That is
x (nx)%(mn 1)
///
Exercises
143
Exercises
7.01
Express the following permutation as a product of disjoint cycles and determine its order
1 2 3 4 5
2 5 4 3 1
(ans.)
7.02
Express the following permutation as a product of disjoint cycles and determine its order
1 2 3 4 5 6 7
2 5 4 7 1 3 6
7.03
Express the following permutation as a product of disjoint cycles and determine its order
1 2 3 4 5 6 7
2 3 4 7 1 5 6
(ans.)
7.04
Express the following permutation as a product of disjoint cycles and determine its order
1 2 3 4 5 6 7 8 9
2 3 4 8 9 7 1 5 6
7.05
5
1
6
3
7
6
1
2
3
3
4
4
7
5
1
6
5
7
6
5
7
6
3
7
6
1
2
4
3
3
4
7
5
5
6
1
7
6
(ans.)
7.06
7.07
How many distinct 3-cycles are there in the symmetric group S5 of permutations of 5 things? (ans.)
7.08
How many distinct 3-cycles are there in the symmetric group S6 of permutations of 5 things?
7.09
7.10
7.11
144
Chapter 7
7.12
7.13
7.14
7.15
n + 1, n + 2, . . . , 2n 1, 2n
and then interleaving the cards from one half with the cards from the other
as
n + 1, 1, n + 2, 2, n + 3, 3, . . . , 2n 1, n 1, 2n, n
(The top and bottom cards do not stay in their original positions.) Show
that if a good riffle shuffle on a deck of 50 cards is executed just 8 times in a
row, then all cards return to their original positions. Show that if a perfect
riffle shuffle on a deck of 52 cards is executed repeatedly, no card returns to
its original position until the riffle shuffle has been executed 52 times.
7.16
7.17
7.18
7.19
7.20
7.21
7.22
7.23
7.24
7.25
7.26
7.27
8
Groups
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
Groups
Subgroups
Lagranges Theorem
Index of a subgroup
Laws of exponents
Cyclic subgroups, orders, exponents
Eulers Theorem
Exponents of groups
Group homomorphisms
Finite cyclic groups
Roots, powers
Here we encounter the first instance of abstract algebra rather than the tangible
algebra studied in high school. One way to think of the point of this is that it is
an attempt to study the structure of things directly, without reference to irrelevant
particular details.
This also achieves amazing efficiency (in the long run, anyway), since it turns
out that the same underlying structures occur over and over again in mathematics. Thus, a careful study of these basic structures is amply repaid by allowing
a much simpler and more unified mental picture of otherwise seemingly different
phenomena.
8.1 Groups
The simplest (but maybe not most immediately intuitive) object in abstract algebra is a group. This idea is pervasive in modern mathematics. Many seemingly
complicated or elusive issues seem to be merely secret manifestations of facts about
groups. This is especially true in number theory, where it is possible to give elementary proofs of many results, but only at the cost of having everything be
complicated and so messy that it cant be remembered.
145
146
Chapter 8
Groups
8.2
Subgroups
147
1
0
0
1
The existence of inverses is just part of the definition. The fact that matrix
multiplication is associative is not obvious from the definition, but this can
either be checked by hand or inferred from higher principles, namely that
the composition of functions is associative. The fact that the product of two
invertible matrices is invertible is interesting: suppose that g, h both have
inverses, g 1 and h1 , respectively. Then you can check that h1 g 1 is an
inverse of gh. This group is certainly not abelian.
Remark: Indeed, in general the best proof of associativity is by finding an interpretation as functions among sets, and invoking the associativity of functions.
Permutations of a set, that is, bijective functions from the set to itself, form a
group, with operation being composition (as functions) of permutations. The
do-nothing permutation (the function which sends every element to itself) is
the identity. The associativity follows because permutations are functions.
If there are more than two things in the set, these permutations groups are
certainly non-abelian.
8.2 Subgroups
Subgroups are subsets of groups which are groups in their own right.
A subset H of a group G is said to be a subgroup if, with the same operation
as that used in G, it is a group.
That is, if H contains the identity element e G, if H contains inverses of all
elements in it, and if H contains products of any two elements in it, then H is a
subgroup. (The associativity of the operation is assured since the operation was
assumed associative for G itself to be a group.)
Another paraphrase: if e H, and if for all h H the inverse h1 is also in
H, and if for all h1 , h2 H the product h1 h2 is again in H, then H is a subgroup
of G.
Another cute paraphrase is: if e H, and if for all h1 , h2 H the product
h1 h1
2 is again in H, then H is a subgroup of G. (If we take h1 = e, then the latter
condition assures the existence of inverses! And so on.)
In any case, one usually says that H is closed under inverses and closed
under the group operation. (These two conditions are independent of each
other.)
For example, the collection of all even integers is a subgroup of the additive
group of integers. More generally, for fixed integer m, the collection H of all
multiples of m is a subgroup of the additive group of integers. To check this: first,
148
Chapter 8
Groups
Proof: First, we will prove that the collection of all left cosets of H is a partition
of G, meaning that every element of G lies in some left coset of H, and if two left
cosets xH and yH have non-empty intersection then actually xH = yH. (Note
that this need not imply x = y.)
Certainly x = x e xH, so every element of G lies in a left coset of H.
Now suppose that xH yH 6= for x, y G. Then for some h1 , h2 H we
have xh1 = yh2 . Multiply both sides of this equality on the right by h1
2 to obtain
1
(xh1 )h1
2 = (yh2 )h2
=y
(by property of e)
8.3
Lagranges Theorem
149
Let z = h1 h1
2 for brevity. By associativity in G,
1
y = (xh1 )h1
2 = x(h1 h2 ) = xz
Since H is a subgroup, z H.
Then
yH = {yh : h H} = {(xz)h : h H} = {x(zh) : h H}
On one hand, since H is closed under multiplication, for each h H the product
zh is in H. Therefore,
yH = {x(zh) : h H} {xh0 : h0 H} = xH
Thus, yH xH. But the relationship between x and y is completely symmetrical,
so also xH yH. Therefore xH = yH. (In other words, we have shown that the
left cosets of H in G really do partition G.)
Next, we will show that the cardinalities of the left cosets of H are all the
same. To do this, we show that there is a bijection from H to xH for any x G.
In particular, define
f (g) = xg
(It is clear that this really does map H to yH.) Second, we prove injectivity: if
f (g) = f (g 0 ), then
xg = xg 0
Left multiplying by x1 gives
x1 (xg) = x1 (xg 0 )
Using associativity gives
(x1 x)g = (x1 x)g 0
Using the property x1 x = e of the inverse x1 gives
eg = eg 0
Since eg = g and eg 0 = g 0 , by the defining property of the identity e, this is
g = g0
which is the desired injectivity. For surjectivity, we simply note that by its very
definition the function f was arranged so that
f (h) = xh
Thus, any element in xH is of the form f (h) for an element h of H. Thus, we have
that f is bijective, and all left cosets of H have the same number of elements as
does H itself.
150
Chapter 8
Groups
So G is the union of all the different left cosets of H (no two of which overlap).
Let i be the number of different cosets of H. We just showed that every left coset
of H has |H| elements. Then we can count the number of elements in G as
|G| = sum of cardinalities of cosets = i |H|
Both sides of this equation are integers, so |H| divides |G|, as claimed.
///
Proof: We repeatedly use the previous corollary of Lagranges theorem, and the
fact that I is a subgroup of H as well as a subgroup of G. Thus, on one hand
|G| = [G : H] |H| = [G : H] ([H : I] |I|)
On the other hand
|G| = [G : I] |I|
Thus, equating these two expressions for |G| gives
[G : I] |I| = [G : H] ([H : I] |I|)
Canceling the order of I gives the asserted result.
///
8.5
Laws of exponents
151
///
(for n 0)
g =g
|
. . . g 1
{z
}
(for n 0)
|n|
g m+n = g m g n
g mn = (g m )n
1
=g
152
Chapter 8
Groups
///
m+n
and
(g m )n = g m . . . g m = g . . . g . . . g . . . g = g . . . g = g mn
| {z } | {z } | {z } | {z }
n
m
m
mn
{z
}
|
n
8.6
153
Proof: The associativity is inherited from G. The closure under the group operation and the closure under taking inverses both follow immediately from the Laws
of Exponents, as follows. First, the inverse of g n is just g n , since
g n g n = g n+(n) = g 0 = e
And closure under multiplication is
g m g n = g m+n
///
if and only if
i j mod n
Proof: The last assertion easily implies the first two, so well just prove the last
assertion. On one hand, if i j mod n, then write i = j + `n and compute (using
Laws of Exponents):
g i = g j+`n = g j (g n )` = g j e` = g j e = g j
154
Chapter 8
Groups
Proof: We just proved that |g| = |hgi|. By Lagranges theorem, |hgi| divides |G|,
which yields this corollary.
///
Proof: The set Z/n of integers-mod-n which are relatively prime to n has (n)
elements. By Lagranges theorem and its corollaries just above, this implies that
the order k of g Z/n divides (n). Therefore, (n)/k is an integer, and
g (n) = (g k )(n)/k = e(n)/k = e
Applied to x-mod-n this is the desired result.
///
Remark: This approach also gives another proof of Fermats theorem, dealing
with the case where n is prime, without mention of binomial coefficients.
Further, keeping track of what went into the proof of Eulers theorem in the
first place, we have
8.8
Exponents of groups
155
is a divisor of (n). That is, the order of x in the multiplicative group Z/n is a
divisor of (n).
Proof: The proof is really the same: the order x is equal to the order of the
subgroup hxi, which by Lagranges theorem is a divisor of the order of the whole
group Z/n .
///
Proof: If gk = e, then we know from discussion of cyclic subgroups above that |g|
divides k. And, on the other hand, if k = m |g| then
g k = g m|g| = (g |g| )m = em = e
Since G is finite, every element g of it is of finite order. Indeed, the list g 1 , g 2 , . . .,
can contain at most |G| distinct elements, so for some i < j it must be that g i = g j .
Then g ji = e, and we conclude that the order of g is at most j i. And, since
there are only finitely-many elements in G, the least common multiple M of their
orders exists. From what weve just seen, surely g M = e for any g. Thus, G does
have an exponent. And if g k = e for all g G then k is divisible by the orders of all
elements of G, so is divisible by their least common multiple. Thus, the exponent
of G really is the least common multiple of the orders of its elements.
///
Remark: The principle that a choice of N things from among n (with replacement) must result in duplication when n < N is the Pigeon-Hole Principle.
And Lagranges theorem gives a limitation on what we can expect the exponent
to be:
Corollary: (of Lagranges theorem) Let G be a finite group. Then the exponent
of G divides the order |G| of G.
Proof: From the proposition, the exponent is the least common multiple of the
orders of the elements of G. From Lagranges theorem, each such order is a divisor
of |G|. The least common multiple of any collection of divisors of a fixed number
is certainly a divisor of that number.
///
156
Chapter 8
Groups
Proof: The image f (eG ) under f of the identity eG in G has the property
f (eG ) = f (eG eG ) = f (eG ) f (eG )
using the property of the identity in G and the group homomorphism property.
Left multiplying by f (eG )1 (whatever this may be!), we get
f (eG )1 f (eG ) = f (eG )1 (f (eG ) f (eG ))
Simplifying and rearranging a bit, this is
eH = (f (eG )1 f (eG )) f (eG ) = eH f (eG ) = f (eG )
This proves that the identity in G is mapped to the identity in H.
To check that the image of an inverse is the image of an inverse, we simply
compute
f (g 1 ) f (g) = f (g 1 g)
8.9
Group homomorphisms
157
158
Chapter 8
Groups
Remark: At least from a theoretical viewpoint, two groups that are isomorphic
are considered to be the same, in the sense that any intrinsic group-theoretic
assertion about one is also true of the other. In practical terms, however, the
transfer of structure via the isomorphism may be difficult to compute. That is,
knowing that two groups are isomorphic is one thing, and knowing the isomorphism
explicitly may be two quite different things.
Remark: Some aspects of this can be paraphrased nicely in words: for example,
Every subgroup of a finite cyclic group is again a finite cyclic group, with order
dividing the order of the group. Conversely, for every divisor of the order of the
group, there is a unique subgroup of that order.
8.10
159
Proof: Lets prove that that the order of gk is N/gcd(k, N ). First, if (gk )` = e =
g 0 , then k` 0 mod N , from the simpler facts recalled above. That is, N |k`. That
is, there is an integer m so that k` = mN . Then divide both sides of this equality
by gcd(k, N ), obtaining
N
k
`=m
gcd(k, N )
gcd(k, N )
Since now N/gcd(k, N ) and k/gcd(k, N ) are relatively prime, by unique factorization we conclude that
N
|`
gcd(k, N )
Therefore, the actual order of g k is a multiple of N/gcd(k, N ). On the other hand,
(g k )N/gcd(k,N ) = (g N )k/gcd(k,N ) = ek/gcd(k,N ) = e
Note that we use the fact that N/gcd(k, N ) and k/gcd(k, N ) are both integers, so
that all the expressions here have genuine content and sense. This finishes the proof
that the order of g k is N/gcd(k, N ).
As a special case of the preceding, if k|N then the order of g k is N/gcd(k, N ) =
N/k, as claimed above.
Since we know by now that |hhi| = |h| for any h, certainly
|hg k i| = |g k | = N/gcd(k, N )
Given integer k, lets show that
hg k i = hg gcd(k,N ) i
Let d = gcd(k, N ), and let s, t be integers so that
d = sk + tN
Then
g d = g sk+tN = (g k )s (g N )t = (g k )s (e)t = (g k )s e = (g k )s
so g d hg k i. On the other hand,
g k = (g d )k/d
since d|k. Thus, g k hg d i. Therefore, since the subgroups hg k i and hg d i are closed
under multiplication and under inverses, for any integer `
(g k )` hg d i
and
(g d )` hg k i
160
Chapter 8
Groups
But hg d i is just the set of all integer powers of g d (and similarly for g k ), so we have
shown that
hg d i hg k i
and vice versa, so we find at last that
hg d i = hg k i
Therefore, all the cyclic subgroups of hgi = G are of the form hg d i for some
positive d dividing N = |G| = |g|. And different divisors d give different subgroups.
This proves the uniqueness.
Let H be an arbitrary subgroup of G. We must show that H is generated by
some g k (so is in fact cyclic). Let k be the smallest positive integer so that g k H.
We claim that hg k i = H. For any other g m H, we can write
m=qk+r
with 0 r < k. Then
g r = g mqk = g m (g k )q H
since H is a subgroup. Since k was the smallest positive integer so that g k H,
and 0 r < k, it must be that r = 0. Therefore, m is a multiple of k, and g k
generates H.
As another particular case, notice that hg k i = hgi if and only if gcd(k, N ) = 1.
And we may as well only consider 0 < k < N , since otherwise we start repeating
elements. That is, the distinct generators of hgi are the elements g k with 0 < k < N
and gcd(k, N ) = 1. So there certainly are (N ) of them.
Likewise, since
|g k | = |hg k i| = |hg gcd(k,N ) i| = |g gcd(k,N ) |
it is not hard to count the number of elements of a given order in hgi.
A homomorphic image of a finite cyclic group is finite cyclic.
///
Proof: This follows by checking that the image of a generator is a generator for
the image.
///
A finite cyclic group of order N is isomorphic to Z/N with addition. Specifically, for any choice of generator g of the cyclic group G, the map
f : n gn
describes an isomorphism f : Z/N G.
8.11
Roots, powers
161
turns out that everything is ok, because weve already shown (in discussion of cyclic
subgroups) that g m = g n if and only if m = n mod N .
For emphasis, well write the group operation in the cyclic group G as rather
than as multiplication or addition. The crucial property which must be demonstrated is the homomorphism property
f (m + n) = f (m) f (n)
Indeed,
f (m + n) = f ((m + n)%N ) = g m+n%N = g m+n
since we proved (in the discussion of cyclic subgroups) that g i = g j whenever
i = j mod N . And then this is
= f (g m ) f (g n )
as desired.
To see that f is injective, suppose that f (m) = f (n) for integers m, n. Then
g m = g n . Again, this implies that m = n mod N , which says that m( mod N ) =
n( mod N ), as desired. So f is injective.
The surjectivity is easy: given g n hgi, f (n) = g n .
Therefore, the map f is a bijective homomorphism, so by definition is an
isomorphism.
///
Proof: Certainly
f (x y) = (xy)r = xr y r (since G is abelian)
162
Chapter 8
Groups
= f (x) f (y)
which shows that f is a homomorphism.
We may as well use the fact that G is isomorphic to Z/n with addition (proven
just above.) This allows us to directly use things we know about Z/n and the
relatively simple behavior of addition mod n to prove things about arbitrary finite
cyclic groups. Thus, converting to the additive notation appropriate for Z/n-withaddition, the map f is
f (x) = r x
We already know that if gcd(r, n) = 1 then there is a multiplicative inverse r1 to
r mod n. Thus, the function
g(x) = r1 x
gives an inverse function to f . This proves that f is both surjective and injective,
so is a bijection, and thus an isomorphism.
For arbitrary r, lets look at the solvability of
r x = y mod n
for given y. Rewritten in more elementary terms, this is
n|(rx y)
or, for some integer m,
mn = rx y
Let d = gcd(r, n). Then certainly it is necessary that d|y or this equation is impossible. On the other hand, suppose that d|y. Write y = dy 0 with some integer y 0 .
Then we want to solve
r x = dy 0 mod n
Dividing through by the common divisor d, this congruence is equivalent to
n
r
x = y 0 mod
d
d
The removal of the common divisor has made r/d relatively prime to n/d, so there
is a multiplicative inverse (r/d)1 to r/d mod n/d, and
x = (r/d)1 y 0 mod (n/d)
That is, any integer x meeting this condition is a solution to the original congruence.
Letting x0 be one such solution, the integers
x0 , x0 +
n
n
n
n
, x0 + 2 , x0 + 3 , . . . x0 + (d 1)
d
d
d
d
are also solutions, and are distinct mod n. That is, we have d distinct solutions
mod n.
Exercises
163
The necessary and sufficient condition gcd(r, n)|y for the equation rx =
y mod n to have a solution shows that there are exactly n/gcd(r, n) integers y mod
n which fulfill this condition. That is, there are exactly n/gcd(r, n) rth powers.
The kernel of f is the collection of x so that rx = 0 mod n. Taking out the
common denominator d = gcd(r, n), this is (r/d)x = 0 mod n/d, which means
(n/d)|(r/d)x. Since now r/d and n/d have no common factor, by unique factorization this implies that n/d divides x. Thus, mod n, there are d different solutions
x. That is, the kernel of f has d elements.
///
Exercises
8.01
8.02
Prove (by induction) that in any group G for any elements g, h G and for
any integer n
hg n h1 = (hgh1 )n
8.03
Make an addition table for Z/4 and a multiplication table for Z/5 .
8.04
8.05
8.06
Show that
(gh)2 = g 2 h2
in a group if and only if gh = hg.
8.07
8.08
8.09
8.10
Show that in an abelian group G, for a fixed positive integer n the set Xn
of elements g of G so that g n = e is a subgroup of G.
8.11
Find all 5 of the distinct subgroups of the group Z/16 (with addition). (List
each subgroup only once.) (ans.)
8.12
Find all 6 of the distinct subgroups of the group Z/12 (with addition). (List
each subgroup only once.)
8.13
There are 8 subgroups of the group Z/30 . Find them all. (ans.)
8.14
164
8.15
8.16
8.17
Chapter 8
Groups
a 0
Check that the collection of matrices g in GL(2, Q) of the form g =
0 d
(that is, with lower left and upper right entries 0) is a subgroup of GL(2, Q).
a b
Check that the collection of matrices g in GL(2, Q) of the form g =
0 d
(that is, with lower left entry 0) is a subgroup of GL(2, Q).
(Casting out nines) Show that
123456789123456789 + 234567891234567891
6= 358025680358025680
(Hint: Look at things modulo 9: if two things are not equal mod 9 then they
certainly arent equal. And notice the funny general fact that, for example,
1345823416 1 + 3 + 4 + 5 + 8 + 2 + 3 + 4 + 1 + 6 mod 9
since 10 1 mod 9, and 100 1 mod 9, and so on. The assertion is that
a decimal number is congruent to the sum of its digits modulo 9! This is
casting out nines, which allows detection of some errors in arithmetic).
8.18
8.19
Prove that a group element and its inverse have the same order.
8.20
Without computing, show that in the group Z/100 (with addition) the elements 1, 99 have the same order, as do 11 and 89. (ans.)
8.21
0
1
1
0
h=
0
1
1
1
Compute the product gh, compute (gh)n for integers n, and then show that
gh is necessarily of infinite order in the group. (ans.)
8.22
Let G be a finite group. Let N be the least common multiple of the orders
of the elements of G. Show that for all g G we have g n = e.
8.23
Exercises
165
8.24
8.25
Show that any integer i so that 1 i < 11 is a generator for the additive
group Z/11 of integers modulo 11.
8.26
8.27
8.28
8.29
8.30
Let
det : GL(2, Q) Q
be the usual determinant map
a b
det
= ad bc
c d
Show by direct computation that det is a group homomorphism.
8.31
Show that for any integer n and positive integer N the map
f : Z/N Z/N
defined by
f (x) = n x
is a group homomorphism (with addition mod N ).
8.32
Show that for any integer n and positive integer N the map
f : Z/N Z/N
defined by
f (x) = xn
is a group homomorphism.
8.33
166
Chapter 8
Groups
8.34
1
0
t
1
a b
0 d
a
a b
in which a, d are
0 d
non-zero rational numbers and b is any rational number, to the multiplicative
group Q of non-zero rational numbers. What is its kernel?
is a homomorphism from the group of all matrices
8.36
Show that
a b
0 d
b
is not a homomorphism.
8.37
8.38
2
1 x x2
x 0 1 x
0 0 1
Show that E is a group homomorphism from Q with addition to a subgroup
of GL(3, Q).
8.39
8.40
8.41
9
Rings and Fields
9.1
9.2
9.3
Rings
Ring homomorphisms
Fields
9.1 Rings
The idea of ring generalizes the idea of numbers, among other things, so maybe
it is a little more intuitive than the idea of group. A ring R is a set with two
operations, + and , and with a special element 0 (additive identity) with most
of the usual properties we expect or demand of addition and multiplication.
The addition is associative: a + (b + c) = (a + b) + c for all a, b, c R.
The addition is commutative: a + b = b + a for all a, b R.
For every a R there is an additive inverse denoted a, with the property
that a + (a) = 0.
The zero has the property that 0 + a = a + 0 = a for all a R.
The multiplication is associative: a(bc) = (ab)c for all a, b, c R.
The multiplication and addition have left and right distributive properties:
a(b + c) = ab + ac and (b + c)a = ba + ca for all a, b, c R.
When we write this multiplication, just as in high school algebra, very often
we omit the dot and just write
ab = a b
Very often, a particular ring has some additional special features or properties:
If there is an element 1 in a ring with the property that 1 a = a 1 for all
a R, then 1 is said to be the (multiplicative) identity or unit in the
ring, and the ring is said to have an identity or have a unit or be a ring
with unit. And 1 is the unit in the ring. We also demand that 1 6= 0 in a
ring.
167
168
Chapter 9
Remark: Sometimes the word unity is used in place of unit for the special
element 1, but this cannot be relied upon.
If ab = ba for all a, b in a ring R, then the ring is said to be a commutative
ring. That is, a ring is called commutative if and only if the multiplication
is commutative.
Most often, but not always, our rings of interest will have units 1. The
condition of commutativity of multiplication is often met, but, for example, matrix
multiplication is not commutative.
In a ring R with 1, for a given element a R, if there is a1 R so that
a a1 = 1 and a1 a = 1, then a1 is said to be a multiplicative inverse
for a. If a R has a multiplicative inverse, then a is called a unit in R. The
collection of all units in a ring R is denoted R and is called the group of
units in R.
A commutative ring in which every nonzero element is a unit is called a field.
A not-necessarily commutative ring in which every nonzero element is a unit
is called a division ring.
In a ring R an element r so that r s = 0 or s r = 0 for some nonzero s R
is called a zero divisor. A commutative ring without nonzero zero-divisors
is an integral domain.
A commutative ring R has the cancellation property if, for any r 6= 0 in
R, if rx = ry for x, y R, then x = y. Most rings with which were familiar
have this property.
Remark: There is indeed an inconsistency in the use of the word unit. But thats
the way the word is used. So the unit is 1, while a unit is merely something which
has a multiplicative inverse. Of course, there are no multiplicative inverses unless
there is a unit (meaning that there is a 1). It is almost always possible to tell from
context what is meant.
It is very important to realize that the notations a for an additive inverse and
a1 for multiplicative inverse are meant to suggest minus a and divide-by-a, but
that at the moment we are not justified in believing any of the usual high school
algebra properties. We have to prove that all the usual things really do still work
in this abstract situation.
If we take a ring R with 0 and with its addition, then we get an abelian group,
called the additive group of R.
The group of units R in a ring with unit certainly is a group. Its identity is
the unit 1. This group is abelian if R is commutative.
In somewhat more practical terms: as our examples above show, very often
a group really is just the additive group of a ring, or is the group of units in a
ring. There are many examples where this is not really so, but many fundamental
examples are of this nature.
The integers Z with usual addition and multiplication form a ring. This ring
is certainly commutative and has a multiplicative identity 1. The group of units
Z is just {1}. This ring is an integral domain.
The even integers 2Z with the usual addition and multiplication form a commutative ring without unit. Just as this example suggests, very often the lack of
9.1
Rings
169
a unit in a ring is somewhat artificial, because there is a larger ring it sits inside
which does have a unit. There are no units in this ring.
The integers mod m, denoted Z/m, form a commutative ring with identity. As
the notation suggests, the group of units really is Z/m : notice that we used the
group-of-units notation in this case before we even introduced the terminology.
Take p to be a prime. The ring of integers mod p, denoted Z/p, is a field if p
is prime, since all positive integers less than p have a multiplicative inverse modulo
p for p prime (computable by the Euclidean algorithm!). The group of units really
is Z/p .
The collection of n-by-n real matrices (for fixed n) is a ring, with the usual
matrix addition and multiplication. Except for the silly case n = 1, this ring is
non-commutative. The group of units is the group GL(n, R).
The rational numbers Q, the real numbers R, and the complex numbers C
are all examples of fields, because all their nonzero elements have multiplicative
inverses.
Just as in the beginning of our discussion of groups, there are some things which
we might accidentally take for granted about how rings behave. In general these
presumptions are reasonable, based on all our previous experience with numbers,
etc. But it is certainly better to give the easy little proofs of these things and to
be conscious of what we believe, rather than to be unconscious.
Let R be a ring. We will prove the following fundamental properties:
Uniqueness of additive identity: If there is an element z R and another
r R so that r + z = r, then z = 0. (Note that we need this condition only
for one other r R, not for all r R.)
Uniqueness of additive inverses: Fix r R. If there is r0 R so that
r + r0 = 0, then actually r0 = r, the additive inverse of r.
Uniqueness of multiplicative identity: Suppose that R has a unit 1. If there
is u R so that for all r R we have u r = r, then u = 1. Or, if for all
r R we have r u = r, then u = 1. Actually, all we need is that either
1 u = 1 or u 1 = 1 to assure that u = 1.
Uniqueness of multiplicative inverses: If r R has a multiplicative inverse
r1 , and if r0 R is such that r r0 = 1, then r0 = r1 . Or, assuming instead
that r0 r = 1, we still conclude that r0 = r1 .
For r R, we have (r) = r. That is, the additive inverse of the additive
inverse of r is just r.
170
Chapter 9
///
Proof: (of uniqueness of multiplicative inverses) Assume that r R has a multiplicative inverse r1 , and that r0 R is such that r r0 = 1. Then multiply that
latter equation by r1 on the left to obtain
r1 (r r0 ) = r1 1 = r1
by the property of 1. Using the associativity of multiplication, the left-hand side is
r1 (r r0 ) = (r1 r) r0 = 1 r0 = r0
by property of multiplicative inverses and of the identity. Putting this together, we
have r0 = r1 as desired.
///
The proof that (r) = r (that is, that the additive inverse of the additive
inverse of r is just r) is identical to the argument given for groups that the inverse
of the inverse is the original thing.
There are several slogans that we all learned in high school or earlier, such as
minus times minus is plus, and zero times anything is zero. It may be interesting
to see that from the axioms for a ring we can prove those things. (We worried over
the so-called laws of exponents already a little earlier.)
These things are a little subtler than the obvious things above, insofar as they
involve the interaction of the multiplication and addition. These little proofs are
good models for how to prove simple general results about rings.
Let R be a ring.
For any r R, 0 r = r 0 = 0.
Suppose that there is a 1 in R. Let 1 be the additive inverse of 1. Then for
any r R we have (1) r = r (1) = r, where as usual r denotes the
additive inverse of r.
Let x, y be the additive inverses of x, y R. Then (x) (y) = xy.
Proof: Throughout this discussion, keep in mind that to prove that b = a means
to prove just that a + b = 0.
9.2
Ring homomorphisms
171
=
=
(0 + 0) r
0r+0r
(since 0 + 0 = 0)
(distributivity)
172
Chapter 9
9.2
Ring homomorphisms
173
That is, x + y is again in the kernel. And f (0) = 0, so 0 is in the kernel. And for
x in the kernel f (x) = f (x) = 0 = 0, so x is in the kernel.
///
Remark: Before proving this, note that our experience makes us anticipate the
fact that such maps really are ring homomorphisms: indeed, we know that to
evaluate the product or sum of two polynomials we can evaluate them individually
and then multiply/add, or multiply/add first and then evaluate. This is exactly
the assertion that evaluation is a ring homomorphism.
=
=
P
a xi
P0im i i
0in bi x
er0 (P + Q) = er0
(aj + bj )xj =
X
j
aj r0j +
(aj + bj )r0j
where without harming anything we put aj = 0 and bj = 0 for any index outside the
range for which the coefficients are defined. This proves that evaluation respects
sums. For products:
er0 (P Q) = er0
X
i,j
(ai bj )xi+j =
X
i,j
174
Chapter 9
Proof: First,
f (0R ) + f (0R ) = f (0R + 0R )
by the defining property of group homomorphism. Then
0R + 0R = 0R
(by the property of the additive identity in R), so
f (0R + 0R ) = f (0R )
Thus, together, we have
f (0R ) + f (0R ) = f (0R + 0R ) = f (0R )
Add the additive inverse f (0R ) to both sides:
(f (0R ) + f (0R )) f (0R ) = f (0R ) f (0R ) = 0S
where the last equality uses the definition of additive inverse. Using associativity
of addition,
(f (0R ) + f (0R )) f (0R ) = f (0R ) + (f (0R ) f (0R )) = f (0R ) + 0S = f (0R )
where we also use the defining property of 0S . Putting these together (repeating a
little):
f (0R ) = f (0R ) + f (0R ) f (0R ) = f (0R + 0R ) f (0R ) = f (0R ) f (0R ) = 0S
as claimed.
///
Remark: Notice that unlike the discussion about the additive identity here we
need the further hypothesis of surjectivity. Otherwise the assertion is false: see the
remark after the proof.
9.3
Fields
175
Thus, f (1R ) behaves like the unit in S. By the already proven uniqueness of units,
it must be that f (1R ) = 1S .
///
1
0
0
0
9.3 Fields
An important subclass of commutative rings are called fields. Many of the familiar
types of numbers such as complex numbers, real numbers, rational numbers, and
Z modulo primes are all fields. But other familiar sets of numbers, such as the
integers themselves, are not fields.
176
Chapter 9
A commutative ring R with unit 1 and such that any non-zero element of R
has a multiplicative inverse (in R) is called a field.
The commutative ring of ordinary integers Z is not a field, because non-zero
integers other than 1 do not have multiplicative inverses in the integers (though
they have inverses in the larger ring Q).
The commutative ring of rational numbers Q is a field, because every non-zero
rational number a/b (with a and b non-zero integers) has the multiplicative inverse
b/a.
The commutative ring of real numbers R is a field. The commutative ring of
complex numbers C is a field.
The commutative ring Z/p with p prime is a field. To be sure of this let
x 6= 0 mod p. Then p does not divide x. Always gcd(x, p) is a divisor of p (and of
x), and since p is prime and does not divide x we have gcd(x, p) = 1. Therefore,
there are integers r and s such that rx + sy = 1. Then rx = 1 mod p, so r is a
multiplicative inverse of x in Z/p.
If n is a composite integer then Z/n is not a field. In particular, let d be a
proper divisor of n. Then d 6= 0 mod n but d has no multiplicative inverse modulo
n.
Exercises
9.01
9.02
9.03
9.04
9.05
Find the group of units in the rings Z/4, Z/5, Z/6. (ans.)
9.06
9.07
Check that the collection 2Z of all even integers is a ring, without unit.
9.08
9.09
9.10
9.11
Let
R be the collection of numbers of the form a + bi where a, b Q and i =
1. Just to keep in practice, check that R is closed under multiplication
and addition. Then, granting that R is a ring (meaning not to worry about
associativity, etc.) show that R is a field. (Hint: Remember rationalizing
denominators ?)
9.12
Exercises
9.13
9.14
177
that R is a ring (meaning not to worry about associativity, etc.) show that
R is a field. (Hint: Rationalizing denominators.)
9.15
Show that in a ring the equation r + r = r can hold only for r = 0. (ans.)
9.16
Find several examples of nonzero elements x, y in the ring Z/15 whose product is nevertheless 0. (ans.)
9.17
Find several examples of nonzero elements x, y in the ring Z/21 whose product is nevertheless 0.
9.18
Find several examples of nonzero elements x, y in the ring Z/16 whose product is nevertheless 0.
9.19
Show that in the ring Z/n with n a composite (that is, not prime) number, the so-called cancellation law fails: that is, for such n, find (non-zero)
elements a, b, c Z/n so that ca = cb but a 6= b.
9.20
Fix an integer N > 1. Prove carefully that the map f : Z Z/N Z given
by f (x) = x + N Z is a ring homomorphism. (Wed really known this all
along.)
9.21
9.22
Let f : R S be a ring homomorphism (with R, S commutative, for simplicity). Let J be an ideal in I. Show that I = {i I : f (i) J} is an ideal
in S.
9.23
(*) Show that the only two two-sided ideals in the ring R of 2-by-2 rational
matrices are {0} and the whole ring R itself.
10
Polynomials
10.1
10.2
10.3
10.4
10.5
Polynomials
Divisibility
Factoring and irreducibility
Euclidean algorithm for polynomials
Unique factorization of polynomials
10.1 Polynomials
We need to understand polynomials algebraically, as being analogous in many regards to the ordinary integers. Thus, the intuition we have for the integers can be
reused to a great extent in reasoning about polynomials with coefficients in a field.
Let k be a field, which we can think of as being a finite field GF (q) = Fq with
q elements, especially Fp = Z/p for prime p, or also possibly the rational numbers
Q, or real numbers R, or complex numbers C. For indeterminate x, define the
polynomial ring over k in one variable to be
k[x] = {polynomials with coefficients in k}
The ring k[x] is a commutative ring, since it is a ring and polynomial multiplication
is commutative.
We write a polynomial as a sum of constants (from k) times non-negative
integer powers of x:
f (x) = a0 + a1 x + a2 x2 + . . . + am xm
The ai s are the coefficients of the polynomial. The constant coefficient is a0 .
If an 6= 0, then an xn is called the highest-order term and an is the highestorder coefficient. We refer to the summand ai xi as the degree i term. Also
sometimes i is called the order of the summand ai xi . The largest index i such that
the coefficient ai is non-zero is the degree of the polynomial. Equivalently, the
degree of such a polynomial is the largest exponent i of x so that the ith coefficient
178
10.1
Polynomials
179
ai is not 0. Note that just writing the term an xn does not imply that an 6= 0. A
polynomial is said to be monic if its (highest-order) coefficient is 1.
Two polynomials in indeterminate x are equal if and only if the coefficients
of respective powers of x are all equal.
Remark: At this point we must distinguish between polynomials and the functions
given by them. In particular, we do not say that two polynomials are equal if they
merely assume the same values for all inputs. While the latter principle is provably
correct in the case that the possible inputs lie in an infinite field, it is definitely
false when the inputs must be in a finite field. The simplest case is the polynomial
f (x) = xp x
with coefficients in Z/p, with p a prime. Fermats Little Theorem tells us that for
all inputs x in Z/p this polynomial has value 0. Yet it is not the 0 polynomial.
We have the usual addition and multiplication of polynomials. Addition is easy
to describe: the ith coefficient of the sum of two polynomials f (x) and g(x) is the
sum of the ith coefficient of f (x) and the ith coefficient of g(x). (This is completely
parallel to vector addition.) Multiplication is somewhat messier, but is reasonable:
the coefficient of xk in the product of
f (x) = a0 + a1 x + a2 x2 + . . . + am xm
and
g(x) = b0 + b1 x + b2 x2 + . . . + bn xn
is the sum of the products ai bj over all pairs of indices i, j that satisfy i + j = k.
That is,
X
coefficient of xk in f g =
ai bj
i+j=k
Remark: To make this true even when one of the two polynomials is the 0 polynomial, the 0 polynomial is by convention given degree .
Proof: The result is clear if either polynomial is the zero polynomial, so suppose
that both are non-zero. Let
P (x) = am xm + am1 xm1 + . . . + a2 x2 + a1 x + a0
Q(x) = bn xn + bn1 xn1 + . . . + b2 x2 + b1 x + b0
where the apparent highest-degree coefficients am and bn really are non-zero. Then
in the product P Q the highest-degree term is am bn xm+n , which occurs only in one
way, as the product of the highest-degree terms from P and Q, so it has coefficient
180
Chapter 10
Polynomials
Remark: From the latter proof we see that the crucial property is that a 6= 0 and
b 6= 0 should imply a b 6= 0. We know that this is true in Q, R, and C, and we
have verified earlier that this is true for k = Z/p for p prime. The latter fact comes
from the key lemma that if a prime p divides a product ab, then either p|a or p|b,
which in turn is proven from the peculiar characterization of the gcd of a, p as the
smallest positive integer of the form sa + tp.
Proposition: (Cancellation property) Let A P = B P for some non-zero polynomial P , where all these polynomials have coefficients in a field k. Then A = B.
Remark: Sometimes polynomials are thought of as simply being a kind of function, but that is too naive. Polynomials give rise to functions, but they are more
than just that. It is true that a polynomial
f (x) = cn xn + cn1 xn1 + . . . + c1 x + c0
with coefficients in a field k gives rise to k-valued functions on the field k, writing
as usual
f (a) = cn an + cn1 an1 + . . . + c1 a + c0
for a k. That is, as usual, we imagine that the indeterminate x is replaced by
a everywhere (or a is substituted for x). This procedure gives functions from k
to k.
But polynomials themselves have features which may become invisible if we
mistakenly think of them as just being functions. For example, suppose that we
look at the polynomial f (x) = x3 + x2 + x +
1 in the polynomial ring F2 [x], that
is, with coefficients in GF (2) = F2 = Z/2. Then
f (0) = 03 + 02 + 1 + 1 = 0 F2
f (1) = 13 + 12 + 1 + 1 = 0 F2
That is, the function attached to the polynomial is the 0-function, but the polynomial is visibly not the zero polynomial.
As another example, consider f (x) = x3 x as a polynomial with coefficients
in Z/3. Once again, f (
0), f (
1), f (
2) are all
0, but the polynomial is certainly not
the zero polynomial.
Remark: We did not verify the associativity of addition, associativity of multiplication, distributivity, etc., to really prove that k[x] is a commutative ring. Its
not hard to do so just using the definitions above, but its not very interesting.
10.2
Divisibility
181
10.2 Divisibility
In a polynomial ring k[x] with k a field, there is a division algorithm and (therefore) there will be a Euclidean algorithm nearly identical in form to the analogous
algorithms for the ordinary integers Z.
The division algorithm is just the usual division of one polynomial by another,
with remainder, as we all learned in high school or earlier. It takes just a moments
reflection to see that the procedure we all learned does not depend upon the nature
of the field that the coefficients are in, and that the degree of the remainder is
indeed less than the degree of the divisor!
Proposition: Let k be a field and M a non-zero polynomial in k[x]. Let H be
any other polynomial in k[x]. Then there are unique polynomials Q (quotient)
and R (remainder) in k[x] so that deg R < deg M and
H =QM +R
In this situation use the notation
R = H%M = reduction of H modulo M
in parallel to the usage for integers.
Proof: Let X be the set of polynomials expressible in the form H S M for some
polynomial S. Let R = H Q M be an element of X of minimal degree. We claim
that deg R < deg M . If not, let a be the highest-degree coefficient of R, let b be
the highest-degree coefficient of M , and define a polynomial
G = (ab1 ) xdeg Rdeg M
Then the subtraction
RGM
exactly removes the highest-order term of R, so
deg(R G M ) < deg R
But this modified version of R would still be in X, since
R G M = (H Q M ) G M = H (Q + G) M
By choice of R this is impossible. Therefore, deg R < deg M . This proves existence.
To prove uniqueness, suppose we had
H = Q M + R = Q0 M + R0
Then subtract to obtain
R R0 = (Q0 Q) M
182
Chapter 10
Polynomials
Since the degree of a product is the sum of the degrees, and since the degrees of
R, R0 are less than the degree of M , this is impossible unless Q0 Q = 0, in which
case also R R0 = 0.
///
A polynomial D divides another polynomial P if there is a polynomial Q so
that P = Q D. Equivalently, P is a multiple of D. We may also say that D is
a divisor of P . We use notation D|P when D divides P . A divisor D of P is a
proper divisor of P if
0 < deg D < deg P
A non-zero polynomial is irreducible (prime) if it has no proper divisors.
Remark: Since the division algorithm works for polynomials with coefficients in a
field, it is merely a corollary that we have a Euclidean algorithm ! If we think about
it, the crucial thing in having the Euclidean algorithm work was that the division
algorithm gave us progressively smaller numbers at each step. (And, indeed, each
step of the Euclidean algorithm is just a division algorithm!)
The greatest common divisor of two polynomials A, B is the monic polynomial g of highest degree dividing both A and B.
Proof: Among the non-negative integer values deg(sf + tg) there is at least one
which is minimal. (We reject any choice of s, t which gives sf + tg = 0, which has
10.2
Divisibility
183
184
Chapter 10
Polynomials
The naive way to compute the greatest commond divisor of two polynomials is
to factor both of them (as in the following section) and determine all the common
factors. However, this is suboptimal. It is better to use the Euclidean algorithm,
discussed a little further below.
1
deg F
2
1
2
1
2
10.3
185
On the other hand, suppose that F (a) = 0. Use the division algorithm to write
F (x) = Q(x) (x a) + R
Since deg R < deg(x a) = 1, R must be a constant. Evaluate both sides at a:
0 = F (a) = Q(a) (a a) + R = Q(a) 0 + R = R
Therefore, R = 0 and so x a divides F (x).
///
This gives a slightly more economical way to test for linear factors.
There is only one degree 0 polynomial in F2 [x], namely the constant 1. The 0
polynomial has degree .
There are just two linear polynomials in F2 [x], namely x and x + 1. Since
every linear polynomial is irreducible, they are irreducible.
For quadratic polynomials, there are 2 choices for the linear coefficient and
2 choices for the constant coefficient, so 2 2 = 4 quadratic polynomials in F2 [x].
Testing for irreducibility, here the algebra is easy:
Obviously x2 = x x
Obviously x2 + x = x (x + 1)
Less obviously x2 + 1 = (x + 1)2 . Here use the fact that 2 = 0, so (x + 1)2 =
x2 + 2x + 1 = x2 + 0 + 1 = x2 + 1.
x2 + x + 1: Now its a little easier to see whether or not this is 0 when values
0,1 are plugged in:
02 + 0 + 1 = 1 6= 0
12 + 1 + 1 = 1 6= 0
So x2 + x + 1 is irreducible in F2 [x]. Its the only irreducible quadratic
polynomial in F2 [x].
For cubic polynomials with coefficients in F2 , there are 2 choices for quadratic
coefficient, 2 for linear coefficient, and 2 for constant, so 8 altogether. If we are
looking only for irreducible ones, we should exclude those with constant coefficient
0, because theyll have value 0 for input 0 (equivalently, theyll have linear factor
x). Also, those with an even number of non-zero coefficients will have value 0 for
input 1, so will have a linear factor x + 1. Keep in mind that if a cubic is not
irreducible then it has at least one linear factor.
We conclude that a cubic polynomial in F2 with constant coefficient 1 and
with an odd total number of non-zero coefficients is necessarily irreducible. Thus,
the only two irreducible cubics in F2 [x] are
x3 + x2 + 1
x3 + x + 1
Irreducible quartic polynomials in F2 [x]: there are 24 = 16 choices for cubic,
quadratic, linear, and constant coefficients. If the constant term is 0, or if the total
186
Chapter 10
Polynomials
10.4
187
Remark: Yes, we need to know the context in order to determine what field the
coefficients lie in. There is no way to simply look at the coefficients and know
directly.
To compute the gcd of x7 + x6 + x4 + x3 + x + 1 and x5 + x4 + x + 1 considered
188
Chapter 10
Polynomials
as polynomials in F2 [x]:
(x7 + x6 + x4 + x3 + x + 1) (x2 )(x5 + x4 + x + 1) = x4 + x2 + x + 1
(x5 + x4 + x + 1) (x + 1)(x4 + x2 + x + 1) = x3 + x
(x4 + x2 + x + 1) (x)(x3 + x) = x + 1
(x3 + x) (x2 + x)(x + 1) = 0
Since we have a 0 on the right-hand side, the algorithm terminates. The right-hand
side of the next-to-last line is x + 1, a non-zero constant, so the greatest common
divisor of these two polynomials is x + 1.
To compute the gcd of x7 + x6 + x4 + x3 + x + 1 and x6 + x4 + x2 + 1 considered
as polynomials in F3 [x]:
(x7 + x6 + x4 + x3 + x + 1) (x + 1)(x6 + x4 + x2 + 1) = 2x5 + 2x2
(x6 + x4 + x2 + 1) (2x)(2x5 + 2x2 ) = x4 + 2x3 + x2 + 1
(2x5 + 2x2 ) (2x + 2)(x4 + 2x3 + x2 + 1) = x + 1
(x4 + 2x3 + x2 + 1) (x3 + x2 )(x + 1) = 1
(x + 1) (x + 1)(1) = 0
Since we have a 0 on the right-hand side, the algorithm terminates. The right-hand
side of the next-to-last line is 1, a non-zero constant, so the greatest common divisor
of these two polynomials is 1.
Remark: Notice that in the last two examples the differing interpretation of
where the coefficients are has a big impact on what the greatest common divisor
is!
Compute the greatest common divisor of the two polynomials x7 +x5 +x4 +x3 +
x + 1 and x6 + x3 + x2 + x + 1 (with coefficients in the finite field GF (2) = F2 = Z/2
with just two elements) by the Euclidean algorithm.
(x7 + x5 + x4 + x3 + x + 1) (x) (x6 + x3 + x2 + x + 1)
(x6 + x3 + x2 + x + 1) (x) (x5 + x2 + 1)
(x5 + x2 + 1) (x3 + x + 1) (x2 + 1)
(x2 + 1) (x) (x)
(x) (x) (1)
=
=
=
=
=
x5 + x2 + 1
x2 + 1
x
1
0
Thus, since the last non-zero entry on the right-hand side is 1, the gcd of x7 + x5 +
x4 + x3 + x + 1 and x6 + x3 + x2 + x + 1 is 1.
With coefficients in GF (2) = F2 , compute the gcd of x6 + x5 + x4 + x3 + x2 + 1
and x5 + x4 + x3 + 1.
(x6 + x5 + x4 + x3 + x2 + 1) (x) (x5 + x4 + x3 + 1) = x3 + x2 + x + 1
(x5 + x4 + x3 + 1) (x2 ) (x3 + x2 + x + 1) = x2 + 1
(x3 + x2 + x + 1) (x + 1) (x2 + 1) = 0
10.5
189
Proof: It suffices to prove that if P |AB and P 6 |A then P |B. Since P 6 |A, and
since P is irreducible, the gcd of P and A is just 1. Therefore, there are s, t k[x]
so that
1 = sA + tP
Then
B = B 1 = B (sA + tP ) = s(AB) + (Bt)P
Since P |AB, surely P divides the right-hand side. Therefore, P |B, as claimed.
Generally, if P divides A1 . . . An , rewrite this as (A1 )(A2 . . . An ). By the first
part, either P |A1 or P |A2 . . . An . In the former case were done. In the latter case,
we continue: rewrite A2 . . . An = (A2 )(A3 . . . An ). So either P |A2 or P |A3 . . . An .
Continuing (induction!), we find that P divides at least one of the factors Ai . ///
Now we prove the existence of factorizations into irreducibles. Suppose that
some polynomial in k[x] did not have a factorization. Then there is a f k[x]
without a factorization and with deg f smallest among all elements lacking a factorization. This f cannot be irreducible, or it has a factorization into irreducibles.
If f is reducible, then of course it has a proper factorization f = gh. This means
that 0 < deg A < deg f and 0 < deg B < deg f . By the minimality of f among
polynomials not having factorizations it must be that both A and B have prime
factorizations. Then a prime factorization of f is obtained by multiplying together
the prime factorizations for A and B.
190
Chapter 10
Polynomials
Exercises
10.01 Factor x3 x into linear factors in F3 [x].
10.02 Factor x5 x into linear factors in F5 [x]. (ans.)
10.03 Factor x5 + x + 1 into irreducibles in F2 [x], by trial division. (ans.)
10.04 Factor x5 + x4 + 1 into irreducibles in F2 [x] by trial division.
10.05 Factor x6 + x3 + x + 1 into irreducibles in F2 [x] by trial division. (ans.)
10.06 Let k[x] be the polynomial ring in one variable x over the field k. What is
the group of units k[x] (meaning the collection of polynomials that have
multiplicative inverses which are also polynomials)?
10.07 Find the greatest common divisor of x5 + x4 + x3 + x2 + x + 1 and
x4 + x2 + 1 in the ring Q[x] of polynomials over Q. (ans.)
10.08 Find the greatest common divisor of x6 + x3 + 1 and x2 + x + 1 in the ring
k[x] of polynomials over the finite field k = Z/3 with 3 elements.
10.09 Find the greatest common divisor of the two polynomials x6 + x4 + x2 + 1
and x8 + x6 + x4 + x2 + 1 in the ring k[x] of polynomials over the finite field
k = Z/2 with 2 elements.
10.10 Find the greatest common divisor of the two polynomials x5 +x+1 and
x5 + x4 + 1 in the polynomial ring F2 [x]. (ans.)
Exercises
191
10.11 Find the greatest common divisor of the two polynomials x5 +x4 +x3 +1
and x5 + x2 + x + 1 in F2 [x].
10.12 Find the greatest common divisor of x7 + x6 + x5 + x4 + 1 and
x6 + x5 + x4 + x3 + x2 + x + 1 in F2 [x]. (ans.)
10.13 Find the greatest common divisor of x5 + x3 + x2 + 1 and x6 + x5 + x + 1
in F2 [x].
11
Finite Fields
11.1
11.2
11.3
11.4
11.5
Making fields
Examples of field extensions
Addition mod P
Multiplication mod P
Multiplicative inverses mod P
Again, while we are certainly accustomed to (and entitled to) think of the
fields rationals, reals, and complex numbers as natural batches of numbers, it is
important to realize that there are many other important and useful fields. Perhaps
unexpectedly, there are many finite fields: For example, for a prime number p, the
quotient Z/p is a field (with p elements).
On the other hand, for example, there is no finite field with 6 or with 10
elements. (Why?)
While it turns out that there are finite fields with, for example, 9 elements,
128 elements, or any prime power number of elements, it requires more preparation
to find them.
The simplest finite fields are the rings Z/p with p prime. For many different
reasons, we want more finite fields than just these. One immediate reason is that for
machine implementation (and for other computational simplifications) it is optimal
to use fields of characteristic 2, that is, in which 1 + 1 = 2 = 0. Among the fields
Z/p only Z/2 satisfies this condition. At the same time, for various reasons we
might want the field to be large. If we restrict our attention to the fields Z/p we
cant meet both these conditions simultaneously.
11.1
Making fields
193
For brevity, write Fq for the finite field with q elements (if it exists!) For a
prime p at least we have one such finite field, namely Z/p = Fp . Again, another
notation often seen is
GF (q) = Fq
Here GF stands for Galois field.
Remark: There is the issue of uniqueness of a finite field with a given number of
elements. It is true that there is essentially at most one such, but this is not easy
to prove. Also, in practice the various possible computational models of the same
underlying abstract object have a great impact, so we will often be more concerned
with the many different models themselves.
Remark: The present discussion continues to be entirely analogous to our discussion of Z/m.
For a polynomial P (not necessarily irreducible), and for two other polynomials
f, g, all with coefficients in Fp , write
f = g mod P
if P divides f g. This is completely analogous to congruences for ordinary
integers. And, continuing with that analogy, define
Fp [x]/P = {congruence classes mod P }
where the congruence class f mod P of a polynomial f is
f = {g Fp [x] : g = f mod P }
Usually one just writes f rather than f.
A polynomial f is reduced mod P if
deg f < deg P
Via the division/reduction algorithm in the polynomial ring Fp [x], every polynomial
in Fp [x] is equal-mod-P to a reduced polynomial mod P : indeed, given f , by
division-with-remainder we obtain polynomials Q and R with deg R < deg P and
so that
f =QP +R
That is,
f R=QP
which is to say that f = R mod P .
Proposition: Two polynomials f, g which are reduced mod P are equal modulo
P if and only if they are equal (in Fp [x]).
Proof: Certainly if f and g are equal then they are equal modulo P , whether or
not they are reduced. On the other hand, suppose that f and g are reduced modulo
P and equal modulo P . Then
f g =QP
194
Chapter 11
Finite Fields
Proof: From the previous proposition, the set of polynomials f (x) of degree
strictly less than the degree of P is an irredundant set of representatives for Fp [x]/P ,
whether or not P (x) is irreducible. There are p choices (from Fp ) for each of the
n coefficients of a polynomial of degree strictly less than n, so there are pn choices
altogether, and thus pn elements in the quotient Fp [x]/P .
Next, we prove existence of multiplicative inverses for non-zero elements f in
Fp [x]/P . Given f 6= 0 in Fp [x]/P , we may suppose that 0 deg f < deg P . Since
P does not divide f ,
deg gcd(f, P ) < deg P
Since P is irreducible, gcd(f, P ) cannot have positive degree, or it would be a proper
factor of P . Thus,
deg gcd(f, P ) = 0
That is, the gcd is a non-zero constant. Since we can adjust gcds to be monic
polynomials by multiplying through by non-zero constants, we have
gcd(f, P ) = 1
Therefore, from above, there are polynomials a, b such that
af + bP = 1
Then
a f = 1 mod P
giving a multiplicative inverse of f as desired. The other requirements of a field,
namely the associativity, distributivity, and commutativity of addition and multiplication, and so on, follow from the analogous properties for polynomials themselves.
Last, we verify that = x-mod-P satisfies
P () = 0 mod P
11.2
195
(This is easier than one might anticipate.) We will verify that for any polynomial
M there is a polynomial N such that
P (x + M P ) = N P
Actually, we will prove more, namely that for any polynomial h,
h(x + M P ) = h(x) mod P
Indeed, by the Binomial Theorem for any exponent k
X k
(x + M P )k = xk +
xi (M P )ki
i
1ik
That is,
(x + M P )k = xk mod P
Adding together suitable constant multiples of powers gives
h(x + M P ) = h(x) mod P
In particular,
P (x + M P ) = P (x) = 0 mod P
That is, any polynomial differing from x by a multiple of P , when used as the input
to P , gives 0 modulo P . That is, x-mod-P is a root of the equation P (y) = 0 in
Fp [x]/P , as claimed.
///
Usually it is desirable in such a field Fp [x]/P to express anything in reduced
form, since then it is easy to test two things for equality: just compare their
coefficients.
Let k be a field. Another field K containing k is called an extension field of
k, and k is a subfield of K. The degree of the extension K of k is the degree of
the polynomial P used in the construction of K as k[x] modulo P .
Remark: In this situation, thinking of
= x-mod-P
as existing in its own right now, and being a root of the equation P (x) = 0 mod P ,
we say that we have adjoined a root of P (x) = 0 to k, and write
k[] = k[x] mod P
196
Chapter 11
Finite Fields
so x-mod-(x2 + 1) is a 1.
We also showed (by showing that every element has a unique reduced representative) that any element of the extension is expressible uniquely in the form
= a + b for a, b R. Of course we usually would write i for the image of x in
that extension field rather than .
Example: Lets adjoin a square root of 2 to the field Z/5. First, note that there
is no a in Z/5 so that a2 = 5. Thus, the quadratic polynomial x2 2 does not
factor in Z/5[x] (since if it did it would have a root in Z/5, which it doesnt). Then
Z/5[x] mod x2 2 is a field, inside which we can view Z/5 as sitting. And
x2 = 2 mod x2 2
11.5
197
198
Chapter 11
Finite Fields
Because f is not 0 mod P , and because P is irreducible, the gcd of the two is 1, so
such S, T do exist.
For example, to find the multiplicative inverse of x in F2 [x]/(x2 + x + 1), first
do the Euclid Algorithm (which is very quick here)
(x2 + x + 1) (x + 1)(x) = 1
Thus, already we have the desired expression
(x + 1)(x) + (1)(x2 + x + 1) = 1
from which
(x + 1)(x) = 1 mod x2 + x + 1
In other words,
x1 = x + 1 mod x2 + x + 1
To find the multiplicative inverse of x2 + x + 1 in F2 [x]/(x4 + x + 1), first do
the Euclidean Algorithm
(x4 + x + 1) (x2 + x)(x2 + x + 1) = 1
Thus, already we have the desired expression
(x2 + x)(x2 + x + 1) + (1)(x4 + x + 1) = 1
from which
(x2 + x)(x2 + x + 1) = 1 mod x4 + x + 1
In other words,
(x2 + x + 1)1 = x2 + x mod x2 + x + 1
Exercises
11.01 In the field K = (Z/2)[x]/(x2 + x + 1) let be the image of x, and compute
in reduced form 5 . (ans.)
11.02 In the field K = (Z/2)[x]/(x2 + x + 1) let be the image of x, and compute
in reduced form 7 .
11.03 In the field K = (Z/2)[x]/(x3 + x + 1) let be the image of x, and compute
in reduced form 5 . (ans.)
11.04 In the field K = (Z/2)[x]/(x3 + x2 + 1) let be the image of x, and compute
in reduced form 5 .
11.05 In the field K = (Z/2)[x]/(x2 + x + 1) let be the image of x, and compute
in reduced form 1 . (ans.)
11.06 In the field K = (Z/2)[x]/(x3 + x + 1) let be the image of x and compute
in reduced form (1 + + 2 )1 . (ans.)
Exercises
199
12
Linear Codes
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
An ugly example
A better approach
An inequality from the other side
The Hamming binary [7, 4] code
Some linear algebra
Row reduction: a review
Linear codes
Dual codes, syndrome decoding
It turns out to be hard to actually make good codes, meaning that they approach the bound indicated by Shannons Noisy Coding Theorem. (They should
also be relatively easy to encode and decode, in addition to their error-correcting
facility.)
The class of codes easiest to study is that of linear codes, which does include
some fairly good codes, and has enough structure so that encoding and decoding
are not ridiculously complicated.
There are many standard introductory texts and references for coding theory, and we mention only a few: [MacWilliams Sloane 1977], [McEliece 1977],
[Pless 1998], [Pretzel 1999], [Roman 1992], [Wells 1999], [Welsh 1988]. All the
error-correcting coding material in the sequel is treated in most of these. There are
many sorts of codes we have not treated, and these are treated in various of these
sources.
12.1
An ugly example
201
to be the number of positions at which they differ. Here, the first word is Hamming
distance 3 from the other two, which are Hamming distance 2 from each other.
Suppose that a binary symmetric channel has bit error probability p = 1/10.
Using this code over that channel (or really its fourth extension, so that we send
4 bits at a time) what is the probability of an uncorrectible error? We are
using minimum distance decoding, so the question means what is the probability that a codeword will get mangled into a 4-bit word that is closer (in
Hamming distance) to some other codeword than to the original codeword?
Well first compute this in the most obvious but labor-intensive approach.
The naive aspect will be that well try to get an exact answer, but this exactness
will not really be relevant to anything, so is a bit silly. And the more trouble
it takes to preserve this needless exactness the sillier it becomes. So well do a
second computation in which we only get an estimate rather than striving for an
expensive and pointless precision.
Lets make a table of all possible 4-bit words and their Hamming distances
from the 3 codewords. Each 4-bit word would be decoded/corrected as the closest
codeword to it. The minimum distances are in boldface.
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0001
1
0
2
1
2
1
3
2
2
1
3
2
3
2
4
3
0110
2
3
1
2
1
2
0
1
3
4
2
3
2
3
1
2
1100
2
3
3
4
1
2
2
3
1
2
2
3
0
1
1
2
ambiguous decoding
ambiguous decoding
ambiguous decoding
ambiguous decoding
There are exactly 4 cases where there would be ambiguous decoding, that is,
where the minimum distance of the received word to a codeword is achieved for two
different codewords. These received words cannot be corrected (with certainty) in
any case.
A possibly multi-bit error in a 4-bit word is not correctible if either the received
word is one of those whose smallest distance to a codeword occurs for two different
codewords, or if the received word is closer (or equal) to another codeword than to
the original codeword.
The probability that a codeword gets mangled into a given 4-bit word is completely computable just from knowledge of the number of bit errors that would turn
202
Chapter 12
Linear Codes
the codeword into the received word, that is, from the Hamming distance between
the codeword and the received word. With error probability p, the probability of
a specific 0-bit error in a 4-bit word is (1 p)4 , the probability of a specific 1-bit
error is (1 p)3 p, the probability of a specific 2-bit error is (1 p)2 p2 , of a specific
3-bit error is (1 p)p3 , and of a specific 4-bit error is p4 . With p = 1/10, these
numbers are approximately
P (no error)
= 0.6561
P (specific 1-bit error) = 0.0729
P (specific 2-bit error) = 0.0081
P (specific 3-bit error) = 0.0009
P (specific 4-bit error) = 0.0001
And note that there are no binomial coefficients appearing here since after all its
not just any error that turns a given codeword into a given received word. For
example, to turn codeword 0001 into 0111, there must be bit errors at the two
middle bit positions, and no other errors.
Now rewrite the table above, writing the probabilities that the 4-bit words will
arise as mangled versions of codewords other than the codewords closest to them.
We also include the cases that the received word is closest to two or more codewords.
That is, we are tabulating the probabilities of various mistakes in decoding:
0001
0110
1100
0000
.0081
.0081
0001
.0009
.0009
0010
.0081
.0009
0011
.0081
.0001
0100
.0081
.0729
.0729
ambiguous decoding
0101
.0081
.0081
0110
.0009
.0081
0111
.0081
.0009
1000
.0081
.0009
1001
.0001
.0081
1010
.0009
.0081
.0081
ambiguous decoding
1011
.0009
.0009
1100
.0009
.0081
1101
.0081
.0009
1110
.0001
.0729
.0729
ambiguous decoding
1111
.0009
.0081
.0081
ambiguous decoding
Thus, under each codeword, the probabilities listed are that the codeword will
get mangled into the 4-bit word on the left. The omitted cases are where the
codeword gets slightly mangled, but only into a word that is still closer to the
original codeword than to any other codeword.
Since the codewords are sent with equal probabilities, the probability of an
uncorrectible (or falsely correctible) received word is
1
1
1
(sum of first column) + (sum of second column) + (sum of third column)
3
3
3
12.2
A better approach
203
1
1
(50.0081+40.0009+10.0001)+ (20.0729+60.0081+40.0009+10.0001)
3
3
1
+ (2 0.0729 + 6 0.0081 + 4 0.0009 + 1 0.0001)
3
(We know that the last two subsums are the same, by symmetry.) This is
1
(4 0.0729 + 17 0.0081 + 12 0.0009 + 3 0.0001) 0.1468
3
That is, the probability of an uncorrectible error is 0.8532.
log2 (3)
log2 (number codewords)
=
0.39624
length codewords
4
The only reason to use a code with rate much below 1 would be to try to correct
errors by adding redundancy. To judge the success of an attempt at error correction,
we should make a comparison (for example) to the situation that wed use all 4-bit
words and see what the probability of uncorrectible error is in that case. In that
case, any error is uncorrectible, so using all 4-bit words as codewords
P (correctible error) = P (no error) = (1
1 4
) 0.6561
10
204
Chapter 12
Linear Codes
And note that 0110 can be mangled to 0111 or 0010 by 1-bit errors, and that these
are still closer to 0110 than to 1100. Likewise, 1100 can get mangled to 1101 or
1000 which are still closer to 1100 than to 0110. For simplicity, we just ignore
any other possibilities of correctible errors. Thus, well know that the probability
of a correctible error is at least the sum of these probabilities.
Computing:
P (correctible error) P (no error) + P (0001 sent with any 1-bit error)
+P (0110 sent, 0111 or 0010 received) + P (1100 sent, 1101 or 1000 received)
3
3
3
4
1 4 1
9
1
9
1
9
1
1
9
+
+ 2
+ 2
=
10
3 1 10 10
3
10 10
3
10 10
4
3
9
1
9
1
=
+ (4 + 2 + 2)
10
3
10 10
0.6561 + 0.1944 = 0.8505
That is, the probability that an error will be correctible is at least 0.8505.
First, notice that 0.8505 is much higher than the 0.6561 probability of correctible error (no error) using all 4-bit words. So weve already proven that this
code is significantly better in terms of error correction, even though we dont know
exactly the probability of errors being correctible.
Second, we had earlier computed that the exact probability that an error is
correctible is 0.8532. The difference is less than 1% of the actual number. In this
context, such an error of 1% in computing a probability is completely irrelevant!
So we conclude that we could have done the easier second computation and
skipped all the work of constructing complete tables of Hamming distances, etc.,
and all the arithmetic.
12.4
205
So in the present example lets see whether we can give a simple computational
approach to show that our approximation
P (uncorrectible error) 0.15
is not needlessly weak. That is, well find a relatively simple approach to obtain an
inequality of the form
P (uncorrectible error) number
and hope that the number on the right-hand side is close to 0.15.
Well only pay attention to uncorrectible single-bit errors. As a complement
to the earlier discussion of correctible single-bit errors, we know 4 uncorrectible
single-bit errors:
0110 sent
1110 received
0110 sent
0100 received
1100 sent
1110 received
1100 sent
0100 received
These are uncorrectible because they are Hamming distance 1 from both 0110
and 1100. These are disjoint events, so the probability that at least one of them
occurs is the sum of the separate probabilities: this probability is
3
3
3
3
9
9
9
9
1 1
1 1
1 1
1 1
+
+
+
0.0972
3 10 10
3 10 10
3 10 10
3 10 10
Then we can say
P (uncorrectible error) P (those 4 specific uncorrectible errors) 0.0972
Combining this inequality with the earlier one, we have
0.0972 P (uncorrectible error) 0.15
The right-hand inequality gives a quality assurance, while the left-hand inequality
tells us that we are not wastefully underselling the quality of the code. And, again,
we get this estimate on the probability of an uncorrectible error without looking
at anything beyond single-bit errors.
206
Chapter 12
Linear Codes
1000011
0100101
0010110
0001111
The Hamming decoding procedure is one of the good features of this code.
First, well write all the codewords as vectors, like
1000011 = (1, 0, 0, 0, 0, 1, 1)
0100101 = (0, 1, 0, 0, 1, 0, 1)
(Keep in mind that the components of these vectors are in F2 .) Define auxiliary
vectors by
r = (0, 0, 0, 1, 1, 1, 1)
s = (0, 1, 1, 0, 0, 1, 1)
t = (1, 0, 1, 0, 1, 0, 1)
(No, its not at all clear why these are the right things...) Well use the inner
product (also called dot or scalar product) on vectors, defined as usual by the
expression
(x1 , . . . , xn ) (y1 , . . . , yn ) = x1 y1 + x2 y2 + . . . + xn yn
although in the present context all the indicated arithmetic is done inside the finite
field F2 rather than the real or complex numbers. Then for a source word such as
0100 the Hamming [7, 4] code encodes it as x = 0100101, or, equivalently, as the
vector x = (0, 1, 0, 0, 1, 0, 1). Suppose that a binary symmetric channel transmits
the word as y = (1, 1, 0, 0, 1, 0, 1), that is, with a single bit error (in this example its
12.4
207
= (0, 1, 1, 0, 1, 0, 1) (0, 0, 0, 1, 1, 1, 1) = 0
= (0, 1, 1, 0, 1, 0, 1) (0, 1, 1, 0, 0, 1, 1) = 1
= (0, 1, 1, 0, 1, 0, 1) (1, 0, 1, 0, 1, 0, 1) = 1
208
Chapter 12
Linear Codes
If the bit error probability goes down to 1/12, then the word error probability
with the do-nothing encoding is
1 (11/12)4 0.2939
while for the Hamming [7, 4] code it is
1
11
12
7
6 !
7
11
1
+
0.1101
12
12
1
With word error further reduced to 1/20, the word error for do-nothing encoding
is
1 (19/20)4 0.18549
while for the Hamming [7, 4] code it iss
1
19
20
7
6 !
7
19
1
+
0.0444
1
20
20
Remark: The Hamming [7, 4] code can correct single bit errors, by converting
4-bit words into 7-bit words in a clever manner. This is much better than simply
repeating messages, but what about 2-bit errors, etc?
12.5
209
210
Chapter 12
Linear Codes
row reduction (Gaussian elimination) of the next section will provide one systematic
approach.
The dot product or scalar product or inner product of two vectors x =
(x1 , . . . , xn ) and y = (y1 , . . . , yn ) is
x y = x1 y1 + x2 y2 + . . . + xn yn
Say that two vectors are orthogonal if their dot product is 0.
Remark: When the scalars are real numbers or complex numbers, the dot product
has geometric significance, but when the scalars are F2 or other things, the geometric interpretation is less elementary. Likewise, while 2-dimensional or 3-dimensional
vectors over the real numbers have popular and important physical interpretations
as arrows or points in the plane or space, we have no such interpretation here. But
the mathematical operations are the same.
For positive integers m and n, an m-by-n matrix with entries in F is simply
a block of numbers with m rows and n columns, with big parentheses enclosing it.
The ij th entry or component is the entry in the ith row and j th column. For
matrix M , very often the ij th component is denoted by Mij . For example,
11 12 13
M=
21 22 23
is a 2-by-3 matrix. Here M11 = 11, M12 = 12, etc.
The diagonal of a matrix M is the upper-left to lower-right diagonal, consisting of the entries M11 , M22 , M33 , etc.
The transpose M t of an m-by-n matrix M is an n-by-m matrix obtained by
flipping M across the diagonal, whose ij th entry is the jith entry of the original.
For example,
t
11 21
11 12 13
= 12 22
21 22 23
13 23
The size-n identity matrix In is the n-by-n matrix with 1s on the diagonal
and 0s off the diagonal. For example,
I1
I2
I3
I4
(1)
1 0
=
0 1
1 0 0
0 1 0
=
0 0 1
1 0 0 0
0 1 0 0
=
0 0 1 0
0 0 0 1
Notice that this makes sense regardless of what kind of scalars we are using. At
the same time, looking at an identity matrix doesnt give any clue as to what the
scalars are. The n-by-n zero matrix consists entirely of zeros.
12.6
211
5 ) = (1, 2, 3, 4, 5)
Remark: In any case, the idea of a vector as ordered n-tuple is slightly more
abstract than the tangible notational manifestations as row or column vectors.
For most purposes, there is little reason to try to worry about whether a vector is
naturally a row vector versus column vector. We will consider these as just being
different notational devices for the same underlying thing.
a b
M = c d
e f
is the same as left-multiplying M by
1
0
0
0
0
1
0
1
0
As another example, adding t times the third row of M to the first row of M is
achieved by left-multiplying M by
1 0 t
0 1 0
0 0 1
212
Chapter 12
Linear Codes
(To prove that this is so in general is an exercise in notation and the definition of
matrix multiplication.)
The row space of an m-by-n matrix M is the subset of F n consisting of all
linear combinations of rows of M (viewed as n-dimensional vectors). Similarly, the
column space of an m-by-n matrix M is the subset of F m consisting of all linear
combinations of columns of M (viewed as m-dimensional vectors).
An m-by-n matrix matrix M is (strongly) row reduced if the following
slightly complicated but important condition is met. Look at the ith row of M ,
which has entries Mi1 , Mi2 , . . ., Mi n . If all these entries are 0, then there is
no condition. If not all these entries are 0, let ji be the smallest integer so that
Miji 6= 0. Call this the leading entry of the ith row, or pivot in the ith row. For
M to be (strongly) row reduced we require that for every row index i
M i ji = 1
and
Mi0 ji = 0
for i0 6= i
for i0 > i
1 0 0 0 1 1
0 1 0 0 1 0
0 0 0 1 1 1
is row reduced (in the strong sense): the leading entry in the top row occurs in the
first column, and all the other entries in the first column are 0. The leading entry in
the second row occurs in the second column, and all the other entries in the second
column are 0. The leading entry of the third row occurs in the fourth column,
12.6
213
and all other entries in that column are 0. The fact that the third column is all
0s is irrelevant. Also, the contents of the fifth and sixth columns are irrelevant to
the question of whether or not the matrix is row-reduced. And the leading entries
occur farther and farther to the right as we go down the rows. On the other hand,
the matrix
1 0 0 0 1 1
0 1 0 0 1 0
1 0 0 1 1 1
is not row reduced (in either sense): in the first column there are two 1s. That is,
the leading entry in both the first and third row occurs in the first column. Also,
the matrix
0 0 1 0 1 1
0 1 0 0 1 0
1 0 0 1 1 1
is not row reduced, since the leading entries do not occur farther to the right as we
move down the rows. That is, the leading entry in the second row is in the second
column, which is farther to the left than the leading entry of the first row, which is
in the third column. Likewise, the leading entry of the third row occurs still farther
to the left of the leading entry in the second column.
Elementary row operations can be used to put a matrix into rowreduced form (in either the stronger or the weaker sense). Reasonably
enough, the process of doing elementary row operations to put a matrix into
row-reduced form is called row reduction.
(Strong) row reduction is easy to illustrate in an example. Lets start with the
matrix
0 1 1 0 1 1
1 1 1 0 1 0
1 0 0 1 1 1
with entries in the field with two elements F2 . First look in the first column: there
is a non-zero entry, but its not in the first row, so we interchange the first and
second rows, to get
1 1 1 0 1 0
0 1 1 0 1 1
1 0 0 1 1 1
to make a non-zero entry occur in the first row. (We could also have interchanged
the first and third rows.) Then, since there is still a non-zero entry in the third
row, we subtract the first row from the third, obtaining
1 1 1 0 1 0
0 1 1 0 1 1
0 1 1 1 0 1
So the first column looks the way it should.
Next, look in the second column, but only below the first row. There are two
1s, and in particular there is a 1 in the second row, so we dont need to interchange
214
Chapter 12
Linear Codes
any rows. Thus, the leading term in the second row occurs in the second column.
But there are two other non-zero entries in the second column (in both first and
third rows), so we subtract the second row from both first and third rows, obtaining
1
0
0
0
1
0
0
1
0
0
0
1
0
1
1
1
1
0
So the second column is arranged the way it should be. The third row has its
leading entry not in the third column, but in the fourth. (So we just dont worry
about whats going on in the third column.) And, in fact, the other entries of the
fourth column are already 0s, so we dont have to do any further work. That is,
the matrix is now in row-reduced form.
Remark: The weaker version of row reduction merely omits some work by not
bothering to do the row operations to make the entires above a pivot 0. This approximately cuts in half the total number of operations necessary, and is sometimes
a good-enough version of row reduction.
Remark: The term row reduced in the literature is ambiguous, and one must look
at the context to discern whether it is strongly or weakly row reduced. For many
purposes it does not matter much which sense is taken.
We can describe the row-reduction process for an m-by-n matrix M a little
more abstractly. Start with two auxiliary indices s, t both set equal to 1. While
s m and t n repeat the following:
If the s, s + 1, s + 2, . . . , m entries in the tth column are all 0, replace t by
t + 1 and restart this block.
Else if the (s, t)th entry is non-zero, divide the sth row by the (s, t)th entry
and then go to the next block.
Else if the (s, t)th entry is 0, but the (s0 , t)th entry is non-zero (with s0 > s),
then divide the s0th row by the (s0 , t)th entry, interchange the sth and s0th
rows, and go to the next block of operations.
For every s0 6= s, if the (s0 , t)th entry is not 0, then subtract (s0 , t)th -entry
times the sth row from the s0th row. (This applies also to the indices s0 < s.)
After all these subtractions, replace s by s + 1.
Go back to the previous block (as long as s m).
When finally s = m + 1 or t = n + 1, the matrix will be in row-reduced form.
Remark: In the special case that the field is F2 , the above process is simpler,
since any non-zero element is already 1, so no division is ever necessary.
Remark: Again, if the weaker version of row reduction will suffice in a given
application, then simply dont bother to subtract a lower row from a higher row.
That is, dont bother to do the row operations to make entries above a pivot 1.
Remark: Most often these algorithms are studied in contexts in which floatingpoint real numbers are used. In that setting, the issue of loss of precision is critical.
But in the present scenario, as well as when computing with numbers from arbitrary finite fields, we effectively have infinite precision, so we need not worry
12.6
215
about round-off error, etc. This avoids many of the technical worries which require
lengthy consideration in the floating-point case.
One problem we need to solve is the following: let
v1
v2
v3
vm
=
(v11 , v12 , . . . , v1,n )
=
(v21 , v22 , . . . , v2,n )
=
(v31 , v32 , . . . , v3,n )
...
= (vm1 , vm2 , . . . , vm,n )
...
vm1 vm2 . . . vm,n
Recall that an m-by-m identity matrix Im is an m-by-m matrix with 1s on the
(upper-left to lower-right) diagonal and 0s off this diagonal:
1 0 0 ... 0
0 1 0 ... 0
Im = 0 0 1 . . . 0
...
0 0 0 ... 1
216
Chapter 12
...
vm1 vm2 . . . vm,n 0
Linear Codes
0
0
1
... 0
... 0
... 0
... 1
That identity matrix (or, really, what it turns into subsequently) will keep track of
the operations we perform. This type of larger matrix created from M is sometimes
called an augmented matrix, but this terminology is nonspecific so you shouldnt
rely upon it.
The goal is to do elementary row operations until the matrix M (as a part
f) has one or more rows which are all 0s, if possible. (The
of the larger matrix M
identity matrix stuck onto M on the right can never have this property...) That is,
f should be 0, if possible.
the leftmost m entries of one or more rows of M
Doing the weak version of row reduction will accomplish this. We go through
it again: Starting in the leftmost column, if the top entry is 0, but if there is some
entry in the first column that is non-zero, interchange rows to put the non-zero
entry at the top. Divide through by the leftmost entry in the (new) first row so
that the leftmost entry is now 1. Let ai1 be the leftmost entry in the ith row. Then
for i > 1 subtract ai1 times the top row from all other rows. This has the effect of
making all entries in the first column 0 except for the top entry. (If the leftmost or
any other column is all 0s, just ignore it.)
Next look at the second column. If necessary, interchange the second row with
another row below it in order to arrange that the second entry of the second row is
not 0. (The first entries of all rows below the top one have already been made 0.)
Divide through the second row by the second entry, so that the second row starts
0, 1. Let ai2 be the ith entry from the top in the second column. Then subtract ai2
times the second row from all lower rows.
Continue this with the third, fourth, up to mth columns, or until any remaining
f
among the first m columns are all 0s. Suppose that the row-reduced version of M
is
fred = ( Mred A )
M
where the Mred is the reduced version of M , and the m-by-m matrix A is what Im
turns into by this process.
Let wi be the left n entries of the ith row of the new matrix Mred . (So these
are length-n row vectors.) Then what we have is
v1
w1
v w
A 2 = 2
...
...
vm
wm
If m > n, then at least the last mn of the wi s will be the zero vector. For example
wm will certainly be the length-n zero vector. That is, we have
am1 v1 + am2 v2 + am3 v3 + . . . + amm vm = (0, . . . , 0)
12.6
217
It is important that (due to the way we obtained the matrix A) for each index i at
least one aij is nonzero.
In other words, we have found a linear combination of the vectors vi
which is zero. (And not all the coefficients in the linear combination are zero.)
Further, it may happen that there is more than one row of the reduced matrix
Mred which is all 0s. So, quite generally, if the ith row of Mred is all 0s, then
ai1 v1 + ai2 v2 + ai3 v3 + . . . + aim vm = (0, . . . , 0)
A numerical example: Find a (non-trivial) linear dependence relation among
the five 4-dimensional binary vectors
1101, 1011, 1100, 1111, 0110
First, stack these up as the rows of a
1
1
1
0
matrix
1
0
1
1
1
0
1
0
1
1
1
1
1
0
1 1 0 1 1 0 0
1 0 1 1 0 1 0
1 1 0 0 0 0 1
1 1 1 1 0 0 0
0 1 1 0 0 0 0
0
0
0
1
Now do (weak) row reduction. We already have the proper pivot in the first row,
so subtract the first row from the second, third, and fourth (but not fifth) to make
the other entries in the first column 0:
1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0
0 0 0 1 1 0 1 0 0
0 0 1 0 1 0 0 1 0
0 1 1 0 0 0 0 0 1
The pivot in the second row is already prepared as well. Then we want to make
all the lower entries in the second column 0. (We dont care about the higher ones
because were doing the weaker form of row reduction.) Thus, just subtract the
second from the last row:
1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0
0 0 0 1 1 0 1 0 0
0 0 1 0 1 0 0 1 0
0 0 0 0 1 1 0 0 1
218
Chapter 12
Linear Codes
In the third column to get a pivot into the right spot we must interchange the third
and fourth rows:
1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0
0 0 1 0 1 0 0 1 0
0 0 0 1 1 0 1 0 0
0 0 0 0 1 1 0 0 1
In this case all the entries below that 1 in the (3, 3) position are already 0s, so no
subtractions are necessary. In the fourth column the pivot is already in the right
spot, and there is only a 0 below it, so no subtractions are necessary. Likewise, in
the fifth column we already have a pivot, and no subtractions are necessary.
By this point, looking at the left 4 columns only of this big reduced matrix
(since the original vectors were length/dimension 4):
1
0
0
0
1
1
0
0
0
0
1
1
0
0
1
0
1
0
we see that we succeeded in getting a row of 0s along the bottom. Thus, taking the
bottom row 11001 of the right part of the large reduced matrix as coefficients for a
linear combination of the original vectors, we have (as predicted)
1 1101 + 1 1011 + 0 1100 + 0 1111 + 1 0110 = 0000
This is the desired linear dependency relation.
12.7
Linear codes
219
Example: The generating matrix for the Hamming binary [7, 4] code has standard
form
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
1
1
1
1
0
1
1
1
1
0
1
Proof: We will take for granted that row reduction can put any matrix into a
(strongly) row-reduced form. (One could prove this by induction.) That is, each
non-zero row begins with a 1, that 1 has only 0s above and below it, and, last,
these leading 1s are farther and farther to the right as one goes down the rows. By
permuting the columns we may move the columns in which these leading 1s occur
as far left as possible, putting the matrix into a form like
1
0
0
.
.
.
0
.
..
0
0 0
1 0
0 1
.. ..
. .
0 0
0 0
.. ..
. .
0
...
...
...
..
...
0 ... 0
..
..
.
.
0 0 ... 0
... 0
... 0
... 0
.
..
. ..
0 1
0 0
.
0 ..
0
..
.
That is, the upper left corner is an identity matrix, to the right of this there is an
unpredictable rectangular matrix, and below these everything is 0. If not for the
possibility of rows of 0s at the bottom, this would be the desired form.
The hypothesis that the k rows of the original matrix are linearly independent
will assure that there can be no rows of 0s. Elementary row operations do not
change the row space, and permutations of columns do not change the dimension
of the row space. Thus, the dimension of the row space of the matrix above must
still be k, the number of its rows. Thus, since dimension of a vector (sub-) space
is well-defined, it cannot be that a k-dimensional space is spanned by fewer than k
non-zero vectors. That is, there can be no rows of 0s in the reduced form. Thus,
with suitable permutations of columns, we have the standard form as claimed. ///
For a generating matrix G of size k-by-n, the associated linear [n, k]-code C
consists of all linear combinations of rows of G other than the zero vector. That
is, the codewords of the code C are exactly the vectors in k n which are linear
combinations of the rows of G. Such codes are linear codes. The set of all linear
combinations of rows of a matrix is the row space of the matrix.
220
Chapter 12
Linear Codes
If G is in standard form, then the first k entries of a codeword are called the
information positions or information symbols, and the remaining nk entries
are called the parity check positions or parity-check symbols.
Associated to a k-by-n generating matrix G is an encoding (map)
f : Fkq row space of G
defined by
f (v1 , v2 , . . . , vk ) = ( v1
v2
v3
...
vk ) G
Thus, the source words are taken to be all k-length words over Fq (other than all
0s), and the encoding is by n-length words over Fq .
Remark: In the special case that the code is binary, instead of saying k-length
and n-length, we can say k-bit and n-bit.
Remark: Note that if G is in standard form then the first k positions of the
encoded word are the same as the word itself:
f (v1 , v2 , . . . , vk ) = (v1 , v2 , . . . , vk , . . .)
Therefore, for G in standard form, the information positions of a vector in the
associated code completely determine the parity-check symbols. In particular, if
G = (Ik A)
and v = (v1 , . . . , vk ) is a k-tuple of scalars, then (viewing v as a row vector)
f (v) = vG = v(Ik A) = (vIk vA) = (v vA)
That is, the parity-check positions are obtained from the information positions by
matrix multiplication.
Remark: In general, for a k-by-n generating matrix G not necessarily in standard
form, a bunch of k indices i in the range 1 i n is called a set of information
positions if values of a codeword at these entries completely determines the other
symbols in the codeword. A code is called systematic on a set of positions if that
set of positions is a set of information positions for the code.
As defined earlier, the (Hamming) distance between two vectors v =
(v1 , . . . , vn ) and w = (w1 , . . . , wn ) in Fnq is the number of indices i so that
vi 6= wi
This is not the usual notion of distance, but it is appropriate for linear codes.
And, again, the Hamming weight of a binary vector is the number of non-zero
components (that is, the number of components which are 1). The minimum
distance of a linear [n, k] code C is the minimum of d(v, w) for two vectors v, w in
C. (Recall that a linear [n, k] code is a particular kind of subset of Fnq , being the
row space of a generating matrix.)
12.7
Linear codes
221
Proposition: The Hamming distance on Fnq has the formal properties of a distance
Proof: The first three assertions are immediate. The reason the fourth is true is
that if u and w differ at the ith position, then any vector v must differ from at least
one of the two at the ith position.
///
Minimum distance decoding of a code C is done by choosing the closest
vector x in C to a received word y, using the Hamming distance, and declaring
y as the decoding of x. (If there are two closest codewords, then either pick one
arbitrarily, or refuse to decode.)
Remark: There still remains the issue of how to do the decoding efficiently. Well
address this shortly in discussing so-called syndrome decoding.
As earlier, let floor(t) denote the floor function of t, which is the largest
integer less-than-or-equal t, for real numbers t.
Theorem: Using minimum distance decoding, a linear code C with minimum
d
distance d can correct floor( d1
2 ) errors, and detect floor( 2 ) errors.
for every codeword x0 other than x. Suppose, to the contrary, that d(x0 , y) e as
well. We use the triangle inequality (proven just above) in a slightly clever (but
completely standard!) way:
d(x, x0 ) d(x, y) + d(y, x0 ) e + e = 2e
But this contradicts
2e < d d(x, x0 )
Thus, it could not have been that y was as close to x0 as to x. Thus, with 2e < d,
e symbol errors can be corrected by minimum-distance decoding.
Similarly, if e floor( d2 ), then 2e d. To be sure to detect this error, it must
be that y is still at least as close to x (in Hamming distance) as y is to any other
codeword. That is, it must be that
d(x, y) d(x0 , y)
for every other codeword x0 . Suppose, to the contrary, that d(x0 , y) < e. Then we
use the triangle equality as we did just above:
d(x, x0 ) d(x, y) + d(y, x0 ) < e + e = 2e
222
Chapter 12
Linear Codes
Proposition: The dual of a linear [n, k] code over Fq is a linear [n, n k] code
over Fq .
Proof: In the terminology of the appendix on linear algebra, for fixed vector w of
length n, the map v v w is a linear functional on the vector space of length-n
vectors. That is, we are identifying the dual space V of the space V of all lengthn vectors with V itself via the dot product. Thus, with this identification, the
dual code is the orthogonal complement of the original code, in the sense of that
appendix. And, from that appendix, we have the relation
dim C + dim C = n
Thus, for dim C = k, we have dim C = n k.
///
12.8
223
Theorem: For a linear code C, the dual code of the dual code is again C itself.
In symbols:
(C ) = C
Proof: The fact that the second dual of a subspace W is W itself again is proven
in the appendix on linear algebra.
///
Proposition: If G is in standard form G = (Ik A), then a generating matrix G
for the dual code C to the code C associated to G is
G = (At Ink )
A ) ( A
G (G ) = ( Ik
Ik ) = ( Ik
A)
A
Ik
= A + A = 0k
Ik )) ((rowspace G) )
(rowspace ( At
Ik )) 6= ((rowspace G) )
but
Recall also from the appendix that
G = G
Thus, if the rowspace of ( At Ik ) is strictly smaller than the orthogonal complement to the rowspace of G, then
(rowspace ( At
Ik )) rowspace G
(rowspace ( At
Ik )) 6= rowspace G
but
But the same reason as in the beginning of this proof shows that
(rowspace ( At
Ik )) rowspace G
(rowspace ( At
Ik )) = rowspace G
Thus, in fact
which proves the result.
///
224
Chapter 12
Linear Codes
Let y1 , . . . , ynk be the rows of a generating matrix for the dual code C of a
linear [n, k] code C over Fq . Then the last result implies that a vector x in Fnq is
in the code C if and only if
v yi = 0 for all yi
Each such condition is a parity-check condition for v to lie in C. An entire
generating matrix H for the dual code is called a (parity) check matrix for the
original code. Note that a parity-check matrix for an [n, k] code is (n k)-by-n.
And the simultaneous conditions imposed by all the separate parity-check equations
is equivalent to the matrix equation
v Ht = 0
where the last 0 is a zero vector.
For a code C with parity-check matrix H, for v Fnq the vector
v Ht
(matrix multiplication)
is the syndrome of v. By the theorem above that asserts that the dual of the dual
is the original code, the syndrome xH t of a codeword x is the zero-vector. Thus, if
a vector y is received which is a codeword x with an error vector e added,
y =x+e
then the syndrome of y and e are the same, as is easily verified by a little algebra:
yH t = (x + e)H t = xH t + eH t = 0 + eH t = eH t
since xH t = 0 (because x is a codeword).
A coset or syndrome of C in Fnq is a subset of Fnq of the form (with fixed
vo Fnq )
{vo + v : v C}
The standard notation for this coset (equivalently, syndrome) is
vo + C = {vo + v : v C}
This is also called the vo th coset or syndrome of C.
Proposition: Let C be a binary linear [n, k] code. For vectors v, w in Fn2 ,
A vector v is in C if and only if v + C = C.
If (v + C) (w + C) 6= , then v + C = w + C.
There are exactly 2nk distinct cosets (syndromes) of C.
Proposition: Fix an n-bit binary linear code C in Fnq . For a sent codeword x
received as vector y with error e = y x, we have an equality of cosets
e+C =y+C
12.8
225
///
>
Proof: Note that of course the hypothesis that the bit error probability is less
than 1/2 would hold in any interesting or realistic situation.
The first assertion is straightforward, as follows. Given a received word y, let
e be a coset leader for the coset y + C. That is, e y + C and e has minimum
Hamming weight for all words in that coset y + C. Write e in the form y + x for
x C. Then syndrome decoding says to decode y as y e, which is
y e = y (y + x) = x C
The Hamming distance from y to x is the Hamming weight of
y (x) = y (y e) = e
226
Chapter 12
Linear Codes
Thus, minimizing the Hamming weight of e is equivalent to minimizing the Hamming distance from y to an element x C. This makes the equivalence of syndrome
decoding and minimum distance decoding clear. (And we had earlier verified the
equivalence of minimum distance decoding and maximum likelihood decoding.) ///
Remark: Of course a coset y + C with more than one coset leader is bad, because
the presence of more than one coset leader means that maximum likelihood decoding
is ambiguous, so has a good chance of failure, in the sense that such an error is
detected but cannot be corrected. But this is inescapable for the worst cosets in
most codes.
Exercises
12.01 Encode the 4-bit word 1101 using the Hamming [7, 4] code.
12.02 Decode the 7-bit word 1101111 using the Hamming [7, 4] code.
12.03 What is the rate of the Hamming [7, 4] code? (ans.)
12.04 What is the word error probability using the Hamming [7, 4] code and a
binary symmetric channel with bit error probability 1/6?
12.05 Express (1, 2) as a linear combination of (3, 4) and (5, 7) (with real scalars).
(ans.)
12.06 Express (2, 1) as a linear combination of (3, 4) and (5, 7) (with
rational scalars).
12.07 Express (1, 2, 3) as a linear combination of (8, 3, 2), (4, 2, 1), (3, 1, 1) (with
rational scalars).
12.08 Express (1, 0, 1) as a linear combination of (8, 3, 2), (4, 2, 1), (3, 1, 1) (with
rational scalars). (ans.)
12.09 Express (1, 1, 1) as a linear combination of (1, 0, 1), (0, 0, 1), (0, 1, 1) (with
scalars F2 ). (ans.)
12.10 Express (1, 0, 1) as a linear combination of (1, 1, 1), (0, 1, 1), (1, 1, 0) (with
scalars F2 ).
12.11 Express (1, 0, 1, 1) as a linear combination of (1, 1, 1, 0), (0, 1, 1, 1), (1, 1, 0, 1),
(1, 1, 1, 1) (with scalars F2 ). (ans.)
12.12 Express (1, 1, 0, 1) as a linear combination of (1, 0, 0, 1), (0, 1, 1, 1), (1, 1, 0, 1),
(1, 1, 1, 1) (with scalars F2 ).
12.13 What is the rate of a binary linear [n, k] code?
12.14 Let
G=
1
0
0
1
0
1
1
1
Exercises
12.15 Let
227
1
G = 0
0
0
1
0
0
1
1
0
1
1
1
1
0
1
0
1
1
0
1
1
1
0
0
1
1 ). (ans.)
1 ).
1
0
1
1
1
0
0
1
1
0
0
1
1
0
1
1
1
0
0
1
1
0
1
1
0
1
1
1
1
0
0
1
13
Bounds for Codes
13.1 Hamming (sphere-packing) bound
13.2 Gilbert-Varshamov bound
13.3 Singleton bound
There are some general facts that can be proven about codes without actual
construction of any codes themselves, giving us guidance in advance about what
may be possible and what is impossible.
13.1
229
The ball of radius r in Fnq centered at a vector v is the collection of all vectors
w so that d(v, w) r. The volume of this ball is the number of vectors in it.
Lemma: The volume of a ball of radius r in Fnq is
volume of ball of radius r centered at v
n
n
n
= 1 + (q 1)
+ (q 1)2
+ . . . + (q 1)r
1
2
r
where
n
k
Proof: The 1 counts the vector v itself. Next we count vectors which differ at
one position from v: there are n1 = n positions at which they might be different
and q 1 choices for the other character to appear at the chosen
position. Next we
count vectors which differ at two positions from v: there are n2 positions at which
they might be different, and q 1 choices for each of the other two characters at
the chosen positions:
n
number of vectors at distance 2 from v = (q 1)
2
2
Continue: finally we count vectors which differ at r positions from v: there are nr
positions at which they might be different, and q 1 choices for each of the other
r characters at the chosen positions:
n
number of vectors at distance r from v = (q 1)
r
r
///
Lemma: Let x and y be two vectors in Fnq with d(x, y) > 2e for some integer e.
Then for another vector z with d(x, z) e, it must be that d(y, z) > e.
230
Chapter 13
///
Corollary: Given vectors x, y with d(x, y) > 2e, the balls of radius e centered at
x and at y are disjoint.
Proof: If x0 is in the ball of radius e centered at x, then apply the previous lemma
to see that its distance from y must be > e, so x0 cannot lie in the ball of radius e
centered at y.
///
///
13.2
Gilbert-Varshamov bound
231
Remark: Although this theorem is a concrete assurance that good codes exist, it
does not give any efficient procedure to find them (nor to decode them).
Remark: There is no assertion that this is the best that a code can do, only that
we can expect at least this level of performance.
Proof: The starting point for the proof is the fundamental fact, proven in the
next chapter, that for a linear code the minimum distance is d if and only if any
d 1 columns of a check matrix are linearly independent. Granting that, consider
the process of choosing n columns in a check matrix so that any d 1 of them are
linearly independent. The code is the row space of a k-by-n generating matrix. Its
check matrix is an (n k)-by-n matrix of full rank, that is, of rank n k. (That
means that if you do row reduction there will not be any row of all 0s when youre
done.)
We just do the binary case for simplicity.
Lets suppose that in the construction of a check matrix we have successfully
chosen ` columns so far, with ` d 2, so that no d 1 of them are linearly
dependent. Now we want to choose an (` + 1)th column. The choice must be made
from among column vectors of size n k. There are 2nk such elements. We must
exclude the all-0-column, exclude any previous column, exclude the sum of any
previous two columns, exclude the sum of any previous three columns, and so on
up to excluding the sum of any previous d 2 columns. In the worst case, all these
things that we must exclude are different. This is the case we consider. That is, at
worst there might be only
`
`
`
+ ... +
+
2nk 1 +
d2
2
1
available vectors. Thus, to be sure that a choice is available, this number must be
positive.
For i < d, the inequality that must be satisfied in order to be able to choose
the ith column is
i1
i1
i1
nk
+ ... +
+
2
1+
i1
2
1
By the binomial theorem, the right-hand side is just 2i . Since
id1nk
(the columns have to have length
at least d 1), this inequality certainly holds.
The binomial coefficients `i are increasing as ` increases. Thus, if
2nk > 1 +
`
`
`
+
+ ... +
1
2
d2
for ` = n 1 (to try to choose the nth column) then the inequality is certainly
satisfied for smaller `. Thus, we obtain the condition of the theorem.
///
232
Chapter 13
Remark: The converse assertion is false. That is, there may exist linear codes
exceeding what the theorem guarantees, although such codes would be very good
indeed. In fact, certain of the geometric Goppa codes were proven by Tsfasman,
Vladut, and Zink to exceed the Gilbert-Varshamov bound.
Remark: If the code is linear, for example is an [n, k] code, then the number of
codewords is ` = q k , and by taking logarithms base q the Singleton bound becomes
n (d 1) k
or, alternatively,
n+1k+d
Thus, on the other hand, if k + d n then no such code can exist.
Proof: Since every pair of codewords differs in at least d positions, even if we ignore
the last d 1 positions no two codewords will be the same in the first n (d 1)
codewords. So if we just chop off the last d 1 positions all the ` codewords are still
different. So well get a code with ` codewords and block length n (d 1). Since
there are q n(d1) words of length n (d 1) altogether, certainly ` q n(d1) .
This is the desired inequality.
///
If a code meets the Singleton bound, then it is called a minimum distance
separating code, or MDS code.
Exercises
13.01 Is there a binary code with 17 codewords, with minimum distance 3, and
with length 7? (ans.)
13.02 Is there a binary code with 29 codewords, with minimum distance 3, and
with length 8?
Exercises
233
13.03 Is there a binary code with 7 codewords, with minimum distance 5, and with
length 8? (ans.)
13.04 Is there a binary linear code of dimension 2, with minimum distance 3, and
with block length 5? (ans.)
13.05 Is there a binary linear code of dimension 3, with minimum distance 3, and
with block length 6?
13.06 Is there a binary linear code of dimension 3, with minimum distance 3, and
with block length 5? (ans.)
13.07 Is there a binary linear code of dimension 3, with minimum distance 4, and
with block length 8?
13.08 Is there a binary linear code of dimension 2, with minimum distance 4, and
with block length 7? (ans.)
13.09 Is there a binary code with 9 codewords, with minimum distance 3, and with
length 5?
13.10 Is there a binary code with 17 codewords, with minimum distance 5, and
with length 8?
14
More on Linear Codes
14.1 Minimum distances in linear codes
14.2 Cyclic codes
Proof: Since the code is linear, the minimum distance between two codewords is
also the minimum distance from the 0 codeword to any other codeword. Let the
columns of the check matrix H be
H = ( r1
r2
...
rn )
ci ri>
If any d of the ri s are linearly independent, then for any codeword v there must
be at least d + 1 non-zero ci s. On the other hand, if some d of the ri s are linearly
dependent, then for some codeword v there must be at most d non-zero ci s.
///
Remark: This theorem finally gives an idea of how to look for linear codes that
can correct many errors. The issue is converted to looking for (check) matrices
H so that any e columns are linearly independent. This is what motivates the
234
14.2
Cyclic codes
235
Corollary: If any 2e columns of the check matrix are linearly independent, then
the code can correct any e errors, and vice versa.
Corollary: For a binary linear code, if no 2 columns of a check matrix are the
same, and if no column of the check matrix is 0, then the code can correct any
single error.
Proof: The case of two binary vectors being linearly independent is especially
simple: this means that neither is a multiple of the other. And because the scalars
are just the field with two elements, if the columns are not all 0s then the only
possible scalar multiple would be by the scalar 1, so to be scalar multiples and
non-zero two columns would actually have to be the same.
///
Corollary:
Remark: In general, it can happen that some of those cycled rows are linearly
dependent. Then how do we systematically determine the dimension of a cyclic
code C manufactured in such manner? That is, what is the k so that C will be an
[n, k]-code?
To answer this question about the dimension of a cyclic code, we identify vectors (b0 , . . . , bn1 ) with polynomials with coefficients in Fq , this time in ascending
degree:
(b0 , . . . , bn1 ) b0 + b1 x + b2 x2 + b3 x3 + . . . + bn1 xn1
236
Chapter 14
(Notice how we altered the indexing of the vector to match the exponents in the
polynomial.) With this convention, the vector v 0 obtained from a given vector v by
cycling the symbols of v to the right (and wrapping around) is simply described as
v 0 (x) = (x v(x))%(xn 1)
(polynomial multiplication)
where A%B denotes the reduction of A modulo B. Thus, the set of all such cycledand-wrapped versions of v is the set of all polynomials
(xi v(x))%(xn 1)
And since the code is linear, we can add any number of these cycled vectors together.
Thus, the code C is exactly the collection of all vectors expressible (in polynomial
terms) in the form
(P (x) v)%(xn 1)
where P (x) is any polynomial. Further, if a linear combination
X
ci xi v
ci x
v %(xn 1) = 0
Further, let
h(x) = c0 + c1 x + c2 x2 + . . . + c` x`
be such a polynomial (with k + ` = n). We claim that a checksum matrix
H for C is given by
c`
0
H=0
0
0
c`1
c`
0
0
0
0
c`2
c`1
c`
0
...
...
...
c`3
c`2
c`1
c`
...
...
...
c`1
c1
c2
...
c`
0
c`1
c`
...
c`1
c0
c1
0
c0
...
...
c2 c1
. . . c2
c0
c1
0
0
0
c0
That is, the top row is made from the coefficients of h in descending order,
and they are cycled to the right (with wrap-around).
14.2
Cyclic codes
237
Proof: Since well see just below that there is no loss of generality in taking g(x) =
bk xk + . . . + b1 x + b0 to be a divisor of xn 1, we assume so now. Then we
have g(x)h(x) = xn 1. By the very definition of polynomial multiplication, the
coefficient of xm in the product g(x)h(x) = xn 1 is
1 (if m = n)
X
bi cj = 1 (if m = 0)
i+j=m
0 (0 < m < n)
Note that the set of the n 1 of these expressions with 0 < m < n is the same
as the set of expressions for entries of GH > , that is, scalar products of rows of H
and G, though the set of scalar products of the rows has many repeats of the same
thing. Thus, g(x)h(x) = xn 1 implies that GH > = 0. However, this does not
quite assure that H is a check matrix, since without any further information it is
conceivable that there would be a vector v of length n not in the code (the rowspace
of G) but with vH > = 0. We will show that this does not happen, invoking some
standard linear algebra. In particular, for a vector subspace C of the length n
vectors Fnq , define the orthogonal complement
C = {w Fnq : w v = 0 for all v C}
From the appendix on linear algebra, we have
dim C + dim C = n
Since the rowspace of H is contained in the orthogonal complement C of the
rowspace C of G, and since visibly
dim rowspace G + dim rowspace H = n
we find that
dim rowspace H = dim C
Again, H C , so necessarily H = C . That is, vH > = 0 implies v C. That
is, H is a check matrix for the code with generator matrix G.
///
n
To find the smallest-degree polynomial h so that h v%(x 1) = 0 we take
h(x) = (xn 1)/gcd(v, xn 1)
where gcd is greatest common divisor. The greatest common divisor of two
polynomials is easily found via the Euclidean algorithm for polynomials.
In fact, continuing this line of thought, every cyclic code of block length n is
obtainable by taking g(x) to be a divisor of xn 1. With
k = n deg g
we obtain an [n, k]-code. The check polynomial h(x) is
h(x) =
xn 1
g(x)
238
Chapter 14
Proof: Given arbitrary g(x) as a generator polynomial, from above we see that a
check polynomial is
xn 1
h(x) =
gcd(g(x), xn 1)
That is, the dual code is generated by the collection of all shifts (with wrap-around)
of the coefficients of h(x) (padded by 0s). (If the coefficients of g(x) are arranged
by ascending degree, then those of h(x) are arranged by descending degree, and vice
versa.) Similarly, the dual code to the dual has generator polynomial
f (x) =
xn 1
xn 1
=
n
gcd(h(x), x 1)
h(x)
since h(x) divides xn 1. We have shown earlier that the dual of the dual is the
original code. Thus, the cyclic code with generator polynomial f (x) is identical to
the original code with generator g(x). That is, we could as well have taken f (x) in
the first place. Since f (x) = (xn 1)/h(x) we have f (x) h(x) = xn 1 and thus
it is clear that the generator polynomial could have been chosen to be a divisor of
xn 1.
///
In this situation, the smaller the degree of h(x), the bigger the code (so the
higher the rate).
Exercises
1 1 1
0 1 1
0 0 1
14.01 Find the dimension of the row space of
0 0 0
1 0 0
1 1 0
14.02 Find the dimension of the row space of the matrix
1
0
1
1
1
1
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
1
1
1
0
0
1
1
1
0
0
0
0
1
1
1
0
0
0
0
(ans.)
1
1
1
0
0
1
1
14.03 Let C be the binary cyclic code of length 9 with generator polynomial
100111011 (coefficients ordered by ascending degree). Find a check matrix
for C. (ans.)
Exercises
239
14.04 Let C be the binary cyclic code of length 9 with generator polynomial
010001011 (coefficients ordered by ascending degree). Find a check matrix
for C.
14.05 Let C be the binary cyclic code of length 9 with generator polynomial
011010001 (coefficients ordered by ascending degree). Find a check matrix
for C.
15
Primitive Roots
15.1 Primitive elements in finite fields
15.2 Characteristics of fields
15.3 Multiple factors in polynomials
15.4 Cyclotomic polynomials
15.5 Primitive elements in finite fields: proofs
15.6 Primitive roots in Z/p
15.7 Primitive roots in Z/pe
15.8 Counting primitive roots
15.9 Non-existence of primitive roots
15.10 An algorithm to find primitive roots
= 1 mod P
1)/r
6= 1 mod P
15.2
Characteristics of fields
241
Theorem:
242
Chapter 15
Primitive Roots
then p divides n.
Since a field has no proper zero-divisors, it must be that either a1k = 0 or b1k = 0.
By the hypothesis that n was minimal, if a 1k = 0 then a = n, and similarly for b.
Thus, the factorizaton n = a b was not proper. Since n has no proper factorization,
it is prime.
Suppose that n 1k = 0k . By the division algorithm, we have n = qp + r with
0 r < p. Then
0k = n 1k = q(p 1k ) + r 1k = 0k + r 1k
From this, r 1k = 0k . Since r < p and p was the least positive integer with
p 1k = 0k , it follows that r = 0 and p divides n.
///
Fields with positive characteristic p have a peculiarity which is at first counterintuitive, but which plays an important role in both theory and applications:
p
i
15.3
243
///
Also
(x2 + 1)p = x2p + 1
(x2 + x + 1)p = x2p + xp + 1
and such things.
Remark: Note that we simply define a derivative this way, purely algebraically,
without taking any limits. Of course (!) this formula is still supposed to yield a
244
Chapter 15
Primitive Roots
thing with familiar properties, such as the product rule. So weve simply used our
calculus experience to make a good guess.
15.3
245
ai jbj
i+j=`
i+j=`
which matches the coefficient in (f g)0 . This proves the product rule.
///
A field k is called perfect if either the characteristic of k is 0, as is the case
for Q, R, and C, or if for characteristic p > 0 there is a pth root a1/p in k for every
a k.
Remark: By Fermats little theorem, the finite field Z/p (for p prime) is perfect.
Similarly, any finite field is perfect.
Proposition: Let f be a polynomial with coefficients in a field k, and P an irreducible polynomial with coefficients in k. If P 2 divides f then P divides gcd(f, f 0 ).
On the other hand, if k is perfect, then P 2 divides f if P divides gcd(f, f 0 ).
246
Chapter 15
Primitive Roots
15.4
Cyclotomic polynomials
247
///
Remark: Note that the analogous formula for least common multiples would be
false in general. For example,
lcm(x4 1, x6 1) =
(x4 1)(x6 1)
gcd(x4 1, x6 1)
248
Chapter 15
Primitive Roots
AB
gcd(A, B)
(x4 1)(x6 1)
(x4 1)(x6 1)
=
gcd(x4 1, x6 1)
x2 1
Lemma: Let n be a positive integer not divisible by the characteristic of the field
k. (This is no condition if the characteristic is 0.) Then the polynomial xn 1 has
no repeated factors.
Proof: From above, it suffices to check that the gcd of xn 1 and its derivative
nxn1 is 1. Since the characteristic of the field does not divide n, n 1k has a
multiplicative inverse t in k. Then, doing a division with remainder,
(xn 1) (tx) (nxn1 ) = 1
Thus, the gcd is 1.
///
Now suppose that n is not divisible by the characteristic of the field k, and
define the nth cyclotomic polynomial n (x) (with coefficients in k) by
1 (x) = x 1
and for n > 1, inductively,
n (x) =
xn 1
lcm of all xd 1 with 0 < d < n, d dividing n
Theorem:
xn 1
1d<n,d|n d (x)
Y
1dn,d|n
d (x)
15.4
Cyclotomic polynomials
249
Proof: First, we really should check that the least common multiple of the xd 1
with d < n and d|n divides xn 1, so that n is a polynomial. We know that d|n
(and d > 0) implies that xd 1 divides xn 1 (either by high school algebra or
from the lemma above). Therefore, using the unique factorization of polynomials
with coefficients in a field, it follows that the least common multiple of a collection
of things each dividing xn 1 will also divide xn 1.
Next, the assertion that n is monic follows from its definition, since it is the
quotient of the monic polynomial xn 1 by the monic lcm of polynomials.
For in a field, x divides n (x) if and only if is a root of the equation
n (x) = 0, from unique factorization of polynomials in one variable with coefficients
in a field. Similarly, t = 1 if and only if x divides xt 1. Thus, having shown
that n (x) truly is a polynomial, the definition
n (x) =
xn 1
lcm of all xd 1 with 0 < d < n, d dividing n
shows that n () = 0 implies that n = 1 and t 6= 1 for all 0 < t < n, as claimed
in the theorem.
To determine the gcd of m and n , let d = gcd(m, n). Observe that m
divides xm 1 and n divides xn 1, so
gcd(m , n ) divides gcd(xm 1, xn 1)
In the lemma above we computed that
gcd(xm 1, xn 1) = xgcd(m,n) 1 = xd 1
Since
dm<n
d is a proper divisor of n. Thus, from
n (x) =
xn 1
lcm of all xd 1 with 0 < d < n, d dividing n
we see that n (x) divides (xn 1)/(xd 1). Since xn 1 has no repeated irreducible
factors, n (x) has no factors in common with xd 1. Thus, in summary, the gcd of
m (x) and n (x) divides xd 1, but n (x) has no factor in common with xd 1,
so gcd(m , n ) = 1.
Next, we use induction to prove that
xn 1 =
d (x)
1dn, d|n
250
Chapter 15
Primitive Roots
xd 1 =
e (x)
0<ed,e|d
Since we have already shown that for m < n the gcd of m and n is 1, we have
Y
d (x)
d|n,d<n
Thus,
xn 1 = n (x)
d (x)
d|n,d<n
as claimed.
The assertion about the degree of n follows from the identity proven below
for Eulers phi-function:
X
(d) = n
d|n,d>0
///
(x) =
1`x;gcd(`,x)=1
(weak multiplicativity)
(d) = n
d|n,d>0
15.5
251
by
f : (x, y) rmy + snx
is a bijection. From rm + yn = 1, rm = 1 mod n so rm is relatively prime to n,
and sn = 1 mod m so sn is relatively prime to m. Thus, rmy + snx has a common
factor with m if and only if x does, and rmy + snx has a common factor with n if
and only if y does. Thus, f also gives a bijection
{x : 1 x < m, gcd(x, m) = 1)} {y : 1 y < n, gcd(y, n) = 1)}
{z : 1 z < mn, gcd(z, mn) = 1)}
is a bijection. This proves that for gcd(m, n) = 1
(mn) = (m) (n)
Using unique factorization, this reduces the calculation of () to its evaluation on
prime powers pe (p prime). This is easy, as an integer x in the range 1 x < pe is
relatively prime to pe if and only if it is not divisible by p, so there are
(pe ) = pe pe1 = (p 1)pe1
such x, as claimed.
To obtain the formula
X
(d) = n
d|n,d>0
1ke
0ke
Then use the weak multiplicativity and unique factorization of divisors into their
prime power factors. Let n = pe11 . . . pet t be the prime factorization of n into powers
of distinct primes pi . We have
X
Y
X
Y
Y
(d) =
(d) =
(pei i ) = (
pei i ) = (n)
d|n
i=1,...,t
d|pi i
i=1,...,t
///
252
Chapter 15
Primitive Roots
d (x)
d|q1
Since xq1 1 has q 1 roots in k, and since the d s here are relatively prime to
each other, each d with d|q 1 must have a number of roots (in k) equal to its
degree. Thus, d for d|q 1 has (d) > 0 roots in k (Eulers phi-function).
Finally, the roots of q1 (x) are those field elements b so that bq1 = 1 and
no smaller positive power than q 1 has this property. The primitive roots are
exactly the roots of q1 (x). The cyclotomic polynomial q1 has (q 1) roots.
Therefore, there are (q 1) > 0 primitive roots. Thus, the group k has a
generator. That is, the group k is cyclic.
///
Theorem: Let k be the finite field Z/p with p prime. Then Z/p is a cyclic
group.
15.7
253
Proof: (of proposition) The main trick here is that a prime p divides the binomial
coefficients
p
p
p
p
,
,...,
,
p1
p2
1
2
=1+p
k+1
p 2k(k+1) 2
pk(k+1) p
p
x + ... + p
x
x+
2
{z
}
|
(1 + pk x)p = 1 + pk+` y
254
Chapter 15
Primitive Roots
(1 + pk x)p = 1 + pk+` y
with y = x mod p, so this is not 1 mod pe unless k + ` e. (And if k + ` e it is
1 mod pe .) Thus,
(multiplicative) order of 1 + pk x mod pe is pek
This proves the proposition.
///
Proof: (of theorem) The assertion of the corollary is stronger than the theorem,
so it certainly suffices to prove the more specific assertion of the corollary in order
to prove the theorem.
Before the most serious part of the proof, lets see why an integer g which is a
primitive root for Z/pe will also be a primitive root for Z/2pe . The main point is
that for an odd prime p
(2pe ) = (2 1)(p 1)pe1 = (p 1)pe1 = (pe )
Let g be a primitive root modulo pe . Then ` = (pe ) is the smallest exponent so that
g ` = 1 mod pe . Thus, surely there is no smaller exponent ` so that g ` = 1 mod 2pe ,
since pe |2pe . Therefore, a primitive root mod pe also serves as a primitive root
modulo 2pe .
Now the central case, that of primitive roots for Z/pe . That is, we want to
show that the multiplicative group Z/pe is of the form hgi for some g. Let g1 be a
primitive root mod p, which we already know exists for other reasons. The plan is
to adjust g1 suitably to obtain a primitive root mod pe . It turns out that at most
a single adjustment is necessary altogether.
If (by good luck?)
g1p1 = 1 + px
15.7
255
with p 6 |x, then lets show that g1 is already a primitive root mod pe for any e 1.
By Lagranges theorem, the order of g1 in Z/pe is a divisor of (pe ) = (p 1)pe1 .
Since p 1 is the smallest positive exponent ` so that g1` = 1 mod p, p 1 divides
the order of g1 in Z/pe (from our discussion of cyclic subgroups). Thus, the order
of g1 is in the list
p 1, (p 1)p, (p 1)p2 , . . . , (p 1)pe1
Thus, the question is to find the smallest positive ` so that
(p1)p`
= 1 mod pe
g1
We are assuming that
g1p1 = 1 + px
with p 6 |x, so the question is to find the smallest positive ` so that
`
(1 + px)p = 1 mod pe
From the proposition, the smallest positive ` with this property is ` = e 1. That
is, we have proven that g1 is a primitive root mod pe for every e 1.
Now suppose that
g1p1 = 1 + px
with p|x. Then consider
g = (1 + p)g1
Certainly g is still a primitive root mod p, because g = g1 mod p. And we compute
p 1 p2
p1 2
p1
p1
p
+ pp1
p + ... +
p+
(1 + p)
=1+
p2
2
1
1+p
|
Since
p1 2
p1
p1
p + . . . = 1 + py
p+
+
3
2
1
{z
}
y
p1
1
=p1
we see that
y = p 1 mod p
so p 6 |y. Thus,
g p1 = ((1 + p)g1 )p1 = (1 + py)(1 + px) = 1 + p(y + x + pxy)
Since p|x, we have
y + x + pxy = y mod p
256
Chapter 15
Primitive Roots
Corollary: (of proof ) In fact, for an integer g which is a primitive root mod p,
either g is a primitive root mod pe and mod 2pe for all e 1, or else (1 + p)g is. In
particular, if g p1 6= 1 mod p2 , then g is a primitive root mod pe and mod 2pe for
all e 1. Otherwise, (1 + p)g is.
Proof: The hypothesis that Z/n has a primitive root is that the multiplicative
group Z/n is cyclic. That is, for some element g (the primitive root)
Z/n = hgi
Of course, the order |g| of g must be the order (n) of Z/n . From general discussion of cyclic subgroups, we know that
g 0 , g 1 , g 2 , g 3 , . . . , g (n)1
is a complete list of all the different elements of hgi. And from general properties
of cyclic groups
order of g
order of g k =
gcd(k, |g|)
So the generators for hgi are exactly the elements
g k with 1 k < |g| and k relatively prime to |g|
By definition of Eulers -function, there are (|g|) of these. Thus, since |g| = (n),
there are ((n)) primitive roots.
///
15.9
257
Corollary: For an odd prime p, the fraction (p 1)/p of the elements of Z/pe
consists of primitive roots.
Proof: From the theorem just proven the ratio of primitive roots to all elements
is
(p 1) (p 1)pe2
(p 1)
((pe ))
=
=
e
(p )
(p 1)pe1
p
as claimed.
///
Proof: First, lets look at Z/2e with e 3. Any b Z/2e can be written as
b = 1 + 2x for integer x. Then
(1 + 2x)2 = 1 + 4x + 4x2 = 1 + 4x(x + 1)
The peculiar feature here is that for any integer x, the expression x(x+1) is divisible
by 2. Indeed, if x is even surely x(x + 1) is even, and if x is odd then x + 1 is even
and x(x + 1) is again even. Thus,
(1 + 2x)2 = 1 mod 8
(rather than merely modulo 4). And from the pattern
(1 + 2k x)2 = 1 + 2k+1 x + 22k x2
we can prove by induction that
(1 + 8x)2
e3
= 1 mod 2e
e2
= 1 mod 2e
But 2e2 < 2e1 = (2e ). That is, there cannot be a primitive root modulo 2e
with e > 2.
Now consider n not a power of 2. Then write n = pe m with p an odd prime
not dividing m. By Eulers theorem, we know that
e
b(p
= 1 mod pe
b(m) = 1 mod m
258
Chapter 15
Primitive Roots
bM = (b(p ) )M/(p
= 1M/(p
= 1 mod pe
and
bM = (b(m) )M/(m) = 1M/(m) = 1 mod m
Thus, certainly
bM = 1 mod pe m
But a primitive root g would have the property that no smaller exponent
` than (pe m) has the property that g ` = 1 mod pe m. Therefore, unless
gcd((pe ), (m)) = 1 well have
lcm((pe ), (m)) < (pe ) (m) = (pe m)
which would deny the possibility that there is a primitive root.
Thus, we need (m) relatively prime to (pe ) = (p 1)pe1 . Since p 1 is
even, this means that (m) must be odd. If an odd prime q divides m, then q 1
divides (m), which would make (m) even, which is impossible. Thus, no odd
prime can divide m. Further, if any power of 2 greater than just 2 itself divides m,
again (m) would be even, and no primitive root could exist.
Thus, except for the cases where weve already proven that a primitive root
does exist, there is no primitive root mod n.
///
Proof: If b is a primitive root, certainly the conditions of the lemma are met.
On the other hand, suppose that the conditions of the lemma are fulfilled for a
particular b. Let q e be the exact power of q dividing p 1, and let t be the order of
q in Z/p . Then t|p1 by Fermats Little Theorem. If q e did not divide t, then still
t would divide (p 1)/q. But by hypothesis t does not divide (p 1)/q. Therefore,
q e |t. Since this is true for every prime q dividing p 1, the least common multiple
m of all these prime powers also divides t, by unique factorization of integers. Of
course, the least common multiple of all prime powers dividing any number p 1
is that number itself. Thus, m = p 1, and p 1 divides t. Since t divides p 1,
this gives t = p 1. That is, b is a primitive root modulo p.
///
Remark: Note that the number of primes dividing p 1 is well below log2 p.
Remark: And recall that the number of primitive roots modulo p (for p prime)
is (p 1), which is typically greater than (p 1)/4. Thus, choosing primitive
Exercises
259
root candidates at random has roughly a 1/4 chance of success. Thus, by a typical
expected value computation, as a heuristic, we should usually expect to find a
primitive root after about 4 tries.
The algorithm to find a primitive root b modulo a prime p, using knowledge
of the factorization of p 1, and to verify that b really is a primitive root, is as
follows:
Pick a random b.
For each prime q dividing p 1, compute b(p1)/q mod p.
If any of these values is 1 mod p, reject b and try a different random candidate.
Else if none of these values is 1 mod p, then b is a primitive root modulo p.
Remark: Again, roughly a quarter or more of the elements of Z/p are primitive
roots, so random guessing will succeed in finding a primitive root very quickly. And
the lemma above justifies the fairly efficient procedure to verify whether or not a
given candidate is a primitive root. And of course we take for granted that we use
an efficient exponentiation algorithm.
Remark: Very often 2 or 3 is a primitive root. For example, among the 168
primes under 1000, only 60 have the property that neither 2 nor 3 is a primitive
root. Among the 168 primes under 1000, only the 7 moduli 191, 311, 409, 439, 457,
479, and 911 have the property that none of 2,3,5,6,7,10,11 is a primitive root.
Exercises
15.01 Find a prime p > 2 such that 2 is not a primitive root modulo p.
15.02 Find all the primitive roots in Z/17. (ans.)
15.03 Find all the primitive roots in Z/19.
15.04 Find any repeated factors of x4 + x2 + 1 in F2 [x]. (ans.)
15.05 Find any repeated factors of x6 + x4 + x2 + 1 in F2 [x].
15.06 Determine the cyclotomic polynomials 2 , 3 , 4 , 5 , 6 . (ans.)
15.07 Determine the cyclotomic polynomials 8 , 9 , 12 .
15.08 Use a bit of cleverness to avoid working too much, and determine the cyclotomic polynomials 14 , 16 , 18 . (ans.)
15.09 Use a bit of cleverness to avoid working too much, and determine the cyclotomic polynomials 20 , 24 , 25 .
15.10 Use some cleverness as well as perseverance to determine the cyclotomic
polynomials 15 , 21 .
15.11 Find a primitive root in F2 [x] modulo x2 + x + 1. (ans.)
15.12 Find a primitive root in F2 [x] modulo x3 + x + 1. (ans.)
15.13 Find a primitive root in F2 [x] modulo x3 + x2 + 1.
15.14 Find a primitive root in F2 [x] modulo x4 + x + 1.
15.15 (*) Find a cyclotomic polynomial that has coefficients other than 0, +1, 1.
(ans.)
16
Primitive Polynomials
16.1
16.2
16.3
16.4
16.5
= 1 mod P
and
x` 6= 1 mod P
for 0 < ` < q N 1.
Proof: The only thing to worry about is that Fq [x]/P should be a field. This
requires exactly that P be irreducible.
///
Theorem: An irreducible polynomial P of degree N in Fq [x] is primitive if and
only if P divides the (q N 1)th cyclotomic polynomial in Fq [x].
260
16.2
Examples mod 2
261
Proof: On one hand, suppose that P is primitive. By definition, this means that
xq
= 1 mod P
N
but that no smaller positive exponent will do. That is, P divides xq 1 1 but
not xM 1 for any positive M smaller than q N 1. We have seen that
` (x) =
x` 1
lcm of all xt 1 with t|`
1)
= 1 mod P
That is, P divides x(q 1) 1. We must prove that P does not divide xt 1 for
any divisor t of q N 1 smaller than q N 1 itself. Note that the characteristic of the
field (the prime dividing the prime power q) certainly does not divide q N 1. Thus,
N
x(q 1) 1 has no repeated roots. Thus, again using the definition of the cyclotomic
polynomial quoted just above, and again using unique factorization, P (x) has no
common factors with xt 1 for divisors t of aN 1 with t < q N 1. In particular,
P does not divide xt 1 for any divisor t of q N 1 with 0 < t < q N 1. Thus,
x is of order exactly q N 1 modulo P , rather than of order any proper divisor of
q N 1. That is, P is primitive.
///
Remark: In fact, a little further preparation would allow proof that for N =
2, 3, 4, 5, 6, . . . the irreducible factors of the (q N 1)th cyclotomic polynomial in
Fq [x] are all of degree exactly N .
The order of a polynomial Q modulo P is the smallest positive integer M so
that
QM = 1 mod P
Thus, paraphrasing the previous proposition, an irreducible polynomial P of degree
N in Fq [x] is primitive if and only if the order of x mod P is q N 1. In any case,
by Lagranges theorem, the order of Q mod P is always a divisor of q N 1.
262
Chapter 16
Primitive Polynomials
This directly shows that the linear polynomial x + 1 is primitive, while x is not.
A quadratic polynomial in F2 [x] is primitive, by definition, if it is irreducible
and divides the (22 1)th cyclotomic polynomial, which is
22 1 = 3 =
x3 1
= x2 + x + 1
x1
which is by coincidence quadratic itself. Its easy to check (by trial division) that
x2 + x + 1 is irreducible, so x2 + x + 1 is the only primitive quadratic polynomial
mod 2.
A cubic polynomial in F2 [x] is primitive, by definition, if it is irreducible and
divides the (23 1)th cyclotomic polynomial, which is
23 1 = 7 =
x7 1
= x6 + x5 + x4 + x3 + x2 + x + 1
x1
From earlier (trial division), we know that the irreducible cubics mod 2 are exactly
x3 + x2 + 1
x3 + x + 1
x15 1
x15 1
x15 1
=
=
1 (x) 3 (x) 5 (x)
3 (x) (x5 1)
(x2 + x + 1)(x5 1)
= x8 + x7 + x5 + x4 + x3 + x + 1
From earlier (trial division), we know that the irreducible quartics mod 2 are exactly
the 3
x4 + x3 + x2 + x + 1 x4 + x3 + 1 x4 + x + 1
If we make a lucky (?!) guess that the second two are the primitive ones (since they
are somewhat related to each other) then we check:
(x4 + x3 + 1) (x4 + x + 1) = x8 + x7 + x5 + x4 + x3 + x + 1
Yes, these two are the primitive quartics mod 2. So 2 out of 3 irreducible quartics
are primitive.
Note that we might have recognized that x4 + x3 + x2 + x + 1 is not primitive
by the fact that it is in fact exactly 5 . That implies that every root of it has
order 5 rather than 15.
16.2
Examples mod 2
263
x31 1
x1
This is of degree 30, so if we imagine that its exactly the product of the primitive
quintics, then there should be 6 of them.
But in fact we already saw (by trial division) that there are exactly 6 irreducible
quintics,
x5 + 0 + x3 + x2 + x + 1
x5 + x4 + 0 + x2 + x + 1
x5 + x4 + x3 + 0 + x + 1
x5 + x4 + x3 + x2 + 0 + 1
x5 + x3 + 1
x5 + x2 + 1
A person who cared enough could really check that the product of these is 31 .
A sextic polynomial in F2 [x] is primitive, by definition, if it is irreducible and
divides the
(26 1)th = 63th = (3 3 7)th
cyclotomic polynomial, which is of degree
(26 1) = (3 3 7) = (3 1)3(7 1) = 42
So if we imagine that its exactly the product of the primitive sextics, then there
should be 42/6 = 7 of them.
How many irreducible sextics are there mod 2? It turns out (as we will see
later in our discussion of the Frobenius automorphism and other further structure
of finite fields) that there are
26 23 22 + 21
54
=
=9
6
6
So two irreducible sextics are not primitive. Which two?
A septic polynomial in F2 [x] is primitive, by definition, if it is irreducible and
divides the
(27 1)th = 127th
cyclotomic polynomial, which is of degree
(27 1) = 127 1 = 2 3 3 7
(since 127 is prime). So if we imagine that its exactly the product of the primitive
septics, then there should be 2 3 3 = 18 of them.
How many irreducible septics are there mod 2? It turns out that there are
126
27 21
=
= 18
7
7
264
Chapter 16
Primitive Polynomials
16.3
265
73
72
36
18
9
8
4
2
1
0
[0]
[1]
[1]
[1]
[1]
[0, 4]
[0, 4]
[0, 4]
[0, 4]
[4, 6, 8]
Thus,
x73 = x4 + x6 + x8 mod 1 + x4 + x9
which is not 1 mod 1 + x4 + x9 .
Just to check, lets verify that x511 = 1 mod 1 + x4 + x9 . Of course, taking
advantage of the power-of-2 situation, it would be smarter to verify that x512 =
x mod 1 + x4 + x9 . We again use the Fast Modular Exponentiation Algorithm,
displaying the values of (X, EY ) at successive steps in the execution:
[1]
512
[2]
256
[4]
128
[8]
64
[2, 6, 7]
32
[0, 3, 5, 7] 16
[1, 4, 6]
8
[2, 3, 7, 8] 4
[0, 2, 5, 7] 2
[1]
1
[1]
0
So that x512 = x mod 1 + x4 + x9 as it should.
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[1]
266
Chapter 16
Primitive Polynomials
x2 6= x mod P
then we would know that the nonic P was reducible. (Why?) For example, with
P (x) = 1 + x9 = (1 + x3 )(1 + x3 + x6 )
= (1 + x)(1 + x + x2 )(1 + x3 + x6 )
we compute x512 mod P (here there is an unusual shortcut):
x512 = x956+8 = (x9 )56 x8
= 1 x8 = x8 mod x9 + 1
which is not x mod x9 +1, so we have proven indirectly that x9 +1 is not irreducible.
On the other hand, lets look at the second nonic mentioned above, 1 + x + x9 .
Compute x73 mod 1 + x + x9 by fast exponentiation, with the abbreviation used
above:
[1]
73 [0]
[1]
72 [1]
[2]
36 [1]
[4]
18 [1]
[8]
9
[1]
[8]
8 [0, 1]
[7, 8]
4 [0, 1]
[5, 6, 7, 8]
2 [0, 1]
[1, 2, 3, 4, 5, 6, 7, 8] 1 [0, 1]
[1, 2, 3, 4, 5, 6, 7, 8] 0
[0]
That is,
x73 = 1 mod 1 + x + x9
which proves that 1 + x + x9 is not primitive.
Just to check, lets compute x512 mod 1 + x + x9 :
[1]
512 [0]
[2]
256 [0]
[4]
128 [0]
[8]
64 [0]
[7, 8]
32 [0]
[5, 6, 7, 8]
16 [0]
[1, 2, 3, 4, 5, 6, 7, 8] 8 [0]
[1, 3, 5, 7]
4 [0]
[1, 5]
2 [0]
[1]
1 [0]
[1]
0 [1]
16.4
Periods of LFSRs
267
=
=
=
=
=
=
=
0
1
s1 + s0
s2 + s1
s3 + s2
s4 + s3
s5 + s4
=1+0=1
=1+1=0
=0+1=1
=1+0=1
=1+1=0
In this example it is apparent (and can be proven by induction) that the pattern
repeats in blocks of 3 bits, with each block being 0, 1, 1.
As another example, with size N = 3, modulus 2, coefficients c = (1, 1, 1), and
seed s = (s0 , s1 , s2 ) = (0, 0, 1) we have
s0
s1
s2
s3
s4
s5
s6
s7
...
=
=
=
=
=
=
=
=
0
0
1
s2 + s1 + s0
s3 + s2 + s1
s4 + s3 + s2
s5 + s4 + s3
s6 + s5 + s4
=1+0+0=1
=1+1+0=0
=0+1+1=0
=0+0+1=1
=1+0+0=1
and the pattern repeats in blocks of 0011. With the seed 101 and the same coeffi-
268
Chapter 16
Primitive Polynomials
cients we have
s0
s1
s2
s3
s4
s5
s6
s7
...
=
=
=
=
=
=
=
=
1
0
1
s2 + s1 + s0
s3 + s2 + s1
s4 + s3 + s2
s5 + s4 + s3
s6 + s5 + s4
=1+0+1=0
=0+1+0=1
=1+0+1=0
=0+1+0=1
=1+0+1=0
=
=
=
=
=
=
=
=
=
=
0
0
1
s2 + s0
s3 + s1
s4 + s2
s5 + s3
s6 + s4
s7 + s5
s8 + s6
=1+0=1
=1+0=1
=1+1=0
=0+1=1
=1+1=0
=0+0=0
=0+1=1
and thereafter the pattern repeats. Note that with size N = 3 this example repeats
in blocks of 7, by contrast to the previous example.
The kind of recursive definition used to define the keystream here can be written in terms of matrices. For simplicity, lets just suppose that N = 4. From
coefficients c = (c0 , c1 , c2 , c3 ), we make a matrix
c0
1
C=
0
0
c1
0
1
0
c2
0
0
1
c3
0
0
0
sn+1
sn
sn
sn1
=C
sn1
sn2
sn2
sn3
(all modulo m)
16.4
Periods of LFSRs
269
0
L=
0
c1 c2
0
0
1
0
0
1
...
0 ...
. . . cN 2
...
0
...
0
...
0
0
cN 1
0
That is, the top row consists of the coefficients (note the ordering of them!) and
everything else is 0 except for the 1s on the subdiagonal. Then
sn
sn+1
sn
sn1
...
L
...
=
sn(N 2)
sn(N 3)
sn(N 1)
sn(N 2)
270
Chapter 16
Primitive Polynomials
sN 1
sN 2
...
s1
s0
as a linear combination
sN 1
sN 2
. . . = a1 v1 + . . . + an vn
s1
s0
sN 1
sN 2
Lk . . . = k1 a1 v1 + . . . + kn an vn
s1
s0
If you understand how determinants work, then you can see that the characteristic polynomial of a matrix of the special form that L has is just
PL (x) = xN c0 xN 1 c1 xN 2 c2 xN 3 . . . cN 1
But we cannot expect to solve high-degree polynomial equations explicitly.
Instead of being so explicit, we could demand that all the eigenvalues i have
the largest possible order(s). That is, the smallest positive integer ` so that `i = 1
is as large as possible.
Remark: A possible problem here is that if weve only discussed the finite fields
Fp and no others then we dont know where to look to find these eigenvalues, since
we probably cannot solve the characteristic equation in Fp ! We need to understand
finite fields more generally to understand whats going on with LFSRs, even though
we dont mention finite fields in the definition of LFSR! But lets not worry too
much.
Assuming (as we did) that PL is irreducible, these eigenvalues lie in the finite
field FpN (since PL is of degree N ). In the discussion of primitive roots, we actually
showed that the multiplicative group of any finite field is cyclic, so FpN is cyclic, of
order pN 1. That is, every non-zero element of FpN satisfies
N
xp
1=0
But we want to exclude elements with smaller orders (so by Lagranges theorem
having order proper divisors of PL ). That is, we want to look at the polynomial
16.4
Periods of LFSRs
271
N
thats left after removing from xp 1 1 = 0 all its common factors with polynomials xd 1 where d is a proper divisor of pN 1. From the discussion of cyclotomic
polynomials, what remains after such common factors are removed is exactly the
(pN 1)th cyclotomic polynomial.
Therefore the hypothesis that PL is irreducible and divides the (pN 1)th
cyclotomic polynomial assures that the order of each eigenvalue is pN 1.
Then look at the periodicity condition
i+`
i
i
i+`
1 a1 v1 + . . . + n an vn = 1 a1 v1 + . . . + n an vn
This simplifies to
i+`
(i+`
i1 )v1 + . . . + (N
iN )vN = 0
1
or
(`1 1)i1 v1 + . . . + (`N 1)iN vN = 0
Certainly if pN 1 divides ` then the quantities in parentheses are all 0, so the sum
is 0.
To prove the other half, that this vector sum being zero implies that every
`i 1 = 0, we need a bit more information about eigenvectors. First, in this
situation we know that PL has distinct roots, since PL is a divisor of the (pN 1)th
cyclotomic polynomial, which we know to have distinct roots. Now we claim that
for an N -by-N matrix M with distinct eigenvalues 1 , . . . , N , any relation
a1 v1 + . . . + aN vN = 0
among corresponding eigenvectors vi must have all coefficients ai equal to 0. To
prove this, make the clever hypothesis that we have such a relation, and that among
all such relations it has the fewest non-zero ai s. Apply M to both sides of that
relation, obtaining
1 a1 v1 + . . . + N aN vN = 0
Multiplying the first relation by j and subtracting from the second gives a vector
relation
(1 j )a1 v1 + . . . + (N j )aN vN = 0
This has the effect of getting rid of the j th term. Note that since the eigenvalues
are all distinct none of the quantities i j is 0 except for i = j. Thus, we can
obtain a relation with fewer non-zero coefficients by using this trick to kill off some
non-zero coefficient. Contradiction.
Thus, in the case at hand,
(`1 1)i1 v1 + . . . + (`N 1)iN vN = 0
N
= 1, i is non-zero itself,
272
Chapter 16
Primitive Polynomials
Since every i has order pN 1, this holds if and only if ` divides pN 1. That is,
weve proven that the order of such a LFSR is pN 1.
///
Remark: We did not prove above that every vector can be expressed as a linear
combination
v = a1 v1 + . . . + aN vN
of eigenvectors vi . Since the eigenvalues are all different from each other, this
is true. We proved above that eigenvectors attached to different eigenvalues are
linearly independent. On the other hand, N linearly independent vectors in an
N -dimensional space implies that these vectors are a basis. (See the appendix on
linear algebra.)
+0 +x +x
Exercises
273
Exercises
16.01 Find the (multiplicative) order of x mod x3 + x + 1 with coefficients in Z/2.
(ans.)
16.02 Find the (multiplicative) order of x + 1 mod x3 + x + 1 with coefficients in
Z/2.
16.03 Find the (multiplicative) order of x2 + x + 1 mod x3 + x + 1 with coefficients
in Z/2.
16.04 Find the (multiplicative) order of x mod x4 +x3 +x2 +x+1 with coefficients
in Z/2. (ans.)
16.05 Find the (multiplicative) order of x mod x4 + x + 1 with coefficients in Z/2.
16.06 Find the (multiplicative) order of x2 + x + 1 mod x4 + x + 1 with coefficients
in Z/2.
16.07 Find an element of order 63 in F64 , where F64 is modeled as F2 [x] modulo
1010111 where those are the coefficients in order of decreasing degree. (ans.)
16.08 Find an element of order 63 in F64 , where F64 is modeled as F2 [x] modulo
1110101 where those are the coefficients in order of decreasing degree.
274
Chapter 16
Primitive Polynomials
Exercises
275
where for index n the state is a list of 6 bits (sn , sn1 , sn2 , sn3 , sn4 , sn5 ).
With initial state (s5 , s4 , s3 , s2 , s1 , s0 ) = (1, 0, 0, 0, 0), after how many steps
will the state return to this?
16.16 Let F16 be modeled as F2 [x] modulo 10011, the latter indicating coefficients
in order of decreasing degree. Find two roots of the equation y 2 + y + 1 = 0
in this field. (ans.)
16.17 Let F16 be modeled as F2 [x] modulo 10111, the latter indicating coefficients
in order of decreasing degree. Find two roots of the equation y 2 + y + 1 = 0
in this field.
17
RS and BCH Codes
17.1
17.2
17.3
17.4
17.5
Vandermonde determinants
Variant check matrices for cyclic codes
Reed-Solomon codes
Hamming codes
BCH codes
So far in our story we have not been very succesful in making error-correcting
codes. Yet Shannons Noisy Coding Theorem assures us of the existence of codes
which correct as close to 100% of errors as we want (with chosen rate, also). It is
simply hard to find these codes.
We know that a linear code can correct e errors if and only if any 2e columns
of its check matrix are linearly independent. This transformation of the question is
much more helpful than the more primitive (though entirely correct) idea that to
correct e errors the minimum distance must be 2e + 1. (Equivalently, for linear
codes, the minimum weight of non-zero vectors in the code must be 2e + 1.) The
linear algebra condition about linear independence is more accessible. For example
we can use this criterion to easily construct the Hamming [7, 4] code which can
correct any single error. The next immediate question is how to achieve this linear
independence property for correction of multiple errors.
The examples we give here are not merely linear, but cyclic. The simplest ones
after the Hamming codes are Reed-Solomon codes or RS codes, and do achieve
correction of arbitrarily large numbers of errors. Generalizing these somewhat are
the BCH codes. They were created by Bose, Chaudhuri, and independently by
Hocquengham, about 195960, and were considered big progress at the time. All
of these can be viewed as a certain kind of generalization of Hamming codes. In
the end, these codes are not so good, but, still, they are the simplest examples of
multiple-error-correcting codes. (Actually, well only consider primitive, narrowsense BCH codes.)
To construct these codes we need larger and larger finite fields, at least for
auxiliary purposes. Our little friend F2 = {0, 1} is unfortunately not adequate.
One approach is simply to use Z-mod-p = Z/p for large prime numbers p. From
276
17.1
Vandermonde determinants
277
a theoretical viewpoint this is fine, but from some practical viewpoints we would
much prefer to be able to rearrange everything as a binary code in the end. This
will require us to use finite fields F2n = GF (2n ) with 2n elements.
Remark: It is very important to realize that we cannot realize finite fields F2n
by using simply Z-mod-something:
Z/2n 6= F2n (unless n = 1)
Instead, we need an irreducible polynomial P (x) of degree n with coefficients in
F2 , and then
F2n = F2 [x]/P = F2 [x] modulo P (x)
It is important to realize that the issue is not so much the number of errors
corrected, but rather the ratio
relative error correction =
and maintaining a high rate. After all, correcting 2 errors but needing a
block size of 1000000 is not very good, since it is quite likely that more than
2 errors will occur in a block that size! Thus, a code can be a failure even if
its rate is high, and even if it corrects many errors, if its block length is just
too long by comparison.
Despite having been used for decades in engineering applications, from an
abstract viewpoint the Hamming, RS, and BCH codes are very limited successes.
Specifically, as we try to use these ideas to correct more and more errors, the block
size goes up too fast, and the relative error correction goes to 0.
278
Chapter 17
are linearly independent if and only if the determinant of the n-by-n matrix made
by sticking the vectors in (either as rows or columns) is non-zero:
v11
v21
det v31
v12
v22
v32
vn1
vn2
v13
v23
v33
...
vn3
v14
v24
v34
...
...
...
v1n
v2n
v3n 6= 0
vn4
...
vnn
M =
1
x1
x21
x31
x41
1
x2
x22
x32
x42
1
x3
x23
x33
x43
xn1
1
xn1
2
xn1
3
1
x4
x24
x34
x44
...
xn1
4
...
...
...
...
...
1
xn
x2n
x3n
x4n
...
xn1
n
Remark: Keep in mind that in greatest generality the product of a bunch of nonzero things can nevertheless be 0. But this counter-intuitive phenomenon certainly
does not occur in fields, for example. More generally, recall that a commutative
ring in which ab = 0 only when either a or b is 0 is an integral domain. Every
field is an integral domain. The ordinary integers Z are an example of an integral
domain which is not a field.
17.1
Vandermonde determinants
279
1
1
1
1
...
1
2
3
n1
...
2
(2 )2
(3 )2
...
(n1 )2
1
3
2 3
3 3
n1 3
( )
( )
...
(
) 6= 0
det 1
4
(2 )4
(3 )4
...
(n1 )4
1
...
n1
2 n1
3 n1
n1 n1
1
( )
( )
. . . (
)
3
More generally, there is no reason that the different powers of have to be consecutive: the only requirement is that theyre not equal to each other. That is, for a
non-zero element of a field and for integers `1 , . . . , `n so that
`1 , `2 , `3 , . . . , `n
are distinct, we have a non-zero determinant
1
1
1
1
`1
`2
`3
1
(`1 )2
(`2 )2
(`3 )2
1
`1 3
`2 3
det 1
( )
( )
(`3 )3
`1 4
`2 4
( )
( )
(`e )4
1
...
1 (`1 )n1 (`2 )n1 (`3 )n 1
...
...
...
...
...
`n1
(`n1 )2
(`n1 )3
(`n1 )4
6= 0
. . . (`n1 )n1
1
1
1
1
...
1
x2
x3
x4
...
xn
x1
2
x22
x23
x24
...
x2n
x1
3
x32
x33
x34
...
x3n 6= 0
M = x1
4
x42
x43
x44
...
x4n
x1
...
xn1
xn1
xn1
xn1
. . . xn1
n
1
2
3
4
for distinct x1 , . . . , xn we can multiply through
x1 x2 x3 x4
x21 x22 x23 x24
3
x1 x32 x33 x34
...
xn1 xn2 xn3 xn4
. . . xn
2
. . . xn
. . . x3n
4
. . . xn 6= 0
. . . x5n
...
xnn
280
Chapter 17
for the x1 , . . . , xn all different from each other, and non-zero. This type of matrix
is also called a Vandermonde matrix.
Let
h(x) =
c1
co
0
...
cs
. . . cs1
. . . cs2
...
0
cs
cs1
0
0
cs
... 0
... 0
... 0
xn 1
= bo + b1 x + . . . + bt xt
g(x)
(with s + t = n). We can easily make one kind of check matrix, as we saw earlier
works for any cyclic code, by
bt
0
H=
0
bt1
bt
0
bt2
bt1
bt
...
bt3
bt2
bt1
. . . b1
. . . b2
. . . b3
bo
b1
b2
0
bo
b1
... 0
... 0
... 0
Now well make a different kind of check matrix for such a cyclic code, which
illustrates better the error-correction possibilities via the linear independence of
columns condition. Suppose that the above discussion took place with all coefficients bi and cj in a finite field Fq . Let Fqm be a larger finite field containing
the finite field Fq and large enough so that the polynomial g(x) factors into linear
factors when we allow coefficients in Fqm . To be sure that this happens, we need
the following proposition.
Proposition: Given a non-constant polynomial g(x) with coefficients in Fq , there
is a larger finite field Fqm in which g(x) factors into linear factors.
Proof: Let f (x) be an irreducible factor of g(x), of degree d > 1. Then from our
discussion of finite fields and polynomial rings the quotient rign Fq [x]/f (x) is a
17.2
281
field, with q d elements. (In fact, f (x) factors into linear polynomials in this field,
but seeing this requires more effort than we can exert at the moment, so we must
continue as though we did not know this.) Let be the image of x in this quotient.
Then as seen earlier f () = 0. Thus, by unique factorization of polynomials, x
is a factor of f (x), hence of g(x). Thus, we can divide, obtaining a polynomial of
lower degree
h(x) = g(x)/(x )
What we have shown so far is that if a polynomial does not already factor into linear
factors then we can enlarge the finite field so as to find a further linear factor. We
repeat this process (by induction on degree) to enlarge the field sufficiently to factor
g(x) into linear factors entirely.
///
Factor g(x) into irreducible polynomials
g(x) = f1 (x)f2 (x) . . . f` (x)
where each fi has coefficients in Fq . For the subsequent discussion we need to
assume that no factor fi occurs more than once. An easy way to be sure
that this is so is to require that gcd(n, q) = 1, for example.
Let i be a root of the ith irreducible factor fi in Fqm . We claim that
1
1
H = 1
1
2
3
12
22
32
...
`2
13
23
33
. . . 1n1
n1
. . . 2
. . . 3n1
`3
...
`n1
Proof: The ith row of the cyclic generator matrix G, interpreted as the coefficients
of polynomials ordered by ascending degree, is xi1 g(x). Thus, G H > = 0 if and
only if
ji1 g(j ) = 0
for all indices j and i. Since none of the j s is 0, this is equivalent to the set of
equations
g(j ) = 0
for all indices j. Since the j s are roots of this equation, certainly G H > = 0.
Now we prove that v H > = 0 implies that v is in the code. Since j is a root
of fj (x) = 0 with fj (x) irreducible, as in our earlier discussion of field extensions
we can take
j = x mod fj (x)
Again interpreting v as a polynomial, the condition v H > = 0 is equivalent to
v(j ) = 0
282
Chapter 17
Remark: Thus, calling t the designed distance is reasonable at least in the sense
that we are sure that the minimum distance is at least t.
Corollary: With t = 2e + 1, the minimum distance 2e + 1 assures that the ReedSolomon [q 1, q 1 2e] code with alphabet Fq can correct any e errors.
///
17.3
Reed-Solomon codes
283
Proof: We will make a variant-type check matrix for C in which any t columns are
linearly independent. This linear independence will be proven by observing that
the t-by-t matrix consisting of any t columns is a Vandermonde determinant.
Let
2
3
...
n
2
2 2
3 2
n 2
( )
( )
. . . ( )
1
( 2 )3
( 3 )3
. . . ( n )3
1 3
H=
4
2 4
3 4
n 4
( )
( )
. . . ( )
1
...
t1
2 t1
3 t1
n t1
1
( )
( )
. . . ( )
The j th column consists of powers of j1 . Since is a primitive root, the entries
of the top row are distinct. Thus, any t 1 columns together form a Vandermonde
matrix with non-zero determinant! This proves linear independence, and by earlier
discussions proves the minimum distance assertion.
///
Example: Lets make a code that will correct 2 errors. For this, well need
designed distance t = 5. Since we need t q 1, we need 5 q 1. For simplicity
well have q be a prime number, so to satisfy 5 q 1 take q = 7. Let be a
primitive root mod 7, for example = 3. Then take a generating polynomial
g(x) = (x 3)(x 32 )(x 33 )(x 34 )
= (x 3)(x 2)(x 6)(x 4) = x4 + 6x3 + 3x2 + 2x + 4
Thus, this will make a [6, 2]-code, since deg g = t 1 = 4 and 2 = 6 4. A
generating matrix is
4 2 3 6 4 0
G=
0 4 2 3 6 4
To obtain a check matrix in the usual form for cyclic codes, take
h(x) =
x6 1
= (x 1)(x 5) = x2 + x + 5
g(x)
Then the check matrix is (note, as usual, the reversal of order of coefficients)
1
0
H=
0
0
1
1
0
0
5
1
1
0
0
5
1
1
0
0
5
1
0
0
0
5
Evidently any 4 columns are linearly independent. To verify this directly wed
have to check 64 = 15 different possibilities, which would be too tedious. But our
variant check matrix proves this for us indirectly! This gives us a [6, 2]-code with
alphabet Z/7 which can correct any 2 errors. The rate is 2/6.
Example: Lets make a Reed-Solomon code that can correct any 3 errors, with
alphabet Fq . For this, we need designed distance t = 2 3 + 1 = 7. Since we must
have t q 1, we take q = 11, so Fq = Z/11. This will make a code of block
284
Chapter 17
4
8
0
0
1
4
8
0
7
1
4
8
9
7
1
4
9
9
7
1
1
9
9
7
0
1
9
9
0
0
1
9
0
0
0
1
0
H=
0
0
0
5
1
0
0
0
0
9
5
1
0
0
0
2
9
5
1
0
0
5
2
9
5
1
0
0
5
2
9
5
1
0
0
5
2
9
5
0
0
0
5
2
9
0
0
0
0
5
2
0
0
0
5
Its not obvious, but evidently any 6 columns of H are linearly independent. Thus,
this [10, 4]-code over alphabet Z/11 can correct any 3 errors. It has rate 4/10.
n deg g
qt
t1
=
=1
n
q1
q1
2e
q1
2e
2e
=
block length
q1
17.4
Hamming codes
285
Thus, we can either correct lots of errors per block length, or maintain a high rate,
but not both.
Remark: The Reed-Solomon codes are maximum-distance separating codes,
meaning that they meet the Singleton bound.
Remark: And, although now we finally have examples where arbitrary numbers
of errors can be corrected, these are not binary codes. For practical applications
we might be compelled to have a binary code, which is one of the possibilities of
the BCH codes treated below.
...
n1 )
Since any two columns of this check matrix are linearly independent, simply by
not being multiples of each other, the minimum distance of the code arising from
this is at least 3 (and it can correct any single error: (3 1)/2 = 1).
Proof: We want to prove directly in this example that the minimum distance is at
least 3. Since the code is linear, this is equivalent to the assertion that the minimum
weight is at least 3. Suppose not. Then there is at least one vector v in the code
with weight 2, that is, with only two non-zero entries say at the ith and j th places
(indexing from 0 to n 1). Then
0 = v H t = i + j
Without loss of generality, i < j. Then
ji + 1 = 0
That is, seemingly satisfies an equation of degree j i with coefficients in F2 .
But by hypothesis g(x) = 0 is the lowest-degree equation satisfied by , and g has
degree n > j j i, so this is impossible. Thus, the minimum distance is at least
3.
///
286
Chapter 17
1 1 0
0 1 1
G=
0 0 1
0 0 0
0
1
0
1
0
0
1
0
0
0
0
1
This is not in the standard form, but since the code is cyclic, we have an easy
way to make the check matrix. We will not make the variant check matrix as just
above, but rather the general type applicable to any cyclic code. Take
h(x) =
x7 1
x7 1
= 3
= x4 + x2 + x + 1 = (x + 1)(x3 + x2 + 1)
g(x)
x +x+1
1 0
H = 0 1
0 0
1 1 1 0 0
0 1 1 1 0
1 0 1 1 1
From examination of the variant-type check matrix we know that the code has
minimum distance at least 3, so we know without directly checking that any two
columns of this H are linearly independent. This can also be verified directly, but
it would be silly to do so.
The rate of this code is dimension/length = 4/7.
Example: With k = 4, n = 24 1 = 15, we need a primitive quartic. We can
use g(x) = x4 + x + 1 (or x4 + x3 + 1) as the generating polynomial. This will
make a binary cyclic [15, 11]-code, since 15 = 24 1 and 11 = 24 1 4. Using the
coefficient of g(x) in ascending order, a generating matrix is
1
0
G = 0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
The rate of this code is 11/15. Note that as the size of these Hamming codes
grows the rate approaches 1. But on the other hand the relative error correction
number of errors correctible
block size
17.5
BCH codes
287
goes to 0 as the block size grows. This is bad. Thus, the useful codes among the
Hamming codes are the relatively small ones.
1
2
3
...
n1
2
2 2
3 2
n1 2
)
1 ( ) ( ) . . . (
H=
1 3 (2 )2 (3 )3 . . . (n1 )3
4
2 4
3 4
n1 4
1 ( ) ( ) . . . (
)
with possibly lying in some larger field Fqm . We suppose that 1, , 2 , 3 ,
. . ., n1 are all distinct and non-zero, and that n 1 4, so n 5. The
discussion of Vandermonde determinants above implies that the determinant of
any 4-by-4 matrix made from 4 different columns of this matrix is non-zero. Thus,
the corresponding linear code will correct 2 errors, since the minimum distance is
4 + 1 = 5 and a code with minimum distance d will correct any number of errors
< d/2.
Further, we can just as well make check matrices with any 6 columns linear
independent (so any 3 errors correctible), any 8 columns linearly independent (so
any 4 errors correctible), and so on. This much was already done by Reed-Solomon
codes, as long as the alphabet Fq was sufficiently large.
The new ingredient is that we want to allow the element to be in a larger field
Fqm than just the field Fq which is used as the actual alphabet for the code. This
would be in contrast to the Reed-Solomon codes where we never went outside the
finite field Fq used as the code alphabet. Staying inside Fq was what required that
Reed-Solomon codes use larger and larger Fq as the block size goes up. Instead,
if we can make check matrices over larger fields but keep the code alphabet itself
fixed, we can make multiple-error-correcting codes using small alphabets. Such
possibilities were already hinted at in the discussion of variant check matrices, and
illustrated in a rather trivial case in the reconstruction of Hamming codes above.
288
Chapter 17
2
1
3
1
H=
1 t2
1 t1
2
(2 )2
(2 )3
(2 )t2
(2 )t1
3
(3 )2
(3 )3
...
(3 )t2
(3 )t1
n1
(n1 )2
(n1 )3
n1 t2
. . . (
)
. . . (n1 )t1
...
...
...
has the property that the (t 1)-by-(t 1) matrix formed from any t 1 columns
has non-zero determinant. This follows from properties of Vandermonde matrices.
In this notation, the quantity t is the designed distance, because if we make
a code with this check matrix then it will have minimum distance at least t.
This generalizes the use of the terminology in the Reed-Solomon case.
We would need to connect such a variant check matrix with generating polynomials for a cyclic code. For 1 i t 1, let fi be the irreducible polynomial
with coefficients in Fq so that
fi (i ) = 0
Since lies in Fqm (and is primitive besides), by Lagranges theorem (or even more
m
elementary reasoning) each irreducible fi (x) must be a factor of xq 1 1. Then
let
g(x) = least common multiple of (f1 , . . . , ft1 )
This will be a polynomial (with coefficients in Fq ) dividing xn 1.
m
Since by assumption n = q m 1 and q are relatively prime, xq 1 1 has no
repeated factors, so unless two or more of the fi are simply the same, their
least common multiple is their product. The lack of repeated factors follows
from the fact that any repeated factor of a polynomial f (x) must also be a
factor of the derivative f 0 (x) of f (x), as observed earlier: if f = P 2 Q with
polynomials P, Q, then
f 0 = 2P P 0 Q + P 2 Q0 = P (2P 0 Q + P Q)
Using the Euclidean algorithm, we can compute without great difficulty that
the gcd of xn 1 and nxn1 is 1 when n 6= 0 in the field Fq .
So the generating polynomial g(x) for the code C with the check matrix H
above is the product of the different irreducible polynomials fi which have roots
17.5
BCH codes
289
i (1 i < t), not repeating a given polynomial fi if two different i and j are
roots of the same fi . Then the question is: given q, m, block length n = q m 1,
primitive element , and designed distance t,
How can we nicely determine g(x) as a function of t?
The answer to this question cannot be described well by a simple formula, but
can be answered systematically.
For a in the finite field Fqm , the Frobenius map Fqm Fqm is defined to
be
a aq
(There is no completely standard symbol for this map!)
Theorem: For a, b in the finite field Fqm , the Frobenius map a aq has the
properties
(xy)q = xq y q
(x + y)q = xq + y q
For a polynomial f with coefficients in Fq , suppose the equation f (x) = 0
has root a Fqm . Then also f (aq ) = 0.
(Lets not worry about the proof of all this right now. Its not very hard in
any case.)
2
In particular, this means that not only is = 1 a root of f1 , but also q , q ,
m
etc. are all roots of f1 . Of course this is not really an infinite list, because q =
2
so the list cycles on itself. And 2 , (2 )q , (2 )q , etc. are roots of f2 . Further, 3 ,
2
(3 )q , (3 )q , etc. are roots of f3 . And so on.
The key question is: among , 2 , . . . , t1 , how many different polynomials
fi do we need? The more we need, the larger the degree of g, and thus the smaller
the quantity
n deg g
which occurs in the expression for the rate of the code
rate = 1
deg g
n
In the worst-case scenario, even with designed distance t much less than the block
length n, it can be that by accident g(x) must be
m
g(x) =
xq 1 1
xn 1
=
x1
x1
qm 1
m
290
Chapter 17
if no two of the , 2 , . . . , t1 are roots of the same irreducible factor fi , then the
degree of the generating polynomial g for the code is
deg g = (t 1) m q m 1 = deg xn 1 (= deg xq
1)
2
3
4
2
(2 )2
(3 )2
(4 )2
...
6
. . . (2 )6
. . . (3 )6
. . . (4 )6
1)
17.5
BCH codes
291
1
1
2
3
4
2
(2 )2
(3 )2
(4 )2
...
14
. . . (2 )14
. . . (3 )14
. . . (4 )14
292
Chapter 17
1 1 1 0
0 1 1 1
0 0 1 1
0 0 0 1
0 0 0 0
0 0 0 0
0 0 0 0
0
1
0
1
1
1
0
0
0
1
0
1
1
1
0
0
0
1
0
1
1
1
0
0
0
1
0
1
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
1
This is a binary [15, 7]-code and has minimum distance at least 5, by construction,
so can correct any 2 errors.
Remark: Notice that it was not so easy to predict from the specification that the
resulting binary code would be [15, 7]. Indeed, since the generalized check matrix
had only 4 rows and had block size 15, a person might have mistakenly thought
that the code would have dimension 15 4 = 11 rather than just 7. The difference
is accounted for in a subtle way by the fact that we get a binary code, not a code
with alphabet F16 , but the check matrix uses F16 .
One more example: lets make a code with designed distance t = 7, to make
a binary code to correct any 3 errors. We can use the block size n = 15 = 24 1
again, and check matrix
1
2
...
14
2
2 2
2 14
1 ( ) . . . ( )
1 3 (3 )2 . . . (3 )14
H=
4
4 2
4 14
1 ( ) . . . ( )
1 5 (5 )2 . . . (5 )14
6
6 2
6 14
1 ( ) . . . ( )
The generating polynomial g(x) will be the minimal polynomial so that
g() = 0, g(2 ) = 0, g(3 ) = 0, g(4 ) = 0, g(5 ) = 0, g(6 ) = 0
By applying the Frobenius map and the little theorem above, we see that , 2 , 4
will all be roots of a single irreducible factor of x16 1, namely x4 + x + 1 = 0
since was effectively defined as a root of this. Likewise, 3 and 6 = (3 )2 will
both be roots of the irreducible polynomial x4 + x3 + x2 + x + 1 which has 3 as a
root (determined by luck above). All that is left is 5 . Well again try to be lucky
rather than systematic. Since 15 = 1, we have (5 )3 = 1, or
((5 )2 + (5 ) + 1)(5 + 1) = 0
17.5
BCH codes
293
G = 0
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
1
1
0
0
1
0
1
1
0
0
1
0
1
1
0
0
1
0
1
1
0
1
1
0
1
1
1
1
1
0
1
0
1
1
1
0
0
0
1
1
1
0
0
0
1
1
0
0
0
1
This is a binary [15, 5]-code that can correct any 3 errors. The fact that the rate is
only 5/15 = 1/3 might be disappointing. Also, the relative error correction rate
is only 3/15 = 1/5.
One more example: using block size n = 25 1 = 31, lets try again to correct
3 errors, so we need designed distance t = 7. Thus, the check matrix is
1
1 2
1 3
H=
1 4
1 5
1 6
2
(2 )2
(3 )2
(4 )2
(5 )2
(6 )2
...
...
...
...
...
...
30
(2 )30
(3 )30
(4 )30
(5 )30
(6 )30
where is a primitive element in F32 . For example, lets model F32 = F2 [x]/P (x)
where P (x) is the primitive quintic
P (x) = x5 + x2 = 1
Then , 2 , 4 are all roots of P (x) = 0, 3 , 6 go together, and 5 is by itself.
Since 31 = 25 1 is prime, all these elements are again primitive, so we need 2
more distinct irreducible quintics as factors to be able to have , 2 , 3 , 4 , 5 , 6
as roots: without worrying about determining these quintics, we do know
deg g = 3 5 = 15
Therefore, we get a binary [31, 16]-code, since 16 = 31 15. This gives a rate above
1/2 again. But now the relative error correction is only 3/31 < 1/10.
Remark: Although our success here was very limited, we did manage to make a
relatively straightforward class of multiple-error-correcting codes. Since they are
294
Chapter 17
Exercises
295
Exercises
2
8
3
7
5
8
1
17.03 Compute the determinant of 4
7
1
17.04 Compute the determinant of 4
7
1
1
17.05 Compute the determinant of
1
1
2
5
8
1
7
1
5
17.06 Compute the determinant of
25
125
2
3
0
1
2
4
8
(ans.)
3
6 (ans.)
9
3
2
8
1 1
3 4
(ans.)
9 16
27 64
1
1
1
6
7
8
36 49 64
216 343 512
17.07 Using the alphabet GF (13) find a generator matrix for a Reed-Solomon code
correcting any 5 bit errors. Use primitive root 2 mod 13. (ans.)
17.08 Using the alphabet GF (17) find a generator matrix for a Reed-Solomon code
correcting any 7 bit errors. (Use primitive root 3 mod 17.)
17.09 Using the alphabet GF (11) find a generator matrix for a Reed-Solomon code
correcting any 4 bit errors. (Use primitive root 2 mod 11.)
17.10 Using the alphabet GF (19) find a generator matrix for a Reed-Solomon code
correcting any 8 bit errors. (Use primitive root 2 mod 19.)
17.11 Determine the dimension and minimum distance of the BCH code of length
48 constructed with designed distance 9 using the field extension GF (72 ) of
the finite field GF (7). (ans.)
296
Chapter 17
17.12 Determine the dimension and minimum distance of the BCH code of length
31 constructed with designed distance 7 using the field extension GF (25 ) of
the finite field GF (2).
17.13 Determine the dimension and minimum distance of the BCH code of length
124 constructed with designed distance 9 using the field extension GF (53 )
of the finite field GF (5).
17.14 For a polynomial f with coefficients in Fq , suppose the equation f (x) = 0
has root a Fqm . Prove that also f (aq ) = 0.
18
Concatenated Codes
18.1
18.2
18.3
18.4
Mirage codes
Concatenated codes
Justesen codes
Some explicit irreducible polynomials
In this chapter we first give a non-constructive proof that there exist infinite
families of linear codes with information rate 1/2 and minimum distance at least
1/10 of the length, all lying in a very restricted class of linear codes. That is,
both the information rate and the error correction rate (ratio of minimum length
to length) are bounded away from 0. We say that such an infinite family of codes
is asymptotically good. RS and BCH codes do not have this property. But the
non-constructive nature of this proof means basically that we cannot find these
codes or describe them in any useful manner.
But there is a clever adaptation of the idea of that proof to make tangible examples of asymptotically good codes, due to Justesen. This was achieved relatively
recently, in the late 1970s. The codes Justesen made were examples of concatenated codes, meaning that they are made by combining simpler building-block
codes in a way that has a synergistic effect on the error-correction properties, etc.
Concatenated codes were apparently introduced by G.D. Forney in about 1966
(in the context of convolutional codes rather than block codes). The general idea
is simple enough: repeatedly encode a message. Of course working out the details
in a way to make it advantageous is the whole trick of it.
For now, this is the only known way to make a constructive infinite family of
longer and longer codes so that neither the information rate nor the relative error
correction go to 0 as the length goes to .
298
Chapter 18
Concatenated Codes
Proof: The idea of the proof is to count the number of s which give relatively
low minimum Hamming weight, and see that there are many s left over. These
leftovers must necessarily give relatively good codes.
Notice that any single non-zero binary word (v, v) in the code C2n () determines , by the relation
= (v) v 1
where inverse and multiplication are as elements of F2n . Fix a real number c in
the range 0 < c < 1/2. If the code has minimum weight < c 2n, then there is a
non-zero word (v, v) in C2n () with Hamming weight < c 2n, so is expressible
as = (v) v 1 for some v so that (v, v) is in C2n () and has Hamming weight
< c 2n.
The number of (v, v)s in C2n () with Hamming weight < c 2n is certainly
at most the number of binary vectors of length 2n with Hamming weight < c 2n
(that is, without assuming theyre in C2n ()). Thus,
X 2n
(number of (v, v)s with Hamming weight < c 2n)
i
ic2n
since there are 2n
locations to put i 1s in a length 2n vector. From the lemma
i
below, we have a slightly subtle estimate of this sum of binomial coefficients
X 2n
22nH(c)
i
i<c2n
18.1
Mirage codes
299
where as usual
H(c) = c log2 c (1 c) log2 (1 c)
We wish to choose c such that the number of these bad s is 2n 2, so, since
there are 2n 1 s available altogether, not all the s would be bad. That is, there
would be at least one so that C2n () has minimum weight c 2n.
That is, we want
2H(c)2n 2n 2
so must choose c such that H(c) is somewhat less than 1/2. Some numerical
experimentation yields
H(0.1) 0.46899559358928122 <
1
2
Therefore, certainly asymptotically as n becomes large we have the desired inequality. In fact, already for n 4 we have the inquality, by numerical computation:
2H(0.1)24 13.472678057860175 24 2
The derivative
i
h
i
d h t
2 2 2H(0.1)2t = t ln 2 (1 2 H(c)) 2t 2 2H(0.1)2t
dt
is likewise positive for t 4 since for this value of c
1 2 H(c) > 0
Thus, the desired inequality holds for all n 4. That is, for n 4 there is such
that the length 2n code C2n has minimum distance at least c 2n = 2n/10.
///
Now we prove the lemma giving the necessary estimate on sums of binomial
coefficients used in the proof above.
Lemma: Fix 0 c 1/2. As usual let
H(c) = c log2 c (1 c) log2 (1 c)
For positive integers `,
X `
2`H(c)
i
i<c`
X `
ci (1 c)`i
i
0i`
X `
ci (1 c)`i
i
0ic`
300
Chapter 18
Concatenated Codes
c
1c
i
(1 c)`
c
1c
c`
(1 c)`
0ic`
c
1c
c`
(1 c) )
`H(c)
X `
i
0ic`
///
Example: Lets make a length 8 binary code with information rate 1/2 and
determine its minimum distance by brute force. Model F8 as F2 [x]/(x3 + x + 1).
Let be the image of x in that field. First try = and make the corresponding
binary code. The encoding of F8 in binary is by encoding v = a + b + c 2 as
(a, b, c). Then, starting to compute the encodings of the 7 non-zero length-3 words,
we have
100
010
001
110
101
011
2
1+
1 + 2
+ 2
(1, )
(, 2 )
( 2 , 1 + )
(1 + , + 2 )
(1 + 2 , 1)
( + 2 , 1 + + 2 )
100010
010001
001110
110011
101100
011111
It is clear that using = has the disadvantage that some weight-1 length-3
codewords v will give (v, v) of weight only 2, not enough to correct even a single
error. Thus, we try this again with = + 1:
100
010
001
110
101
011
2
1+
1 + 2
+ 2
(1, 1 + )
(, + 2 )
( 2 , 1 + + 2 )
(1 + , 1 + 2 )
(1 + 2 , 2 )
( + 2 , 1)
100110
010011
001111
110101
101001
011100
18.2
Concatenated codes
301
This is a relative success, in the sense that we have achieved a minimum distance
3, so one bit error can be corrected. This is not as good as the Hamming [7, 4]
code, however, since the information rate of the present code is 1/2 while that of
the Hamming code is 4/7 > 1/2.
Example: Now make a length 8 code of information rate 1/2, hoping to have a
minimum distance greater than 3. Model F16 as F2 [x]/(x4 + x + 1). Let be the
image of x in the field. Encode
a + b + c 2 + d 3 (a, b, c, d)
To avoid having the weight-1 length-4 binary words encode as weight 3 or less, we
try
= 1 + + 2
The weight-1 words encode as
1000
0100
0010
0001
2
3
(1, 1 + + 2 )
(, + 2 + 3 )
2
( , 1 + + 2 + 3 )
( 3 , 1 + 2 + 3 )
10001110
01000111
00101111
00011011
We get no word with Hamming weight less than 4. Continuing with our brute-force
computations, encoding weight-2 length-4 words we have
0011
0101
1001
0110
1010
1100
00110100
01011100
10010101
01101000
10100001
11001001
Unfortunately, the fourth one encodes as a weight-3 codeword. Indeed, the Hamming bound says that this attempt is too optimistic, because a length 8 binary code
with 24 = 16 codewords and minimum distance would have to satisfy
8
8
8
4
2 2 1+
+
2
1
which is false. Thus, we would need to consider longer mirage codes to achieve
minimum distance 5 and correction of two bit errors. The computations are not
illuminating, and we might concede once again that we do not understand how to
explicitly construct good codes.
302
Chapter 18
Concatenated Codes
The basic idea of a concatenated code, in our context, is that the symbols in
the alphabet of the outer code Cout are interpreted as being the words of the inner
code Cin , and after the outer encoding is applied the inner encoding is applied to
each alphabet symbol. Sometimes such a process is also called superencoding,
and the concatenated code is sometimes called a supercode.
Well take both codes to be linear. More specifically, let Cout be a linear code
with alphabet F2k , and let Cin be a binary linear code of length n. (To identify
elements of F2k with binary vectors of length k, as usual choose an irreducible polynomial P (x) in F2 [x] of degree k so that we have a model for F2k as F2 [x]/P (x)).
If the outer code Cout is an (N, K, D) code (with alphabet F2k ) and the inner code
Cin is a binary (n, k, d) code, then a word
a = (a0 , a1 , . . . , aK1 ) FK
2k
is encoded by the outer code as some N -tuple
b = (b0 , b1 , . . . , bN 1 ) FN
2k
Then each bi F2k is rewritten as some binary vector
bi = (bi,0 , bi,1 , b1,2 , . . . , bi,k1 ) Fk2
(for a choice of irreducible degree k polynomial P (x) F2 [x]), and encoded by the
inner code to
ci = (ci,0 , ci,1 , c1,2 , . . . , ci,n1 ) Fn2
Proposition: The concatenated code made from an outer (N, K, D) code with
alphabet F2k and an inner binary (n, k, d) code is essentially a binary (nN, kK)code. That is, the information rate of such a concatenated code is equal to the
product of the rates of the inner and outer codes.
Proof: This is almost clear from the very definition of the concatenated code.
information rate =
kK
log2 (number of codewords)
=
log2 (all words)
nN
Proof: Let D be the minimum distance of the outer code and d the minimum
distance of the inner code. Let w = (0 , . . . , N 1 ) be a codeword in FN
2k . Then
there are at least D nonzero entries among the i . Now viewing each i as a binary
vector, if i is non-zero then it has at least d non-zero entries. Thus, the length
18.3
Justesen codes
303
N n binary version of w has at least Dd non-zero entries, which is to say that the
minimum distance is at least Dd, as claimed.
///
Remark: These varying inner codes are the codes discussed earlier, among which
the really good ones are mirages since we cannot readily locate them. But the idea
is that even the average is good enough so that the supercode is still pretty good.
But of course we need to verify some details.
304
Chapter 18
Concatenated Codes
In the discussion of the mirage codes we proved a lemma asserting the inequality
X 2k
2H(c)2k
i
i<c2k
We know that at least D/N of the codeword symbols from the outer code are
not 0, and we want to use this inequality to estimate how good or bad the better
half of the worst D/N fraction of the inner codes can be. That is, we are interested
in guessing a fraction 0 < c < 1/2 so that at most the fraction 12 D/N of the inner
codes have minimum distance < c 2k. That is, using the lemma, we want to find
c so that
1 D
2H(c)2k
N
2 N
This simplifies to
2H(c)2k
1
D
2
Since at least 21 D/N of the inner codes have minimum distance c 2k, then well
get a lower bound for the minimum distance of the supercode:
1
min distance for supercode (c 2k) ( D)
2
18.3
Justesen codes
305
=
=
k
length
2k N
2N
2(2 1)
2
(A Reed-Solomon code over the field F2k has length N = 2k 1.)
If we try taking
1
1
D = 2k N
2
2
then we require
1
1 1
1
2H(c)2k D 2k = 2k
2
2 2
4
If we take c so that H(c) < 1/2, then for sufficiently large k this inequality will
hold. Specifically, for
2
k
1 2 H(c)
the inequality holds. For example, as with the mirage codes earlier, taking c = 1/10
gives
1
H(c) = H(0.1) 0.469 <
2
Then with this choice of D and c we want k large enough such that
1
1
D 2H(c)2k 2k 20.469 2k
2
4
For k 33 this holds. In any case, to get H(c) < 1/2 we must take c < 0.11, since
H(0.11) 0.4999
1
2
Thus, for k 33 and c 1/10 for example, with D = 2k /2, the minimum
distance is at least
1
1
1 1
(c 2k) ( D)
2k 2k = k 2k /20
2
10
2 2
The length of the binary supercode is 2k (2k 1) 2k 2k , so we have
min distance
1
k 2k /20
=
length
2k 2k
40
From the relation D = N K + 1 for a Reed-Solomon code, we see that the
information rate for the outer code is K/N 1/2. The inner code information
rates are all 1/2, so the information rate of the concatenated code is about 1/4.
(The information rate goes to 1/4 as k goes to infinity.)
That is, for k = 33, 34, 35, . . . we can make concatenated binary codes of lengths
2k (2k 1), with information rate going to 1/4 in the limit, and with minimum
distance asymptotically at least 1/40 of the length.
///
306
Chapter 18
Concatenated Codes
Remark: We will use the fact that if is a root of f (x) = 0 for irreducible
f (x) F2 [x] then 2 is also, and in fact the complete list of roots is
2
, 2 , 2 , 2 , 2 , 2 , . . . , 2
t1
Proof: First, we check that the order t of 2 modulo 3`+1 is exactly 23` = (3`+1 ).
By Eulers theorem and Lagranges theorem, t is a divisor of 23` . So what we want
to show is that 2 is a primitive root in Z/3`+1 . From our discussion of primitive
roots, we know that it suffices to show that 2 is a primitive root mod 32 . And,
indeed, 23 = 1 6= 1 mod 32 , 22 = 4 6= 1 mod 32 , so 2 is a primitive root as desired.
Next, let P be any irreducible polynomial of degree 2m = 2 3` over F2 , so
that we have a concrete model
F22m F[x]/P (x)
(Note that we do not have to give a constructive proof that P exists! Why?) Since
3` divides 22m 1 (that is, since the order of 2 mod 3`+1 is 2 3` = 2m) and since
there is a primitive root g in F22m , the element
= g (2
2m
1)/3`
x3 1
Exercises
307
has roots in F2m consisting of all elements whose orders are powers of 3 less than
or equal 3` . The element of order 1 (namely 1 itself) is a root of x 1, the elements
of order 3 are roots of
(x3 1)/(x 1) = x2 + x + 1
i
i1
1, so by
i1
i1
x3 1
= x2 3 + x3 + 1
i1
3
x
1
`
2 (3` 1)
= 3`
31
That is, weve found a factorization of x3 1. Since every factor but the last has
degree strictly less than 2m, and since the irreducible polynomial of which is a
`
root is a divisor of x3 1, the last factor must be the irreducible polynomial of
which is a root. This proves the proposition.
///
Exercises
18.01 Let be the image of x in F4 = F2 [x]/(x2 + x + 1). Write out the 3 nonzero codewords of the corresponding mirage code and observe the minimum
distance. (ans.)
18.02 Let be the image of x + 1 in F4 = F2 [x]/(x2 + x + 1). Write out the 3
non-zero codewords and observe the minimum distance.
18.03 Let F8 = F2 [x]/(x3 + x2 + 1), let be the image of 1 + x, and write out the
encoding of the corresponding mirage code of length 6 and information rate
1/2. Note the minimum distance.
308
Chapter 18
Concatenated Codes
18.04 Can you find F32 to give a mirage code of length 10 with minimum
distance at least 4?
18.05 Find a constant c such that
X 3n
< 22n
i
i<c3n
(ans.)
18.06 In a variant on mirage codes, encode binary n-bit strings v by v (v, v, v)
viewing v as lying in F2n and with , chosen in F2n to make a binary code
of length 3n with information rate 1/3. As with the mirage codes, prove a
lower bound on the error-correction rate for such a code, with optimal (but
inexplicit) and . (ans.)
18.07 Prove that the minimum distance of a concatenated code is at least the
product of the minimum distances of the inner and outer code.
19
More on Rings and Fields
19.1
19.2
19.3
19.4
19.5
19.6
19.7
19.8
Here we develop in greater detail some aspects of ring theory touched upon
very briefly earlier.
310
Chapter 19
Example: Let R = k[x] be the ring of polynomials in one variable x with coefficients in a field k. Fix a polynomial P (x), and let I R be the set of all polynomial
multiples M (x) P (x) of P (x). Verification that I is an ideal is identical in form to
the previous example.
Example: Abstracting the previous two examples: let R be any commutative
ring with unit 1, and fix n R. Then the set I = n R = {rn : r R} consisting
of all multiples of m is an ideal, called the principal ideal generated by n. The
same argument proves that it is an ideal. Such an ideal is called a principal ideal.
Example: In any ring, the trivial ideal is just the set I = {0}. Consistent with
typical usage in mathematics, an ideal I is proper if it is neither the trivial ideal
{0} nor the whole ring R (which is also an ideal).
The following proposition is an important basic principle.
Proposition: Let I be an ideal in a commutative ring R with unit 1. If I contains
any element u R , then I = R.
Proof: Suppose I contains u R . The fact that u is a unit means that there is
a multiplicative inverse u1 to u. Then, for any r R,
r = r 1 = r (u1 u) = (r u1 ) u
That is, r is a multiple of u. Since I is an ideal, it must contain every multiple of
u, so I contains r. Since this is true of every element r R, it must be that R = I.
///
Proof: This will follow from the previous proposition if we check that non-zero
constant polynomials are units (that is, have multiplicative inverses). Indeed, for
a k with a 6= 0, since k is a field there is a1 k k[x]. Thus, certainly a is
invertible in the polynomial ring k[x].
///
We can recycle the notation we used for cosets to write about ideals in a more
economical fashion. For two subsets X, Y of a ring R, write
X + Y = {x + y : x X, y Y }
X Y =X Y
X
= {finite sums
xi yi : xi X, yi Y }
i
Note that in the context of ring theory the notation X Y has a different meaning
than it does in group theory. Then we can say that an ideal I in a commutative
ring R is an additive subgroup so that R I I.
19.1
311
also lies in I. That is, there is indeed a monic polynomial of lowest degree of any
element of the ideal. Let x I, and use the Division Algorithm to get Q, R k[x]
with deg R < deg P and
x=QP +R
Certainly Q P is still in I, and then Q P I also. Since R = x Q P , we
conclude that R I. Since P was the monic polynomial in I of smallest degree, it
must be that R = 0. Thus, x = Q P n k[x], as desired.
///
Remark: The proofs of these two propositions can be abstracted to prove that
every ideal in a Euclidean ring is principal.
Example: Let R be a commutative ring with unit 1, and fix two elements x, y R.
Then
I = R x + R y = {rx + sy : r, s R}
is an ideal in R. This is checked as follows. First,
0=0x+0y
so 0 lies in I. Second,
(rx + sy) = (r)x + (s)y
312
Chapter 19
Example: To construct new, larger ideals from old, smaller ideals we can proceed
as follows. Let I be an ideal in a commutative ring R. Let x be an element of R.
Then let
J = R x + I = {rx + i : r R, i I}
Lets check that J is an ideal. First
0=0x+0
so 0 lies in J. Second,
(rx + i) = (r)x + (i)
so J is closed under inverses. Third, for two elements rx + i and r0 x + i0 in J (with
r, r0 R and i, i0 I) we have
(rx + i) + (r0 x + i0 ) = (r + r0 )x + (i + i0 )
so J is closed under addition. Finally, for rx + i J with r R, i I, and for
r0 R,
r0 (rx + i) = (r0 r)x + (r0 i)
so R J J as required. Thus, this type of set J is indeed an ideal.
Remark: In the case of rings such as Z, where we know that every ideal is
principal, the previous construction does not yield any more general type of ideal.
Remark: In some rings R, it is definitely the case that not every ideal is principal.
That is, there are some ideals that cannot be expressed as R x. The simplest
example is the following. Let
R = {a + b 5 : a, b Z}
It is not hard to check that this is a ring. Let
I = {x 2 + y (1 + 5) : x, y R}
19.2
Ring homomorphisms
313
With just a little bit of cleverness, one can show that this ideal is not principal.
This phenomenon is closely related to the failure of unique factorization into primes
in this ring. For example, we have two apparently different factorizations
2 3 = 6 = (1 + 5) (1 5)
(All the numbers 2, 3, 1 + 5, 1 5 are prime in the naive sense that they
cant be further factored in the ring R.) These phenomena are not of immediate
relevance, but did provide considerable motivation in the historical development of
algebraic number theory.
Remark: In rings R that are not necessarily commutative, there are three different
kinds of ideals. A left ideal I is an additive subgroup so that R I I, a right ideal
I is an additive subgroup so that I R I, and a two-sided ideal I is an additive
subgroup so that R I R I. Mostly well only care about ideals in commutative
rings, so we can safely ignore this complication most of the time.
314
Chapter 19
and
(x-mod-n) (y-mod-n) = (x y)-mod-n
Even though it is slightly misleading, this homomorphism is called the reduction
mod n homomorphism.
Now we prove that
The kernel of any ring homomorphism f : R S is an ideal in R.
Let x be in the kernel, and r R. Then
f (rx) = f (r)f (x) = f (r) 0 = 0
since by now weve proven that in any ring the product of anything with 0 is 0.
Thus, rx is in the kernel of f . And, for x, y both in the kernel,
f (x + y) = f (x) + f (y) = 0 + 0 = 0
That is, x + y is again in the kernel. And f (0) = 0, so 0 is in the kernel. And for
x in the kernel f (x) = f (x) = 0 = 0, so x is in the kernel.
///
!
X
X
X
X
ai roi
bj roj = ero (g) ero (h)
ero (g h) =
(ai bi ) rok =
k
i+j=k
19.2
Ring homomorphisms
315
///
///
Remark: Notice that, unlike the discussion about the additive identity, here we
need the further hypothesis of surjectivity. Otherwise the assertion is false: see the
remark after the proof.
316
Chapter 19
Remark: It is important to note that it is not necessarily true that the image of
the multiplicative identity 1R under a ring homomorphism f : R S has to be the
multiplicative identity 1S of S. For example, define a ring homomorphism
f :QS
from the rational numbers Q to the ring S of 2-by-2 rational matrices by
f (x) =
x 0
0 0
1
0
0
0
1
0
0
1
in the ring S.
There are also examples in commutative rings where the unit is mapped to
something other than the unit. For example, let R = Z/3 and S = Z/6, and define
f : R S by
f (r mod 3) = 4r mod 6
Check that this is well-defined: if r = r0 mod 3, then 3|(r r0 ). Then surely
6|4(r r0 ), so indeed 4r = 4r0 mod 6. This proves well-definedness. Check that this
is a homomorphism:
f (x + y) = 4(x + y) = 4x + 4y = f (x) + f (y)
This would have worked with any number, not just 4. To see that f preserves
multiplication, the crucial feature of the situation is that
4 4 = 4 mod 6
Then
f (x y) = 4(x y) = (4 4)(x y) = (4x) (4y) = f (x) f (y)
Thus, f is a homomorphism. But f (1) 6= 1.
19.3
Quotient rings
317
318
Chapter 19
Proof: Let x+I be a non-zero element of R/I. Then x+I 6= I, so x 6 I. Note that
the ideal Rx + I is therefore strictly larger than I. Since I was already maximal, it
must be that Rx + I = R. Therefore, there are r R and i I so that rx + i = 1.
Looking at this last equation modulo I, we have rx 1 mod I. That is, r + I is
the multiplicative inverse to x + I. Thus, R/I is a field.
On the other hand, suppose that R/I is a field. Let x R but x 6 I. Then
x + I 6= 0 + I in R/I. Therefore, x + I has a multiplicative inverse r + I in R/I.
That is,
(r + I) (x + I) = 1 + I
From the definition of the multiplication in the quotient, this is rx + I = 1 + I, or
1 rx + I, which implies that the ideal Rx + I is R. But Rx + I is the smallest
ideal containing I and x. Thus, there cannot be any proper ideal strictly larger
than I, so I is maximal.
///
19.5
Field extensions
319
320
Chapter 19
19.6
321
suppose that q(R(x)) = q(S(x)) for two polynomials R, S of degrees less than the
degree of P . Then R(x) = S(x) mod P (x), which is to say that P (x) divides
R(x) S(x). Since the degree of R(x) S(x) is strictly less than that of P (x), this
can happen only for R(x) = S(x). This is the desired result.
///
Corollary: When the field k is finite with q elements, for an irreducible polynomial
P of degree n, the field extension K = k[x]/P (x) has q n elements.
Proof: Let be the image of x in K. We use the fact that every element of K
has a unique expression as R() for a polynomial R of degree less than n. There
are q choices for each of the n coefficients (for powers of ranging from 0 to n 1),
so there are q n elements altogether.
///
322
Chapter 19
Proof: Again, since the Frobenius map is just taking q th powers and K is
closed under multiplication, maps K to itself. What needs more attention is the
injectivity and surjectivity. One way to prove injectivity is to note that the kernel
of N (or of any power of ) is {0}, so N is injective, by the trivial kernel criterion
for injectivity of a ring homomorphism. And for functions from a finite set to itself,
injectivity implies surjectivity. Also, some thought should make clear that proving
N is the identity map on K certainly is sufficient to prove that is both injective
and surjective.
The multiplicative group K is of order q N 1, so by Lagranges theorem and
its corollaries the order of any K is a divisor of q N 1, and
q
=1
() = q = q
=1=
///
Proposition: The Frobenius map restricted to k is the identity map. That is,
for every in k = Fq , () = . If K has the property that () = , then
in fact k.
Proof: Half of the proposition is really just a corollary of Lagranges theorem. The
first point is that the multiplicative group k of nonzero elements in k has q 1
elements. So, by Lagranges theorem and its corollaries, the order of any element
in k is a divisor d of q 1, and, further, q1 = 1 for that reason. Then for nonzero
k we have
() = q = ()q1 = 1 =
And certainly 0q = 0, so this proves half of the proposition.
Now suppose that K and () = . By the definition of this means
that is a solution of the equation xq x = 0 lying inside the field K. By unique
factorization of polynomials (with coefficients in a field), we know that a polynomial
equation of degree q has at most q roots in a field. We already found q roots of
this equation, namely the elements of the smaller field k sitting inside K. So there
simply cant be any other roots of that equation other than the elements of k. This
shows that () = implies k, which is the second half of the proposition. ///
19.6
323
Remark: A more systematic development of general field theory would make the
result of the last lemma much clearer, but would have taken more time altogether
than the funny proof given below.
Proof: By Lagranges theorem and its corollaries, in the group obtained by taking
K with its addition (ignoring for the moment its multiplication), the order of any
element is a divisor of the order q N of the group, and
qN = + . . . + = 0
{z
}
|
qN
Since q = pn , this is
pnN = + . . . + = 0
{z
}
|
pnN
Since K is a field, whenever the product of several elements is 0, one of the factors
is itself 0. Thus, t is 0, as asserted in the lemma. And then
+ . . . + = 1K + . . . + 1K = (1K + . . . + 1K ) = 0 = 0
|
{z
} |
{z
}
{z
}
|
p
///
Proposition: The Frobenius map of K over k has the property that for any
, in K
( + ) =
( ) =
() + ()
() ()
That is, preserves addition and multiplication. Since we already saw that is
bijective, is said to be a ring isomorphism.
Proof: The second assertion, about preserving multiplication, is simply the assertion that the q th power of a product is the product of the q th powers. This is true
in great generality as long as the multiplication is commutative, which it is here.
This doesnt depend at all on what the particular exponent is.
The proof that preserves addition makes quite sharp use of the fact that the
exponent is q, which is a power of a prime number p. This wouldnt work for other
exponents. To start with, we claim that for , in K
( + )p = p + p
324
Chapter 19
( + )p = (p + p )p = ap + bp
3
( + )p = (p + p )p = (ap + bp )p = p + p
and so on, so by induction we could prove that
nN
( + )p
nN
= p
nN
+ p
///
19.6
325
Proposition: Let
A = {1 , . . . , t }
be a set of (t distinct) elements of K, with the property that for any in A, ()
is again in A. Then the polynomial
(x 1 )(x 2 ) . . . (x t )
(when multiplied out) has coefficients in k.
(P ) + (Q)
(P ) (Q)
326
Chapter 19
The meaning of equality for polynomials is that the corresponding coefficients are
equal, so the previous inequality implies that (ci ) = ci for all indices i. By now
we know that this implies that ci k, for all indices i.
///
Proof: Consider the successive images i () of under the Frobenius map. Since
the field is finite, at some point i () = j () for some 0 i < j. Since is a
bijection of K to K, it has an inverse map 1 . Applying this inverse i times to
the equation i () = j (), we find
= 0 () = ji ()
That is, in fact i = 0. That means that for the smallest j so that j () is already
j () = i () for 1 i < j, in fact this duplication occurs as
j () =
rather than duplicating some other element farther along on the list. Let
, (), . . . , d1 ()
be the distinct images of under the Frobenius map. We just saw that d () = .
Let
P (x) = (x )(x ())(x 2 ()) . . . (x d1 ())
As just above, application of to P only permutes the factors on the right-hand
side, by shifting indices forward by one, and wrapping around at the end since
d () = . Thus, when multiplied out, the polynomial P is unchanged by application of , so has coefficients in the smaller field k. We saw this phenomenon already
in the discussion of the Frobenius map. And visibly is a root of the equation
P(x) = 0.
From just above, if is a root in K of a polynomial equation with coefficients
in the smaller field k, then () is also a root. So any polynomial with coefficients
in k of which is a zero must have factors x i () as well, for 1 i < d. By
unique factorization of polynomials with coefficients in a field, this shows that this
is the unique such polynomial.
In particular, P must be irreducible in k[x], because if it properly factored
in k[x] as P = P1 P2 then (by unique factorization) would be a root of either
P1 (x) = 0 or P2 (x) = 0, and then all the d distinct elements i () would be roots
of the same equation as well. Since the number of roots is at most the degree, there
cannot be any proper factorization, so P is irreducible in k[x].
///
19.6
327
///
We need to develop one further abstraction. Let e denote the identity map of
K = Fq [x]/Q to itself, and let
G = {e, , 2 , . . . , n1 }
where Q is of degree n. This is a set of maps of K to itself. As noted above, each
one of these maps when restricted to Fq is the identity map on Fq . Since each i
is the identity on Fq and maps K bijectively to itself, we say that G is a set of
automorphisms of K over Fq .
Proposition: This set G of automorphisms of K over Fq is a group, with identity
e.
328
Chapter 19
///
Proof: What we should really claim here is that the collection of distinct images
i () is naturally in bijection with the collection of cosets G/G where G is the
stabilizer subgroup of in the automorphism G. Indeed, if g G and h G ,
then
(gh)() = g(h()) = g()
This proves that gG g() is well-defined. And if g() = g 0 (), then =
g 1 g 0 (), so g 1 g 0 is in the stabilizer subgroup G . This proves that no two distinct
cosets gG and g 0 G of G send to the same thing.
///
Corollary: For in the field K = k[x]/Q, the degree of the unique monic irreducible polynomial P with coefficients in k so that P () = 0 is a divisor of the
degree n of Q.
Remark:
///
19.7
Counting irreducibles
329
0
if n is divisible by the square of some prime
(d) q n/d
1dn, d|n
1 n
q
n
X
p1 |n
q n/p1 +
q n/p1 p2
p1 ,p2 |n
q n/p1 p2 p3 + . . .
p1 ,p2 ,p3 |n
///
q p1 p2 q p1 q p2
p1 p2
Corollary: If n =
pe1
e1
q p1 q p1
n
///
330
Chapter 19
///
Proof: The quotient ring L = k[x]/P is a field. Let be the image of x there.
We know that P () = 0, and from discussion of the Frobenius map we know that
P (x) = (x )(x ())(x 2 ()) . . . (x d1 ())
d
By Lagranges theorem and its corollaries, we know that q 1 = 1, since the order
of L is q d 1. By unique factorization of polynomials with coefficients in a field,
d
this implies that P (x) divides xq 1 1 as polynomials with coefficients in k = Fq .
On the other hand, the existence of a primitive root g in K means exactly
n
that g q 1 = 1 but no smaller positive exponent makes this true. And, thus, the
n
elements g 1 , g 2 , g 3 , . . . , g q 1 are all distinct (and nonzero). For any integer t
(g t )q
= (g q
1 t
) = 1t = 1
1 = (x g 1 )(x g 2 )(x g 3 ) . . . (x g q
Proof: (of theorem) At last we can count the elements of K by grouping them in
d-tuples of roots of elements of irreducible monic polynomials with coefficients in
19.8
Counting primitives
331
By M
obius inversion we obtain the formula
X
(d) q n/d
n Nn =
d|n
///
///
Proof: Without loss of generality, we only consider the case n > 1, since the linear
case can be treated separately and easily. In particular, this excludes the case that
an irreducible polynomial is divisible by x.
332
Chapter 19
= 1 mod Q(x)
= 1 mod Q(x)
n
is
q
which is equivalent to
= 1 mod Q
n
=1
q =
Exercises
333
which is
n () =
From the discussion of the Frobenius automorphism, this implies that the unique
monic irreducible polynomial f (x) in Fq [x] so that f () = 0 is of degree at most n.
At the same time, in the construction of finite fields we saw that Q() = 0 as well.
As a corollary of the discussion of the Frobenius automorphism, there is exactly
one monic irreducible polynomial f so that f () = 0, so Q = f . Since f has degree
at most n, the degree of Q is at most n. Thus, all the irreducible factors of N are
of degree n, where N = q n 1.
Finally, we observe that primitive polynomials are necessarily irreducible. Indeed, a primitive polynomial Q of degree n in Fq [x] divides the cyclotomic polynomial N with N = q n 1. Just above we proved that all the irreducible factors
of N are of degree n, so by unique factorization Q has no alternative but to be
irreducible.
///
Exercises
19.01 Let N be an integer. Prove carefully that N Z is an ideal in Z.
19.02 Fix an integer N > 1. Prove carefully that the map f : Z Z/N Z given
by f (x) = x + N Z is a ring homomorphism.
19.03 Show that x2 y 2 = 102 has no solution in integers. (ans.)
19.04 Show that x3 + y 3 = 3 has no solution in integers. (ans.)
19.05 Show that x3 + y 3 + z 3 = 4 has no solution in integers. (ans.)
19.06 Show that x2 + 3y 2 + 6z 3 9w5 = 2 has no solution in integers.
19.07 Let I, J be two ideals in a ring R. Show that I J is also an ideal in R.
19.08 Let I, J be two ideals in a ring R. Let
I + J = {i + j : i I and j J}
Show that I + J is an ideal.
19.09 Let f : R S be a surjective ring homomorphism (with R, S commutative,
for simplicity). Let I be an ideal in R. Show that J = {f (i) : i I} is an
ideal in S.
19.10 Let f : R S be a ring homomorphism (with R, S commutative, for simplicity). Let J be an ideal in I. Show that I = {i I : f (i) J} is an ideal
in S.
19.11 Show that there is no element x F13 so that x4 + x3 + x2 + x + 1 = 0.
(ans.)
19.12 Show that there is no solution to x2 + 1 in F11 .
19.13 Consider the polynomial ring Z[x] in one variable over the ring of integers.
Show that the ideal
I = Z[x] 2 + Z[x] x
334
Chapter 19
generated by 2 and x is not principal, that is, that there is no single polynomial f (x) such that I consists of all polynomial multiples of f (x). (ans.)
19.14 Let k be a field. Show that in the polynomial ring k[x, y] in two variables
the ideal I = k[x, y] x + k[x, y] y is not principal.
19.15 (*) Show that the maximal ideals in R = Z[x] are all of the form I =
R p + R f (x) where p is a prime and f (x) is a monic polynomial which is
irreducible modulo p.
20
Curves and Codes
20.1
20.2
20.3
20.4
20.5
20.6
20.7
Plane curves
Singularities of curves
Projective plane curves
Curves in higher dimensions
Genus, divisors, linear systems
Geometric Goppa codes
The Tsfasman-Vladut-Zink-Ihara bound
The material of this chapter is significantly more difficult than earlier ones, and
we are very far from giving a complete treatment. Indeed, the theory of algebraic
curves over fields of positive characteristic was only developed within the last 60
years or so, and is not elementary. An introductory treatment of the applications
to coding theory, at a similar level, is to be found in [Walker 2000].
The discovery described in [Tsfasman Vladut Zink 1982] of codes exceeding
the Gilbert-Varshamov bound was sensational not only because these codes were
so good, but also because their construction used such esoteric mathematics.
We cannot pretend to do justice to the theory of algebraic curves or to the
additional mathematics necessary to fully explain these codes, but can only give an
approximate idea of the more accessible aspects of these ideas.
336
Chapter 20
We may add the modifier algebraic to the word curve to emphasize that we use
only algebra to define this set of points, rather than transcendental functions such
as ex , which might not work right over finite fields anyway. The field k is sometimes
called the field of definition of the curve.
Remark: Often we say the curve f (x, y) = 0 rather than the more proper the
curve defined by f (x, y) = 0. But do not be deceived: an equation is not a curve.
Actually, what we have defined here is the set of k-rational points on a curve.
For example, taking k to be the real numbers R the curve x2 + y 2 = 1 has no
(real-valued) points on it at all, since the square of a real number is non-negative.
Yet taking k to be the complex numbers C it has infinitely many.
Remark: Unlike the family of examples y = f (x), in general it is a highly nontrivial problem to figure out what rational points lie on a curve, or even whether
there are any at all.
Example: Consider the curve X defined over the rational numbers Q by x2 +y 2 =
1. This is the set of points on the usual unit circle in the plane R2 which happen
to have rational coordinates (here rational means in Q). We can systematically
find all the rational points on this curve, as follows. First, there is at least one
easy point to notice on the curve: (1, 0) is such. Second, and this is considerably
more subtle, it turns out that if we consider any (straight) line y = t (x 1)
passing through the point (1, 0) with rational slope t, then the other point in
which this line intersects the circle is also rational. (The converse is easier: for
any other rational point on the curve, the straight line connecting it to (1, 0) will
have rational slope.) Lets verify the claim that the second point of intersection is
rational: we want to solve the system of equations
2
x + y2 =
1
y
= t(x 1)
where t is viewed as a fixed parameter. The situation suggests that we replace y
by t(x 1) in the first equation, to obtain
x2 + t2 (x 1)2 = 1
This might look unpromising, but we know in advance that this quadratic equation
in x has one root x = 1. Recall that the discriminant of a quadratic equation
ax2 + bx + c = 0 is defined to be
p
= b2 4ac
and the quadratic formula (obtained really by simply completing the square) gives
a formula for the roots of the quadratic equation ax2 + bx + c = 0: the roots are
b b2 4ac
roots =
2a
Since the parameter t is rational, the discriminant of the quadratic equation is
rational. Thus, the other root must also be rational, however unclear this is from
20.1
Plane curves
337
the equation itself. This should give us sufficient courage to go ahead with the
computation. Rearrange the equation to the standard form
(1 + t2 )x2 2t2 x + (t2 1) = 0
Then, invoking the quadratic formula,
roots of x + t (x 1) = 1
2t2
p
t4 + (1 + t2 )(1 t2 )
=
1 + t2
2
t + 1 t2 1
t2 1
t2 1
t2 t4 + 1 t4
=
=
,
=
1,
=
1 + t2
t2 + 1
t2 + 1 t2 + 1
t2 + 1
t2
t2 1
t2 + 1
t2 1
1
t2 + 1
=
2t
t2 + 1
t2 1
2t2
+ 2
=0
2
t +1 t +1
shows that the sum of the two roots is 2t2 /(t2 + 1). We know that one of the two
roots is 1, so the other root is
2t2
t2 1
1= 2
+1
t +1
t2
Remark: In the last example we used the fact that a quadratic equation has (at
most) two roots, and that if the coefficients of the equation are rational, and if
one root is rational, then so is the other root. More generally, as a consequence of
unique factorization of polynomials with coefficients in a field, we have:
338
Chapter 20
Proof: We have earlier observed that is a root of the equation f (x) = 0 if and
only if the linear polynomial x divides the polynomial f (x). From this, and
from unique factorization, we conclude that each x 1 , x 2 , . . ., x n1
divides f (x), and so does the product
(x 1 )(x 2 ) . . . (x n1 )
We also know by now that the degree of f (x) divided by this product will be the
difference of their degrees, hence just 1. That is, this division will leave a linear
factor c(x n ) (with c 6= 0). Since the computation takes place inside k[x], the
coefficients c and c are in k. Thus, is necessarily in k, since c is non-zero. This
n is the last root of the equation.
///
20.2
Singularities of curves
339
be two real square roots y of that expression, although only occasionally will these
square roots be rational.
This phenomenon is even more pronounced over finite fields Fq . For example,
over F5 = Z/5 the equation
y 2 = x5 x + 2
has no rational points at all: by Fermats little theorem x5 x = 0 for every x in
F5 , so for x, y in F5 the equation is equivalent to y 2 = 2. But 2 is not a square in
Z/5, so there is no such y in F5 . Thus, the set of rational points of this curve is
the empty set. This might correctly seem to be missing some information, and it
is indeed so. We should consider the family of all finite fields F5n containing the
little field F5 in this case, and ask about points (x, y) with x, y in F5n satisfying
y 2 = x5 x + 2 as n gets larger and larger.
We can give a preliminary imprecise definition: an algebraic closure of Fq
is the union of all the extension fields Fqn of it.
Thus, without worrying about how to organize that definition properly and
make it precise, we should anticipate that we might care about points on a curve
(defined over Fq ) with coordinates in the algebraic closure of Fq .
Remark: We should also be sure that up to inessential differences there is only
one finite field Fpn with pn elements for p prime. There is more to be said here!
f
(x, y)
x
fy (x, y) = f2 (x, y) =
f
(x, y)
y
be the partial derivatives with respect to the first and second inputs, respectively.
340
Chapter 20
Remark: Most often the first input will be x, and likewise most often the second
input will be y, but it is slightly dangerous to rely upon this, and even to use the
notation fx which presumes this. The notation fi for the partial derivative with
respect to the ith input is more reliable.
While derivative-taking as a limit is familiar in case the underlying field k is
R or C, we cannot possibly define derivatives in such manner when the underlying
field is a finite field Fq . Rather, as done in the one-variable case earlier to look at
the issue of multiple factors, the formulas that we prove in the familiar case will be
taken as the definition in the abstract case. That is, bring the exponent down into
the coefficient and subtract one from the exponent:
X
X
cij xi y j =
i cij xi1 y j
x
i,j
i,j
X
X
cij xi y j =
j cij xi y j1
y
i,j
i,j
Unfortunately, it requires proof to know that this direct algebraic definition really
does have the other properties to which were accustomed. It does!
A plane curve defined by a single equation f (x, y) = 0 is non-singular at a
point (x0 , y0 ) on the curve if and only if (in addition to the condition f (x0 , y0 ) = 0)
we have
fx (x0 , y0 ) 6= 0 or fy (x0 , y0 ) 6= 0
If both of these partial derivatives vanish, then the point (x0 , y0 ) is a singular
point of the curve. That is, the point (x0 , y0 ) is a non-singular point of the curve
if and only if the gradient
f (xo , yo ) f (xo , yo )
,
,
f (xo , yo ) =
x
y
evaluated at (xo , yo ) is not the zero vector.
Remark: These conditions on partial derivatives should remind an astute observer
of the implicit function theorem.
Over the rational numbers Q, the real numbers R, and even more so in the
case of finite fields Fq , there may be singular points which are not rational over the
field of definition. An example is
y 2 = (x2 3)2
20.2
Singularities of curves
341
this curve is non-singular. We can practice the partial derivative criterion to verify
this computationally, as follows. To find singular points (x, y), solve the system
f (x, y) = 0
fx (x, y) = 0
fy (x, y) = 0
In the present case this is
2
x + y2
2x
2y
= 1
= 0
= 0
From the last two equations, if 2 6= 0 in the field, the only possible singular point
is (0, 0), but it doesnt lie on the curve. Therefore, there are no singular points if
2 6= 0 in the field. (If 2 = 0 in the field, this curve degenerates.)
Example: Consider f (x, y) = x2 y 2 1. Since over the real numbers the equation
f (x, y) = 0 defines a hyperbola, which is plausibly non-singular, we expect that this
curve is non-singular. Again, we use the partial derivative criterion to verify this
computationally. To find singular points (x, y), solve the system
2
x y2 = 1
2x
= 0
2y
= 0
Again, from the last two equations, if 2 6= 0 in the field, then the only possible
singular point is (0, 0) , but it doesnt lie on the curve. Therefore, there are no
singular points. (If 2 = 0 in the field, this curve degenerates.)
Example: Consider f (x, y) = x3 y 2 . Use the partial derivative criterion to find
singular points, if any. To find singular points (x, y), solve the system
3
x y2 = 0
3x2
= 0
2y
= 0
From the last two equations, if neither 2 = 0 nor 3 = 0 in the field, then the only
possible singular point is (0, 0), which does lie on the curve. Therefore, if neither
2 = 0 nor 3 = 0 in the field, then the only singular point is (0, 0).
Example: A general family of curves whose non-singularity can be verified systematically is those of the form
y 2 = f (x)
where f is a polynomial in k[x] without repeated factors (even over extension fields).
These are called hyperelliptic. (The cases that the degree of f is 1, 2, 3, or 4
are special: the degree 2 curves are called rational curves, and the degree 3 and
4 curves are called elliptic curves.) The system of equations to be solved to find
singularities is
2
= f (x)
y
0 = fx (x)
2y =
0
342
Chapter 20
The last equation shows that the only possible singularities occur at points with
y-coordinate 0 if 2 6= 0 in k. Substituting that into the first equation and carrying
along the second equation gives a system
0 = f (x)
0 = fx (x)
We have already observed that if this system has a solution x0 then at least the
square (x x0 )2 of the linear factor x x0 must divide f (x). That is, f (x) would
have a repeated factor, contrary to assumption. Thus, the curve has no singularities
in the plane (at least if 2 6= 0 in the field).
Remark: On the other hand, in the previous example, if the underlying field does
have 2 = 0, as happens in the case that k = F2 , then the curve degenerates. It
turns out that a more proper analogue of hyperelliptic curve in characteristic 2
(that is, when 2 = 0) is a modifed equation of the form
y 2 + ay + bxy = f (x)
with at least one of a, b non-zero.
20.3
343
Remark: There is simply more than one way to give coordinates to a point in
projective space, in the same way that there is more than one way to specify a
fraction
1
3
7
= =
2
6
14
and in the same way that there is more than one way to specify a residue class mod
m:
3 = 3 + m mod m = 3 + 2m mod m = 3 7m mod m = . . .
Remark: The projective plane P2 can also be thought of as the collection of lines
in three-space, by identifying a point (a, b, c) in homogeneous coordinates with
the line through the origin defined by the equation ax + by + cz = 0. Note that
changing the representative (a, b, c) by multiplying through by a non-zero scalar
does not change the line.
The affine plane k 2 embeds into P2 nicely by
(x, y) (x, y, 1)
Note that (x, y, 1) (x0 , y 0 , 1) if and only if (x, y) = (x0 , y 0 ).
Thus, for example, the point (3, 4) in the (affine) plane is identified with (3, 4, 1)
in homogeneous coordinates on the projective plane P2 .
A point given in projective coordinates (x, y, z) with z = 0 is called the point
at infinity, and the line at infinity is the set of points (x, y, z) in P2 with z = 0.
That is,
all points at infinity = line at infinity = {(x, y, 0) : not both x, y are 0 }
One justification for this terminology is that no such point lies in the embedded
copy of the (affine) plane k 2 , so whatever they are these points at infinity really do
lie outside the usual plane.
The total degree of a term cijk xi y j z k (with cijk 6= 0) is simply the sum i+j+k
of the exponents. The total degree deg(f ) of a polynomial f is the maximum of
the total degrees of all the summands in it. A polynomial in 3 variables x, y, z is
homogeneous of degree N if there is a non-negative integer N so that every
term cijk xi y j z k (with cijk 6= 0) has the same total degree N , that is,
i+j+k =N
344
Chapter 20
cij xi y j
i,j
i,j
Proposition: Let F (x, y, z) be the homogeneous polynomial attached to a polynomial f (x, y) in two variables, and let x, y, z in k satisfy F (x, y, z) = 0. Then for
any t k
F (tx, ty, tz) = 0
That is, the equation F (x, y, z) = 0 (not including the point (0, 0, 0)) specifies a
well-defined subset of P2 . And the intersection of this set with the imbedded affine
plane F 2 is the original affine curve f (x, y) = 0.
Proof: One basic feature of the homogenization process is that for any x, y, z
F (tx, ty, tz) = tdeg(f ) F (x, y, z)
The other basic feature is that
F (x, y, 1) = f (x, y)
The first property makes clear the first assertion of the proposition, and then the
second makes clear the second assertion of the proposition.
///
Now we give the beginning of better justification of the terminology and construction by returning to the simple example of two lines in the plane. From the
equation of a straight line L
ax + by + c = 0
20.3
345
in the usual coordinates on k 2 (with not both a, b zero), we create the associated
homogenized equation
ax + by + cz = 0
If (x, y, z) satisfies this equality, then so does (x, y, z) for any k . That is,
in the projective plane P2 . And it has
the homogenized equation defines a curve L
the desirable property that under the embedding of k 2 into P2 the original line L
is mapped to a subset of L.
of L? This
What points at infinity are there on the extended version L
amounts to looking for solutions to ax + by + cz = 0 with z = 0: That means
ax + by = 0. Since not both a, b are zero, without loss of generality we may suppose
that b 6= 0 and get y = (a/b)x. Thus, we get points
(x, ax/b, 0) (1, a/b, 0)
That is, these are just different homogenous coordinates for the same point:
on L.
there is a single point at infinity lying on a given line.
We really do have the smoothed-out symmetrical assertion:
Theorem: Any two (distinct) lines in the projective plane P2 intersect in exactly
one point.
=
=
0
0
The assumption on the lines is that (a, b, c) is not a scalar multiple of (a0 , b0 , c0 )
(and equivalently (a0 , b0 , c0 ) is not a scalar multiple of (a, b, c)). We must solve this
system of equations for (x, y, z).
Suggested by basic linear algebra, we might view this as hunting for a vector
(x, y, z) so that
(x, y, z) (a, b, c) = 0
(x, y, z) (a0 , b0 , c0 ) = 0
with the usual dot product. Suggested by basic linear algebra over the real
numbers, we might anticipate that the cross product of (a, b, c, ) and (a0 , b0 , c0 ) is
a solution: try
(x, y, z) = (bc0 b0 c, ac0 + a0 c, ab0 a0 b)
Indeed,
a(bc0 b0 c) + b(ac0 + a0 c) + c(ab0 a0 b)
a0 (bc0 b0 c) + b0 (ac0 + a0 c) + c0 (ab0 a0 b)
= 0
= 0
Note that it is the fact that (a, b, c) and (a0 , b0 , c0 ) are not scalar multiples of each
other that makes (x, y, z) 6= (0, 0, 0).
A little more work, as in the preceding proposition, would show that the collection of all solutions is exactly the collection of scalar multiples of a given solution.
Thus, in P2 there is a unique solution.
///
346
Chapter 20
20.3
347
to study the part of the curve near (x1 , y1 , z1 ) with y1 6= 0. Thus, reasonably
declaring a projective curve to be non-singular if all these affine parts of it are
non-singular in the usual affine sense, we see that
///
Proof: Omitted.
Remark: The notion of multiplicity is familiar in the case of polynomials in a
single variable, where the left-hand side of a polynomial equation f (x) = 0 may
have repeated linear factors, as in the case
(x 1)2 (x 2) = x3 4x2 + 5x 2 = 0
which has a double root at x = 1. In the case at hand, the spirit of the idea of
multiplicity is similar, but is more complicated to define precisely and so as to make
Bezouts theorem exactly true. Perhaps one may view the notion of multiplicity
as a technical refinement to deal with somewhat special or extreme cases, since
generically there will be no multiple points, in the same way that it is unusual for
a polynomial in a single variable to have a multiple root.
Proof: (of corollary) Affine plane curves will have the same points of intersection
as the corresponding projective plane curves, except that theyll be missing any
points of intersection that are points at infinity. Thus, instead of the equality
that we get in Bezouts theorem for projective curves, we merely get an inequality,
348
Chapter 20
since we may have lost some points (and we may have failed to properly count
multiplicities).
///
Remark: Notice that the only notational device that distinguishes this field of
rational functions from the ring of polynomials is the use of (round) parentheses
rather than (square) brackets.
Let X be an affine plane curve defined by an equation f (x, y) = 0 over Fq .
We suppose that f (x, y) has no repeated factors. (It is in fact a little tricky to
understand what this means, since were talking about polynomials in two variables,
20.5
349
not one.) For two other polynomials r(x, y) and s(x, y), we write
r(x, y) = s(x, y) mod f (x, y)
if r(x, y) s(x, y) is a polynomial multiple of f (x, y). Note that there is no division/reduction algorithm for polynomials in two variables, so that we cannot be
so glib about dividing one polynomial by another in this context as we were in
the one-variable context. Further, for two rational functions r1 (x, y)/s1 (x, y) and
r2 (x, y)/s2 (x, y), we write
r2 (x, y)
r1 (x, y)
=
mod f (x, y)
s1 (x, y)
s2 (x, y)
if
r1 (x, y) s2 (x, y) = r2 (x, y) s1 (x, y) mod f (x, y)
That is, two fractions are equal modulo f if and only if we have equality when we
multiply out the fractions to have simply polynomials.
Recall that non-zero elements r, s in a commutative ring R are zero divisors
if r s = 0. The field of fractions of a commutative ring R without zero divisors
is the field consisting of all fractions r/s with r R, 0 6= s R, where
r
r0
= 0
s
s
if rs0 = sr0
(The latter is the expected condition for equality of fractions.) Addition, multiplication, and inversion of fractions are by the usual formulas
a
c
ad + bc
+ =
b
d
bd
a c
ac
=
b d
bd
a
b
1/
=
(for a 6= 0)
b
a
One should verify that the axioms for a field are met. As perhaps expected, assuming that R has a unit 1, there is an injective ring homomorphism of the original
ring R into this field of fractions by r r/1.
The field of rational functions on the plane curve X defined by f (x, y) = 0
is defined to be the field of fractions of
Fq [x, y] mod f (x, y)
That is, it consists of all ratios of polynomials-modulo- f (x, y). This field is denoted
Fq (X).
Remark: So every rational function on a plane curve X is expressible as a ratio
of polynomials, but in more than one way, since two such ratios are equal on X
if they are equal modulo f (x, y).
350
Chapter 20
Thinking about it in a projective context, we would define the field of functions on the projective plane P2 to be the collection of ratios of homogeneous
polynomials in 3 variables where the degree of the numerator is equal to the degree
of the denominator. On the face of it, this is a different thing than the field of rational functions in two variables defined above. But they are in fact essentially the
same thing: every ratio of not necessarily homogeneous polynomials in two variables
may be converted to a ratio of homogeneous polynomials (with degree of numerator equal to degree of denominator) simply by homogenizing both numerator and
denominator (and adding extra factors of z to either numerator or denominator in
order to make the degrees equal). The process can likewise be reversed.
The idea of field-of-functions on a plane curve has a projective version as well.
Everything here is predictable, in principle, from thinking about the proper way
to projectivize the affine case discussed just above. But well repeat it in the
projectivized setting just for emphasis. Let X be a projective curve defined by
a homogeneous equation F (x, y, z) = 0 over Fq . We suppose that F (x, y, z) has
no repeated factors (!?). For two other homogeneous polynomials R(x, y, z) and
S(x, y, z), say
R(x, y, z) = S(x, y, z) mod F (x, y, z)
if R(x, y, z) S(x, y, z) is a polynomial multiple of F (x, y, z). For two ratios
r1 (x, y, z)/s1 (x, y, z) and r2 (x, y, z)/s2 (x, y, z) of homogeneous polynomials, we
write
r2 (x, y, z)
r1 (x, y, z)
=
mod F (x, y, z)
s1 (x, y, z)
s2 (x, y, z)
if
r1 (x, y, z) s2 (x, y, z) = r2 (x, y, z) s1 (x, y, z) mod F (x, y, z)
That is, two fractions are equal modulo F if and only if we have equality when
we multiply out the fractions to have simply polynomials.
The field of rational functions on the projective curve X defined by
F (x, y, z) = 0 is defined to be
Fq (x, y, z) mod F (x, y, z)
That is, it consists of all ratios of homogeneous polynomials and two such ratios
are the same if they are equal modulo F (x, y, z). This field is denoted Fq (X).
20.5
351
is the defining relation of the curve. In general, things need not be quite so simple.
Let P1 , P2 , . . . , Pn be a set of distinct points on a projective curve X defined
over Fq . A divisor on X is an expression
D = `1 P1 + . . . + `n Pn
with integers `i . Such a thing once would have been called a formal finite sum
of points. However, rather than misguidedly wasting our time trying to legitimize
this by imagining what adding points might mean, or what multiplying points
by integers might mean, we make sense of this as follows. A divisor on X is an
integer-valued function ` on X, which takes the value 0 at all but finitely many
points. That is certainly legal. Then the expression above is really the sum
X
`(P ) P
P X
Example: Let C be the projective plane curve defined by the homogeneous equation x3 + y 3 + z 3 = 0. The points P1 = (1, 1, 0), P2 = (1, 0, 1), P3 = (0, 1, 1)
are distinct points on the projective plane, and all lie on the curve C. Thus, the
expression
5 P1 17 P2 + 11 P3
is a divisor on the curve.
Remark: The previous expression does not indicate scalar multiplication or vector
addition, but rather operations inside a more abstract object (which is a group).
The degree of a divisor is just the sum of the coefficients:
deg (`1 P1 + . . . + `n Pn ) = `1 + . . . + `n
Assuming that the points Pi are distinct, the coefficient `i corresponding to the
point Pi is the multiplicity of the divisor at Pi . The support of a divisor is the
set of points at which the divisor has non-zero multiplicity. A non-zero divisor is
positive if all its multiplicities are non-negative.
Example: Let C again be the projective plane curve defined by the homogeneous
equation x3 + y 3 + z 3 = 0, with points P1 = (1, 1, 0), P2 = (1, 0, 1), P3 =
(0, 1, 1) on it. The degree of the divisor
D = 5 P1 17 P2 + 11 P3
is
deg D = 5 17 + 11 = 1
Let C1 and C2 be two projective plane curves (defined by single equations
without a common factor). Then by Bezouts theorem the intersection is a finite
set of points P1 , . . . , Pn , whose number is equal to the product of the degrees (being
352
Chapter 20
sure to count multiplicities). The intersection divisor is the divisor which is the
sum of all the points in this intersection, counting multiplicities:
X
div(C1 C2 ) =
Pi
i
Let = g/h be a rational function on a projective plane curve X, with homogeneous polynomials g and h of the same degree. Let Zg be the projective curve
defined by g(x, y, z) = 0, and let Zh be the curve defined by h(x, y, z) = 0. We have
zeros of on X = div(Zg X)
poles of on X = div(Zh X)
One should think of the poles as being where the function blows up. The divisor
div(f ) is
div(f ) = div(Zg X) div(Zh X)
That is, roughly, the divisor of a function is simply the formal sum of zeros minus
poles.
The linear system L(D) attached to a divisor D on a projective plane curve
X is
L(D) = {f Fq (X) : div(f ) + D is a positive divisor}
By convention, we also include the zero function in any such linear system. This
linear system is a vector space over Fq , in the sense that any two elements of it can
be added, there are additive inverses, and there is a scalar multiplication. Thus,
there is a corresponding notion of dimension of L(D).
Remark: The integer g, the genus, occurring in the following theorem, has a
more intuitive origin that is described afterward, but in reality its importance lies
in such things as the more technical content of this theorem.
Theorem: (Riemann-Roch) Let X be a non-singular projective plane curve.
There is a non-negative integer g such that for any divisor D,
dimension of vector space L(D) deg(D) + 1 g
and in fact if deg(D) > 2g 2, we have the equality
dimension of vector space L(D) = deg(D) + 1 g
Example: Suppose that X is the usual line with the added point at infinity. (That
is, X is the projective line.) The point at infinity we will suggestively denote by .
Then for a positive integer ` the linear system L(` ) consists of the polynomial
functions of degree less than or equal `. In particular, the functions in that linear
system are not allowed to have poles anywhere but at , and the order of pole at
is bounded by `.
It is not possible to give a complete-yet-accurate description of where genus
came from, but a heuristic description is possible, along with a family of examples.
20.6
353
First, a curve defined over C, with points at infinity properly inserted, is provably
geometrically equivalent to a sphere with a number of handles attached. For
example, attaching a single handle gives a geometric thing equivalent to the surface
of a doughnut (called a torus). Attaching no handles leaves us simply with a sphere.
Any number of handles may be attached, and in fact in slightly different ways, but
for the moment all we care about is the number of handles attached, in effect. If
a curve X over C is described as a sphere with g handles attached, then the curve
X is said to have genus g. In some cases there is a computationally effective way
to determine genus: for hyperelliptic curves
y 2 = f (x)
(after desingularizing this curve at infinity when the degree of f is odd) the genus
is
deg f 2
(deg f even)
2
2
genus of {y = f (x)} =
deg f 1
(deg f odd)
2
This geometric discussion doesnt quite make sense if the defining field is Fq ,
and in that case the abstracted version of genus is a trickier thing. Nevertheless, for
hyperelliptic curves, for example, the formula for genus is the same as for complex
curves.
354
Chapter 20
deg(D) + 1 g
n
n deg(D)
n
We would want both of these to be large. Or, as a sort of compromise, wed want
the sum to be large. Here
relative min distance
N +1g
= 1 + 1/n g/n
n
We want g/n to become small, or, equivalently, n/g to become large. In particular,
this means that we want the curves to have a large number of Fq -rational points
available to use as the Pi s.
|N (q + 1)| 2g q
Exercises
355
A curve that achieves the maximum allowed by this inequality is the hermitian
curve
y q z + yz q = xq+1
over the field Fq2 with q 2 elements.
Theorem: (Tsfasman, Vladut, Zink, Ihara) Let q be an even power of a prime.
There is a constructible infinite sequence of curves Xi defined over Fq so that
lim
= q1
genus of Xi
Remark: Thus, if we trace back the implications of this for the geometric Goppa
codes, and compare asymptotically with the Gilbert-Varshamov bound, we would
see that weve done better!
Remark: The curves constructed to exceed the Gilbert-Varshamov bound are
modular curves, meaning that they are very special among algebraic curves, in
particular admitting a very sharp analysis of their rational points!
Exercises
20.01 Find all solutions (x, y) to x2 2y 2 = 1 with x, y in Q. (ans.)
20.02 Find all solutions (x, y) to x2 3y 2 = 1 with x, y in Q.
20.03 Find all solutions (x, y) to x2 + xy + y 2 = 1 with x, y in Q.
20.04 Determine all the points at infinity of the projectivization of the curve x2
2y 2 = 1. (ans.)
20.05 Determine singular points of the affine curve x2 2y 2 = 1. (ans.)
20.06 Are the points at infinity of the projectivization of x2 2y 2 = 1 singular?
(ans.)
20.07 Determine all the points at infinity of the projectivization of the curve x2 +
xy + y 2 = 1. Which are singular points?
20.08 Over a field of characteristic 2, show that y 2 + y = x3 + x + 1 is nonsingular. Find all the points at infinity and show that they are non-singular.
Determine the genus.
20.09 Prove Eulers identity: for a homogeneous polynomial f (x1 , . . . , xn ) of
total degree d, prove that
n
X
f =df
x
i
i=1
20.10 Show that the hermitian curve y q z + yz q = xq+1 over the field Fq2 with q 2
elements is non-singular and has a unique point at infinity.
20.11 Determine the genus of the hermitian curve y q z + yz q = xq+1 .
1
1
1
1
2 nn+ 2 en e 12(n+1) < n! < 2 nn+ 2 en e 12n
ln x dx = [x ln x x]n+1
= (n + 1) ln(n + 1) (n + 1)
1
and
ln x dx = [x ln x x]n0 = n ln n n
t1
356
357
1
1
ln n n ln (n + 1)! n + 1 +
ln(n + 1) (n + 1)
n+
2
2
1
1
= n+
ln 1 +
1
2
n
x2
x3
x4
+
+ ...
2
3
4
to obtain
En En+1
1
1
1
2 + 3 ... 1
n 2n
3n
1
1
1
1 1
1
1
=
+ 2 ... +
2 + 3 ... 1
1 2n 3n
2 n 2n
3n
1 1 1
1 1 1
1
1
1
1 1 1
+
+
+
+ ...
=
2
3
4
3 4 n
4 6 n
5 8 n
6 10 n5
=
n+
1
2
1
by cancelling the 1 and the 2n
. For any n 1 this is an alternating decreasing
sequence.
Recall that for an alternating decreasing sequence
a1 a2 + a3 a4 + . . .
(that is, with each ai > 0 and ai > ai+1 for all i), we have identities such as
a1 a2 < a1 a2 + a3 a4 + . . . < a1 a2 + a3
358
Therefore,
1
12
1
1
3
n2
n
< En En+1 <
1
1
1
+
12n2
12n3
40n4
In particular, since the left-hand side is always positive, each of the values
En En+1 is positive, so the sequence En itself is decreasing.
1
1
and adding 12(n+1)
to the right-hand inequality here, we get
Subtracting 12n
En
1
12n
En+1
1
12(n + 1)
<
1
1
1
1
1
+
12n2
12n3
40n4
12n 12(n + 1)
2n n 2 = 2
1
n
4
2
#
#
"
2
1
1
17
=4>0
1 2
2
16
4
16
1
12(n + 1)
is decreasing.
In summary,
lim En = C
n
1
12n
359
and
En
1
12(n+1)
1
12n
go to 0,
increases to C
En
1
12(n + 1)
C+
1
1
< En < C +
12(n + 1)
12n
decreases to C
Therefore, for n 2,
That is,
1
1
1
1
C+
+ n+
+ n+
ln n n < ln n! < C +
ln n n
12(n + 1)
2
12n
2
This is the statement of the proposition.
///
Basics
Dimension
Homomorphisms and duals
Scalar products
Vandermonde determinants
This appendix consists entirely of proofs of some basic linear algebra results.
Even though our immediate interests are in tangible linear algebra involving row
vectors, row spaces, and other concrete versions of vectors and operations upon
them, it turns out to be economical to give relatively abstract proofs of some important basic properties of them.
A.1 Basics
This section gives definitions for abstract vector spaces over arbitrary fields that
subsume concrete definitions in terms of matrices, row-vectors, and column-vectors.
Let k be a field, whose elements we may also call scalars. A vector space
over k is a set V with a special element 0 (the zero vector) and with a vector
addition denoted + with the expected properties
0+v =v+0
for all v V
v+w =w+v
for all v, w V
(u + v) + w = u + (v + w) for all u, v, w V
(property of 0)
(commutativity)
(associativity)
(additive inverse)
for
for
for
for
all
all
all
all
v, w V , k
v V , k
v V , k
vV
(distributivity)
(distributivity)
(associativity)
(property of 1)
Remark: The last axiom may seem unnecessary, but in abstract situations it is
not automatically satisfied, and must be explicitly required.
360
A.1
Basics
361
A familiar vector space over a field k is the space of ordered n-tuples of elements
from k
k n = {(x1 , . . . , xn ) : x1 , . . . , xn k}
with component-wise addition and scalar multiplication. The 0-vector is
0 = (0, . . . , 0)
Remark: Note the use of the word over in the above definition to tell what the
field of scalars is. This is a peculiar but standard mathematical usage.
Remark: Ignoring the scalar multiplication, a vector space V is an abelian group.
Thus, some basic properties of groups could be recycled here to immediately deduce
some properties of vector spaces.
There is exactly one vector in V with the property of the zero vector 0.
Proof: Let z V also have the property that z + v = v for even a single v V .
Then, using associativity and the definition of v,
z = z + 0 = z + (v + (v)) = (z + v) + (v) = v + (v) = 0
as claimed.
///
Proof: Let x, y V be such that x+v = 0 and y+v = 0. Then, using associativity,
y = y + 0 = y + (v + x) = (y + v) + x = 0 + x = x
as claimed.
///
///
362
Then add ( 0) (whatever the latter is!) to both sides to obtain, via associativity
and the property of the additive inverse,
0 = 0 ( 0) = ( 0 + 0) (alf 0)
= 0 + ( 0) ( 0)) = 0 + 0 = 0
as claimed.
The additive inverse v of a vector v is (1) v:
///
v = (1) v
A.2
Dimension
363
Proof: Since a vector subspace is closed under scalar multiplication and vector
addition, certainly every linear combination of vectors taken from X must lie in
any vector subspace containing X. On the other hand, we must show that any
vector in the intersection of all subspaces containing X is expressible as a linear
combination of vectors in X. But it is not hard to check that the collection of these
linear combinations is a vector subspace of V , and certainly contains X. Therefore,
the intersection is no larger than this set of linear combinations.
///
A linearly independent set of vectors spanning a subspace W of V is a basis
for W .
Proposition: Given a basis e1 , . . . , en for a vector space V , there is exactly one
expression for an arbitrary vector v V as a linear combination of e1 , . . . , en .
Proof: That there is at least one such expression follows from the spanning property. If
X
ai ei = v =
bi ei
(ai bi )ei = 0
Since the ei are linearly independent, this implies that ai = bi for all indices i. ///
A.2 Dimension
The first main results involve the notion of dimension. The conclusions of this section are not surprising, but must be considered carefully for subsequent discussions
to be well-founded.
The argument in the proof of the following fundamental theorem is sometimes
called the Lagrange replacement principle. This is the first and main non-trivial
result in linear algebra.
Theorem: Let v1 , . . . , vm be a linearly independent set of vectors in a vector
space V , and let w1 , . . . , wn be a basis for V . Then m n, and (renumbering the
vectors wi if necessary!) the vectors
v1 , . . . , vm , wm+1 , wm+2 , . . . , wn
are a basis for V .
364
Thus, v1 , . . . , vi+1 , wi+2 , . . . , wn span V . We claim that these vectors are linearly
independent. Indeed, if for some coefficients aj and bj
a1 v1 + . . . + ai+1 vi+1 + bi+2 wi+2 + . . . + bn wn = 0
then some ai+1 must be non-zero, because of the (inductively assumed) linear independence of v1 , . . . , vi , wi+1 , . . . , wn , thus surely of the subcollection
v1 , . . . , vi , wi+2 , . . . , wn . Thus, we can rearrange to express vi+1 as a linear combination of v1 , . . . , vi , wi+2 , . . . , wn . Then the expression for wi+1 in terms of
v1 , . . . , vi , vi+1 , wi+2 , . . . , wn becomes an expression for wi+1 as a linear combination of v1 , . . . , vi , wi+2 , . . . , wn . But this would contradict the (inductively assumed)
linear independence of v1 , . . . , vi , wi+1 , wi+2 , . . . , wn .
Consider the possibility that m > n. Then, by the above argument, v1 , . . . , vn
is a basis for V . Thus, vn+1 is a linear combination of v1 , . . . , vn , contradicting their
linear independence. Therefore, m n, and v1 , . . . , vm , wm+1 , . . . , wn is a basis for
V , as claimed.
///
A.3
365
///
=
=
=
...
=
(1, 0, 0, . . . , 0, 0)
(0, 1, 0, . . . , 0, 0)
(0, 0, 1, . . . , 0, 0)
(0, 0, 0, . . . , 0, 1)
spans k n , since
(c1 , . . . , cn ) = c1 e1 + . . . + cn en
On the other hand, a linear dependence relation
0 = c1 e1 + . . . + cn en
gives
(c1 , . . . , cn ) = (0, . . . , 0)
from which each ci is 0. Thus, these vectors are a basis for k n .
///
Remark: The vectors in the latter proof are the standard basis for kn .
366
(for all v1 , v2 V )
(for all k, v V )
The kernel of f is
ker f = {v V : f (v) = 0}
and the image of f is
Im f = {f (v) : v V }
Vector space homomorphisms are also called linear. A homomorphism is an isomorphism if it is one-to-one (injective) and onto (surjective).
A vector space homomorphism f : V W sends 0 (in V ) to 0 (in W ).
Proof: First,
f (0) = f (0 + 0) = f (0) + f (0)
Then add f (0) (whatever it may be) to both sides, obtaining
0 = f (0)+f (0) = f (0)+(f (0)+f (0)) = (f (0)+f (0))+f (0) = 0+f (0) = f (0)
proving what was claimed.
For a vector space homomorphism f : V W , for v V ,
///
f (v) = f (v)
///
Proof: Regarding the kernel, the previous proposition shows that it contains 0.
The last bulleted point observed that additive inverses of elements in the kernel are
again in the kernel. And for x, y ker f
f (x + y) = f (x) + f (y) = 0 + 0 = 0
so the kernel is closed under addition. Finally, for k and v V
f ( v) = f (v) = 0 = 0
so the kernel is closed under scalar multiplication. Thus, the kernel is a vector
subspace.
A.3
367
///
Proof: Let v1 , . . . , vm be a basis for ker f , and, invoking the theorem, let
wm+1 , . . . , wn be vectors in V such that v1 , . . . , vm , wm+1 , . . . , wn form a basis for
V . We claim that the images f (wm+1 ), . . . , f (wn ) form a basis for Im f . First,
show that these vectors span the image. Indeed, for f (v) = w, express v as a linear
combination
v = a1 v1 + . . . + am vm + bm+1 wm+1 + . . . + bn wn
and apply f , using its linearity
w = a1 f (v1 ) + . . . + am f (vm ) + bm+1 f (wm+1 ) + . . . + bn f (wn )
= a1 0 + . . . + am 0(vm ) + bm+1 f (wm+1 ) + . . . + bn f (wn )
= bm+1 f (wm+1 ) + . . . + bn f (wn )
since the vi s are in the kernel. This shows that the f (wj )s span the image. For
linear independence, suppose that
0 = bm+1 f (wm+1 ) + . . . + bn f (wn )
368
Then
0 = f (bm+1 wm+1 + . . . + bn wn )
Then, bm+1 wm+1 +. . .+bn wn would be in the kernel of f , so would be a linear combination of the vi s, which would contradict the fact that v1 , . . . , vm , wm+1 , . . . , wn
is a basis, unless all the bj s were 0. Thus, the images f (wj ) are linearly independent, so form a basis for Im f .
///
ci ei
f (ci ei ) =
ci f (ei )
ci f (ei ) = 0
Then
!
0=
ci f (ei ) =
f (ci ei ) = f
ci ei
P
Since f is injective i ci ei = 0. Since the ei form a basis for V , it must be that all
ci s are 0. This proves that the f (ei )s are linearly independent.
///
A (linear) functional : V k on a vector space V over k is a linear map
from V to the field k itself, viewed as a one-dimensional vector space over k. The
collection V of all such linear functionals is the dual space of V .
Proposition: The collection V of linear functionals on a vector space V over k
is itself a vector space over k, with the addition
( + )(v) = (v) + (v)
and scalar multiplication
( )(v) = (v)
Proof: The 0-vector in V is the linear functional which sends every vector to 0.
The additive inverse is defined by
()(v) = (v)
The distributivity properties are readily verified:
(( + ))(v) = ( + )(v) = ((v) + (v)) = (v) + (v) = ()(v) + ()(v)
A.3
369
and
(( + ) )(v) = ( + )(v) = (v) + (v) = ()(v) + ()(v)
as desired.
///
1
0
(for i = j)
(for i =
6 j)
From the definition alone it is not at all clear that a dual basis exists, but the
following proposition proves that it does.
Proof: Proving the existence of a dual basis corresponding to the given basis will
certainly prove the dimension assertion. Using the uniqueness of expression of a
vector in V as a linear combination of the basis vectors, we can unambiguously
define a linear functional j by
!
j
ci ei
= cj
These functionals certainly have the desired relation to the basis vectors ei . We
must prove that the j are a basis for V . If
X
bj j = 0
bi =
bj j (ei ) = 0(ei ) = 0
This holds for every index i, so all coefficients are 0, proving the linear independence
of the j . To prove the spanning property, let be an arbitrary linear functional
on V . We claim that
X
(ej ) j
=
j
370
ai ei gives
ai (ej ) j (ei ) =
ai (ei )
since j (ei ) = 0 for i 6= j. This proves that any linear functional is a linear
combination of the j .
///
Let W be a subspace of a vector space V over k. The orthogonal complement W of W in V is
W = { V : (w) = 0, for all w W }
///
m+1jn
m+1jn
by the defining property of the dual basis. That is, every functional in W is a
linear combination of the j , and thus the latter form a basis for W . Then
dim W + dim W = m + (n m) = n = dim V
as claimed.
///
The second dual V of a vector space V is the dual of its dual. There is a
natural vector space homomorphism : V V of a vector space V to its second
V by
(v)() = (v)
A.3
371
for v V , V .
Corollary: Let V be a finite-dimensional vector space. Then the natural map of
V to V is an isomorphism.
Proof: If v is in the kernel of the linear map v (v), then (v)() = 0 for all
, so (v) = 0 for all . But if v is non-zero then v can be part of a basis for V ,
which has a dual basis, among which is a functional such that (v) = 1. Thus,
for (v)() to be 0 for all it must be that v = 0. Thus, the kernel of is {0}, so
(from above) is an injection. From the formula
dim ker + dim Im = dim V
it follows that dim Im = dim V . We showed above that the dimension of V is
the same as that of V , since V is finite-dimensional. Likewise, the dimension of
V = (V ) is the same as that of V , hence the same as that of V . Since the
dimension of the image of in V is equal to the dimension of V , which is the
same as the dimension of V , the image must be all of V . Thus, : V V is
an isomorphism.
///
=
=
dim C
dim R
372
Then
column rank of M = row rank of M
///
Remark: From the symmetry it follows that there are corresponding linearity
and non-degeneracy properties for the second argument, as well:
hu, v + wi = hu, wi + hu, vi
and
hu, vi = hu, vi
Remark: When the scalars are the complex numbers C, often a variant of the
symmetry condition is useful, namely a hermitian condition that hu, vi = hv, ui
where the bar denotes complex conjugation.
Remark: When the scalars are real, sometimes the non-degeneracy condition is
usefully replaced by a positive-definiteness condition, namely that hv, vi 0 and
is 0 only for v = 0. An analogous condition is likewise often appropriate in the
complex-scalar case.
When a vector space V has a scalar product h, i, there is a natural linear map
v v from V to its dual V given by
v (w) = hv, wi
A.4
Scalar products
373
Proof: The non-degeneracy in the first argument means that for v 6= 0 the linear
functional v is not 0, since there is w V such that v (w) 6= 0. Thus, the linear
map v v has kernel {0}, so v v is injective. Since V is finite-dimensional,
from above we know that it and its dual have the same dimension. Let L(v) = v .
Since (from above)
dim Im L + dim ker L = dim V
the image of V under v v in V is that of V . Since (from above) proper subspaces
have strictly smaller dimension it must be that L(V ) = V .
///
Proof: Suppose that L(v) W . Thus, v (w) = 0 for all w W . That is,
hv, wi = 0 for all w W . On the other hand, suppose that hv, wi = 0 for all
w W . Then v (w) = 0 for all w W , so v W .
///
Corollary: Redefine
W = {v V : hv, wi = 0 for all w W }
Then
dim W + dim W = dim V
and
W = W
as
Proof: With our original definition of Worig
Worig
= { V : (w) = 0 for all w W }
we had proven
Thus,
dim W + dim W = dim V
374
1
x2
x22
x32
..
.
...
...
...
...
1
xn
x2n
x3n
..
.
xn1
2
...
xnn1
= (1)n(n1)/2
(xi xj )
i<j
P
where the
means product over the indicated indices, just as
means sum. A
matrix of that form is a Vandermonde matrix, and its determinant is a Vandermonde determinant. Granting some standard facts about determinants, the
idea of the proof of this formula is straightforward. However, to make the idea
fully legitimate, it is necessary to do some further work, namely verify that polynomial rings in several variables are unique factorization domains. We do that in
the following appendix.
The idea of the proof of the identity is as follows. First, we note that whatever
the determinant is it is a polynomial in x1 , . . ., xn . It is a standard fact that if
two columns of a matrix are the same, then the determinant is 0. From this we
conclude (!) that for i 6= j the determinant is divisible by xi xj . Since xi xj
and xj xi only differ by 1 and we do not want inadvertently to include the same
factor twice, we conclude (!) that the determinant is divisible by the product
Y
(xi xj )
i<j
We want to argue that the determinant can have no further polynomial factors,
so up to a constant (which well determine) is equal to the latter product. The
mn
1
notion of total degree is useful. The total degree of a monomial xm
is
1 . . . xn
m1 + . . . + mn . The total degree of a polynomial is the maximum of the total
A.5
Vandermonde determinants
375
degrees of the monomials occurring in it. We grant for the moment the result of
the proposition below, that the total degree of a product is the sum of the total
degrees of the factors. The total degree of our product is
X
1=
1i<jn
ni=
1i<n
1
n(n 1)
2
To determine the total degree of the determinant, we invoke a standard formula for
the determinant of a matrix M with entries Mij , namely that
det M =
where p runs over permutations of n things and (p) is the sign or parity of p, that
is, (p) is +1 if p is a product of an even number of 2-cycles and is 1 if p is the
product of an odd number of 2-cycles. Qualitatively, up to 1, this expresses the
determinant as a sum of products of elements from the first, second, third, . . ., nth
columns, no two in the same row. In particular, since the matrix is square, there
must be exactly one factor from each row. In a Vandermonde matrix all the top
row entries have total degree 0, all the second row entries have total degree 1, and
so on. Thus, in this sort of sum for a Vandermonde determinant, each summand
has total degree
1
0 + 1 + 2 + . . . + (n 1) = n(n 1)
2
That is, the total degree of the determinant is equal to the total degree of the
product
X
X
1
1=
n i = n(n 1)
2
1i<jn
1i<n
1
x2
x22
x32
..
.
...
...
...
...
1
xn
x2n
x3n
..
.
xn1
2
...
xnn1
= constant
(xi xj )
i<j
376
by taking all the x2 s in the linear factors x2 xj with 2 < j. Continuing in this
manner, we get a coefficient of +1 in the product.
In the determinant, the only way to obtain this monomial is as the product
of entries from lower left to upper right. The indices of these entries are (n, 1),
(n 1, 2), . . . , (2, n 1), (1, n). Thus, the coefficient of this monomial is (1)t
where t is the number of 2-cycles necessary to obtain the permutation p with the
property
p(i) = n + 1 i
There are at least two ways to count this. We might observe that this permutation
is expressible as a product of two-cycles
n+1
(1 n)(2 n 1) (3 n 3) . . . ( n1
2
2 ) (for n odd)
Thus, for n even there are n/2 two-cycles, and for n odd there are (n 1)/2 twocycles. We might insist on arranging a closed form for this. Since these numbers
will be the exponent on 1, we only care about their values modulo 2. Thus,
because of the division by 2, we only care about n modulo 4, and we have values
n/2
(n 1)/2
n/2
(n 1)/2
///
Proof: The fact that the total degree of the product is less than or equal the
sum of the total degrees is clear. However, it is less clear that there cannot be
any cancellation which might cause the total degree of the product to be strictly
less than the sum of the total degrees. It is true that such cancellation does not
occur, but the proof is a little less clear than in the single-variable case. One way
to demonstrate the non-cancellation is as follows.
A.5
Vandermonde determinants
377
Let xe11 . . . xenn and xf11 . . . xfnn be two monomials of highest total degree t occurring with non-zero coefficients in f and g, respectively. We can assume without
loss of generality that the exponents e1 and f1 of x1 in the two expressions are the
largest among all monomials of total degree t in f and g, respectively. Similarly,
we can assume without loss of generality that the exponents e2 and f2 of x2 in the
two expressions are the largest among all monomials of total degree t in f and g,
respectively, and so on. We claim that the coefficient of the monomial
M = xe1 +f1 . . . xnen +fn
is simply the product of the coefficients of xe11 . . . xenn and xf11 . . . xfnn , so non-zero.
Let xu1 1 . . . xunn and xv11 . . . xvnn be two other monomials occurring in f and g such
that for all indices i we have ui + vi = ei + fi . By the maximality assumption on
e1 and f1 , we have e1 u1 and f1 v1 , so the only way that the necessary power
of x1 can be achieved is that e1 = u1 and f1 = v1 . Among exponents with these
maximal exponents of x1 , e2 and f2 are maximal, so e2 u2 and f2 v2 , and
again it must be that e2 = u2 and f2 = v2 in order to obtain the exponent of x2 .
Continuing inductively, we find that ui = ei and vi = fi for all indices. That is,
the only terms in f and g which contribute to the coefficient of the monomial M
in the product f g are the two monomials xe11 . . . xenn and xf11 . . . xfnn . Thus, the
coefficient of the monomial M is non-zero, and the total degree is indeed as large
as claimed.
///
Appendix: Polynomials
The goal here is to prove that rings of polynomials in several variables with
coefficients in a field are unique factorization domains, meaning that such polynomials can be factored essentially uniquely into irreducible polynomials. We will
make this precise. Among other uses, this fact is necessary in discussion of Vandermonde determinants, and is useful in the proof that the parity (or sign) of a
permutation is well-defined.
For precision, we need some definitions. Let R be a commutative ring R with
1. For r, s in R, r divides s, written r|s, if there is t R such that s = tr. Such R is
said to be a domain or integral domain if for r, s R the equation r s = 0 implies
that either r or s is 0. The units R in R are the elements in R with multiplicative
inverses. An irreducible element p in R is a non-unit with the property that if
p = xy for x, y R then either x or y is a unit.
Appendix: Polynomials
379
380
Appendix: Polynomials
to express both r and s. Then the greatest common divisor has exponents which
are the minima of those of r and s
min(e1 ,f1 )
gcd(r, s) = p1
m ,fm )
. . . pmin(e
m
Proof: Let
min(e1 ,f1 )
g = p1
m ,fm )
. . . pmin(e
m
First, it is easy to see that g does divide both r and s. On the other hand, let
d be any common divisor of both r and s. Enlarge the collection of inequivalent
irreducibles pi if necessary such that d can be expressed as
d = w ph1 1 . . . phmm
with a unit w and non-negative integer exponents. To say that d|r is to say that
there is D R such that dD = r. Let
Hm
1
D = W pH
1 . . . pm
Then we find
hm +Hm
wW p1h1 +H1 . . . pm
= d D = r = u pe11 . . . pemm
Then unique factorization (and the non-equivalence of the pi s) implies that the
exponents are the same: for all indices i we have
hi + Hi = ei
Thus, hi ei . The same argument applies with r replaced by s, so hi fi ,
and hi min(ei , fi ). Thus, d|g. For uniqueness, observe that any other greatest
common divisor h would have g|h, but also h|r and h|s. Using the (unique up to
units) factorizations, it is immediate that the exponents of the irreducibles in g and
h must be the same, so g and h must differ only by a unit.
///
Proof: The characterization of the greatest common divisor given in the previous
proposition shows that for each irreducible p in factorizations of r and s at least
one of the exponents of p in r/g and s/g is 0.
///
Appendix: Polynomials
381
Proof: We reduce this to the case when everything is inside the ring R. Given
a list of elements xi = ai /bi in k with ai and bi all in R, take 0 6= r R such
that rxi R for all indices i. For example, taking r to be the product of the
denominators bi would do. Let G the the greatest common divisor of the rxi ,
and then put g = G/r. We claim that this g is the greatest common divisor
of the original xi s. This is straightforward. On one hand, from G|rxi it follows
immediately that g|xi . On the other hand, if d|xi then rd|rxi , so rd divides G = rg
and then d|g.
///
Definition: The content cont(f ) of a polynomial f in k[x] is the greatest common
divisor of the coefficients of f .
The following lemma is the crucial point in proving the theorem.
Lemma: (Gauss) Let f and g be two polynomials in k[x]. Then
cont(f g) = cont(f ) cont(g)
Remark: The values of the content function are only well-defined up to units R .
Thus, the content of Gauss lemma more properly concerns the equivalence classes
of irreducibles dividing the respective coefficients.
382
Appendix: Polynomials
Proof: From the remark just above we see that for any c k
cont(c f ) = c cont(f )
Thus, from the proposition above which notes that
gcd(
a
b
,
)=1
gcd(a, b) gcd(a, b)
we can assume without loss of generality that cont(f ) = 1 and cont(g) = 1 and
must prove that cont(f g) = 1. Suppose not. Then there is a a non-unit irreducible
element p of R dividing all the coefficients of f g. Put
f (x) = a0 + a1 x + a2 x2 + . . .
g(x) = b0 + b1 x + b2 x2 + . . .
But p does not divide all the coefficients of f , nor all those of g. Let i be the
smallest integer such that p does not divide ai . Let j be the largest integer such
that p does not divide bj . Now consider the coefficient of xi+j in f g. It is
a0 bi+j + a1 bi+j1 + . . . + ai1 bj1 + ai bj + ai+1 bj1 + . . . + ai+j1 b1 + ai+j b0
In all the summands to the left of ai bj the factor ak with k < i is divisible by p,
and in all the summand to the right of ai bj the factor bk with k < j is divisible by
p. This leaves only the summand ai bj to consider. Since the whole sum is divisible
by p, it follows that p|ai bj . Since R is a unique factorization domain, either p|ai or
p|bj , contradiction. Thus, it could not have been that p divided all the coefficients
of f g.
///
Appendix: Polynomials
383
Proof: (of theorem) We can now combine the corollaries of Gauss lemma to
prove the theorem. Given a polynomial f in R[x], let c = cont(f ), so from above
cont(f /c) = 1. The hypothesis that R is a unique factorization domain allows us
to factor u into irreducibles in R, and we showed just above that these irreducibles
remain irreudicble in R[x].
Replace f by f /cont(f ) to assume now that cont(f ) = 1. Factor f into irreducibles in k[x] as
f = u pe11 pemm
where u is in k , the pi s are irreducibles in k[x], and the ei s are positive integers.
We can replace each pi by pi /cont(pi ) and replace u by
u cont(p1 )e1 cont(pm )em
so then the new pi s are in R[x] and have content 1. Since content is multiplicative,
from cont(f ) = 1 we find that cont(u) = 1, so u is a unit in R. The previous
corollaries demonstrate the irreducibility of the (new) pi s in R[x], so this gives a
factorization of f into irreducibles in R[x]. That is, we have an explicit existence
of a factorization into irreducibles.
Now suppose that we have two factorizations
f = u pe11 pemm = v q1f1 qnfn
where u, v are in R (and have unique factorizations there) and the pi and qj are
irreducibles in R[x] of positive degree. From above, all the contents of these irreducibles must be 1. Looking at this factorization in k[x], it must be that m = n
and up to renumbering pi differs from qi by a constant in k , and ei = fi . Since all
these polynomials have content 1, in fact pi differs from qi by a unit in R. By equating the contents of both sides, we see that u and v differ by a unit in R . Thus,
by the unique factorization in R their factorizations into irreducibles in R (and,
from above, in R[x]) must be essentially the same. Thus, we obtain uniqueness of
factorization in R[x].
///
Bibliography
The fundamental paper from which nearly all these things originate is [Shannon 1948]. There are several other introductory texts on coding, meeting various
tastes, mostly emphasizing the error-correction aspects and omitting discussion of
compression. [Roman 1992] includes both. Some devoted mostly to error correction are [Berlekamp 1968], [Lidl Niederreiter 1986], [van Lint 1998], [McEliece 1977],
[Pless 1998], [Pretzel 1999], [Wells 1999], [Welsh 1988]. Discussion of compression
appears in its own right in other sources such as [Salomon 1998] and [Sayood 1996].
An encyclopedic reference for error-correction is [MacWilliams Sloane 1977]. The
collection [Verdu McLaughlin 2000] contains many tutorial and historical articles.
[Conway Sloane 1988] discusses lattices and sphere packing and applications to
coding, among many other uses.
[Berlekamp 1968] E. R. Berlekamp, Algebraic Coding Theory, McGraw-Hill, New
York, 1968.
[Conway Sloane 1988] J. H. Conway, N.J.A. Sloane, Sphere Packings, Lattices, and
Groups, Springer-Verlag, New York, 1988.
[Forney 1966] G. D. Forney, Concatenated Codes, M.I.T Press, Cambridge, MA,
1966.
[Justesen 1972] J. Justesen, A class of constructive asymptotically good algebraic
codes, IEEE Trans. Info. Theory 18 (1972), pp. 652656.
[Lidl Niederreiter 1986] R. Lidl, H. Niederreiter, Introduction to finite fields and
their applications, University Press, Cambridge, 1986.
[van Lint 1998] J. H. van Lint, Introduction to Coding Theory, third edition,
Springer-Verlag, New York, 1998.
[MacWilliams Sloane 1977] F. J. MacWilliams, N. A. J. Sloane, The Theory of
Error-Correcting Codes, North-Holland, Amsterdam, 1977.
[McEliece 1977] R. J. McEliece, The Theory of Information and Coding, Encyclopedia of Math. and its Applications, Vol. 3, Addison-Wesley, Reading, MA,
1977.
384
Bibliography
385
Selected Answers
1.01 5, 7, 6 elements, respectively.
7
1.08
10
4
2
1.23
8
3
83
3
8!
3! 3! 2!
1.24 Send 2n to n.
1.25 Send 0 to 0, send n > 0 to 2n, and
n < 0 to 2|n| 1.
= 35
1.28
, 10 9, respectively.
1.29
1.30
10
3
1
2
210
1 210
3
3+7
10
5
1
2
1 220
20
10
2
1.40
386
P
i=0
Answers
is not obvious.)
1.53 Use Chebysheffs inequality.
1.54 Use Chebysheffs inequality.
1.55 Use Chebysheffs inequality.
1.56 Use Chebysheffs inequality.
387
sum of random variables is the sum of
the expected values.
The intuitively
reasonable answer N p is indeed correct,
but does not actually follow easily from
the definition of expected value.
4.08
log2 24
5
4
5
4.09 1/2
2.01
15
8
388
this prime factorization. Thus, we have 3
choices (0, 1, or 2) of how many factors
of 2 to include, 2 choices (0 or 1) of how
many factors of 3 to include, and similarly
2 choices of how many factors of 5 to
include. 3 2 2 = 12 positive factors.
6.13 To say that d|m is to say that there
is an integer k such that k d = m. Then
k d = m, which shows that d divides
m.
6.15 One really should recognize the
binomial coefficients: 1331 = (1+10)3 and
14641 = (1 + 10)4 .
6.16 73
6.17 11
6.18 128
6.19 Since d|n there is an integer a such
that ad = n. Likewise, there is an integer
b such that bd = n + 2. Subtracting,
(b a)d = (n + 2) n = 2. Thus d|2.
Answers
sizes of the mutually disjoint subsets
which appear. There is 1 partition with
one subset, there are 43 partitions with
a 3-element and 1-element subsets, 42 /2!
with two 2-element subsets, 42 with a 2element and two 1-element subsets, and
1 with four 1-element subsets. Adding
these up, there are 15 partitions, hence
15 equivalence relations.
6.46 n
6.47 8
6.51 Since 32 = 1 mod 10, we have
34 = (1)2 = 1 mod 10. Since 999 =
4 249 + 3, 3999 mod 10 is 34249+3 =
(34 )249 33 modulo 10, which simplifies to
1249 33 modulo 10, since 34 = 1 mod 10.
6.52 67 Brute force is plausible here. Or
the Extended Euclidean Algorithm.
6.53 This is less palatable by brute force,
but the Extended Euclidean Algorithm
works well and gives 143, equivalently,
1091.
6.56 8, 8, and 8
6.60 18
6.61 375
6.62 4 and 83
6.68 No, by Eulers criterion,
2(1011)/2 = 1 mod 101.
since
since
Answers
389
(b(p+1)/4 )2 = ap+1 = a a = b
7.03 (1 2 3 4 7 6 5) of order 7
7.05
7.07
1
5
543
3
2
4
3
7
4
6
5
2
6
1
7
3
= 20
7.11 6 = lcm(2, 3)
7.12 12 = lcm(3, 4)
7.13 20 = lcm(4, 5)
10.12 x3 + x2 + 1
11.01 + 1
10.10 x2 + x + 1
11.03 2 + + 1
11.05 + 1
390
Answers
11.06 2
11.09 x(x + 1)(x2 + x + 1)
11.10 x(x + 1)(x3 + x + 1)(x3 + x2 + 1)
12.03 4/7
15.11 x modulo x2 + x + 1
15.12 x modulo x3 + x + 1
15.15 105 (x) has some coefficients 2.
16.01 7
12.18
16.04 5
G=
1
1
0
1
1
0
0
1
!
1 0 0 1 0 0 1 0 0
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
Answers
391
11
0
...
11
8
...
1
8
0
1
X N
i
i<cN
2N H(c)
where
H(c) = c log2 c (1 c) log2 (1 c)
Thus, we want H(c) < 2/3. Numerical
experimentation yields H(.1667) 0.65.
18.06 Imitate the proof for the mirage
codes. There are (2n 1) (2n 1) choices
for (, ), and we want the number of bad
pairs to be less than the total:
X 3n
i<c3n
(2n 1)2 1
392
That is, for n 4 there is a choice of ,
such that the code has minimum distance
at least c 3n 61 3n = n/2. The errorcorrection rate is at least 1/6, and the
information rate is 1/3.
19.03 Look at this equation modulo 4,
and realize (check!) that the only squares
mod 4 are 0, 1.
19.04 Look at this modulo 7. The cubes
modulo 7 are 0 and 1.
19.05 Look at this modulo 9.
19.11 Hint: the polynomial x4 +x3 +x2 +
x+1 is the 5th cyclotomic polynomial, and
F
13 is a cyclic group of order 12 while 5 is
relatively prime to 12.
19.13 First, show that 1 is not in I: if
1 = g(x) 2 + h(x) x, then modulo x we
have 1 = g(0) 2, but this would say that
1 is an even integer, which it is not. Next,
observe that the quotient ring Z[x]/I is
Z/2, since in the homomorphism to the
quotient polynomials lose their higherdegree terms since x goes to 0. Likewise,
2 goes to 0. (But 1 does not go to 0.)
20.01 The point (1, 0) is an obvious point
on the curve. Consider the line of slope t
through (1, 0), y = t(x 1). We find its
intersection with the curve x2 2y 2 = 1
by substituting for y, obtaining x2 2t(x
1)2 = 1. This quadratic equation has
the root x = 1 and another root x =
Answers
(2t2 + 1)/(2t2 1). The corresponding
y = t(x 1) is 2t/(2t2 1). Thus, every
rational t gives a solution. Conversely, any
solution other than (1, 0) gives a line of
rational slope, hence is of this form for
some t.
20.04 The projectivization is given by
the homogenized equation x2 2y 2 = z 2 .
The points at infinity are those where
z = 0, namely given by the equation
x2 2y 2 = 0. If there is a square root of
2 in whatever the field is, then the points
(, 1, 0) (in projective coordinates) are
the points at infinity. If there is no square
root in the underlying field then there are
no points at infinity.
20.05 The gradient of the function
f (x, y) = x2 2y 2 is (2x, 4y). If the
characteristic of the underlying field is not
2, this vector is the zero vector only for x,
y both 0, which is a point not on the curve.
That is, there are no singular points in the
affine plane.
20.06 From an earlier exercise, just
above, if there is no square root of 2 in the
underlying field, then there are no points
at infinity. If there is a square root of
2, then there are points (, 1, 0). If the
characteristic of the field is not 2, then
the gradient (2x, 4x, 2z) of f (x, y, z) =
x2 2y 2 z 2 does not vanish there, so these
points are non-singular.
Index
abelian group 146
addition modulo P 197
addition of polynomials 84
additive identity 167
additive inverse 167
additivity of degrees 179
affine plane 343
algebraic closure 339
algebraic curve 335
algebraic derivative 244
algebraic geometry code 354
alphabet 45
ambiguous decoding 201
atomic event 13
augmented matrix 216
automorphism 327
average word length 51
393
394
concatenated codes 297, 301
conditional entropy 42
conditional probability 16
congruence 111
congruence class 193, 112
congruent modulo m 111
coprime 97
coset 148, 224
coset leader 225
counting irreducibles 329
counting primitive roots 256
counting primitives 331
CRC 86, 272
curves 335
cycles 135
cyclic 158
cyclic codes 235, 282, 287
cyclic redundancy checks 86
cyclic subgroup 153
cyclotomic polynomial 247
cylinders 20
data polynomial 86
decimal 46
decoding 63
decomposition into disjoint cycles 136
degree of divisor 351
degree of field extension 319
degree of polynomial 83, 178 195
derivative 244
designed distance 282
diagonal 210
dimension 208, 211, 364
discrete logarithm 120
discrete memoryless channel 61
disjoint 2
disjoint cycles 136
distinct 5
distributive 167
divides 96, 182
divisibility 181
division algorithm 93
division of polynomials 85
division ring 168
Index
divisor 96, 348, 351
dot product 210, 371
dual basis 368
dual code 222
dual space 368
efficient code 54
eigenvectors 270
element 2
elementary column operations 211
elementary row operations 211
elliptic curve 342
encoding 45
entries of vector 209
entropy 33, 37, 39
equivalence class 109, 112
equivalence relation 108, 109
erasure channel 62
erasures 36
error polynomial 88
error probability 62
error vector 88
Euclidean algorithm 105, 181, 187
Euler criterion 121
Euler phi-function 97, 118, 128
Euler totient function 97
Eulers theorem 118, 154
evaluation homomorphism 173
evaluation map 173
event 13
exclusive-or 87
expected value 20
exponent 247
exponent of group 155
exponents 151
extension field 195, 319
factorial 5
factorization 100
failure of unique factorization 103
failure to detect error 88
fair coin 9
fair wager 21
fast modular exponentiation 122
Index
Fermats Little Theorem 117
field 193
field 115, 168, 175
field extension 195
field of definition of curve 336
field of fractions 349
field of functions 350
field of rational functions 348
finite field 82, 115, 192, 193
finite group 148
finite-dimensional 364
floor function 73, 221
Frobenius automorphism 321
Frobenius map 289, 321
function 3
functional 368
Galois field 82, 193
gcd 97, 105, 182
generating functions 25
generating matrix 219, 230
generating polynomial 282
generating polynomial for LFSR 267
generating polynomial of CRC 86
generator of group 158
generators of ideal 312
genus 348, 352
geometric Goppa codes 353
Gilbert-Varshamov bound 230
Goppa codes 353
gradient 340
greatest common divisor 97, 105, 182
group 145
group homomorphism 156
group identity 145
group inverse 146
group of units 168
Hamming [7, 4] code 205
Hamming bound 230
Hamming codes 285
Hamming decoding 207
Hamming distance 66, 228
Hamming weight 66, 88 228
395
Hartley 39
Hasse-Weil inequality 355
hermitian curve 355
hexadecimal 46
homogeneous coordinates 343
homogenize 344
homogenized equation 345
homomorphism 156, 171 313, 365
Huffman encoding 54
hyperelliptic curve 341
ideal 172, 309
identity 167
identity function 4
identity in a group 145
identity matrix 210
i.i.d. 45
image 156, 365
independent 12
independent events 17
independent random variables 22
independent trials 14
indeterminate 83
index 120
index of subgroup 150
infinite 4
infinity, points at 342
information 33
information positions 220
information rate 34, 71
initial state 267
injective 3
inner code 301
inner product 210, 371
input alphabet 61
instantaneous code 47
integers modulo m 108, 111, 112
integers modulo primes 83, 115
integral domain 168
interleaver 141
intersection 2
intersection divisor 352
inverse function 4
inverse in group 146
396
inverse permutation 135
inverses modulo P 197
irreducible 182
irreducible polynomial 90
isomorphism 158, 171 313, 365
joint entropy 40
Justesen codes 303
kernel 156, 172, 313, 365
Kraft inequality 48
Lagrange replacement principle 363
Lagrange theorem 148
law of large numbers 28
laws of exponents 151
left coset 148
left ideal 313
left translate 148
length 48
LFSR 267
limiting frequency 11, 15
linear [n, k]-code 220
linear algebra 360
linear codes 200, 218, 282, 287
linear combination 362
linear dependence 209, 215, 362
linear feedback shift register 267
linear functional 368
linear independence 209, 362
linear system 352
linear systems on curves 348
list 1
look-ahead 47
MacMillan inequality 48
majority vote 36
map 3
mapping 3
Markov inequality 27
matrix 210
maximal ideal 318
maximum-likelihood decoding 66, 71
maximum-likelihood rule 66
Index
MDS code 232, 285
meet 2
memoryless source 44
minimum-distance 221
minimum-distance decoding 67, 71, 221
minimum-distance separating code 232
minimum-distances in linear codes 234
minimum-error rule 66
mirage codes 297
modular curves 355
modular exponentiation 122
modulus 94
monic polynomial 179
Monty Hall paradox 31
multiple 96, 182
multiple factors 243
multiplication modulo P 197
multiplication of polynomials 84
multiplicative inverse 168
multiplicative inverse modulo m 96, 107
multiplicative inverses modulo P 197
multiplicity 347
mutually disjoint 14
mutually exclusive 14
mutually prime 97
noise 61
noiseless coding 44
noiseless coding theorem 51
noiseless decoding 33
noisy coding theorem 72
non-existence of primitive roots 257
non-singular curve 340
non-unique factorization 103
one-to-one 3
onto 3
optimal code 54
optimality of Huffman encoding 58
order of a permutation 138
order of group 148
order of shuffle 141
ordered k-tuple 5
ordered pair 2
Index
ordering 5
orthogonal complement 369
outer code 301
output alphabet 61
overhand shuffle 139
pair 2
parity-check 63, 82
parity-check bit 82, 86, 87
parity-check condition 224
parity-check matrix 224, 280
parity-check positions 220
partition 111
perfect code 230
perfect field 245
period of LFSR 270
permutation 134
phi-function 97, 118, 128
pigeon-hole principle 155
pivot 212
planar curve 335
point at infinity 342, 343
polynomial 83, 178
polynomial ring 178
power, cartesian 2
power residues 121
power set 3
powers in a group 162
prefix 47
prefix code 47
primality test 102, 184
prime factorization 102
prime number 97
primitive element 240, 252
primitive polynomial 90, 260, 331
primitive root 120, 241, 253, 256
primitivity test 264
principal ideal 310
probability 8
probability measure 13
product, cartesian 2
product of permutations 135
product random variable 22
projective plane 343
397
projective plane curve 342
proper divisior 97, 184
proper ideal 310
proper subset 2
properties of exponents 151
pseudo-random numbers 267
quotient homomorphism 318
quotient ring 317
random variable 20
rate 71, 203
rational curve 341
rational points on curve 336
real-life CRCs 88
reduced form 194, 195, 219
reduced modulo P 193
reduction 181
reduction algorithm 93
reduction homomorphism 172
reduction modulo m 93
redundancy 33, 34
Reed-Solomon code 282
reflexivity 109
relation on a set 109
relative error correction 284
relatively prime 97
repeated factors 243
replacement principle 363
representative for equivalence class 110
residue class 112
retransmission 36
Riemann-Roch theorem 352
riffle shuffle 140
right ideal 313
ring 116, 167
ring homomorphism 171
roots in a group 162
roots modulo primes 121
row operations 211
row reduced 212
row reduction 213
row space 212, 220
RS code 282
398
rule 2
sample space 13
scalar multiple 209
scalar product 210, 371
second dual 370
seed 267
self-information 37
semantics-based error correction 35
set 1
shuffle 139
Singleton bound 232, 285
singular point of curve 340
singularities of curves 339
source words 44, 220
span 362
sphere-packing bound 230
stabilizer subgroup 328
standard basis 365
standard deviation 24
standard form 219
Stirlings formula 356
stochastic matrix 62
strings 45
subfield 195, 319
subgroup 147
subgroup generated by g 153
subgroup index 150
subring 309
subset 2
subspace spanned 362
sum random variable 22
Sun-Zes theorem 124
supercode 301
support of divisor 351
surjective 3
symmetric group 134
symmetry 109
syndrome 224
syndrome decoding 222
syntax-based error correction 35
systematic 220
systematic form 219
test primality 102
Index
testing for primitivity 264
total degree 344
transition matrix 269
transition probabilities 62
transitivity 109
translate 148
transpose 210
trial 9
trial division 102, 184
triangle inequality 203, 229
trivial ideal 310
trivial permutation 135
Tsfasman-Vladut-Zink-Ihara bound 354
uncertainty 33
uncorrectible error 201
undetectable error 201
union 2
unique decipherability 46
unique factorization 100, 189
unit 167
unordered list 1
Vandermonde determinant 277
Vandermonde matrix 277
variance 20, 24
variant check matrix 280
Varshamov-Gilbert bound 230
vector 208
vector space 360
vector subspace 362
vector sum 209
volume of ball 228
Weak Law of Large Numbers 28
word error probability 71
word length 48
XOR checksum 87
zero divisor 168
zero matrix 211
zero vector 209