Introduction To Algebraic Coding Theory - 2022
Introduction To Algebraic Coding Theory - 2022
Series Editor
M Zuhair Nashed (University of Central Florida)
Editorial Board
Guillaume Bal Palle Jorgensen
Gang Bao Marius Mitrea
Liliana Borcea Otmar Scherzer
Raymond Chan Frederik J Simons
Adrian Constantin Edriss S Titi
Willi Freeden Luminita Vese
Charles W Groetsch Hong-Kun Xu
Mourad Ismail Masahiro Yamamoto
This series aims to inspire new curriculum and integrate current research into texts. Its aims
and main scope are to publish:
– Cutting-edge Research Monographs
– Mathematical Plums
– Innovative Textbooks for capstone (special topics) undergraduate and graduate level
courses
– Surveys on recent emergence of new topics in pure and applied mathematics
– Advanced undergraduate and graduate level textbooks that may initiate new directions
and new courses within mathematics and applied mathematics curriculum
– Books emerging from important conferences and special occasions
– Lecture Notes on advanced topics
Monographs and textbooks on topics of interdisciplinary or cross-disciplinary interest are
particularly suitable for the series.
Published
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance
Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy
is not required from the publisher.
Printed in Singapore
TO
My wife Ping
This page intentionally left blank
Preface
vii
viii Introduction to Algebraic Coding Theory
the proofs are consigned to classic books by Zariski and Samuel [16],
Walker [15], Chevalley [10], and Mumford [13].
The final part of this text is devoted to the well-known geometric Goppa
codes [25]. Their early decoding processes depend on linear algebra only and
are elementary. For the decoding processes, we give explicit descriptions and
try to present them as naturally as possible. Further discussions involving
the remarkable concepts of Feng and Rao [21] on majority voting will also
be presented in this way.
The field of coding theory is too rich to be covered in a one-semester
course. We have added appendices to discuss the familiar topics such
as convolution codes, sphere-packing problem, other interesting codes, and
Berlekamp’s algorithm which might be beneficial to interested readers who
wish to have a wide scope of understanding of the related materials.
This book is written to be concise. There are about 204 pages for a one-
semester course. We hope that it will be useful to students and working
algebraic geometers alike in understanding the booming field of coding. We
would like to thank W. Heinzer for reading the whole book and making
some valuable suggestions and B. Lucier for commenting on Parts I & II of
our manuscript.
We wish to thank Ms. Rochelle Kronzek, executive editor at World
Scientific Publishing Company, for her constant enthusiasm on initializing
this project. We are grateful to Ms. Lai Fun Kwong, managing editor at
WSP, for her prompt communications and support during the book writing.
We wish to thank the anonymous referee who improved this book and
Mr. T. R. Soundararajan for taking care of the final form of the book.
About the Author
ix
This page intentionally left blank
Contents
Preface vii
About the Author ix
Introduction xv
xi
xii Introduction to Algebraic Coding Theory
Appendices 207
Appendix A Convolution Codes 209
A.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . 209
A.2 Combining and Splitting . . . . . . . . . . . . . . . . . . . 211
A.3 Smith Normal Form . . . . . . . . . . . . . . . . . . . . . 211
A.4 Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . . . . 213
References 237
Index 241
This page intentionally left blank
Introduction
All living beings use signals to communicate with each other. The signals,
also known as codes, can take the form of chemicals, sounds, colors, etc.
About two million years ago, humanity gained its own distinctiveness by
creating abstract signals, languages. All languages can be seen as codes.
Many historians try to decipher “lost” languages: the most famous one
was probably written hieroglyphs, which were deciphered using the Rosetta
stone. Since ancient times, poems have been used as a way of communicat-
ing oral tradition, including Illiad and Odyssey in Greek, Mahabharata and
Ramayana in Sanskrit, and Ode in Chinese. One advantage of the rhyme
and poetic structure of poetic verse over prose is that it is easy to find the
errors if they occur. In other words, poetry is the first “error-detecting”
form of communication.
In 1945, Erwin Schrödinger published a book entitled What is life?
The Physical Aspect of the Living Cell [3], in which he observed that
chromosomes are code-scripts and are molecules in nature. Ergo, there
must be a code of some kind which allowed molecules in a cell to carry
information. It motivated many scientists to study the codes transmitted
by living beings, eventually leading to the discovery of the double-helix
structures of DNA and RNA by James D. Watson and Francis Crick.
Poetry uses rhymes and molecules use chemical bonds to detect errors
and correct them. It is natural to impose some algebraic relations on
the symbols of letters for the same purpose. A computer scientist, R. W.
Hamming, used a primitive computer, by today’s standard, to perform his
research. At that time, scientists had to queue their work for the computer
to sequentially process. If the computer found errors (usually the errors
xv
xvi Introduction to Algebraic Coding Theory
0 0 0 1 1 0 1
and [a1 a2 a3 a4 ]×G = [a1 a2 a3 a4 b1 b2 b3 ]. The matrix G is called the generator
matrix. The ai ’s are called the message symbols. Furthermore, let
⎛ ⎞
1 1 0
⎜0 1 1⎟
⎜ ⎟
⎜1 1 1⎟
⎜ ⎟
⎜ ⎟
H = ⎜1 0 1⎟
⎜ ⎟
⎜1 0 0⎟
⎜ ⎟
⎝0 1 0⎠
0 0 1
and [a1 a2 a3 a4 ] × G × H = [000]. The matrix H is called the check matrix
and bi ’s are called the check symbols.
The decoding process is as follows. Suppose that the computer, for
whatever reasons, reads [a1 a2 a3 a4 b1 b2 b3 ] as [a1 a2 a3 a4 b1 b2 b3 ] which might
be different from the original string. However, this kind of error is
Introduction xvii
Let us take a simple example. Suppose we check one out of two boxes (e.g.,
male and female) to provide some information. Before choosing a box, both
boxes have equal chance of being selected, and we only pick one. So, before
the selection, the entropy is
1 1
log2 2 + log2 2 = 1 = 1 − information.
2 2
Therefore, the information is 0. After the selection, the entropy is
−1 log2 1 = 0 = 1 − information.
Therefore, the information is 1. The information gained is
information = 1 − 0 = 1.
For four boxes to be checked, for these requirements to be fulfilled, we
either use the definition of entropy directly, or group them into two subsets,
{box 1, box 2} and {box 3, box 4 }, then we pick one subset out of the two
subsets and one box from the subset of two boxes, so
information = 1 + 1 = 2.
The material world tends to homogenize distributions, for instance, air
tends to mix all components uniformly. These are the results of the increase
of entropy. On the other hand, living beings tend to select from a mixed
Introduction xix
It turns out that DNA has some proof-reading capabilities which RNA lacks,
although how those proof-reading capabilities function is unclear. These
capabilities slow down the rates of mutations considerably. The fact that
the lack of proof-reading capabilities for RNA make some viruses evolve
rapidly and this phenomenon causes many problems for an individual’s
health and even a worldwide pandemic problem. The transplant of a single
useful gene without the associate proof-reading capability might likely be
dangerous. The self-correcting codes that occur in nature might be better
than all of our algebraic coding theory.
Similar to the codes of life, civilizations and cultures themselves may
be viewed as the transmission of codes through time. Due to the nature
of decaying phenomena caused by historical events, thermodynamics, and
cosmic rays, we may view the channel of time as a noisy channel. In his
old age, Leonardo da Vinci worried about the decaying of his masterpieces.
Preservation of our heritage becomes an important topic. One way might be
using the self-correcting codes to prolong the useful period of our civilization
and culture. Oral and written languages are important parts of heritages. In
all oral and written languages, there are many non-functional parts which
serve as check symbols. It is dangerous to delete these parts.
We live in the age of technology. Messages are transmitted in sequences
of 0’s and 1’s through space. It is possible to make an error with noisy
channels, and so self-correcting codes become vital to eradicate all errors
(as long as the number of errors is small). Self-correcting codes are widely
used in industries for a variety of applications including e-mail, telephone,
remote sensing (e.g., photographs of Mars), CD, etc. We will present some
essentials of the theory in this book.
Using linear algebra, we have the salient Hamming codes. The next level
of coding theory is through the usage of ring theory, especially polynomials,
rational functions, and power series to produce BCH codes, Reed–Solomon
codes, and classical Goppa codes. The more advanced level of coding theory
is an application of algebraic geometry to geometric Goppa codes. The aim
of this book is to gradually bring interested readers to the most advanced
level of coding theory.
PART I
Linear Codes
In this chapter, we lay the foundation for coding theory using linear algebra,
although certainly there are many other ways. For instance, we could simply
send multiple copies of the same message and determine the correct one bit
by bit by a majority vote, which is called repetition code. Alternatively, we
could use real curve theory to construct a “self-correcting code” as follows.
Given data a0 , a1 , let us consider the line defined by linear equation
y = f (x) = a0 x + a1 .
We transmitted the values y0 = f (0), y1 = f (1) (and in general yi = f (i))
instead of {a0 , a1 }, making the observation that {a0 , a1 } and {y0 , y1 }
determine each other. There is no way to tell if the transmitted {y0 , y1 }
contain an error. To detect errors, assume that there is at most one error;
we may transmit a group of three data, (y0 , y1 , y2 ). If (0, y0 ), (1, y1 ), and
(2, y2 ) are on a line L3 ⊂ A2R , then there is no mistake since we assume
that there is at most one error. If they are not on a line, then there is a
mistake. However, we cannot decide which one is an error. To correct one
possible error, we should add two more symbols y2 = f (2), y3 = f (3) instead
of just one more y2 and transmit {y0 , y1 , y2 , y3 } because we assume that
there is at most one error, then there must be at least three correct values.
For any three correct values (f (i), f (j), f (k)) among the four values, the
corresponding points (i, f (i)), (j, f (j)), (k, f (k)) will lie on a line L4 ⊂ A2R ,
and hence the remaining one point (, f ()) is determined, i.e., since is
correct, we just need to determine f (). That is, a brute-force search for
3
4 Introduction to Algebraic Coding Theory
the correct triple will reveal which three values are consistent (for a line)
and hence determine the extra one.
Let us consider the problem of correcting two errors. Let us add one
more data f (4). Now, we have f (0), f (1), f (2), f (3), and f (4). Can we
correct two errors? Say, (0, f (0)), (1, f (1)), (2, f (2)) lie on a line. However,
(2, f (2)), (3, f (3)), (4, f (4)) may lie on a different line. Then, we cannot
tell which line is the correct one. We cannot correct the mistakes. We may
modify the method in the previous paragraph to correct two errors. We shall
add two more points {f (4), f (5)} and transmit {y0 , y1 , . . . , y4 , y5 } because
we assume that there are at most two errors; therefore, there are at least
four correct values. Furthermore, any four correct values will determine the
line and hence the remaining two values. That is, a brute-force search for
the correct four tuple will reveal which four values are consistent (for a
line). Thus, two errors can be corrected this way.
It is easy to generalize the above method to correct any number of
errors.
Instead of lines, we may use curves of higher degrees. We may consider
all quadratic curves. A quadratic curve is defined by the equation
y = f (x) = a0 x2 + a1 x + a2 ,
representing the original data {a0 , a1 , a2 }. We transmit the values y0 =
f (0), y1 = f (1), and y2 = f (2), instead of {a0 , a1 , a2 }, making the
observation that {a0 , a1 , a2 } and {y0 , y1 , y2 } determine each other. There is
no way to tell if the transmitted {y0 , y1 , y2 } contain an error. To correct one
possible error, we should add two more symbols y3 = f (3), y4 = f (4) and
transmit {y0 , y1 , y2 , y3 , y4 }. Suppose on the other end, the receiver receives
{y0 , y1 , y2 , y3 , y4 } with one possible error. The receiver then determines
which of the four tuples are consistent (for a quadratic curve) and then
uses it to correct the fifth one. In general, we feel that if we consider curves
of higher degrees and if we transmit sufficiently more points than necessary,
say s more points, then we may correct 2s errors. However, a brute-force
search for the correct tuple will be time-consuming. We may modify this
curve code slightly to produce the Reed–Solomon code (see Section 3.3)
with a fast decoding process.
Both repetition codes and real curve codes are time-consuming for
decoding. Instead, we focus on linear codes, which are more efficient and
have fast decoding methods. We follow the historical development of the
theory of self-correcting codes, primarily using techniques from linear
algebra.
Linear Codes 5
1.2. Preliminaries
Definition 1.2. Let K be a set with two operations (+, ·), where +, · are
binary operations between elements in K such that a, b, c ∈ K satisfies the
following conditions:
6 Introduction to Algebraic Coding Theory
The only thing we have to establish in Definition 1.3 is that with n, the
smallest positive integer such that n · 1 = 0, if n exists, then n must be a
prime number. Otherwise, let n = · m with 0 < , m < n, it follows from
n · 1 = · m · 1 = ( · 1)(m · 1) = 0.
Since we have a field, either · 1 = 0 or m · 1 = 0. Therefore, n is not the
smallest. A contradiction, i.e., is prime.
We need some basic knowledge of field theory. The reader is referred
to Commutative Algebra, Vol. I, p. 60, by Zariski and Samuel [16], for field
theory and the following corollary.
Sometimes, we omit the mention of (V, +, ·) if they are obvious and say V
is a vector space.
A set of vectors {vi }i∈I is called a set of generating vectors if for any
vector v ∈ V, we always have an expression v = finite ai vi . A set of vectors
{vi }i∈I is called a set of linearly independent vectors if for any expression
0 = finite ai vi , we must have ai = 0 ∀i. A set of vectors {vi }i∈I is called
a basis of V if it is a set of generating and linearly independent vectors.
A common theorem of the vector spaces is that for a given vector space
V, all bases will have the same cardinality, which is called the dimension
of the vector space V. Any vector space is said to be finite-dimensional if
it has a finite basis. In coding theory, we only use finite-dimensional vector
spaces.
If we have two fields L ⊃ K, then clearly L is a vector space over K.
1.2.4. Matrices
y1 = c1,1 x1 + · · · + c1,k xk ,
··· (2)
yn−k = cn−k,1 x1 + · · · + cn−k,k xk .
y1 − c1,1 x1 − · · · − c1,k xk = 0,
··· (2 )
yn−k − cn−k,1 x1 − · · · − cn−k,k xk = 0,
1.4. Distances
The only natural metric in Fq is the discrete one, i.e., we have d(a, b) = 1
if a = b and d(a, b) = 0 if a = b. We generalize this distance function to an
n-dimensional vector space Fn q (= U) by using the following definitions.
(1) : d(a, a) = 0,
(2) : d(a, b) = d(b, a),
(3) : d(a, b) + d(b, c) ≥ d(a, c).
Proof. It is evident.
y1 = c1,1 x1 + · · · + c1,k xk ,
··· (2 )
yn−k = cn−k,1 x1 + · · · + cn−k,k xk .
v × H̄ = (v − e) × H̄ = 0.
Proof. It is evident.
Let us continue our discussion of taking a subspace C as code space;
an ad hoc decoding method is the following maximum likelihood decoding
method. The maximum likelihood decoding is to find an error vector e in
the coset S of v such that
w(e) = d(e, 0) = min{w(u) : u ∈ S}.
It means that the maximum likelihood decoding is to select the one correction
with the least amount of corrections. The previous proposition explains the
meaning of the term maximum likelihood decoding. We further define the
following.
syndrome space
coset
of a vector v only forms a set rather than a point as in the real case. For
instance, we have the following example.
Example 3: Let us consider a [6, 3] code C over the prime field F2 = Z/2Z
with the following check matrix H:
⎛ ⎞
1 1 0
⎜ ⎟
⎜0 1 1⎟
⎜ ⎟
⎜1 0 1⎟
⎜ ⎟
H=⎜ ⎟.
⎜1 0 0⎟
⎜ ⎟
⎜ ⎟
⎝ 0 1 0⎠
0 0 1
Furthermore, we let H̄ be the following matrix:
⎛ ⎞
0 0 0 1 1 0
⎜ ⎟
⎜0 0 0 0 1 1⎟
⎜ ⎟
⎜0 0 0 1⎟
⎜ 1 0 ⎟
H̄ = ⎜ ⎟.
⎜0 0 0 1 0 0⎟
⎜ ⎟
⎜ ⎟
⎝0 0 0 0 1 0⎠
0 0 0 0 0 1
Let the coset S be with syndrome [000111], i.e., s × H̄ = [000111] for any
s ∈ S; then, it is easy to see that no element s with w(s) = 1 is in S,
while there are three elements [100001], [010100], and [001010] in S with
w(s) = 2. They are all coset leaders, and they are not unique.
If there are several coset leaders for a coset, we will identify any one of
them as the coset leader. The maximum likelihood decoding procedure is
as follows: For any message v , we find the coset S where it lies (usually
by finding the syndrome of v which determines the coset S); then, we
find a coset leader e which is the most likely error vector in S. Finally,
we correct the error by taking v = v − e. In engineering, we may have a
table of {(syndrome, coset leader)} for the sake of convenience. Since the
syndrome space is of dimension n − k, it consists of q n−k elements. If n − k
is small, we may precompute the table and decode accordingly.
Note that this particular decoding procedure may not be effective for
all codes, and it may decode to a wrong word, so we shall look for other
possible decoding procedures. The advantage of this procedure is that it
does exist for any code. Therefore, we may study any code with the decoding
procedure of maximum likelihood decoding, which may not be effective. In
16 Introduction to Algebraic Coding Theory
Exercises
(1) Prove that Z/pZ is a field, while Z/pm Z is not a field for m ≥ 2.
(2) Let us consider the repetition code [a1 · · · an ] :→ [a1 a1 · · · a1
· · · an an · · · an ], where each digit repeats itself m times. Find the
generator matrix, check matrix, and the minimal distance of this code.
(3) Prove that for a given binary [n, k] code with at least one word of odd
weight, all code words of even weight form an [n, k − 1] code.
(4) Let us consider the example of Exercise (2). If we want to correct two
errors, how long should be the code? If we want to correct errors, how
long should be the code?
Linear Codes 17
The method is effective in both locating and correcting the error since
error correction is simply switching 0 and 1. The shortcoming is that the
Hamming codes are unable to correct several errors (in particular, a burst
of errors).
The Hamming code is the grandfather of all self-correcting codes, it
calls for more theoretical study of Hamming codes. We have the following
proposition.
Definition 1.24. A code that satisfies the Singleton bound with equality,
i.e., d = n − k + 1, is called an MDS code (maximum distance separable
code).
Proposition 1.25. Let C be an [n, k, d] code. Then, for any received word
a = [a1 · · · an ] for a code word a = [a1 · · · an ] with less than or equal to
(d − 1)/2 errors, there is at most an element a ∈ C such that d(a, a ) ≤
(d − 1)/2. Therefore, if the number of errors is restricted by (d − 1)/2,
then the decoded word is unique if it existed.
Proof. It follows from the statement that there is a code word a such
that d(a, a ) ≤ (d − 1)/2. If there are two elements a = [a1 · · · an ], b =
[b1 · · · bn ] ∈ C which satisfy the criteria of the proposition, then d(a, b) ≤
d(a, a ) + d(b, a ) < d, which is a contradiction.
Furthermore, the number of errors allowed is the maximal of d(a , a),
where a is the code word and a is the received word. The last statement is
obvious.
d(r, M ) > t and find quickly the correct c if d(r, M ) ≤ t. What we can do
now is not up to expectation: if there are less than or equal to t errors, the
decoder will find the correct c for us; if d(c, M ) > t, then the decoder will
find a c ∈ M such that d(c , r) ≤ r (with a small probability) and return
an error message if c cannot be found (with a large probability).
Remark: Let us consider the example of the beginning of Section 1.1. Let
us consider the linear case. If we only consider L3 = {(0, y0 ), (1, y1 ), (2, y2 )},
then it is easy to see that any two lines passing two points must pass the
third one, while two lines can share a common point; therefore, d = 2
and (d − 1)/2 = 0. Thus, it cannot correct the error. If we consider
L4 = {(0, y0 ), (1, y1 ), (2, y2 ), (3, y3 )}, then it is easy to see that any two
lines passing two points must pass the third one, while two lines can share
one point; therefore, d = 3 and (d − 1)/2 = 1. Thus, it can correct one
error.
Note that according to the above proposition, Hamming code may
correct 1 = (3−1)/2 error. This we already know. Later, we will construct
[n, k, d] codes for large d and decode more than one error.
Another important property of Hamming [2n − 1, 2n − n − 1, 3] codes
C is that any word a = [a1 · · · a2n −1 ] is within a distance of 1 of the
code space because a × H = [c1 , . . . , cn ] = [0 · · · 010 · · · 0] × H; therefore,
(a + e) × H = [0 · · · 0], where e = [0 · · · 010 · · · 0] and a + e ∈ C. We define
the following.
Definition 1.26. Let C be an [n, k, d] code. If all words in the word space
Fnq are within a distance of (d − 1)/2 of C, then C will be called a perfect
code.
Definition 1.26 means that any word can be corrected with maximal
t = (d − 1)/2 errors to an unique code word.
Let us compute the probability of a decoder, which can correct t errors
on an [n, k] code, to fail. For instance, for Hamming codes t = 1, if
there are t + 1 or more errors, then the Hamming code decoder will not
decode correctly. Let the channel have a probability p of being incorrect
and q of being correct. Then, p + q = 1 and
n
1 = 1n = (p + q)n = Cni pi q n−i .
i=0
The probability r of failing to decode or decoding improperly is
n t
r= Cni pi q n−i = 1 − Cni pi q n−i .
i=t+1 i=0
Linear Codes 21
Exercises
(1) Show that a Hamming code is a perfect code (note that in the definition
of Hamming code, we assume that n ≥ 3).
(2) Let C be a binary perfect code of length n with minimum distance 7.
Show that n = 7 or n = 23.
(3) Let p be a prime number and q = pm . A q-ary Hamming code of length
(q n − 1)/(q − 1) is defined by a check matrix H with the following
properties: (1) the zero vector is not any row vector, (2) any two row
vectors are linearly independent, and (3) any non-zero vector is linearly
dependent on one of the row vectors. Show that it is a perfect code.
(4) Set up a computer program to decode a Hamming code.
(5) Show that if there are more than two errors in a received word r for a
Hamming code, then r will be decoded to a wrong code word.
Shannon’s theorem is a guiding light of coding theory. For all the different
codes, we need common standards of measurements to compare them. We
sometimes define the common standards for linear code first and then
generalize them for arbitrary codes. Let us consider the efficiency of codes
first. We have the following definition.
Certainly, we have 0 < k/n ≤ 1, and we want the number k/n as large
as possible. It is obvious that if k/n = 1, then the code cannot correct any
error. Let us consider all codes (linear or otherwise). A code is defined as
a subset M = {a1 , . . . , am }, with m elements, of the word space Fnq =U.
We may use the maximum likelihood decoding to decode any received word
a to a, as a ∈ M with d(a, a ) minimal (cf. Remark of Proposition 1.25).
We shall generalize the above definition to the general cases. Let us use
the linear codes as guidance. For a linear code space of dimension k, the
number of elements is q k , and k/n = logq (q k )/n. Therefore, we naturally
generalized the above definition to the following.
22 Introduction to Algebraic Coding Theory
Definition 1.35. Let us use the notations of the preceding paragraph. Let
A(n, s) = max{m | an (n, m, d) code exists with d ≤ s}.
Definition 1.36. Let the limit sup of the rate of information α(δ)
be
q−1
Definition 1.37. For 0 ≤ x ≤ q , we define the entropy function Hq (x)
as
Hq (0) = 0,
Hq (x) = x logq (q − 1) − x logq x − (1 − x) logq (1 − x), x = 0.
Proof. Let r = λn. We separate the proof into two cases: Case 1. q = 2;
Case 2. q ≥ 3.
Case 1. We suppose that q = 2. Then, we have λ ≤ 12 , and
H2 (0) = 0,
H2 (x) = −x log2 x − (1 − x) log2 (1 − x), x = 0
and
r
U2 (n, r) = Cni .
i=0
Using Stirling’s formula, we deduce that
r
n!
−1
n log2 Ci ≥ n−1 log2 Cnr = n−1 log2
n
i=0
r!(n − r)!
i
λ
≤ Cni (1 n
− λ)
1−λ
0≤i≤λn
= Cni λi (1 − λ)n−i
0≤i≤λn
≤ (λ + (1 − λ))n = 1
or
Cni ≤ 2nH(λ) .
0≤i≤λn
Proof. We have
α(δ) = lim sup n−1 logq A(n, δn)
= lim sup n−1 logq A(n, δn)
≥ lim (1 − n−1 logq Uq (n, δn))
n→∞
= 1 − Hq (δ).
We may compare Proposition 1.40 with Shannon’s theorem. Consider
the binary case, i.e., q = 2, then an (n, m, nδ) code with minimal Hamming
distance nδ may correct nδ−1 2 errors (cf. Proposition 1.23). In this case, we
observe that the rate of correcting error nδ−1 δ 1
2n = 2 − 2n for large values of
n. To compensate for a noisy channel of having a rate of errors ℘, we
should choose a code with δ = 2℘. Then, we have the following equations
(note that log2 (2 − 1) = 0 and recall that α(δ) = α(2℘) is the limit sup of
the rate of information):
α(2℘) ≥ 1 + 2℘ log2 2℘ + (1 − 2℘) log2 (1 − 2℘).
Gilbert–Varshamov’s bound
For comparison, in Shannon’s theorem, we have the rate of information
R(M ):
R(M ) ≥ 1 + ℘ log2 ℘ + (1 − ℘) log2 (1 − ℘) − .
Shannon’s theorem
Recall that 1 + x log2 x + (1 − x) log2 (1 − x) is the capacity function
(Definition 1.31), which is a monotonic decreasing function from 0 to 1/2
28 Introduction to Algebraic Coding Theory
(Proposition 1.32); we may deduce that the right-hand side number (drop
the small ) in the second equation using Shannon’s theorem is always
bigger than the corresponding number in the first equation as long as
2℘ < 12 , which is valid for interested symmetric channels (see Exercise (5)).
Therefore, we conclude that the rate of information R(M ) by Shannon’s
theorem satisfies a stronger inequality than the limit sup of the rate
of information α(2℘) by Gilbert–Varshamov’s bound. Hence, Shannon’s
theorem is stronger in this case. However, Shannon’s theorem is an existence
theorem, and Gilbert–Varshamov’s bound is constructive. Therefore, each
has its own advantage. For more than 20 years, the Gilbert–Varshamov’s
bound, which serves as a standard to measure all codes, has been met by the
classical Goppa codes (cf. Proposition 3.22 ), and it has only been surpassed
by the geometric Goppa codes in the 1990s (see Theorem 5.11, Part IV).
Exercises
Ring Codes
This page intentionally left blank
Chapter 2
Rings
2.1. Preliminaries
31
32 Introduction to Algebraic Coding Theory
Recall that letters in coding theory are picked up from a finite field. We wish
to discuss some basic structure of a finite field F. One of the basic properties
we assume is the following theorem from Theory of Groups, Vol. 1, p. 147,
by Kurosh [12], where we consider abelian additive group; note that all
theorems there will be applied in this book multiplicatively.
Rings 33
dx = 0.
Proof. (1) It is easily provable. (2) Let c, d be selected such that ac+bd = 1.
Let x be a generator of Za and y be a generator of Zb . We claim that
z = (dx, cy) is a generator Za ⊕ Zb .
We have the following computations:
az = (adx, acy) = (0, acy) = (0, (1 − bd)y) = (0, y + (−bd)y) = (0, y).
dx = 0,
xd = 1.
Then, the above polynomial will have more than d distinct elements as
solutions, which is impossible for a field. Now, it follows from (1) of
Proposition 2.3 that Zci ⊕ Zcj = Zci cj . After this recombination, the
factorization has only n − 1 factors, and our proposition follows.
One interesting criterion for a polynomial f(x) ∈ F[x] to have a multiple
root in an algebraic closure Ω of F is the derivative test. We have to define
the derivative algebraically.
Definition 2.5. Let f(x) = i ai xi . Then, the derivative f (x) is defined
to be f (x) = i iai xi−1 .
It is easy to see that the derivative obeys the following formal rules.
(1) : a = 0,
(2) : (f (x) + g(x)) = f (x) + g (x),
(3) : (f (x)g(x)) = f (x)g(x) + f (x)g (x).
Proof. The first statement follows trivially from the preceding proposition.
For the second statement, let K1 be equal to the collection of all solutions
n
of the equation f (x) = xp − x = 0 in Ω. Since Ω is an algebraic closure
of Zp , then the equation f (x) = 0 splits completely. By considering the
derivative of f (x) = 1, the equation has no multiple root. Therefore, K1
consists of pn elements. Moreover, let y, z ∈ K1 , then we have
n n n n n n
(y + z)p = y p + z p = y + z, and (yz)p = y p z p = yz.
Therefore, y + z, yz ∈ K. It is easy to check that all requirements of a field
are satisfied to establish that K1 is a field of pn elements and thus equal to
Fq . For the last part of the proposition, let m = ns, then we have pm − 1 =
m n n
psn − 1 = (pn − 1)r. Therefore, xp −1 − 1 = x(p −1)r − 1 = (xp − 1)g(x).
Thus, L ⊃ Fq .
It is not hard to see the following proposition.
Proof. It is easy to see that the Frobenius map ρ is one-to-one. The pigeon
hole principle implies ρ is onto. Therefore, for any given α ∈ K, there is an
β ∈ K such that
β p = α.
The coding and the decoding processes depend on the computer programs
heavily. This section is written for those readers who are not familiar with
computer programs, especially a student of pure mathematics.
Let us assume that p = 2 in this section. The reader is requested to
discuss the cases of p > 2. Let the set Um of all m-bits be a vector space
of dimension m over F2 . We want to provide a more algebraic structure to
Um . In fact, a finite field F2m over F2 can be used to represent Um . This is
the right way to generalize a vector space in coding theory. The addition is
simple, while the multiplication is very messy in practice. We shall manage
the multiplication in three ways as follows.
(1) The table of logarithm: We shall only consider examples. The
reader may generalize the following construction process to the general
setup for any prime field Z/pZ with p > 2 and any finite field Zpm .
Let us consider F24 over F2 . Let us write F24 = F2 [α], where α satisfies
the equation x4 + x + 1 = 0. We have the following list of powers of α:
{1 = α0 , α, α2 , α3 , α + 1 = α4 , α2 + α = α5 , α3 + α2 = α6 , α3 + α + 1 =
α7 , α2 + 1 = α8 , α3 + α = α9 , α2 + α + 1 = α10 , α3 + α2 + α = α11 , α3 +
α2 + α + 1 = α12 , α3 + α2 + 1 = α13 , α3 + 1 = α14 , 1 = α15 }.
It is clear that 1, α, α2 , α3 are linearly independent over F2 . Let
us write them as [1000], [0100], [0010], [0001], then the above list of the
powers of α may be re-written as [1000], [0100], [0010], [0001], [1100], [0110],
[0011], [1101], [1010], [0101], [1110], [0111], [1111], [1011], [1001], [1000]. One
way to treat the problem of multiplication is to look up the list and find
i, j for any two elements [a1 a2 a3 a4 ], [b1 b2 b3 b4 ] such that
[a1 a2 a3 a4 ] = αi ,
[b1 b2 b3 b4 ] = αj ,
and we define
logα ([a1 a2 a3 a4 ]) = i,
logα ([b1 b2 b3 b4 ]) = j.
Then, we have
logα ([a1 a2 a3 a4 ][b1 b2 b3 b4 ]) = i + j.
Let us find the residue value k of i + j module 15 = 24 − 1, i.e., k =
i + j mod 15 and 0 ≤ k < 15. Now, we may look up the above list of
38 Introduction to Algebraic Coding Theory
× 2 3 4 5 6 7
2 4 6 3 1 7 5
3 6 5 7 4 1 2
4 3 7 6 2 5 1
5 1 4 2 7 3 6
6 7 1 5 3 2 4
7 5 2 1 6 4 3
For any other finite field, we use the above method to construct a
multiplication matrix (ai aj ) for all pairs of non-zero elements (ai , aj ) in
the field. Once we have the table, we have to lookup the table once to find
out the result of multiplication of the pair (ai , aj ).
Note that the table is symmetric, and the multiplication depends on the
polynomial f (x) = x3 + x + 1, for instance, 5 × 3 = 4 which comes from the
following equation:
defining equation be
f (x) = x4 + x + 1 = 0.
Note that we have 1 = −1. Let F24 = F2 [α] and α satisfies the above
equation. Let us represent [a1 a2 a3 a4 ] as a1 + a2 α + a3 α2 + a4 α3 and
[b1 b2 b3 b4 ] as b1 + b2 α + b3 α2 + b4 α3 . Then certainly, [a1 a2 a3 a4 ][b1 b2 b3 b4 ]
can be represented as (a1 + a2 α + a3 α2 + a4 α3 )(b1 + b2 α + b3 α2 + b4 α3 ) =
(a1 b1 ) + (a1 b2 + a2 b1 )α + · · · + (a4 b4 )α6 = c1 + c2 α + · · · + c7 α6 . Now,
we want to re-write the last expression as a polynomial in α of degree at
most 3 by going mod the defining equation. In general, we may reduce
any polynomial j+1 ci αi+1 to a polynomial of degree at most 3 using a
linear feedback shift register, which simulates the process of modulo out the
defining equation as follows:
LFSR for x4 = x + 1.
Exercises
(α + β)p = αp + β p .
(8) Find an element α which generates the multiplicative group Fp2 , and
find its defining equation.
(9) Let δ ∈ F24 be a non-zero element. Find δ −1 .
In the classical Goppa code1 (see Section 3.4), we use the total quotient
rings. Given a ring R without non-zero zero divisors, which is called an
integral domain, we may consider the quotient field of R defined as the set
s/r with s = 0 and s/r ≡ s /r iff sr = s r. In the classical case of the
integral domain Z, the ring of integers, its quotient field is the the field of
rational numbers Q. A possible generalization of quotient field to the case
of rings which are not integral domains are the total quotient rings.
Example 2: Let R be the ring of real numbers. Then, its total quotient
ring is itself.
Example 3: Let R be the residue class ring K[x ]/(g(x )). Then, an element
f (x) ∈ R is a regular element ⇔ g(x ) and f (x ) are co-prime. In that case,
there are elements h(x ) and r (x ) such that
h(x)g(x) + f (x)r(x) = 1.
Therefore,
1
= r(x).
f (x)
It is easy to see that the total quotient ring of K[x ]/(g(x )) is itself. We
apply the total quotient ring to the classical Goppa code. Especially, if
f (x) = (x − γi ) with g(γi ) = 0, where {γi } are all distinct, then any
ci
element of the form ∈ K[x]/(g(x)) can be written as n(x)/f (x)
(x−γi )
with deg(n(x)) < deg(g(x)).
As we point out that the rings used in coding theory are usually K-rings
(see Chapter 3), it is easy to see that R can be expressed as K[{R}]. The
simplest and most useful polynomial ring in coding theory is K[x], where x
is a variable. Let us use R to denote the polynomial ring F[x] for any field
F. We have the following definition.
Definition 2.16. Let us consider R. Let a polynomial f (x) = i ci xi be
given. The degree function, deg(f (x )), of any polynomial f (x) is as follows:
max{i : ci = 0} if f (x) = 0
deg(f (x)) =
−∞ if f (x) = 0.
Using the fact that the field F is an integral domain, we have the
following basic properties of deg(f (x )):
The above proposition can be used to find the GCD of two polynomials
f (x ), g(x ). The following process, known as the “long algorithm”, is
fundamentally important and is of interest for decoding purposes later in the
book. We pay attention to the degrees of the polynomials involved, which
turns out to be important in decoding programs. Note that the Euclidean
algorithm is fast in computing. The main point in the process of decoding is
that it will be modified to an Euclidean algorithm with a stopping strategy
depending on the degrees involved (cf. Proposition 3.12).
such that after we name nj = deg(fj (x)), mj = deg(βj (x)), we have (1)
nj+1 < nj and (2) mj−1 = nj−1 − nj . Furthermore, after repeated back-
substitution (see Proof ), we have the following equation for j ≥ 3:
Proof. We may apply the Euclidean algorithm to the pair f2 (x), f3 (x) and
so on, until we reach the case that fs (x) = 0. The first part (1), (2) of the
proposition is routine. Suppose that we have the second part for j = 4, . . . ,
with < s. Then, we have the following three equations:
f−1 = β−1 (x)f (x) + f+1 (x),
α−1 (x)f1 (x) + γ−1 (x)f2 (x) = f−1 (x),
α (x)f1 (x) + γ (x)f2 (x) = f (x).
Substituting the last two equations to the first one and collecting coeffi-
cients, we get the following equations:
α+1 (x) = α−1 (x) − α (x)β (x),
γ+1 (x) = γ−1 (x) − γ (x)β (x).
Note that
deg(α−1 (x)) ≤ n − n−2 ≤ n − n ,
For instance, if we restrict the denominator to be less than 1000, than 355
113
is the best approximation to π.
When we apply the long algorithm to coding problems, we have to know
the place to truncate the process. Now, Sugiyama, Kasahara, Hirasawa, and
Namekawa [35] noticed the place to stop for the above long algorithm for a
modern application for decoding purposes (see Proposition 3.12).
From the above proposition, we can deduce that R = F [x] is a principal
ideal ring, i.e., each ideal is generated by one element,
2.5.1. LFSR
Let us consider the problem of combining the above proposition with a
computer. As well-known, the Euclidean algorithm can be implemented
effectively using LFSR (linear feedback shift register) in any computer. Let
us consider
f1 (x) = β1 (x)f2 (x) + f3 (x)
as in the statement of the Euclidean Algorithm. Let
f1 (x) = c0 + c1 x + c2 x2 + · · · + cn1 −1 xn1 −1 + cn1 xn1 ,
f2 (x) = a0 + a1 x + · · · + an2 −1 xn2 −1 + xn2 .
In general, polynomial divisions can be performed using LFSR. Let us
consider a simple case over F2 . We use the following LFSR to perform
the division:
6 6 6 6
2.5.2. Ideals
In the ring theoretical coding theory, the word space will be R/I, where
R = F[x], and I an ideal of R. The code space will be J/I, where J ⊃ I an
ideal. We have the following.
Proposition 2.22. Let {βi }n1 , {αi }n1 be elements in F such that all βi ’s
are distinct. Then, there is an unique polynomial f (x) of degree at most
n − 1 such that f (βi ) = αi ∀ i. Moreover, f (x) is of the following form:
j=i (x − βj )
f (x) = αi .
i j=i (βi − βj )
Proof. It is easy to see that the f (x) defined above satisfies the require-
ments of the proposition. Suppose g(x) is another. Then, we have f (x)−g(x)
with n roots βi and of degree at most n − 1. Therefore, f (x) − g(x) = 0 or
f (x) = g(x).
Corollary 2.23. Let {βi } be a set of n distinct numbers. Let Pn be the set
of all polynomials of degrees ≤ n−1. The Pn is generated by { j=i (x−βj )}
as a vector space over F.
We shall use the proof of the Chinese remainder theorem. Since the
polynomials j=i (x − βj )mj ’s are co-prime, they generate the unit ideal,
i.e., we have
hi (x) (x − βj )mj = 1
i j=i
for some suitable hi (x). We may not take hi to be gi because the degree
restriction on gi may not be satisfied. Let r = max{deg(hi (x)) − mi }. If
r < 0, then we just let gi (x) = hi (x). Note that the degree restriction is
satisfied. If r ≥ 0, let s = the number of hi (x) such that deg(hi (x))−mi = r.
We make a double induction on r, s, i.e., we reduce the number s to 0, then
r will automatically drop. When r drops to negative, then we find gi (x)’s.
Note that we always have s > 1. Otherwise, s = 1, and there is a unique
term of the highest degree, which cannot be canceled by any other term,
and the above equation cannot be satisfied. Therefore, s ≥ 2. Let us pick
up any two terms of the highest degree, say corresponding to hi (x) and
hj (x). We have r = deg hi (x) − mi = deg hj (x) − mj . For any c, hi (x) and
hj (x) can be replaced by hi (x) + cxr (x − βi )mi and hj (x) − cxr (x − βj )mj ,
respectively, and the above equation is still satisfied. We may select c such
that deg(hi (x)+cxr (x−βi )mi ) is smaller. Thus, at least one term drops out
48 Introduction to Algebraic Coding Theory
from the collection of the highest terms. We reduce the number s at least
by 1. When the number s drops to zero, then r must drop. So, we find
gi (x)’s by double induction.
Now, we have
gi (x) (x − βj )mj = 1 and
i j=i
⎛ ⎞
αk ⎝ gi (x) (x − βj )mj ⎠ = αk .
i j=i
Therefore,
⎛ ⎞
⎝ αi gi (x) (x − βj )mj ⎠ − αk
i j=i
⎛ ⎞ ⎛⎛ ⎞⎞
=⎝ αi gi (x) (x − βj )mj ⎠ − ak ⎝⎝ gi (x) (x − βj )mj ⎠⎠
i j=i i j=i
⎛ ⎞ ⎛⎛ ⎞⎞
=⎝ αi gi (x) (x − βj )mj ⎠ − ak ⎝⎝ gi (x) (x − βj )mj ⎠⎠
i=k j=i i=k j=i
= (x − βk )mk fk (x).
It means that we may let f (x) = i αi gi (x) j=i (x − βj )mj , then we have
f (x) − αk = (x − βk )mk ∀k. Furthermore, if there is another polynomial
f ∗ (x) with the same properties of f (x), namely
then f (x) − f ∗ (x) will be divisible by (x − βi )mi for all i, i.e., f (x) − f ∗ (x)
will be divisible by i (x − β)mi which has a degree n higher than n − 1.
We conclude that f (x) − f ∗ (x) = 0.
Before we study the abstract theory of rings further, let us study an example
to illustrate the usage of ring theory to express the Hamming code and
introduce the readers to the next level of coding theory. Recall the [7, 4, 3]
Rings 49
0 0 0 1 1 0 1
Note that G is the same generator matrix used in the introduction.
We can stay within the ring of polynomial F2 [x] by starting with
4
message [m1 . . . m4 ], forming the polynomial m(x) = 1 mi xi−1 with the
code word a(x) as
7
a(x) = (1 + x + x3 )m(x) = ai xi−1
1
and to send the encoded message as the coefficients [a1 . . . a7 ]. The receiver
7
will check the received [a1 . . . a7 ] and compute a (β) = 1 ai β i−1 .
Assume that there is at most one error in the received word. If the
result of the preceding calculation is a (β) = 0, then there is no error.
Otherwise, a (β) = β j = 0, and there is an error at the (j + 1)th spot of
[a1 · · · a7 ]. After correcting the error and recovering the original polynomial
a(x), one has the polynomial m(x) = a(x)/(1 + x + x3 ) and the original
message [m1 · · · m4 ]. We call the polynomial g(x) = 1 + x+ x3 the generator
polynomial and the polynomial h(x) = (1 + x7 )/g(x) = 1 + x + x2 + x4 the
check polynomial. Note that a(x) is a code word iff a(β) = 0 iff a(x) =
c(x)g(x) iff (1 + x7 )|a(x)h(x).
The decoder works perfectly if there is no error or if there is one error.
However, if there more than one error, and (1) if r happens to be a code
word, then the decoder will treat it as the original code word, or (2) if r is
not a code word, then the decoder will replace it by a wrong code word.
The above example illustrates that Hamming codes can be discussed
purely in terms of polynomial rings. It broadens our horizons. This is
the next level of development in coding theory. Before we continue our
discussion of Fq [x], we must have some understanding of the finite field Fq .
Exercises
(1) Prove the Chinese remainder theorem for K[x], and use it to prove the
Lagrange interpolation theorem.
(2) Find the total quotient ring of Z/4Z.
52 Introduction to Algebraic Coding Theory
2.6. Separability
= f (n).
Remark: The converse is also true.
Recall that Ir is the number of monic irreducible polynomials of degree
r in Fqm [x]. It leads to the following proposition.
mr
> (q − q mr/2+1 )
= q rm (1 − q −rm/2+1 )
>0
and our proposition is proved.
The above proposition shows the existence of finite field Fpm for any
m in a fixed algebraic closure Ω. Each finite field Fpm exists uniquely. We
have the following diagram, where lines indicate inclusive relation.
Fp4 Fp6 Fp10 Fp15 Fp35 Fp77 ··· ··· ··· ···
A
A @
@ AA
AA
AA
A @A A A
P A @A A A
Fp2 PPFp3 HH Fp5 Fp7 Fp11 Fp13Fp17 ···
PP H
PPHH@ @
PPH@
PHP@
P
H
Fp
Figure 2.1.
Rings 55
We shall consider formal power series in this section. First, the decoding
process of codes on F[x ] depend on some properties in this section (see
Proposition 2.36). Second, the important concept of residue in a Riemann
surface (or a smooth algebraic curve), which is all important in the
Geometric Goppa Codes, can be computed (see Proposition 4.42) with the
help of the materials in this section.
∞ i
Let us consider the expression f (x) = i=0 ai x , where ai ’s are
coefficients in the field F. In the past, analysts considered the problem
of evaluating the expression for x = b and deducing many concepts of
convergence, divergence, etc. If a power series is divergent, then we are
likely to disregard it from the point of analysis. However, algebraically, if
we do not evaluate them (other than the trivial evaluation at x = 0), then
we may simply treat them as algebraic items, and they play their roles.
Henceforth, expressions of the form
∞
f (x) = ai xi
i=0
will be called formal power series or power series, and expressions of the
form
∞
f (x) = ai xi
i=−m
⎛ ⎞
f (x) · g(x) = ⎝ aj bk ⎠ xi
i j+k=i
f (x) + g(x) = (ai + bi )xi .
i
56 Introduction to Algebraic Coding Theory
Proposition 2.31. The sets F[[x]] ⊂ F((x)) are closed under the usual
addition and multiplication.
Proof. (=⇒) If f (x) is a unit, then there exists g(x) such that f (x)g(x) =
1. Then, we have
ord(f (x)) + ord(g(x)) = 0.
Therefore, ord(f (x)) = 0 and f (0) = 0.
(⇐=) If f (0) = a = 0, then a−1 f (0) = 1. We may write a−1 f (x) as
1 − g(x) with ord(g(x)) ≥ 1. We have
∞
(a−1 f (x))−1 = 1 + g(x)i .
i=1
Proposition 2.35. Given any field F, the set F[[x]] is an integral domain
and F((x)) is field.
∗
Proof. Let us show the uniqueness. If fg∗ (x)
(x)
is another pair of polynomials,
where f (x), g (x) are of degrees at most m, n, respectively, with g ∗ (0) = 0
∗ ∗
and
∗
f (x)
ord − h(x) ≥ m + n + 1.
g ∗ (x)
Then, we have
f (x) f ∗ (x)
ord − ∗ ≥ m + n + 1.
g(x) g (x)
On the other hand, we have
f (x) f ∗ (x) f (x)g ∗ (x) − f ∗ (x)g(x)
− ∗ = .
g(x) g (x) g(x)g ∗ (x)
Note that g(0)g ∗ (0) = 0. We conclude
Suppose the contrary, there are polynomials f (x), g(x), h(x) that satis-
fied the above equation. Let us multiply by g(x) the equation fg(x)(x)
=
h(x)mod(x5 ); it will produce
f (x) = g(x) + g(0)x4 mod (x5 ).
Since we have the restrictions on the degrees of f (x), g(x), we must have
f (x) = g(x), which do not satisfy all numerical conditions.
Exercises
Ring Codes
From now on, we shall fix the ground field Fq . A linear code is defined
by a subspace C of an n-dimensional vector space Fnq . Note that for the
purpose of decoding, we shall have more (algebraic) relations between the
vectors in the words space. Naturally, there shall be multiplicative relations
between the vectors in the words space. It means that we shall consider ring
structures. A simple ring is Fq [x]/(f (x)). It is easy to see that Fnq can be
represented by Fq [x]/(f (x)) for any polynomial f (x) of degree n. In this
representation, we have the multiplicative structure of a ring other than the
additive structure of a vector space. We shall study the concepts of coding
theory in the context of the ring Fq [x]/(f (x)). To make our work easier,
we shall later select a good f (x) for coding purposes. We have the following
definition to begin with.
Definition 3.1. Given any polynomial h(x), the Hamming weight of h(x)
n−1 i
mod (f (x)) is to consider h̄(x) ∈ Fq [x]/(f (x)) as h̄(x) = i=0 ci x ,
where n = deg(f (x)), and the Hamming weight of h(x) mod (f (x)) is the
Hamming weight of [c0 , c1 , . . . , cn−1 ]. Note that the preceding expression
h̄(x) is unique in any residue class with the degree less then n.
61
62 Introduction to Algebraic Coding Theory
Proposition 3.3. Any ideal Ī =(0) in Fq [x]/(xn −1) defines a cyclic code.
Conversely, any cyclic code of length n can be represented by an ideal Ī in
Fq [x]/(xn − 1).
Proof. If Ī = (1), then the code space is the whole word space, and the
code space is cyclic. Let an ideal Ī =(1) in Fq [x]/(xn − 1) and a code
word c(x) = c0 + c1 x + · · · + cn−1 xn−1 ∈ Ī. Then xc(x) ∈ Ī and xc(x) =
cn−1 + c0 x + · · · + cn−2 xn−1 since xn = 1, and the code space is cyclic.
Conversely, if C is cyclic of length n, then c(x) = c0 + c1 x + · · · +
cn−1 xn−1 forms a set Ī for all [c0 c1 . . . cn−1 ] ∈ C. The cyclic property of C
implies xc(x) ∈ Ī; hence, x c(x) ∈ Ī. It is easy to see that Ī is an ideal in
Fq [x]/(xn − 1), and C can be represented by Ī.
f (x) = nxn−1 ,
Proposition 3.5. Given any set {γj } ⊂ Ω such that all γj are roots
of xn − 1 for a fixed n, let g(x) be the least-common multiple of all
monic irreducible polynomials satisfied by {γj } for all j. Then, g(x) is the
generator polynomial of a cyclic code C in Fq [x]/(xn − 1).
Then, we have
detM = (xi − xj ).
i>j
Proof. First, we treat all xi as symbols. Subtracting the first column from
the second column, . . ., the first column from the nth column , we may
extract xi − x1 from the ith column, we conclude that i>1 (xi − x1 ) |
det(M ). By symmetry, we conclude
(xi − xj ) | det(M ).
i>j
Since both sides are polynomials of degrees n(n − 1)/2, they can only differ
by a constant. Comparing the coefficients of i≥2 xi−1 i on both sides, we
conclude that they must be equal. Since the proposition is true for xi as
symbols, it must be true for xi as elements of the field F.
Let us further consider the ring Fq [x]/(xn −1) with the usual assumption
that n, p are co-prime. For the purpose of coding theory, we have the
following proposition.
min{Hamming wt(a) : a ∈ C, a = 0} ≥ δ.
Proof. Recall that the code space is the ideal generated by g(x). Hence,
all code polynomial c(x) must be satisfied by γ , γ +1 , . . . , γ +δ−1 . Suppose
Ring Codes 65
with s terms where s < δ such that c(γ ) = · · · = c(γ +δ−1 ) = 0. Consider
the following system of equations
cij γ ij = 0,
ij
······
cij γ (+δ−1)ij = 0.
ij
Among the above δ linear equations in fewer than δ variables cij , we pick
the first s so that the number of equations matches the number of variables.
The coefficient matrix is the following:
⎛ i ⎞
γ 1 γ ()i2 . . . γ is
⎜ (+1)i1 ⎟
⎜γ γ (+1)i2 . . . γ (+1)is ⎟
N =⎜ ⎜ ⎟.
⎟
⎝· · · ··· ··· ··· ⎠
(+s−1)i1 (+s−1)i2 (+s−1)is
γ γ ... γ
It suffices to show that the matrix N is non-singular, then all cij must
be zero; this implies c(x) is the zero polynomial. Contradiction!
Let us show that the matrix N is non-singular. Let us pull γ ij from the
jth column. Then, we have the following matrix L:
⎛ ⎞
1 1 ... 1
⎜γ i1 γ i2 . . . γ is ⎟
⎜ ⎟
L=⎜ ⎟,
⎝· · · ··· ··· ··· ⎠
γ i1 (s−1) γ i2 (s−1) . . . γ is (s−1)
which is a Verdermonde matrix of rank s with xj replaced by γ ij . Since
we have γ ij − γ ik = 0 with ij = ik ≤ n − 1, the matrices L and N are
non-singular. Therefore, all cij = 0, and c(x) is a zero polynomial, contrary
to our assumption that c(x) is a non-zero polynomial.
The following cyclic codes were discovered by Bose and Ray-Chaudhuri
(1960) and Hocquenghem (1959) and are known as BCH codes.
The Hamming codes use the vector space structure of Fn2 . The BCH
codes use the ring structure of F2 [x]/(xn − 1). Note that F2 [x]/(xn − 1)
is isomorphic to Fn2 as a vector space, and furthermore, it has a rich ring
structure. The ring structure makes them better.
1 + x + x3 + x4 + x5 + x7 + x8 = 0,
1 + x + x2 + x3 + x4 = 0.
So, the generator polynomial g(x) is the product of the above two
polynomials, which is
(1 + x + x3 + x4 + x5 + x7 + x8 )(1 + x + x2 + x3 + x4 )
= 1 + x3 + x6 + x9 + x12
1 We follow the decoding process of SKHN [35]. Another slightly faster decoding process
is the Berlekamp’s algorithm [4] decoder. Please see Appendix D.
68 Introduction to Algebraic Coding Theory
received r
check polyn
pass decoder
check polyn
Figure 3.1.
assumption that the number of errors v and erasures u are limited by the
numerical condition 2v + u < δ, and we shall use the theory of power series
and the Euclidean algorithm. The decoder will produce an error word e (see
Section 3.2) such that r − e might be a code word. Even if the decoder
produces an error word e(x), we have to further test if the assumption of
the limited number of errors, the numerical condition, is truly satisfied by
testing if e(x) is the correct error word by checking c = r − e to see if it
is a code word using the check polynomial. If it is, then we pass on to the
block of code word. If it is not, then the decoding fails, and we return an
error message.
Let us concentrate on the decoder part. We assume that there are u
erasures and at most v errors. Let
be the hypothetical error vector. Note that we assume that there are at most
v + u non-zero ei ’s. It follows from the remark after Proposition 1.25 that
if we assume that 2u + 2v < δ and c(γ i ) = 0 for i = 1, . . . , δ, it means that
e(γ i ) = r(γ i ) for i = 1, . . . , δ; then, there is a unique code word within the
error range. Certainly, by brute force of checking all possibilities, we may
recover c(x). However, this is too slow. We use the improved numerical
condition 2v + u < δ, which is better than 2u + 2v < δ and find a clever way
of solving the decoding problem. We introduce (following Peterson [30])
the following concepts of error locator which gives the locations of errors
Ring Codes 69
and error-evaluator polynomial ω(x) which gives the values at the error
locations.
Definition 3.10. Let M be the set of all places where either there is
an erasure or an error. We shall write the set M as the disjoint union
M = N ∪ L, with N consisting of u erasures and L consisting of at most v
errors. The error-locator polynomial σ(x) is defined as
Since the set N is known, then the function σ1 (x) is a known polynomial of
degree u. Thus, σ(x) and σ2 (x) determine each other. The error-evaluator
polynomial ω(x) is defined as
v + u + v + u − u = 2v + u.
70 Introduction to Algebraic Coding Theory
Proof. We have
∞
ω(x) ei γ i x
= = ei (γ i x)j
σ(x) 1 − γ ix j=1
i∈M i∈M
∞ ∞
= ei γ ij xj = e(γ j )xj .
j=1 i∈M j=1
≥ δ + 1.
On the other hand, Proposition 2.36 shows that the rational function
ω(x)
σ2 (x)is thus uniquely defined. Sometimes, one may establish the uniqueness
proposition first (as we prove Proposition 2.36 first), and then, we tie the
object in an equation (as we prove Proposition 2.36). Finally, we use the
equation to show the existence. For decoding purpose, we need a fast way
to recover the rational function σω(x)
2 (x)
from Proposition 3.11 (please see the
next proposition).
The above equation in the proposition, written slightly differently as
follows, is named by Berlekamp as the key equation (see Appendix D):
(1 + S(x))σ(x) ≡ ω(x) mod xδ +1 .
Ring Codes 71
Note that since σ1 (x) is known, then f1 (x) is known and thus uniquely
determined. The conclusion of the above proposition can be re-written as
⎛ ⎞
δ
ω(x) = σ(x) ⎝ r(γ i )xi ⎠ + xδ +1
h(x)
i=1
= σ2 (x)f1 (x) + xδ +1 h∗ (x).
Therefore, ω(x) ∈ the ideal (f1 (x), xδ +1 ) ⊂ Fq [x]. Note that σ2 (0) = 1,
i.e., σ2 (x) is an unit in Fq [[x]]. We have the following interesting equation:
ω(x)
= f1 (x) mod(xδ +1 ). (100 )
σ2 (x)
In fact, due to the uniqueness result of Proposition 2.19, the long
algorithm applied to f1 (x) and f2 (x) = xδ +1 will provide a fast way to
find the rational function σω(x)
2 (x)
. It turns out to be one of the most useful
tools in decoding. We have the following proposition.
Proposition 3.12 (Euclidean Algorithm with stopping strategy )
(Sugiyama–Kasahara–Hirasawa–Namakawa). Let us assume that
there are non-negative integers u, v and δ = 2v + u and polynomials
ω, σ = σ1 σ2 . Let deg(ω(x)) = δ , deg(σ1 ) = u and deg(σ2 ) = v, where
σ1 (x) is a known function defined by Definition 3.10. We use the notation
of Proposition 2.19. Let f1 (x) be defined above in equation (10 ), and let
f2 (x) = xδ +1 , and let fi (x), ni be defined as in Proposition 2.16. Let (the
stopping time) t be determined as t = 3 if n3 ≤ v + u; otherwise, t is
determined by nt−1 > v + u ≥ nt . For the case t = 3, we have
α3 (x)f1 + γ3 (x)f2 (x) = 1f1 (x) + 0f2 (x) = f3 (x).
Otherwise, (for t > 3), in the following equation:
αt (x)f1 (x) + γt (x)f2 (x) = ft (x), (1)
72 Introduction to Algebraic Coding Theory
we have
ω(x) ft (x)
= .
σ2 (x) αt (x)
Furthermore, we have σ2 (x) = h(0)−1 αt (x), where αt (x) = xs h(x) with
h(0) = 0 and ω(x) = h(0)−1 ft (x).
Proof. We like to factor out αt (x) from the above equation (1) and try
to get our conclusion directly. However, there are several technicalities. We
want to show that the operation of factoring can be performed in the power-
series ring Fq [[x]]. Note that we may assume 2v + u = δ , and according
to Proposition 2.16, we have deg(αt (x))(≤ δ + 1 − nt−1 ) ≤ v=n as in
Proposition 2.19 and deg ft (x) = nt ≤ u + v=m as in Proposition 2.36
in any case. Let αt (x) = xs h(x) with h(0) = 0 and h(x) be a unit in Fq [[x]].
Note that then,
ord(α−1
t ) = −s
Since σω(x)
2 (x)
is the reduced form of the rational function, and the condition
of σ2 (0) = 1 makes the rational expression unique, our proposition follows
easily.
ω(x) ω(x)
From the above proposition, we easily find σ(x) = σ1 (x)σ2 (x) with σ1 (x)
known.
Ring Codes 73
r(γ) = γ 2 , r(γ 2 ) = γ + 1,
r(γ 3 ) = γ, r(γ 4 ) = γ 2 + 1.
We conclude that
ω(x) γ2x
= .
σ(x) 1 + γ 2 x + (γ 2 (γ 3 + γ + 1))x2
It means that we have
ω(x) = γ 2 x
σ(x) = 1 + γ 2 x + (γ 2 (γ 3 + γ + 1))x2 .
We find the two roots of σ(x) are γ −3 , γ −6 (the simple way is computing
all σ(γ −i ) which takes at most 15 evaluations).
74 Introduction to Algebraic Coding Theory
ω(x) −3
We have to find e3 , e6 . We calculate xσ (x) at x = γ , γ −6 , and both
e3 , e6 are 1. Therefore, we find c(x) = r(x) − e(x) = x + x2 + x4 + x5 +
x7 + x8 + x10 + x11 + x13 + x14 is a possible code word. Further test by
multiplying it with the check polynomial = x3 + 1 yields 0 mod x15 + 1 or
c(x) = (x + x2 )(1 + x3 + x6 + x9 + x12 ) = (x + x2 )g(x). We conclude c(x)
is a code word. We find the original message is x + x2 .
Exercises
The BCH codes introduced in the preceding sections are general with the
disadvantages of being complicated and hard to use. We discuss their simple
counterparts, the Reed–Solomon codes, in this section.
The reader is referred to the beginning of Chapter 1. There, we discuss
a possible way of coding using polynomial curves over real numbers.
The process is as follows. Let [a0 , a1 , . . . , ak−1 ] be the original message.
k−1
It determines a polynomial curve f (x) = 0 ai xi of degree at most
k − 1. Then, we send out [b1 , b2 , . . . , bn ] = [f (1), f (2), . . . , f (k), . . . , f (n)].
Assuming that there are at most n−k 2 errors, we may use brute force of
checking all possibilities to decode the received message [b1 , b2 , . . . , bn ] as
follows. Knowing k, n, we may try all subsets of [b1 , b2 , . . . , bn ] of k + n−k 2
elements to see which one determines a polynomial curve of degree at most
k − 1. Once we find the curve, if it exists, we may use it to correct all errors.
The difficulty is the decoding process since a brute force way will be time-
consuming when n, k are large. Now, we shall change the coefficient field
from the real field R to a finite field K and rename it the Reed–Solomon
code (see the following definition). The important aspect of it is that there
is a fast way of decoding (as seen in the following).
Ring Codes 75
Proof. Note that deg(f (x)) < k < n. Let us define a map π: Pk
→ Fnq as π(f (x)) = [f (γ), f (γ 2 ), . . . , f (γ n )]. Then clearly, π is a linear
transformation. The only thing we have to show is that π is an one-to-
one map, i.e., if f (γ i ) = 0 for i = 1, 2, . . . , n, then f = 0. Note that then
xn − 1 | f (x), and deg(f (x))≥ n, if f = 0. A contradiction.
It follows from the remark after Proposition 1.25 that the Reed–
Solomon [n, k] code can correct up to (and including) n−k
2 errors.
γ jn xn = xn
γ j x − 1 = γ j (x − γ n−j ).
and
c(x) = x (x − γ i ) = g(x)h(x).
i=(n−j)
78 Introduction to Algebraic Coding Theory
r(γ) = γ 3 , r(γ 2 ) = γ,
r(γ 3 ) = γ 7 , r(γ 4 ) = γ 7 ,
r(γ 5 ) = γ, r(γ 6 ) = 0,
r(γ 7 ) = γ 9 , r(γ 8 ) = 0.
j=8
We start the long algorithm with f1 = j=1 r(γ j )xj = γ 3 x + γx2 +
γ 7 x3 + γ 7 x4 + γx5 +γ 9 x7 and f2 = x9 as follows:
1 · f1 + 0 · f2 = f3 ,
f2 + (γ 13 + γ 6 x2 )f3 = γx + γ 14 x2 + γ 6 x3 + γ 13 x4 + γ 2 x5 + γ 13 x6 = f4 ,
f3 + (1 + γ 11 x)f4 = γ 9 x + γ 2 x2 + γx4 + γ 6 x5 = f5 ,
f4 + (γ 9 + γ 7 x)f5 = γ 9 x + γ 8 x2 + γ 5 x3 + γ 9 x4 = f6 .
Ring Codes 79
with
σ(x) = αu (x) = (γ 9 + γ 7 x) + (γ 13 + γ 6 x2 )
+ (γ 9 + γ 7 x)(1 + γ 11 x)(γ 13 + γ 6 x2 ),
ω(x) = fu (x) = γ 9 x + γ 8 x2 + γ 5 x3 + γ 9 x4 .
((1 + γ + γ 3 ) + γ 3 x + (γ + γ 2 )x2 + γ 3 x3
+ (1 + γ 2 + γ 3 )x4 + (1 + γ 3 )x5 + (1 + γ 3 )x6 )(x15 + 1).
(1 + γ + γ 3 ) + γ 3 x + (γ + γ 2 )x2 + γ 3 x3 + (1 + γ 2 + γ 3 )x4
+ (1 + γ 3 )x5 + (1 + γ 3 )x6 .
80 Introduction to Algebraic Coding Theory
g(x) = α(x)(x − γ) + r.
g(x)−g(γ)
It means that we have g(γ) = r = 0, α(x) = (x−γ) , and
Namely, (x− γ) has an inverse (−r)−1 α(x) mod g(x). We say that (x− γ)
is regular in Fqm [x]/(g(x)). It is easy to see that in the polynomial ring
Fq [x ], we have
−1 g(x) − g(γ)
(x − γ) ≡ 1 mod g(x).
g(γ) x−γ
1 −1 g(x) − g(γ)
= ,
x−γ g(γ) x−γ
or we may write
1 −1 g(x) − g(γ)
≡ mod g(x). (1)
x−γ g(γ) x−γ
We have the following definition.
Proof. Let us follow the definition for classical Goppa Codes. From the
equation
n
ci
≡ 0 mod(g(x)),
i=1
(x − γi )
we have that the left-hand side can be written as n(x)/d(x) with d(x) =
(x − γi ) and deg (n(x)) < n. Since (d(x), g(x)) = 1, then c = {c1 , . . . , cn }
is a code word ⇔ g(x) | n(x) or n(x) = g(x)h(x), where h(x) is of degree <
n−s. Since the classical Goppa codes C are parameterized by the coefficients
of h(x), its rank is n − s. On the other hand, if c = [c1 , . . . , cn ] is a code
word with at most s of the ci ’s non-zero, then
n
ci
≡0 mod(g(x))
i=1
(x − γi )
can be rewritten as
n
ci ci
= ,
i=1
(x − γi ) (x − γi )
i∈I
where hi (x) is a polynomial of degree less than 2t. Then, the syndrome
Sr (x) can be expressed as a polynomial h(x) which is equivalent to Sr (x)
mod (g(x)) and of degree less than 2t. The important fact is that Sr (x),
and hence h(x), is computable and known. Furthermore, let r = c+ e where
c is a code word and e is the error word, M = {i : ei = 0} = the error
locations of r with cardinality(M ) ≤ t. Then, Sc (x) = 0 and
ri
Sr (x) = Se (x) = = ri hi (x) mod(g(x)) = h(x) mod g(x).
x − γi
i∈M i∈M
Exercises
Algebraic Geometry
This page intentionally left blank
Chapter 4
Algebraic Geometry
In the last chapter, we discussed the Reed–Solomon codes, which are used
to evaluate a polynomial of degree at most k − 1 at n(≥ k) points in the
ring of polynomials Fq [x], where Fq is a field with q elements. We see
that it is equivalent to evaluate L(D) on PF1q (see Example 1 of Section
5.1) with D = nP∞ (see Section 4.3). Similarly, classical Goppa code can
be considered as a code over PF1q . We may extend the concept of Reed–
Solomon codes and classical Goppa codes to codes over any projective
smooth curve (instead of lines only). The Riemann–Roch theorem induces a
richer algebraic structure, and the corresponding codes will be more useful.
Now we advance to the next level of coding theory, in which we focus on
geometric Goppa codes. Our attention on the rings Fq [x] will be refocused
on algebraic functions of one variable, i.e, function rings of curves. We need
some knowledge of algebraic geometry, a rich and beautiful subject. It has
been there for two thousand years. Great minds of the past have rebuilt
it again and again. Some guiding lights from the past will be treated as
simple corollaries of theorems, and the foundations of algebraic geometry
might come last.
We will not try to give a comprehensive description of algebraic geom-
etry in this book. Instead, we emphasize the useful curve theory, especially
Riemann–Roch theorem and Weil’s theorem on the zeta function, which
gives us a count on the number of rational points on a smooth curve over
a finite field Fq , and simply discuss it to the extent that the reader will be
able to bear the burden with us. Sometimes for the readers’ understanding
of the subject, we have to give the general picture of algebraic geometry
which may not be relevant to the study of coding theory. Most proofs will be
89
90 Introduction to Algebraic Coding Theory
The set of all points in affine space An (K) plus the points at infinity is
called n-dimensional projective space P n (K). It can be show that projective
space P n (K) is complete. Rigorously, we define an equivalence relation ≈
in the (n + 1)-dimensional affine space An+1 (K)\{0}= {(a0 , a1 , . . . , an ) :
not all ai = 0} as follows:
Let us consider the real field R case. The projective line P 1 (R) can be
represented by all lines in the plane passing through the origin. Let the line
and the horizontal line span an angle θ. As long as θ = 90◦ , the line will
have an intersection (1, tan θ) with the vertical line (1, t); therefore, all lines
passing through (0, 0) can be represented as the vertical line x = 1 union
with one extra point P 0 (K) or the point at ∞ corresponding to θ = 90◦ .
Therefore, A1 (R) is the real line, and P 1 (R) is a cycle. If the ground field
is the complex field C, then A1 (C) is the complex line which is a real plane,
and P 1 (C) is the one-point compactification of the real plane, i.e., a sphere.
In general, for any field K, we have
In the case the ground field is the real field R, we have P 2 (R) = A2 (R) ∪
P 1 (R), i.e., an affine plane with a circle attached at infinity. We have the
following picture for P 2 (R).
92 Introduction to Algebraic Coding Theory
The above does not enlighten any non-expert. Topologically, we may view
P 2 (R) as a unit closed disc (the open disc is homeomorphic to A2 (R))
with the antipodes boundary points identified (which is homeomorphic to
a circle, note that two antipodes determine a line). The interior of the disc
is identified with A2 (R) by v → v/(1 − |v|), where v is an vector inside the
unit disc and |v| is the length of v. The real project plane is a non-orientable
surface.
From the point of view of algebraic geometry, the affine spaces are not
complete 2 (in the sense of valuation) over any field K, while the projective
spaces are.
n
We define a linear subspace L in PK as the image of U\{0}, where U is
a positive-dimensional subspace of the An+1 K treated as a vector space. We
define dim(L) = dim(U) − 1.
We have the following well-known theorem about vector spaces:
2
Corollary 4.3. Let L1 , L2 be two linear spaces of PK and dim(L1 ) =
dim(L2 ) = 1. Then, the intersection of L1 , L2 is not empty.
Although we only need curve theory for coding theory, we shall enjoy the
beautiful theory of algebraic geometry. Let us speak further about AnK .
There are two objects involved as follows: (1) the algebraic objects of the
affine space AnK , including a polynomial ring K[x1 , . . . , xn ] (over the projec-
n
tive space PK ; there is a homogeneous polynomial ring K[x0 , x1 , . . . , xn ]h
={all homogeneous polynomials}), and the system of equations there. (2)
Geometric objects of the affine space AnK , which is the solution sets X(I)
of ideals I of equations. Similarly, we have projective geometric objects,
i.e., the solution sets X(Ī) of homogeneous ideals of equations. The relation
between these two objects may be complicated in general. Primarily, we
take the point of view of algebra. We shall study affine varieties first.
We have the following well-known Hilbert basis theorem.
Note that X(I) may be empty even in the simplest case that I = (f ),
where f (x1 , . . . , xn ) = non-zero constant. Look at the following examples.
Exercises
(1) We define JacRad(I)= ∩m∈X(I) m. Any ring with the property that
JacRad(I) = Rad(I) for all ideals I will be called a Hilbert ring (or a
Jacobson ring). Show that the power series ring K[[x]] is not a Hilbert’s
ring.
(2) Let f = x2 y 3 ∈ K[x, y]. What is Rad (f )?
(3) Show that Rad((x2 + 1)) = Rad((1)) in the ring R[x], where R is the
field of real numbers.
(4) Find a finite set of generators for the ideal generated by {xi − y i+1 :
i = 10, 11, . . .} in the polynomial ring K[x, y] of two variables.
(5) Is the ideal (x1 + 1, y 2 + 1) ∈ R[x, y] prime? radical? where R is the
field of real numbers.
(2) All meromorphic functions with finitely many poles. It can be shown
that this is the set of all rational functions C(x) in one variable.
(3) All meromorphic functions with infinitely many poles, i.e., we include
all meromorphic functions with essential singularities. The Picard’s
great theorem told us that possibly except one value, the function will
take any value infinitely many times. It is not well studied.
Apparently the first set is too narrow; the field of constants is common
to many projective curves and will not tell the the underlining geometric
sets apart. It may not be useful for our study. The third set is too big and
not well studied. Riemann selected the second set and proved an important
theorem (see Proposition 4.28) which tells the underlining geometric sets
apart by the concept of genus (see Proposition 4.28). Since then, the field
of rational functions become an indispensable tool of algebraic geometry.
In the present book of algebraic coding theory, we need the results of
Riemann–Roch theorem over algebraic functions of one variable over a finite
field. This was started in the work in 1882 by Dedekind4 and Weber.5 Even
the concept of divisors were theirs. The form of Riemann–Roch theorem we
need is due to Weil.6
If we consider all rational functions of any algebraic curve C over a field
K, then we have an infinite-dimensional vector space. We want to classify
them. We define two algebraic curves to be birationally isomorphic iff their
rational function fields are isomorphic over K.
(D) ≥ d(D) + 1 − g,
Example 4: Let us consider the ring Z of all integers. Let I = (0). Then
clearly, all maximal ideals are of the form (p), where p is a prime number.
We have X((0)) = {(p) : p prime}. And X((p)) = (p), X((12)) = {(2), (3)}.
Moreover, we have Rad((12)) = (2) ∩ (3) = (6).
Example 5: Let X be the affine line A1R over real field R. Then, (x2 + 1)
is a maximal ideal of R[x]. It is a point. However, x2 + 1 = 0 has no real
solution, and it can be shown that R[x]/(x2 + 1) ≡ C ≡ R, where C is the
field of complex numbers. So, the point (x2 + 1) is not a rational point.
Exercises
xy + x3 + y 3 = 0.
Suppose that (f0 (x), f1 (x), . . . , fn (x)) = (1) and that there exists b ∈ L
which is a finite extension of K such that
fn (b) = 0,
fi (b) = 0, f or i < n,
(x − b)2 f0 (x).
rank[bij ] = N − n,
where
∂fi
bij = (a1 , . . . , aN ).
∂xj
Proposition 4.15. For our case of a perfect ground field, a point is regular
iff it is smooth.
xn + y n − 1 = 0.
It follows from the above proposition that it is smooth for this affine part.
In general, we may discuss even arithmetic cases. See the following
example.
Example 9: We may consider the plane curve C defined by f (x, y) =
y 2 − x(x − 1)(x − a) over a field K of characteristic not 2, where a = 0, 1. It
is easy to see that (f, fx , fy ) = (1). Hence, the affine curve C is non-singular.
Let us homogenize the equation by introducing a variable z as
g(x, y, z) = y 2 z − x(x − z)(x − az). Clearly, g(x, y, 1) = f (x, y). It is clear
that the affine curves defined by g(1, y, z) or g(x, 1, z) are smooth. We say
the projective curve defined by g(x, y, z) is smooth.
Let R be the rational function field of an algebraic variety over a ground
field K. The algebraic closure of K in the quotient field of R is called the
field of constants. For simplicity, we assume that K = the field of constants,
i.e., every element in R outside K is transcendental over K.
field to K(x, y). It means that we find a plane model for the curve. Although
any curve has a plane curve as a model (i.e., those two curves share the same
rational function field, or those two curves are birationally isomorphic).
3
Note that a projective plane curve will have at most qq−1 −1
= q2 + q + 1
2
rational points (PK has so many points). A non-plane curve may have
many more rational smooth points (cf. Proposition 4.10) . If the number of
smooth points matters in our discussion (see the next chapter on geometric
Goppa codes), then we may not be able to discuss only smooth curves in
the plane, while it is known that any curve can be represented as a smooth
space curve.
Let C be a curve; recall that the curve is said to irreducible if the ideal
J(C) is prime. Let us consider the following example.
Example 10: Let the field K = R the field of real numbers. Let J(C) =
(x2 +1) in R[x, y]. Then, J(C) is irreducible. However, if we extend the field
R to the complex field C, then the generating polynomial (x2 + 1) splits
into a product (x + i)(x − i). So, the curve splits into two lines.
(1) F = x2 + y 2 + x3 ,
(2) F = x2 + y 3 .
Exercises
Example 13: Let n be a positive integer such that p†n. Let us consider
the following projective Fermat curves over Fpm with equation,
xn + y n = 1.
Proof. This theorem follows from Walker’s book, Algebraic Curves [15].
[Valuations and Places] We take the point of view of algebra in the study
of curve theory. We follow the book [10]. First, instead of geometric object,
we are given a function field F of one variable over a perfect field K, i.e.,
F contains an element x which is transcendental over K and F is algebraic
of finite degree over K(x). Furthermore, we assume that K is algebraically
closed in F.
Zariski10 (cf. Vol. II, p. 110 [16]) defined the Riemann surface of F
as the collection of all K-valuations of F (see following definition) since
all valuations of curves are non-singular local rings (see following for a
definition). This shows that the Riemann surfaces are non-singular models
of curves. Over the complex field C, the Riemann surface in the sense of
Zariski equals to the Riemann surface in analysis as sets; however, they are
equipped with different topologies.
Let us recall that the K-valuations are defined as the subrings O with
the following properties:
(1) O ⊃ K,
(2) O = F, and
(3) if x ∈ F\O, then x−1 ∈ O.
Example 15: Let us consider the real field R and the projective line PR1
over R. Let us consider the function f = x2 + 1. It vanishes at the maximal
ideal P = (x2 + 1) wth residue field degree μp (f ) = 2, and a double pole at
∞, P∞ . Therefore, it is divisor D = (f ) = P − 2P∞ and degree d((f )) =
1 · 2 − 2 · 1 = 0.
We have the following proposition (cf. Chevalley [10], p. 18, Theorem 5).
Proposition 4.26. Let us assume that the field K is perfect. Then, every
non-constant function f of a project curve has some poles.
11 Chevalley [10] uses multiplicative format for the divisor group rather than the additive
group format for the divisor group in this book. Note that his L(D) is our L(−D).
Algebraic Geometry 109
Definition 4.27. Let (f ) = ni Pi . The collection of all zeroes and poles
will be called the support of f , supp(f ) = {Pi : ni = 0}.
Example 16: Let us consider PR1 , where R is the real field. Let P = (x2 +1),
P∞ be the point at ∞ and f = x2 + 1. Then, (f )0 = P and (f )∞ = 2P∞ ,
while μ(P ) = 2. Therefore, d((f )0 ) = 2 = d((f )∞ ).
where n > 2 and p†n (in case that p = 0, the second condition p†n is void).
1
We claim that Fn is not birationally equivalent to PK , or the rational
function field F (Fn ) = K(t). Suppose it is, we may dehomogenize the
defining equation by setting x0 = 1, and let x1 = g(t)/f (t), x2 = h(t)/f (t)
with (f (t), g(t), h(t)) = (1). We wish to deduce a contradiction if f, g, h
are not all in the field K. Suppose that there are triples f, g, h, we take
a triple f, g, h with max (deg(f ), deg(g), deg(h)) the smallest possible one.
Note that if two of f, g, h are constants, then the third one must be constant
which is impossible. The above defining equation becomes
n−1
g(t)n + h(t)n = (g(t) + ω i h(t)) = f (t)n ,
i=0
where ω is a nth root of unity. Since each prime factor p(t) of two of
(g(t) + ω i h(t))’s must be a prime factor of g, h and hence a prime factor
110 Introduction to Algebraic Coding Theory
of f , all (g(t) + ω i h(t)) are co-prime. We may rewrite the above equation as
g(t) + ω i h(t) = ci αni (t)
and deg(αi ) = deg(f (t)). Now, we select three of them with one of αi
non-constant. Take the three corresponding equations and eliminate g, h;
we get a new equation of the following form:
ei αni (t) + ej αnj (t) + ek αnk (t) = 0.
Absorbing the coefficients ei , ej , ek to the polynomials, we rewrite the
above as
βin (t) + βjn (t) = βkn (t).
Note that max (deg(f ), deg(g), deg(h)) > max (deg(αi ), deg(αj ), deg(αk )).
A contradiction.
Exercises
Example 18: Let us consider the simplest curve, a straight line in the
2
projective plane PK , defined by x2 = 0. Let us consider the pole set
{x0 = 0, x1 = 1, x2 = 0} = {P}. It is not hard to see that the set of
all functions with at most one pole at P is {(a0 x0 + a1 x1 )/x0 } which is
a vector space of dimension 2 over K. In general, let D = nP. It is not
hard to see that the set of all functions with at most D = nP as poles
are {(a0 xn0 + a1 xn−1
0 x1 + · · · + an xn1 )/xn0 } which is a vector space of
dimension n + 1 = (D) over K. Let t = x1 /x0 . Then, a0 xn0 + a1 xn−1 0 x1 +
n n n
· · · + an x1 )/x0 can be re-written as a0 + a1 t + · · · + an t , a polynomial in t
of degree ≤ n. Then, we verify Riemann’s inequality (cf. Proposition 4.28)
(D) ≥ d(D) + 1 − g, with g = 0.
our discussion, let us fix the affine piece (A2K ) = {x0 = 1} as the points at
finite distance. Let us consider the pole set {x0 = 0 = x1 , x2 = 1} = {P}.
Let us consider the set of all functions L(nP) with at most n poles at P,
where n is a non-negative integer and no pole elsewhere. We claim that
L(nP) is a vector space of dimension n over K.
We shall dehomogenize the equation by setting x = xx10 , y = xx20 . Then,
the defining equation can be rewritten as
g0 (x) + g1 (x)y
f (x, y) = .
h(x)
We may assume that (h(x), g0 (x), g1 (x)) = (1); otherwise, we may reduce
the form. We make a further assumption that h(x) splits completely
(otherwise, go to a finite extension of K, if the reader feels comfortable
by assuming that K is algebraically closed, then assume it). Suppose that
f (x, y) has no pole at finite distance (means in A2K ). We claim that h(x)
is a non-zero constant.
If not, we show that f has a pole at finite distance, and thus f ∈
L(nP∞ ). Consider any non-constant factor x − β of h(x). We have the
following two cases: (1) β = 0, 1, a,; (2) β = 0, 1, a. In case (1), the
intersection of x − β with curve C will be distinct points P1 and P2
(corresponding to two distinct non-zero values of y on C with x = β). Either
g0 (β) = 0 or g1 (β) = 0 but not both are zero since (h(β), g0 (β), g1 (β)) = 1.
In any situation, the numerator of f (x, y) can not be 0 at both non-zero
values of y. Therefore, f must have a pole at finite distance, and it is not
allowed. In case (2), the intersection of x − β with the curve C is at the
point P = (β, 0) twice. If g0 (β) = 0 since y = 0, then the numerator does
not pass through the point P = (β, 0), so the function will have two poles
at P = (β, 0). It is impossible. Therefore, g0 (β) = 0. We shall study the
completion of the local ring OP at the point P. Let us discuss the situation
that β = 0 (other situations are similar). The defining equation is
Proposition 4.28. We always have (E) − d(E) ≤ (D) − d(D) for any
two divisors E and D with E D.
Example 21: Let us consider a Fermat’s curve x31 +x32 +x30 over F2m . Then,
by previous discussions, we know it is smooth and regular. By Plücker’s
formula, its genus g is 1. It follows from Riemann’s theorem that (D) ≥
d(D) always. Let us verify Riemann’s theorem for some special divisors.
Let P∞ = (0, 1, 1). Let us consider D = nP∞ for some non-negative
integer n. Note that d(D) = n. Let us make a projective transformation π:
(1) : π(x1 ) = y1 + y2 ,
(2) : π(x2 ) = y2 ,
(j) : π(x0 ) = y0 .
Then, the defining equation becomes y13 + y2 y12 + y22 y1 + y03 . Let us consider
the affine part defined by setting y1 = 1 with x = y0 , y = y2 . The equation
becomes
y 2 + y = x3 + 1.
The function field F (C) is of degree 2 over F2m (x). In general, the functions
in F (C) = K(x)[y] are of the following form:
g0 (x) + g1 (x)y
f (x, y) = .
h(x)
We may assume that (h(x), g0 (x), g1 (x)) = (1). We make a further
assumption that h(x) splits completely (otherwise, go to a finite extension
of K or assume the field is algebraically closed). Suppose that f (x, y) has
no pole at finite distance, we claim that h(x) is a non-zero constant.
If not, we show that f has a pole at finite distance and thus f ∈ L(nP∞ ).
Consider any non-constant factor x−β of h(x). The intersection of x−β with
curve C will be distinct points P1 = (β, y1 ) and P2 = (β, y2 ) (corresponding
to two distinct non-zero values, y1 , y2 , of y on C with x = β, which can
always be achieved if we go to an algebraic extension of K). If the numerator
of f (x, y) is zero at both points, then we have
g0 (β) + y1 g1 (β) = 0,
g0 (β) + y2 g1 (β) = 0.
116 Introduction to Algebraic Coding Theory
g0 (β) = 0,
g1 (β) = 0,
v 2 + v = u3 + v 3 .
It is easy to see that ordP∞ (v) = 3, ordP∞ (u) = 1. Hence, ordP∞ (y) = −3,
and ordP∞ (x) = −2. Therefore, a polynomial of the form g0 (x) + g1 (x)y
will have order −2i or −2j − 3 at P∞ . It is easy to see all functions with
at most n poles at P∞ form a vector space of dimension n.
Exercises
(1) Show that the curve defined over Fp2 by the equation
ax2 + by 2 = 1
is a birationally equivalent (cf. Definition 4.26) to a projective line for
any 0 = a, b ∈ Fp2 , where p is an odd prime number.
(2) Finish the arguments of Example 22.
(3) Finish the arguments of Example 23.
1
(4) Prove Riemann’s theorem for PK directly.
(5) Given ch(K) > 2, and the curve C defined by x2 − y 2 + x3 + y 4 = 0,
let P be the point at infinity. Find L(P), L(2P), L(3P).
118 Introduction to Algebraic Coding Theory
Then, we may simply define residue as a−1 and disregard the integration.
Certainly, we have to prove the residue thus defined is algebraically
sound.
for all P.
4.7.4. Residue
In the example at the beginning of this section, we have the important
concept of residue of Cauchy over complex numbers C. We want to
Algebraic Geometry 121
Then there exists a differential ω of F with the divisor (ω) ∈ Ω(D) such
that resPi ω = ri .
Exercises
1
(1) Prove the existence theorem for differential for PK directly.
1
(2) Prove the approximation theorem for functions for PK directly.
122 Introduction to Algebraic Coding Theory
The above proposition means that in our situation, a local ring O (we
only consider valuation rings) contains a perfect field K, hence the residue
map O → O/P = O induces an isomorphic on K. So, the characteristic of
K equal to the characteristic of O. This is the equicharacteristic case, and
O is K [[t]], where K is the field of representative. We have the following
proposition which will be useful for algebraic coding theory.
Then, we have the following: (1) μp (ydx) = r; (2) resP (ydx) = c−1 .
Proof. Statement (1) follows from definition, and for (2), see Chevalley,
p. 110 corollary of Theorem 6 [10].
Remark: In our applications to coding theory, we restrict the above
proposition to the simple cases that the differentials only have simple poles,
i.e., r ≥ −1.
Note that in the classical complex case, every place is rational; this
result shows that this residue matches Cauchy residue.
Example 24: Let us consider the curve C = PC1 , where C is the field of
the complex numbers.
(1) We shall make computations according to Riemann–Roch theorem.
Let D = mP∞ , where m is an integer ≥ −1. Since the genus g = 0, we
have d(mP) ≥ 2g − 1 = −1; it follows from Proposition 4.28 (Riemann’s
theorem) that (mP) = d(mP) + 1 − g. Hence, it follows from Proposition
4.34 (Riemann–Roch theorem) that i(mP) = 0, i.e., there is no differential
ω with divisor δ(ω) mP∞ . Let us consider D = −mP∞ , where m ≥ 2
is an integer. Let the regular function ring of PC1 \{0} be C[x]. Then, it
follows from Proposition 4.29 that (−mP) = 0. Further, it follows from
Proposition 4.34 (Riemann–Roch theorem) that
−i(−mP) = d(−mP) + 1 − g,
and hence, we have
i(−mP) = m − 1.
124 Introduction to Algebraic Coding Theory
Let fg(x)
(x)
dx ∈ Ω(−mP). Since the differential has no pole in A1C , g(x) must
be a non-zero constant; we may assume it is 1, i.e., the differential is
f (x)dx. It is easy to check that the vector space Ω(−mP) is generated
by {dx, xdx, . . . , xm−2 dx}.
(2) We wish to show that the differentials in the classical sense are
equivalent to linear functions on Ξ. First, we generalize our previous
concepts as follows.
Let f (x)dx be a differential in the classical sense. Let vp = the order of
f (x)dx at point P, i.e., f (x)dx = ( ∞ j
i=vp aj t )dt, where avp = 0 and t is a
uniformization parameter. Then, (f (x)dx) = vp P. Let D = j nj Pj
be a divisor. Then, we have f (x)dx ∈ Ω(D) ⇐⇒ (f (x)dx) D. We
give a concrete argument for (2) directly. We separate the discussions
into the following two steps: (A) Every differential in the classical sense
induces a differential in the Weil’s sense, and this induction is a one to one
correspondence. (B) The induced map is a bijection between Ω(D) and the
linear functionals vanishing on Ξ(D) + F.
Proof of (A): We show that every differential in the classical sense can be
viewed as a (K-)linear function on Ξ which vanishes on Ξ(D) + F for some
suitable divisor D. Let ω = f (x)dx be a differential in the classical sense.
Then, for any repartition ξ and any point P ∈ PC1 (note that all points are
rational), let ξp be the function element specified by ξ at the point P, we
may define
ωP (ξ) = the residue of ξp ω at P.
Since for the above residue to be non-zero at P, either ω must have a pole
or ξP must have a pole, and both sets are finite, therefore there are only
finitely many P for the above residue to be non-zeroes, we may define
ω(ξ) = ωP (ξ).
P
(f (x)dx) −D.
Then, (f (x)dx) = j nj Pj + m P∞ + k k Pk with nj ≥ nj , m ≥ m
and k ≥ 0. We see that h(x)f (x) has no poles at finite distance, and thus,
it is a polynomial g(x) in x. Therefore (g(x)dx) is (nj − nj )Pi + (m +
( j nj )P∞ . Let t = x−1 be a uniformization parameter at ∞. We have
dx = −t−2 dt. Then, it follows that at the point ∞,
⎛ ⎞ ⎛ ⎞
v∞ (h(x)f (x)dx) = m + ⎝ nj ⎠ ≥ m + ⎝ nj ⎠ = −d(D),
j j
v∞ (g(x)) ≥ 2 − d(D).
proved in this case. For the general cases, the reader is referred to Chevalley,
pp. 26–30 [10] or Exercise 4.
According to (1), (2), (3), differentials in the classical sense are the linear
functions which vanish on Ξ(D) + F.
1
Example 25: Let C = PK with ch(K) > 2. Then, F (C) = K(x). Let
ω = (x2 +x3 )d(x2 )+dx be a differential. It is easy to see that ω = (1+2x3 +
2x4 )dx. At any point Pa = ((x − a)), the local ring at Pa is R = K[x](x−a)
with maximal ideal (x−a)R, where (x−a) is the uniformization parameter.
Then, ω = (1 + 2(x − a + a)3 + 2(x − a + a)4 )d(x − a) = f (x − a)d(x − a).
We have ordPa (ω) = ordPa (f (x − a)). We have one more point P∞ at ∞.
The local ring at P∞ is K[τ ](τ ) , where τ = x1 . The differential ω =
(1+2τ −3 +2τ −4 )d(τ −1 ) = −(τ −2 +2τ −5 +2τ −6 )dτ ) = −τ −6 (2+2τ +τ 4 )dτ .
So, we have that μP∞ (ω) = ordP∞ (−(τ −2 + 2τ −5 + 2τ −6 ) = −6.
Example 26: In the definition of residue, we have to use the trace function
which can be illustrated by examples. If the point is not rational, then we
1
have to consider the trace function. Let us consider the curve PK and the
following differential:
2xdx
η= .
x2 + 1
Let the ground field K be C the complex field. By partial fraction, we have
√
with i = −1,
1 1
η= + dx.
x+i x−i
It is easy to see that the residues at (x + i), (x − i) are 1, 1, respectively.
At P∞ with uniformization parameter t = x−1 and x2 + 1 is an unit, η
becomes
η = (−2t−1 + · · · )dt.
It is easy to see that resPj η = 0. However, let us consider the case that
the ground field K = R the real field. At ∞, it has a residue −2 as before.
There is only one point P = (x2 + 1) at finite distant with a pole. Let
t = x2 + 1 be a uniformization parameter at this point. Then, we have
η = t−1 dt.
The coefficient of t−1 term is 1. However, the point is not rational. Its
residue will be defined as the trace of 1, note that 1 · 1 = 1 · 1 + 0 · i and
Algebraic Geometry 127
The sum of all residues at finite distance is clearly cj1 . We shall show
that the residue at P∞ is its negative. Since the residue map is linear,
it suffices to check every term in the above formula. Note that at P∞ , a
uniformizing parameter is x−1 = t. Then, we have
Exercises
(1) Given a smooth and regular curve C, the Weierstrass gaps is the integers
i such that there is no rational function z which has a pole only at P
and ordP (z) = −i. Show that there are precisely g Weierstrass gaps
where g is the genus of C.
(2) For Example 29, show that (D) = 1 directly.
1
(3) Let P be a rational point on PK . Show that i(P) = 0 and (P) = 2.
1
(4) Given the curve C = PK , find a differential f (x)dx with residues 1 at
x = 1 and −1 at x = −1 with 0 residues elsewhere.
(5) Given the curve PC1 , where C is the field of complex numbers and any
divisor D, show that the dimension of all linear functional on Ξ which
vanishes on Ξ(−D) + F is max(0, d(D)-1).
Algebraic Geometry 131
Recall that
∞
1 xr
ln =− .
1−x 1
r
Proof. We may deduce a proof in the curves case from Hartshorne’s book
“Algebraic Geometry” V Ex 1.10, Appendix C Ex 5.7.
So, there are totally nine points. This makes the non-restrictive
inequality in Hasse–Weil’s inequality an equality.
Algebraic Geometry 133
x30 = 1.
α2 + α + 1 = 0,
β 2 + β + α = 0.
It is easy to see that F22 = F2 [α] and F24 = F22 [β]. Let us count the
number of rational points on C. It is impossible to have two of x0 , x1 , x2
to be 0 since the third one must be 0 and (0, 0, 0) is not a projective point.
So, we have the following two cases: (1) two of them are non-zero and the
third zero; (2) all of them are non-zero.
134 Introduction to Algebraic Coding Theory
(y + 1)5 = x15
1 = 1.
P1 (t)
Z(t) = ,
(1 − t)(1 − qt)
where
2g
P1 (t) = (1 − αi t) = (1 − αt)(1 − βt)
i=1
(1) 1 + 2 + α + β = 3,
(2) 1 + 4 − α2 − β 2 = 9,
(3) 1 + 23 − α3 + β 3 = N3 .
Nr = 1 + q r − αr − β r .
y 2 = x(x − 1)(x − α)
α2 + α + 1 = 0.
Example 36: For the later applications in coding theory, let us consider
the Klein quartics curve over F2 or F22 or F24 defined by the equation
x3 y + y 3 + x = 0. Its genus is g = 3.
For coding theory, let us count the number of rational points. It is clear
that x = 0 ⇔ y = 0. We denote this point (0, 0) (or (1, 0, 0) as the projective
point) by P. It is easy to see that (1, 1) is not a point. After we homogenize
the equation, it becomes x3 y + y 3 z + xz 3 = 0. Let z = 0. Then, the curve
has other two points at ∞, (0, 0, 1) = P1 and (0, 1, 0) = P2 over any
field K. So, there are three rational points over F2 :
Now, let us count the points at the finite distance. Over F22 , we have x3 = 1
for all x = 0. Therefore, the affine equation gets reduced to y + 1 + x = 0
if x, y = 0, i.e., y = 1 + x, where x = 1, 0. Let α be a field generator of F22
over F2 , i.e., α2 + α + 1 = 0. Then, (α, α + 1) = P3 , (α + 1, α) = P4 are
the extra rational points on the curve. Therefore, over F22 , there are five
rational points. The two extra points are
Let us consider the field F24 . Let F24 = F22 [β] with β satisfying β 2 +β+
α = 0. First, we consider x = α. Note that α3 = 1. Therefore, the equation
will be reduced to y 3 + y + α = 0. It is easy to see that y 3 + y + α =
(y + α + 1)(y 2 + (α + 1)y + (α + 1)). We have to solve y 2 + (α + 1)y + (α +
1) = 0. The two roots are y = α2 (β + 1), α2 β. We name (α, α2 (β)) = P5 ,
(α, α2 (β + 1)) = P6 . Similarly, we find P7 = (α + 1, (α + 1)2 (β + α),
P8 = (α + 1, (α + 1)2 (β + α + 1)). Thus, we have four more points:
Exercises
We use the term the theory of smooth algebraic projective curves over a finite
field Fq for the theory of functions of one variable over a finite field Fq .
As we all know that after Shannon, the theory of coding theory were
separated into two parts: (1) theoretical part about the existence of good
codes, (2) decoding procedures. Note that for Hamming code, or more
general for any vector-space code, we have the following diagram:
π
message space Fkq −
→ word space Fnq ,
where the map π is an injective map with the image of the code space.
For the later developments, we slightly modify the coding theory to the
following diagram:
σ σ
message space Fkq −→
1 2
function space −→ word space Fnq .
The first map σ1 is injective. Thus, we use functions to rename the messages,
and the second map σ2 is a homomorphism with the image of σ2 σ1 (Fkq ) =
as the code space. Usually, the map σ2 is an evaluation map which evaluates
a function f at an ordered n-tuple of points (P1 , . . . , Pn ). Thus, it maps a
function f to [f (P1 ), . . . , f (Pn )] ∈ Fnq . Note that σ2 σ1 will send the message
space to the word space, certainly we do not want to send any non-zero
message to zero; thus, we require that the composition σ2 σ1 is an injective
map on the message space. In our previous discussions, the message space
is naturally mapped to the code space. The theorists are mainly working on
141
142 Introduction to Algebraic Coding Theory
the function space, and the engineers work on the methods to correct the
errors after the transmissions of the code words. For Reed–Solomon codes,
the function space is all polynomials with degree k − 1 or less, which is a
subspace of all polynomials with degree n − 1 or less (these pair of vector
spaces are mapped to a pair of a code space ⊂ the word space = Fnq by
evaluations at a sequence of points). For classical Goppa codes, the function
space is the set of rational functions of the following form:
n
ci
f= ≡0 mod (g(x))
i=1
(x − γi )
n + 1 ≥ k + d ≥ n + 1 − g.
d ≥ n − d(D).
For the last inequalities, the first one is the classical Singleton bound
(Proposition 1.23). The second one comes from the two inequalities already
established for k and d of this proposition as
k + d ≥ d(D) − g + 1 + d ≥ n + 1 − g.
Remark 2: The number k provides the rate of information k/n, and the
lower bound n − d(D) for d provides the bound number (n−d(D) 2 for the
number t of corrections for the code (cf. the remark after Proposition 1.25,
as long as t ≤ (n−d(D)−1
2 , the code can correct t errors by brute force). The
lower bounds for the rank (d(D)−g+1) and the minimal distance (n−d(D))
are called the designed rank and designed minimal distance.
Algebraic Curve Goppa Codes 145
Example 1: Let the curve C be the projective line PF1q over a finite field Fq .
Let n = q − 1 and β be a generator of the cyclic group F∗q . Let B =
Pβ + Pβ 2 + · · · + Pβ n , where Pβ j is the point that solves the equation
x − β j = 0, and D = (k − 1)P∞ , where P∞ is the point at infinity, and we
assume that k < n. Then, we have
k ≥ m − 1 + 1 = m,
d ≥ 8 − m.
We consider two codes are identical if both their word spaces are the
same Fnq and their code spaces are the same subspace.
Hence, we conclude that the code space of CΩ (B, D) and Cp (B, D) are each
the dual code of the code space of CL (B, D). Hence, the geometric Goppa
code in residue form CΩ (B, D) is the geometric Goppa code in primary form
Cp (B, D).
Proof. Clearly, it follows from the remark after Proposition 4.38 that the
differential ω exists. The divisor (ω) = −B + U, such that the support of
B is disjoint from the support of U. Let us define a map π : CL (B, D ) →
CΩ (B, D) by
π(f ) = f ω.
It is easy to see that
f ∈ L(D ) ⇔ (f ) + U − D 0
⇔ (f ) D − U ⇔ (f ) + (ω) D−U+U−B=D−B
⇔ f ω ∈ Ω(D − B).
Furthermore, π : L(D − B)→ Ω(D) by
f ∈ L(D − B) ⇔ (f ) + U − D − B 0 ⇔ (f ) B+D−U
⇔ (f ) + (ω) B + D − U + U − B = D ⇔ f ω ∈ Ω(D).
Hence, π : L(D )/L(D − B) → Ω(D − B)/Ω(D), and π induces an
isomorphism.
150 Introduction to Algebraic Coding Theory
This theory is sensitive to the curves and divisors involved. For instance,
there are curves with very few rational points or even no rational points.
Clearly, the theory is bad or void in that case. For smooth plane curves,
2
since the plane PK has q 2 + q + 1 rational points, there are at most q 2 + q + 1
rational points on the plane curve, and the geometric Goppa code based on
it will be short. Or the divisor selected is poor for the application purposes
(we have to take the decoding process in consideration). Therefore, we
shall consider space curves. We are interested in special curves with many
rational points so that the selection is easy with special divisors which may
aid us in decoding.
We illustrate the above method by several examples as follows.
As pointed out by the remark of Proposition 1.25, any finite code, linear or
not, can be decoded by brute force. In the next two chapters, we discuss two
faster ways of decoding algebraic curve Goppa codes and compare them with
a brute-force decoding. In this section, we show that if two codes are with
closed rates of information and closed rates of distances, then the longer
one is more precise. This is easy to understand. For instance, let C1 be
an [n, k, d] code which corrects d−1
2 errors and C2 be an [2n, 2k, 2d] code
which corrects 2 . Let them work on a block of length 2n. Then, we use
2d−1
a decoder for C1 twice and a decoder for C2 once. Then, any received word
which can be decoded by the decoder for C1 can be done by the decoder
Algebraic Curve Goppa Codes 151
for C2 . On the other hand, a received word with d−1 2 +1 errors for the first
n letters can only be decoded by the second decoder if the total number
2 . Hence, it is easy to see that C2 is
of errors is less than or equal to 2d−1
more precise. We show this point by simple computations.
The decoding process for Reed–Solomon codes are faster than the
known decoding algebraic geometry Goppa codes (see the next chapter).
However, the algebraic geometry Goppa codes are more precise than the
Reed–Solomon codes. The ground field controls the time required for
multiplication, which certainly affects the total speed of computation. For
our comparisons of different codes we shall fix a ground field. Let us fix the
ground field to be Fq = Fpm ; there are q + 1 smooth points in PF2q . There,
the length of code is ≤ q − 1 = 15 = n for Reed–Solomon codes, classical
Goppa codes, etc. The Weil’s theorem on the number of smooth rational
points of a projective curve gives us a much bigger number, and Example 4
of the preceding section shows that a projective curve C over F24 could
have 64 points, which is many more that 24 + 1 = 17 points. So, we have a
longer code. We discuss why a longer code is more precise.
We follow après Pretzel [8], p. 69. A further advance of geometric Goppa
codes in the future might be an improvement of the speed of decoding. If
the speed is not an issue, then the geometric Goppa code has an advantage
of being more accurate, which will be illustrated in this section.
Let us consider a Reed–Solomon code over the field F24 defined by all
polynomials of degree < 7. Then, it is a [15, 7, 9]-code, where 15 = 24 −1, 7 =
7, 9 = 15 − 7 + 1. So, the number of message symbols is 7, and it can correct
4 = (9 − 1)/2 errors. Let us consider a geometric Goppa code based on the
curve x50 + x51 + x52 (cf. Example 4 in the preceding section). By Plücker’s
formula, its genus g is 6. We have computed to find that there are 65 rational
points (cf. Example 35 in Section 4.9). Let one of these 65 points be P.
Let us consider a one-point code with B = the sum of other 64 points and
D = 37P. Then, it is an [n, k, d]-code, where n = 64, k = 32, d ≥ 27. Using
SV algorithm (see the following), it may correct (27 − 6 − 1)/2 = 10 errors.
Let us process through the said Reed–Solomon-code four times. Then, we
process 28 message symbols, while we may process 32 symbols through the
said geometric Goppa code.
Let us compare these two processors. We have 32 > 28 and geometric
Goppa code carrying more messages. Another yardstick is the failure rate,
i.e., the probability that one may fail to recover code messages due to
there being more than allowed errors and mistakenly decoding to wrong
messages. To simplify our notations, we say that the decoder always returns
152 Introduction to Algebraic Coding Theory
an error message in those cases. Let us compute the probability for each
processors to return an error message, i.e., if there are more than allowed
errors appearing. For Reed–Solomon process, if there are 5 or more errors,
then the processor will not decode and will reject the received word. Let
the channel has a probability p of being incorrect and q of being correct.
Then, p + q = 1 and
n
1 = 1n = (p + q)n = Cni pi q n−i
i=0
1
r0 = pi = p = 1 − q.
i=1
15
4
i 15−i i 15−i
r1 = C15
i p q =1− C15
i p q .
i=5 i=0
By the same arguments, using SV algorithm for geometric Goppa code, the
probability r2 of returning an error message is
64
10
i 64−i i 64−i
r2 = C64
i p q =1− C64
i p q .
i=11 i=0
Let us assign a numeral for p. Let r0 = p = 0.01. Then, the above numbers
are r0 = 0.01 and r1 = 0.27627423 × 10−6 , and r2 = 0.595098292 × 10−10 .
Therefore, to use the Reed–Solomon processor four times, the probability
R1 of returning an error message is
R1 = 1 − (1 − r1 )4 = 0.11052 × 10−5 .
and 1 − (1 − r3 ) 2188
= 0.000000000655500732%, which is much better than
the Reed–Solomon code.
α(δ) ≥ 1 − Hq (δ).
where the two-by-two matrix above is in the full modular group = the special
linear group over Z = SL2 (Z) = Γ. Let the principal congruence subgroup
of level N be
a, b a, b
Γ(N ) = : det = 1, a ≡ d ≡ 1, b ≡ c ≡ 0 modN .
c, d c, d
Then, Γ0 () acts on the half-plane H, and Γ0 ()\H is an affine curve Y0 ()
(for a detailed discussion, see Tsfasman et al. [33]). We define the reduction
of Y0 () by characteristic p and complete it to a projective curve X0 ()(=
C ). Then, its genus is /12. We have the following proposition.
Remark: The curves mentioned above are interesting. For instance, there is
no smooth-plane models for them if is large enough. Note that a projective
plane PF2q can be decomposed as A2Fq ∪AFq ∪0; its number of rational points
is q 2 + q + 1. Therefore, a smooth plane curve has at most q 2 + q + 1 rational
points; then, the Shimura curves X0 () do not have a smooth and regular
planar model if > (12(p4 + p2 + 1)/(p − 1) − 1), where q = p2 . Note
that then, we have the number of rational points n ≥ (p − 1)( + 1)/12 >
(p − 1)12(p4 + p2 + 1)/((p − 1)12) = (q 2 + q + 1) the number of rational
points of a plane.
We want to show that geometric Goppa codes can better than the well-
known Gilbert–Varshamov’s bound. Recall the following entropy function
Hq (x) with Hq (0) = 0 and defined for 0 < x < (q − 1)/q:
or
(q − 1)(1 − x)/(qx) = 1.
x = (q − 1)/(2q − 1)
Theorem 5.11. Let us use the notations of the previous Proposition 5.9.
Moreover, we assume that p ≥ 7 and q = p2 . Then, we have a one-point code
on X0 () with block length n − 1 tending to ∞ such that for δ = (q − 1)/
(2q − 1), the relative minimum distance d /(n − 1) tends to a limit > δ,
and their rate of information k /(n − 1) tends to a limit > 1 − Hq (δ).
Proof. Let us consider the sequence of Shimura curves X0 (). Let us select
one rational point P and call the other rational points P1 , . . . , Pn −1 .
Algebraic Curve Goppa Codes 157
n −1
Let B = i=1 Pi . Let D = α P with α > 2g − 2. Then, the rank
k and minimal distance d satisfies
k ≥ (n − 1) − α + g
d ≥ α − 2g + 2
and the rate of distance δ and the rate of information R are given as
R = k /(n − 1) ≥ (n − α + g − 1)/(n − 1),
δ = d /(n − 1) ≥ (α − 2g + 2)/(n − 1).
Choose α so that (α − 2g + 2)/(n − 1) → δ. Then, the limit of δ is
at least δ, and the limit of rate of information R (cf. Proposition 5.9) is at
least
1 − δ − lim(g /n ) ≥ 1 − δ − 1/(p − 1).
Under our assumption, p ≥ 7, and we want to prove that logp2 (2p2 − 1) >
p/(p − 1) for all p ≥ 7. For p = 7, we have log72 (2 · 49 − 1) = 1.1755 > 7/6 =
7/(7 − 1), or the inequality logp2 (2p2 − 1) − p/(p − 1) > 0 is satisfied. It
is not hard to see that it is a monotonic increasing function. Therefore, we
always have logp2 (2p2 − 1) > p/(p − 1) = 1 + 1/(p − 1) for all p ≥ 7. Note
that we use q = p2 , the above result can be rewritten as logq (2q − 1) >
p/(p − 1) = 1 + 1/(p − 1). It follows from the preceding Proposition 5.10
that (cf. Definition 1.36, Proposition 1.40)
α(δ) = lim R ≥ 1 − δ − lim(g /n ) ≥ 1 − δ − 1/(p − 1)
→∞
Exercises
(1) Find the generator matrix for the code defined in Example 2.
(2) Find the check matrix for the code defined in Example 2.
(3) Write a computer program to code any message for the code defined in
Example 2.
(4) Show that the Gilbert–Varshamov’s bound is not reached by a sequence
of Hamming codes.
This page intentionally left blank
Chapter 6
6.1. Introduction
(1) The sender and the receiver agree on the generator matrix and the
check matrix (cf. Exercises 4 and 5). Upon receiving the received word,
Decoding the Geometric Goppa Codes 161
the receiver either (A) lets it goes to the left to check if it is a code
word by using the check matrix or the check procedure to find if the
received message is a code word. If it is a code word (it may not be
the original sent code word), then go to the next block of message. If
not, then use the following syndrome calculation, which lead it to the
syndrome calculation (see Section 6.2) of the right column. Or (B) We
may go to the right column directly. We apply the syndrome calculation
to the received word r. There are two possibilities: either it passes the
calculation or it fails the calculation. If it passes the calculation and
comes from the left column with a failure for the check procedure or
check matrix, that means the basic assumptions 1 ≤ wt(e) ≤ t is not
true; therefore, we conclude that there are more than t errors, we return
an error message. Or maybe, it directly comes to this calculation and
passes, we have to check it by the checking procedure. If it further passes
the check, then it is a code word; if it fails then it goes to SV or DU.
(2) We start either the SV algorithm (Section 6.3) or DU algorithm
(Section 6.4), which is with the basic assumption that there are t
or less errors. At the end of the procedures, we construct an error
vector e.
(3) Further, the test will decide if r − e is a code vector. If it is, then we
complete the decoding procedure and correct the errors (occasionally,
it may correct more than t errors) successively. If it is not, then there
are more than t errors and return a message “error”.
received r
XX
Definition 6.1. For any rational function φ, r ∈ the rational function field
of C with no pole at B, we define pseudo-dot (or a pairing) product · as
φ·r = φ(Pj )r(Pj ) = φ(Pj )rj ,
j j
Definition 6.2. Given a received word r with the unknown original code
word c. Let e = (e1 , . . . , en ) be an error word with c = r +e. The point Pj is
called an error location (with respect to e) if ej = 0. A non-zero function
Decoding the Geometric Goppa Codes 163
It is possible that the error vector e we find may not be the true one. It
is due to that our basic assumption 1 ≤ wt(e) ≤ t is not satisfied. We have
to use the check procedure or the check matrix to decide if e is the true one
by checking if r − e is a code word.
Let us consider L(D). Let {ϕ0 , . . . , ϕ−1 } be a basis of L(D). Then, we may
write {ϕi (Pj )rj } as a × n matrix (ϕi (Pj )rj ). Note that if every row adds
up to 0, then ϕi · r = 0, ∀i, so there is no syndrome with respect to L(D)
and r = c or e = 0. The above is the check procedure. We use the following
proposition to decide if a received word r has either more than t errors or no
error (cf. the remark after Proposition 1.25). It is an important proposition
to understand the decoding procedures; it shows that if we only allow t or
less errors in any block, we shall only check all functions χ in L(Y) with
d(Y) ≥ t + 2g − 1 and the support(Y) disjoint from support(B) instead of
check L(D). Note that even if there is no syndrome, it may be due to the
false assumption that the number of error is ≤ t (i.e., the true number of
errors > t). We have to further test to see if it is a code word.
The following is a useful and important criterion for error locator θ. This
proposition will be applied also in the DU Algorithm in the next section.
Proof. Let e = θ e = (θ(P1 )e1 , . . . , θ(Pn )en ). Note that the weight of
e ≤ t since e has at most t non-zero coordinates.
(⇒) Since θ is an error locator, we have e = 0. We have
θχ · e = χ · (θ e) = χ · e = 0,
for all χ ∈ L(Y).
(⇐) Since χ · e = χ · (θ e) = θχ · e = 0 for all χ ∈ (Y), then e is
a code word in Cp (B, Y) which has a minimal weight ≥ d(Y) − 2g + 2 ≥
(2g + t − 1) − 2g + 2 = t + 1 > t. Therefore, e = (0, 0, . . . , 0), and θ is an
error locator for e.
If we used the check procedure or the check matrix before heading to decide
that there are errors, then certainly, there are more than t errors. Otherwise,
we complete a basis ϕ0 , . . . , ϕw−1 of L(U) to a basis ϕ0 , . . . , ϕ−1 , we have
to use the remaining part of check procedure now to complete it. Thus, we
can tell if r is a code word or there are more than t errors.
Let us go back to the syndrome calculation without check first. (2) If
the equation is not satisfied, i.e., if the answer on the right-hand side is
a non-zero vector, we assume that there are fewer than t errors and use
the syndrome table to find an error locator and the error vector e (see
Propositions 6.7 and 6.8). Certainly, we have to check if r − e is a code
word by using the check matrix or the check procedure. If r − e is a code
word, then we succeed in correcting the errors. If r − e is not a code word,
then there are more than t errors and the decoder fails. We need d(U) ×
n (sometimes only (2g + t − 1) × n as in the case of one-point codes)
multiplications for the syndrome calculation of every block.
Suppose an error locator θ is found. We wish to estimate the size of the
set M = {Pi : θ(Pi ) = 0}. The following proposition gives an estimation.
If F
0 or E
0, then E + F
0. It is impossible. Thus, F 0, E 0.
We wish to show that d(F) ≤ d(A). Suppose the contrary, d(F) > d(A).
Then, d(E + F) > d(A), and d(A) = d((θ) + A) > d(A). A contradiction.
Since we usually select divisor A with support (A) disjoint from support
(B) which contains M , then we conclude that d(A) ≥ ||M ||. This is an
important estimation of the size of M .
6.3. SV Algorithm
This is the part of step (2) of Section 6.1 of this chapter. Let us consider
a geometric Goppa code in primary form Cp (B, D) based on a smooth
Decoding the Geometric Goppa Codes 167
projective curve C with genus g and a suitable divisors B, D. Let the rank
k = n − d(D) + g − 1 (note that according to Proposition 4.28 (Riemann’s
theorem), if d(D) ≥ 2g − 1, then we have (D) = d(D) + 1 − g, and k = n −
(D) = n−d(D)+g −1) and the minimal distance d = d(D)−2g +2 and the
number of permissible errors t ≤ d−g−1
2 . The SV algorithm requires many
numerical inequalities. They are for (1) syndrome calculation to decide if
the syndrome of r with respect to U is 0, (2) the existence of error locator θ,
and (3) the usage of θ to compute the error vector e.
The left null space of the above matrix is not trivial. For any non-trivial
solution [α0 , . . . , αv−1 ], let θ = αj ψj . Then, θ is an error locator for e
in L(A).
Proof. First thing we want to show is that the above system of equations
has a non-trial solution. We have that (A) = v > t. Let M be the set
of error locations of e (although we do not know e, the set M exists
theoretically). Let us consider the following system of equations:
ψj (Pk )αj = 0, Pk ∈ M.
j
for all j. Thus, θ is an error locator for e in L(A). Therefore, equation (2)
is satisfied by {α0 , . . . , αv−1 }.
Conversely, let the set {α0 , . . . , αv−1 } satisfy the above equation (2).
Let θ = ψj αj . It follows from Proposition 6.4 that it suffices to show
that θχ · e = 0 for all χ ∈ L(Y), especially, for all χj ∈ L(Y). We have
θχj · e = ψk χj · eak = Skj · eαk = Skj · rαk = 0.
k k k
where the Ek are all indeterminates, then the non-trivial solution exists
uniquely. (iii) Furthermore, the error vector e = [e1 , . . . , en ] is a solution
/ M .
set and ej = 0, for all j ∈
Proof. (i) It follows from Proposition 6.6 that since M ⊂ support(B) and
support(A) ∩ support(B) = ∅, all elements of M are outside the support of
A. We have ||M || ≤ d(A). (ii) Note that L(X) ⊂ L(D) and φj · r = φj · e.
We may replace all φj · r in the above equation by φj · e. Let M = all error
locations of e, we then have M ⊂ M . We have
φj (Pk )Ek = φj · e = φj · r.
k∈M
φj · (e − e∗ ) = 0, j = 1, . . . , u,
which means that (e − e∗ ) ∈ Cp (B, X), which has a minimal distance
d = d(X) − 2g + 2 ≥ d(A) + 1 (by the sufficient relation (6)). On the other
hand, e , e∗ have only zero values outside M which has a cardinality of
at most d(A). We conclude that e − e∗ = 0 or e = e∗ . Thus, we prove
the uniqueness of the solutions. It means that we may solve the system of
equations (3) to find the error vector e. (iii) They are obvious from the
above discussion.
The preceding three propositions are the kernel of the decoding process.
Note that always we assume 1 ≤ wt(e) ≤ t. If the system of equation (2)
produced only trivial solution, then our assumption 1 ≤ wt(e) ≤ t is false,
i.e., either there is no error or there are more than t errors. The two cases
can be separated by the check procedure or by using a check matrix.
Even if it does produce an error vector e, it may be just an accident, and
we still have to check r + e to be sure. If r + e is a code word, then we
decode successively. Otherwise, if r + e is not a code word, then the decoder
fails.
The above system of equations (3) can be written in matrix form with
ik ∈ M as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
φ0 (Pi0 ) · · · · · · φ0 (Pim ) Ei1 φ0 · r
⎢ ··· ··· ··· ··· ⎥ ⎢ · ⎥ ⎢ · ⎥
⎢ ⎥·⎢ ⎥ ⎢ ⎥.
⎣ ··· ··· ··· ··· ⎦ ⎣ · ⎦=⎣ · ⎦ (4)
φu−1 (Pi0 ) · · · · · · φu−1 (Pim ) Eim φu−1 · r
Example 1: Let us consider the Klein quartics projective curve over F24
defined by the equation x3 y + y 3 + x = 0. We shall consider one-point
code with genus g = 3. Let us take t = 3, b = 6, m = 14 = d(D). Then,
all numerical conditions are satisfied. This code is with n = 16, k = 4 =
n − d(D) + g − 1, d = d(D) − 2g + 2 = 10.
(1) Let us count the number of rational points. There are 17 rational points
{P1 , P2 , . . . , P16 , P} over F24 . (cf. Example 36 in Section 4.9. We
shall use the notations of that example). We take B = P1 + · · · + P16
and D = 14P, where P = the origin.
(2) We take A = 6P, U = Y = 8P, X = 14P. It is easy to check that
d(D) = 14, d(A) = 6, d(Y) = 8, d(X) = 14 ≥ 2 · g − 1 = 5; therefore, it
follows from Proposition 4.28 (Riemann’s theorem) that (D) = d(D)+
1 − g, k = n − d(D) − 1 + g and D U, D A + Y X.
(3) By direct computation, we know the rank of this code is n − d(D) + g −
1 = 4 (cf. Exercises 4 and 5), and the designed minimal distance
d ≥ d(D) − 2g + 2 = 10. The SV algorithm will correct d−g−1 2 ≥3
errors.
(4) We compute a basis of L(D) = L(14P). It is easy to see that the
following {f0 , f3 , f5 , f6 , f7 , f8 , f9 , f10 , f11 , f12 , f13 , f14 } form a basis:
Note that ordP (x) = 3, ordP (y) = 1, and ordP (fi ) = −i. The reasons that
they form a basis are the following: (1) fi ∈ L(14P); (2) they are linearly
independent over F24 ; (3) by Riemann’s theorem, (14P) = 14 + 1 − g = 12.
We shall compute the following 12 × 16 matrix C:
⎡ ⎤
f0 (P1 ) · · · · · · f0 (P16 )
⎢ ⎥
⎢ f (P ) · · · · · · f3 (P16 ) ⎥
C =⎢ 3 1 ⎥.
⎣ ··· ··· ··· ··· ⎦
f14 (P1 ) · · · · · · f14 (P16 )
Note that the last equation is equivalent to the defining equation of the
curve x3 y + y 3 + x = 0. We compute the generator matrix and the check
matrix which are 16 × 4 and 4 × 16 matrices, respectively (cf. Exercises 4
and 5).
Let us consider the received word r, code word c, and error word e. We
compute fi · r for fi = χi ∈ L(Y) = L(U) = L(8P), i.e.,
⎡ ⎤
f0 (P1 ) ··· · · · f0 (P16 ) ⎡ r1 ⎤ ⎡ f0 · r ⎤
⎢ ⎥
⎢f3 (P1 ) ··· · · · f3 (P16 )⎥ ⎢ r2 ⎥ ⎢ ⎥
⎥ = ⎢ f3 · r ⎥ .
⎢ ⎥·⎢
⎣ ⎦ ⎣ ⎦
⎣ ··· ··· ··· ··· ⎦ · ·
f8 (P1 ) ··· · · · f8 (P16 ) r16 f 8 · r
fi ∈ L(D):
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
f9 (P1 ) · · · · · · f9 (P16 ) r1 f9 · r
⎢ ⎥
⎢f10 (P1 ) · · · · · · f10 (P16 )⎥ ⎢ r2 ⎥ ⎢ ⎥
⎥ = ⎢ f10 · r ⎥.
⎢ ⎥·⎢
⎣ ⎦ ⎣
⎣ ··· ··· ··· ··· ⎦ · · ⎦
f14 (P1 ) · · · · · · f14 (P16 ) r16 f14 · r
The total number of computations for the case that there is no error or
there is at least one error is 12 × 16 = 192.
Note that fj = ψj ∈ L(A) = L(6P) for j = 0, 3, 5, 6. Further, note that
fk = ϕk ∈ L(X) = L(D) = L(14P) for k = 0, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
and ψk χj = fk fj for k = 0, 3, 5, 6, j = 0, 3, 5, 6, 7, 8.
Explicitly, we have
8α0 + 11α3 + 5α5 + 15α6 = 0,
11α0 + 15α3 + 1α5 + 3α6 = 0,
5α0 + 1α3 + 9α5 + 7α6 = 0,
15α0 + 3α3 + 7α5 + 14α6 = 0,
2α0 + 9α3 + 11α5 + 6α6 = 0,
1α0 + 7α3 + 6α5 + 13α6 = 0.
A non-zero solution of the six equations are α0 = 3, α3 = 8, α5 = 6,
α6 = 1. Note that the multiplication and addition are not the usual ones
between integers. They are the ones for the field elements. For instance,
2 + 3 = α + α + 1 = 1 = 5 and 2 × 3 = α × (α + 1) = 1 instead
of 6. The fastest way of solving the small size system of linear equations
is still Gaussian elimination. The number of multiplications involved is
62 × 4/3 = 48. The corresponding error locator is θ = 3f0 + 8f3 + 6f5 + f6 .
Let us find the zero set of θ. We use the following table with the rows
of f0 , f3 , f5 , f6 pre-computed (say, we want to compute f5 (P6 ), we have
f5 (P6 ) = αβ+β+α+1 α2 = β + 1 = 5), and the values of θ is computed by the
formula θ = 3f0 + 8f3 + 6f5 + f6 . We have the following table of values
(Table 6.1) to help us do the computation.
To find the zero set, we only have to add the row vectors
[fj (P1 ), . . . , fj (P16 )] with the produced coefficients and observe the result-
ing 0 coordinates. For instance, the summation corresponding to P14 is
3 · 1 + 8 · 12 + 6 · 4 + 11 which written in terms of α, β with α2 + α + 1 = 0,
Table 6.1.
P1 , P2 , P3 , P4 , P5 , P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16
f0 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
f3 0, 0, 3, 2, 3, 3, 2, 2, 14, 13, 10, 8, 15, 12, 9, 11
f5 0, 0, 1, 1, 4, 5, 6, 7, 12, 15, 9, 11, 5, 4, 7, 6
f6 0, 0, 2, 3, 2, 2, 3, 3, 8, 10, 14, 13, 9, 11, 12, 15
θ 3, 3, 3, 10, 4, 15, 5, 8, 4, 4, 2, 11, 0, 0, 9, 0
Decoding the Geometric Goppa Codes 177
We have to further check if the code vector c is really a code vector. For that
purpose, note that fi ·r = fi (Pk )rk = k∈M fi (Pk )rk + k∈ M fi (Pk )rk
and fi ·c = fi (Pk )ck = k∈M fi (Pk )ck + k∈ M fi (Pk )ck , and e = c−r =
c + r (since the characteristic is 2), and for k ∈ M, fi (Pk )rk = fi (Pk )ck ,
and for k ∈ M, fi (Pk )rk +fi (Pk )ck = fi (Pk )ek . Further, note that fi ·c = 0
(which is what we want to prove!) if and only if we have k∈M fi (Pk )ek =
fi · r. Furthermore, for i = 0, 3, 5, 6, 7, 8, the equations have been checked,
178 Introduction to Algebraic Coding Theory
10 · 1 + 8 · 14 + 14 · 15
= (αβ + α) + (αβ)(αβ + β + α) + (αβ + β + α)(αβ + β + α + 1)
= β 2 + α2 + (αβ + 1)(β + α) + αβ + αβ + α
= β + α = 6.
Indeed, all equations are satisfied which implies that fi · c = 0 for all
0 ≤ i ≤ 14. We conclude that c is a code word, and we decode successfully.
Note that if the above equations are not satisfied, then our decoder fails.
For this step, it takes 18 multiplications.
The total number of multiplications needed is as follows: (1) If there is
no error, (A) we use the check matrix, then it is 64; (B) otherwise, we use
the syndrome calculation, then it is 192. (2) If there are less than t errors,
then it is 192 + 48 + 9 + 64 + 18 = 331. (3) If the decoder fails, then it
is 96 + 48 + 9 + 64 + 18 = 235. We are processing 4 letters which is 16
bits. Per bit, we have (1): (A) 4 multiplications, (B) 12 multiplications;
(2) 20.69 multiplications; (3) 14.69 multiplications. The maximal number
of multiplications for 16 bits (k = 4 and each block contains 4 message
symbols) is 32.69 which means 2.04 per bit. A modern computer of 500
mhz can correct at least 15 million blocks or 240 million bits per second.
Remark: The main mistake made by some books about the SV algorithm
is in the syndrome calculation. It is taken to be Skj · r = ψk χj · r. If all the
outcomes are zeroes, then it is faultily claimed that there is no syndrome
and no error.
The true syndrome calculation uses Proposition 6.4. If there is a received
word r that satisfies χ · r = 0 for all χ ∈ L(U), where D U and (U) ≥
t + 2g − 1, then either there are more than t errors or there is no error. The
only way to show that there is no error is by using the check matrix or by
computing φj · r = 0 for a basis {φj } of L(D).
Decoding the Geometric Goppa Codes 179
Exercises
(1) Show that [1, 1, 1, 2, , 15, 12, , 8, 2, , 8, 6, 14, 11, 0, 1, 0, 0] is a code word in
Example 1.
(2) Show that [5, 5, 15, 9, 3, 2, 14, 4, 12, 10, 14, 4, 0, 0, 1, 0] is a code word in
Example 1.
(3) Show that [1, 1, 3, 1, 3, 13, 9, 11, 8, 15, 5, 13, 0, 0, 0, 1] is a code word in
Example 1.
(4) Find a generator matrix for the code in Example 1.
(5) Find a check matrix for the code in Example 1.
(6) Write a computer program to decode the code in Example 1.
(7) Write down the details of decoding the geometric Goppa code
Cp (B, 37P) with d(B) = 64 based on x50 + x51 + x52 with the ground
field F24 (cf. Example 35 in Section 4.9).
(8) Write a computer program to decode the code in Exercise 4.
6.4. DU Algorithm
6.4.1. Pre-computation
Let C be a smooth projective curve of genus g (g ≥ 2) over a finite field Fq ,
where q = 2m . We assume that we have primary Goppa code (cf. Definition
5.5) Cp (B, D) which is equivalent to CΩ (B, D) (cf. Proposition 5.7). Thus,
we have an integer n. We shall assume that n > d(D) > 0 and d(D) > g − 1
180 Introduction to Algebraic Coding Theory
Under further study, what we really need are only (1) and (4), which we
shall use in our study of example. Note that d = m − 2g + 2 = 2t + 1
and t = d−1 d−1
2 = 2 , which according to the remark after Proposition
1.25 is the best we can do. Comparing with SV algorithm, the value of
m = 2g + 2t − 1 is much smaller than the value 3g + 2t − 1 for SV algorithm.
The difference of the sizes is g. We have to use the Feng–Rao’s majority
voting to create g lines of values for the decoding purpose.
Proof. It follows from Proposition 4.30 that, since for any canonical divisor
W, W − (A0 + (2g + t − 1)P) is a divisor of negative degree, then (W −
(A0 +(2g +t−1)P)) = 0. Then, it follows from the Riemann–Roch theorem
that (A0 + (2g + t − 1)P) = 2g + t − 1 + 1 − g = g + t. The other part of
the proposition follows from Proposition 4.28 since ψi ∈ L(A0 + iP) and
(A0 + iP) − (A0 + (i − 1)P) ≤ 1.
Remark: The classic ordering and the one-point code mentioned above in
the subsection are the ones with A0 = 0 and νA (ψi ) = − ordP (ψi ).
Let us write A = A(Qi )Qi with A(Qi ) is the corresponding
coefficient of Qi . We have the following proposition.
θ ∈ L(A0 + iP)
⇔ θ ∈ L(E + (i + s − t)P)
⇔ − ordP (θ) ≤ (i + s − t)
⇔ − ordP (θ) − s + t ≤ i.
Apparently, the minimal possible value for the last inequality is − ordP (θ)−
s + t = d(A) − A(P ) − ordP (θ).
Furthermore, if θ ∈ L(A0 + iP)\L(A0 + (i − 1)P), and θ ∈ L(A 0 +
jP)\L(A 0 + (j − 1)P), then θθ ∈ L(A0 + A 0 + (i + j)P)\L(A0 + A 0 +
(i + j − 1)P).
Similarly, we shall construct bases {ψi } for L(A + (2g − 1)P), {χi } for
L(A +(2g−1)P), where 0 ≤ i ≤ 2g+t−1 and {φi } for L(A+A +(3g−1)P),
where 0 ≤ i ≤ 3g + 2t − 1, respectively. Let A0 = A − tP and ψi ∈
L(A0 +iP)\L(A0 +(i−1)P). Note that D = A+A +(2g −1)P; therefore,
we construct more φi than needed for coding purposes. Let us consider a
special kind of one-point code A = A = tP. Then, we have A0 = 0,
{χi = ψi }, {φi } basis for L((2g +t−1)P) and L((3g +2t−1)P), respectively.
Further, ψi ∈ L(iP)\L((i − 1)P), and D = (2g + 2t − 1)P.
We pre-compute the following relations:
ψi χj = ai,j φi+j + bi,j,k φk ,
k<i+j
In general, we shall build the syndrome table line by line, for all i, j
with 0 ≤ i, j ≤ 2g + t − 1 from the line i + j = 2g + 2t − 1 + s to the
next line i + j = 2g + 2t − 1 + s + 1 for s = 0, . . . , g − 1. Every time
after we construct Si,j (see the next section) for all i, j with 0 ≤ i, j ≤
2g + t − 1, i + j = 2g + 2t − 1 + s + 1 for s = 0, . . . , g − 1, we try to find an
error locator θ row-wise or column-wise. There are two possibilities: either
(1) we find one error locator (then we proceed to decode) or (2) we cannot
find one (then, we push to find the Si,j on the next line using the materials
of Feng–Rao’s majority voting of the next section). We shall handle case (1)
in this section. Let us start with s = 0.
2g+t−1
Si,j αi = 0 j ≤ 2g + t − 1(= w) (2)
i=0
the materials of the next section to construct the syndrome table of the
next line i + j = 2g + 2t − 1 + 1). If we can find an error locator θ this
way (or row-wise), then we may use the following proposition to solve the
decoding problem.
that there is an error locator θ ∈ L(A), where A = A0 + tP. Let the set
M = {Pi : θ(Pi ) = 0}. Then, its cardinality is at most t, and the following
system of equations
φi (Pk )Ek = φi · r(= φi · e), i ≤ 2g + 2t − 1 (3 )
Pk ∈M
Proof. (1) The first part follows from the discussions before the proposi-
tion which show the existence of the error locator θ. Let us prove the second
part of (1). We have θe = 0. Hence, 0 = χj ·(θe) = i≤t αi ψi ·(χj e) =
i≤t αi ψi χj · e = i≤t αi Si,j .
(2) Since θ ∈ L(A), it follows from Proposition 6.6 that the cardinality
of M ≤ d(A) = t.
We have
φi (Pk )Ek + φi (Pk )Ek = φi (Pk )Ek
M
Pk∈ Pk ∈M Pk ∈M
It means that we may solve the system of equations (3 ) to find the error
vector e.
Note that the number of equations is in general greater than the number
of variables, and there are no non-zero solutions in general. The non-zero
solution {α } can be extended to a solution set of
v
Si,j αi = 0 j ≤ w(= 2g + t − 1) (2)
i=0
Proof. (1) The discussions before the proposition show the existence of the
error locator θ. Let us prove the second part of (1). We have θe = 0. Hence
0 = χj · (θ e) = i≤t+s αi ψi · (χj e) = i≤t+s αi ψi χj · e i≤t+s Si,j .
Decoding the Geometric Goppa Codes 187
We want to verify that the set {α0 , . . . , αg+t } satisfies the above
equation (2 ). It follows from Proposition 6.4 that θχ · e = 0 for all χ ∈
L(A +(2g−1)P), especially, it suffices for a basis {χj } for L(A +(2g−1)P).
Henceforth we have,
Sij · eαi = ψi χj · eai = χj · (θ e) = 0
i i
Thus, any non-trivial solution of equation (2 ) induces an error locator. It
means that the concrete system of equations (2 ) is equivalent to the virtual
equations indexed by the set M for θ.
Decoding the Geometric Goppa Codes 189
Proof. Since θ ∈ L(A + gP), it follows from 6.6 proposition that the
cardinality of M ≤ d(A + gP) = (g + t).
Since c = r + e and outside M , c = r and e = 0, and the field Fq is of
characteristic 2; therefore, we have φ · r = φ · e and φ · (c + r) = 0 outside
M . We know that φ · r = φ · e for inside M . Therefore, the above equation
(3 ) has a solution. Clearly, e |M = e is a solution of (3 ). Let e∗ be another
solution. Then, the weight of e − e∗ ≤ (g + t) and e − e∗ will be a solution
of the homogeneous system of equations
φi (Pk )Ek = 0, i ≤ 3g + 2t − 1. (5 )
Pk ∈M
This fact will be critically important in the proof of Proposition 6.20, which
states that with all points Si,j for i + j ≤ 2g + 2t − 1 + s classified as valid
votes and others invalid votes, and then the valid votes are further classified
as correct or incorrect votes, the Proposition 6.20 states that the correct
vote exists and is the majority of all valid votes. Thus, we shall collect all
valid votes, and among the valid votes, we look up the majority block which
must be the correct vote. Once the correct vote is found, we make the vote
unanimous by changing all incorrect votes, invalid votes, and non-existent
votes to the correct one. Thus, we determined that the values on the line
i + j ≤ 2g + 2t − 1 + s will be decided correctly.
So, they must be all equal. Then, by Gaussian elimination to the last row of
S|i,j , when the last row of S|i,j−1 becomes 0, the value of Si,j is determined.
Hence, our assigned arbitrary value for Si,j must be the determined one.
Decoding the Geometric Goppa Codes 193
Suppose that there are two ways to give two different values β1 , β2 for Si,j .
If we assign β2 to the matrix, then clearly the last row of S|i,j will become
[0, . . . , 0, a] with a = 0. Then, we have
rank(S|i−1,j−1 ) = rank(S|i,j ) − 1.
It is against our hypothesis. So, the value of Si,j is uniquely determined this
way. Similarly, its value is uniquely determined column-wise. Let the value
of Si,j be b row-wise and c column-wise. Let us take the value b for Si,j and
then apply the column-wise Gaussian elimination to the last column. We
get a new matrix with last row [0, . . . , 0, b − c]. It follows that b = c.
(⇐) If Si,j casts a valid vote, then by Gaussian elimination to the
last row, the last row will become [0, . . . , 0]. Let us then apply Gaussian
elimination to the last column. The last column will be [0, . . . , 0]T . It is
easy to see that
rank(S|i−1,j−1 ) = rank(S|i,j ).
rank(S|i−1,j−1 ) = rank(S|i,j ).
We want to illustrate the relation of the rank of S|i,j and the existence
of error locators of a certain kind. Let us assume that the value of Si,j
casts a valid vote. Let us consider an error locator θ with νA (θ) = i and
θ ∈ L(A0 + (2g + t − 1)P). Let
θ = ai ψi + bk ψk ,
k<i
θχ · e = χ · (θ e) = χ · 0 = 0, f or all ≤ 2g + t − 1.
θχ · e = χ · (θ e) = χ · 0 = 0, f or all ≤ j,
194 Introduction to Algebraic Coding Theory
or
ψi χ · e + bk ψk χ · e = 0, f or all ≤ j,
k<i
Thus, the last row of S|i,j is a linear combination of the preceding rows.
We rewrite the above in the following way: If the last row of S|i,j is a linear
combination of the preceding rows, i.e., solve the following equations:
αk Sk, = Si, f or < j. (7)
k<i
Proof. Let us assume (1). In the above discussion, we show that equation
(7) can be solved and it produces a value for Si,j . By Gaussian elimination
to the last row, the last row is reduced to [0, . . . , 0]. Now, apply the same
reasoning to the columns. The last column is [0, . . . , 0]T . Therefore, we have
rank(S|i−1,j−1 ) = rank(S|i,j ). It follows from the preceding proposition
that Si,j casts a valid vote. Let
θ = ai ψi + bk ψk , (1)
k<i
since the characteristic of the field is 2. Now all ψk · e are correctly defined
as we already know, then ψi · e must be correctly defined and unique.
Moreover, if there is an error locator θ with νA (θ ) = j, θ ∈ L(A0 +
(2g + t − 1)P), then repeating the above argument, we conclude that if
θ = aj χj + bk χk ,
ki <j
Since all χk e are correctly defined as we already know, then χj e must
be correctly defined and unique. It means that we have
χj e = bk χk e, χj · e = bk χk · e.
k <j k <j
196 Introduction to Algebraic Coding Theory
(1) We claim that there are at least g error locators θ with νA (θ) distinct
numbers between t+s+1 and 2g +t−1 and with the corresponding number
ai = 1 in the equation (1) of Proposition 6.19.
Since there is no error locator θ with νA (θ) ≤ t + s, we may count all
error locators in L(A + (2g − 1)P). Let the divisor E be Pi ∈M Pi , where
M is the set of error locations of e. Clearly, an error locator θ is a non-zero
element in L(A + (2g − 1)P − E). Note that d(A + (2g − 1)P − E) ≥ 2g − 1.
Therefore, it follows from Riemann’s theorem that (A+(2g −1)P−E) = g.
Let {θi } be a basis. We may use linear changing of basis to make the
new basis {θi } have distinct values νA (θi ) and the corresponding number
ai = 1 in the equation (1) of Proposition 6.19. We still name the new
basis {θi }.
(2) Let us count the number mcor of correctly defined votes and the
number ninc of valid and incorrect votes. According to the preceding
Proposition 6.19, mcor ≥ the number of pairs (i, j) with both error locators
θ, θ with νA (θ) = i, νA (θ ) = j, and θ ∈ L(A + (2g − 1)P), θ ∈
L(A + (2g − 1)P), and ninc ≤ the number of pairs (i, j) with no error
locators θ with νA (θ) = i, θ ∈ L(A + (2g − 1)P) or error locators θ with
νA (θ ) = j, θ ∈ L(A + (2g − 1)P).
Let the set J be the set of all integers {i : t + s + 1 ≤ i ≤ 2g + t − 1},
then there is a map π with π(Si,j ) = i which sends {Si,j : i+j = 2g +2t+s}
to J. Let I be the subset of J which is the collection of {νA (θ)} of error
locators θ, where the corresponding number ai = 1 in the equation (1) of
Proposition 6.19. It means that for every dimension, there is exactly one
error locator. So, it follows from (1) that the cardinality of I, | I |, = g.
(3) Let I1 be the subset of J which is the collection of {νA (θ )} of error
locators θ . Then, by arguments identical to (1) and (2), we have | I1 |= g.
Let us define a reflection σ of the interval I = {i : t + s + 1 ≤ i ≤
2g + t − 1}, i.e., σ(j) = 2g + 2t + s − j. Note that σ 2 = id. Then, it is easy
198 Introduction to Algebraic Coding Theory
i + j = 2g + 2t + s ⇔ i = (2g + 2t + s) − j = σ(j).
minc ≤| J\(I ∪ I ) |= 2g − s − 1− | I ∪ I | .
mcor ≥| I ∩ I | .
So, we have
mcor − minc ≥ | I ∩ I | + | I ∪ I | +s + 1 − 2g
= | I | + | I | +s + 1 − 2g > s + 1.
So, we proved that more than half of the valid votes are correct; they provide
the identical θi+j · e, and they are in majority.
Remark: After we tally all valid votes and separate them into blocks
according to the values of φi+j induced by them and find the winner, which
is the correct one, we make the vote unanimous by changing all incorrect
votes, invalid votes, and non-existent votes to the correct one. Once we
complete the extra line Si,j with i + j = 2g + 2t − 1 + s + 1, we shall
try to find an error locator with order t + s + 1 row-wise or column-wise.
If we cannot find an error locator either way, then we decide on the next
line of i + j = 2g + 2t − 1 + s + 2. We thus proceed to the end, using the
materials in The Construction of Complete Syndrome Table, Final
Step s + 1 = g, we find the error locator θ and the error vector e. Finally,
we solve the decoding problem.
SV Algorithm DU Algorithm
n = 16 16
k = 4 7
d = 10 7
t = 3 3
Note that all other things of these two algorithms are comparable, but
the contents of their messages are different. For a block of length 16,
SV Algorithm contains information of 4 letters and DU Algorithm
contains information of 7 letters. Therefore, the same length of transmission
of DU Algorithm contains 74 amount of information compared with
SV Algorithm. We shall make the following pre-computation.
(1) We pre-compute the generator matrix and the check matrix (cf.
Exercises 4 and 5).
(2) We compute a basis of L((3g + 2t − 1)P) = L(14P) = L(A + A +
(3g − 1)P). It is easy to see that the following: {φ0 , φ3 , φ5 , φ6 , φ7 , φ8 , φ9 ,
φ10 , φ11 , φ12 , φ13 , φ14 } form a basis.
where ordP (φi ) = −i. The reasons that they form a basis are the following:
(1) φi ∈ L(14P); (2) they are linearly independent over F22 ; (3) by
Riemann’s theorem, (14P) = 14 + 1 − g = 12.
We shall compute the following 12 × 16 matrix C:
⎡ ⎤
φ0 (P1 ) · · · · · · φ0 (P16 )
⎢ ⎥
⎢ φ (P ) · · · · · · φ3 (P16 ) ⎥
C=⎢ 3 1 ⎥.
⎣ ··· ··· ··· ··· ⎦
φ14 (P1 ) · · · · · · φ14 (P16 )
It is easy to see that the first 6 of the φi form a basis for L(A0 + (2g +
t − 1)P) = L(8P)).
200 Introduction to Algebraic Coding Theory
φ0 φi = φi , for i ≤ 8
φ3 φi = φi+3 , for i ≤ 8
φ5 φi = φi+5 , for i ≤ 8, and i = 7
φ6 φi = φi+6 , for i ≤ 8.
y3 1 y
φ5 φ7 = = 4 + 2 = φ12 + φ5 ,
x5 x x
y4 x3 y 2 + xy y y2
φ7 φ7 = = = + = φ14 + φ7 .
x6 x6 x5 x3
The above are our pre-computations.
Let us consider the received word r. We use the check matrix or the check
procedure to determine if r is a code word. If it is a code word, then we pass
to the next block of received message. The computation requires 7 × 16 =
112 multiplications. If it is not a code word, then we start the syndrome
calculation as follows. We compute φi · r for φi ∈ L(A) = L(8P), i.e.,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
φ0 (P1 ) · · · · · · φ0 (P16 ) r1 φ0 · r
⎢ ⎥ ⎢
⎢φ3 (P1 ) · · · · · · φ3 (P16 )⎥ ⎢ r2 ⎥ ⎢ ⎥
⎥ = ⎢ φ3 · r ⎥.
⎢ ⎥·⎣ ⎦ ⎣
⎣ ··· ··· ··· ··· ⎦ · · ⎦
φ8 (P1 ) · · · · · · φ8 (P16 ) r16 φ8 · r
syndrome table:
⎡ ⎤ ⎡r ⎤ ⎡ ⎤
φ9 (P1 ) · · · · · · φ9 (P16 ) 1
φ9 · r
⎢ ⎥ ⎢ r ⎥
⎣φ10 (P1 ) · · · · · · φ10 (P16 )⎦ · ⎢ ⎥ ⎣ ⎦
2
⎣ · ⎦ = φ10 · r .
φ11 (P1 ) · · · · · · φ11 (P16 ) φ11 · r
r16
The total number of computations for the case that there is an error is
9 × 16 = 144.
Table 6.2.
φ0 , φ3 , φ5 , φ6 , φ7 , φ8
φ0 0, 8, 4, 3, 12, 8
φ3 8, 3, 8, 7, 11, 11
φ5 4, 8, 11, 11, S5,7 ,
φ6 3, 7, 11, S6,6 ,
φ7 12, 11, S7,5 ,
φ8 8, 11,
Table 6.3.
φ0 , φ3 , φ5 , φ6 , φ7 , φ8
φ0 0, 8, 4, 3, 12, 8
φ3 8, 3, 8, 7, 11, 11
φ5 4, 8, 11, 11, 3, S5,8
φ6 3, 7, 11, 7, S6,7 ,
φ7 12, 11, 3, S7,6 ,
φ8 8, 11, S8,5 ,
Table 6.4.
φ0 , φ3 , φ5 , φ6 , φ7 , φ8
φ0 0, 8, 4, 3, 12, 8
φ3 8, 3, 8, 7, 11, 11
φ5 4, 8, 11, 11, 3, 12
φ6 3, 7, 11, 7, 12, S6,8
φ7 12, 11, 3, 12, S7,7 ,
φ8 8, 11, 12, S8,6 ,
204 Introduction to Algebraic Coding Theory
Table 6.5.
φ0 , φ3 , φ5 , φ6 , φ7 , φ8
φ0 0, 8, 4, 3, 12, 8
φ3 8, 3, 8, 7, 11, 11
φ5 4, 8, 11, 11, 3, 12
φ6 3, 7, 11, 7, 12, 4
φ7 12, 11, 3, 12, 8,
φ8 8, 11, 12, 4,
Table 6.6.
P1 , P2 , P3 , P4 , P5 , P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16
φ0 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
φ3 0, 0, 3, 2, 3, 3, 2, 2, 14, 13, 10, 8, 15, 12, 9, 11
φ5 0, 0, 1, 1, 4, 5, 6, 7, 12, 15, 9, 11, 5, 4, 7, 6
φ6 0, 0, 2, 3, 2, 2, 3, 3, 8, 10, 14, 13, 9, 11, 12, 15
θ 3, 3, 3, 10, 11, 13, 9, 15, 8, 3, 13, 9, 0, 0, 9, 0
In general, since θ(Pi ) = φ6 (Pi ) + 6φ5 (Pi ) + 8φ3 (Pi ) + 3φ0 (Pi ), we use
the pre-computation of φj (Pi ) to locate the zeroes of θ. It takes 4 × 16 = 64
multiplications. For the present case, the zero set M is {P13 , P14 , P16 }.
Using Proposition 6.13, we have to solve the last set of equations as
follows:
Exercises
(1) Show that [0, 4, 12, 12, 1, 0, 0, 13, 12, 12, 3, 11, 0, 0, 0, 0] is a code word
in Example 2.
(2) Show that [13, 9, 9, 4, 0, 14, 1, 0, 13, 4, 9, 6, 0, 0, 0, 0] is a code word in
Example 2.
(3) Show that [14, 2, 12, 2, 0, 10, 0, 1, 2, 9, 12, 14, 0, 0, 0, 0] is a code word in
Example 2.
(4) Show that [10, 1, 14, 6, 0, 9, 0, 0, 13, 10, 15, 3, 1, 0, 0, 0] is a code word in
Example 2.
(5) Show that [0, 6, 3, 9, 0, 5, 0, 0, 6, 0, 5, 3, 0, 1, 0, 0] is a code word in
Example 2.
(6) Show that [1, 5, 14, 12, 0, 2, 0, 0, 13, 8, 5, 5, 0, 0, 1, 0] is a code word in
Example 2.
(7) Show that [1, 12, 9, 11, 0, 6, 0, 0, 14, 10, 5, 9, 0, 0, 0, 1] is a code word in
Example 2.
(8) Find a generator matrix for the code in Example 2.
(9) Find a check matrix for the code in Example 2.
(10) Write a computer program to decode the code of Example 2.
(11) Using DU algorithm, write down the details of decoding the geometric
Goppa code Cp (B, 37P) based on x50 + x51 + x52 with the ground field
F24 (cf. Section 6.8).
Appendices
This page intentionally left blank
Appendix A
Convolution Codes
A.1. Representation
g1 (x)
(· · · , b1 , b0 )
f(x)
(· · · , a1 , a0 ) split memor1 memor2
g2 (x)
(· · · , c1 , c0 )
209
210 Introduction to Algebraic Coding Theory
where
b0 = a0 , b1 = a1 , bn = an + an−2 for all n>1,
c0 = a0 , c1 = a1 + a0 , cn = an + an−1 + an−2 for all n>1.
Or we simply write
g1 (x) = (1 + x2 )f (x),
g2 (x) = (1 + x + x2 )f (x).
It seems that the encoder is simply a multiplication by a polynomial.
However, there is a catch: a multiplication by x (a delay of time by 1)
should be considered as invertible! So, we should enlarge the polynomial
ring F[x] to the Laurent polynomial ring F[x](x) = { h(x) xd
: h(x) ∈ F[x ]}.
If we let the Laurent polynomial ring F[x](x) act on the power series
ring F[[x]], then we find it is not even closed under the action induced by
x−1 . The natural thing to do is to enlarge the power series ring F[[x]] to
the meromorphic function field F((x)). Recall that
∞
F2 ((x)) = ai xi : ai ∈ F2 , m ∈ Z .
−m
single error for g1 (x), namely replacing g1 (x) by g1 (x)+xn , then the inverse
will differ with f (x) at infinitely many places. Any encoder with the last
problem will be called a catastrophic encoder and will not be used, and we
avoid a decoder requiring infinitely many memory units.
It is clear that if we are allowed only one message series g1 (x) to decode,
then the only good encoders which take one incoming stream of data f (x)
and produce one stream of data g1 (x) are multiplying by xn . Those are
non-interesting. We should consider using several message series to decode
to find one message series. Let us first study a naive technique of combining
and splitting data streams. Let
∞
f (x) = ai xi
i=−m
∞
gj (x) = bji xi for j = 1, 2, . . . , r.
i=−m
r−1
h(x) = xj gj (xr ).
j=0
It means that we may split one stream of data f (x) into n streams
of data hj (x) for j = 0, 1, 2, . . . , n − 1, in symbols Sn (f (x)) =
[h0 (x), h1 (x), . . . , hn−1 (x)], and combining r streams of data gj (x) for
j = 0, 1, 2, . . . , r − 1 into one stream of data h(x), in symbols
Cr (g0 (x), . . . , gr−1 (x)) = h(x). The splitting and combining operations are
one to one and onto maps between F((x)) and F((x))r or F((x))n , while
they are non-linear respect to the field F((x)).
The way of splitting a data stream f (x) in in Section A.1 is outside the way
described in Section A.2. Let us have a detailed study of it. The splitting
212 Introduction to Algebraic Coding Theory
Using some linear algebra, we may rewrite the above matrix equations as
1 + x2 1 + x + x2
f (x) · 1 · 1 0 · = g1 (x) g2 (x) .
x 1+x
The above is the well-known Smith normal form of a matrix over the P.I.D.
F2 [x ] as follows.
M = AΓB,
where (1) both A and B are invertible such that their inverses are with
entries in R, (2) Γ is in the diagonal form with entries on the diagonal the
invariant factors γi of M. Note that γ1 | γ2 | . . . | γr .
Proof. Omitted.
We conclude that with only finitely many memory units (in fact, at most 4),
we may recover f (x) if there is no error for g1 (x) and g2 (x). Furthermore,
if there are single errors for g1 (x) and g2 (x), say replacing them by
xn , xm , then the decoding results are polynomials which are not infinitely
long meromorphic functions. Therefore, the encoder is not a catastrophic
encoder.
Appendix A: Convolution Codes 213
Figure A.1.
00 . . . . . . . .3
10 . . . . . . . .3
01 · . . . . . . .2
11 · . . . . . . .3
t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8
Figure A.2.
To tile the real plane with same size tiles of same shape, one may use same
sized triangles, equilaterals, or hexagons. For all other shapes there are
always empty spaces left. This tiling is tight; we may consider the problem
of filling a plane with the same size discs with gaps allowed. One knows
from experience that the following arrangement for discs is tight, and in
fact, it can be proved mathematically.
217
218 Introduction to Algebraic Coding Theory
receiver select the centers of the discs as the permissible code words, and
the receiver receives a point which is slightly different from the point of
the original message, we believe that the distance between those two points
indicate the measure of error that occurred. As long as the received point
is in a disc (which is likely), the receiver will decode it as the center of
the disc. Naturally, we want to pack the space by the most efficient way so
we may have the largest possible spheres in some region of the plane, i.e.,
which corresponds to the largest rate of information. We shall generalize
the sphere-packing problem to higher dimension.
Let us consider the beautiful [23,12] Golay code. In the vector space
V = F23
2 , a ball of radius 3 centered at [00 · · · 00] contains at least
3
Ci23 = 2048 = 211
i=0
points. It is easy to see any ball of radius 3 contains exactly 211 elements. In
the vector space V, there are 223 elements, so it is possible to have 212 balls
with radius 3 that do not overlap. Indeed this happens (see the following),
and we have a code with 212 code words of length 23 and the information
rate 12/23 = 0.52.
We shall follow the way of constructing the Reed–Solomon code to
construct the Golay code. Let us consider the field F211 . All non-zero
elements in F211 satisfy the following equation:
11
x2 −1
+ 1 = 0.
Note that
211 − 1 = 23 × 89.
Therefore, F211 contains a 23rd root of unity. We have the following
decomposition:
x23 + 1 = (x + 1)(x11 + x9 + x7 + x6 + x5 + x + 1)
× (x11 + x10 + x6 + x5 + x4 + x2 + 1) = (x + 1)g(x)g ∗ (x),
where g ∗ (x) = x11 g(1/x). We have the following definition:
Note that the [23,12] Golay code is not a Reed–Solomon code. We have
the following proposition.
Proposition B.2. It is possible to have 212 balls with radius 3 that do not
overlap and pack V = F23
2 .
Proof. Omitted.
The other Golay code is the [11,6] Golay code over F3 , where balls of
radius 2 densely pack F11
3 .
The perfect codes over a finite field are like the tiling of the plane. The only
perfect codes are (1) repetition code of odd length, (2) q-ary Hamming
codes (cf. Exercise 1.5 (1), (2), (4)), and (3) [23,12] binary Golay code and
[11,6] ternary Golay code. Otherwise, it is impossible to use balls of the
same radius to tile a vector space Fnq . Let us consider the simple case over
F2 . We study
A(n, d) = the largest integer M such that there exist M codewords
× {x1 , . . . , xM } such that d(xi , xj ) ≥ d, if i=j.
Let d be odd. Then, the number A(n, d) is the maximal number of ways
of putting balls of radius d/2 in Fn2 without touching each other. There
might be some extra space allowed. This is similar to the sphere-packing
problem in R3 . We may form a code C with C = {x1 , . . . , xM } which will
correct d/2 errors. In general, it is difficult to find the number A(n, d)
except some easy cases with n, d small or d ≥ n.
We may put more conditions on the code C to make it easier to
construct. Let us define the distance d(x, C) and the covering radius ρ(C)
of a set C as
d(x, C) = min{d(x, c) : c ∈ C},
ρ(C) = max{d(x, C) : x ∈ Fn2 },
d = min{d(c1 , c2 ) : c1 = c2 ∈ C}.
We have the following definition.
However, to compute A(z) for large q, we still have to look over all q k
vectors in C. It is still a formidable task. If n − k is small, then we may be
able to compute the weight enumerator for the dual code C ⊥ . The following
proposition is courtesy of MacWilliams.
Proposition B.4 (MacWilliams identity). Let A(z) be the weight
enumerator of an [n, k] linear code C and B(z) be the weight enumerator of
the dual code C ⊥ . Then, A(z) and B(z) are related by the following formula:
n
1
B(z) = Ai (1 − z)i (1 + (q − 1)z)n−i−1 .
q k i=0
Example B2: Let C = [7, 4] Hamming code. Then, its dual code C ⊥
is an [7, 3] code. It turns out that each non-zero code words in C ⊥ has
weight 4. Thus, the weight enumerator of C ⊥ is 1 + 7z 4 . Therefore, the
weight enumerator of C which is considered as the dual code of C ⊥ is
1
[(1 + z)7 + 7(1 − z)4 (1 + z)3 ] = 1 + 7z 3 + 7z 4 + z 7 .
8
This page intentionally left blank
Appendix C
223
224 Introduction to Algebraic Coding Theory
There are several ways to present the Reed–Muller code which is a binary
code correcting several errors. The code was discovered by Muller in 1954
([29]) and the decoding method was due to Reed in 1954 [31]. Its decoding
method is easy; hence, it has been used in several occasions, for instance,
during 1969–1977, all of NASA’s (USA) Mariner class deep-space probes
were equipped with a Reed–Muller code (i.e., RM (5, 1), see following
definition).
Recall from Section 5.1 that a linear coding theory has the following
diagram
σ σ
message space Fkq −→
1 2
function space −→ word space Fnq .
The first map σ1 is injective. Thus, we use functions to rename the messages,
and the second map σ2 is an injective map with the image of σ1 (Fkq ) = the
code space. Usually, the map σ2 is an evaluation map which evaluate a
function f at an ordered n-tuple of points (P1 , . . . , Pn ). Thus, it maps a
function f to [f (P1 ), . . . , f (Pn )] ∈ Fnq . Note that σ2 σ1 will send the message
space to the word space; certainly, we do not want to send any non-zero
message to zero. Thus we require that the composition σ2 σ1 is an injective
map on the message space.
We shall use the following defined P (m, r) as the function space. Let
P (m, r) denote the set of all polynomials of degree ≤ r ≤ m in m variables
(x1 , . . . , xm ) over F2 . Note that computation wise, over F2 , we always have
x2 = x.
Hence, all monomials can be reduced in the computational sense to
multiples of distinct xi .
Let us consider F2 m as the ground vector space. We have a simple
way to represent P (m, r). Let n = 2m . Then there are n = 2m vectors
(or points) in F2 m and (v0 , v1 , . . . , vn−1 ) denote a list of all the 2m binary
vectors in F2 n in some order. Then, for each f ∈ P (m, r), we may define
f (vj ) = h(aj1 , . . . , ajm ), where vj = [aj1 · · · ajm ]. Furthermore, we may
use integers nj to represent vj as we define nj = 2(i−1)aji . We have the
following example of using integers to represent point in F2n .
Example C2: Let us consider m = 4. Then, we have the following
Table C.1 for the values of polynomials x1 , x2 , x3 , x4 . Numerically, xi
goes to 2i−1 and the coefficients a1 , a2 , a3 , a4 in the table determine the
polynomial (xi )ai which in turn goes to 2(i−1)ai , where ai = 1 or 0.
Appendix C: Other Important Coding and Decoding Methods 225
Table C.1.
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
x1 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1
x2 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1
x3 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1
x4 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1
k = 1 + C1m + · · · + Crm .
1 + C1m + · · · + Crm .
Therefore, we need only to show that any non-zero polynomial f will not be
sent to zero vector in F2n by σ2 , i.e., σ2 (f ) = (f (v0 ), f (v1 ), . . . , f (vn−1 )) =
(0, 0, . . . , 0). Suppose that h(x1 , . . . , xm ) = f is a linear combination of
those monomials and f (vi ) = 0 for all i. If h(0, x2 , . . . , xm ) = 0, then
we can find values for x2 , . . . , xm so that it is not zero by mathematical
induction. If h(0, x2 , . . . , xm ) = 0, then h(x1 , . . . , xm ) = x1 g(x2 , . . . , xm )
and h(1, x2 , . . . , xm ) = g(x2 , . . . , xm ), so that we can find values for
x2 , . . . , xm such that h(vi ) is not zero by mathematical induction. In any
case, we have a contradiction if σ2 (f ) = (0, 0, . . . , 0).
Proof. We first show that d ≤ 2m−r . Let us use the notations of the proof
of the preceding proposition. Let h = x1 x2 · · · xr . Then, f = 0 whenever
x1 = 0 or x2 = 0 or · · · or xr = 0. Using set-theoretic inclusions to compute
the number of zeroes among v0 , . . . , vn−1 , we may conclude that it is 2m −
2m−r . There are at least 2m − 2m−r zeroes among v0 , . . . , vn−1 . Therefore,
the minimal weight, hence the minimal distance, is at most 2m−r .
We wish to prove that d ≥ 2m−r . Let us consider a polynomial
h(x1 , x2 , . . . , xm ) = 0, so every variable appears at most once in every
term. We have h(0, x2 , . . . , xm ) = 0 =⇒ h(x1 , . . . , xm ) = x1 g(x2 , . . . , xm ).
Similarly, h(1, x2 , . . . , xm ) = 0 =⇒ h(x1 , . . . , xm ) = (x1 − 1)p(x2 , . . . , xm ).
After we try every xi , then we conclude that either there is an xi ,
say x1 , such that (1) h(0, x2 , . . . , xm ) = 0 and h(1, x2 , . . . , xm ) = 0 or
(2) h = m i=1 (xi − δi ), where δi = 0 or 1. The second case happens only
if r = m, and 2m−r = 1, our proposition is certainly true. In the first
case, both h(0, x2 , . . . , xm ) = g(x2 , . . . , xm ) = 0 and h(1, x2 , . . . , xm ) =
p(x2 , . . . , xm ) = 0. By induction on the number of variables, we conclude
that g(x2 , . . . , xm ) and h(x2 , . . . , xm ) have at most 2m−1 − 2m−1−r zeroes.
Since the sets of zeroes for g(x2 , . . . , xm ) and p(x2 , . . . , xm ) are with x1 = 0
or x1 = 1, they are disjoint. Therefore, f has at most (2m−1 − 2m−1−r ) +
(2m−1 − 2m−1−r ) = 2m − 2m−r zeroes. So the minimal weight, hence the
minimal distance, is at least 2m−r .
1 1 1 1 1 1 1 1
Puncture: Any code can be punctured by deleting some of its check
symbols. We have an example of C2 → C0 .
Expurgating: A cyclic code generated by the polynomial g(x) can be
expurgated to form another cyclic code by multiplying any additional factor
into the generated polynomial. The most common expurgate code is the
code C1 generated by g(x)(x − 1). We have the following example C0 → C1 .
⎡ ⎤
1 0 1 0 1 0 0
⎢ 1 1 1 1 0 1 0⎥
H1T = ⎢
⎣0 1 1 1 0 0 1⎦.
⎥
1 1 1 1 1 1 1
228 Introduction to Algebraic Coding Theory
00000
11100
00111.
Note that the minimal distance of the three letters is 3, and we expect to
correct 1 error. Then, we may use some encoding scheme, say RS-code, to
encode the whole e-mail. The way to encode the letters will be called the
inner code, and the way to encode the e-mail will be called the outer code.
In general, we may encode any message by a code C1 (like the encoding of
all letters) which is called the inner code, and then encode the code words
C1 by another code C2 which is called the outer code. The most important
development along this direction is the Justensen codes [28].
Appendix C: Other Important Coding and Decoding Methods 229
Example C2: Let C a [7, 3] binary linear code with the check matrix H
⎡ ⎤
1 1 0 1 0 0 0
⎢1 0 1 0 1 0 0⎥
H=⎢ ⎣0 1 1 0 0 1 0⎦.
⎥
1 1 1 0 0 0 1
c0 = c1 + c3
c0 = c2 + c4
c0 = c5 + c6 .
Now, let us assume the channel is binary symmetric with error proba-
bility ℘ < 12 . Then, we have
⎧
⎪
⎪ r1 + r3 : if e1 + e3 = 0
⎪
⎨r + r
2 4 : if e2 + e4 = 0
c0 =
⎪
⎪ r5 + r6 : if e5 + e6 = 0
⎪
⎩
r0 : if e0 = 0.
If there is one error, then we may use majority vote to decide which
value is correct for c0 . If two errors are allowed, and there is a 2-2 tie on the
vote, there is one vote with e0 = 0 since e0 = 0 is with a probability 1 − ℘
and ei +ej = 0 is with a probability (1−℘)2 +℘2 = 1−℘−℘(1−2℘) < 1−℘,
so it is in favor of the vote with e0 = 0, i.e., c0 = r0 .
Example C3: Let us consider RM (3, 1) which is a [8, 4, 4]- code. We have
the following Table C.2.
Table C.2.
0, 1, 2, 3, 4, 5, 6, 7,
m0 1, 1, 1, 1, 1, 1, 1, 1,
m1 0, 1, 0, 1, 0, 1, 0, 1,
m2 0, 0, 1, 1, 0, 0, 1, 1,
m3 0, 0, 0, 0, 1, 1, 1, 1,
c 0 = m0 , c1 = m0 + m1 , c2 = m0 + m2 , c3 = m0 + m1 + m2 , c4 = m0 + m3 ,
c5 = m0 + m1 + m3 , c6 = m0 + m2 + m3 , c7 = m0 + m1 + m2 + m3 .
m1 = c 0 + c 1 = c 2 + c 3 = c 4 + c 5 = c 6 + c 7 ,
m2 = c 0 + c 2 = c 1 + c 3 = c 4 + c 6 = c 5 + c 7 ,
m3 = c 0 + c 4 = c 1 + c 5 = c 2 + c 6 = c 3 + c 7 .
Clearly, we do not know ci ’s and only know ri ’s. Assume that there is at
most 1 error. Let us consider the value of m1 , the values of m2 , m3 can be
considered similarly. Then, at least three of the four values of r0 + r1 , r2 +
r3 , r4 + r5 , r6 + r7 must be correct. Therefore, a majority vote will decide
the correct value of m1 . If there are two errors, then the vote might be
tied. In that case, the value is indeterminate. After we find the values for
m1 , m2 , m3 , we are left to find m0 . Note that
There is another code which rivals the turbo codes, the low-density parity
check (LDPC) Codes [23] which was created by Robert Gallager, who was a
Ph. D. student at MIT, in 1958. This code is sometimes called Gallager code.
First, Gallager used sparse matrices for the generator matrices. Second,
Gallager used a decoder for every bit and let the decoders talk among
themselves and thus created a huge rumor mill with thousands or tens of
thousands of talkers. The patent right of LDPC was held by Codex Corp.
until it expired without ever being used. One of the reasons was it was
technically infeasible to create the rumor mill in the 1950s.
Appendix D
Let
r(x) = r0 + r1 x + · · · + rn−1 xn−1 = the received word,
2t
S(x) = r(γ i )xi .
i=1
233
234 Introduction to Algebraic Coding Theory
Note that we already know the existence of σ(x) and ω(x); it follows
from Proposition 2.56 that they are unique. The only problem is how
to find them fast. Note that we know deg(σ(x)) ≤ t, deg(ω(x)) ≤ t and
deg(S(x)) ≤ 2t. Berlekamp’s idea is not to solve the above system of linear
equations directly, rather to find a sequence of σ (k) (x) and ω (k) (x) with
σ (2t) (x) = σ(x) and ω (2t) (x) = ω(x). The key equation is generalized to a
sequence of equations of the following form:
(1 + S(x))σ (k) (x) = ω (k) (x) mod xk+1 , (D1k )
with the degree restrictions polynomials deg(σ (k) (x)), deg(ω (k) (x)) ≤ k+1 2 .
0
We define σ = 1, inductively; suppose that we have constructed
σ (k) (x), ω (k) (x), we want to define σ (k+1) (x), ω (k+1) (x). Let us look at one
more term as follows:
(1 + S(x))σ (k) (x) = ω (k) (x) + Δ(k) xk+1 mod xk+2 , (Ek )
Appendix D: Berlekamp’s Decoding Algorithm 235
where Δ(k) is the next coefficient and a number of the above equation. If
Δ(k) = 0, then we may take σ (k+1) (x) = σ (k) (x) and ω (k+1) (x) = ω (k) (x)
and continue our inductive process of construction. Suppose Δ(k) = 0.
Then, it is complicated to define them. We have to do something more.
We introduce two more functions τ (k) and γ (k) . Let us define them by the
following equations:
Δ(σ (k) ) = σ (k) − σ (k+1) = Δ(k) xτ (k) ,
Δ(ω (k) ) = ω (k) − ω (k+1) = Δ(k) xγ (k) .
Inductively, after we define τ (k) and γ (k) , then we have to define τ (k+1)
and γ (k+1) . let us take (Ek ) − (D1k+1 ) and simplify. Then, we deduce the
following equation:
(1 + S(x))τ (k) = γ (k) + xk mod xk+1 . (D2k )
Inductively, we define τ (k+1) and γ (k+1) as follows. We may use one of the
following two ways, if Δ(k) = 0, let
τ (k+1) = xτ (k) , and γ (k+1) = xγ (k) , (D3)
or if Δ(k) = 0, let
σ (k) ω (k)
τ (k+1) = , and γ (k+1)
= . (D4)
Δ(k) Δ(k)
The critical things are to control the degrees of σ (k) (x), ω (k) (x). We wish
that they are less than or equal to k+12 .
Initially, Berlekamp adds two more integer-valued functions D(k), B(k)
and defines σ (0) = 1, ω (0) = 1, τ (0) = 1, γ (0) = 0, and integer-valued
functions D(0) = 0, B(0) = 0. Inductively, we have two cases:
⎧
⎪ (k)
⎪Δ = 0
⎪
or
⎪
⎪
⎨ (k) k+1
Case 1 = Δ = 0, D(k) > 2 or
⎪
⎪
⎪
⎪
⎪
⎩ Δ(k) = 0 and D(k) = k + 1 , and B(k) = 0,
2
⎧ k+1
⎪
⎨ Δ(k) = 0 and D(k) < or
2
Case 2 =
⎪
⎩ Δ(k) = 0, D(k) = k + 1 , and B(k) = 1.
2
In case 1, we define τ (k+1) , γ (k+1) by equation (D3) and set
(D(k + 1), B(k + 1)) = (D(k), B(k)).
236 Introduction to Algebraic Coding Theory
Proposition D.1. We always have the following: (1) deg(σ (k) ) ≤ D(k)
with equality if B(k) = 1. (2) deg(ω (k) ) ≤ D(k) − B(k) with equality if
B(k) = 0. (3) deg(τ (k) ) ≤ k−D(k) with equality if B(k) = 0. (4) deg(γ (k) ) ≤
k − D(k) − (1 − B(k)) with equality if B(k) = 1.
Proposition D.3. If σ (k) and ω (k) are any pair of polynomials which
satisfy
σ (k) (0) = 1 and (1 + S)σ (k) = ω (k) mod xk+1 .
Let D = max{deg σ (k) , deg ω (k) }. Then, there exist polynomials u and
v such that
u(0) = 1, v(0) = 0,
deg u ≤ D − D(k), deg v ≤ D − [k − D(k)],
(k+1) (k) (k)
σ = uσ + vτ ,
ω (k+1) = uω (k) + vγ (k) .
237
238 Introduction to Algebraic Coding Theory
[18] Burton, H.O. and Weldon, E.J. Cyclic product codes. IEEE Trans. Inf.
Theory, IT-11: 433–439, 1965.
[19] Elias, P. Coding for noisy channels. IRE Conv. Record, part 4, 37–46.
Einhoven University of Technology, 1993
[20] Duursma, I.M. Decoding codes from curves and cyclic codes. Ph.D. disse-
tation, Einhoven University of Technology, 1993.
[21] Feng, G.L. and Rao, T.R.N. A simple approach for construction of algebraic-
geometric codes from affine plane curves. IEEE Trans. on Inf. Theory, 40,
1003–1012, 1994.
[22] Forney, G.D.Jr. Generalized minimum distance decoding. IEEE Trans. on
Inf. Theory, 12(2): 125–131, April 1966.
[23] Gallager, R.G. Low-density parity-check codes. IRE Trans. Inf. Theory.,
IT-8: 21–28. January 1962.
[24] Ghorpade, S. and Datta, M. Remarks on Tsfasman–Boguslavsky conjec-
ture and higher weights of projective Reed–Muller codes. In Arithmetic,
Geometry, Cryptography and Coding Theory, Providence, RI: AMS, 2017,
pp. 157–169.
[25] Goppa,V.D. A new class of linear error-correcting codes. Probl. Inf. Trans.,
6: 207–21, 1970.
[26] Hartmann,C.R.P. and Tzeng, K.K. Generalizations of BCH bound. Inf.
Control, 20: 489–498, 1972.
[27] Ihara, Y. Congruence relations and Shimura curves. Proc. symp. Pure Math.,
33(2): 291-311, 1979.
[28] Justensen, J. A class of constructive asymptotically good algebraic codes.
IEEE Trans. Inf. Theory, 18: 652–656, 1972.
[29] Muller, D.E. Metric Properties of Boolean Algebra and Their Application to
Switching Circuits. Report No. 46, Digital Computer Laboratory, Univ. of
Illinois, April 1954.
[30] Peterson, W.W. Encoding and error-correction procedures for the Bose-
Chaudhuri codes. IRE Trans. Inf. Theory, October 1960.
[31] Reed, I.S. A class of multiple-error-correcting codes and the decoding
scheme. J. IRE Trans. Inf. Theory, September 1954.
[32] Reed, I.S. and Solomon, G. Polynomial codes over certain finite fields. J. Soc.
Ind. Appl. Math., June 1960.
[33] Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J.,
27: 379–423, 623–656, 1948.
[34] Skorobogatov, A.N. and Vlǎduţ, S.C. On the decoding of algebraic-
geometric codes. IEEE Trans. Inf. Theory, 36: 1051–1060, 1990.
[35] Sugiyama, Y. Kasahara, M. Hirasawa, S., and Namekawa, T. A method for
solving key equation for decoding Goppa codes. Inf. Control, 27: 87–99,
1975.
[36] Tsfasman, M.A., Vlǎduţ, S.C., and Zink, T. Modular curves, Shimura curves
and Goppa codes better then the Varshamov-Gilbert bound. Math. Nachr.,
109: 21–28. 1982.
[37] Xing, C. Nonlinear codes from algebraic curves improving the Tsfasman-
Vlǎduţ-Zink bound. IEEE Trans. Inf. Theory, 49(7): 1652–1657, 2003.
References 239
[38] Hasse, H. Zur Theorie der abstrakten elliptischen funktionenkörper I,II &
III. Crelle’s Journal, 175: 193–208, 1936.
[39] Serre, J.P. Sur le nombre des points rationnels d’une courbe algebraic sur
un corps fini. C.R. Acad. Sc. Paris, 296: 397–402, 1983.
[40] Weil, A. Number of solutions of equations over finite field. Bull. Amer. Math.
Soc., 55: 497–508, 1949.
This page intentionally left blank
Index
A C
abelian group, 5 calculus, 105, 156
absolutely irreducible, 98, 100, 104, canonical divisor, 120, 129, 181
143, 145, 148 catastrophic encoder, 211
affine algebraic variety, 93, 95, 105 Cěch theory, 101
affine space, 90–93, 97 characteristic, 36, 59, 99, 155
algebraic closure, 7, 41, 52, 63, 94, check matrix, xvi, 10–11, 16–17, 49,
102 157, 160–161, 166, 171–174, 178,
algebraic degree, 98 199–200
algebraic geometry, vii, 89, 101, check polynomial, 51, 63, 66, 68, 74,
105 79
algebraic integer, 131 check symbol, xvi, 10
algebraic object, 93–95 Chevalley, C., vii–viii, 90, 108, 114,
algebraic variety, 97–99, 101 117, 119–121, 123, 126, 130
algebraically closed, 90, 94–95 Chinese Remainder Theorem, 47, 51
analytic function, 97 co-prime, 47, 110
analytic geometry, 97 code, 75
arithmetic, 102 BCH, 63, 65–67, 76, 78, 81
associative law, 5 block, xvii, 11
augmenting code, 228 classical Goppa, 80–85, 142,
147–148
B cyclic, 62–65, 74
ball, 24 dual, 16, 148–149
Berlekamp algorithm, 234 error-correcting, 142
Bézout’s theorem, 93, 106 geometric Goppa, 78, 81, 89,
binary expansion, 17 103, 143, 145, 148–149, 151,
birationally equivalent, 96, 109–110 159–160, 166, 179, 199, 206
birationally isomorphic, 103 Golay, 218
241
242 Introduction to Algebraic Coding Theory
R S
radical ideal, 95 sender, 160, 180
rank of code, 9, 82, 144, 147, 150, separable, 52, 59, 102, 130
157, 173, 199 sequential decoding, 213
rank of matrix, 192–195, 202–203 Shannon’s Theorem, 21–23, 27–28,
Rao, T.R.N., 160, 179 76, 153
rate of distance, 22, 24, 76, 157 Shimura, 153, 155–156
rate of information, 21–24, 28, 76, 82, shortened code, 228
145, 156–157 Singleton, R., 18
rational function, 57–58, 69, 71, 83, Singleton bound, 18–19, 144
97, 105, 130, 142–143, 162, 164, singular locus, 102
181–182, 194 singular point, 101–102
Index 245