0% found this document useful (0 votes)
9 views

RHcoding

Uploaded by

ksameerk1997
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

RHcoding

Uploaded by

ksameerk1997
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 263

[LH

‘ =
“ f2-

& aa 7 ao
Asoaul Bulpos> ul asuno> ysul4 V7


i ft
OXFORD APPLIED MATHEMATICS AND COMPUTING SCIENCE SERIES |

A First Course
in Coding Theory

Raymond Hill
*
te
le
qyo4xo
Oxford Applied Mathematics
and Computing Science Series
Oxford Applied Mathematics
and Computing Science Series

J. Anderson: A First Course in Combinatorial Mathematics (Second Edition)


D. W. Jordan and P. Smith: Nonlinear Ordinary Differential Equations (Second
Edition)
B. Carré: Graphs and Networks
G. D. Smith: Numerical Solution of Partial Differential Equations (Third
Edition)
S. Barnett and R. G. Cameron: Introduction to Mathematical Control Theory
(Second Edition)
A. B. Tayler: Mathematical Models in Applied Mechanics
R. Hill: A First Course in Coding Theory
P. Baxandall and H. Liebeck: Vector Calculus
P. Thomas, H. Robinson, and J. Emms: Abstract Data Types: Their Specification,
Representation, and Use
R. P. Whittington: Database Systems Engineering
J.J. Modi: Parallel Algorithms and Matrix Computation
D. J. Acheson: Elementary Fluid Dynamics
L. M. Hocking: Optimal Control: An Introduction to the Theory with
Applications
S. Barnett: Matrices: Methods and Applications
O. Pretzel: Error-Correcting Codes and Finite Fields
D.C. Ince: An Introduction to Discrete Mathematics, Formal System
Specification, and Z (Second Edition)
A. Davies and P. Samuels: An Introduction to Computational Geometry for
Curves and Surfaces
P. Grindrod: The Theory and Applications of Reaction-Diffusion Equations:
Patterns and Waves (Second Edition)
O. Pretzel: Error-Correcting Codes and Finite Fields (Student Edition)
RAYMOND HILL
University of Salford

A First Course
in Coding Theory

/
) c-ansnoox PRESS « OXFORD
Oxford University Press, Great Clarendon Street, Oxford OX2 6DP

Oxford New York


Athens Auckland Bangkok Bogota Bombay Buenos Aires
Calcutta Cape Town Dar es Salaam Delhi Florence Hong Kong Istanbul
Karachi Kuala Lumpur Madras Madrid Melbourne Mexico City
Nairobi Paris Singapore Taipei Tokyo Toronto Warsaw
and associated companies in
Berlin Ibadan

Oxford is a trade mark of Oxford University Press

Published in the United States


by Oxford University Press Inc., New York

© Raymond Hill, 1986

First published 1986


Reprinted 1988 (with corrections), 1990, 1991, 1993, 1994, 1996, 1997

All rights reserved. No part of this publication may be


reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, without the prior permission in writing of Oxford
University Press. Within the UK, exceptions are allowed in respect of any
fair dealing for the purpose of research or private study, or criticism or
review, as permitted under the Copyright, Designs and Patents Act, 1988, or
in the case of reprographic reproduction in accordance with the terms of
licences issued by the Copyright Licensing Agency. Enquiries concerning
reproduction outside those terms and in other countries should be sent to
the Rights Department, Oxford University Press, at the address above.

This book is sold subject to the condition that it shall not,


by way of trade or otherwise, be lent, re-sold, hired out, or otherwise
circulated without the publisher’s prior consent in any form of binding
or cover other than that in which it is published and without a similar
condition including this condition being imposed
on the subsequent purchaser

British Library Cataloguing in Publication Data


Hill, Raymond
A first course in coding theory.—(Oxford
applied mathematics and computing science series)
1. Error-correcting codes (Information theory)
I. Title II. Series
519.4 QA268

Library of Congress Cataloging in Publication Data


Hill, Raymond, 1947-
A first course in coding theory.
(Oxford applied mathematics and computing science series)
Bibliography: p.
Includes index.
1. Error-correcting codes (Information theory)
I. Title. II. Series.
QA268.H55 1986 005.7’2 85-21588
ISBN 0 19 853803 0 (Pbk)

Printed in Northern Ireland by The Universities Press (Belfast) Ltd.


To
Susan, Jonathan, and Kathleen
Preface

The birth of coding theory was inspired by a classic paper of


Shannon in 1948. Since then a great deal of research has been
devoted to finding efficient schemes by which digital information
can be coded for reliable transmission through a noisy channel.
Error-correcting codes are now widely used in applications such
as returning pictures from deep space, design of registration
numbers, and storage of data on magnetic tape. Coding theory is
also of great mathematical interest, relying largely on ideas from
pure mathematics and, in particular, illustrating the power and
the beauty of algebra. Several excellent textbooks have appeared
in recent years, mostly at graduate level and assuming a fairly
advanced level of mathematical knowledge or sophistication. Yet
the basic ideas and much of the theory of coding are readily
accessible to anyone with a minimal mathematical background.
(For a recent article advocating the inclusion of algebraic coding
theory in the undergraduate curriculum, see Brinn (1984).)
The aim of this book is to provide an elementary treatment of
the theory of error-correcting codes, assuming no more than
high school mathematics and the ability to carry out matrix
arithmetic. The book is intended to serve as a self-contained
course for second or third year mathematics undergraduates, or
as a readable introduction to the mathematical aspects of coding
for students in engineering or computer science.
The first eight chapters comprise an introductory course which
I have taught as part of second year undergraduate courses in
discrete mathematics and in algebra. (There is much to be said
for teaching coding theory immediately after, or concurrently
with, a course in algebra, for it reinforces with concrete examples
many of the ideas involved in linear algebra and in elementary
group theory.) I have also used the text as a whole as a Master’s
course taken by students whose first degree is not necessarily in
mathematics. The last eight chapters are largely independent of
one another and so courses can be varied to suit requirements.
For example, Chapters 9, 10, 14, and 15 might be omitted by
students who are not specialist mathematicians.
Vill Preface
The book is concerned almost exclusively with block codes for
correcting random errors, although the last chapter includes a
brief discussion of some other codes, such as variable length
source codes and cryptographic codes. The treatment throughout
is motivated by two central themes: the problem of finding the
best codes, and the problem of decoding such codes efficiently.
One departure from several standard texts is that attention is
by no means restricted to binary codes. Indeed, consideration of
codes over fields of order a prime number enables much of the
theory, including the construction and decoding of BCH codes,
to be covered in an elementary way, without needing to work
with the rather more complex fields of order 2” (h > 1).
Another feature is the large number of exercises, at varying
levels of difficulty, at the end of each chapter. The inclusion of
the solutions at the end makes the book suitable for self-learning
or for use as a reading course. I believe that the best way to
understand a subject is by solving problems and so the reader is
urged to make good attempts at the exercises before consulting
the solutions.
Finally, it is hoped that the reader will be given a taste for this
fascinating subject and so encouraged to read the more advanced
texts. Outstanding amongst these is MacWilliams and Sloane
(1977); the size of its bibliography—nearly 1500 articles—is a
measure of how coding theory has grown since 1948. Also highly
recommended are Berlekamp (1968), Blahut (1983), Blake and
Mullin (1976), Cameron and van Lint (1980), Lin and Costello
(1983), van Lint (1982), McEliece (1977), Peterson and Weldon
(1972), and Pless (1982).
Salford
February 1985

Acknowledgements
I am grateful to Professors P. G. Farrell and J. H. van Lint, and
Drs J. W. P. Hirschfeld, R. W. Irving, L. O’Carroll, and R.
Sandling for helpful comments and suggestions, and to B.
Banieqbal for acquainting me with the work of Ramanujan,
which now features prominently in Chapter 11.
I should also like to thank Susan Sharples for her excellent
typing of the manuscript.
R.H.
Contents

Page
Notation x1

Introduction to error-correcting codes


fF

The main coding theory problem 11


NO

An introduction to finite fields 31


WO

Vector spaces over finite fields 41


FP

Introduction to linear codes 47


ON

Encoding and decoding with a linear code 55


HD

The dual code, the parity-check matrix, and syndrome


NI

decoding 67
The Hamming codes 81
oO

Perfect codes 97
10 Codes and Latin squares 113
11 A double-error correcting decimal code and an
introduction to BCH codes 125
12 Cyclic codes 141
13 Weight enumerators 165
14 The main linear coding theory problem 175
15 MDS codes 191
16 Concluding remarks, related topics, and further
reading 201

Solutions to exercises 211


Bibliography 243
Index 249
Notation

For the reader who is unfamiliar with the notation of modern set
theory, we introduce below all that is required in this book.
A set is simply a collection of objects. In this book we shall
make use of the following sets (among others):
R: the set of real numbers.
Z: the set of integers (positive, negative, or zero).
Z,: the set of integers from 0 to n — 1 inclusive.
The objects in a set are often called its elements or its
members. If x is an element of the set S$, we write x e S, which is
read ‘x belongs to S’ or ‘x belonging to S’ as the context requires.
If x is not an element of S we write x¢éS. Thus 2¢€ Z but 4¢Z.
Two sets are equal if they contain precisely the same elements.
The set consisting precisely of elements x,,%,...,x, is often
denoted by {x,,%.,...,x,}. For example, Z;= {0,1,2}. Also
Z, = {0, 2,1} = {2, 1, 0}.
If S is a set and P a property (or combination of properties)
which elements x of S may or may not possess, we can define a
new set with the notation

{x «S| P(x)}
which denotes ‘the set of all elements belonging to S which have
property P’. For example, the set of positive integers could be
written {x € Z|x >0} which we read as ‘the set of elements x
belonging to Z such that x is greater than 0’. The set of all even
integers can be denoted by {2n |ne€ Z}.
A set T is called a subset of a set S if all the elements of T
belong to S. We then say that ‘T is contained in S’ and write
T cS, or that ‘S contains 7’ and write $a T.
If S and T are sets we define the union S UT of S and T to be
the set of all elements in either S or 7. We define the intersection
ST of S and T to be the set of all elements which are members
of both S$ and T. Thus
SUT ={x|xeSorxeT},
SOT ={x|xeS
and x eT}
Xl Notation

If S and T have no members in common, we say that S and T are


disjoint.
The order or cardinality of a finite set S is the number of
elements in S and is denoted by |S|. For example, |Z,,| =n.
Given sets § and T we denote by (s,t) an ordered pair of
elements where s eS and te T. Two ordered pairs (s,,¢,) and
(S),) are defined to be equal if and only if s;=s, and t,=b.
Thus if S = T = Z, (0, 1)4(1, 0). The Cartesian product of S and
T, denoted by S x T, is defined to be the set of all ordered pairs
(s,t) such that se S and te T. The product S$ x S is denoted by
S*. Thus
S?= {(s,, 8) | 5, € S, 5 € S}.
If S and T are finite sets, then

|S x T| = [S| -|T7|
for, in forming an element (s, t) of S x T, we have |S| choices for
s and |T| choices for ¢. In particular |S?| = |S|?.
More generally we define the Cartesian product of n sets
S,,S5,...,8, to be a set of ordered n-tuples thus:
S, X S, Xx - XS, = {(51, 8, . oe ,5,)|5,;€8;,i=1,2,. ee jn}.

Two ordered n-tuples (5,,5,...,5,) and (t,6,...,¢,) are


defined to be equal if and only if s,=¢ for i=1,2,...,n. If
S,;=S,=:::=S,=85, the product is denoted by S$”. For
example,
R?={(x,y,z)|xeR,yeR,zeR}
is a set-theoretic description of coordinatized 3-space. If S is
finite, then clearly

S"|
= |S|".
Finally we remark that in this book we shall often write an
ordered n-tuple (x1, %»,...,%,) simply as x,x,---X,.
1 Introduction to error-correcting codes

Error-correcting codes are used to correct errors when messages


are transmitted through a noisy communication channel. For
example, we may wish to send binary data (a stream of Os and 1s)
through a noisy channel as quickly and as reliably as possible.
The channel may be a telephone line, a high frequency radio
link, or a satellite communication link. The noise may be human
error, lightning, thermal noise, imperfections in equipment, etc.,
and may result in errors so that the data received is different
from that sent. The object of an error-correcting code is to
encode the data, by adding a certain amount of redundancy to
the message, so that the original message can be recovered if
(not too many) errors have occurred. A general digital com-
munication system is shown in Fig. 1.1. The same model can be
used to describe an information storage system if the storage
medium is regarded as a channel; a typical example is a
magnetic-tape unit including writing and reading heads.

Figure 1.1
Noise
1 Received Decoded
Message Message Codeword vector message
source ——> | Encoder | —————» | Channel | —————-» | Decoder | ——--—» User

Let us look at a very simple example in which the only


messages we wish to send are ‘YES’ and ‘NO’.

Example 1.2
\

Message = YES Encoder: 00000 SS 01001 Decoder: YES


YESorNO |7 >| YES=00000 | ———> oio0' |——»| o1001~ —-> | User
NO=11111 00000= YES

Here two errors have occurred and the decoder has decoded the
received vector 01001 as the ‘nearest’ codeword which is 00000 or
YES.
2 A first course in coding theory
A binary code is just a given set of sequences of Os and 1s
which are called codewords. The code of Example 1.2 is
{00000, 11111}. If the messages YES and NO are identified with
the symbols 0 and 1 respectively, then each message symbol is
encoded simply by repeating the symbol five times. The code is
called a binary repetition code of length 5. This is an example of
how ‘redundancy’ can be added to messages to protect them
against noise. The extra symbols sent are themselves subject to
error and so there is no way to guarantee accuracy; we just try to
make the probability of accuracy as high as possible. Clearly, a
good code is one in which the codewords have little resemblance
to each other.
More generally, a g-ary code is a given set of sequences of
symbols where each symbol is chosen from a set F, = {A;, Ao,
...,4A,} of q distinct elements. The set F, is called the alphabet
and is often taken to be the set Z,={0,1,2,...,q—-1}.
However, if g is a prime power (i.e. q=p” for some prime
number p and some positive integer ) then we often take the
alphabet F, to be the finite field of order gq (see Chapter 3). As
we have already seen, 2-ary codes are called binary codes; 3-ary
codes are sometimes referred to as ternary codes.

Example 1.3. (1) The set of all words in the English language is
a code over the 26-letter alphabet {A,B,..., Z}.
(ii) The set of all street names in the city of Salford is a
27-ary code (the space between words is the 27th symbol) and
provides a good example of poor encoding, for two street names
on the same estate are HILLFIELD DRIVE and MILLFIELD
DRIVE.
A code in which each codeword is a sequence consisting of a
fixed number n of symbols is called a block code of length n.
From now on we shall restrict our attention almost exclusively to
such codes and so by ‘code’ we shall always mean ‘block code’.
A code C with M codewords of length n is often written as an
M Xn array whose rows are the codewords of C. For example,
the binary repetition code of length 3 is
000
111.
Let (F,)” denote the set of all ordered n-tuples a= aja): - a,
where each a;¢€ F,. The elements of (F,)” are called vectors or
Introduction to error-correcting codes 3
words. The order of the set (F,)” is gq”. A q-ary code of length n
is just a subset of (F,)”.

Example 1.4 The set of all 10-digit telephone numbers in the


United Kingdom is a 10-ary code of length 10. Little thought
appears to have been given to allocating numbers so that the
frequency of ‘wrong numbers’ is minimized. Yet it is possible to
use a code of over 82 million 10-digit telephone numbers (enough
for the needs of the UK) such that if just one digit of any number
is misdialled the correct connection can nevertheless be made.
We will construct this code in Chapter 7 (Example 7.12).

Example 1.5 Suppose that HQ and X have identical maps


gridded as shown in Fig. 1.6 but that only HQ knows the route
indicated, avoiding enemy territory, by which X can return safely
to HQ. HQ can transmit binary data to X and wishes to send the
route NNWNNWWSSWWNNNNWWN. This is a_ situation
where reliability is more important than speed of transmission.
Consider how the four messages N, S, E, W can be encoded into
binary codewords. The fastest (i.e. shortest) code we could use is
0 0O=N
0O1=W
C,=
10=E
11=S.

Figure 1.6
4 A first course in coding theory
That is, we identify the four messages N, W, E, S with the four
vectors of (F,)*. Let us see how, as in Example 1.1, redundancy
can be added to protect these message vectors against noise.
Consider the length 3 code C, obtained by adding an extra digit
as follows.
000
O11
101
110.

This takes longer than C, to transmit but if there is any single


error in a codeword, the received vector cannot be a codeword
(check this!) and so the receiver will recognize that an error has
occurred and may be able to ask for the message to be
retransmitted. Thus C, has the facility to detect any single error;
we Say it is a single-error-detecting code.
Now suppose X can receive data from HQ but is unable to
seek retransmission, i.e. we have a strictly one-way channel. A
similar situation might well apply in receiving photographs from
deep space or in the playing back of an old magnetic tape, and in
such cases it is essential to extract as much information as
possible from the received vectors. By suitable addition of two
further digits to each codeword of C, we get the length 5 code

00000
01101
3=
10110
11011.

If a single error occurs in any codeword of C;, we are able not


only to detect it but actually to correct it, since the received
vector will still be ‘closer’ to the transmitted codeword than to
any other. (Check that this is so and also that if used only for
error-detection C3 is a two-error-detecting code).
We have so far talked rather loosely about a vector being
‘closer’ to one codeword than to another and we now make this
Introduction to error-correcting codes 5
concept precise by introducing a distance function on (F,)”,
called the Hamming distance.
The (Hamming) distance between two vectors x and y of (F,)”
is the number of places in which they differ. It is denoted by
d(x, y). For example, in (F)” we have d(00111, 11001) = 4, while
in (F)* we have d(0122, 1220) = 3.
The Hamming distance is a legitimate distance function, or
metric, since it satisfies the three conditions:

(i) d(x, y) =0 if and only ifx=y.


(ii) d(x, y) =d(y,x) for all x, ye (F,)”.
(iii) d(x, y) <d(x, z) + d(z, y) for all x, y,ze(F,)”.

The first two conditions are very easy to verify. The third,
known as the triangle inequality, is verified as follows. Note that
d(x, y) is the minimum number of changes of digits required to
change x to y. But we can also change x to y by first making
d(x,z) changes (changing x to z) and then d(z,y) changes
(changing z to y). Thus d(x, y) =d(x, z) + d(z, y).
The Hamming distance will be the only metric considered in
this book. However, it is not the only one possible and indeed
may not always be the most appropriate. For example, in (Fi)°
we have d(428, 438) = d(428, 468), whereas in practice, e.g. in
dialling a telephone number, it might be more sensible to use a
metric in which 428 is closer to 438 than it is to 468.
Let us now consider the problem of decoding. Suppose a
codeword x, unknown to us, has been transmitted and that we
receive the vector y which may have been distorted by noise. It
seems reasonable to decode y as that codeword x’, hopefully x,
such that d(x’,y) is as small as possible. This is called nearest
neighbour decoding. This strategy will certainly maximize the
decoder’s likelihood of correcting errors provided the following
assumptions are made about the channel.
(i) Each symbol transmitted has the same probability p(<4)
of being received in error.
(ii) If a symbol is received in error, then each of the gq —1
possible errors is equally likely.
Such a channel is called a g-ary symmetric channel. The binary
symmetric channel is shown in Fig. 1.7.
6 A first course in coding theory
Figure 1.7
1-p
0 > 0

1 —> 1
sent 1-p received

p is called the symbol error probability of the channel.

If the binary symmetric channel is assumed and if a particular


binary codeword of length n is transmitted, then the probability
that no errors will occur is (1—p)”, since each symbol has
probability (1 — p) of being received correctly. The probability
that one error will occur in a specified position is p(1—p)""'.
The probability that the received vector has errors in precisely 1
specified positions is p‘(1 — p)"~‘. Since p <3, the received vector
with no errors is more likely than any other; any received vector
with one error is more likely than any with two or more errors,
and so on. This confirms that, for a binary symmetric channel,
nearest neighbour decoding is also maximum likelihood
decoding.

Example 1.8 Consider the binary repetition code of length 3


000
C={h
Suppose the codeword 000 is transmitted. Then the received
vectors which will be decoded as 000 are 000, 100, 010 and 001.
Thus the probability that the received vector is decoded as the
transmitted codeword 000 is

(1—p)?+3p(1- py =(1—p)?(1 + 2p).


Note that, by symmetry, the probability is the same if the
transmitted codeword is 111. Thus we can say that the code C
has a word error probability, denoted by P,,,(C), which is
independent of the codeword transmitted. In this example, we
have P.,,(C) = 1— (1 — p)*(1 + 2p) = 3p? — 2p3.

In order to compare probabilities given by such polynomials in


p, it is useful to assign an appropriate numerical value to p. For
Introduction to error-correcting codes 7
example we might assume that, on average, the channel causes
one symbol in a hundred to be received in error, i.e. p = 0.01. In
this case P.,,(C) = 0.000 298 and so approximately only one word
in 3355 will reach the user in error.
We will show in Chapter 6 that a very important class of codes,
called linear codes, all have the property that the word error
probability is independent of the actual codeword sent. For a
general code, a brute-force decoding scheme is to compare the
received vector with all codewords and to decode as the nearest.
This is impractical for large codes and one of the aims of coding
theory is to find codes which can be decoded by faster methods
than this. We shall see in Chapters 6 and 7 that linear codes have
elegant decoding schemes.
An important parameter of a code C, giving a measure of how
good it is at error-correcting, is the minimum distance, denoted
d(C), which is defined to be the smallest of the distances
between distinct codewords. That 1s,

d(C)
= min {d(x, y)|x, ye C, x#y}.
For example, it is easily checked that for the codes of Example
1.5, d(C,) =1, d(C) =2 and d(C;) = 3.

Theorem 1.9 (i) A code C can detect up to s errors in any


codeword if d(C) =s +1.
(ii) A code C can correct up to ¢ errors in any codeword if
d(C) =2t+1.

Proof (i) Suppose d(C)2=s +1. Suppose a codeword x is


transmitted and s or fewer errors are introduced. Then the
received vector cannot be a different codeword and so the errors
can be detected.
(ii) Suppose d(C)=2t+1. Suppose a codeword x is trans-
mitted and the vector y received in which ¢ or fewer errors have
occurred, so that d(x, y) St. If x’ is any codeword other than x,
then d(x’, y)=t+1. For otherwise, d(x’, y)<t, which implies,
by the triangle inequality, that d(x, x')<d(x, y)+ d(x’, y) $21,
contradicting d(C) =2t+1. So x is the nearest codeword to y
and nearest neighbour decoding corrects the errors.

{Note: The reader may find Remark 2.12 helpful in clarifying this
proof.|
8 A first course in coding theory
Corollary 1.10 If acode C has minimum distance d, then C can
be used either (i) to detect up to d — 1 errors, or (11) to correct up
to |(d —1)/2| errors in any codeword.
(|x| denotes the greatest integer less than or equal to x).

Proof (i) d2s+1iffs<d-—1. (i) d=2t+1 iff t<(d—1)/2.

For example, if d(C)=3, then C can be used either as a


single-error-correcting code or as a double-error-detecting code.
More generally we have:

Number of errors Number of errors


detected by corrected by
d(C) C C
©

OO
NO

©
WNF&F
QW

&
&

KR
On

NO
na

DN
TON

W
=

The following notation will be used extensively and should be


memorized.
An (n,M,d)-code is a code of length n, containing M
codewords and having minimum distance d.

Examples 1.11 (i) In Example 1.5, C, is a (2, 4, 1)-code, CG, a


(3, 4, 2)-code and C; a (5, 4, 3)-code.
(ii) The q-ary repetition code of length n whose codewords
are

(q-1) (q-1) Las (q-1)


is an (n, gq, n)-code.
Introduction to error-correcting codes 9
Example 1.12 The code used by Mariner 9 to transmit pictures
from Mars was a binary (32, 64, 16)-code, called a Reed—Muller
code. This code, which will be constructed in Exercise 2.19, is
well suited to very noisy channels and also has a fast decoding
algorithm. How the code was used will be described in the
following brief history of the transmission of photographs from
NASA space probes.

The transmission of photographs from deep-space


1965: Mariner 4 was the first spaceship to photograph
another planet, taking 22 complete photographs of Mars. Each
picture was broken down into 200 x 200 picture elements. Each
element was assigned a binary 6-tuple representing one of 64
brightness levels from white (=000000) to black (=111111). Thus
the total number of bits (i.e. binary digits) per picture was
240 000. Data was transmitted at the rate of 83 bits per second
and so it took 8 hours to transmit a single picture!
1969-1972: Much improved pictures of Mars were obtained
by Mariners 6, 7 and 9 (Mariner 8 was lost during launching).
There were three important reasons for this improvement:
(1) Each picture was broken down into 700 x 832 elements
(cf. 200 x 200 of Mariner 4 and 400 x 525 of US commer-
cial television).
(2) Mariner 9 was the first spaceship to be put into orbit
around Mars.
(3) The powerful Reed—Muller (32, 64, 16)-code was used for
error correction. Thus a binary 6-tuple representing the
brightness of a dot in the picture was now encoded as a
binary codeword of length 32 (having 26 redundant bits).
The data transmission rate was increased from 84 to
16200 bits per second. Even so, picture bits were
produced by Mariner’s cameras at more than 100 000 per
second, and so data had to be stored on magnetic tape
before transmission.
1976: Viking 1 landed softly on Mars and returned high-
quality colour photographs.
Surprisingly, transmission of a colour picture in the form of
binary data is almost as easy as transmission of a black-and-white
one. It is achieved simply by taking the same black-and-white
10 A first course in coding theory
photograph several times, each time through a different coloured
filter. The black-and-white pictures are then transmitted as al-
ready described and the colour picture reconstructed back on
Earth.

5 March 1979: High-resolution colour pictures of Jupiter and


its moons were returned by Voyager 1.
12 November 1980: Voyager 1 returned the first high-
resolution pictures of Saturn and its moons.
25 August 1981: Voyager 2 returned further excellent pic-
tures of Saturn.
And to come:

24 January 1986: Voyager 2 passes Uranus.


24 August 1989: Voyager 2 passes Neptune.

Exercises 1

1.1 If the following message were received from outer space,


why might it be conjectured that it was sent by a race of
human-like beings who have one arm twice as long as the
other? [Hint: The number of digits in the message is the
product of two prime numbers.|

0011000001 10001111111101100100110010011001011110001
00100010010001001001100110
1.2 Suppose the binary repetition code of length 5 is used for
a binary symmetric channel which has symbol error
probability p. Show that the word error probability of the
code is 10p* — 15p* + 6p.
1.3 Show that a code having minimum distance 4 can be used
simultaneously to correct single errors and detect double
errors.
1.4 The code used by Mariner 9 will correct any received
32-tuple provided not more than ... (how many?) errors
have occurred.
1.5 (i) Show that a 3-ary (3, M, 2)-code must have M <9.
(ii) Show that a 3-ary (3, 9, 2)-code does exist.
(iii) Generalize the results of (i) and (ii) to q-ary
(3, M, 2)-codes, for any integer g =2.
2 The main coding theory problem

A good (n,M,d)-code has small n (for fast transmission of


messages), large M (to enable transmission of a wide variety of
messages) and large d (to correct many errors). These are
conflicting aims and what is often referred to as the ‘main coding
theory problem’ is to optimize one of the parameters n, M, d for
given values of the other two. The usual version of the problem
is to find the largest code of given length and given minimum
distance. We denote by A,(n, d) the largest value of M such that
there exists a g-ary (n, M, d)-code.
The problem is easily solved for d=1 and d =n, for all q:

Theorem 2.1 (i) A,(n,1)=@q". (ii) Aj(n, n) = q.

Proof (i) For the minimum distance of a code to be at least 1


we require that the codewords are distinct, and so the largest
q-ary (n, M, 1)-code is the whole of (F,)”, with M = q”.
(ii) Suppose C is a q-ary (n,M,n)-code. Then any two
distinct codewords of C differ in all n positions. Thus the symbols
appearing in any fixed position, e.g. the first, in the M codewords
must be distinct, giving M<q. Thus A,(n,n) <q. On the other
hand, the q-ary repetition code of length n (see Example
1.11(ii)) is an (n, gq, n)-code and so A,(n,n) =q.

Example 2.2 We will determine the value A,(5, 3). The code C,
of Example 1.5 is a binary (5, 4, 3)-code and so A,(5, 3) =4. But
can we do better? To show whether there exists a binary
(5,5,3)-code a brute-force method would be to consider all
subsets of order 5 in (4)° and find the minimum distance of each.
Unfortunately there are over 200 000 such subsets (see Example
2.11(iii)), but, by using the following notion of equivalence, the
search can be considerably reduced. We will return to Example
2.2 shortly.
12 A first course in coding theory
Equivalence of codes

A permutation of a set S={x,,%,...,X,} IS a one-to-one


mapping from S to itself. We denote a permutation f by
x; Xp... XX,
1 1 L |.
fm) FQ2)..- Flr)
Definition Two q-ary codes are called equivalent if one can be
obtained from the other by a combination of operations of the
following types:
(A) permutation of the positions of the code;
(B) permutation of the symbols appearing in a fixed position.
If a code is displayed as an M Xn matrix whose rows are the
codewords, then an operation of type (A) corresponds to a
permutation, or rearrangement, of the columns of the matrix,
while an operation of type (B) corresponds to a re-labelling of
the symbols appearing in a given column.
Clearly the distances between codewords are unchanged by
such operations and so equivalent codes have the same para-
meters (n,M,d) and will correct the same number of errors.
Indeed, under the assumptions of a g-ary symmetric channel, the
performances of equivalent codes will be identical in terms of
probabilities of error correction.

Examples (i) The binary code

00100
00011
11111
11000

is equivalent to the code C, of Example 1.5. (Apply the


permutation
0 1
4
1 0
The main coding theory problem 13
to the symbols in the third position of C and then interchange
positions 2 and 4. Note that the codewords will be listed in a
different order from that in Example 1.5).
(ii) The ternary code
012
C=4120
201
is equivalent to the ternary repetition code of length 3. Applying
the permutation
0 1 2
Lill
201
to the symbols in the second position and
O 1 2
Lvl
1 2 0
to the symbols in the third position of C gives the code
000
111.
222

Lemma 2.3 Any q-ary (n,M,d)-code over an_ alphabet


{0,1,...,q—1} is equivalent to an (n, M,d)-code which con-
tains the all-zero vector 0=00---0.

Proof Choose any codeword x,x,-- +x, and for each x,;#0 apply
the permutation
/0 xi J
1 J J for all 740, x;
x, O J
to the symbols in position 1.

Example 2.2 (continued) We will show not only that a binary


(5, M,3)-code must have M <4 but also that the (5, 4, 3)-code is
unique, up to equivalence.
14 A first course in coding theory
Let C be a (5, M, 3)-code with M =4. Then by Lemma 2.3 we
may assume that C contains the vector 0 = 00000, (replacing C by
an equivalent code which does contain 0, if necessary). Now C
contains at most one codeword having 4 or 5 1s, for if there were
two such codewords, x and y say, then x and y would have at
least 3 1s in common positions, giving d(x, y) 2 and contradict-
ing d(C) =3.
Since 0 € C, there can be no codewords containing just one or
two ls and so, since M = 4, there must be at least two codewords
containing exactly 3 1s. By rearranging the positions, if neces-
sary, we may thus assume that C contains the codewords
00000
11100.
00111
It is now very easy to show by trial and error that the only
possible further codeword can be 11011.
We have thus shown that A,(5, 3) =4 and that the code which
achieves this value is, up to equivalence, unique.
Restricting our attention for the time being to binary codes, we
list in Table 2.4 the known non-trivial values of A,(n,d) for
n= 16 and d $7. This is taken from the table on P. 156 of Sloane
(1982) which in turn is an updating of the table on P. 674 of

Table 2.4

n d=3 d=5 d=7

5 4 2 —
6 8 2 —
7 16 2 2
8 20 4 2
9 40 6 2
10 72-79 12 2
11 144-158 24 4
12 256 32 4
13 512 64 8
14 1024 128 16
15 2048 256 32
16 2560-3276 256-340 36-37
The main coding theory problem 15
MacWilliams and Sloane (1977). Where the value of A,(n, d) is
not known, the best available bounds are given; for example, the
entry 72—79 indicates that 72 = A,(10, 3) <79.
Many of the entries of Table 2.4 will be established during the
course of this book (we have already verified the first entry in
Example 2.2). In Chapter 16 we shall again consider Table 2.4
and review the progress we have made.
The reason why only odd values of d need to be considered in
the table is that if d is an even number, then A,(n,d)=
A,(n —1,d—1), a result (Corollary 2.8) towards which we now
proceed.
Taking F, to be the set {0,1}, we define two operations on
(F,)". Let x=x,X,-+-x, and y=y,jy.---y, be two vectors in
(F,)". Then the sum x + y is the vector in (F,)” defined by
x+y= (x; + V1, X_ + yo, 22 0 Xp + y,),

while the intersection x Ny is the vector in (/)” defined by


xNy= (X11, %2Ye, ane > XnYn)-

The terms x,;+y, and x,y; are calculated modulo 2 (without


carrying); that is, according to the addition and multiplication
tables
+|0O 1 -| 0 1
0;0 1 0; 90 0
1/1 QO 1/0 1

For example 11100+ 00111 = 11011


and 1110000111 = 00100.
The weight of a vector x in (f)”, denoted w(x), is defined to be
the number of 1s appearing in x.

Lemma 2.5 If x and ye (F)", then d(x, y)=w(x+y).


Proof The sum x+y has a 1 where x and y differ and a 0 where
x and y agree.

Lemma 2.6 If x and ye (F)”, then


d(x, y) = w(x) + w(y) — 2w(x Ny).
Proof d(x, y) = w(x+ y) = (number of 1s in x) + (number of 1s
16 A first course in coding theory
in y)—2(number of positions where both x and y have a
1) = w(x) + w(y) — 2w(xNy).

Theorem 2.7 Suppose d is odd. Then a binary (n, M, d)-code


exists if and only if a binary (n + 1, M, d + 1)-code exists.

Proof ‘only if? part: Suppose C is a binary (n, M, d)-code,


where d is odd. Let C be the code of length n + 1 obtained from
C by extending each codeword x of C according to the rule
— (%1%- ++ X,0 — if w(x) is even
K=Xj%)°° +X, PO K= . ;
XyX_-°+ X,1 if w(x) is odd.
Equivalently we can define
K = XyXQ° Xn
where x,,4; = 17, x;, calculated modulo 2.
This construction of C from C is called ‘adding an overall
parity check’ to the code C.
Since w(X) is even for every codeword & of C, it follows from
Lemma 2.6 that d(x, ¥) is even for all X, ¥ in C. Hence d(C) is
even. Clearly d=<d(C)<d +1, and so, since d is odd, we must
have d(C) =d +1. Thus C is an (n +1, M, d + 1)-code.
‘if part: Suppose D is an (n+1,M,d+1)-code, where d is
odd. Choose codewords x and y of D such that d(x, y)=d +1.
Choose a position in which x and y differ and delete this from all
codewords. The result is an (n, M, d)-code.

Corollary 2.8 If d is odd, then A,(n+1,d+1)=A,(n, a).


Equivalently, if d is even, then A,(n, d) = A,(n—1,d-—1).

Example 2.9 By Example 2.2, A,(5,3)=4. Hence, by


Corollary 2.8, A,(6,4)=4. To illustrate the ‘only if? part of
Theorem 2.7 we construct below a (6,4, 4)-code from the
(5, 4, 3)-code of Example 1.5.
(5, 4, 3)-code (6, 4, 4)-code
00000 000000
01101 oieare «20011011
10110 101101
11011 110110
The main coding theory problem 17
The trial-and-error method of Example 2.2, which proved that
a binary (5, M,3)-code must have M <4, would not be practical
for sets of larger parameters. However, there are some general
upper bounds on how large a code can be (for given n and d),
which sometimes turn out to be the actual value of A,(n, d). The
best known is the so-called ‘sphere-packing bound’, which we
will prove after introducing a little more notation.

Binomial coefficients

If n and m are integers with 0<m <n, then the binomial coeffi-
. n ,
cient ( ), pronounced ‘n choose m’, is defined by
m

(") =
where m! =m(m — 1):-+3.2.1 for m>0
and Q!=1.

Lemma 2.10 The number of unordered selections of m distinct


. ge . . [a
objects from a set of n distinct objects is ( )
m

Proof An ordered selection of m distinct objects from a set of n


distinct objects can be made in
n!
nino Mo m+ lac

ways, for the first object can be chosen in any of n ways, then the
second in any of n—1 ways, and so on. Since there are
m(m—1)::+:2.1=m! ways of ordering the m objects chosen,
the number of unordered selections is
n!}
m!(n—m)!-

Examples 2.11 (i) We illustrate the proof of Lemma 2.10 by


listing the ordered and unordered selections of 2 objects from 4.
Labelling the four objects 1, 2, 3, 4, the ordered selections of 2
from 4 are (1,2), (1,3), (1,4), (2,1), (2,3), (2,4), @,1),
18 A first course in coding theory
(3, 2), (3,4), (4,1), (4, 2), (4,3). The number of them is 12=
4.3 =4!/2!.
The unordered selections of 2 from 4 are {1,2}, {1,3}, {1, 4},
{2,3}, {2,4}, {3,4}. Each unordered selection corresponds to
2!=2 ordered selections and so the number of unordered
_ , 4 4
selections is 5191 ( ,) 6.

Note that the unordered selections of m objects from a set $


are just the subsets of S of order m.
(ii) Suppose a bet on a football pool is to be a selection
(unordered) of 8 matches from a large number. The 8 matches
are forecast to be draws (ties). A common plan is to select 10
matches and to ‘choose any 8 from 10’. The number of bets
required is (‘.) = 45.
(iii) The number of different binary codes with M=5 and
n=5 is (~~) = 201 376. Of course the number of inequivalent
codes will be very much smaller than this.
(iv) The number of binary vectors in (F,)” of weight i is ("),
this being the number of ways of choosing i positions out of n to
have 1s. For example, the vectors in (F,)* of weight 2 are 1100,
1010, 1001, 0110, 0101, 0011. The one-to-one correspondence
with the list of unordered selections in (i) above should be
evident.
We now introduce the notion of a sphere in the set (F,)”.
Provided the analogy is not stretched too far, it can be useful to
think of (F,)” as a space not unlike the three-dimensional real
space which we inhabit. The distance between two points of (F,)”
is of course taken to be the Hamming distance and then the
following definition is quite natural.

Definition. For any vector u in (F,)” and any integer r=0, the
sphere of radius r and centre u, denoted S(u,r), is the set
{v € (F,)” | d(u, v) <r}.

Remark 2.12 Let us interpret Theorem 1.9(ii) visually. If


The main coding theory problem 19
d(C)22t+1, then the spheres of radius ¢ centred on the
codewords of C are disjoint (i.e. they have no overlap). For if a
vector y were in both S(x, t) and S(x’, t), for codewords x and x’
(see Fig. 2.13), then by the triangle inequality we would have
d(x, x') <d(x, y)+ d(x’, y) St+t=2t,
a contradiction to d(C) = 2t+ 1.

Figure 2.13
—»~
ey
Figure 2.14
So if t or fewer errors occur in a codeword x, then the received
vector y may be different from the centre of the sphere S(x, fr),
but cannot ‘escape’ from the sphere, and so is ‘drawn back’ to x
by nearest neighbour decoding (see Fig. 2.14).

Lemma 2.15 A sphere of radius r in (F,)" (QSr<n) contains


exactly

(i) + (a-v+ Glare


+ (Yaar
vectors.

Proof Let u be a fixed vector in (F,)”. Consider how many


vectors v have distance exactly m from u, where mn. The m

positions in which v is to differ from u can be chosen in (


m
ways and then in each of these m positions the entry of v can be
chosen in gq — 1 ways to differ from the corresponding entry of u.
20 A first course in coding theory
Hence the number of vectors at distance exactly m from uw is
(alc — 1)” and so the total number of vectors in S (u,r) is

© Ge (anae Canoe
Remark The numbers (”) are called binomial coefficients

because of their role in the binomial theorem, which for any


positive integer n states that

(A+x)"=1+ ("x + (5 )x2+ seat ("x

For x an integer, the binomial theorem follows from Lemma 2.15


by taking x =q —1 and r=n, for S(u,n) is the whole of (F,)”
and so contains q” = (1+ x)” vectors.

Theorem 2.16 (The sphere-packing or Hamming bound) A


q-ary (n, M, 2t + 1)-code satisfies

m|(") + ("\a ~1)4..+4 ("\(a - 1y| <q", (2.17)


Proof Suppose C is a q-ary (n,M,2t+1)-code. As we ob-
served in Remark 2.12, any two spheres of radius ¢ centred on
distinct codewords can have no vectors in common. Hence the
total number of vectors in the M spheres of radius ¢ centred on
the M codewords is given by the left-hand side of (2.17). This
number must be less than or equal to qg”, the total number of
vectors in (F,)”.
For future reference, we re-state (2.17) for the particular case
of binary codes. That is, any binary (n, M, 2t + 1)-code satisfies

mii+(")+(C)+---+(") con (2.18)


For given values of gq, n and d, the sphere-packing bound
provides an upper bound on A,(n,d). For example, a binary
(5, M, 3)-code satisfies M{1+5} =<2°=32, and so A,(5,3) <5.
Of course, just because a set of numbers n, M, d satisfies the
sphere-packing bound, it does not necessarily mean that a code
The main coding theory problem 21
with those parameters exists. Indeed we saw in Example 2.2 that
there is no binary (5,5,3)-code and that the actual value of
A,(5, 3) is just 4.

Perfect codes

A code which achieves the sphere-packing bound, i.e. such that


equality occurs in (2.17), is called a perfect code. Thus, for a
perfect t-error-correcting code, the M spheres of radius ¢ centred
on codewords ‘fill’ the whole space (F,)” without overlapping.
Or, in other words, every vector in (F,)” is at distance <¢ from
exactly one codeword.
The binary repetition code

of length n, where n is odd, is a perfect (n,2,n)-code. Such


codes, together with codes which contain just one codeword or
which are the whole of (F,)”, are known as trivial perfect codes.
The problem of finding all perfect codes has provided mathe-
maticians with one of the greatest challenges in coding theory
and we shall return to this problem in Chapter 9. We will
conclude this chapter by giving, in Example 2.23, an example of
a non-trivial perfect code. An alternative construction, as one of
the family of so-called perfect Hamming codes, will be given in
Chapter 8, while the present construction will be generalized in
Exercise 2.15 to a class of binary codes known as Hadamard
codes.
The construction given here will be based on one of a family of
configurations known as block designs, which we now introduce.

Balanced block designs


Definition A balanced block design consists of a set S of uv
elements, called points or varieties, and a collection of b subsets
of S, called blocks, such that, for some fixed k, r and A
(1) each block contains exactly k points
(2) each point lies in exactly r blocks
(3) each pair of points occurs together in exactly A blocks.
Such a design is referred to as a (b, uv, r,k, A)-design.
22 A first course in coding theory
Example 2.19 Take S={1,2,3,4,5,6,7} and consider the
following subsets of S: {1,2,4}, {2,3,5}, {3,4,6}, {4,5, 7},
{5, 6,1}, {6, 7,2}, {7, 1, 3}.
It is easily verified that each pair of elements of S occurs
together in exactly one block. Thus the subsets form the blocks
of a (7, 7, 3, 3, 1)-design.
There is a simple geometrical representation of this design (see
Fig. 2.20). The elements 1,2,...,7 are represented by points
and the blocks by lines (6 straight lines and a circle). This is
known as the seven-point plane, the Fano plane, or the projective
plane of order 2.

Fig. 2.20 The seven-point plane

The elements of the set S of a block design are often called


varieties because such designs were originally used in statistical
experiments, particularly in agriculture. For example, suppose
that we have v varieties of fertilizer to be tested on b crops and
that we are particularly interested in the effects of pairs of
fertilizers acting together on the same crop. By using a balanced
block design, each of the b crops can be tested with a block of k
varieties of fertilizer, in such a way that each pair of varieties is
tested together a constant number A of times. Thus the design is
balanced so far as comparison between pairs of fertilizers is
concerned.

Example 2.21 If we have 7 varieties of fertilizer (labelled


1,2,...,7) and 7 crops, then, using the (7,7, 3,3, 1)-design of
Example 2.19, we could treat the first crop with the block of
The main coding theory problem 23
varieties {1, 2,4}, the second crop with {2,3,5} and so on. The
schedule can be.displayed as follows:

Figure 2.22
Blocks
Bi B, Bs By, Bs Be B,
(1 1 0 0 0 1 0 1
2 1 1 0 0 0 1 0
3 0 1 1 0 0 0 1
Varieties 4 4 1 0 1 1 0 0 0
5 0 1 0 1 1 0 0
6 0 0 1 0 1 1 0
7 0 0 0 1 0 1 1
The 7 X 7 matrix of Os and 1s thus obtained is called an incidence
matrix of the design. More formally we have:

Definition The incidence matrix A ={a,| of a block design is a


vu Xb matrix in which the rows correspond to the varieties
X1,X),...,X, and the columns to the blocks B,, B,,..., B,, and
whose 7, /th entry is defined by
={" if x; € B;
ai = ,
0 if x,éB;
We now construct our example of a non-trivial perfect code.

Example 2.23 Let A be the incidence matrix of Fig. 2.22 and let
B be the 7 X7 matrix obtained from A by replacing all Os by 1s
and all 1s by Os. Let C be the length 7 code whose 16 codewords
are the rows a,,@,...,a, of A, the rows b,, b,,...,b, of B and
the additional vectors 0 = 0000000 and 1= 1111111. Thus
C=0000000=0 010110 0=a;, 100111 0=b,
1111111=1 0010110=a 0100111=b,
1000101=a, 0001011=a 101001 1=b,
1100010=a 0111010=b, 110100 1=b,
0110001=a,0011101=b1110100=b,
101100 0=a,
24 A first course in coding theory
We will show that the minimum distance of C is 3, 1.e. that
d(x, y)=3 for any pair of codewords x,y. By the incidence
properties of the (7,7, 3,3, 1)-design, each row of A has exactly
3 1s and any two distinct rows of A have exactly one 1 in
common. Hence, by Lemma 2.6,
d(a;,a,)=3+3-2.1=4 fori¥j.
Since distances between codewords are unchanged if all Os are
changed to 1s and all 1s to Os, we have also that
d(b;,b;) =4 for iF].
It is clear that

d(0,y) =3, 4 or 7 according as y=a,, b;J or 1,


d(1,y) =3, 4 or 7 according as y=b,, a, or 0,
and d(a;,b;) =7 fori=1,2,...,7.

It remains only to consider d(a;,b;) for i~j. But a; and b,


differ precisely in those places where a; and a, agree and so
d(a;, b;) =7—- d(a;,, a;) =7-4=3.

We have now shown that C is a (7, 16, 3)-code and since

16{(;) +(;)) =”
we have equality in (2.18) and so the code is perfect.
The existence of a perfect binary (7, 16,3)-code shows that
A,(7, 3) = 16 and so we have established another of the entries of
Table 2.4.
In leaving the code of Example 2.23 we note that it has the
remarkable property that the sum of any two codewords is also a
codeword! Interestingly, the (5, 4,3)-code of Example 2.2 has
the same property. Such codes are called linear codes and play a
central role in coding theory. We shall begin to study the theory
of such codes in Chapter 5.

Concluding remarks on Chapter 2


(1) It is not recommended that the reader spends a lot of
time on the unresolved cases in Table 2.4, for many man-hours
have so far failed to improve on the current best bounds.
The main coding theory problem 25
However, the manner in which one entry, A,(15,5) = 256, was
obtained (Nordstrom and Robinson 1967) might give some
encouragement to the amateur. It was previously known only
that 128 < A,(15, 5) <256 and this case was chosen by Robinson
as an example of a problem which he posed to high school
students in an introductory talk on coding theory. One of them,
named Alan Nordstrom, accepted the challenge and, by trial
and error, constructed a (15, 256,5)-code, the now-famous
Nordstrom—Robinson code. A construction of this code will be
given in Exercise 9.9.
It might be felt that all optimal codes of moderate length
should be obtainable by means of exhaustive computer searches.
But an estimate of the time needed to find whether there exists,
say, a binary (10, 73, 3)-code shows how difficult this would be.
In fact, computer-aided searches have so far met with distinctly
limited success; almost all the good codes known have arisen out
of their discoverers’ ingenuity.
(2) For binary codes, the sphere-packing bound turns out to
be reasonably good for cases n22d+1. Unfortunately, it
becomes very weak for n <2d, but in such cases there is a much
sharper bound, due to Plotkin (1960), which will be derived in
Exercises 2.20—22. [For some recent analogous results on ternary
codes, see Mackenzie and Seberry (1984). For some bounds on
binary (n,M,d)-codes with nv slightly greater than 2d, see
Tietavainen (1980).|

The reader who wishes to progress quickly to the main stream


of coding theory, which is the theory of linear codes, need not
dwell on the remaining remarks of this chapter for too long and
may also leave Exercises 2.12 to 2.24 for the time being.
(3) The parameters of a (b,v,r,k,A)-design are not inde-
pendent, for they satisfy the following two conditions (see
Exercise 2.13):
bk =ur (2.24)
r(k —1)=A(u — 1). (2.25)
However, if five numbers b, vu, r, k, A satisfy (2.24) and (2.25),
there is no guarantee that a (b,v,r,k,A)-design exists. For
example it is known that there does not exist a (43, 43, 7,7, 1)-
design.
26 A first course in coding theory
(4) A block design is called symmetric if v = b (and so also,
by (2.24), k=r), and is referred to simply as a (uv, k, A)-design.
There are two types of (uv, k, A)-design which will be of particular
interest to us.
(i) A finite projective plane is a symmetric design for which
A=1. If we put k =n +1, then x is called the order of the plane.
By (2.25), we then have v = n* +n +1, and so a projective plane
of order n is a (n?+n+1,n+1,1)-design. Such a design exists
whenever n is a prime power (see Exercise 4.7).
Gi) A (4t—1,2t—1,t—1)-design is called a Hadamard
design.
We see that the (7,3, 1)-design of Example 2.19 is both a
projective plane of order 2 and a Hadamard design with t = 2.
(5) Further relations on the five parameters of a
(b, v, r, k, A)-design have been found by making ingenious use of
the incidence matrix. The best known is the very simple, but by
no means obvious, result that
u<b (2.26)
obtained by the statistician R. A. Fisher in 1940.
For the particular case of symmetric designs, the following
fundamental theorem was proved by Bruck, Ryser and Chowla
in 1950.

Theorem 2.27 If a(v,k, A)-design exists, then


(i) if vis even, k —A is a square
(ii) if v is odd, the equation z? =(k
— A)x* + (-1)©-V7Ay?
has a solution in integers x, y, z not all zero.
It is an unsolved problem to determine whether the necessary
condition of Theorem 2.27, together with (2.24) and (2.25), form
a set of sufficient conditions for the existence of a symmetric
design. There are many parameters for which the existence of the
design is undecided, a particularly interesting case being the
projective plane of order 10, with parameters (v,k,A)=
(111, 11, 1).
For full details of these, and other, results on block designs the
reader is referred to Anderson (1974) or Hall (1980).
(6) A generalization of block designs to so-called ‘t-designs’
will be considered in Chapter 9.
The main coding theory problem 27
Exercises 2

Questions should not be answered simply by referring to Table


2.4.
2.1 Construct, if possible, binary (n,M,d)-codes with the
following parameters: (6, 2, 6), (3, 8,1), (4, 8, 2), (5, 3, 4),
(8, 30, 3). (When not possible, show why not possible).
2.2 Show that if there exists a binary (n, M,d)-code, then
there exists a binary (n —1, M’, d)-code with M’=M/2.
Deduce that A,(n, d) <2A,(n — 1, d).
2.3 Prove that A,(3, 2)=q’ for any integer q >2. [Hint: See
Exercise 1.5].
2.4 Let E,, denote the set of all vectors in (F,)” which have
even weight. Show that E,, is the code obtained by adding
an overall parity check to the code (F)"~!. Deduce that E,,
is an (n, 2”~', 2)-code.
2.5 Consider an entry to a football pool made by selecting 10
matches at random from a total of 50 and ‘choosing any 8
from 10’. Show that if exactly 8 of the 50 matches finish as
draws, the odds against the above entry containing a
winning line are greater than 10 million to 1.
2.6 Show that if there is a binary (n, M, d)-code with d even,
then there exists a binary (n, M, d)-code in which all the
codewords have even weight.
2.7 Show that the number of inequivalent binary codes of
length n and containing just two codewords is n.
2.8 Show that A,(8,5) =4 and that, up to equivalence, there
is just one binary (8, 4, 5)-code.
2.9 Show that any q-ary (n,q,n)-code is equivalent to a
repetition code.
2.10 Show that a q-ary (¢q + 1, M, 3)-code satisfies M = q?.
2.11 Show that A,(8, 4) = 16.
2.12 Listed below are the blocks of an (11,5, 2)-design. Use
this to construct a binary (11, 24, 5)-code.
{1,3, 4, 5, 9}, {2, 4,5, 6, 10}, {3, 5,6, 7, 11},
{1, 4, 6, 7, 8}, {2,5,7, 8, 9}, {3, 6, 8,9, 10},
{4,7,9, 10, 11}, {1, 5,8, 10, 11}, {1,2,6,9, 11},
{1,2,3, 7, 10}, {2,3,4, 8, 11}.
[Remark: We see from Table 2.4 that A,(11,5) =24 and
28 A first course in coding theory
so the code constructed here is the largest binary double-
error-correcting code of length 11. We shall prove this in
Exercise 2.22(iv).]
Show that the sphere-packing bound for a binary
(11, M,5)-code gives only M <30.
2.13 Show that the parameters of a (b, v,r, k, A)-design satisfy
(i) bk = vr, (ii) r(k — 1) =A(vu — 1). [Hint for (i): Count in
two ways the number of ordered pairs in the set {(x, B): x
is a point, B is a block and x € B}.]
2.14 Show that there do not exist (b, v, r, k, A)-designs with the
parameters: (i) (12, 8, 6, 4, 3), (ii) (22, 22, 7, 7, 2).
2.15 Show that if there exists a Hadamard (4¢ — 1, 2¢ — 1, t—1)-
design, then A,(4t — 1, 2t — 1) = 8t.
2.16 Let C be the binary code consisting of all cyclic shifts of
the vectors 11010000, 11100100 and 10101010, together
with 0 and 1. (A cyclic shift of a,a,---a, is a vector of the
form 44,41 °° *4,€,;Q°+**a,_,.) Show that C is a (8, 20, 3)-
code. When showing that d(C) =3, the cyclic nature of
the code reduces the number of evaluations of d(x, y)
20
required from ( >) to -:: (how many?).
2.17 [The (u|u+v) construction of Plotkin (1960).] Given
uU=U,°--u,, and v=v,---U,, let (u| v) denote the vector
U,***U,U,°*:U, Of length m+n. Suppose that C, is a
binary (n, M,, d,)-code and that C, is a binary (n, M), d>)-
code. Form a new code C;, consisting of all vectors of the
form (u|u+v), where ue C,, veC,. Show that C, is a
(2n, M,M,, d)-code with d = min {2d,, d,}.
2.18 Prove that A,(16, 3) = 2560. [Hint: Use Exercises 2.16 and
2.17.]
2.19 Starting from the (4, 8,2) even-weight code (see Exercise
2.4) and the (4, 2, 4) repetition code, apply Exercise 2.17
three times to show that there exists a binary (32, 64, 16)-
code. [Remark: The (2”, 2”*!, 2~')-codes, which may be
constructed in this way for each positive integer m = 1, are
called first-order Reed—Muller codes.|

The aim of the next three exercises is to derive the so-called


Plotkin bound.
The main coding theory problem 29
2.20 Show that if C is a binary (n, M, d)-code with n < 2d, then
M< ne —n) if M is even
~ (2d/(2d —n)—1_ if M is odd.

(Hint: let C = {x,,X,...,X,,} and let T be the (5) xn

matrix whose rows are the vectors x;+x,, 1<i<j<M.


Estimate the number w(T) of non-zero entries of T in two
ways, via rows and via columns.|
2.21 Deduce from Exercise 2.20 that, if m <2d, then
A,(n, d) <2|d/(2d —n)|.
State the upper bounds this gives on A,(9,5) and on
A,(10, 6). How can the bound on A,(9, 5) be improved?
[Remark: As for this case, it happens in general that the
above bound is good for d even, but is open to improve-
ment for d odd; we make that improvement in the next
exercise.|
2.22 Show that
(i) if d is even and n<2d, then
A,(n, d) <2|d/(2d—-n)|,
(ii) if dis odd andn<2d +1, then
A,(n, d) <2|(d +1)/(2d+1-n)],
(iii) if dis even, then A,(2d, d) =< 4d,
(iv) if dis odd, then A,(2d+1,d)<4d + 4.
(i) to (iv) are known collectively as the Plotkin bound.
2.23 Show that the (32, 64, 16)-code of Exercise 2.19 is optimal.
Generalize this result by proving that A,(2d,d)=4d
whenever d is a power of 2.
2.24 Show that if there exists a Hadamard (4¢ — 1, 2t —1,t— 1)-
design, then A,(4t, 2t) = 8¢.
3 An introduction to finite fields

To make error-correcting codes easier to use and analyse, it is


necessary to impose some algebraic structure on them. It is
especially useful to have an alphabet in which it is possible to
add, subtract, multiply and divide without restriction. In other
words we wish to give F, the structure of a field, the formal
definition of which follows.

Definition A field F is a set of elements with two operations +


(called addition) and - (multiplication) satisfying the following
properties.
(i) Fis closed under + and -,i.e.a+banda-b are in F
whenever a and Db are in F.
For all a, b and c in F, the following laws hold.
(ii) Commutative laws:a+b=b+a,a-b=b-a.
(iii) Associative laws: (a+ b)+c=a+(b+c),a-(b:c)=
(a-b)-c.
(iv) Distributive law: a-(b+c)=a-b+t+a-c.
Furthermore, identity elements 0 and 1 must exist in F satisfying
(v) a+0=a for all ain F.
(vi) a:1=a for all ain F.
(vii) For any a in F, there exists an additive inverse element
(—a) in F such that a + (—a) =0.
(viii) For any a#0 in F, there exists a multiplicative inverse
element a~' in F such that a-a7~* = 1.

Notes
(1) From now on we will generally write a - b simply as ab.
(2) We can regard a field F as having the four operations +, —,
- and +, where — and + are given by (vii) and (viii)
respectively with the understanding that a—b=a+(-—b)
and a + b, or a/b, =a(b“") for b40.
32 A first course in coding theory
(3) The reader who has done any group theory will recognize
that a field can be more concisely defined to be a set of
elements such that
(a) itis an abelian group under +,
(b) the non-zero elements form an abelian group under -,
(c) the distributive law holds.
(4) The following two further properties of a field are easily
deduced from the definition.

Lemma 3.1 Any field F has the following properties.


(i) ad0=O0 for all ain F.
(ii) ab =O0>a=0 or b =0. (Thus the product of two non-zero
elements of a field is also non-zero.)

Proof (i) We have a0=a(0+0)=a0+a0. Adding the addi-


tive inverse of a0 to both sides gives
0=a0+ (—a0) = a0+a0+ (—a0) =a0+0=<a0.
Thus a0 = 0.
(ii) Suppose ab=0. If a#0, then a has a multiplicative
inverse and so b=1-b=(a~'a)b =a~'(ab)=a~'0=0. Hence
ab=0>D>a=O0orb=0.

Definition A set of elements with + and - satisfying the field


properties (i) to (vil), but not necessarily (viii), is called a ring.

Remark For convenience, we have defined a ‘ring’ to be a


structure which should properly be called a ‘commutative (or
abelian) ring, with an identity’.
Familiar examples of infinite fields are the set of real numbers
and the set of complex numbers. The set Z of integers is a ring
but is not a field because, for example, 2 does not have a
multiplicative inverse in Z. Another example of a ring which is
not a field is the set F[x] of polynomials in x with coefficients
belonging to a field F. This ring will be of importance in Chapter
12.

Definition A finite field is a field which has a finite number of


elements, this number being called the order of the field.
The following fundamental result about finite fields was proved
An introduction to finite fields 33
by Evariste Galois (1811-32), a French mathematician who died
in a duel at the age of 20. Galois is famous also for proving that
the general quintic equation is not solvable by radicals.

Theorem 3.2. There exists a field of order gq if and only if g is a


prime power (i.e. g =p”, where p is prime and h is a positive
integer). Furthermore, if g is a prime power, then there is, up to
relabelling, only one field of that order.

A field of order gq is often called a Galois field of order gq and is


denoted GF(q).
The proof of Theorem 3.2 may be found in one of the more
advanced texts on coding theory or in books on abstract algebra.
While we shall give a partial proof in Exercise 4.6, and shall give
a brief description of fields of order p”, with h >1, in Chapter
12, it is enough for almost all our purposes to consider only
prime fields, those of order a prime number p. We shall see
shortly that if p is prime, then GF(p) is just the set
{0,1,...,p—1} with arithmetic carried out modulo p. But first
we review modular arithmetic in general.

Definition Let m be a fixed positive integer. Two integers a and


b are said to be congruent (modulo m), symbolized by
a=b(modm),
if a — b is divisible by m, 1.e. if a= km + b for some integer k.
We write a# b (mod m) if a and b are not congruent (modulo
m).
Every integer, when divided by m, has a unique principal
remainder equal to one of the integers in the set Z,,= {0,1,...,
m-—1}. It is easily shown that two integers are congruent
(mod m) if and only if they have the same principal remainders
on division by m.

Examples 3=24(mod7), 13=-—2(mod5), 25#12(mod7),


15= O(mod3), 15= O(mod5), 15# O(mod 2).

Theorem 3.3 Suppose a=a'(modm) and b=b' (modm).


Then
(i) a+b=a'+b' (modm)
(ii) ab=a'b’ (modm).
34 A first course in coding theory
Proof a=a'+km and b=b'+I/m for some integers k and I.
Then (i) a+b=a'+b’+(k+l)m and so at+b=a'+
b'(modm) and (ii) ab=a'b'+(kb’'+a'l+klm)m and so
ab =a'b' (mod m).

Theorem 3.3 enables congruences to be calculated without


working with large numbers. Note that if a=a’, then repeated
use of (ii) shows’ that, for all positive integers
n, a” =(a')" (mod m).

Examples 3.4 (i) What is the principal remainder when 73 - 52


is divided by 7?
(ii) Determine whether (2'°)(14%°) + 1 is divisible by 11.

Solution (i) 73=3(mod7) and 52=3(mod7). Hence, by


Theorem 3.3(11), 73 -52=3-+-3=9=2(mod7). So the principal
remainder is 2. (There is no need actually to multiply 73 by 52
and divide the answer by 7.)
(ii) Note that 2°=32=-1(mod1l). Also 147=3?=-
2 (mod 11). Hence
(21°)(14*) = (2°)3(37)*° = (- 1)°(—2)°

= (—1)(2””) = (-1)(2?)* = (-1)(-1)* = —1 (mod 11).

Thus (2!°)(14*°) + 1=0 (mod 11), i.e. the number is divisible


by 11.

Let us now try to give Z,, = {0,1,...,m—1} the structure of


a field. We define addition and multiplication in Z,, by: a + b (or
ab) =the principal remainder when a+b (or ab) is divided by
m.
For example, in Z,. we have
8+4=0, 9+11=8, 3-4=0, 3:9=3.
Theorem 3.3 shows that addition and multiplication in Z,, are
well-defined and it is easily verified that the field properties (i) to
(vii) are satisfied for any m (the additive inverse of a is m —a).
Thus, for any integer m =2, Z,, is a ring. It is called the ring of
integers modulo m. But for which values of m is field property
(viii) satisfied? The following theorem gives the answer.
An introduction to finite fields 35
Theorem 3.5 Z,, 1S a field if and only if m is a prime number.

Proof First, suppose m is not prime. Then m=ab for some


integers a and b, both less than m. Thus
ab =0(modm), with a#0(mod m) and
b #0(mod m).
So, in Z,,, the product of the non-zero elements a and b is zero
and so, by Lemma 3.1(ii), Z,, 1s not a field.
Now suppose that m is prime. By the remarks preceding this
theorem, to show that Z,, is a field it 1s enough to show that
every non-zero element of Z,, has a multiplicative inverse. Let a
be a non-zero element of Z,, and consider the m—1 elements
la,2a,...,(m-—1)a. These elements are non-zero, for ia
cannot have the prime m as a divisor if i and a do not. Also the
elements are distinct from one another, for

ia = ja> (i — j)a =0 (mod m)


> m is a divisor of (i —/)a
>m is a divisor of i —j, since m is prime
and does not divide a.
>i=j, since both i andjeé{1,2,...,m—1}.
So, in Z,,, the m—1 elements la, 2a,...,(m-—1)a must be
equal to the elements 1,2,...,m—1, in some order, and one of
them, ja say, must be equal to 1. This j is the desired inverse of
a.

Examples 3.6 (1) GF(2)=Z,= {0, 1} with addition and multi-


plication tables
+|/0O1 - | 01
0; 01 0) 00
1110 1101

(2) GF(3)=Z,= {0, 1,2} with tables


012 012
36 A first course in coding theory
(3) Z, is not a field by Theorem 3.5 (examination of the
multiplication table of Z, shows that 2 does not have an inverse
and so we cannot divide by 2 in Z,). However, while 4 = 2? is not
prime, it is a prime power, and so the field GF(4) does exist, by
Theorem 3.2. It can be defined as GF(4)={0,1,a,b} with
tables
+ Olab . Olab

0 Ola b 0 0000
1 10ba 1 Olab
a abol a Oabil
b bail10 b Obla

We shall meet this field in its natural setting in Example 12.2.


(4) Z, and Zj, are not fields, nor is there any field of order 6
or 10.
(5) GF(11)= Z,, = {0,1,2,...,10} is a field. We can easily
carry out addition, subtraction and multiplication (modulo 11)
without using tables. But what about division? Remember, to
divide a by b, we just multiply a by b~'. So how do we find b7'?
The proof of Theorem 3.5 shows the existence of multiplicative
inverses but not how to find them efficiently. Two methods for a
general prime modulus m are described in Exercises 3.8 and 3.9.
For a modulus as small as m = 11 it is easy to construct, by trial
and error, a table of inverses, thus:

x |) 123456789 10

x7! | 16439287510
To illustrate the use of this table, we will divide 6 by 8 in the field
GF(11). We have
§=6:8'=6:7=42=9.
We can give an immediate application of the use of modulo 11
arithmetic in an error-detecting code.

The ISBN code

Every recent book should have an International Standard Book


Number (ISBN). This is a 10-digit codeword assigned by the
publisher. For example, a book might have the ISBN
0-19-859617-0
An introduction to finite fields 37
although the hyphens may appear in different places and are in
fact unimportant. The first digit, 0, indicates the language
(English) and the next two digits 19 stand for Oxford University
Press, the publishers. The next six digits 859617 are the book
number assigned by the publisher, and the final digit is chosen to
make the whole 10-digit number x,x, - - - X,9 satisfy
10

>) ix;=0 (mod 11). (3.7)


i=1

The left-hand side of (3.7) is called the weighted check sum of the
number X,X.°*-*X 9. Thus for the 9-digit number x,x,--: Xp
already chosen, x, is defined by
9
= >) ix, (mod 11)
to get the ISBN. i=l
The publisher 1s forced to allow a symbol X in the final
position if the check digit x,) turns out to be a ‘10’; e.g.
Chambers Twentieth Century Dictionary has ISBN 0550-10206-X.
The ISBN code is designed to detect (a) any single error and
(b) any double-error created by the transposition of two digits.
The error detection scheme is simply this. For a received vector
yiy2°** Yio calculate its weighted check sum Y=) }°, iy,. If
Ye0 (mod 11), then we have detected error(s). Let us verify that
this works for cases (a) and (b) above. Suppose x = x,x, ++ + X49 iS
the codeword sent.
(a) Suppose the received vector y = y,y2-- + yo is the same as x
except that digit x; is received as x;+a with a#0. Then
Y= }2, i, = (} 0 ix,) + ja = ja (mod 11), sincej and a
are non-zero.
(b) Suppose y is the same as x except that digits x, and x, have
been transposed Then

Y= > i= > iy +k —j)x,+


VU — k)x,
=1
= (k — j)(% —x,) #0 (mod 11),
if k#j and x; F X,.
Note how crucial use is made of the result (Lemma 3.1(ii)) that
in a field, the product of two non-zero elements 1s also non-zero.
This does not hold in Z,, in which, for example,
38 A first course in coding theory
2-5 =0 (mod 10), and this is why we work with modulus 11 rather
than 10. We shall discuss some further codes based on modulo 11
arithmetic in Chapters 7 and 11.
The ISBN code cannot be used to correct an error unless we
know that just one given digit is in error. This 1s the basis of the
following party trick.
Ask a friend to choose a book not known to you and to read
out its ISBN, but saying ‘x’ for one of the digits. After a few
seconds working you announce the value of x. For example, if
the number read out is 0-201-1x-502-7, your working 1s:
1-04+2:2+3-04+4-14+5°14+6-x+7:-54+8-04+9-24+10:7=0.
Hence 6x + 4=0, and so

x= =7-6'=7-2=14=3.

Concluding Remark It is hoped that the reader is beginning to


appreciate the power and versatility of finite fields, which the
author believes to be among the most beautiful structures in
mathematics. One remarkable property of any finite field, not
needed in this book and so not proved here, is that all the
non-zero elements can be expressed as powers of a single
element, which is called a primitive element; i.e. there exists
géGF(q) such that the non-zero elements of GF(q) are
precisely 1,g,g°,...,g7-*, with g?-'=1. This result is by no
means obvious, even if we restrict our attention to the case of
prime fields. One application of this result is that in a large or
complicated field a table of indices of the non-zero elements,
with respect to a fixed primitive root, can be constructed, and
this can be used, in the same way as logarithms, to carry out
multiplication in the field.
For an encyclopaedic volume on finite fields the reader is
referred to Lid] and Niederreiter (1983).

Exercises 3

3.1 Find the principal remainder when 2” is divided by 7. Find


the units digit of 31°.
3.2 Show that every square integer is congruent (mod 4) to
An introduction to finite fields 39
either 0 or 1. Hence show that there do not exist integers x
and y such that x? + y* = 1839. !
3.3 Construct a table of multiplicative inverses for (i) GF(7),
(ii) GF(13).
3.4 (i) What is the minimum distance of the ISBN code?
(ii) What proportion of books would you expect to have
an ISBN containing the symbol X?
3.5 Check whether the following are ISBNs.
0-13165332-6
0-1392-4101-4
07-028761-4

3.6 The following ISBNs have been received with smudges.


What are the missing digits?
()- 13-189 139-9
0-02-3 2¢aps0-0

3.7 Consider the code C of all 10-digit numbers over the


10-ary alphabet {0,1,...,9} which have the property that
the sum of their digits is divisible by 11; that 1s,
10

C= {rx "* 410 > x;=0 (mod 11}.


i=1

Show that C can detect any single error. What would be


the disadvantage of using this code for book numbers
rather than the ISBN code?
3.8 Let a be a non-zero element of GF(p), where p is prime.
By considering the product of the p—1 elements
la,2a,...,(p —1)a, prove that
aP-'=1 (modp) (Fermat’s theorem).
Deduce that a~'=a?~* (mod p). [Remark: for p large, a
more efficient method of finding a~' is given in the next
exercise].
3.9 The Euclidean algorithm is a well-known method of
finding the greatest common divisor d of two integers a
and b. It also enables d to be expressed in the form
d=ax+by
40 A first course in coding theory
for some integers x and y. Show that the Euclidean
algorithm can therefore be used to find the inverse of an
element a#0 in the field GF(p), where p is prime. If you
know the Euclidean algorithm, use it to calculate
23-* (mod 31).
3.10 Find a primitive element for each of GF(3), GF(7) and
GF(11).
3.11 Suppose F is a finite field. Given that we F and nis a
positive integer, let naw denote the element ~+@a+---+
a (n terms). Prove that there exists a prime number p such
that pa =0 for all we F. This prime number p is called
the characteristic of the field F.
3.12 Suppose p is a prime number. Show that (a+ b? =a’? +
b? (mod p). [Hint: show that (") =0(mod p) if 1<is
p-—1.] Deduce that a?=a(modp), for any integer a.
(This gives an alternative proof of Fermat’s theorem,
Exercise 3.8.)
3.13 In the field GF(q), where q is odd, show that the product
of all the non-zero elements is equal to —1.
3.14 Show that in a finite field of characteristic p,
(i) if p=2, then every element is a square
(ii) if p is odd, then exactly half of the non-zero elements
are squares.
4 Vector spaces over finite fields

In addition to carrying out arithmetical operations within the


alphabet of a code, it is also very useful to be able to perform
certain operations with the codewords themselves. We have
already benefited from this in making use of the ‘sum’ of two
binary vectors to prove Lemma 2.6.
Throughout this chapter we assume that q is a prime power
and we let GF(q) denote the finite field of g elements. The
elements of GF(q) will be called scalars. The set GF(q)" of all
ordered n-tuples over GF(q) will now be denoted by V(n, q)
and its elements will be called vectors.
We define two operations within V(n, q):
(i) addition of vectors: if x=(x,,%,...,x,) and y=
(Y., Jaye Ps) E V(n, q); then

x+y= (x, + y,,X_ + yo, oe Xp + y,)

(ii) multiplication of a vector by a scalar: if


x = (X1, X, se Xn) € Vin, q) and ae GF(q),

then ax = (ax,, ax%,,..., aX,,).


The reader should have no difficulty in verifying that V(n, q)
satisfies the axioms for a vector space; i.e. that, for all u,v, we
V(n, q) and for all a, b € GF(q),
(i) u+veV(n,q)
(ii) (u+v)+w=u+(v+w)
(iii) the all-zero vector 0=(0,0,...,0)¢€V(n,q) and sat-
isies u+ 0=0+u=—u.
(iv) Given u=(Wj,W%,...,u,)€V(n,q), the element —u
=(-W, —m&,..., —u,) € V(n, q) and satisfies u + (—u)
= 0.
(v) utv=vtu.
(Properties (i)-(v) mean that V(n,q) is an ‘abelian
group’ under addition).
42 A first course in coding theory
(vi) (Closure under scalar multiplication) av € V(n, q).
(vii) (Distributive laws) a(u+v)=au-+av, (a+ b)u=aut
bu.
(viii) (ab)u=a(bu).
(ix) lu=u, where 1 is the multiplicative identity of GF(q).
A subset of V(n, q) is called a subspace of V(n, q) if it is itself
a vector space under the same addition and scalar multiplication
as defined for V(n, q).
Trivially, the set {0} and the whole space V(n,q) are
subspaces of V(n,q). A subspace is called non-trivial if it
contains at least one vector other than 0.

Theorem 4.1 A non-empty subset C of V(n, q) is a subspace if


and only if C is closed under addition and scalar multiplication,
i.e. if and only if C satisfies the following two conditions:
(1) Ifx,yeC, thenx+yec.
(2) IfaeGF(q) and xeC, then axeC.

Proof It is readily verified that if C satisfies (1) and (2), then C


satisfies all the axioms (i)—(ix) (with V(v, q) replaced by C) for a
vector space. (To show that 0eC, choose any xeC; then, by
(2), 0=OxeC. Property (2) also shows that if veC, then
—veC, for —v=(—1)v.)

Readers familiar with the theory of vector spaces over infinite


fields, such as the real or complex numbers, will find that
definitions and results generally carry over to the finite case, e.g.
the following.
A linear combination of r vectors V,, V2,...,V,in V(n,q) isa
vector of the form a,v, +@v,+::-:+4,v,, where the a; are
scalars.
It is easily verified that the set of all linear combinations of a
given set of vectors of V(n, q) is a subspace of V(n, q).
A set of vectors {v,,v2,...,V,} is said to be linearly
dependent if there are scalars a,,@,...,a,, not all zero, such
that
av,+@vy+°:::+a,v,=9.
A set of vectors {V,, V2,...,V,} is called linearly independent if
Vector spaces over finite fields 43
it is not linearly dependent; i.e. if
QV,
+ @V,+°::+4av,=05>4,=aq=::-=4=0.

Let C be a subspace of V(n, q). Then a subset {v,, v>,...,v,}


of C is called a generating set (or spanning set) of C if every
vector in C can be expressed as a linear combination of
Vi, V2,..-,V,-
A generating set of C which is also linearly independent is
called a basis of C.
For example, the set
{(1,0,0,...,0),(0,1,0,...,0),...,(0,0,...,0,1)}
is a basis of the whole space V(n, q).

Theorem 4.2 Suppose C is a non-trivial subspace of V(n, q).


Then any generating set of C contains a basis of C.

Proof Suppose {v,,v2,...,V,} iS a generating set of C.


If it is linearly dependent, then there are scalars a,,a,,...,4,,
not all zero, such that
QV, +@V,+°::+av,=0.
If a; is non-zero then
r

vV,=—a;' Say;
i=1,i4j
and so v, is a linear combination of the other v;. Thus vy; is
redundant as a generator and can be omitted from the set
{V,,V2,...,V,} to leave a smaller generating set of C. In this
way we can omit redundant generators, one at a time, until we
reach a linearly independent generating set. The process must
end since we begin with a finite set.
Since any subspace C of V(n,q) contains a finite generating
set (e.g. C itself), it follows from Theorem 4.2 that every
non-trivial subspace has a basis.
A basis can be thought of as a minimal generating set, one
which does not contain any redundant generators.

Theorem 4.3 Suppose {vj, V2,..., Vx} is a basis of a subspace C


44 A first course in coding theory
of V(n, q). Then
(i) every vector of C can be expressed uniquely as a linear
combination of the basis vectors.
(ii) C contains exactly g* vectors.

Proof (i) Suppose a vector x of C is represented in two ways


as a linear combination of v,, V>,...,V,. [hat is,
X=4,V,;
t+ QVot-**+Qvy,
and x= bi, + b,v> t+eeet b,V,.

Then (a, — b,)v, + (@ — b2)v. +--+ + (a — by)v, =0. But the set
{v,,V>,...,V,} is linearly independent and so a,—b,=0 for
1=1,2,...,k;ie.a,=b,fori=1,2,...,k.
(ii) By (i), the q* vectors U*_, av; (a,;€ GF(q)) are precisely
the distinct vectors of C.

It follows from Theorem 4.3 that any two bases of a subspace


C contain the same number k of vectors, where |C| = q*, and this
number k is called the dimension of the subspace C; it is denoted
by dim (C).
We have already exhibited a basis of V(n, q) having n vectors
and so dim (V(n, q)) =n.

Exercises 4

4.1 Show that a non-empty subset C of V(n, q) is a subspace if


and only if ax+byeC for all a,b ¢€GF(q) and for all
KX, yeC.
4.2 Show that the set E,, of all even-weight vectors of V(n, 2)
is a subspace of V(n, 2). What is the dimension of E,,?
[ Hint: See Exercise 2.4.] Write down a basis for E,,.
4.3 Let C be the subspace of V (4, 3) having as generating set
{(0,1,2,1), (4,0,2,2), (1,2,0,1)}. Find a basis of C.
What is dim (C)?
4.4 Let u and v be vectors in V(n, q). Show that the set {u, v}
is linearly independent if and only if u and v are non-zero
and v is not a scalar multiple of u.
4.5 Suppose {x,,X.,...,x,} is a basis for a subspace C of
V(n, q). Show that we get a basis for the same subspace C
Vector spaces over finite fields 45
if we either
(a) replace an x; by a non-zero scalar multiple of itself,
or
(b) replace an x; by x; + ax;, for some scalar a, with j ¥i.
4.6 Suppose F is a field of characteristic p. Show that F can be
regarded as a vector space over GF(p). Deduce that any
finite field has order equal to a power of some prime
number.
4.7 From the vector space V(3, q), an incidence structure P, is
defined as follows.
The ‘points’ of P, are the one-dimensional subspaces of
V(3, q). The ‘lines’ of P, are the two-dimensional sub-
spaces of V(3, q). The point P ‘belongs to’ the line L if
and only if P is a subspace of L.
Prove that P, is a finite projective plane of order q. List
the points and lines of P, and check that it has the same
structure as the seven-point plane defined in Example
2.19.
5 Introduction to linear codes

Throughout this chapter, we assume that the alphabet F, is the


Galois field GF(q), where qg is a prime power, and we regard
(F,)" as the vector space V(n,q). A vector (%),%,...,%,) will
usually be written simply as x,x,---x,.
A linear code over GF(q) is just a subspace of V(n,q), for
some positive integer n.
Thus a subset C of V(n, q) is a linear code if and only if
(1) u+veC, for all u and v in C, and
(2) aueC, for allue C, ae GF(q).
In particular, a binary code is linear if and only if the sum of
any two codewords is a codeword. It is easily checked that the
codes C,, C, and C, of Example 1.5, and the code C of Example
2.23, are all linear.
If C is a k-dimensional subspace of V(n,q), then the linear
code C is called an [n,k]-code, or sometimes, if we wish to
specify also the minimum distance d of C, an [n, k, d]-code.

Notes (i) A q-ary [n,k, d]-code is also a q-ary (n, g*, d)-code
(by Theorem 4.3), but, of course, not every (n, q*, d)-code is an
[n, k, d]-code.
(ii) The all-zero vector 0 automatically belongs to a linear
code.
(iii) Some authors have referred to linear codes as ‘group
codes’.
The weight w(x) of a vector x in V(n, q) is defined to be the
number of non-zero entries of x. One of the most useful
properties of a linear code is that its minimum distance is equal
to the smallest of the weights of the non-zero codewords. To
prove this we need a simple lemma.

Lemma 5.1 If x and ye V(n, q), then


d(x,y) = w(x —y).
48 A first course in coding theory
Proof The vector x—y has non-zero entries in precisely those
places where x and y differ.

Remark For q=2, Lemma 5.1 is the same as Lemma 2.5,


bearing in mind that ‘plus’ is the same as ‘minus’ when working
modulo 2.

Theorem 5.2 Let C be a linear code and let w(C) be the


smallest of the weights of the non-zero codewords of C. Then
d(C) =w(C).

Proof There exist codewords x and y of C such that d(C)=


d(x, y). Then, by Lemma 5.1,
d(C)=w(x—y)>w(C),
since x — y is a codeword of the linear code C.
On the other hand, for some codeword xe C,
w(C) = w(x) = d(x, 0) = d(C),
since 0 belongs to the linear code C. Hence d(C) =w(C) and
w(C)=d(C), giving d(C) = w(C).

We now list some of the advantages and disadvantages of


restricting one’s attention to linear codes.

Advantage 1 For a general code with M codewords, to find the


minimum distance we might have to make (*) =4M(M -1)
comparisons (as in Example 2.23). However, Theorem 5.2
enables the minimum distance of a linear code to be found by
examining only the weights of the M — 1 non-zero codewords.
Note how much easier it is now to show that the code of
Example 2.23 has minimum distance 3, if we know that it is
linear.

Advantage 2 To specify a non-linear code, we may have to list


all the codewords. We can specify a linear [n, k]-code by simply
giving a basis of k codewords.
Introduction to linear codes 49

Definition A k Xn matrix whose rows form a basis of a linear


[n, k]-code is called a generator matrix of the code.

Examples 5.3 (i)


The code C, of Example 1.5 is a [3, 2, 2]-
01 1
code with generator matrix
1011)
(ii) The code C of Example 2.23 is a [7,4,3]-code with
generator matrix
1111111
1000101
1100010
0110001
(iii) The qg-ary repetition code of length n over GF(q) is an
[n, 1, n]-code with generator matrix
[1 1---1].
Advantage 3 There are nice procedures for encoding and
decoding a linear code (See Chapters 6 and 7).

Disadvantage 1 Linear q-ary codes are not defined unless q is a


prime power. However, reasonable q-ary codes, for g not a
prime power, can often be obtained from linear codes over a
larger alphabet. For example, we shall see in Chapter 7 how
good decimal (i.e. 10-ary) codes can be obtained from linear
11-ary codes by omitting all codewords containing a given fixed
symbol. This idea has already been illustrated in Chapter 3, for
the ISBN code can be obtained in such a way from the linear
11-ary code
10
[ik -++Xi9 € V(10, 11): > ix; = 0}.
i=1

Disadvantage 2 The restriction to linear codes might be a


restriction to weaker codes than desired. However, it turns out
that codes which are optimal in some way are very frequently
linear. For example, for every set of parameters for which it is
known that there exists a non-trivial perfect code, there exists a
50 A first course in coding theory
perfect linear code with those parameters. Notice also how often
the value of A,(n,d) in Table 2.4 is a power of 2. It is usually,
though not always, the case that such a value of A,(n,d) is
achieved by a linear code.

Equivalence of linear codes

The definition of equivalence of codes given in Chapter 2 is


modified for linear codes, by allowing only those permutations of
symbols which are given by multiplication by a non-zero scalar.
Thus two linear codes over GF(q) are called equivalent if one
can be obtained from the other by a combination of operations of
the following types.
(A) permutation of the positions of the code;
(B) multiplication of the symbols appearing in a fixed position
by a non-zero scalar.

Theorem 5.4 Two k Xn matrices generate equivalent linear


[n, k]-codes over GF(q) if one matrix can be obtained from the
other by a sequence of operations of the following types:
(R1) Permutation of the rows.
(R2) Multiplication of a row by a non-zero scalar.
(R3) Addition of a scalar multiple of one row to another.
(C1) Permutation of the columns.
(C2) Multiplication of any column by a non-zero scalar.

Proof The row operations (R1), (R2) and (R3) preserve the
linear independence of the rows of a generator matrix and simply
replace one basis by another of the same code (see Exercise 4.5).
Operations of type (C1) and (C2) convert a generator matrix to
one for an equivalent code.

Theorem 5.5 Let G be a generator matrix of an [n, k]-code.


Then by performing operations of types (R1), (R2), (R3), (C1)
and (C2), G can be transformed to the standard form
[I | A],
where J, is the k Xk identity matrix, and A is a k X(n—k)
matrix.
Introduction to linear codes 51

Proof During a sequence of transformations of the matrix G,


we denote by g; the (i, j)th entry of the matrix under considera-
tion at the time and by m%,%%,...,¥, and ¢),@,...,¢, the rows
and columns respectively of this matrix.
The following three-step procedure is applied for j=1,
2,...,k in turn, the jth application transforming column c; into
its desired form (with 1 in the jth position and Os elsewhere),
leaving unchanged the first j — 1 columns already suitably trans-
formed. Suppose then that G has already been transformed to

“1 QO --- Q Qi oe Lin "]

0 1 wee 0 8; eee Ban

0 0 -:- 1 8-15 °° ° = 8j—-1,n


0 0 oe 0 Bi coe Bin

0 O eee O 8k e 8 Lkn _

Step 1 If g,#0, go to Step 2. If g,=0, and if for some


i>j,g,#0, then interchange r; and r;. If g; =0 and g,, =0 for all
i >j, then choose h such that g,,#0 and interchange ¢, and ¢,.
Step 2. We now have g,,#0. Multiply r; by gj".
Step 3 We now have g,;,=1. For each of i=1,2,...,k, with
i#j, replace r; by r; — g,r,.
The column ¢; now has the desired form.
After this procedure has been applied for ;=1,2,...,k, the
generator matrix will have standard form.

Notes (1) If Gcan be transformed into a standard form matrix


G' by row operations only (this will be the case if and only if the
first kK columns of G are linearly independent), then G’ will
actually generate the same code as does G. But if operations
(C1) and (C2) are also used, then G’ will generate a code which
is equivalent to, though not necessarily the same as, that
generated by G. The procedure described in the preceding proof
is designed to give a standard form generator matrix for the same
code whenever this is possible.
(2) In practice, inspection of the generator matrix G will
52 A first course in coding theory
often suggest a quicker way to transform to standard form, as in
Example 5.6(iii) below.
(3) The standard form [J, | A] of a generator matrix is not
unique; for example, permutation of the columns of A will give a
generator matrix for an equivalent code.

Examples 5.6 (i) See Example 5.3(i). Interchanging rows


gives the standard form generator matrix
F 0 1
011
for the code C,.
(ii) We will use the procedure of Theorem 5.5 to transform
the generator matrix of Example 5.3(11) to standard form.
1111111 T1iii11117
1000101 cer 0111010
1100010 0011101
0110001 —)11000 14
-10001017
— 0111010
ron 0011101
(000101 14
-10001017
_ 0100111
vee 0011101
(000101 14
-10001017
____, 0100111
ees 0010110
L0001011-
(iii) Consider the [6,3]-code over GF(3) having generator
matrix
000111
011012).
102011
Introduction to linear codes 53

An obvious permutation of the columns gives the standard form


generator matrix
100011
010112
001211
for an equivalent code.

Exercises 5

5.1 Is the binary (11, 24,5)-code of Exercise 2.12 linear?


(There is no need to examine any codewords).
5.2 Exercise 4.2 shows that E,,, the code of all even-weight
vectors of V(n,2), is linear. What are the parameters
[n,k,d]| of E,,? Write down a generator matrix for E,, in
standard form.
5.3 Let H be an r Xn matrix over GF(q). Prove that the set
C = {xeV(n,q)|xH7 =0} is a linear code. [Remark: we
shall show in Chapter 7 that every linear code may be
defined by means of such a matrix H, which is called a
parity-check matrix of the code.]
5.4 (i) Show that if C is a binary linear code, then the code
obtained by adding an overall parity check to C is
also linear.
(ii) Find a generator matrix for a binary [8, 4, 4]-code.
5.5 Prove that, in a binary linear code, either all the code-
words have even weight or exactly half have even weight
and half have odd weight.
5.6 Let C, and C, be binary linear codes having the generator
matrices
/+110 1001101
loot and Go= 101010111.
0010111
List the codewords of C, and C, and hence find the
minimum distance of each code. (Use Theorem 5.2.)
5.7 Let C be the ternary linear code with generator matrix
ew
0112)
54 A first course in coding theory
List the codewords of C and use Theorem 5.2 to find the
minimum distance of C. Deduce that C 1s a perfect code.
5.8 Let B,(n, d) denote the largest value of M for which there
exists a linear q-ary (n, M, d)-code (q is a prime power).
Clearly the value of B,(n,d) is less than or equal to the
value of A,(n,d), which was defined in Chapter 2.
Determine the values of B,(8, 3), B.(8, 4) and B,(8, 5). Is it
true that B,(n, d) = A,(n, d) for each of these cases?
5.9 Exercise 2.3 shows that A,(3, 2) = q’ for any integer q >2.
Show that, if g is a prime power, then B,(3, 2) = q’.
5.10 Suppose [J, | A] is a standard form generator matrix for a
linear code C. Show that any permutation of the rows of A
gives a generator matrix for a code which is equivalent to
C.
5.11 Let C be the binary linear code with generator matrix
1110000
1001100
1000011
010101 0.
Find a generator matrix for C in standard form. Is C the
same as the code of Example 5.6(ii)? Is C equivalent to
the code of Example 5.6(11)?
5.12 Suppose C, and C, are binary linear codes. Let C; be the
code given by the (u| u+v) construction of Exercise 2.17.
Show that C; is linear.
Deduce that B,(2d, d) = 4d when d is a power of 2.
6 Encoding and decoding with a linear code

Encoding with a linear code

Let C be an [n, k]-code over GF(q) with generator matrix G. C


contains g* codewords and so can be used to communicate any
one of q* distinct messages. We identify these messages with the
q* k-tuples of V(k,q) and we encode a message vector u=
U,U, +++ U, Simply by multiplying it on the right by G. If the rows
of G arer,,W,...,¥,, then
k
uG = S ux;
i=]

and so uG is indeed a codeword of C, being a linear combination


of the rows of the generator matrix. Note that the encoding
function u->uG maps the vector space V(k,q) on to a k-
dimensional subspace (namely the code C) of V(n, q).
The encoding rule is even simpler if G is in standard form.
Suppose G=[I,|A], where A=[a,] is a kX (n—k) matrix.
Then the message vector u is encoded as
K=UG =X XX Xn,

where x; =u;, 1 Sik, are the message digits


k

and > G;U; , 1<isn-k,


j=l
are the check digits. The check digits represent redundancy which
has been added to the message to give protection against noise.

Example 6.1 Let C be the binary [7, 4]-code of Example 5.3(ii),


for which we found in Example 5.6(ii) the standard form
generator matrix 1000101

0100111
00101107
0001011
56 A first course in coding theory
A message vector (U,, Up, U3, U4) is encoded as

Up, U3, Ug, Uy + Uy + Uz, Up + Uz + Ug, Uy + Uy + Ug).


(Uy,
For example,
0000 is encoded as 0000000,
1000 - 59 9 1000101,
1110 - 9 - 1110100.

For a general linear code, we summarize the encoding part of the


communication scheme (see Fig. 1.1) in Fig. 6.2.

Fig. 6.2
Noise
Message Y
Message vector Encoder: Codeword C
> _ aa ———enenmeemc hannel
source U=U U5" U, U—>X = uG X=X1°"'Xp

Decoding with a linear code

Suppose the codeword x = x,x,--- x, is sent through the channel


and that the received vector is y= y,y.°° + y,. We define the error
vector e to be
e=y—xX=e,e)'°-é,.
The decoder must decide from y which codeword x was
transmitted, or equivalently which error vector e has occurred.
An elegant nearest neighbour decoding scheme for linear codes,
devised by Slepian (1960), uses the fact that a linear code is a
subgroup of the additive group V(n,q). The reader who is not
familiar with elementary group theory should not be deterred as
we shall not be assuming any prior knowledge of the subject
here.

Definition Suppose that C is an [n,k]-code over GF(q) and


that a is any vector in V(n, q). Then the set a+ C defined by
a+ C={a+x|xeC}
is called a coset of C.

Lemma 6.3 Suppose that a+C is a coset of C and that


Encoding and decoding with a linear code 57
bea+C. Then b+C=atC.

Proof Since bea+C, we have b=a+x, for some xe C. Now


ifb+yeb+C, then
b+y=(at+x)+y=a+(x+y)eat+C.
Hence b+ Cca+C.On the other hand, ifa+zea+C, then

a+z=(b—x)+z=b+(z—-x)eb+C.
Hence a+Ccb+C, andsob+C=aC.
The following theorem is a particular case of Lagrange’s
well-known theorem for subgroups.

Theorem 6.4 (Lagrange) Suppose C is an [n,k]-code over


GF(q). Then
(i) every vector of V(n, qg) is in some coset of C,
(ii) every coset contains exactly g* vectors,
(iii) two cosets either are disjoint or coincide (partial overlap is
impossible).

Proof (i) IfaeV(n,q), thena=a+0ea+C.


(ii) The mapping from C to a+ C defined by
x—a+t+x,

for all xe C, is easily shown to be one-to-one. Hence |a+ C| =


IC| =q*.
(iii) Suppose the cosets a+ C and b+C overlap. Then for
some vector v, we have ve(a+C)M(b+C). Thus, for some
%YS vV=at+x=bty.
? C,

Hence b=a+(x—y)e€a+C, and so by Lemma 6.3, b+C=


atc.

Example 6.5 Let C be the binary [4,2]-code with generator


matrix 1011

G=() 10 i
i.e. C = {0000, 1011, 0101, 1110}.
58 A first course in coding theory
Then the cosets of C are

0000 + C = C itself,
1000 + C = {1000, 0011, 1101, 0110},
0100 + C = {0100, 1111, 0001, 1010},
and 0010 + C = {0010, 1001, 0111, 1100}.
Note that the coset 0001 + C is {0001, 1010, 0100, 1111}, which is
the same as the coset 0100+ C. This could have been predicted
from Lemma 6.3, since 0001 € 0100 + C. Similarly we must have,
for example, 0111+ C =0010+ C.

Definition The vector having minimum weight in a coset is


called the coset leader. (If there is more than one vector with the
minimum weight, we choose one at random and call it the coset
leader. For example, in Example 6.5, 0001 is an alternative coset
leader to 0100 for the coset 0100 + C).

Theorem 6.4 shows that V(n,q) is partitioned into disjoint


cosets of C: *
Vin, q)= (0+ C)U(a, + C)U:+-U(a, +0),
where s=g"”*—1, and, by Lemma 6.3, we may take
0, a,,.. .., a, to be the coset leaders.
A (Slepian) standard array for an [n, k]-code C is a q”~* x q*
array of all the vectors in V(n, q) in which the first row consists
of the code C with 0 on the extreme left, and the other rows are
the cosets a; + C, each arranged in corresponding order, with the
coset leader on the left. A standard array may be constructed as
follows:

Step 1 List the codewords of C, starting with 0, as the first row.


Step 2 Choose any vector a,, not in the first row, of minimum
weight. List the coset a, +C as the second row by putting a,
under 0 and a, + x under x for each xe C.
Step 3 From those vectors not in rows 1 and 2, choose a, of
minimum weight and list the coset a, + C as in Step 2 to get the
third row.
Step 4 Continue in this way until all the cosets are listed and
every vector of V(n, q) appears exactly once.
Encoding and decoding with a linear code - 59
Example 6.6 A standard array for the code of Example 6.5 is
codewords — 0000 1011 0101 1110
1000 0011 1101 0110
0100 1111 0001 1010
0010 1001 0111 1100
t
coset leaders

Note that in a standard array, each entry is the sum of the


codeword at the top of its column and the coset leader at the
extreme left of its row. We now describe how the decoder uses
the standard array.
When y is received (e.g. 1111 in the above example), its
position in the array is found. Then the decoder decides that the
error vector e is the coset leader (0100) found at the extreme left
of y and y is decoded as the codeword x = y — e (1011) at the top
of the column containing y.
Briefly, a received vector is decoded as the codeword at the
top of its column in the standard array.
The error vectors which will be corrected are precisely the
coset leaders, irrespective of which codeword is transmitted. By
choosing a minimum weight vector in each coset as coset leader,
we ensure that standard array decoding is a nearest neighbour
decoding scheme. .
In Example 6.6, with the given array, a single error will be
corrected if it occurs in any of the first 3 places (e.g. (a) below)
but not if it occurs in the 4th place (e.g. (b) below).

Message Codeword Channel Received Decoded Received


+ noise vector word message

(a) 01 — 0101 — ; 0101 — 0001 = 0101 — 01


(b) 01 — 0101 — | 0101 —> 0100 — 0000 — 00

Notes (1) In practice, the above decoding scheme is too slow


for large codes and also too costly in terms of storage require-
ments. A more sophisticated way of carrying out standard array
decoding, known as ‘syndrome decoding’, will be described in
Chapter 7. .
(2) In Example (b) above, the message symbols 01 were
60 A first course in coding theory
actually unaffected by noise and yet, after decoding, the wrong
message 00 was received. This is an instance of more harm than
good ensuing from the addition of redundancy. But in order to
get a sensible measure of how good a code is, we must calculate
the probability that a received vector will be decoded as the
codeword which was sent. Since the error vectors which will be
corrected by standard array decoding are the same whichever
codeword is sent, this calculation is extremely easy for a linear
code, as we now show.

Probability of error correction

For simplicity, we restrict our attention for the remainder of this


chapter to binary linear codes. We assume that the channel is
binary symmetric with symbol error probability p. We saw in
Chapter 1 that the probability that the error vector is a given
vector of weight i is p'(1—p)”"~“ and so the following theorem
follows immediately.

Theorem 6.7 Let C be a binary [n,k]-code, and for i=0,


1,...,n let aw; denote the number of coset leaders of weight 1.
Then the probability P.,..(C) that a received vector decoded by
means of a standard array is the codeword which was sent is
given by
Pror(C) = » api(1—p)".
Example 6.8 For the [4,2]-code of Example 6.6, the coset
leaders are 0000, 1000, 0100 and 0010. Hence a=1, a, =3,
Y= a,=a,=0, and so

Poor(C) = (1 — p)* + 3p. — pp)?


=(1—p)*( + 2p).
If p=0.01, then P.,,,(C) =0.9897. The probability that a de-
coded word is not the word sent, 1.e. the word error rate, is

Por (C) =l]- Proorl(C),

which, for p = 0.01, is 0.0103.


Without coding, the probability of a 2-digit message being
received incorrectly is 1 — (1 — p)? which, for p = 0.01, is 0.0199.
Encoding and decoding with a linear code 61
So, for p = 0.01, we have nearly halved the word error rate at the
expense of having to send two check symbols with every 2-digit
message.

Remark 6.9 If d(C) =2t+1 or 2t+2, then C can correct any t


errors.Hence every vector of weight <t¢ is a coset leader and
n , .
SO a;= (") for 0<i St. But for i>t, the a; can be extremely

difficult to calculate and are unknown even for some very


well-known families of codes. One case for which there is no
such difficulty is that of perfect codes; since the error vectors
corrected by a perfect [n,k,2t+1]-code are precisely those
. n .
vectors of weight <t, we have a;= ( ‘) for OSi<t and a,;=0
fori >t.

A linear [n,k]-code C uses n symbols to send k message


symbols. It is said to have rate R(C)=k/n. Thus the rate of a
code is the ratio of the number of message symbols to the total
number of symbols sent and so a good code will have a high rate.

Example 6.10 Let us return to Example 1.5 and consider how a


route can most accurately be communicated if we impose the
condition that the rate of the code used must be at least 4, i.e.
that there is time enough to send only as many check symbols as
there are message symbols. We will assume the channel to be
binary symmetric with p = 0.01.
It might at first appear that we can do no better than to use the
[4, 2]-code of Example 6.6, for which we found in Example 6.8
that P.,, = 0.0103. It is not hard to see that this is the best we can
do if we limit ourselves to using just four codewords, one for
each possible message N, W, E or S. But consider the following
Strategy.
We first identify N, W, E and S with the message vectors 00,
01, 10 and 11 and convert the route (e.g. NNWN---) to a long
string of message symbols (00000100---). We then break the
string into blocks of 4 and encode each block into a length 7
codeword by means of the [7, 4]-code C considered in Examples
2.23, 5.6 and 6.1. By Remark 6.9, since C is a perfect
[7, 4, 3]-code, we have ay=1, a, =7 and a, =0 for i>1. (Note
that there is no need to construct a standard array to find the a;
62 A first course in coding theory
in this case.) Hence

P...(C) = 1—(1— p)’— 7p( — p)?®


~0.002 if p =0.01.
Thus the number of codewords (and hence messages) received
in error after decoding with this [7, 4]-code is about one-fifth of
the number received in error when using the best [4, 2]-code.
And yet we are sending the information at a more efficient rate,
for R(C) =4>4.
One lesson to be learned from this example is that if we first
represent our information by a long string of binary digits, we
need not be too restricted in our choice of [n, k]-code, for we can
just encode the message symbols k at a time. We shall see in
Exercise 6.6 that by using a [23, 12]-code, which has rate >4, we
can get the word error rate P.,, down to approximately 0.000 08.
It is beginning to look as though we can make the word error
rate as small as we wish by using a long enough code (but still
having rate =4). Indeed it is a consequence of the following
remarkable theorem of Shannon (1948) that, for a binary
symmetric channel with symbol error probability p, we can
communicate at a given rate R with as small a word error rate as
we wish, provided R is less than a certain function of p called the
capacity of the channel.

Definition The capacity €(p) of a binary symmetric channel


with symbol error probability p is

€(p)=1+p log,p + (1—p) log, (1—p).


Fig. 6.11
?(p)

0 1 1 p

Theorem 6.12 (Shannon’s theorem; proof omitted) Suppose a


Encoding and decoding with a linear code 63
channel is binary symmetric with symbol error probability p.
Suppose R is a number satisfying R < €(p). Then for any e>0,
there exists, for sufficiently large n, an [n,k]-code C of rate
k/n = R such that P.,,(C) < e.
(A similar result holds for non-binary codes, but with a
different definition of capacity).

The proof of this result may be found in van Lint (1982) or


McEliece (1977). Unfortunately, the theorem has so far been
proved only by probabilistic methods and does not tell us how to
construct such codes. It should be borne in mind also that for
practical purposes we require codes which are easily encoded and
decoded and that this is less likely to be the case for long codes
with many codewords.

Example 6.13 It may be calculated that €(0.01) =0.92. Thus,


for p = 0.01, even if we insist on transmitting at a rate of 7, we
can, in theory, make P.,, as small as we wish by making n (and k)
sufficiently large.

Symbol error rate

Since some of the message symbols may be correct even if the


decoder outputs the wrong codeword, a more useful quantity
might be the symbol error rate Py», the average probability that
a message symbol is in error after decoding. A method for
calculating P,,,.» IS given in Exercise 6.7, but it is more difficult to
calculate than P.,, and is not known for many codes. Note also
that the result of Exercise 6.9 shows that Shannon’s theorem
remains true if we replace P,,, by Pimp.

Probability of error detection

Suppose now that a binary linear code is to be used only for error
detection. The decoder will fail to detect errors which have
occurred if and only if the received vector y is a codeword
different from the codeword x which was sent, 1.e. if and only if
the error vector e = y — x is itself a non-zero codeword (since C is
linear). Thus the probability Pyadetee(C) that an incorrect code-
word will be received is independent of the codeword sent and is
given by the following theorem.
64 A first course in coding theory
Theorem 6.14 Let C be a binary [n, k]-code and let A; denote
the number of codewords of C of weight i. Then, if C is used for
error detection, the probability of an incorrect message being
received undetected is

P undetec(C) — x A,p'(1 —py.

(Note that, unlike the formula of Theorem 6.7 for P.,,,(C), the
summation here starts at i= 1).

Example 6.15 With the code of Example 6.6,


P undetec = P*(1 —p)+2p7(1 —p)
= p?—p'.

= 0.000 099 99 if p = 0.01,

and so only one word in about 10 000 will be accepted with errors
undetected.

In the early days of coding theory, a popular scheme, when


possible, was detection and retransmission. With only a mod-
erately good code, it is possible to run such a scheme for several
hours with hardly any undetected errors. The difficulty is that
incoming data gets held up by requests for retransmission and
this can cause buffer overflows.
The retransmission probability for an [n, k]-code is given by
P retrans =1- (1 —p)" - P undetec:

For example, with the [4,2]-code of Example 6.6, if p =0.01,


then P.errans
= 9.04 and so about 4% of messages have to be
retransmitted. This percentage increases for longer codes; e.g. if
we used a [24, 12]-code, then P,..,,,, would be over 20%.
A compromise scheme incorporating both error correction and
detection, called ‘incomplete decoding’, will be described in
Chapter 7.

Concluding remark on Chapter 6

The birth of coding theory was inspired by the classic paper of


Claude Shannon, of Bell Telephone Laboratories, in 1948. In
fact, this single paper gave rise to two whole new subjects. The
Encoding and decoding with a linear code 65
first, information theory, is a direct extension of Shannon’s work,
relying mainly on ideas from probability theory, and this will not
be pursued here. The second, coding theory, relies mainly on
ideas from pure mathematics and, while retaining some links
with information theory, has developed largely independently.

Exercises 6

6.1 Construct standard arrays for codes having each of the


following generator matrices:
1 0 101 10110
G.=|5 i| G.= |) 1 i| Gs=|) 101 |
Using the third array:
(i) decode the received vectors 11111 and 01011,
(ii) give examples of
(a) two errors occurring in a codeword and being
corrected,
(b) two errors occurring in a codeword and not
being corrected.
6.2 If the symbol error probability of a binary symmetric
channel is p, calculate the probability, for each of the
three codes of Exercise 6.1, that any received vector will
be decoded as the codeword which was sent. Evaluate
these probabilities for p = 0.01.
Now suppose each code is used purely for error
detection. Calculate the respective probabilities that the
received vector is a codeword different from that sent (and
evaluate for p = 0.01). Comment on the relative merits of
the three codes.
6.3 We have assumed that, for a binary symmetric channel,
the symbol error probability p is less than 4. Can an
error-correcting code be used to reduce the number of
messages received in error if (i) p =3, (ii) p >3?
6.4 Suppose C is a binary [n, k]-code with minimum distance
2t + 1 (or 2t +2). Given that p is very small, show that an
approximate value of P.,,(C) is
n

((, + , ~ ai1)p™
66 A first course in coding theory
where @,,, 1s the number of coset leaders of C of weight
t+1.
6.5 Show that if the perfect binary [7, 4]-code is used for error
detection, then if p = 0.01, Pyndetec = 0.00 000 68 and about
7% of words have to be retransmitted.
[Hint: The codewords of such a code are listed in
Example 2.23.]
6.6 We shall see in Chapter 9 that there exists a perfect binary
(23, 12, 7]-code, called the binary Golay code. Show that,
if p=0.01, the word error rate for this code is about
0.000 08.
6.7 If standard array decoding is used for a binary [n, k]-code
and the messages are equally likely, show that P,,,,, does
not depend on which codeword was sent and that
12
Pomp = Em F; iP;

where F; is the weight of the first k places of the codeword


at the top of the ith column of the standard array, and P, is
the probability that the error vector is in this ith column.
6.8 Show that if p = 0.01, the code of Example 6.5 has
Pymb = 0.005 3.
6.9 Show that for a binary [n, k]-code,
1
poe S Pomp = Perr:
The dual code, the parity-check matrix,
7 and syndrome decoding

As well as specifying a linear code by a generator matrix, there is


another important way of specifying it—by a parity-check matrix.
First we need some definitions.
The inner product u-v of vectors u=u,u,:--u, and v=
U,U2--:v, in V(n,q) is the scalar (i.e. element of GF(q))
defined by
U'V=H WVU, + UnU2++°* +Uy,v,.

For example, in V(4, 2), (1001) - (1101) =0,


(1111) - (1110) = 1,
and in V(4, 3), (2011) - (1210) = 0,
(1212) - (2121) =2.
If u- v=0, then u and v are called orthogonal.
The proof of the following lemma is left as a straightforward
exercise for the reader.

Lemma 7.1 For any u, v and w in V(n, qg) and A, uw € GF(q),


(i) urv=v-eu
(ii) (Au+pv)-w=A(u-w) + u(v: w).
Given a linear [n, k]-code C, the dual code of C, denoted by
C+, is defined to be the set of those vectors of V(n, q) which are
orthogonal to every codeword of C, i.e.
Ct={veV(n,qg)|v-u=0 forall
ue Cy}.
After a preliminary lemma, we shall show that C~ is a linear
code of dimension n — k.

Lemma 7.2 Suppose C is an [n,k]-code having a generator


matrix G. Then a vector v of V(n, q) belongs to C* if and only if
v is Orthogonal to every row of G; i.e. ve C'<vG!‘ =0, where
G’ denotes the transpose of G.
68 A first course in coding theory
Proof The ‘only if? part is obvious since the rows of G are
codewords. For the ‘if’ part, suppose that the rows of G are
¥j,%,-...,¥, and that v-r; =0 for eachz. If uis any codeword of
C, then u= ))*_, A,x; for some scalars A; and so
k
v-u=)> A(v-r, (by Lemma 7.1(ii))

= >) 1,0=0.
i=1

Hence vy is orthogonal to every codeword of C and so is in C~.

Theorem 7.3 Suppose C is an [n, k]-code over GF(q). Then the


dual code C* of C is a linear [n, n — k]-code.

Proof First we show that C~ is a linear code.


Suppose v,, v>€ C* and A, uw €e GF(q). Then, for all ue C,
(Av, + MV>) -u=A(vV, > u) + L(V - U) (by Lemma 7.1)
=A0+ n0=0.
Hence Av, + uv, € C~, and so C~ is linear, by Exercise 4.1.
We now show that C~ has dimension n —k. Let G =[g,,] be a
generator matrix for C. Then, by Lemma 7.2, the elements of C~
are the vectors v= vU,U2: °° v,, satisfying

> gjv;=0 fori=1,2,...,k.


j=l
This is a system of k independent homogeneous equations in n
unknowns and it is a standard result in linear algebra that the
solution space C~ has dimension n —k. For completeness we
show this to be so as follows.
It is clear that if codes C, and C, are equivalent, then so also
are Cj and C;. Hence it is enough to show that dim(C*)=
n — k in the case where C has a standard form generator matrix
1 eee @) Ay cee Q n—k

G=

@) eee 1 Aki eee Ak.n—k


The dual code and syndrome decoding 69
Then

Ct= {1 U>,...,U,)€ V(n, q) |v;


n—k

+ j=1> AjV_+; = 9, i=1,2,....k}.

Clearly for each of the q”~* choices of (v,4,,...,U,), there is a


unique vector (v,,U2,...,U,) in C+. Hence |C*| =q”"~* and so
dim (C-)=n—k.

Examples 7.4 It is easily checked that


(i) if
0000
C= 1100 hen CLEC
=Vo011 ” then =C.

1111
(ii) if
000
110 __ [900
C=) or, > thence’ = ti
101

Theorem 7.5 For any [n, k]-code C, (C*)* =C.

Proof Clearly Cc(C~)* since every vector in C is orthogonal


to every vector in C*. But dim((C*)*)=n—-(n-
k)=k=dimC, and so C=(C*)°-.

Definition A _ parity-check matrix H for an [n, k]-code C is a


generator matrix of C~.
Thus H is an (n —k) Xn matrix satisfying GH’ = 0, where H™
denotes the transpose of H and 0 1s an all-zero matrix. It follows
from Lemma 7.2 and Theorem 7.5 that if H is a parity-check
matrix of C, then
C= {xeV(n,q)|xH’?
=0}.
In this way any linear code is completely specified by a
parity-check matrix.
70 A first course in coding theory
In Example 7.4(i), 1100

oo
is both a generator matrix and a parity-check matrix, while in
(ii), [111] is a parity-check matrix.
The rows of a parity check matrix are parity checks on the
codewords; they say that certain linear combinations of the
co-ordinates of every codeword are zero. A code is completely
specified by a parity-check matrix; e.g. if

y= ont f
0011
then C is the code

{(X1, Xo, X3, X4) € V(4, 2) |x, +x) = 0, x3 + x, = 0}.


The equations x, +x,=0 and x;+x,=0 are called parity-check
equations.
If H =[111], then C consists of those vectors of V(3, 2) whose
coordinates sum to zero (mod2). More generally, the even
weight code E,, of Exercise 5.2 can be defined to be the set of all
vectors X,x,°- +x, of V(n, 2) which satisfy the single parity-check
equation
XytXt+-+++x, =0.
The following theorem gives an easy way of constructing a
parity-check matrix for a linear code with given generator matrix,
or vice versa.

Theorem 7.6 If G=[I,|A] is the standard form generator


matrix of an [n, k]-code C, then a parity-check matrix for C is
H = [—A* | I,-x]-

Proof Suppose
1 0 ayy GQ n—k
G= :
0 1 Ak Qk n—k
Let
—Ay, —Ayy 1 0)
H=
The dual code and syndrome decoding 71
Then H has the size required of a parity-check matrix and its
rows are linearly independent. Hence it is enough to show that
every row of H is orthogonal to every row of G. But the inner
product of the ith row of G with the jth row of H is
O+---+0+(-a;)+0+---+0+4,+0+---+0=0.

Example 7.7 The code of Example 5.6(ii) has standard form


generator matrix
101
111
G= ,
* 1110
011
Hence a parity-check matrix is

1110
H=]|0111|
1101
(Note that the minus signs are unnecessary in the binary case.)

Definition A parity-check matrix H is said to be in standard


form if H = [B | L,—x]:

The proof of Theorem 7.6 shows that if a code is specified by a


parity-check matrix in standard form H=[B|J,_,], then a
generator matrix for the code is G = [J, | —B7]. Many codes, e.g.
the Hamming codes (see Chapter 8), are most easily defined by
specifying a parity-check matrix or, equivalently, a set of
parity-check equations. If a code is given by a parity-check
matrix H which is not in standard form, then H can be reduced
to standard form in the same way as for a generator matrix.

Syndrome decoding

Suppose H is a parity-check matrix of an [n, k]-code C. Then for


any vector y€ V(n, q), the 1 X (n —k) row vector

S(y)=yH"
is called the syndrome of y.
72 A first course in coding theory
Notes (i) If the rows of H are h,,hy,...,h,_,, then S(y)=
(y-h,,y-°hy,...,y-h,_,).
Gi) S(y)=OSyeEC.
(iii) Some authors define the syndrome of y to be the column
vector Hy’ (i.e. the transpose of S(y) as defined above).

Lemma 7.8 Two vectors u and v are in the same coset of C if


and only if they have the same syndrome.

Proof wu and v are in the same coset


@ut+C=v+C
@u-vec
&(u-—v)H?
=0
©SuH! =vH'
& S(u)
= S(v).

Corollary 7.9 There is a one-to-one correspondence between


cosets and syndromes.

In standard array decoding, if n is small there is no difficulty in


locating the received vector y in the array. But if n is large, we
can save a lot of time by using the syndrome to find out which
coset (i.e. which row of the array) contains y. We do this as
follows.
Calculate the syndrome S(e) for each coset leader e and
extend the standard array by listing the syndromes as an extra
column.

Example 7.10 In Example 6.5,

1011
G= oio1

and so, by Theorem 7.6, a parity-check matrix is

a= [10]
~ £11014"
The dual code and syndrome decoding 73
Hence the syndromes of the coset leaders (see Example 6.6) are
(0000) = 00
S$(1000) = 11
(0100) = 01
(0010) = 10.
The standard array becomes:

coset leaders syndromes


0000 1011 0101 #1110 0 0
1000 0011 1101 0110 11
0100 1111 0001 1010 01
0010 1001 0111 #1100 1 0.
The decoding algorithm is now: when a vector y is received,
calculate S(y) =yH’ and locate S(y) in the ‘syndromes’ column
of the array. Locate y in the corresponding row and decode as
the codeword at the top of the column containing y.
For example, if 1111 is received, $(1111)=01, and so 1111
occurs in the third row of the array.
When programming a computer to do standard array decod-
ing, we need store only two columns (syndromes and coset
leaders) in the computer memory. This is called a syndrome
look-up table.

Example 7.10 (continued) The syndrome look-up table for this


code is

syndrome z coset leader f(z)

0 0 0000
11 1000
01 0100
1 0 0010

The decoding procedure is:


Step 1 For a received vector y calculate S(y) = yH’.
74 A first course in coding theory
Step 2 Let z=S(y), and locate z in the first column of the
look-up table.
Step 3 Decode y as y— f(z).
For example, if y=1111, then S(y)=01 and we decode as
1111 — 0100= 1011.

Incomplete decoding

This is a blend of error correction and detection, the latter being


used when ‘correction’ is likely to give the wrong codeword.
More precisely, if d(C) = 2t+ 1 or 2t +2, we adopt the following
scheme whereby we guarantee the correction of St errors in any
codeword and also detect some cases of more than ¢ errors.
We arrange the cosets of the standard array, as usual, in order
of increasing weight of the coset leaders, and divide the array
into a top part comprising those cosets whose leaders have
weights <t and a bottom part comprising the remaining cosets. If
the received vector y is in the top part, we decode it as usual
(thus assuming <¢ errors); if y is in the bottom part, we conclude
that more than ¢ errors have occurred and ask for re-
transmission.

Example 7.11 Let C be the binary code with generator matrix

ort
oot
A standard array for C is

codewords— | 00000 10110 01011 11101))


10000 00110 11011 01101
01000 11110 00011 10101 top part
00100 10010 01111 11001
00010 10100 01001 11111
00001 10111 01010 11100)J

11000 01110 10011 00101 bottom part


10001 00111 11010 01100

coset leaders
The dual code and syndrome decoding 75
If 11110 is received, we decode as 10110, but if 10011 is
received, we seek re-transmission. Note that in this example, if a
received vector y found in the bottom part were ‘corrected’, then
owing to the presence of two vectors of weight 2 in each such
coset, we would have less than an ‘evens’ chance of decoding y to
the codeword sent; e.g. if 10011 is received, then, assuming two
errors, the codeword sent could have been 01011 or 10110.
An incomplete decoding scheme is particularly well-suited to a
code with even minimum distance. For if d(C) = 2t+2, then it
will guarantee to correct up to ¢ errors and simultaneously to
detect any t+ 1 errors.
When we carry out incomplete decoding by means of a
syndrome look-up table, we can dispense with the standard array
not only in the decoding scheme but also in the actual construc-
tion of the table. This is because we know precisely what the
coset leaders are in the top part of the array (namely, all those
vectors of weight <t), while those in the bottom half are not used
in decoding and so need not be found. In other words we just
store the ‘top part’ of a syndrome look-up table as we now
illustrate.

Example 7.11 (continued) By Theorem 7.6, a parity-check


matrix Is 10100

HT = | 11010 }.
01001
Calculating syndromes of coset leaders via S(y) =yH’, we get
(the ‘top part’ of) the syndrome look-up table thus (the second
column was written down first):

syndrome z coset leader f(z)

000 00000
110 10000
O11 01000
100 00100
010 00010
001 00001

When a vector y is received, we calculate S(y) and decode if S(y)


appears in the z column. If S(y) does not appear, we seek
76 A first course in coding theory
re-transmission. For example, (i) if y=11111, then S(y) = 010
and so we decode as 11111 — 00010 = 11101.
Gi) if y=10011, then S(y)=101, which does not appear in
the table and so we conclude that at least 2 errors have occurred.

We next consider an interesting non-binary code having a neat


syndrome decoding algorithm which does not even require a
look-up table. This is the decimal code promised in Example 1.4.
Because 10 is not a prime power, the code will be derived from a
linear code over GF(11), as was the ISBN code described in
Chapter 3, but here the codewords satisfy two parity-check
equations instead of just one.

Example 7.12 Consider the linear [10,8]-code over GF(11)


defined to have parity-check matrix
qa[PPtriiint ,
112345678910]
H is deliberately chosen not to have standard form here in order
to get a nice decoding algorithm later.
Let C be the 10-ary code obtained from this 11l-ary code by
omitting all those codewords which contain the digit ‘10’. In
other words, C consists of all 10-digit decimal numbers x=
X4X_* + * Xz Satisfying the two parity-check equations
10 10

Si x%,=0 (mod11) and = » ix,=0 (mod 11).


i=1 i=1

It can be shown, e.g. via the inclusion—exclusion principle, that C


contains 82644629 codewords, but we omit the proof of this
here. The codewords of C can be listed by finding a generator
matrix in standard form. To do this we first put H into standard
form via elementary row operations.
H sae ene o |
"nr" 112345678 910
ereneecen
“(Dn 119987654321
r,—(-1)r,

_ 4 ee7eses 2 1 OF
mmr 1324567891001
The dual code and syndrome decoding 77
Using Theorem 7.6,
T 2 87]
3.7
46
G= EF 5 5
6 4
7 3
8 2
= 914
and so C= {(%,%,...,%g, 2X, +3x,+-:++9xg, 8x,
+ 7x, +
-+++x)}, where x,,%),...,Xg run over the values 0,1,2,...,9
and those words are omitted which give the digit ‘10’ in either of
the last two coordinate places.

We now describe an incomplete syndrome decoding scheme


which will correct any single error and which will simultaneously
detect any double error arising from the transposition of two
digits of a codeword.
Suppose x = (%1, X%2,...,X,9) is the codeword transmitted and
y= (yy, y2,--- >No) is the received vector. The syndrome

(4, B)=yH™=(¥i= ydi=1 iy)


10 10

is calculated (modulo 11).


Suppose a single error has occurred, so that for some non-zero
j and k,
(1, y2 so Yio) = (x1, se » X15 Xj + kK, Xj41, se , X10).

Then

A=S y=(Sx)+k =k (mod 11),


10 10

i=1 i=1

10 10

B=)» iy,= (> ix) +jk=jk (mod 11).


i=1 i=1

So the error magnitude k is given by A and the error position / is


given by the value of B/A. (The latter is calculated as BA~' as
described in Chapter 3). Hence the decoding scheme is, after
78 A first course in coding theory
calculating (A, B) from y, as follows:
(1) if (A, B)=(0,0), then y is a codeword and we assume no
errors,
(2) if A#0 and B#0, then we assume a single error which is
corrected by subtracting A from the (B/A)th entry of y,
(3) if A=0O or B=O but not both, then we have detected at
least two errors.
Case (3) always arises if two digits of a codeword have been
transposed, for then A=0O and (as for the ISBN code)
B#0.
For example, suppose y = 0610271355. We calculate that A =8
and B=6. Hence B/A =6-:8 !=6:7=42=9, and so the 9th
digit should have been 5-8 = —3 =8.

Remarks on Example 7.12 (1) Note how much faster is this


decoding scheme than the brute-force scheme of comparing the
received vector with all codewords. There is no need to store a
list of codewords in the memory of the decoder, nor is there even
any need to store a syndrome look-up table.
(2) The fact that we are able to correct any single error gives
an indirect proof that the minimum distance of the code is at
least 3. We will see in Example 8.8 that the minimum distance
could have been deduced directly by inspection of the parity-
check matrix H.
(3) Some further decimal codes will be discussed in Chapter
11.

Exercises 7

7.1. Prove Lemma 7.1.


7.2 Prove that if E, is the binary even weight code of length n,
then E;, is the repetition code of length n.
7.3. Give a very simple scheme for error detection with a linear
code, making use of the syndrome.
7.4 For a binary linear code with parity-check matrix H, show
that the transpose of the syndrome of a received vector is
equal to the sum of those columns of H corresponding to
where the errors occurred.
The dual code and syndrome decoding 79
7.5 Construct a syndrome look-up table for the perfect binary
[7, 4, 3]-code which has generator matrix

1000111
0100110
0010101
000101 1.

Use your table to decode the following received vectors:


0000011, 1111111, 1100110, 1010101.

7.6 Let C be the ternary linear code with generator matrix

b 11 |
2011)

(a) Find a generator matrix for C in standard form.


(b) Find a parity-check matrix for C in standard form.
(c) Use syndrome decoding to decode the received
vectors 2121, 1201 and 2222.
7.7 Using the code of Example 7.12, decode the received
vector 0617960587.
7.8 Example 7.12 shows that A,)(10,3) = 82644629. Prove
that A,,(10, 3) < 10°. Prove also that A,,(10, 3) = 11°.
7.9 Show that the decimal code
10

{G2 cee , X10) E (Fio)"° > Xj


i=1
10

=(0 (mod 10), >) ix, =0(mod 10)}


i=1
is not a single-error-correcting code.
7.10 Suppose a certain binary channel accepts words of length 7
and that the only kind of error vector ever observed is one
of the eight vectors 0000000, 0000001, 0000011, 0000111,
0001111, 0011111, 0111111, 1111111. Design a binary
linear [7, k|-code which will correct all such errors with as
large a rate as possible.
7.11 Suppose C is a binary code with parity-check matrix H.
Show that the extended code C, obtained from C by
A first course in coding theory
adding an overall parity-check, has parity-check matrix
— —

0
0

eel
0

|
1 1---1] 1
S The Hamming codes

The Hamming codes are an important family of single-error-


correcting codes which are easy to encode and decode. They are
linear codes and can be defined over any finite field GF(q) but,
for simplicity, we first restrict our attention to the binary case.
A Hamming code is most conveniently defined by specifying its
parity-check matrix:

Definition Let r be a positive integer and let H be an


rX (2”—1) matrix whose columns are the distinct non-zero
vectors of V(r,2). Then the code having H as its parity-check
matrix is called a binary Hamming code and is denoted by
Ham (r, 2).
(We shall later generalize this to define Ham(r,q) for any
prime power q.)

Notes (i) Ham(r, 2) has length n =2’ —1 and dimension k =


n—r. Thus r=n—K 1s the number of check symbols in each
codeword and is also known as the redundancy of the code.
(ii) Since the columns of H may be taken in any order, the
code Ham (r, 2) is, for given redundancy r, any one of a number
of equivalent codes.
‘ol
Examples 8.1 1 (i)(i r =2:H= 101
By Theorem 7.6, G =[111], and so Ham (2, 2) is just the binary
triple repetition code.
(ii) r=3: a parity-check matrix for Ham (3, 2) is
0001111
H=|}011001 1}.
1010101
Here we have taken the columns in the natural order of
increasing binary numbers (from 1 to 7). To get H in standard
82 A first course in coding theory
form we take the columns in a different order:

0111100
H=|101101 0}.
1101001

Hence, by Theorem 7.6, a generator matrix for Ham (3, 2) is

1000011
0100101
G=
00101107
0001111

It is easily seen that Ham (3,2) is equivalent to the perfect


[7, 4, 3]-code of Example 5.6 (and Examples 2.23 and 5.3). We
show next that all the binary Hamming codes are perfect.

Theorem 8.2 The binary Hamming code Ham (r, 2), for r=2,

(i) isa [2”—1, 27-1 -r]-code;


(ii) has minimum distance 3 (hence is single-error-correcting);
(iii) is a perfect code.

Proof (i) By definition, Ham (r,2)* is a [2”—1,r]-code and


so Ham (r, 2) is a [2”— 1, 2” — 1 —r]-code.
(ii) Since Ham(r,2) is a linear code, it is enough, by
Theorem 5.2, to show that every non-zero codeword has weight
23. We do this by showing that Ham (r, 2) has no codewords of
weight 1 or 2.
Suppose Ham (r, 2) has a codeword x of weight 1, say

x=00---010-:-0O (with 1 in the ith place).

Since x is orthogonal to every row of the parity-check matrix H,


the ith entry of every row of H is zero. Hence the ith column of
H is the all-zero vector, contradicting the definition of H.
Now suppose Ham (r, 2) has a codeword x of weight 2, say
x=0---010---010---0O (with 1s in the ith and jth places).
Denoting the sth row of H by [h,,h,.---h,,], we have, since x is
The Hamming codes 83
orthogonal to each such row,
h, +h; =9 (mod 2) fors=1,2,...,7; ?

that is
h, =h, (mod 2) fors=1,2,...,r.
Hence the ith and jth columns of H are identical, again
contradicting the definition of H.
Thus d(Ham (r, 2))=3. On the other hand, Ham (r, 2) does
contain codewords of weight 3. For example, if the first three
columns of H are
000

000
011
101
then the vector 11100---0 is orthogonal to every row of H and
so belongs to Ham (r, 2).
(iii) To show Ham (r, 2) is perfect, it is enough to show that
equality holds in the sphere-packing bound (2.18). With ¢=1,
n=2'—1 and M=2"~’, the left-hand side of (2.18) becomes

(1 + (")) =2"-"(1+n)=2"-"(1 +2’ - 1) =2’,

which is equal to the right-hand side of (2.18).

Decoding with a binary Hamming code

Since Ham (r,2) is a perfect single-error-correcting code, the


coset leaders are precisely the 2”(=n + 1) vectors of V(n, 2) of
weight <1.
The syndrome of the vector 0---010---0 (with 1 in the jth
place) is (0---010---0)H", which is just the transpose of the
jth column of H.
Hence, if the columns of H are arranged in order of increasing
binary numbers (i.e. the jth column of H is just the binary
representation of j), then we have the following nice decoding
algorithm.
Step 1 When a vector y is received, calculate its syndrome
S(y)
= yH".
84 A first course in coding theory
Step 2 If S(y) =0, then assume y was the codeword sent.
Step 3 If S(y) #0, then, assuming a single error, S(y) gives the
binary representation of the error position and so the error can
be corrected.
0001111
For example, withH =}; 0110011],
1010101
if y= 1101011, then S(y) = 110, indicating an error in the sixth
position and so we decode y as 1101001.

Extended binary Hamming codes

The extended binary Hamming code Ham/(r,2) is the code


obtained from Ham (r, 2) by adding an overall parity-check.
As in the proof of Theorem 2.7, the minimum distance is
increased from 3 to 4. Also, by Exercise 5.4, the extended code
is linear and so Ham (r, 2) is a [2’, 2” —1—r, 4]-code.
We shall see in Exercise 8.4 that the extended code Ham (7, 2)
is no better than Ham (r, 2) when used for complete decoding. In
fact, it is inferior since an extra check digit is required for each
codeword, thus slowing down the rate of transmission of infor-
mation. However, having minimum distance 4, Ham (r, 2) is
ideally suited for incomplete decoding, as described in Chapter 7,
for it can simultaneously correct any single error and detect any
double error.
Let H be a parity-check matrix for Ham (r,2). By Exercise
7.11, a parity-check matrix H for the extended code may be
obtained from H via

H—>H= H

[11 --- 11

The last row gives the overall parity-check equation on code-


words, i.e. X,) +X t+ ++ +X,4,=0.
If H is taken with columns in increasing order of binary
The Hamming codes 85
numbers, there is a neat incomplete decoding algorithm, illus-
trated for r = 3 below, to correct any single error and at the same
time detect any double error.

Example 8.3 Ham (3,2) has the parity-check matrix


00011110
H= 01100110
101010107
11111111
The syndrome of the error vector 00---010---0 (with 1 in the
jth place) is just the transpose of the jth column of H. The
incomplete decoding algorithm is as follows. Suppose the re-
ceived vector is y. The syndrome S(y)=yH" is calculated.
Suppose S(y) = (5;, 52, 53, 54). Then
(i) if s,;=0 and (s,, 55,53) =0, assume no errors,
(ii) if s,=0 and (s,, 5,,5,;) 40, assume at least two errors have
occurred and seek retransmission,
(iii) if s,=1 and (5s;,52,53;)=0, assume a single error in the
last place,
(iv) if s,=1 and (s;, 5), 53) 40, assume a single error in the jth
place, where j is the number whose binary representation is
(51, Sp, $3).

A fundamental theorem

Before defining Hamming codes over an arbitrary field GF(q),


we establish a fundamental relationship between the minimum
distance of a linear code and a linear independence property of
the columns of a parity-check matrix. This result will also be
important in later chapters.

Theorem 8.4 Suppose C is a linear [n, k]-code over GF(q) with


parity-check matrix H. Then the minimum distance of C is d if
and only if any d—1 columns of H are linearly independent but
some d columns are linearly dependent.

Proof By Theorem 5.2, the minimum distance of C is equal to


86 A first course in coding theory
the smallest of the weights of the non-zero codewords. Let
X= X,xX,-°-x, be a vector in V(n,q). Then
xe CO&xH' =0
© x,H, + xH, + .e + x,,H,, = 0,

where H,, H.,...,H,, denote the columns of H.


Thus, corresponding to each codeword x of weight d, there is a
set of d linearly dependent columns of H. On the other hand if
there existed a set of d — 1 linearly dependent columns of H, say
H,,, H,,,...,H td—1? then there would exist scalars x,,x;
2 ino7 * 8 , 8?
x iq-1? not all zero, such that

x;,Hi, + x;,Hi;, +:-+-4X, H, = 0.


ld-1" td-1

But then the vector x=(0---0x,;0---Ox,0---Ox,,_,0---0),


having Xj, in the ijth position for j=1,2,...,d—1, and Os
elsewhere, would satisfy xH’ =0 and so would be a non-zero
codeword of weight less than d.

Theorem 8.4 not only provides a means of establishing the


minimum distance of a specific linear code when H is given, but
also provides a means of constructing the parity-check matrix to
provide a code of guaranteed minimum distance. We concentrate
here on the case d =3, leaving a discussion of the general case
until Chapter 14.

q-ary Hamming codes

In order that C be a linear code with minimum distance 3, we


require that any two columns of a parity-check matrix H be
linearly independent. Thus the columns of H must be non-zero
and no column must be a scalar multiple of another (cf. Exercise
4.4). For fixed redundancy r, let us try to construct an [n,n—
r,3]-code over GF(q) with n as large as possible by finding as
large a set as possible of non-zero vectors of V(r, q) such that
none is a scalar multiple of another.
Any non-zero vector v in V(r,q) has exactly g —1 non-zero
scalar multiples, forming the set {Av| A © GF(q), A 40}. In fact,
the g’—1 non-zero vectors of V(r,q) may be partitioned into
(q” —1)/(q —1) such sets, which we will call classes, such that
two vectors are scalar multiples of each other if and only if they
The Hamming codes 87

are in the same class. For example, in V(2,3), with vectors


written as columns, these classes are

Ui): G)b {Cao} AG)G)F ame 1G)


By choosing one vector from each class we obtain a set of
(q’ — 1)/(q —1) vectors, no two of which are linearly dependent.
Hence, by Theorem 8.4, taking these as the columns of H gives a
parity-check matrix for a [(q’—1)/(q —1), (q’—1)/(q -—1)-r,
3]-code. This code is called a q-ary Hamming code and is
denoted by Ham (r, q).
Note that different parity-check matrices may be chosen to
define Ham (r,q) for given r and qg, but any such matrix may
clearly be obtained from another one by means of a permutation
of the columns and/or the multiplication of some columns by
non-zero scalars. Thus the Hamming codes are linear codes
which are uniquely defined, up to equivalence, by their
parameters.
An easy way to write down a parity-check matrix for
Ham (r, q) is to list as columns (e.g. in lexicographical order) all
non-zero r-tuples in V(r, q) with first non-zero entry equal to 1.
This must work because within each class of g—1 scalar
multiples there is exactly one vector having 1 as its first non-zero
entry.

Examples 8.5 (i) A parity-check matrix for Ham (2, 3) is

b 11
1012)

(ii) A parity check matrix for Ham (2, 11) is

So1234s678910
1012345678910)
(iii) A parity-check matrix for Ham (3, 3) is
0000111111111
0111000111222).
1012012012012
88 A first course in coding theory
Theorem 8.6 WHam(r,q) is a perfect single-error-correcting
code.

Proof Ham (r,q) was constructed to be an (n, M, 3)-code with


n =(q’ —1)/(q —1) and M =q"~’. With t=1, the left-hand side
of the sphere-packing bound (2.17) becomes

q’"(1+n(q—-1))=q" "1 +q’ —1)


=q",

which is the right-hand side of (2.17), and so Ham(r,q) is a


perfect code.

Corollary 8.7 If q is a prime power and if n = (q’ — 1)/(q — 1),


for some integer r 22, then
A,(n, 3)=q"”.

Thus, if g is a prime power and d=3, then the main coding


theory problem, that of finding A,(n, 3), is solved for an infinite
sequence of values of n. In particular, we have now established a
further entry of Table 2.4, namely A,(15, 3) = 2"! = 2048.

Decoding with a g-ary Hamming code

Since a Hamming code is a perfect single-error correcting code,


the coset leaders, other than 0, are precisely the vectors of
weight 1. The syndrome of such a coset leader is
H?
= bH/,
S(0---0b0---0)=(0---0b0---0)
t
jth place

where H; denotes the jth column of H.


So the decoding scheme is as follows. Given a received vector
y, calculate S(y)=yH’. If S(y)=90, assume no errors. If
S(y) #0, then S(y)=5H/ for some b and j and the assumed
single error is corrected by subtracting b from the jth entry of y.
For example, suppose g = 5 and
011111
H=|) o1oaat
The Hamming codes 89
Suppose the received vector is y = 203031. Then S(y) = (2, 3)=
2(1, 4) and so we decode y as 203034.

Shortening a code

Shortening a code can be a useful device if we desire a code of


given length and minimum distance and if we know of a good
code with greater length and the same minimum distance.
Suppose C is a q-ary (n, M, d)-code. Consider a fixed coor-
dinate position, the jth say, and a fixed symbol A of the alphabet.
Then, if we take all the codewords of C having A in the jth
position and then delete this jth coordinate from these code-
words, we will get a code C’ of length nm —1 with, in general,
fewer codewords but the same minimum distance. C’ is called a
shortened code of C.
If C is a linear [n, k, d]-code, and if the deleted symbol is 0,
then the shortened code C’ will also be linear; C’ will be an
[n —1,k —1,d']|-code, where d’ will in general be the same as d
(it may occasionally be greater than d). If C has parity-check
matrix H, then it is easy to see that a parity-check matrix of C’ is
obtained simply by deleting the corresponding column of H.

Example 8.8 Let us have another look at the [10, 8]-code over
GF (11) considered in Example 7.12. This was defined to have
parity-check matrix
wafer
112345678910
and it now follows instantly from Theorem 8.4 that this code has
minimum distance at least 3, for clearly any two columns of H
are linearly independent. In fact, we see that it is a doubly
shortened Hamming code, for H is obtained from the parity-
check matrix of Ham (2,11), as given in Example 8.5(ii), by
deleting the first two columns. This doubly shortened Hamming
code has two practical advantages over Ham (2, 11); first, it has
an even simpler decoding algorithm, as described in Example
7.12, and, secondly, it not only corrects any single error but also
detects any double error created by the transposition of two
digits. On the other hand, Ham (2, 11) has far more codewords
than its doubly shortened version.
90 A first course in coding theory
The 11-ary [10, 8, 3]-code of Example 8.8 is optimal in that the
number of its codewords is equal to the value of A,,(10, 3) (see
Exercise 7.8), a result which is generalized in Exercise 8.10.
While shortening an optimal code will certainly not in general
produce an optimal code, it is interesting to note a recent result
of Best and Brouwer (1977) that the triply shortened binary
Hamming code is optimal; thus
A,(2’ —s,3)=2%-"-5 for s =1,2,3, 4. (8.9)
For s =1, (8.9) merely states the optimality of Ham (r, 2), of
which we are already aware, while for s =2, 3 and 4, (8.9) tells
us that three successive shortenings of Ham(r,2) are also
optimal. The result was proved by the use of linear program-
ming, a technique which has been used to great effect recently in
obtaining improved upper bounds on A,(n,d) for a number of
cases. For a good introduction to the method, see Chapter 17 of
MacWilliams and Sloane (1977).
Taking r =4 in (8.9) gives the values of A,(14, 3), A,(13, 3)
and A,(12,3) as shown in Table 2.4. However, if Ham (4, 2) is
shortened four times, the resulting (11, 128,3)-code is not
optimal, for we see from Table 2.4 that there exists a binary
(11, 144, 3)-code.

Concluding remarks on Chapter 8

(1) Hamming codes were discovered by Hamming (1950) and


Golay (1949).
(2) For simplicity, we began this chapter by introducing only
the binary Hamming codes. In a sense some of that material was
made redundant by the treatment of g-ary Hamming codes,
which included the case g = 2; for example, Theorem 8.2 is just a
particular case of Theorem 8.6. However, the discussion of the
extended Hamming code is applicable only to the binary case, for
we cannot in general add an overall parity-check to a g-ary code
in such a way as to guarantee an increase in the minimum
distance. This is because Lemma 2.6 and hence Theorem 2.7 do
not have suitable analogues for non-binary codes.
(3) By Theorem 8.4, we can construct the parity-check
matrix of a g-ary linear code of redundancy r and minimum
The Hamming codes 91
distance d by finding a set of (column) vectors of V(r,q) such
that any d—1 of them are linearly independent. As we have
seen, it is easy to write down such a set of N vectors for d =3 of
any size N we wish up to a maximum value of (q’ — 1)/(q — 1).
For d = 4 also, it is easy enough to construct a set of vectors of
V(r,q), any d—1 of which are linearly independent, simply by
writing down vectors of V(r, q), one at a time, each time making
sure that the new vector is not a linear combination of any d —2
earlier ones. However, this approach is a little naive for d= 4,
for we are likely to run out of choices for the new vector at a
relatively early stage. In fact, the problem of finding the
maximum possible number of vectors in V(r,q) such that any
d —1 are linearly independent is extremely difficult for d= 4 and
very little is known except for cases r=4. The problem is of
much interest in other branches of mathematics, namely in finite
geometries and in the theory of factorial designs in statistics. We
shall return to it in Chapter 14.
We can at least use the above-mentioned naive approach to
get a lower bound on the maximum size of a code for given
length and minimum distance. This is the Gilbert bound (also
called the Gilbert-Varshamov bound), discovered independently
by Gilbert (1952) and Varshamov (1957).

Theorem 8.10 Suppose gq is a prime power. Then there exists a


q-ary [n, k|-code with minimum distance at least d provided the
following inequality holds:

> (q- 1y(" | ‘) <q”. (8.11)

Proof Suppose q, n, k and d satisfy (8.11). We shall construct


an (n —k) Xn matrix H over GF(q) with the property that no
d—1 columns are linearly dependent. By Theorem 8.4, this will
establish the theorem. Put r=n—k. Choose the first column of
H to be any non-zero r-tuple in V(r, q). Then choose the second
column to be any non-zero r-tuple which is not a scalar multiple
of the first. Continue choosing successive columns so that each
new column is not a linear combination of any d—2 or fewer
previous columns. There are gq —1 possible non-zero coefficients
92 A first course in coding theory
and so when we come to try to choose the ith column, those
r-tuples not available to us will be the

n@=14("\aq-y+('5')q-
fee ed (5 )(4 — 1)

linear combinations of d—2 or fewer columns from the i —1


columns already chosen. Not all of these linear combinations
need be distinct vectors, but even in the worst case, where they
are distinct, provided N(Z) is less than the total number q’ of all
r-tuples, then an ith column can be added to the matrix. Thus,
since (8.11) holds, we will reach a matrix H having n columns, as
required.

The following is an immediate consequence of Theorem 8.10.

Corollary 8.12 If q is a prime-power, then

A,(n,d)= q“,

where k, is the largest integer k satisfying

a<a"/(S@-0("7')).
Corollary 8.12 gives a general lower bound on A,(n, d) when q is
a prime-power and is the best available for large n (see, e.g.,
Chapter 17, Theorem 30 of MacWilliams and Sloane 1977).
However, for specific values of g, n and d one can usually do
much better by constructing a good code in some other way. For
example, taking g = 2, n = 13 and d =5, Corollary 8.12 promises
only the existence of a binary (13, M,5)-code with M = 16,
whereas we see from Table 2.4 that the actual value of A,(13, 5)
is 64. We shall construct such an optimal binary (13, 64, 5)-code
in Exercise 9.10.
For a weaker version of the Gilbert-Varshamov bound, but
one which applies for any size g of alphabet, see Exercise 8.12.
The Hamming codes 93
Exercises 8

8.1 Write down a parity-check matrix for the binary [15, 11]-
Hamming code. Explain how the code can be used to
correct any single error in a codeword. What happens if
two or more errors occur in any codeword?
8.2 With the code of Example 8.3, use an incomplete decod-
ing algorithm to decode the following received vectors.

11100000, 01110000, 11000000, 00110011.

8.3 Show that the code of Examples 2.23, 5.3(ii) and 5.6(ii) is
a Hamming code.
8.4 Suppose C is a binary Hamming code of length n and that
C is its extended code of length n+1. For a binary
symmetric channel with symbol error probability p, find
Pror(C) and Pror(C) in terms of p and n, and show that,
surprisingly, Pro(C) = Pror(C).
8.5 (i) Write down a parity-check matrix for the 7-ary
[8, 6]-Hamming code and use it to decode the re-
ceived vectors 35234106 and 10521360.
(ii) Write down a parity-check matrix for the 5-ary
[31, 28]-Hamming code.
8.6 Use Theorem 8.4 to determine the minimum distance of
the binary code with generator matrix

a 11007
1010
0110
LL 1111
1101
0101
L 1001.

8.7 Let C, be the code over GF(5) generated by

12403
02141}.
20314
94 A first course in coding theory
Let C, be the code over GF(3) generated by
120210
201201).
111212
Find a parity-check matrix for each code and determine
the minimum distance of each code.
8.8 Use Theorem 8.4 to construct a [6, 3, 4|-code over GF(5).
8.9 Let R, denote the rate of the binary Hamming code
Ham (r, 2). Determine lim,_,.. R,.
8.10 Prove that if g is a prime power and if 3Sn <q +1, then
A,(n,3)=q"’.
8.11 (The ‘football pool problem’) Suppose there are ¢ football
matches and that a bet consists of forecasting the outcome,
home win (1), away win (2) or draw (X), of each of the t¢
matches. Thus a bet can be regarded as a ternary f-tuple
over the alphabet {1, 2, X}.
The ‘t-match football pool problem’ is the following.
‘What is the least number f(t) of bets required to
guarantee at least a second prize (i.e. a bet having at most
one incorrect forecast)?’
(a) (i) By using Hamming codes over GF(3), find the
value of f(t) for values of t of the form (3’ — 1)/2
for some integer r=2; 1.e. for t=4, 13, 40,
121,....
(ii) Enter in the coupon below a minimum number
of bets which will guarantee at least 3 correct
forecasts in some bet.

Arsenal Luton
Coventry Ipswich
Liverpool Chelsea
Watford Everton

(b) Show that 23 =f(5) <27.


[Remark: It was shown by Kamps and Van Lint
(1967) that f(5) = 27, the proof taking ten pages. The
value of f(t) is unknown for t>5 except for values
13, 40, 121, etc., covered by part (a). For some
The Hamming codes 95
recent work on the bounds for f(6), f(7) and f(8) see
Fernandes and Rechtschaffen (1983), Weber (1983),
and Blokhuis and Lam (1984).]
8.12 (A weaker, but more general, version of the Gilbert—
Varshamov bound). Prove that, for any integer g =2,

Ana) =a"/(S @-1(")).


[Remark: When q is a prime power, this bound is much
inferior to that of Corollary 8.12. For example, it guar-
antees the existence of a binary (13, M,5)-code having
only M=8, compared with M=16 given by Corollary
8.12 and a largest possible value of M of 64.]
9 Perfect codes

We recall from Chapter 2 that a q-ary t-error-correcting code of


length n is called perfect if the spheres of radius ¢ about
codewords fill the space (F,)” with no overlap; thus a q-ary
(n, M,2t+1)-code is perfect if and only if the sphere-packing
condition

M\1 +(q-Int+(q- 17(3) $eet(q- 1y(")| =g" (9.1)


is satisfied.
Apart from being the best codes for their n and d, perfect
codes are of much interest to mathematicians, largely because of
their associated designs and automorphism groups.
The problem of finding all perfect codes was begun by M.
Golay in 1949 but not completed until 1973 (and then only in the
case of prime-power alphabets) by J. H. van Lint and A.
Tietavainen. Before giving their final result (Theorem 9.5) we
review the perfect codes we already know of and describe two
new ones.
The trivial perfect codes were defined in Chapter 2 to be
binary repetition codes of odd length, codes consisting of a single
codeword, or the whole of (F,)”.
In Chapter 8 we defined the perfect g-ary Hamming codes with
arameters _»
P (n, M, d) = ((q’ —1)/(q —-1), g"~", 3),
for any integer r =2 and any prime power q.
Note that the Hamming parameters satisfy (9.1) for any
positive integer gq and, while it is conjectured that there do not
exist any codes having these parameters for q not a prime-power,
this is known to be the case only for g=6 and r=2 (see
Theorem 9.12).
A natural approach in looking for further perfect codes was
first to seek solutions of (9.1) in integers g, M, n and f¢; 1.e.
to find g, n and t such that )ii_o (q — 1y(") is a power of g. A
98 A first course in coding theory
limited search by Golay (1949) produced only three feasible sets
of parameters (n, M,d) other than the above-mentioned. These
were (23, 2'7,7) and (90, 2”8,5) with g=2 and (11, 3°,5) with
q =3.
[Remark: A computer search carried out by van Lint in 1967
showed that these are the only further solutions of the sphere-
packing condition with n = 1000, t= 1000 and q = 100.]
In his 1949 paper, Golay was concerned only with linear codes.
He exhibited generator matrices, which he presumably had found
by trial and error, for codes having the parameters (23, 2’, 7)
and (11, 3°,5), and he also showed that a linear [90, 78, 5]-code
over GF(2) could not exist. Remarkably, he did all this, together
with generalizing the Hamming codes from those over GF(2) to
those over any prime field, in less than one page!
Before describing the two perfect Golay codes, we give a
proof, based on that of Golay, of the non-existence of a linear
code having the third feasible set of parameters.

Theorem 9.2 There does not exist a binary linear [90, 78, 5]-
code.

Proof Suppose H were a parity-check matrix for a binary


(90, 78, 5]-code. Then H is a 12 x 90 matrix, whose columns we
denote by H,, H,,..., Hoo. By Theorem 8.4, any four columns
of H are linearly independent and so the set
X = {0, H,, H) + H, | 1<i<90,1<j<k
<90}
90
is a set of 1+90+( ) distinct column vectors. But 1+ 90+

90
() =2" and so X is precisely the set V(12, 2) of all binary
12-tuples. Hence the number of vectors of odd weight in X is 2’
(see e.g. Exercise 2.4 or Exercise 5.5). We now calculate this
number in a different way. Suppose m of the columns of H have
odd weight, so that 90—m of them have even weight. As in
Lemma 2.6, w(H; + H,) = w(H,) + w(H,) — 2w(H; 1 H,), and so
w(H, + H,) is odd if and only if exactly one of w(H;) and w(H,)
is odd. Thus another expression for the number of vectors of odd
weight in X is m + m(90 —m). Hence
m(91 —m)=2"!
Perfect codes 99
and so both m and 91—™m are powers of 2. This is clearly
impossible for any integer m and so the desired linear code
cannot exist.
Remark The non-existence of a non-linear (90, 2’8, 5)-code will
be demonstrated in Theorem 9.7.

The binary Golay (23, 12, 7]-code


We present here the binary Golay code, as did Golay in his 1949
paper, by exhibiting a generator matrix. This is a little unsatis-
factory in that it is not clear where the matrix has come from, but
it should at least satisfy the reader that the code exists (it will be
defined in a more natural way, as a cyclic code, in Chapter 12).
Following the treatment of Pless (1982) and MacWilliams and
Sloane (1977), we give a different, though equivalent, generator
matrix from that given by Golay in order to facilitate the
derivation of the code’s properties and particularly its minimum
distance.
By Theorem 2.7 and Exercise 5.4, the existence of a
(23, 12, 7]-code C implies the existence of a [24, 12, 8]-code C
and vice versa. It turns out to be advantageous to define the
extended Golay code C first.
Theorem 9.3 The code G;, having generator matrix G = [J,2| A]
=[1 011111111111)
1 110 11 1000 1 0
1 101 11 000101
1 011 1 0 0 0 1 0 11
1 111 0 0 01 01 1 0
1 110 0 0 10 11 0 1
1 100 01 01 1 0 11
1 000 1 0 11 01 11
1 001 01 10 11 1 0
1 010 11 01 11 0 0
1 101 1 0 11 1000
be
1 011 0 1 11 000 1)
is a [24, 12, 8]-code.
100 A first course in coding theory
Proof We are required to show that d(G,,)=8, and by
Theorem 5.2 it is enough to show that every non-0 codeword has
weight at least 8. The above generator matrix has been chosen so
that this can be done without having to list all 2'* codewords. We
proceed by a sequence of four lemmas.

Lemma l= G54 = Gog, 1.€. Gog is Self-dual.

Proof It is readily checked that u-v=0Q, or equivalently that


w(uly) is even, for every pair of (not necessarily distinct) rows
u and v of G. (The amount of checking involved here can be
much reduced by observing that each of rows 3 to 12 of matrix A
can be obtained from the second row by means of a cyclic shift of
the last 11 coordinates. For, by symmetry arguments, it is then
Perfect codes 101
Proof We write a codeword x=x,x,--+%,4 as (L|R) where
L=x,°++X,2 is the left half of x and R= x,3--:x,, is the right
half of x. Suppose x is a codeword of G,, of weight 4. Then one
of the following cases occurs.
Case 1 w(L)=0, w(R)=4. This is impossible since we see
from the generator matrix G that 0 is the only codeword
with w(L) = 0.
Case 2. w(L)=1, w(R) =3. If w(L) =1, then x is one of the
rows of G, none of which has w(R) = 3.
Case 3. w(L)=2, w(R) =2. If w(L)=2, then x is the sum of
two rows of G, but it is easily seen that no sum of two
rows of A has weight 2.
Case 4 w(L)=3, w(R)=1. It would be tedious to check that
the sum of any three rows of G has w(R)>1. But by
using Lemma 2 we can avoid this. For if w(R) = 1, then
x must be one of the rows of [A | J], none of which has
weight 4.
Case 5 w(L)=4, w(R) =0. Again by looking at the generator
matrix [A | J] we see that 0 is the only codeword having
w(R) = 0.
Theorem 9.3 now follows immediately from Lemmas 3 and 4.
The binary Golay code G,3 is obtained from G,, simply by
omitting the last coordinate position from all codewords. G,,; is
thus a (23,2'*,7)-code whose parameters satisfy the sphere-
packing condition
23 23
i.e. {1423 4 (*) + (=) } = 2",

So G3 1s a perfect code.

Remark ‘The omission of any other fixed coordinate from G,,


(this process is called puncturing) would also give a (23, 2'*, 7)-
code and it happens that any such punctured code is equivalent
to Go3.

The ternary Golay [11, 6, 5]-code

With just a little trial and error it is not difficult to make use of
Theorem 8.4 and to construct the parity-check matrix of an
[11, 6, 5]-code over GF(3) (see Exercise 9.3).
102 A first course in coding theory
However, to bring out the similarities of the binary and ternary
Golay codes, we exhibit a generator matrix for a ternary
[12, 6, 6]-code G,., which may be punctured to get the perfect
ternary Golay code G,, with parameters [11, 6, 5].

Theorem 9.4 The ternary code G,, having generator matrix

1 0111111]
1 0 101221
1 110122
G=[I,| A]=
1 121012
0 1 122101
Lo 17112210
is a [12, 6, 6|-code.

Proof This is left to Exercise 9.2.

Are there any more perfect codes?

It was conjectured for some time that the Hamming codes


Ham (r,q) and the Golay codes G,; and G,, were the only
non-trivial perfect codes. However, in 1962, J. L. Vasil’ev
constructed a family of non-linear perfect codes with the same
parameters as the binary Hamming codes. Then Schénheim
(1968) and Lindstrém (1969) gave non-linear codes with the same
parameters as Hamming codes over GF(q) for any prime power
q.
The conjecture was weakened to: ‘any non-trivial perfect code
has the parameters of a Hamming or Golay code’. The proof of
this, for g a prime power, was finally completed by Tietévdinen
(1973) following major contributions by van Lint (see van Lint
(1975)). Thus we have the following result, which was also
proved independently by Zinov’ev and Leont’ev (1973).

Theorem 9.5 (van Lint and Tietaévdinen) A non-trivial perfect


q-ary code, where q is a prime power, must have the same
parameters as one of the Hamming or Golay codes.
Perfect codes 103
The proof of Theorem 9.5 is rather complicated and the
details, which may be found in MacWilliams and Sloane (1977),
are omitted here. One important ingredient of the proof is
Lloyd’s theorem, which we also state without proof, which gives
a further necessary condition on the parameters for the existence
of a perfect code. The binomial coefficient (* | in the following
m
is defined by

pth Ue eane’
if m is a positive integer
m!
=] if m = 0.

Theorem 9.6 (Lloyd (1957)) If there exists a perfect (n, M, 2t +


1)-code over GF(q), then the polynomial L,(x) defined by

L¢> s
(ry
g= x-1
("7 *)("—*)
has ¢ distinct integer roots in the interval 1=x <n.
Using Lloyd’s theorem, it was shown that an unknown perfect
code over GF(q) must have t=11, gq <8 and n < 485. However,
by the computer search mentioned earlier, the only parameters
in this range satisfying the sphere-packing condition are those of
trivial, Hamming or Golay codes and also the parameters
(n, M, d) = (90, 2”, 5) with gq =2. [Remark: It has been shown
by H. W. Lenstra and A. M. Odlyzko (unpublished) that the
computer search can be avoided by tightening the inequalities.|
We have already established the non-existence of a linear
(90, 2’8, 5)-code. The non-existence of a non-linear code with
these parameters follows from Lloyd’s theorem, for with t=2
and n = 90,

L(x) =0 if and only if x?—91x +2048 =0


and this equation does not have integer solutions in x.
We give below a self-contained proof of this non-existence,
avoiding Lloyd’s theorem, and relying only on a simple counting
argument. We first give a simple definition.
104 A first course in coding theory
Definition If u and v are binary vectors of the same length,
then we say that u covers v if the 1s in v are a subset of the 1s in
u. In other words,

ucoversv if andonlyif uNv=v.


For example 111001 covers 101000.

Theorem 9.7 There does not exist a binary (90, 278, 5)-code.

Proof Suppose, for a contradiction, that C is a (90, 278, 5)-code.


By Lemma 2.3, we may assume that 0 ¢€ C. Then every non-zero
codeword in C has weight at least 5. Let Y be the set of vectors
in V(90, 2) of weight 3 which begin with two 1s. Clearly |Y| = 88.
Since C is perfect, each vector y of Y lies in a unique sphere
S(x, 2) of radius 2 about some codeword x. Such a codeword x
must have weight 5 and must cover y.
Let X be the set of all codewords of C of weight 5 which begin
with two 1s. We will count in two ways the number of ordered
pairs in the set
D = {(x, y)| xe X,y€ Y, x covers
y}.
By the previous remarks, each y in Y is covered by a unique x in
X and so
ID| =|Y| =88.
On the other hand, each x in X (e.g. 1111100---0) covers
exactly three ys in Y _ (111000-:-0,110100---0 and
110010---0), and so

ID| =3|X1.
Hence 3 |X|=88, giving |X| =88/3, which is a contradiction,
since |X| must be an integer. Thus a (90, 278, 5)-code cannot
exist.

t-designs

The counting argument, which will be generalized in Exercise


9.5(b), of the proof of Theorem 9.7 is reminiscent of that used in
proving the relations (2.24) and (2.25) for block designs (see
Perfect codes 105
Exercise 2.13). This is not just a coincidence, for we can
associate with any perfect code a certain design called a t-design.

Definition A t-design consists of a set X of v points, and a


collection of distinct k-subsets of X, called blocks, with the
property that any ¢t-subset of X is contained in exactly A blocks.
We call this a t-(v, k, A) design.
Thus 2-designs are the same as balanced block designs, which
were defined in Chapter 2.

Definition A Steiner system is a t-design with A=1. A t-—


(v,k,1) design is usually called an S(t, k, v).
For example, the Fano plane of Example 2.19 is an S(2, 3, 7).
The following theorem shows how Steiner systems can be
obtained from perfect codes.

Theorem 9.8 (Assmus and Mattson 1967) If there exists a


perfect binary t-error-correcting code of length n, then there
exists a Steiner system S(t +1, 2¢+1,7).

Proof This is left to Exercises 9.4(b) and 9.5.

Assmus and Mattson (1969) later gave an important sufficient


condition on a code, which is not necessarily perfect, for the
existence of associated f-designs. For the details, see
MacWilliams and Sloane (1977, Chapter 6) or Assmus and
Mattson (1974). Many new 5-designs have been obtained in this
way. [Remark: it was a long-standing conjecture that f-designs
having t=6 did not exist; however the discovery of a 6-design
has recently been announced by Magliveras and Leavitt (1983).]

Remaining problems on perfect codes

Theorem 9.5 leaves the following problems unresolved.

Problem 9.9 Find all perfect codes having the parameters of the
Hamming and Golay codes.

It was observed after the definition of the g-ary Hamming


codes in Chapter 8 that any linear code with the Hamming
106 A first course in coding theory
parameters is equivalent to a Hamming code. But the problem of
finding all non-linear codes with these parameters appears to be
very difficult and is unsolved. It is believed that there are (at
least) several thousand inequivalent perfect binary codes with the
parameters (15, 2"',3). For supporting evidence see Phelps
(1983).
However, the two perfect Golay codes are unique, i.e. any
code with the parameters of a Golay code must be equivalent to
a Golay code. This was proved by Pless (1968) in the restriction
to linear codes (see also Exercise 9.3 for the ternary case). For
unrestricted codes, the uniqueness of G,3 was proved by Snover
(1973), while that of both G,; and G,, was demonstrated by
Delsarte and Goethals (1975).

Problem 9.10 Find all perfect codes over non-prime-power


alphabets.

It is conjectured that there are no non-trivial perfect codes


over non-prime-power alphabets. The best result to date is the
following theorem of Best (1982), the proof of which is too
involved to include here. For an outline, see Best (1983).

Theorem 9.11 For t=3 and t#6 or 8, the only non-trivial


perfect t-error-correcting code over any alphabet is the binary
Golay code.
It is likely that the cases t=6 and t=8 (and possibly even
t=2) will be settled fairly soont, but for t=1, the problem
appears to be extremely difficult. We have already observed that
the parameters

(n, M, d)=((q’ — 1)/(q — 1), 4”"", 3)


satisfy the sphere-packing condition for integers gq and r 2. For
q a prime-power, these are the parameters of the Hamming
codes, but for g not a prime power, very little is known about the
existence or otherwise of codes having these parameters; only in
the smallest case, g =6, r=2, is the problem resolved, as we
now describe.
The possible existence of a 6-ary (7,6°,3)-code was first
+ Cases t = 6, 8 have now been settled by Y. Hong (Ph.D. Dissertation,
Ohio State University, 1984).
Perfect codes 107
considered explicitly by Golay (1958) and answered in the
negative by Golomb and Posner (1964), who reduced the
problem to one from recreational mathematics, posed by Euler
in 1782 and solved in 1901, as follows.

Theorem 9.12 There does not exist a 6-ary (7, 6°, 3)-code.

Proof Suppose, for a contradiction, that C is a (7, 6, 3)-code


over the alphabet K = {1, 2,3, 4, 5, 6}. Consider the 6° vectors of
length 5 obtained by deleting the last two coordinates of each
codeword of C. These must be precisely the 6° distinct vectors of
(F)°, for if two of these 5-tuples were the same, then the
corresponding two codewords in C would be distance at most 2
apart. Hence there are 6* codewords of C beginning with any
fixed triple. If we now take those 36 codewords of C beginning
with 111 and then delete these first three positions, we will have
a (4, 67, 3)-code, which we denote by D. By the same argument
as above, the 36 ordered pairs given by deleting any two fixed
coordinates from the codewords of D will be precisely the 36
distinct ordered pairs in (F)*. Hence, if a codeword (ijkl) of the
code D is identified with an officer whose rank is i and whose
regiment is j and who stands in the kth row and /th column of a
6 X 6 square, we have a solution to the following problem:

Euler’s ‘36 officers problem’ (1782) ‘There are 36 officers, one


from each of 6 ranks from each of 6 regiments. Can these officers
be arranged in a 6 X 6 square so that every row and every column
of the square contains one officer of each rank and one officer of
each regiment?
It was conjectured by Euler that the answer is ‘no’, and this
was proved to be the case (by exhaustive search) by Tarry
(1901). For a fairly short, self-contained proof, see Stinson
(1984).
Hence a 6-ary (7, 6°, 3)-code cannot exist and Theorem 9.12 is
proved.

Remark The ‘36 officers problem’ is equivalent to a problem


concerning mutually orthogonal Latin squares, a topic whose
connection with codes is the subject of the next chapter, where it
will be seen why the method of proof of Theorem 9.12 cannot be
108 A first course in coding theory
used to rule out the existence of g-ary (q + 1, q7~', 3)-codes for
values of qg other than 6.

Concluding remarks

(1) The Golay codes have been constructed in a number of


different ways, most naturally as cyclic codes (see Chapter 12) or
as quadratic residue codes. A less obvious, but neat elementary
construction is given in van Lint (1982).
(2) A number of special algorithms have been devised for
decoding G,3 and G,,, some of them making ingenious use of the
properties of the associated 5-design. Among these are
Berlekamp’s (1972) algorithm, Goethals’ (1971) majority logic
algorithm, and Gibson and Blake’s (1978) method using ‘miracle
octad generators’.
(3) The probability of error correction when using G,3 was
found in Exercise 6.6. By Exercise 9.1, there is no advantage in
using G,, rather than G,3 for complete decoding.

Exercises 9

9.1 (Generalization of Exercise 8.4) Suppose C is a perfect


binary linear code of length n and that C is its extended
code. Prove that, for a binary symmetric channel,
Prort(C) = Peorr(C).
| Hint: Use the Pascal identity for binomial coefficients,

Ce )= (+2) for n=i>1.|


9.2 Prove Theorem 9.4; i.e. show that d(G,,) = 6. [Hint: Show
that Gi} = G,, so that [—A7? | J] is also a generator matrix
for G,,. Then use the fact that if w(x) <5, then either
w(L) <2 or w(R) S2, where x = (L| R)].
9.3. Use Theorem 8.4 to construct G,,; 1.e. find 11 vectors of
V(5, 3) such that any 4 of them are linearly independent.
Furthermore show that this can be done in essentially only
one way, thus proving the uniqueness of G,, as a linear
[11, 6, 5]-code. [Hint: Show first that, up to equivalence,
Perfect codes 109
we may assume that H =[J;| A], where

111107

pe
x * * GC *

i
* Oo * *

x
pe
b>
lI
* oO * * *

Se
QO * * * *

=
and the asterisks represent non-zero entries.]
9.4 (a) Show that if y is a vector in V(23,2) of weight 4,
then there exists a unique codeword x of weight 7 in
G,3 which covers y. Deduce that the number of
codewords of weight 7 in G,; is 253.
(b) Let M be a matrix whose columns are the codewords
of weight 7 in G53. Show that M is the incidence
matrix of a design which has 23 points, 253 blocks, 7
points in each block, and such that any 4 points lie
together in exactly one block; thus we have con-
structed a Steiner system S(4, 7, 23).
9.5 Show that if there exists a perfect binary ¢t-error-correcting
code of length n, then
(a) there exists a Steiner system S(¢ +1, 2t+ 1,7);
n-l ei
(b) ("Gar is an integer fori=0,1,...,¢.
[Remark: Putting n = 90, t=2 and i =2 in part (b) is
the case considered in proving Theorem 9.7.]
9.6 Construct a Steiner system S(5, 8,24) from the extended
binary Golay code G»,.
9.7 Show that the number of codewords of weight 3 in the
Hamming code Ham (r, 2) is (2” — 1)(2”~' — 1)/3.
9.8 Show that the number of vectors of weight 5 in the ternary
Golay code is 132.
9.9 We shall construct the Nordstrom—Robinson (15, 256, 5)-
code N,, in the following steps.
(i) Show that if the order of the coordinates of the
binary Golay code G,, is changed so that one of the
weight 8 codewords is 1111111100---0, then G,,
has a generator matrix having its first 8 columns as
110 A first course in coding theory
shown below.

‘ 1 7]
1 0 1
1 1
1 1
1 1
G- 0 1 1
11
0
0
0 0
0
L 0 |
[Hint: Since G,, is self-dual, (a) the first seven
columns of G must be linearly independent and (b)
the codeword 1111111100---0 is orthogonal to
every codeword.|
(ii) Show that the total number of codewords of Gy,
whose first eight coordinates are one of 00000000,
10000001, 01000001, 00100001, 00010001, 00001001,
00000101 or 00000011 is 256.
(iii) Take these 256 codewords and delete the first 8
coordinates of each of them. Show that the resulting
code is a (16, 256, 6)-code. This is the extended
Nordstrom—Robinson code Ng.
(iv) Puncture N,, (e.g. delete the last coordinate) to get
the (15, 256, 5)-code Ns.
[Remark: N,, and N,; are non-linear codes. They are
both optimal, cf. Table 2.4.]
9.10 Construct from N,;5 a (12, 32, 5)-code. [This code is called
the Nadler code, having originally been constructed in
another way by Nadler (1962). The Nadler code is both
optimal (see Chapter 17, §4 of MacWilliams and Sloane
1977) and unique (Goethals 1977).|
9.11 (i) Show that there does not exist a binary linear
Perfect codes 111
(13, 64,5)-code. [Hint: Suppose C is a_ binary
[13, 6, 5]-code with generator matrix
it 1100000000
G, G, }
Show that G, generates an [8,5,3]-code, whose
parameters violate the sphere-packing condition.|
(ii) Deduce that there is no linear code with the
parameters of the Nordstrom—Robinson code.
(iii) Can the non-existence of a [12,5,5]-code (i.e. a
linear code with the parameters of the Nadler code)
be proved by the method of (i)?
10 Codes and Latin squares

The main aim of this chapter is to show how codes can be


constructed from certain sets of Latin squares and vice versa. In
particular, we shall completely solve the ‘main coding theory
problem’, over any alphabet, for single-error-correcting codes of
length 4.

Latin squares
Definition A Latin square of order q is a q Xq array whose
entries are from a set F, of q distinct symbols such that each row
and each column of the array contains each symbol exactly once.

Example Let F,= {1,2,3}. Then an example of a Latin square


of order 3 is
123
231
3 1 2.
Latin squares, like balanced block designs (see Chapter 2), can
be used in statistical experiments.

Example 10.1 Three headache drugs 1, 2, 3 are to be tested on


subjects A, B, C on three successive days M, T, W. One
possible schedule is
M W
A

A 1 3
NN

B 1 3
C 1 2. 3.z
But in addition to testing the effect of different drugs on the same
subject, we also want to have some measurement of the effects of
the drugs when taken on different days of the three-day period.
So we would like each drug to be used exactly once each day, i.e.
114 A first course in coding theory
we require a Latin square for the schedule, e.g.

M T W
A 1 2 3
B 2 3 #1
C 3 1 2.

Theorem 10.2 There exists a Latin square of order q for any


positive integer q.

Proof We-can take 12--:q as the first row and cycle this round
once for each subsequent row to get

123--: q
234-:. ql
345---q12

qi2-:-- q-tl.
Alternatively, the addition table of Z, is a Latin square of order
q.

Mutually orthogonal Latin squares


Definition Let A and B be two Latin squares of order q. Let a,
and b, denote the i, jth entries of A and B respectively. Then A
and B are said to be mutually orthogonal Latin squares (abbre-
viated to MOLS) if the q* ordered pairs (a,,5,),i,j=1,
2,..-,Q, are all distinct.
In other words, if we superimpose the two squares to form a
new q Xq square with ordered pairs as entries, then these q?
ordered pairs are all distinct.

Example 10.3 The Latin squares


123 123
A=23 1 and B=312
312 231
form a pair of MOLS of order 3, for when superimposed they give
Codes and Latin squares 115
the array
(1,1) (2,2) (3,3)
(2,3) (3,1) (,2)
(3,2) (1,3) (2,1).

Application Suppose three headache drugs, labelled 1, 2, 3,


and three fever drugs, also labelled 1, 2, 3, are to be tested on
three subjects A, B, C on three successive days M, T, W. As in
Example 10.1, we shall use a Latin square of order 3 for the
headache drug schedule and another one for the fever drug
schedule. Since each subject takes a headache drug and a fever
drug each day we have the opportunity of observing their
combined effect. Can we test each of the 9 combinations of
headache drug/fever drug exactly once? Yes, by using the above
pair of MOLS.
M T W
A (1,1) (2,2) @,3)
B (2,3) (3,1) (1,2) Here (i,j) denotes
C (3,2) (1,3) (2,1) (headache drug 1, fever drug /).

Example 10.4 There does not exist a pair of MOLS of order 2,


for if F, = {1, 2}, then the only Latin squares of order 2 are ; ; and
21
12 and these are not mutually orthogonal.

Optimal single-error-correcting codes of length 4

Over an arbitrary alphabet F,, let us consider the ‘main coding


theory problem’ for codes of length 4 and minimum distance 3;
i.e. the problem of finding the value of A,(4, 3). First we find an
upper bound.

Theorem 10.5 A,(4,3) <q’, for all q.

Proof Suppose C is a q-ary (4, M,3)-code and let x = x,.%5x3x,


and y=y,)y3y, be distinct codewords of C. Then
(x1, X) ¥ (y, 2), for otherwise x and y could differ only in the last
116 A first course in coding theory
two places, contradicting d(C) =3. Thus the M ordered pairs
obtained by deleting the last two coordinates from C are all
distinct vectors of (F,)* and so we must have M <q’.

Example 10.6 For q=3, the bound of Theorem 10.5 is


attained, for the Hamming code Ham (2, 3) is a (4, 9, 3)-code:
0000
0112
0221
1011
1120
1202
2022
2101
221 0.
Note that the ordered pairs in any two fixed coordinate positions
are precisely the distinct vectors of (F;)*. The argument of the
proof of Theorem 10.5 shows that this must be so.

Remark For q24, the bound of Theorem 10.5 is a big


improvement on the sphere-packing bound, which gives only that
A,(4, 3) <q*/(4q — 3).
Our next task is to determine those values of g for which a
q-ary (4, q”, 3)-code exists. Since the q* ordered pairs starting off
the codewords of such a code are distinct, such a code must have
the form
{(i, J, aij» b;;) | (i,j) € (F,)°}.

We now demonstrate the connection between such codes and


pairs of mutually orthogonal Latin squares.

Theorem 10.7 There exists a q-ary (4, g*, 3)-code if and only if
there exists a pair of MOLS of order q.

Proof We will show that a code


C = {(i, j, ai; b;;) | (i,j) € (F,)7}
Codes and Latin squares 117
is a (4, q*, 3)-code if and only if A = [a,] and B = [b,| form a pair
of MOLS of order q.
As in the proof of Theorem 10.5, the minimum distance of C is
3 if and only if, for each pair of coordinate positions, the ordered
pairs appearing in those positions are distinct. Now the q* pairs
(i, a,;) are distinct and the q* pairs (j, a,) are distinct if and only if
A is a Latin square. The q* pairs (i, b,) are distinct and the q?
pairs (j, 5,;) are distinct if and only if B is a Latin square. Finally
the q* pairs (a,;,b,) are distinct if and only if A and B are
mutually orthogonal.
Theorem 10.7 shows that A,(4,3)=q’ if and only if there
exists a pair of MOLS of order q. We shall show (in Theorem
10.12) that such a pair of MOLS is easily constructed for three
quarters of all cases, or more precisely, whenever g =0, 1, or
3 (mod 4).

Theorem 10.8 If g is a prime power and q #2, then there exists


a pair of MOLS of order gq.

Proof Let F, be the field GF(q) = {Ao, A1,...,A,-1}, where


Ay = 0 (if g is prime, we may take A; =i for each i). Let uw and v
be two distinct non-zero elements of GF(q). Let A=[a,] and
B =[b,] be q x q arrays defined by
ai = A; + uA; and bi; = A; + vA,.

(The rows and columns of A and B are indexed by 0,1,...,q—-


1.) We first verify that A and B are Latin squares. If two
elements in the same row of A are identical, then we have
A; + UA, = A; + UA ;:, 1.€. LA, = LA;-,

implying that j =j’, since u#0. Similarly, if two elements in the


same column of A are identical, then we have

A; + LA, = A; + LA,;, 1.€. A; = Ai";

implying that i =i'. Thus A, and similarly B, are Latin squares.


To show that A and B are orthogonal, suppose on the contrary
that (4,,, b;) = (@;;, bj), 1.e. assume that the same ordered pair
appears twice in the superposition of the squares. Then
A; + pA; = Aj + wd;
and A, t+ VA, = Ay + vA),
118 A first course in coding theory
which on subtraction implies that

(u— v)A; = (UVa,


Since u# v, we have j =j’ and, consequently, i =i’.

Remark Notice how the important field property of being able


to cancel non-zero factors was used in the above proof. A similar
construction using Z,, where n is not a prime, would fail to give a
pair of MOLS.

Example 10.9 With GF(3)={0,1,2}, the construction of


Theorem 10.8 gives, taking u =1, v =2,

012 021
A=12 0 and B=102
201 21 0.

The corresponding (4,9,3)-code, given by Theorem 10.7, is


precisely the Hamming code as displayed in Example 10.6.
We next describe a construction which yields pairs of MOLS of
order g for many more values of q.

Theorem 10.10 If there exists a pair of MOLS of order m and


there exists a pair of MOLS of order n, then there exists a pair of
MOLS of order mn.

Proof Suppose A,, A, is a pair of MOLS of order m and B,, B,


is a pair of MOLS of order n.
Denote the (i, /)th entry of A, by a) (k=1,2)
and the (i, j)th entry of B, by b&) (k=1, 2).
Let C, and C, be the mn X mn squares defined by

C, = (af, B, Xai, By) ++: (at), B,)


(aS, B,)

(ami, Be) (arm Bx)


where (a{*), B,) denotes an n Xn array whose r,sth entry is
(a), b®) for r,s=1,2,...,n.
In other words, C, is obtained from A, by replacing each entry
Codes and Latin squares 119
a of A, by the n Xn array (a, B,), where

(a, B,) = (a, bYP)(a, bY) - - (a, bY?)


(a, b)
(a,b) «+» (a, b®).
It is a straightforward exercise to verify that C, and C, are Latin
squares and that they are mutually orthogonal.

Example 10.11
012 012 . of
_ _ is a pairo
A;=120 and A,=201 MOLS of order 3.
201 120
0123 0123
1032 _ 2301 = isa pair of
B=, 301 and B= 3210 MOLS of order 4.
3210 1032

The construction of Theorem 10.10 gives the following pair of


MOLS of order 12 in which the entries are ordered pairs from
the Cartesian product F, x F, = {00, 01, 02, 03, 10, 11, 12, 13, 20,
21, 22, 23}. (We could relabel these elements as the integers
1,2,...,12 if we wished).
00 01 02 03: 10 11 12 13; 20 21 22 23
01 00 03 02 : !
02 03 00 01 | !
03 02 01 00! !
10 11 1213! 20 ' 00
11
C=49
120 A first course in coding theory
00 01 02 03: 10 | 20
02 03 00 01! !
03 02 01 00:
01 00 03 02:

20 ‘00 ' 10
Q= ! !

10 20 ‘ 00

It should be clear to the reader how to complete the remaining


entries in the above squares.
The construction of Theorem 10.10 can be repeated any
number of times. For example, we can get a pair of MOLS of
order 60 by taking the pair of MOLS of order 12 constructed in
Example 10.11 together with a pair of MOLS of order 5 as given
by Theorem 10.8. The following result tells us precisely for which
values of g a pair of MOLS of order gq can be constructed by this
method.

Theorem 10.12 If q =0, 1 or 3 (mod 4), then there exists a pair


of MOLS of order q.

Proof Suppose gq =0, 1 or 3 (mod 4). Then gq is either odd or is


divisible by 4. Hence, if gq =p*#'p®---p” is the prime factoriza-
tion of gq, where p,,p.,...,p, are distinct primes and
h,, hn, ...,h, are positive integers, then p”>3 for each i. Thus,
by Theorem 10.8, there exists a pair of MOLS of order p* for
each i. Repeated application of Theorem 10.10 then gives a pair
of MOLS of order p{'p%?- + +p” =.

Theorem 10.12 leaves cases q=2(mod4), i.e. g=2,6,


10, 14,..., unresolved. It was shown in Example 10.4 that there
Codes and Latin squares 121
does not exist a pair of MOLS of order 2. A pair of MOLS of
order 6 is equivalent to a solution of Euler’s ‘36 officers
problem’. As we saw in Chapter 9, Euler’s conjecture that no
such pair exists was proved by Tarry. Euler conjectured further
that there does not exist a pair of MOLS of order q for any
gq =2 (mod 4). For such g =10, he could not have been further
from the truth, though it was not until 1960 that his conjecture
was finally disposed of in the following result.

Theorem 10.13 (Bose, Shrikhande and Parker (1960)). There


exists a pair of MOLS of order g for all q except g = 2 and g = 6.

The proof of Theorem 10.13 for cases g =2 (mod 4) is rather


complicated and is omitted here.

Corollary 10.14 A,(4, 3) =q? for all q #2, 6.

Proof This is immediate from Theorems 10.5, 10.7, and 10.13.

Finally we find the values of A,(4, 3) for g =2 and g =6. It is


a very easy exercise to show that A,(4, 3) =2 (see Exercise 2.1),
while the following gives the value of A,(4, 3).

Theorem 10.15 A,(4, 3) = 34.

Proof ‘The arrays


123456 123456
214365 345612
346512 214365
A=435621 ™ 8651243
562143 436521
6512134 5621134
form a pair of Latin squares which are as close to being
orthogonal as is possible. They fail only in that (a¢5, b¢5)=
(4;3, 5,3) and (a¢6, Dg6) = (414, 514). Thus the code
{(i, J, aij, 7) | (i,j) E (Fo)*, (i,j) # (6, 5) or (6, 6)}

is a (4, 34, 3)-code.


122 A first course in coding theory
Now if there existed a (4, 35, 3)-code C over K, then C would
have the form

{(i,j, Gi; » b;;) | (i,j) € (F)’, (i, J) # (io, jo)}

for some (ig, jo). After a little thought, the reader should be able
to show that the partial 6 x 6 arrays A =[a,;] and B =[b,], each
having the (i, jo)th entry missing, can be completed to Latin
squares which must be mutually orthogonal. This contradicts
Tarry’s non-existence result.
Summarizing our results concerning A,(4, 3), we have

Theorem 10.16 A,(4,3)=q’, for all q 42, 6,


A,(4, 3) =2,
A,(4, 3) = 34.
Remark We now see why the non-existence of a perfect g-ary
(q +1, q?~', 3)-code cannot be proved by using the method of
proof of Theorem 9.12 except when g = 6.

In the remainder of this chapter, we generalize some of the


earlier results. First we give a generalization of the bound of
Theorem 10.5, due to Singleton (1964).

Theorem 10.17 (The Singleton bound)


A,(n,d)=<q" 4").

Proof Suppose C is a q-ary (n, M, d)-code. As in the proof of


Theorem 10.5, if we delete the last d — 1 coordinates from each
codeword (i.e. puncture C d—1 times), then the M vectors of
length n — d +1 so obtained must be distinct and so M <q”~4*!,

Sets of t mutually orthogonal Latin squares


Definition A set {A,, A>z,...,A,} of Latin squares of order q is
called a set of mutually orthogonal Latin squares (MOLS) if each
pair {A;, A;} is a pair of MOLS, for 1 <i<j St.

Theorem 10.18 ‘There are at most g — 1 Latin squares in any set


of MOLS of order q.
Codes and Latin squares 123
Proof Suppose A,,A>2,...,A;, 1s a set of t MOLS of order gq.
The orthogonality of two Latin squares is not violated if the
elements in any square are relabelled. So we can relabel the
elements of each square so that the first row of each A, is
12---q. Now consider the ¢ entries appearing in the (2, 1)th
position of the ¢ Latin squares. None of these entries can be a 1,
since 1 already appears in the first column of each A;. Also, no
two of these entries can be the same, because for any two of the
A,, the pairs (1, 1), (2,2),...,(qg,q) already appear in the first
row of the corresponding superimposed matrix. Hence we must
have t=q—1.

Definition If aset of g — 1 MOLS of order g exists, it is called a


complete set of MOLS of order q.

Theorem 10.19 If gq is a prime power, then there exists a


complete set of g — 1 MOLS of order gq.

Proof Consider the field GF(q)= {Ao,A1,...,Ag-1} where


Ayg=0. Let A,,A2,...,Ag-, be qXq arrays, with rows and
columns indexed by 0,1,...,q—1, in which the (i, /)th entry of
A, is the element of GF(q) defined by

a!) =), + AyA;.


It follows exactly as in the proof of Theorem 10.8, that
A,,A2,...,A,g—, form a set of MOLS of order gq.

Remark It is not known whether there exist any complete sets


of MOLS of order q when q is not a prime power. Surprisingly, a
complete set of MOLS of order gq 23 is equivalent to a
projective plane of order qg (see e.g. Ryser (1963), p. 92 for a
proof of this). Thus one approach towards finding a projective
plane of order 10 (the lowest-order unsolved case, as mentioned
in Chapter 2) is to try to find a set of 9 MOLS of order 10.
However, no-one has yet succeeded in finding even a set of 3
MOLS of order 10.

Theorem 10.20 A q-ary (n, q*, n — 1)-code is equivalent to a set


of n —2 MOLS of order gq.
124 A first course in coding theory
Proof Asin Theorem 10.7, an (n, q*,n — 1)-code C over F, has
the form

{(i, J; a), a\?), ae) a\r—?)) | (i,j) € (F,)°}.

It is left as an exercise for the reader to show that d(C) =n — 1 if


and only if A,,A2,...,A,-2, where A, = [a], form a set of
MOLS of order q.

Corollary 10.21 A,(3, 2) =q? for all q.

Proof A (3, q*,2)-code is equivalent to a single Latin square of


order g, which exists by Theorem 10.2. The Singleton bound
shows that such a code is optimal.

Corollary 10.22 If g is a prime power and n <q + 1, then


A,(n,n—1)=q’.

Proof This is immediate from Theorems 10.17, 19 and 20.

For other connections between Latin squares and error-


correcting codes, see Dénes and Keedwell (1974).

Exercises 10

10.1 Construct a pair of orthogonal Latin squares of order 7.


10.2 Use a pair of MOLS of order 3 and the construction of
Theorem 10.10 to construct a pair of MOLS of order 9.
10.3. Using the field GF(4) as defined in Example 3.6(3),
construct a set of three MOLS of order 4.
10.4 Show that the dual of the Hamming code Ham (2, q) is a
(q +1, q*,q)-code. List the codewords of (Ham (2, 5))*
and hence construct a set of four MOLS of order 5.
10.5 Define f(qg) to be the largest number of Latin squares in a
set of MOLS of order qg. On the basis of results stated in
this chapter, write down all the information you can
about the values of f(n) for 3<n < 20; i.e. give values of
f(n) where known, otherwise give the best upper and
lower bounds you can.
10.6 Show that A9(5, 4) = 400.
A double-error-correcting decimal code
1 1 and an introduction to BCH codes

In Chapter 3 we met the ISBN code, which is a single-error-


detecting decimal code of length 10. Then in Example 7.12 we
constructed a single-error-correcting decimal code of length 10.
Our first task in this chapter will be to construct a double-error-
correcting decimal code of length 10 and to determine an efficient
algorithm for decoding it. As before, the code will really be a
linear code defined over GF(11).
We shall then generalize this construction to a family of
t-error-correcting codes defined over finite fields GF(q), where
2t+1<q. These codes are particular examples of BCH codes
(BCH codes were discovered independently by Hocquenghem
(1959) and by Bose and Ray-Chaudhuri (1960)) or Reed-
Solomon codes.
We shall see that the decoding of these codes depends on
solving a certain system of simultaneous non-linear equations, for
which coding theorists have devised some clever methods of
solution. Surprisingly, such a system of equations was first solved
by Ramanujan (1912) in a seemingly little-known paper in the
Journal of the Indian Mathematical Society. We shall present
here a decoding algorithm based on Ramanujan’s method, which
is easy to understand and makes use of the method of partial
fractions which the reader will very likely have met.

Historical Remark In 1970, N. Levinson wrote an expository


article entitled ‘Coding Theory—a counterexample to G. H.
Hardy’s conception of applied mathematics’, in which he showed
how theorems from number theory play a central role in coding
theory, contrary to Hardy’s (1940) view that number theory
could not have any useful application. It is of particular interest,
therefore, to see a result of Hardy’s great protegé, Ramanujan,
also finding an application in coding theory. Incidentally, perhaps
contrary to popular belief, Ramanujan was not completely
unknown before his discovery by Hardy. He had already
126 A first course in coding theory
published three papers, all in the above-mentioned journal,
before he first wrote to Hardy in January, 1913. It is the third of
these papers, published in 1912, which is of interest to us here. It
was just two pages long and gave neither references nor any
motivation for solving the given system of equations.

Some preliminary results from linear algebra

We shall construct a code of specified minimum distance d by


constructing a parity-check matrix H having the property that
any d—1 columns of H are linearly independent (see Theorem
8.4). The following well-known result concerning the deter-
minant of a Vandermonde matrix enables us to make this
construction in a natural way. The determinant of a matrix A will
be denoted by det A.

Theorem 11.1 Suppose a,,a,,...,a, are distinct non-zero ele-


ments of a field. Then the so-called Vandermonde matrix
m4 doe 17
Qa, a> oee a,

A=] at az-::.@

pay as,' tee ay" |

has a non-zero determinant.


Proof By subtracting a, Xrowi from row(i+1) for i=1 to
r —1, we have
Bl 1 ae 1 7)
O a-a, vee a, — a,
0 a(a,—4a,) *** 4,(a,—4a))
det A = det
0 a3(a2 — 41) “++ ay(a,
— a)

}0 a3-*(ay — a4) +++ a7 *(a, — ay)|


1 1... 17
a> a3 eee a,

= (a, — a,)(a3 — a,)-++(a,—a,) det | a3 a3


“N
Q
e6.8@

as? a, +++ ar?)


A double-error-correcting decimal code 127
The matrix in this last expression is again of Vandermonde type
and if we similarly subtract a, X rowi from row (i+ 1) for i=1 to
r —2, and then take out factors, we get

det A = (a, — a,)(@3 — a4) ++ + (@, — @))(@3 — a2) ++ + (@, — a2)


1 lee. 1

xdet} “> “470° &


a’? eee qa’

Repeating the process until the determinant becomes unity,

det A = (a, — a,)(43 — 4,) ++ + (a, — 4)


(a; — a2) ++ (a, — ap)

(a, —a,4)X1

= I] (a; — a;).
i>j

Hence detA is non-zero, since the a; are distinct non-zero


elements of a field. [Remark: the reader who is familiar with the
method of proof by induction should be able to shorten the
length of the above proof. ]
The following is another standard result from linear algebra;
its converse is also true, but will not be needed.

Theorem 11.2 If A is an rXr matrix having a non-zero


determinant, then the r columns of A are linearly independent.

Proof Suppose A is an rXr matrix such that det A +0.


Suppose, for a contradiction, that the columns ¢,,¢,...,¢, of A
are linearly dependent. Then some column of A can be expressed
as a linear combination of the other columns, say
r
C = S a;C;.

iF]
i=]

Then replacing column c¢; by ¢; — ).7_; a,¢; gives a matrix B whose


itj
determinant is equal to that of A and which also has an all-zero
column. Thus det A = det B = 0, giving the desired contradiction.
128 A first course in coding theory
A double-error-correcting modulus 11 code
We are now ready to construct our double-error-correcting
decimal code. The code will consist of those codewords of the
single-error-correcting code of Example 7.12 which satisfy two
further parity-check equations. A similar code was considered by
Brown (1974).

Example 11.3 Let C be the linear [10,6]-code over GF(11)


defined to have parity-check matrix

L1a.--- 1
12 3--- 10
™=1 4 2232... 10 |
1 23 33 ---
10°
As usual, if we desire a decimal code rather than one over
GF(11), we simply omit those codewords containing the symbol
10 so that our decimal code is
10

D= [xen ++ X19 € (Fio)”” > x;


i=1

=> i= > Px,=> Fx, =0 (mod 11)|


10 10 10

i=1 i=]
i= p=]
i=

where F,yg= {0,1,2,..., 9}.


Note that any four columns of H form a Vandermonde matrix
and so, by Theorems 11.1 and 11.2, any four columns of H are
linearly independent. Thus, by Theorem 8.4, the code C (and
hence also D) has minimum distance 5 and so is a double-error-
correcting code.

Remark The 11-ary code C contains 11° codewords and so is


optimal by the Singleton bound (Theorem 10.17). The decimal
code D does not achieve the Singleton bound of 10° but
nevertheless contains over 680 000 codewords.

We next construct a syndrome decoding scheme which will


correct all double (and single) errors in codewords of C.
Suppose x=xX,X,°°'X,9 is the transmitted codeword and
Y= )1¥2°** Yio IS the received vector. We calculate the syndrome
A double-error-correcting decimal code 129
of y 10 10 10 10
(S,, 55, $3, Sy) =yH? = (> Yi» » LY, > i*y,, >» ?y;).
i=1 i=] i=]

Suppose two errors of magnitudes a and Db have occurred in


positions i and j respectively. Then
at+b=S, (1)
ai + bj = S, (2)
ai? + bj? = S$, (3)
ai? + bj? = Sy. (4)
We are required to solve these four equations for the four
unknowns a, b, i, j and at first sight this looks rather difficult as
the equations are non-linear. However, we can eliminate a, b
and / as follows.
i X (1) — (2) gives b(i — j) = iS, — S, (5)
i x (2) — (3) gives bj(i —j) = iS, — S; (6)
i X (3) — (4) gives bj7(i — j) = iS; — Sy. (7)
Comparing (6)? with (5) x (7) now gives
(iS, — $3)? = (iS, — S,)(iS; — S4),
which implies that
(SS — $,S;)i7 + (S,S, — S,S3)i + $3 — S,S, = 0. (8)
It is clear that if a, b and i were eliminated from (1) to (4) in
similar fashion, then we would get the same equation (8) with i
replaced by j. Thus the error locations 7 and j are just the roots of
the quadratic equation (8). Once i and j are found, the values of
a and Db are easily obtained from (1) and (2).
Let P = S$—S,S;, O = S,S, — S,S3 and R = S$ — S,S,. Note that
if just one error has occurred, say in position i of magnitude a,
then we have
S,=a, S,=ai, S,;=ai* and S,=ai?
andso P=Q=R=0.
Thus our decoding algorithm is as follows.
From the received vector y, calculate the syndrome S(y) = (S,,
S>, 53, S,) and, if this is non-zero, calculate P, Q and R.
130 A first course in coding theory
(i) If S(y) =0, then y is a codeword and we assume no errors.
(ii) If S(y)#0 and P=Q=R= 0, then we assume a single
error of magnitude S, in position S,/S,.
(iii) If P4O and R40 and if Q* —4PR is a non-zero square in
GF(11), then we assume there are two errors located in
positions i and j of magnitudes a and b respectively, where
__-Q+V(Q?—4PR)
i,j >P (9)
b = (iS, — So)/( — j) (10)
and a=S,—b. (11)
(iv) If none of (1), (ii) or (iii) applies, then we conclude that at
least three errors have occurred.

Notes (1) It does not matter which way round we take i and j
in (9); we need not insist, for example, that i<j.
(2) As usual, all arithmetic is carried out modulo 11, division
being carried out with the aid of the table of inverses as in
Example 7.12. We need further here a table of square roots
modulo 11. By first calculating the squares of the scalars as
shown below
x | 12345678910

wel] 149533594 1
we may take the table of square roots to be
x | 1 3459
Vx | 15243
We could equally well use the negative of any of these square
roots; the presence of the ‘+’ in (9) shows that it does not matter
which of the two roots is taken. Note that if, in (9), QO? — 4PR is
not a square (i.e. it is one of 2, 6, 7, 8, 10), then at least three
errors must have occurred.

A class of BCH codes

Let us now consider how the code of Example 11.3 might be


generalized. Generalizing the construction of the code to a
t-error-correcting code of length n over GF(q) is very easy
A double-error-correcting decimal code 131
provided
2t+1<n<q—l.
Generalizing the decoding algorithm is less straightforward but
can nevertheless be done in an ingenious way.
The codes defined below belong to the much larger class of
BCH codes. By restricting our attention to these easily defined
codes we can demonstrate in an elementary way the essential
ingredients of the important error-correction procedure for the
more general BCH codes.
We will assume for simplicity that g is a prime number, so that
GF(q) = {0,1,...,q—1}, but there is no difficulty whatsoever
in adapting the results to the general prime-power case.
Let C be the code over GF(q) defined to have the parity-check
matrix
4. 1 od eee 1
1 2 3 +s ON
H=|1 2% 3% «++ n? |,

4 Ja—-2 34-2 tee n@-2

where d =n <q — 1. That is,

C= frye, V0, 9) > #x,=0 for j=0,1,...,d-2}


i=1

Any d—1 columns of H form a Vandermonde matrix and so


are linearly independent by Theorems 11.1 and 11.2. Hence, by
Theorem 8.4, C has minimum distance d and so is a q-ary
(n,q”~**!, d)-code. Since C meets the Singleton bound
(Theorem 10.17), we have proved

Theorem 11.4 If q is a prime-power and if d=<n <q — 1, then


A,(n, d) — qr att,

From now on we will assume that d is odd, so that d=2t+1


and H has 2t¢ rows. Let us try to generalize the decoding
algorithm of Example 11.3.
Suppose the codeword x= x,x,--:x, 1S transmitted and that
the vector y=y,y2--- y, 1S received in which we assume that at
most ¢ errors have occurred. Suppose the errors have occurred in
132 A first course in coding theory
positions X,,X,,...,X, with respective magnitudes
m,,M,,...,m, (if e<t¢t errors have occurred, we just assume
that m..,=M,4.=:-::=m,=0). From the received vector y we
calculate the syndrome
(S, So, a) So) =yH"',
i.e. we calculate

S=>i= yi t= >)i=1 m,Xt!


forj7=1,2,...,
2t¢.
Thus, to find the errors, we must solve for X; and m, the
following system of equations
my, + my, +:---+Mm, = §,

m,X, + m,X> +eeet m,X, = §,

mX% + m,X% +eee+ m,X? = §, > (11.5)

m,Xt-t+ m,X2-1+ 7 ee + m,X2-} = Sor)

This is precisely the system of equations solved by Ramanujan in


1912 and we follow exactly his method of solution below (for
t = 3, the equations are too complicated to eliminate 2¢ — 1 of the
unknowns as we did for the case t = 2).
Consider the expression

0) = + freed
1— X,6 . )1
my, Mp» m,

0.) = 7x61 -x,0


Now 1— iX60 =m (1+ X0 +.X20?+---)
and so
o(8)
= (m, +m,+++++m,) + (MX,
+ m2X, +--+ +m,X,)0
+ (m, Xf + M2X3+ +++ +m,X?)0? +---.
By virtue of equations (11.5), we get
p(8) = S, + S,6 + S307 +--+ +S,,07
7-1 +---, (2)
Reducing the fractions in (1) to a common denominator, we have
A,+A,0+A367+-+-+A,0'"! 3
p(8) = 1+ B,6+ B,67+---+Be (3)
A double-error-correcting decimal code 133
Hence
(S,+S,0 + 8,67 +---)(1+B,0+ B,67+---+B,6')
=A,+A,0+A,07+---+A,0").
Equating like powers of 0 we have:
A,=S8, )
A, = S5 + S,B,

A; = S3 + SB, + S,B, > (11.6)

A, = S, + S,1B, + S,>B, +eeet S,B,-4 J

0) = S44 + SB, + S,-,B, + ze + S,B,

0) = S49 + 5,413, + SB, + 7. + SB,


(11.7)

Q = S>, + S5,-,B, + S>,-7B> + ee + S,B,.

Since S,,55,...,55, are known, the ¢ equations (11.7) enable


us to find B,, B,,...,B,, and then A,,A,,...,A, are readily
found from equations (11.6).
Knowing the values of the A; and B;, we can split the rational
function of (3) into partial fractions to get

0) = Pi
+ P2
+++ + Pi
.
9(9) 1-—q,0 1-4q,0 1—q,0

Comparing this with (1), we see that


m,= Pp, xX, =q)
M2 = P2 X2
= qr

Mm, = Pr X,= 4:
and the system (11.5) is solved.

Remark 11.8 The polynomial


o(0)=1+8B,0+ B,0*+---+B,6'
= (1 — X,0)(1 — X28) +++ (1— X,8)
is what coding theorists call the error-locator polynomial; its
zeros are the inverses of the error locations X,, X5,...,X,. The
134 A first course in coding theory
polynomial
w(0) = A,+A,0++--+A,0!
is what coding theorists call the error-evaluator polynomial.
Once we have found the error locations, we can use the
evaluator polynomial to calculate the error magnitudes.

Let us illustrate the above method by an example.

Example 11.9 Consider the 3-error-correcting code over


GF (11) with parity-check matrix

rid 1 --- 17]


12 3 --: 10
12? 3% --- 10°
12? 33 +--+ 10°]
1 2* 3*--- 10¢
212° 3 --+ 10%
Suppose we have received a vector whose syndrome has been
calculated to be
(S;, S>, 53, S45 Ss, Se) = (2, 8, 4, 5, 3, 2).
Assuming at most 3 errors, in positions X,, X2, X3 of respective
magnitudes m,, m2, m3 we have

m, m, m, _— A,+A,0+A;6?
p(8) = 1-—X,0 1-— X,0 1— X30 1+ B,0 + B,6? + B03’

where, by 11.6 and 11.7, the A; and B; satisfy


A,=2
A,=8+ 2B,
A,=4+
8B, + 2B,
O=5+4B,
+ 8B, + 2B,
0=3+5B,+
4B, + 8B,
0=2+3B,+
5B, + 4B.
Solving first the last three equations for B,, B, and B, gives
A double-error-correcting decimal code 135
B,=5, B,=10, B,=8, A; =2, A,=7 and A, =9. Therefore

2+76+967
p(9) = 1+560+ 1007 + 863°

To split this into partial fractions we must factorize the de-


nominator. Because there is only a finite number of field
elements, the simplest way to find the zeros of the denominator
is by trial and error. In this case we find that the zeros are 4, 5
and 9. The error positions are the inverses of these values, i.e. 3,
9, and 5, and we now have

(8) = 2+70+ 967


(1-30)1-50)4-90) =T39*
1-360 oso tog
1-S5@A@ 1-986
Now m, is given by multiplying through by 1—36 and then
putting 30 =1, i.e. 0=37'=4, to get
— 247-4494 |
My, 4.
(1-5-4) -9-4)
The reader familiar with partial fractions may recognize this
method as a ‘cover-up’ rule. Similarly, m, is obtained from the
left-hand side of (1) by ‘covering up’ the factor 1—56 and
putting @=5~'=9. This gives m,=2 and similarly we get
m;=7. Thus the error vector is
0040200070.

Notes (1) If the number of errors which actually have oc-


curred is e, where e <t, then m,,,;=™m.42=°::=m,=0 so that
o(0@) becomes
A, +A,0+-++-+A,0°"'
1+ B,O+---+B,0°
We therefore require a solution of equations (11.7) for which
Be)= Berg= °° = B,
= 0.

It will not be obvious from the received vector, nor from the
syndrome, what the number e of errors is, but if e<t, then only
the first e equations of (11.7) will be linearly independent, the
remaining t—e equations being dependent on these. So when
solving the system (11.7) we must find the maximum number e of
136 A first course in coding theory
linearly independent equations and put B,,,= B.4.=-:-=B,=
0.
For example, suppose in Example 11.9 that the syndrome has
been found to be (5, 6, 0,3, 7,5). Then equations (11.7) become
6B, + 5B3,=8
3B, + 6B3= 4
7B, + 3B, = 6.
Eliminating B, from the last two equations gives
3B, +8B;,=4
which is just a scalar multiple of the first. So we put B,=0 and
solve the first two equations for B, and B, to get B,=5 and
B,=5. We then have A, =5 and A, =9. So
5 +96
(9) = 7564562”
which gives, on splitting into partial fractions,
2 1 3
1-@ 1-50
Thus we assume that there are just two errors, in position 1 of
magnitude 2, and in position 5 of magnitude 3.
(2) When the error-locator and error-evaluator polynomials
o(@) and w(@) (defined in Remark 11.8) have been found, and
the error locations X,, X,,...,X, determined, then, as we saw in
Example 11.9, the error magnitudes are given by

yeJ
w(x;XE)
[| GQ - x77) forj=1,2,...,e. (11.10)
‘Fi
This is why w(@) is called the error-evaluator polynomial.
We now summarize the general algorithm.

Outline of the error-correction procedure (assuming <t


errors)
Step 1 Calculate the syndrome (S,, S;,..., S3,) of the received
vector.
A double-error-correcting decimal code 137
Step 2 Determine the maximum number of equations in system
(11.7) which are linearly independent. This is the number e of
errors which actually occurred.
Step 3 Set B11, Bes2,..., B, all equal to zero and solve the
first e equations of (11.7) for B,, B,,..., B..
Step 4 Find the zeros of the error locator polynomial
1+ B,6+B,0*+---+B.6°
by substituting each of the non-zero elements of GF(q).
Step 5 Find A,,A,,...,A, from system (11.6) and find each
error magnitude m, by substituting X;' in the error-evaluator
polynomial A, + A,60+---+A,0°~' and dividing by the product
of the factors 1 — X,X;* fori=1,2,...,e with i4j.

Notes (1) If in Step 3 we solve the system (11.7) by reducing


to upper triangular form, then we can automatically carry out
Step 2 at the same time.
(2) The above procedure is essentially that used by coding
theorists today, although Ramanujan’s consideration of partial
fractions is not used explicitly.
(3) The computations involved in the above scheme may all
be performed very quickly with the exception of Step 3, in which
we are required to solve the matrix equation
Ky Sy S3 ses S. VP B. | r—S.41 |

S» S3 S4 7 Se+1 B.-, —S.42


S3 =

| Se Se+1 Se+2 see Sre—1 | |B, _ — $2 2

For example, if we were to solve the system by inverting the


e Xe matrix, then the number of computations needed would be
proportional to e*. This might be reasonable for small rt, but if we
need to correct a large number of errors we require a more
efficient method of solution. Various refinements have been
found which greatly reduce the amount and complexity of
computation.
Note that the e x e matrix above is not arbitrary in form, but
has the property known as ‘persymmetry’; that is, the entries in
any diagonal perpendicular to the main diagonal are all identical.
138 A first course in coding theory
Berlekamp (1968) and Massey (1969) were able to use this
additional structure to obtain a method of solving the equations
in a computationally much simpler way. This involved converting
the problem to one involving linear-feedback shift registers;
details may be found in Peterson and Weldon (1972),
MacWilliams and Sloane (1977) or Blahut (1983). An alternative
algorithm (see same references) involves the clever use of the
Euclidean algorithm for polynomials. This algorithm is perhaps
easier to understand than the Berlekamp—Massey algorithm,
though it is thought to be less efficient in practice.
(4) Since we require that n <q — 1 in constructing the above
codes, it may look as though the methods of this chapter have no
applicability to binary codes. However, binary BCH codes
indeed exist and are extremely important. A binary BCH code
may be defined by constructing a certain matrix whose entries
belong to a field of order 2” and then converting this to a
parity-check matrix for a binary code by identifying each element
of GF(2") with a binary h-tuple (written as a column vector) in a
natural way. These BCH codes are discussed extensively in
several of the standard texts on coding theory. It is hoped that
for the reader who wishes to study BCH codes further, the above
treatment will facilitate his understanding of the more general
case.

Concluding remarks

(1) Apart from the ISBN code, modulus 11 decimal codes are
now widely used, mainly for error detection rather than correc-
tion. One of the earliest uses was in the allocation of registration
numbers to the entire population of Norway in a scheme devised
by Selmer (cf. 1967). Selmer’s code, defined in Exercise 11.6,
satisfies two parity-check equations and is designed to detect all
single errors and various types of commonly occurring multiple
errors. Before devising his code, in order to ascertain which
psychological errors occurred most frequently, Selmer analysed
the census returns of 1960 for the population of Oslo. In this
census, the public had filled in the date of birth themselves, and
comparison of these entries with those in the public register had
revealed about 8000 inconsistencies, which were on record in
Oslo. Selmer actually received only 7000 of these; the remaining
A double-error-correcting decimal code 139
thousand were people who had also written their name incor-
rectly and so belonged to another file!
(2) For a survey of various types of error-detecting decimal
codes, see Verhoeff (1969). This includes, in Chapter 5, the first
example of a pure decimal code which detects all single errors
and all transpositions.
(3) In 1970, Goppa discovered codes which are an important
generalization of BCH codes, and whose decoding can be carried
out in essentially the same way. McEliece (1977) asserts that ‘it is
fairly clear that the deepest and most impressive result in coding
theory is the algebraic decoding of BCH-—Goppa codes’. It has
been the aim of this chapter to give the essential flavour of this
result assuming nothing more than standard results from first-
year undergraduate mathematics.

Exercises 11
11.1 Using the code of Example 11.3, decode the received
vector 1204000910.
11.2 Find a generator matrix for the [10, 6]-code of Example
11.3.
11.3. For the code of Example 11.9, find the error vectors
corresponding to the syndromes
(1,7,5,2,3,10) and (9,7,7, 10,8, 3).
11.4 Suppose we wished to give each person in a population of
some 200 000 a personal identity codeword composed of
letters of the English alphabet. Devise a suitable code of
reasonably short length which is double-error-correcting.
11.5 When decoding a BCH code of minimum distance 2¢ + 1,
suppose the error locations are found to be
X,,X,...,X,. Show that the error magnitude m, in
position X; is given by

m, = —X,w(X;"")/0'(X;"),
where w(@) is the error-evaluator polynomial and o'(@)
denotes the derivative of the error-locator polynomial
o(@).
11.6 Every person in Norway has an 11-digit decimal registra-
tion number x,x,--:x,,;, where x,x,---Xx, is the date of
140 A first course in coding theory
birth, x7XgX¥9 iS a personal number and xX,9 and x,, are
check digits defined by
X10 = —(2Xo + 5X8 + 4x, + 9x6 + 8X5 + X4 + 6x3 +7X> + 3x4)

(mod 11)
and

X41 = — (2X49 + 3X9 + 4x. + 5x4 + 6X6 + 1X5

+ 2x4 + 3x3 + 4x, + 5x,) (mod 11).


Write down a parity-check matrix for the code (regarded
as a code over GF(11)). If the code is used only for error
detection, will all double errors be detected? If not,
which double errors will fail to be detected?
1 2 Cyclic codes

Cyclic codes form an important class of codes for several reasons.


From a theoretical point of view they possess a rich algebraic
structure, while practically they can be efficiently implemented
by means of simple devices known as shift registers. Further-
more, many important codes, such as binary Hamming codes,
Golay codes and BCH codes, are equivalent to cyclic codes.

Definition A code C is cyclic if (i) C is a linear code and (ii) any


cyclic shift of a codeword is also a codeword, i.e. whenever
Ajoa,°°-a,—-;, 18 in C, then so ts a4,_,do@, +++ a,_>.

Examples 12.1 (i) The binary code {000, 101,011,110} is


cyclic.
(ii) The code of Example 2.23, which we now know as the
Hamming code Ham (3, 2), is cyclic. (Note that each codeword
of the form a; is the first cyclic shift of its predecessor and so is
each b;.)
(iii) The binary linear code {0000, 1001, 0110, 1111} is not
cyclic, but it is equivalent to a cyclic code; interchanging the third
and fourth coordinates gives the cyclic code {0000, 1010,
0101, 1111}.
(iv) Consider the ternary Hamming code Ham (2,3) with
1011
generator matrix |. From the list of codewords found in
0112
Exercise 5.7, we see that the code is not cyclic. But is Ham (2, 3)
equivalent to a cyclic code? The answer will be given in Example
12.13 (see also Exercise 12.22).
When considering cyclic codes we number the coordinate
positions 0,1,...,n—1. This is because it is useful to let a
vectOr doa,:-:a,-, in V(n,q) correspond to the polynomial
Agtayxt-++-t+a,_,x" 7}.
142 A first course in coding theory
Polynomials

From now on we will denote the field GF(q) by F,, or simply by


F (with g understood). We denote by F[x] the set of polynomials
in x with coefficients in F. If f(x)=fot+fix+---+f,+.” is a
polynomial with f,,40, then m is called the degree of f(x),
denoted deg f(x). (By convention the degree of the zero polyno-
mial is —%.) The coefficient f,, is then called the leading
coefficient. A polynomial is called monic if its leading coefficient
is 1.
Polynomials in F[x] can be added, subtracted and multiplied in
the usual way. F[x] is an example of an algebraic structure
known as a ring, for it satisfies the first seven of the eight field
axioms (see Chapter 3). Note that F[x] is not a field since
polynomials of degree greater than zero do not have multiplica-
tive inverses. Observe also that if f(x), g(x)eF[x], then
deg (f(x)g(x)) = deg f(x) + deg g(x).

The division algorithm for polynomials

The division algorithm states that, for every pair of polynomials


a(x) and b(x) #0 in F[x], there exists a unique pair of polyno-
mials q(x), the quotient, and r(x), the remainder, such that
a(x) = q(x)b(x) + r(x),
where deg r(x) < deg D(x).
This is analogous to the familiar division algorithm for the ring
Z of integers. The polynomials g(x) and r(x) can be obtained by
ordinary long division of polynomials.
For example, in F[x], we can divide x° +x +1 by x7+x+1 as
follows.
x+1

x?7+x4+1 x3 +x +1
xet+x74+x
x? +]
x7+x+ 1
Xx
Cyclic codes 143
Hence x7+x+1=(x+1)(x?+x+1)+x is the desired expres-
sion of x7 +x+1 as g(x)(x*+x4+1)+7r(x).

The ring of polynomials modulo f(x)

The ring F[x] of polynomials over F is analogous in many ways


to the ring Z of integers. Just as we can consider integers modulo
some fixed integer m to get the ring Z,, (see Chapter 3), we can
consider polynomials in F[x] modulo some fixed polynomial f(x).
Let f(x) be a fixed polynomial in F[x]. Two polynomials g(x)
and h(x) in F[x] are said to be congruent modulo f(x),
symbolized by
g(x) = h(x) (mod f(x),
if g(x) — h(x) is divisible by f(x).
By the division algorithm, any polynomial a(x) in F[x] is
congruent modulo f(x) to a unique polynomial r(x) of degree
less than deg f(x); r(x) is just the principal remainder when a(x)
is divided by f(x).
We denote by F[x]/f(x) the set of polynomials in F[x] of
degree less than degf(x), with addition and multiplication
carried out modulo f(x) as follows.
Suppose a(x) and b(x) belong to F[x]/f(x). Then the sum
a(x) + b(x) in F[x]/f(x) is the same as the sum in F[x], because
deg (a(x) + b(x)) < deg f(x). The product a(x)b(x) in F[x]|/f(x)
is the unique polynomial of degree less than deg f(x) to which
a(x)b(x) (as a product in F[x]) is congruent modulo f(x).
For example, let us calculate (x + 1)? in E[x]/(x7+x +1). We
have
(x +1)? =x?2+2x4+1=x%°+1=x
(modx?+x +41).
Thus (x + 1)? =x in B[x]/(x* +x +1).
Just as Z,, is a ring, so also is F[x]/f(x); it is called the ring of
polynomials (over F) modulo f(x).
If f(x) € F,[x] has degree n, then the ring F,[x]/f(x) consists of
polynomials of degree <n — 1. Each of the n coefficients of such
a polynomial belongs to F, and so

[Fy [x ]/f (x) | =q".

Example 12.2. The addition and multiplication tables for F[x]/


144 A first course in coding theory
(x? +x +1) are easily found to be:

+ 0 1 x 1+x 0 1 x 1+x

0 0 1 x i1+x 0 |0 O 0 0
1 1 Q iI+x x 1 j0 1 x 1+x
x x 1+x O 1 x 10 x 14+x 1
lT+x!i1l+x x ] 0 1+xl0 1+x 1 x

We see that this is more than just a ring. Every non-zero element
has a multiplicative inverse and so F[x]/(x* +x +1) is actually a
field. In fact, we have precisely the field of order 4 given in
Example 3.6(3), with x and 1+.x corresponding to a and b
respectively.
It is certainly not the case that F[x]/f(x) is a field for any
choice of f(x); consider, for example, the multiplication table of
F[x]/(x?2 +1) (see Exercise 12.2). The special property of f(x)
which makes F[x]/f(x) a field is that of being ‘irreducible’, which
we now define.

Definition A polynomial f(x) in F[x] is said to be reducible if


f(x) =a(x)b(x), where a(x), b(x)eF[x] and dega(x) and
deg b(x) are both smaller than deg f(x). If f(x) is not reducible,
it is called irreducible.
Just as any positive integer can be factorized uniquely into a
product of prime numbers, any monic polynomial in F[x] can be
factorized uniquely into a _ product of irreducible monic
polynomials.
The following simple observations are often useful when
factorizing a polynomial.

Lemma 12.3
(i) A polynomial f(x) has a linear factor x —a if and only if
f(a) =0.
(ii) A polynomial f(x) in F[x] of degree 2 or 3 is irreducible if
and only if f(a) ¥0 for all a in F.
(iii) Over any field, x” —1=(« —1)(x""14+x"-74---+x41)
(the second factor may well be further reducible).

Proof (i) If f(x) =(x —a)g(x), then certainly f(a) =0. On the
other hand, suppose f(a) =0. By the division algorithm, f(x) =
Cyclic codes 145
q(x)(x —a)+r(x), where degr(x)<1. So r(x) is a constant,
which must be zero since 0 = f(a) = r(a).
(ii) A polynomial of degree 2 or 3 is reducible if and only if it
has at least one linear factor. The result is now immediate from
(i).
(iii) By (i), x —1 is a factor of x” —1 and long division of
x” —1 by x —1 gives the other factor.

Example 12.4 (i) Factorize x*-—1 in E[x] into irreducible


polynomials.
(ii) Factorize x*— 1 in F[x] into irreducible polynomials.

Solution By 12.3(iii), x7 —1= (x —1)(x?+.x +1) over any field.


(i) By 12.3(ii), x7 +x +1 is irreducible in E[x].
(ii) By 12.3(i), in [x], x — 1 is a factor of x7 +x +1, and we
get the factorization x° — 1 = (x —1)?.

The finite fields GF(p"), h>1

The property in Fix] of a polynomial being irreducible cor-


responds exactly to the property in Z of a number being prime.
We showed in Theorem 3.5 that the ring Z,, is a field if and only
if m is prime and the following may be proved in exactly the
same way.

Theorem 12.5 The ring F[x]/f(x) is a field if and only if f(x) is


irreducible in F[x].

Proof This is left to Exercise 12.3.

Although we do not show it here, it can be shown that for any


prime number p and for any positive integer h, there exists an
irreducible polynomial over GF(p) of degree h. This result,
together with Theorem 12.5, gives the existence of the fields
GF(p") for all integers h =>1. As we remarked in Theorem 3.2,
these are essentially the only finite fields.

Back to cyclic codes

Returning from our excursion to look at fields of general order,


we now fix f(x) =x” — 1 for the remainder of the chapter, for we
146 A first course in coding theory
shall soon see that the ring F[x]/(x” — 1) of polynomials modulo
x" —1 is the natural one to consider in the context of cyclic
codes. For simplicity we shall write F[x]/(x" —1) as R,, where
the field F = F, will be understood.
Since x” =1(modx”"—1), we can reduce any polynomial
modulo x” — 1 simply by replacing x” by 1, x”*! by x, x"t? by x?
and so on. There is no need to write out long divisions by x” — 1.
Let us now identify a vector aoa,---a,_,; in V(n, q) with the
polynomial
a(x) =agtayxt+-+++a,_ x")
in R,,. We shall simultaneously view a code as a subset of V(n, q)
and as a subset of R,. Note that addition of vectors and
multiplication of a vector by a scalar in R, corresponds exactly to
those operations in V(n,q). Now consider what happens when
we multiply the polynomial a(x) by x. In R,,, we have

X + A(X) =aox + a,x? +++++4,_4x"


=A,-~ +x +°+++a,_.x""',

which is the vector a, d9°-+-:d,-2. Thus multiplying by x


corresponds to performing a single cyclic shift. Multiplying by x”
corresponds to a cyclic shift through m positions.
The following theorem gives the algebraic characterization of
cyclic codes.

Theorem 12.6 A code C in R, is a cyclic code if and only if C


satisfies the following two conditions:
(i) a(x), b(x)EeCDa(x)+ D(x) EC,
(ii) a(x)e€Candr(x)eER, Sr(x)a(x) eC.
[Note that (ii) does not just say that C must be closed under
multiplication; it says that C must be closed under multiplication
by any element of R,,. The reader who is familiar with ring theory
will recognize that Theorem 12.6 says that cyclic codes are
precisely the ‘ideals’ of the ring R,,.]

Proof Suppose C is a cyclic code in R,,. Then C is linear and so


(i) holds. Now suppose a(x)e€C and r(x)=ntnxt+---+
r,-1x""'€R,,. Since multiplication by x corresponds to a cyclic
shift, we have x - a(x) eC and then x - (xa(x)) = x7a(x) € C and
Cyclic codes 147
so on. Hence
r(x )a(x) = roa(x) + Hxa(x) +-++ +7, _4x""1a(x)
is also in C since each summand is in C. Thus (ii) also holds.
Now suppose (i) and (ii) hold. Taking r(x) to be a scalar, the
conditions imply that C is linear. Taking r(x) =x in (ii) shows
that C is cyclic.
We now give an easy way of constructing examples of cyclic
codes.
Let f(x) be any polynomial in R, and let (f(x)) denote the
subset of R,, consisting of all multiples of f(x) (reduced modulo
x" —1), Le.
(f(x)) = (refx) | r(x) € Ra}.
Theorem 12.7 For any f(x) €R,, the set (f(x)) is a cyclic code;
it is called the code generated by f(x).

Proof We check conditions (i) and (ii) of Theorem 12.6.


(i) If a(x)f(x) and b(x)f(x) € (f(x)), then
a(x )f(x) + b(x)f(x) = (a(x) + b(x)) f(x) € (f(x).
(ii) If a(x)f(x) € (f(x)) and r(x) € R,,, then
r(x)(a(x)F(x)) = (rw )a(x))f(x) € (Fx)).
Example 12.8 Consider the code C=(1+x7) in R, (with
F = GF(2)). Multiplying 1 + x* by each of the eight elements of
R, (and reducing modulo x*—1) produces only four distinct
codewords, namely 0,1+.x,1+.x? and x + x’. Thus C is the code
{000, 110, 101, 011} of Example 12.1(i).
We next show that the above easy way of constructing cyclic
codes is essentially the only way, 1.e. any cyclic code can be
generated by a polynomial. (In the terminology of ring theory,
this says that every ideal in R,, is a ‘principal ideal’.)

Theorem 12.9 Let C be a non-zero cyclic code in R,,. Then


(i) there exists a unique monic polynomial g(x) of smallest
degree in C,
(ii) C= (g(x)),
(iii) g(x) is a factor of x” — 1.
148 A first course in coding theory
Proof (i) Suppose g(x) and h(x) are both monic polynomials
in C of smallest degree. Then g(x) — h(x) eC and has smaller
degree. This gives a contradiction if g(x)#h(x), for then a
suitable scalar multiple of g(x) — h(x) is monic, is in C, and is of
smaller degree then deg g(x).
(ii) Suppose a(x)eC. By the division algorithm for F[x],
a(x) = q(x)g(x)+r(x), where degr(x)<deg g(x). But r(x)=
a(x) — q(x)g(x) eC, by the properties of a cyclic code given in
Theorem 12.6. By the minimality of deg g(x), we must have
r(x) =0 and so a(x) € (g(x)).
(iii) By the division algorithm,
x" —1=q(x)g(x) + r(x),
where deg r(x) < deg g(x). But then r(x) = —q(x)g(x) (mod x” —
1), and so r(x) € (g(x)). By the minimality of deg g(x), we must
have r(x) =0, which implies that g(x) is a factor of x” — 1.

Definition Ina non-zero cyclic code C the monic polynomial of


least degree, given by Theorem 12.9, is called the generator
polynomial of C.
Note that a cyclic code C may contain polynomials other than
the generator polynomial which also generate C. For example,
the code of Example 12.8 is generated by 1+’, but its
generator polynomial is 1+ x.
The third part of Theorem 12.9 gives a recipe for finding all
cyclic codes of given length n. All we need is the factorization of
x” — 1 into irreducible monic polynomials.

Example 12.10 We will find all the binary cyclic codes of length
3. By Example 12.4(i), x7 -1=(x« + 1)(x?+x+1), where x +1
and x? +x +1 are irreducible over GF(2). So, by Theorem 12.9,
the following is a complete list of binary cyclic codes of length 3.

Generator Corresponding
polynomial Code in R3 Code in V(3, 2)

1 all of R; all of V(3, 2)


x+1 {0O,1+x,x+x7,1+x7} {000, 110, 011, 101}
x°+x4+1 {0O,1+x+x’} {000, 111}
x°-1=0 {0} {000}
Cyclic codes 149
Lemma 12.11 Let g(x)=g9+ gix +---+8,x" be the generator
polynomial of a cyclic code. Then gg is non-zero.

Proof Suppose go=0. Then x”~'g(x) =x~‘g(x) is a codeword


of C of degree r — 1, contradicting the minimality of deg g(x).
By definition, a cyclic code is linear. It would be handy if
immediately from the generator polynomial g(x) we could
deduce the dimension of the code and also write down a
generator matrix. The next theorem shows that we can do both.

Theorem 12.12 Suppose C is a cyclic code with generator


polynomial
B(x) =Bot aixt--
ter’
of degree r. Then dim (C) = —r and a generator matrix for C is

go 81 82 °°° g, 0 0---0 1]
O 8o 8:1 82 °°" Sr O--- 0
G=| 0 0 8 §1 §2 °°" Sr

| 0
Bo O--: 0 go g Bocce
0gy |

Proof The n-—r rows of the above matrix G are certainly


linearly independent because of the echelon of non-zero g,s with
Os below. These n —r rows represent the codewords g(x), xg(x),
x*g(x),...,x"-’~'g(x), and it remains only to show that every
codeword in C can be expressed as a linear combination of them.
The proof of Theorem 12.9(1i) shows that if a(x) is a codeword of
C, then
a(x) = q(x)g(x)
for some polynomial q(x), and that this is an equality of
polynomials within F[x], not requiring any reduction modulo
x"—1. Since dega(x)<n, it follows that degqg(x)<n-r.
Hence

q(x)8(X) = (Got GX +00 + nape") 8 (x)


= Go8(X) + q1X8(x) +2 + In —riX" "18 (x);
which is the desired linear combination.
150 A first course in coding theory
Example 12.13 Find all the ternary cyclic codes of length 4 and
write down a generator matrix for each of them.

Solution Over GF(3), the factorization of x*—1 into ir-


reducible polynomials is
x4*—1=(x -1)(e3 +47 4x41) = -1)% +1)? +1).
So there are 2?=8 divisors of x*—1 in E[x], each of which
generates a cyclic code. By Theorem 12.9, these are the only
ternary cyclic codes of length 4. The codes are specified below by
their generator polynomials, and the corresponding generator
matrices are given by Theorem 12.12. Note that neither of the
two-dimensional codes has minimum distance 3 and so the
ternary Hamming [4, 2, 3]-code is not cyclic, thus answering the
question posed in Example 12.1(iv).

Generator polynomial Generator matrix

iB _
-1 1 0 O
x-1 O-1 1 O
| 0 O-1 JL
(1 10 0
x+1 0 1 1 0
| O 0 1 1
4 (1 01 0
L909 10 1

(x —1)(x +1) =x°-1 “to 7


| O-1 0 LJ
(x —1)(x* +1) =x°-x*+x-1 [-1 1-1 1]
(x +1)(x? +1) =x°+x74+x41 fi 11 1]
x*-1=0 [0 0 0 9)

The check polynomial and the parity-check matrix of a


cyclic code

The generator matrix of a cyclic code as given by Theorem 12.12


is not in standard form. Our usual method of writing down a
Cyclic codes 151
parity-check matrix from the standard form of G (via Theorem
7.6) is therefore not appropriate for cyclic codes. However, there
is a natural choice of parity-check matrix for a cyclic code. This is
closely related to the so-called ‘check polynomial’, which we
define first.
Let C be a cyclic [n, k|-code with generator polynomial g(x).
By Theorem 12.9, g(x) is a factor of x” — 1 and so
x" —1=g(x)h(x),
for some polynomial h(x). Since g(x) is monic, so also is h(x).
By Theorem 12.12, g(x) has degree n — k and so h(x) has degree
k. This polynomial h(x) is called the check polynomial of C. The
reason for this name is apparent from the following theorem.

Theorem 12.14 Suppose C is a cyclic code in R,, with generator


polynomial g(x) and check polynomial h(x). Then an element
c(x) of R,, is a codeword of C if and only if c(x)h(x) = 0.

Proof First note that, in R,, g(x)h(x) =x” —1=0.


Hence c(x) € C>c(x) = a(x)g(x), for some a(x) eER,,
> c(x)h(x) = a(x)g(x)h(x)
=a(x)-0
= 0.
On the other hand, suppose c(x) satisfies c(x)h(x) =0. By the
division algorithm, c(x) = q(x)g(x) + r(x), where deg r(x) <n —
k. Then c(x)h(x)=0 implies that r(x)h(x)=0, ie.
r(x)h(x) =0(modx”"—1). But deg(r(x)h(x))<n-kK+k=n,
and so r(x)h(x)=0 in F[x]. Hence r(x) =0, and then c(x)=
q(x)g(x) €C.
In view of Theorem 12.14 and the fact that dim ((h(x))) =
n—k=dim(C*), we might easily be fooled into thinking that
h(x) generates the dual code C~. In general this is not so. The
point is that the product of c(x) and h(x) being zero in R,, is not
the same thing as the corresponding vectors in V(n,q) being
orthogonal. In the next theorem, however, we see that the
condition c(x)h(x)=0 in R, does imply some useful or-
thogonality relations which lead to a natural choice of parity-
check matrix.
152 A first course in coding theory
Theorem 12.15 Suppose C is a cyclic [m,k]-code with check
polynomial
h(x) =hhbthyxt--++h,x*.

Then
(i) a parity-check matrix for C is
mh, hy» *** My 0 0--- 07
O hy My-1-++ho O--- 0
H= Lo ,
Lo . O
LO--- Oh, hye: ho

(ii) C+ is acyclic code generated by the polynomial


h(x) =h, +hy_yx t+ + t+hoxt.

Proof (i) By Theorem 12.14, a polynomial c(x)=co+c,x +


-+++c,_,x""' is a codeword if and only if c(x)h(x) = 0. Now for
c(x)h(x) to be zero, then in particular the coefficients of
x*,x*t1) x"! must all be zero, i.e.

Coh, + C,h,_1 + - es + c,No = 0

c,h, + Ch, 3 ++-+++ Cp 44M = 0

Cyh—K—1Mk a + C,—1No = (),

Thus any codeword coc, -:-c,_; of C is orthogonal to the vector


h,h,,***h .0O0---0 and to its cyclic shifts. So the rows of the
matrix H given in the statement of the theorem are all codewords
of C*. We have already observed that h(x) is monic of degree k
and so h,=1; thus the echelon of 1s with zeros below in H
ensures that the rows of H are linearly independent. The number
of rows of H is n —k, which is the dimension of C+. Hence H is
a generator matrix of C*, i.e. a parity-check matrix for C.
(ii) If we can show that h(x) is a factor of x” — 1, then it will
follow from Theorem 12.12 that (h(x)) is a cyclic code whose
generator matrix is the above matrix H, and hence that (h(x)) =
C*. We observe that h(x)=x*h(x7'). Since h(x~*)g(x7’)
=(x7')"—1, we have x*h(x7!)x"-*g(x7!) =x"(x-" —- 1) =1-
x", and so A(x) is indeed a factor of x” — 1.
Cyclic codes 153
Remarks (i) The polynomial h(x) =x*h(x7!)=h, +hy_.x +
-+hox* is called the reciprocal polynomial of h(x); i
coefficients are those of h(x) in reverse order.
(ii) We may regard h(x) as the generator polynomial of C+,
though strictly speaking, in the non-binary case, one ought to
multiply it by the scalar hg! to make it monic.
(iii) The polynomial h(x~1) =x”-*h(x) is a member of Ce,
We have not yet discussed the minimum distance of cyclic
codes. There are some classes of cyclic codes for which useful
lower bounds on the minimum distance are known. For example,
cyclic BCH codes can be constructed to have ‘designed minimum
distance’ while there are codes called quadratic residue codes
which satisfy a ‘square root bound’. These codes and bounds are
well treated in several of the more advanced texts. We con-
centrate here on finding the minimum distances of two particu-
larly interesting cyclic codes, namely the two Golay codes. Our
methods, while aimed directly at the codes in hand, nevertheless
provide some insights into the more general methods.

The binary Golay code

In Chapter 9, we proved the existence of a perfect binary


[23, 12, 7]-code G,, by exhibiting a generator matrix. We now
show that this Golay code can be constructed in a more natural
way as a cyclic code. The only knowledge we shall assume in
advance is the factorization of x**—1 over GF(2). [There is a
clever method of finding the factors of x”—1 over GF(q) in
general (see, for example, Chapter 7, §5, of MacWilliams and
Sloane (1977)) but we shall not dwell on this here. Alternatively
one may find the factors by consulting tables (see, e.g., the same
reference for a list of factors of x” — 1 over GF(2) for n <63).]
We begin then with the factorization
S— P(x — De txt
x9 + 9° 4+ x44-x74+1)
X (xt + xP +474
494+ x°4+% 41):
= (x — 1)gi(x)g2(x),
say.
Let C, be the code (g,(x)) and let C, be the code (g,(x)). By
Theorem 12.12, C, is a [23, 12]-code. The object of the next few
pages is to show that the minimum distance of C;, is 7.
154 A first course in coding theory
We observe that the polynomials g,(x) and g,(x) are recipro-
cals of each other, and so C, is equivalent to C,;. Remarkably, the
knowledge that x7? — 1 = (x — 1)g,(x)g,(x), where g,(x) denotes
the reciprocal of g,(x), is all we need to show that d(C,) = 7; we
do not actually need to know what g,(x) is.

Remark 12.16 Although we do not show it here, x? —1 has a


factorization over GF(2) of the form (x — 1)g,(x)g,(x), where
(g,(x)) and (g,(x)) are equivalent codes, whenever p is a prime
number of the form 8m +1. If p is of the form 8m —1 we also
have g.(x) = ,(x). For example,
x?’—-1=(*- 1)? 4x41) 4+2x74+1)
and x? —1= (x — 1)g(x)g(),
where g(x) =14+x°4+ x8 4+ x79 + xP 4x8¥4x¥.,
In view of Remark 12.16, we prove the next two lemmas for p
equal to a general odd prime number rather than just for p = 23.
We will denote the vector 1 +x +x*+---+.x?7! consisting of all
1s by 1. Note that if x? — 1 = (x — 1)g,(x)g,(x), then g,(x)g.(x) =
1.

Lemma 12.17 Suppose that x? —-1=(x —1)g,(x)g.(x) over


GF(2), and that (g,(x)) and (g,(x)) are equivalent codes. Let
a(x) be a codeword of (g;(x)) of odd weight w. Then
(i) w*2p
(ii) if also go(x) =2,(x), then w7-w+12p.

Proof (i) Since (g(x)) is equivalent to (g,(x)), there is some


codeword b(x) in (g,(x)) also of weight w. Now a(x)b(x) is a
multiple of g,(x)g.(x) =1, and so a(x)b(x) =0 or 1. Since w is
odd, we have a(1)b(1) = w - w=1 mod (2), and so we must have
a(x)b(x)=1+x+---+x?"'. But a(x)b(x) has at most w?
non-zero coefficients and so w*= p.
(ii) If g,(x)=2,(x), then the codewords of (g,(x)) are just
the reciprocals of the codewords of (g,(x)). In particular we may
take b(x) to be a(x~') in the proof of (i) to get
a(x)a(x7")=1+x4+x74+---4+ P71,
But w of the w? terms in the product a(x)a(x~') are 1 and so the
maximum weight of a(x)a(x~') is w*-w +1.
Cyclic codes 155
Corollary 12.18 If, with the hypotheses of Lemma 12.17, it is
also known that the minimum distance d of (g,(x)) is odd, then
d satisfies the square root bound
d=vp,
while if also g,(x) = g,(%), this can be improved to
d*—-d+12=2p.

By Lemma 12.17(i1), our [23, 12]-code C, has no words of odd


weight less than 7, because 5? —5 +1< 23. There is an ingenious
way of showing that C,, and more generally any so-called
quadratic residue (QR) code (we do not define OR codes here,
but simply remark that C, is an example of such a code), must
have odd minimum distance and therefore must satisfy the
square root bound. The argument, which involves showing that
an extended OR code has a transitive automorphism group, is
beyond the scope of the present book. As our main aim is merely
to find the minimum distance of the Golay code C,, the following
lemma will suffice.

Lemma 12.19 Suppose p is an odd prime number and that, over


GF(2), x? -1=(x —1)g,(x)g,(x). Let a(x) be a codeword of
(2,(x)) of even weight w. Then
(i) w=0(mod 4)
(ii) w#4 unless p =7.

Proof (i) As in the proof of Lemma 12.17, we have


a(x)a(x~') =0 or 1. Since a(x) has even weight, a(1) = 0, and so
a(x)a(x~') = 0. Suppose a(x) = x +x%+--++x, Then

a(x)a(x~') = S > x*-4 =0)

in R,. Of the w* summands, w are equal to 1 (the terms with


i=j), and these sum to 0(mod2). So the remaining w?—w
terms x“~%(i4#j) must cancel each other out in pairs. Now if
xe = yk then x9-*=x"-*, and so the terms must actually
cancel four at a time. Thus
w* —w =0 (mod 4) and so w =0 (mod 4).
(ii) Suppose w = 4. Without loss of generality (via a suitable
156 A first course in coding theory
cyclic shift), suppose a(x)=1+x'+x/+.x*, where i, j, k are
distinct and 1<i,j,k<p. Then (1+x'+x/+x*)\(14+x7%+
xJ+x7*)=0.
Thus the six sets {i,—-i}, {j,-j}, {k,-k}, {i-j,j-i},
{i-—k,k—i} and {j-—k,k—j} must split into three matching
pairs, under congruence modulo p. By symmetry there is no loss
in assuming / is congruent to one of —j, j-—iorj—k.
Case 1 Suppose i=j—k (mod p). Then k =j —i gives a second
match and so the third match must be given by j= +(i — k). But
1=j—k and j=i-—k implies 2k =0(modp), which is a con-
tradiction since p is an odd prime. Likewise, i=j—k and
j =k —i implies 2i =0 (mod p), which is again a contradiction.
Case 2 Suppose i=-—j (mod p). Since Case 1 has been ruled
out, we must have k=i-—k or k=j—k and as the two
possibilities are essentially the same, we may assume k =i —k,
i.e. 1=2k. The third match is then given by i —-j=j —k, which
implies k = —31=—6k. Thus 7k =0(modp), which is a con-
tradiction unless p = 7.
Case 3 Suppose i=j —i(modp). To avoid the cases above, we
may assume the remaining matches are given by j/=k —j and
k =i—k. But then k = 2) = 41 = 8k, again giving 7k =0 (mod p).

Remark We observed in Remark 12.16 that x’ — 1 has the form


(x — 1)g(x)g(x), where g(x) =x°+x+4+1. Since (g(x)) contains
words of weight 4, the exclusion of case p = 7 in Lemma 12.19(i1)
is essential.

We have now reached our goal:

Theorem 12.20 Let G3 be the binary cyclic code in R,3 with


generator polynomial g(x)=14+x74+x44+ x54 x64 x04 x11,
Then G3 is a perfect [23, 12, 7]-code.

Proof We have already observed that

x —1= (x — 1)g(x)g (x).


By Lemma 12.17, the minimum odd weight w of G;; satisfies
w?—w +1223, which implies that w=7. By Lemma 12.19, G,,
can have no words of even weight <8. As g(x) is a codeword of
Cyclic codes 157
weight 7, we have d(G,3) = 7. Since
23 23
241423 + ( )+( )fa2,
2 3
the sphere-packing condition (9.1) is satisfied and so Gy3 is
perfect.
The code G,; is called the binary Golay code. It is equivalent
to the Golay code as defined in Chapter 9 (cf. the remarks
following Problem 9.9).

The ternary Golay code

We now show that the ternary Golay code G,;, may also be
constructed as a cyclic code. Our starting point is the factoriza-
tion of x!'— 1 over GF(3):
xP =a 1? 42x47 - 0 4x? -DO-— xP +x? -x- 1)
= (x — 1)g,(*)g2(x), say.
Note that g(x) = —x°g,(x~') and so (g,(x)) and (g,(x)) are
equivalent [11, 6]-codes. We shall show that the code (g;(x)) has
minimum distance 5.

Theorem 12.21 Let C be the ternary code (g,(x)) in R,,, where


g(x) =x°+x4-x3+x7-1. Let D be the subcode of C
generated by (x — 1)g,(x). Let a(x) =ay + a,x +--+ + ayox"° bea
codeword of C of weight w. Then
(i) a(x) e€D if and only if Dj2 a; =0,
(ii) if a(x)eéD, then w =0(mod 3),
(iii) if a(x)éD, then w =2 (mod 3),
(iv) if a(x)¢éD, then w=4,
(v) w#3,
(vi) d(C)=5.

Proof (i) Given that a(x) is in C and so is a multiple of g,(x),


have
“ee a(x) € D a(x) is a multiple of (x — 1)
©Sa(1) =0
10
& D>, a, =0.
i=0
158 A first course in coding theory
(ii) First observe that, since a? = 1 (mod 3) for each non-zero
coefficient a;, we have w = ), a? (mod 3). By Theorem 12.15(ii),
the dual code D* of D is generated by the reciprocal polynomial
of g.(x), which happens to be precisely —g,(x). Thus D*=
(2,(x)) = (—g,(x)) =C. So D is contained in D+, which means
that D is self-orthogonal, i.e. the inner product of any two
vectors of D is zero. In particular, if a(x) eD, then the inner
product of a(x) with itself is zero, i.e. )} a? =0 (mod 3). Thus
a(x)€« D> w=0(mod 3).
(iii) By Theorem 12.12, D is a code of dimension 5. Also D is
contained within the 6-dimensional code C. Since 1=1+x+
--++ x1 is in C but not in D, C is the disjoint union of the three
cosets D,1+ D and —1+D. Thus any codeword a(x) of C
which is not in D 1s of the form
a(x) = d(x) +1,
for some codeword d(x) =dy+d,x+-+-+d,x%eD.
Hence w(a(x)) = >, (d; +1)?

2s (§ a?) + 11+ (> d,)


= 1 (by (i) and (ii)

(iv) Suppose a(x)¢€D. Now a(x)a(x~') is a multiple of


g,(x)g.(x)=1. By (i), a(1) 40, and so a(x)a(x~')= +1. Thus
a(x)a(x~') has weight 11. But at most w? coefficients of
a(x)a(x~') are non-zero and so w72 11. Hence w= 4.
(v) Suppose, for a contradiction, that w=3. Then, by a
suitable cyclic shift, and multiplication by —1 if necessary, we
may suppose a(x) =1+.x' +’. By (ii) and (iii), a(x) must be in
D and so, by (i), we must actually have a(x) =1+.x'+-’. Also,
a(x) € D implies that a(x)a(x~') is a multiple of
(x — 1)gi(x)g.(x) =x" -1=0
in R,,. Thus
(d+x'+x/)\(14+x7'+x7)=0,
giving xit+x tx t+x7 txt +x =0,
Cyclic codes 159
Since i and j are distinct and non-zero we must have i = —j =i —
j(mod11), which implies that 27=0O(mod11), which is a
contradiction.
(vi) It follows from (ii)-(v) that d(C)=5 and since g,(x)
itself has weight 5, d(C) =S.
The [12, 6, 5]-code C of Theorem 12.21 is called the ternary
Golay code. It is a perfect code because
11
341 +2-11 +2(~ | = 3",

and it is equivalent to the ternary Golay code defined in Chapter


9.

Hamming codes as cyclic codes

We will show that the binary Hamming codes discussed in


Chapter 8 are equivalent to cyclic codes. The proof will be
incomplete in the sense that we shall assume results previously
stated, but left unproved, in the text.

Theorem 12.22 The binary Hamming code Ham (r, 2) is equiv-


alent to a cyclic code.

Proof Let p(x) be an irreducible polynomial of degree r in


Fx]. Then, by Theorem 12.5, the ring E[x]/p(x) of polynomials
modulo p(x) is actually a field of order 2”. As was mentioned in
Chapter 3, every finite field has a primitive element and so there
exists an element a of Fx]|/p(x) such that E[x]/p(x) = {0, 1, a,
aw?,...,a@7~*}. Let us now identify an element ay + a,x + a,x? +
-++>+a4,_,x"~! of E[x]/p(x) with the column vector
ao
ay

a,
and consider the binary r X (2’ — 1) matrix
H=[1 a@ a?--- a? 77].
Let C be the binary linear code having H as parity-check matrix
160 A first course in coding theory
Since the columns of H are precisely the distinct non-zero vectors
of V(r,2), C is a Hamming code Ham(r, 2). Putting n =2’—1
we have
C= {fifi + fr-1€ Vn, 2) | fot fiat: ++ +f"? = 0}
= {f(x) eR, |f(@) = 0 in F[x]/p(x)}. (12.23)
If f(x) eC and r(x) eR,, then r(x)f(x) € C because r(a) f(a) =
r(~)-0=0. So, by Theorem 12.6, this version of Ham (7, 2) is
cyclic.

Definition If p(x) is an irreducible polynomial of degree 7 such


that x is a primitive element of the field F[x]/p(x), then p(x) is
called a primitive polynomial.

Theorem 12.24 It p(x) is a primitive polynomial over GF(2) of


degree r, then the cyclic code (p(x)) is the Hamming code
Ham (7, 2).

Proof If p(x) is primitive, then (12.23) implies that

Ham (r, 2) = (f(x) € R, [f(x) =0 in Flx}/p(x)}


= (p(x)).
Example 12.25 The polynomial x°+x+1 is irreducible over
GF(2) and so E[x]/(x° +x + 1) is a field of order 8. Also, x is a
primitive element of this field, for
BJx]/(x? +x +1)
={0,1,%, x7, x =xt 1 xtHx? tx, we Hx?+x4+1, x8 =x? +1}.
Thus a parity-check matrix for a cyclic version of the Hamming
code Ham (3, 2) is
1001011
H=j010111 O},
0010111

wherein the columns represent 1, a, a*,..., a@° as described in


the proof of Theorem 12.22, with a= x.
Since x°+x+1 is a primitive polynomial, it is a generator
polynomial for Ham (3,2) and so, by Theorem 12.12, a gener-
Cyclic codes 161
ator matrix for the code is

1101000
0110100
0011010)
0001101

Remarks (1) It can be shown that there exists a primitive


polynomial of degree r for any r.
(2) We saw in Example 12.13 that the ternary Hamming code
Ham (2,3) is not equivalent to a cyclic code. However,
Ham (r,q) is equivalent to a cyclic code if r and g—1 are
relatively prime (see, e.g., Blahut (1983), Theorem 5.5.1).

Concluding remarks on Chapter 12

(1) Cyclic codes were first studied by Prange (1957). Interest


was further stimulated by the theorem of Bose and Ray-
Chaudhuri (1960) which gave lower bounds on the minimum
distance for a large class of cyclic codes. It was quickly
discovered that almost every special linear code previously
discovered (e.g. Hamming, Golay, Reed—Muller) could be made
cyclic.
(2) For a comprehensive treatment of the theory of cyclic
codes, see, e.g., MacWilliams and Sloane (1977). For details of
the practical implementation of cyclic codes, including the
associated circuitry, see, e.g., Blahut (1983) or Lin and Costello
(1983).

Exercises 12

12.1 Is each of the following codes (a) cyclic, (b) equivalent to


a cyclic code?
(i) the binary code {0000, 1100, 0110, 0011, 1001}
(ii) the binary code {00000, 10110, 01101, 11011}
(iii) the ternary code {0000, 1122, 2211}
(iv) the qg-ary repetition code of length n
(v) the binary even-weight code E,,
(vi) the ternary code {x € V(n, 3) | w(x) =0 (mod 3)}
162 A first course in coding theory
(vii) the ternary code

{een -+ +x, €V(n,3)| > x; =0 (mod 3)|


i=1
12.2. Write out the multiplication table for F[x]/(x? +1).
Explain why F[x]/(x? + 1) is not a field.
12.3 Write out a proof of Theorem 12.5.
12.4 Show that an irreducible polynomial over GF(2) of
degree =2 has an odd number of non-zero coefficients.
12.5 To verify that a polynomial p(x) is irreducible, why is it
enough to show that p(x) has no irreducible factor of
degree <} deg p(x)?
12.6 List the irreducible polynomials over GF(2) of degrees 1
to 4. Construct a finite field of order 8.
12.7 Suppose p is a prime number.
(i) Factorize x? —1 into irreducible polynomials over
GF(p).
(ii) Factorize x?~'—1 into irreducible polynomials over
GF(p).
12.8 Factorize x° — 1 into irreducible polynomials over GF(2)
and hence determine all the cyclic binary codes of length
5.
12.9 Let g(x) be the generator polynomial of a binary cyclic
code which contains some codewords of odd weight. Is
the set of codewords in (g(x)) of even weight a cyclic
code? If so, what is the generator polynomial of this
subcode?
12.10 Suppose x” —1 is the product of ¢ distinct irreducible
polynomials over GF(q). How many cyclic codes of
length n over GF(q) are there?
12.11 Given that the factorization of x’—1 into irreducible
polynomials over GF(2) is (x —1)@?+x4+1)(Q? 42x74
1), determine all the cyclic binary codes of length 7. Give
a name or a concise description of each of these codes.
12.12 Factorize x?—1 over GF(3). How many ternary cyclic
codes of length 8 are there?
12.13 Write down a check polynomial and a parity-check matrix
for each of the ternary cyclic codes of length 4 (see
Example 12.13).
12.14 Let h(x) be the check polynomial of a cyclic code C. Is
(h(x)) equal to C+? Is (h(x)) equivalent to C+?
Cyclic codes 163
12.15 Suppose C is a binary cyclic code of odd length. Show
that C contains a codeword of odd weight if and only if 1
is a codeword of C.
12.16 Suppose a generator matrix G of a linear code C has the
property that a cyclic shift of any row of G is also a
codeword. Show that C is a cyclic code.
12.17 Show that 2 is a primitive element of GF(11). Deduce
that the [10,8]- and [10, 6]-codes over GF(11) of Ex-
amples 7.12 and 11.3 respectively are equivalent to cyclic
codes.
12.18 Let G3, be the cyclic Golay code defined in the text.
Prove that any two vectors in G,3 of even weight have
inner product equal to zero. Hence prove that the
extended Golay code G,,, obtained by adding an overall
parity-check to G3, is self-dual.
12.19 Determine which of the irreducible polynomials over
GF (2) of degree 4 (found in Exercise 12.6) are primitive.
Hence write down a generator polynomial for the binary
Hamming code of length 15. Find the check polynomial
for this code. Write down the corresponding parity-check
matrix (using Theorem 12.15) and check that its columns
are precisely the non-zero vectors of V(4, 2).
12.20 Let g(x) be the generator polynomial of a cyclic binary
Hamming code Ham (r, 2), with r=3. Show that ((x -
1)g(x)) is a cyclic [2’ — 1, 2” —r —2, 4]-code.
12.21 An error vector of the form x’+ x‘! in R, is called a
double-adjacent error. Show that the code ((x — 1)g(x))
of Exercise 12.20 is capable of correcting all single errors
and all double-adjacent errors.
12.22 Let C be a [gq +1, 2, g]-code over GF(q), where gq is odd.
Show that C cannot be cyclic. Deduce that the Hamming
code Ham (2, qg) is not equivalent to a cyclic code when q
is odd.
13 Weight enumerators

If C is a linear [n, k]-code, its weight enumerator is defined to be


the polynomial
We(z) = >, Az’
i=0

=Apjp tA ,zZ+-°:+A,z",
where A, denotes the number of codewords in C of weight i.
Another way of writing W-(z) is

Welz) = Dd 2”™.
xEC

Examples 13.1 (i) Let C be the binary even-weight code of


length 3; i.e. C= {000, 011,101,110}. Its dual code C* is
{000, 111}. The weight enumerators of C and C~ are
W-(z) =1+ 32"
W oi(z) = 1427.
(ii) The code C = {00, 11} is self-dual and so
We(z) = Wei(z) =14 2%.
We have already seen (Theorem 6.14) that knowledge of the
weight enumerator of a code enables us to calculate the
probability of undetected errors when the code is used purely for
error detection.
The main result of this chapter is a remarkable formula of
MacWilliams (1963), which enables the weight enumerator of
any linear code C to be obtained from the weight enumerator of
its dual code C~.
For simplicity we shall prove this result, known as the
MacWilliams identity, only for binary codes (Theorem 13.5),
although the general result will be stated afterwards (Theorem
13.6).
The following three lemmas are required only for the proof of
166 A first course in coding theory
the MacWilliams identity. The less mathematically minded
reader, who is happy to accept the validity of the formula
without proof, may skip these lemmas, and also the proof of
Theorem 13.5, without any great loss; the subsequent examples
and exercises make use only of the formula and not of its proof.

Lemma 13.2 Let C be a binary linear [n, k]-code and suppose y


is a fixed vector in V(n, 2) which is not in C*. Then x- y is equal
to 0 and 1 equally often as x runs over the codewords of C.

Proof Let A={xeC|x-y=0}


and B={xeC|x-y=1}.
Let u be a codeword of C such that u-y=1 (u exists since
yéC+). Let u+ A denote the set {u+x|xeA}. Then
ut+AcB,

for if xe A, then (u+x)-y=u-ry+x-y=1+0=1.


Similarly
utBcaA.
Hence
|A| = |u+A| <|B| =|u+ Bl S<|Al.
Hence |A| = |B| and the lemma is proved.

Lemma 13.3 Let C be a binary [n,k]-code and let y be any


element of V(n, 2). Then

—7}\xy —
(" ifye C+
&
xEC (-1) 0 ifyé Ct

Proof If yeC~*, then x- y= 0 for all xe C, and so

> (-)*¥=|C|-1=2*.
xeEC

If yé C+, then by Lemma 13.2, as x runs over the elements of C,


(—1)*¥ is equal to 1 and —1 equally often, giving

d (-1""=0.
xeC
Weight enumerators 167
Lemma 13.4 Let x be a fixed vector in V(n, 2) and let z be an
indeterminate. Then the following polynomial identity holds:

DS z”O(-1)*¥ = (1 — zy" + zy" ™,


ye V(n,2)

Proof
1 1
Sy z”0)(-1)*¥ = 2 SS: . ZU tA HIN — EDI Hn
ye V(n,2) =0 y2=0 yn=0

=> > (eco)


1 1 n

y,=90 n=0

“(2 0)
=(1-—z)"(1 +z)",

1 . . 1 +Z if Xx; = 0
since
;
2 2 J(—1Wx
1) =
{ 2. ifx,=1 .

Theorem 13.5 (The MacWilliams identity for binary linear


codes) If Cis a binary [n, k]-code with dual code C+, then

Welz) =51 (14 zy'We(; 1-—z


: =).
Proof We express the polynomial

f= xeC
(Sayre)
\yeV(7,2)
in two ways.
On the one hand, using Lemma 13.4,

F(z)= 2X (L=z)P (1+ zy"


xEC

=(1+z)" > (- ~ zy
xeC 1+2z

~assrw(ts3}
168 A first course in coding theory
On the other hand, reversing the order of summation, we have

fz)= »
yeV(n,2)
2¥0(SxEC
(-1)*)
= >) z”2* (by Lemma 13.3)
yeC"

= 2*W ¢.(2).

Equating the two expressions for f(z) establishes the result.

The proof of the following more general result is similar to that


of Theorem 13.5, using generalized versions of the preceding
lemmas, but we omit the details.

Theorem 13.6 (The MacWilliams identity for general linear


codes) If C is a linear [n, k]-code over GF(q) with dual code
C~, then

W ¢+(z) = 7 [1+(q- DW pe)

Remark If C is a binary [n, k]-code, then, since the dual code


of C* is just C, we can write the MacWilliams identity in the
(often more useful) form:
1 1—z
W-(z) = sae (1 +2)'Wes(—). (13.7)
Examples 13.8 We apply Theorem 13.5 to the codes of
Examples 13.1.
(i) We have W-(z) =1+ 3z*. Hence, by Theorem 13.5,

W(z) =4(1 + 2)W.(


—]
=4[(1+z)+3(1-zy(1+z)]
=14+ 23,
as already found directly from C~.
Let us interchange the roles of C and C~ in order to check the
Weight enumerators 169
formula (13.7). We have

(14+ zPWe-(- . =) =H(1 42) +(1—2)]


=1+ 32’,
which is indeed W,(z).
(ii) We have W-(z) =1+ 2’. Hence

Woi(z) = 301 + 2)°We( = r =)

=s[(1 +2)? + (1-2)


=1+ 2%.
Thus W¢:(z) = W-(z), as we expect, since C is self-dual.

For the very small codes just considered, the use of the
MacWilliams identity is an inefficient way of calculating their
weight enumerators, which can be written down directly from the
lists of codewords. But suppose we are required to calculate the
weight enumerator of an [n, k]-code C over GF(q) where k is
large. To enumerate all g* codewords by weight may be a
formidable task. However, if k is so large that n —k is small,
then the dual code C~ may be small enough to find its weight
enumerator, and then the MacWilliams identity can be used to
find the weight enumerator of C.
For example, the binary Hamming code Ham (r, 2) has dimen-
sion 2’ —1—r, and so the number of codewords in Ham (r, 2) is
271". a large number even for moderately small values of r.
But the dual code has only 2’ codewords and, as we shall soon
see, it has a particularly simple weight enumerator. From this,
the weight enumerator of Ham (r, 2) itself is easily determined.
First we look at a particular case.

Example 13.9 Let C be the binary [7, 4]|-Hamming code. Then


the dual code C~* has generator matrix
0001111
0110011}.
1010101
170 A first course in coding theory
When we compute W-:(z) directly, by listing the codewords, we
find, surprisingly, that each of the non-zero codewords has
weight 4 (the next theorem shows this to be no isolated
phenomenon, as far as the Hamming codes are concerned). Thus
Woi(z) =14+72%4,
and so the weight enumerator of C itself is, by equation (13.7),
4[((1+z)’+70 —z)4(14+2z)3]) =14+72°4+ 724+ 2’.

Theorem 13.10 Let C be the binary Hamming code Ham (r, 2).
Then every non-zero codeword of C+ has weight 2’~*.

Proof Let
h, hy hy. es Ain

H= Me _ Mat "22 sf" Man

h, hy hy se hyn

be a parity-check matrix of C where the rows of H are denoted


by h,,h,,...,h,. Then a non-zero codeword ¢ of C* is a vector
of the form c= ))7_, A,;h; for some scalars A,,A,,..., A,, not all
zero. We will find the weight of ¢ by finding the number n,(c) of
zero entries of ¢ and then subtracting n,(c) from the length n.
Now c has a zero in its jth position if and only if j7_, AA, =0,
i.e. if and only if ij_,Ax,;=0, where (x,x.---x,)’ is the jth
column of H. Since C is a Hamming code, the columns of H are
precisely the non-zero vectors of V(r, 2) and so n,(c) is equal to
the number of non-zero vectors in the set

X= frye -+x,€ V(r, 2) >» AXi — of,


i=]

i.e. No(e) = |X| —1.


It is easy to see that X is an (r — 1)-dimensional subspace of
V(r, 2) (e.g. view X as the dual code of the [r, 1]-code which has
generator matrix [A,A,::-:A,], so that dim(X¥)=r-—1, by
Theorem 7.3). Hence
|X|=2’-1 andso nyo(e) =2" 1-1.
Weight enumerators 171
(Note that n,(c) is independent of the choice of non-zero
codeword c in C*). Thus
w(c) =n —n(c) = 2’ —1-(27-'- 1)
= 2771,
Corollary 13.11 The weight enumerator of the binary Ham-
ming code Ham (7, 2), of length n = 2’ — 1, is
1
> [(A4+z)*4+n(1 — 27)" Y?71 — z)].

Proof This is a straightforward application of the MacWilliams


identity which is left to Exercise 13.5.

Probability of undetected errors

Suppose we wish to find Pyndetec(C) for a binary [n, k]-code C.


By Theorem 6.14, we have

Punaetee(C) = 2, Ap‘(1— py"


=(1— p)" > A(2-),

Since

and since Ay = 1, we have

Pandetec(C) = (1 -py"|We(=?-) — | (13.12)

If we know W((z), then we can find Pyjgete(C) by means of


equation (13.12). If we know only W<.(z) to start with, then we
could use the MacWilliams identity (13.7) to calculate W.(z) and
then use equation (13.12). Alternatively, we could use the
formula derived in Exercise 13.9, which gives Pndetec(C) directly
in terms of W..(z), and thereby avoid the intermediate calcula-
tion of W-(z).
172 A first course in coding theory
Exercises 13

13.1 Suppose C is a binary linear code of length n which


contains the vector 11 - -- 1 consisting of all 1s. Show that

A; =A, -i,

fori=O0,1,...,n.
13.2 Find the weight enumerator of the code whose generator
matrix is
10011
01001
00101
(a) directly,
(b) by using the MacWilliams identity.
13.3 Let C be the binary [9,7]-code having the generator
matrix _ O17
01
10
L 10
11
11
- 11.
Let )}?_) A,z' denote the weight enumerator of C. Use the
MacWilliams identity to find the values of Ag, A,, A, and
A3. Show that C contains the vector consisting of all 1s
and hence, or otherwise, determine the full weight
enumerator of C.
13.4 Using the result of Example 13.9, write down the weight
enumerator of the extended binary Hamming code of
length 8.
13.5 Prove Corollary 13.11.
13.6 Find the number of codewords of each of the weights 0,
1, 2, 3 and 4 in the binary Hamming code of length 15.
13.7 Let C be a binary linear code and let Cy denote the
subcode of C consisting of all codewords of C of even
weight. Show that

W (Zz) = 2[We(z) + We(-z)].


Weight enumerators 173
13.8 Let C be a binary linear code and let C be the extended
code obtained from C by adding an overall parity check.
Show that

W ez) =3[(1 + z)We(z) + 1 — z)We(-z)].


13.9 Suppose C is a binary [n, k]-code. Prove that

P undetec(C) = Qn-k Wel - 2p) ~ (1 —p)”.

13.10 Let G,, be the extended binary Golay code defined in


Theorem 9.3. Notice that the vector consisting of all 1s
belongs to G,, (add all the rows of G together). Using
properties of G,, found during the proof of Theorem 9.3,
show that
W o,(Z) = 1+ 75928 + 2576z"? + 759z'° + 277.
13.11 Let G,3 be the cyclic binary code defined in Theorem
12.20, and let G,, be its extended code. Using results
from Chapter 12, including Exercise 12.18, determine the
weight enumerator of G),.
13.12 Use either Exercise 13.10 or 13.11, together with Exer-
cise 9.4(a), to determine the weight enumerator of the
binary Golay code G3.
13.13 For each of the two constructions given of the [24, 12]-
Golay code (in Chapters 9 and 12), the tricky part of
showing that d(G,,)=8 is to show that there are no
codewords of weight 4. Assuming only the easily proven
facts that G,, 1s self dual, that G,, contains 1, and that
every codeword has weight divisible by 4, use the
MacWilliams identity to show that A, = 0.
14 The main linear coding theory problem

In Chapter 2 we discussed the ‘main coding theory problem’.


This was the problem of finding A,(n, d), the largest value of M
for which there exists a qg-ary (n, M,d)-code. In the present
chapter we shall consider the same problem restricted to linear
codes. If gq is a prime power, we denote by B,(n, d) the largest
value of M for which there exists a linear (n, M,d)-code over
GF(q). (The function B,(n,d) was briefly introduced in Exer-
cises 5.8 and 5.9). Clearly B,(n,d) is always a power of g, and
B,(n,d)<A,(n,d). We shall refer to the problem of finding
B,(n,d) as the main linear coding theory problem, or MLCT
problem for short.
If we regard the values of g and d as fixed, we may state the
problem as follows.

MLCT problem (Version 1) For given length n, find the


maximum dimension k such that there exists an [n, k, d]-code
over GF(q). (Then, for this k, B,(n, d) = q*).

Recall that the redundancy r of an [n, k, d]-code is just n —k


(the number of check symbols in a codeword). An alternative
version of the MLCT problem 1s:

MLCT problem (Version 2) For given redundancy r, find the


maximum length nm such that there exists an [n,n —r, d]-code
over GF(q).

Solving Version 1 for all ” is equivalent to solving Version 2


for all 7, because in either case we then know exactly those
values of m and k for which an [n,k,d]-code exists. The
equivalence of the two versions will be made explicit in Theorem
14.3.
It turns out that Version 2 provides the more natural ap-
proach. The key to this approach, which was touched upon in
176 A first course in coding theory
Concluding Remark 3 of Chapter 8, is given in the next theorem.
But first we make some definitions.

Definitions An (n,s)-set in V(r,q) is a set of n vectors in


V(r,q) with the property that any s of them are linearly
independent.
We denote by max, (7, q) the largest value of n for which there
exists an (n,s)-set in V(r, q). An (n,s)-set in V(r, q) which has
n=max,(r,q) is called optimal. The packing problem for
V(r, q) is that of determining the values of max, (7, q) and the
optimal (7, s)-sets.
The packing problem was first considered by Bose (1947) for
its statistical interest and later (1961) for its connection with
coding theory, which is given by the following theorem.

Theorem 14.1 There exists an [n,n —r,d]-code over GF(q) if


and only if there exists an (n, d — 1)-set in V(r, q).

Proof Suppose C is an [n,n —r,d]-code over GF(q) with


parity-check matrix H. Then, by Theorem 8.4, the columns of H
form an (n,d — 1)-set in V(r, q). On the other hand, suppose K
is an (n, d — 1)-set in V(r, q). If we form an r Xn matrix H with
the vectors of K as its columns, then, again by Theorem 8.4, H is
the parity-check matrix of an [n,n —r]-code whose minimum
distance is at least d.

Corollary 14.2. For given values of qg, d and r, the largest value
of n for which there exists an [n,n —r,d]-code over GF(q) is
max,_; (7, q).

So the MLCT problem (Version 2) is the same as the packing


problem of finding max,_, (7, q). We now show that the values
of B,(n, d) are also given by the solutions to this problem.

Theorem 14.3 Suppose max,_;(r—1,q)<n<max,_,(r,q).


Then B,(n, d)=q"".

Proof Since n<max,_,(r7,q), there exists an [n,n —r, d]-code


over GF(q), and so B,(n,d)=q"~’. If B,(n,d) were strictly
greater than q”~”, then there would exist an [n,n —r+1, d]-
code, implying that n < max,_, (r — 1, qg), contrary to hypothesis.
The main linear coding theory problem 177
Let us pause to outline our plan of campaign for the remainder
of this and the next chapter. We shall consider the MLCT
problem for increasing values of the minimum distance d. The
cases d=1 and d =2 are easily dealt with in Exercise 14.2. We
will therefore consider first the problem for d = 3 and will solve it
for all values of g and r. We will then consider the case d= 4,
solving the MLCT problem for g=2 and giving the known
results for g>2. For cases of d greater than 4, very little is
known in the way of general results, at least not until d reaches
its maximum value for given redundancy 7, which is d=r-+1.
We will consider this very interesting case in Chapter 15.

The MLCT problem for d=3 (or Hamming codes revisited)

Theorem 14.4 For given redundancy 7, the maximum length n


of an [n,n—r,3]-code over GF(q) is (q’—1)/(q—1); ive.
max,(r, g) = (q” — 1)/(q — 1).
Proof By Corollary 14.2, the required value of 7 is max, (r, q),
the largest size of an (n, 2)-set in V(r, q). Now a set S of vectors
in V(r, q) is an (n, 2)-set if and only if no vector in S is a scalar
multiple of any other vector in S. As we saw in the construction
of g-ary Hamming codes in Chapter 8, the qg’—1 non-zero
vectors of V(r,q) are partitioned into (q’— 1)/(q —1) classes,
each class consisting of g — 1 vectors which are scalar multiples of
each other. Thus an (n,2)-set of largest size is just a set of
(q’ — 1)/(q — 1) vectors, one from each of these classes.
The optimal [n,n —r,3]-codes with n =(q’—1)/(q—1) are
just the Hamming codes Ham (r, q) defined in Chapter 8. The
solution to MLCT problem (Version 1) follows immediately from
Theorems 14.3 and 14.4:

Theorem 14.5 B,(n,3)=q"", where r is the unique integer


such that (q¢”!—1)/(q¢ —1)<n <(q’ —1)/(q — 1).

Remarks (1) It is easy to express B,(m,3) as an explicit


function of g and n (see Exercise 14.3).
(2) To construct a linear (n, M,3)-code with M=B,(n, 3),
one simply finds the least integer 7 such that n <(q’ — 1)/(q — 1)
and writes down, as a parity-check matrix, n column vectors of
178 A first course in coding theory
V(r, q) such that no column is a scalar multiple of another. Such
a parity-check matrix can always be obtained by deleting columns
from the parity-check matrix of a Hamming code Ham(r, q).
Thus the best linear single-error-correcting codes of given length
are either Hamming or shortened Hamming codes.

Before proceeding to the case d= 4, we remark that it will be


advantageous to view an (7,5)-set not only as a set of vectors in
the vector space V(r,q), but also as a set of points in the
associated projective geometry PG(r—1,q), which we now
define.

The projective geometry PG(r — 1, q)


With the vector space V(r, q)= {(a1, a2,...,a,)|a, € GF(q)},
we associate a combinatorial structure PG(r — 1, q) consisting of
points and lines defined as follows.
The points of PG(r —1,q) are the one-dimensional subspaces
of V(r,q). The lines of PG(r—1,q) are the two-dimensional
subspaces of V(r, q). The point P is said to belong to (or lie on)
the line L if and only if P is a subspace of L. PG(r—1,q) is
called the projective geometry of dimension r — 1 over GF(q).
Each point P of PG(r—1,q), as a subspace of V(r,q) of
dimension 1, is generated by a single non-zero vector. So, if
a=(a,,a,,...,a,)€P, then
P = {AalA € GF(q)}.
In practice, we identify the point P with any non-zero vector it
contains. In other words, we take the points of PG(r —1,q) to
be the non-zero vectors of V(r,q) with the rule that if a=
(a,,@,...,a,) and b=(b,,b,,...,b,) are two such vectors,
then
a=b in PG(r —1, q) if and only if a= Ab in V(r, q),
for some non-zero scalar A.
We now list some elementary properties of PG(r — 1, q).

Lemma 14.6 In PG(r-—1,q),


(i) the number of points is (q’ — 1)/(q — 1),
(ii) any two points lie on exactly one line,
(iii) each line contains exactly g + 1 points,
(iv) each point lies on (q’* — 1)/(q — 1) lines.
The main linear coding theory problem 179
Proof (i) Since each of the g’ — 1 non-zero vectors in V(r, q)
has g—1 non-zero scalar multiples, the number of points of
PG(r — 1, q) is (q’ —1)/(q — 1).
(ii) If a and b are distinct points of PG(r —1,q), then the
unique line through them consists of the points Aa + ub, where A
and mu are scalars not both zero.
(iii) In (ii), there are g*—1 choices for the pair (A, u), but
since we are identifying scalar multiples, the number of distinct
points on the line is (q*— 1)/(q — 1) =q +1.
(iv) Let t be the number of lines on which a given point P
lies. Let X denote the set {(Q, L)|Q is a point #P, L is a line
containing both P and Q}. We count the members of X in two
ways. For each of the (g” — 1)/(q — 1) — 1 choices for Q, there is
a unique line L containing P and Q. Thus
|X| =(q4’ —-I/(q-1)-1=(q’ -9)/(q- 1).
On the other hand for each of the ¢ lines through P, there are, by
part (iii), g points Q other than P lying on L. Thus
|X| = tq.
Equating the two expressions for |X| gives t = (q’~'—1)/(q — 1).

Definition The projective geometry PG(2,q) is called the


projective plane over GF(q). It follows from Lemma 14.6 that
PG(2, q) is a symmetric (q7+q+1,q +1, 1)-design, so that it is
a projective plane as defined in Chapter 2.

Examples 14.7 (i) The simplest projective plane is PG(2, 2).


This contains 7 points labelled 001, 010, 100, 011, 101, 110, 111,

001

010 100

011 110 101

Fig. 14.8. The projective plane PG(2, 2).


180 A first course in coding theory
and 7 lines as shown in Fig. 14.8. This shows that PG(2, 2) is the
same as the 7-point plane of Example 2.19.
(ii) The 6 points of PG(1,5) are 01, 10, 11, 12, 13 and 14,
and there is just one line consisting of all 6 points. The points
could equally well be labelled 03, 10, 22, 12, 21, and 41, say,
because in PG(1, 5), 01 = 03, 11 =22, 13 =21 and 14=41.

Remarks (1) The points of PG(r—1,q) can be uniquely


labelled by making the left-most non-zero coordinate equal to 1.
(2) If qg=2, the points of PG(r—1,2) are labelled by the
non-zero vectors of V(r, 2).

Definition A set K of n points in PG(r—1,q) is called an


(n, s)-set if the vectors representing the points of K form an
(n,5)-set in the underlying vector space V(r, q).

Remarks (1) Two advantages of working in PG(r—1,q) are


that (a) some neat counting arguments may then be used to
obtain upper bounds on max,(7,q) and (b) many optimal
(n,s)-sets turn out to be natural geometric configurations.
(2) An (n,2)-set in PG(r—1,q) is just a set of n distinct
points of PG(r—1,q). So we may describe a Hamming code
Ham (r, q) as a code having a parity-check matrix H whose
columns are the distinct points of PG(r—1,q). Of course,
different representations of these points as vectors will give rise
to different, but equivalent, codes. For example (cf. Example
14.7(i1)), Ham (1, 5) may be defined to have parity-check matrix
011111
H=l oi
or, equally well,

H=| 012124
302211
|.
The MLCT problem for d=4

The maximum length of an [n,n —,r,4]-code, for given r, is


equal to the value of max; (r,q), the largest size of an (n, 3)-set
in V(r, q) (or in PG(r — 1, q)).
The main linear coding theory problem 181
An (n, 3)-set in the plane PG(2, q) is usually called an n-arc,
while an (n, 3)-set in PG(r — 1, q), for r>3, is called an n-cap.
Since three points of PG(r—1,q) are linearly dependent if
and only if they are collinear (i.e. they lie on the same line), we
may describe an n-arc/n-cap as a set of n points, no three of
which are collinear.
The problem of determining the values of max; (r,q), first
considered by Bose (1947), was quickly solved for g = 2, for all r,
and for r=4, for all g. But, despite having received much
attention since, the problem has been solved only for the
additional pairs (r, qg) = (4,3) and (5,3). The known values of
max; (r, q) are listed in Fig. 14.9.

max; (r,2) =2’7! (Bose 1947)


q+1, q odd
= B 1947
max; (3, 4) (3 + 2, g even (Bose )

maxs
4,q)=|
(4, 4)
q*>+1,
q*+1,
qodd_
qeven
:
(Bose 1947)
(Qvist 1952)
max; (5, 3) = 20 (Pellegrino 1970)
max; (6, 3) = 56 (Hill 1973)
Fig. 14.9. The known values of max; (r, q).

We now prove the more straightforward of these results.

The determination of max; (r, 2)

Here we are concerned with finding optimal binary linear codes


with d=4. The following general theorem shows that we may
obtain such codes from optimal codes of minimum distance 3 by
the simple device of adding an overall parity-check.

Theorem 14.10 Suppose d is odd. Then there exists a binary


[n, k, d|-code if and only if there exists a binary [7 + 1, k, d + 1]-
code.

Proof The proof of Theorem 2.7 is valid in the restriction to


182 A first course in coding theory
linear codes. This is because an ‘extended’ linear code (i.e. the
code obtained from a linear code by adding an overall parity-
check) is also linear (see Exercise 5.4).

Corollary 14.11 Suppose d is even. Then


(i) B.n,d)=B,(n —-1,d—1)
(ii) max,_,(r, 2) =max,_,(r —1,2) +1.

Proof
(i) is immediate from Theorem 14.10.
(ii) m<max,y_,(r,2)@there exists a binary
[n,n —r, d]-code
<> there exists a binary
(7 —-1,n —r,d— 1]-code
$n —-1smax,_>(r— 1, 2)
©n<max,_>(r—1,2)+1.

Corollary 14.12 max, (r,2)=2""!.

Proof By Theorem 14.4, max, (r, 2) = 2’ — 1. Hence


max; (r,2) =(2” '—1)+1=2"71.

The optimal binary code with d=4 and redundancy r is the


extended Hamming code Ham (r — 1,2). As we saw in Chapter
8, a parity-check matrix for this code is

0
_ H :
H= oP

where H is a parity-check matrix for Ham (r — 1, 2), so that the


columns of H are just the points of PG(r—2,2) (i.e. the
non-zero vectors of V(r — 1, 2)).
The columns of H form an optimal 2’~!-cap in PG(r — 1, 2). It
consists of the points of PG(r —1,2) not lying in the subspace
{(x,,...,X,)|x,=0}. Geometrically, it may be described as the
complement of a hyperplane.
The main linear coding theory problem 183
The determination of max; (3, q)

First we give some examples of good linear codes with d = 4 and


redundancy 3. We then prove that these codes are optimal by
showing that there cannot exist such codes of greater length.

Theorem 14.13 Let a,,a2,...,a,-,; be the non-zero elements


of GF(q).
1 1-:-- 1 10
(i) The matrix H=| a, a, --- ajz_, 0 0
ai az +: az_,01
is the parity-check matrix of a [q+1,q-—2,4]-code.
Equivalently, the columns of H form a (q+1)-arc in
PG(2, q).
(ii) If gq is even, then the matrix
1 1--» 1 100
H*= |a, a2 +--+ aj-, 910
ai, az ++: az_,001
is the parity-check matrix of a [q+2,q-—1,4]-code.
Equivalently, the columns of H* form a (q+2)-arce in
PG(2, q).

Proof (i) It is enough to show that any three columns of H are


linearly independent. Any three of the first g — 1 columns of H
form a Vandermonde matrix, and so are linearly independent by
Theorems 11.1 and 11.2. For any three columns which include
one or both of the last two columns, the determinant may be
expanded about these last columns to get again the determinant
of a Vandermonde matrix.
(ii) We have shown in the proof of part (i) that any three
columns of H* are linearly independent, with the possible
exception of three of the form
1 1 0
a;|,|4;|, and } 1}.
a*|i | a?J )
The determinant of the matrix A formed by these three columns
184 A first course in coding theory
is equal to a; — as. Since q is even, GF(q) has characteristic 2 (cf.
Exercise 4.6). Hence, by Exercise 3.12, a7 — a+ = (a; —a,)*. Since
a;#a;, det A is non-zero.

g +1, if g is odd
Corollary 14 14
max; (3, 7)’ = | gq +2, if1 g is} even.

Remark The (qg+1)-arc formed by the columns of A in


Theorem 14.13 is the conic {(x, y, z) € PG(2, q)| yz =x?}.

We now show that the codes/arcs given in Theorem 14.13 are


optimal.

Theorem 14.15
(i) For any prime power q, max; (3,q)<q +2.
(ii) If g is odd, then max; (3,q)<q +1.

First proof (i) Let H be a standard form parity-check matrix


for an [n,n — 3, 4]-code C over GF(q), with n = max; (3, q):
a; Ay °*°° an—3 1 0 0

H= b, b, 7. b,,-3 0 1 0) .

Ci Co oe Cn-3 0) 0) 1

Since any three columns of H are linearly independent, the-


determinant formed by any three columns must be non-zero.
From the non-vanishing of the determinant formed by any two of
the last three columns and one of the first n — 3 columns, we find
that the a;s, b;s and c;s are all non-zero. Multiplying the ith
column by a;' for i=1,2,...,n—3, we have that C is
equivalent to a code in which the a;s are all 1. Thus we may
assume that
1 1-:--. 1 100
A= b, b, ++: b,-,3 910 5

Cy C> eee Cn—3 001

where the b,s and c;s are non-zero. As the determinant formed
by the last column and two of the first n — 3 columns is non-zero,
the 5;s must be distinct non-zero elements of GF(q). Hence
n—-3<q-—1landsonsq+2.
The main linear coding theory problem 185
(ii) (Adapted from Fenton and Vamos, 1982). Now suppose
q is odd. Suppose, for a contradiction, that a [¢ +2, q—1,4]-
code C exists. Then, as in (i), we may assume that C has a
parity-check matrix
11-:.; 1 100
H= |b, b2--:+ b,-,01 0
Cy C2 *** Cg-y 001

where b,,b2,...,6,-, are the distinct non-zero elements of


GF(q) and similarly c,,c2,...,¢,-,; are also the distinct non-
zero elements of GF(q) in some order. The non-vanishing of
determinants of the form
111
det | b; 5; 0
c; c; 0
implies that the elements b,cy', bocy',...,bg-,¢7+, are distinct
and so they too are the non-zero elements of GF(q) in some
order. Hence, by Exercise 3.13, all three of the products []?=;' b,,
II?= c;, and []#=} (b;c;"*) are equal to —1. But then
q-1

{I (bc;") = (I b,)( TL c) =(-1)(-1)'=1.


i=1

Since 1 4 —1 if g is odd, this gives the desired contradiction.

Second proof (geometric) (i) Suppose K is an a»-arc in


PG(2,q) of maximum size n = max; (3, q). Let P be a point of
K. By Lemma 14.6(iv), there are g + 1 lines through P and every
point of K lies on one or other of them. But on none of these
lines can there be more than one point of K besides P (by
definition of an n-arc, no three points of K are collinear). Thus
n=1+(q¢+l)=q+t2.
(ii) Now suppose q is odd. Suppose, for a contradiction, that
K is a (q + 2)-arc in PG(2, q). Then if P is any point of K, each
of the g +1 lines through P must contain exactly one further
point of K. This means that every line in PG(2,q) meets K in
either 2 or 0 points (but never in 1). Now let Q be any point of
PG(2, qg) lying outside K. Through Q there pass q + 1 lines and
each point of K lies on one (and only one) of them. So if ¢ of
186 A first course in coding theory
these lines meet K in two points, then |K|=2t, contradicting
|K| =q +2 being odd.

Remark The author feels that the attractiveness of the above


proofs merits the inclusion of both. The geometric proof has two
important advantages: (1) it generalizes to give upper bounds on
max; (r,q) for larger values of r; (2) it does not assume specific
properties of the field GF(q), and so gives the same upper bound
on the size of n-arcs in any projective plane of order q.

Corollary 14.14 and Theorem 14.15 give

Theorem 14.16 (Bose 1947)


max, G,q= {4 if Is odd
qt+2 ifg 1s even.

Remark It has been shown by Segre (1954) that, for g odd,


every (q +1)-arc in PG(2,q) is a conic. This implies that the
optimal [g + 1, g — 2, 4]-code is unique, up to equivalence. For g
even, optimal (g + 2)-arcs in PG(2, q) are not in general unique,
and a classification is unknown.

The determination of max, (4, q), for q odd

As we shall adopt a geometric approach here, we introduce a


little more terminology concerning the projective geometry
PG(r—1,q). In defining PG(r—1,q) from the vector space
V(r, q), recall that the points and lines in PG(r — 1, q) are the 1-
and 2-dimensional subspaces respectively of V(r, q). More gener-
ally we define a ¢t-space in PG(r—1,q) to be a (¢+1)-
dimensional subspace of V(r, q). Thus a 0-space is a point and a
1-space is a line. A 2-space is called a plane and an (r — 2)-space
in PG(r — 1, q) is called a hyperplane. Note that the dimension t
of a ¢t-space in PG(r—1,q) is always one less than the
corresponding vector space dimension.
We usually identify a ¢-space in PG(r —1,q) with the set of
points it contains. The number of points in a f-space is
(q’*' — 1)/(q — 1), since a (t + 1)-dimensional subspace of V(r, q)
contains g‘t!—1 non-zero vectors, each of which has g-1
The main linear coding theory problem 187
non-zero scalar multiples. A ¢-space is just a copy of PG(¢, q) in
so far as the incidence properties of its subspaces are concerned.
In particular, a cap in PG(r — 1, q) must meet a (¢ — 1)-space in
at most max; (¢, g) points, bearing in mind that any subset of a
cap is also a cap.
We may now derive an upper bound on max; (4, qg), for q odd.

Theorem 14.17 If g is odd, then max; (4, q) <q?+1.

Proof Suppose K is an n-cap in PG(3, gq) of maximum size. Let


P, and P, be any two points of K and let L be the line on which
P, and P, lie. Since no three points of K are collinear, L contains
no other point of K. Through the line L there pass g + 1 planes
(Exercise 14.4), and each point of K, other than P, and P,, lies
on one and only one of these planes. Since g is odd, it follows
from Theorem 14.15(ii1) that no plane can contain more than
gq +1 points of K. In particular, a plane through L can contain at
most g — 1 points in addition to P, and P,. Hence
n=2+(g+1)(¢q —1)=q7+1.
We next show that (q?+1)-caps exist in PG(3,q), when q is
odd.

Theorem 14.18 Suppose g is odd and let b be a non-square in


GF(q). Then the set
O = {(x, y,z, w) € PG(3, g)| zw = x? — by}
is a (q*+1)-cap in PG(3, q@).

Proof Since b is a non-square, the only point of Q having z = 0


is (0,0,0,1). Each of the remaining points may be represented
by a vector having z = 1, and so we may write
O = {(0, 0, 0, 1), (x, y, 1, x” — by”) | (x, y)e V(2, g)}. (14.19)
This shows that |Q| = g*+1. We must show that no three points
of Q are collinear. Clearly (0,0,0,1) cannot be collinear with
two other points of Q because there is only one point of Q of the
form (x,y, 1, *) for any given pair (x, y). Now let a, = (x1, yy, 1,
xi — by?) and a, = (xo, y2, 1, x5 — by3) be any two points of Q,
other than (0, 0, 0,1). Suppose, for a contradiction, that the line
188 A first course in coding theory
joining a, and a, contains a third point of Q. Then, for some
non-zero scalar A, a, + Aa, € Q, i.e. the point (x, y,z,w) =(x,+
AX2, V1 t+ Ayo, L+A, x7 — by? + Axs — Aby5) satisfies zw =x? — by?.
This condition implies, after some cancellation, that

Ax? + Ax3 — Aby? — Aby3 = 2Ax,x2 — 2Aby,)>.


Since A 40, it follows that

(x1 — x2)" = b(y, — y2)"s


which is impossible since b is a non-square.
Putting Theorems 14.17 and 14.18 together gives

Theorem 14.20 If q is odd, then max; (4, g) = q* +1.

Example 14.21 Take gq =3 and b=~—1 in Theorem 14.18. By


(14.19), a 10-cap in PG(3,3) is formed by the columns of the
matrix 0000111222
H=
0012012012
0111111111
1011122122
Thus H is the parity-check matrix of a ternary [10, 6, 4]-code
which is of greatest length for d =4 and r= 4.

Remarks (1) The set Q of Theorem 14.18 is an example of an


elliptic quadric. For qg odd, any elliptic quadric is a (q7 + 1)-cap,
and conversely (Barlotti 1955) any (q7+1)-cap is an elliptic
quadric. This implies that the optimal [q?+ 1, g* — 3, 4]-code is
unique, up to equivalence.
(2) For g=2", with h>1, it is also true that max, (4,q)=
gq? +1, but the proof is a little trickier and is omitted here.

The values of B,(n, 4), for n=<q?+1


By means of Theorem 14.3, we can instantly translate our results
concerning max; (r, q) for r = 2 and 3 into results about B,(n, 4).

Theorem 14.22 If qg is odd, then

an.4)
a(n, 4)
=| nS
q
gq’*
for 4<n<qg+1
= forg+2<n<q’* +1.
The main linear coding theory problem 189
If g is even, then
q’> = for 45n<q+2
B,(n,4)=
q’* = forg+3<n<q*
+1.

Remarks on max; (r, g) for r=5

For r = 3 and r = 4 the packing problem for caps in PG(r — 1, q)


was fairly easy to solve because of the existence of natural
geometric configurations (conics in PG(2, q) and elliptic quadrics
in PG(3,q)) which are optimal caps. But in PG(r—1,q) for
r=5, large caps do not appear to arise in such a natural way and
so the packing problem is much more difficult. As we see from
Table 14.9, the only known values of max; (r,q) for g#2 and
r=5 are max;(5,3)=20 and max; (6,3) =56. (For a coding-
theoretic proof of the second result, wherein the uniqueness of
the optimal ternary [56, 50, 4]-code is also demonstrated, see Hill
(1978).)
It is easy to construct 20-caps in PG(4, 3) (Exercise 14.9) but
hard to show that 20 is the largest size possible. By contrast, it is
rather difficult to describe a 56-cap in PG(5, 3), but a short proof
of the maximality of 56 has been given by Bruen and Hirschfeld
(1978) (cf. Exercise 14.11). In the next dimension up for g =3,
the best known bounds are

112 = max; (7, 3) $ 163,

suggesting that the problem of finding optimal caps in PG(6, 3) is


far from solution.

Concluding remarks on Chapter 14

(1) We have mentioned that the problem of determining


max, (7,q) was first considered by Bose (1947). Much of the
subsequent work has been carried out by the Italian school of
geometers led by Segre, Barlotti and Tallini.
For a survey of the known results concerning max, (7, q) and
similar functions, see Hirschfeld (1983). For a comprehensive
coverage of the theory of projective geometries over finite fields,
see Hirschfeld (1979 and Volume 2, to appear).
(2) For recent results concerning max, (r,q) for g=3 and
s<r<15, see Games (1983).
190 A first course in coding theory
(3) There seems to be little pattern to results concerning
max,_, (r, q) for fixed values of d greater than 4. However, when
d takes its maximum value for given r, that is d=r+1, an
interesting pattern once again emerges. This case is the subject of
the next chapter.
(4) Another version of the MLCT problem is to find, for
given g, mn and k, the maximum value of d for which there exists
an [n, k, d|-code over GF(q). In the case of binary linear codes,
Helgert and Stinaff (1973) give a table of such values (or bounds
when the values are not known) for k <n <127. For a com-
prehensive update of this table, incorporating many improve-
ments by various authors, see Verhoeff (1985).

Exercises 14

14.1 Is it true that B,(n,d) is always equal to the highest


power of 2 less than or equal to A,(n, d)?
14.2 Show that (i) B,(n, 1) =q", (ii) B,(n, 2) =q""".
14.3. Show that B,(n, 3) =q'"loeatra—n FDI
14.4 Show that in PG(3, q) the number of planes containing a
given line is q + 1.
14.5 Which code is the optimal [n,n —5,5]-code having
n = max, (5, 3)?
14.6 Specify a [26, 22, 4]-code over GF(5).
14.7 Pinpoint where the proofs of Theorems 14.17 and 14.18
fail when gq is even.
14.8 Devise a syndrome-decoding algorithm for a [g?+1, q* —
3, 4]-code over GF(q) (q odd), which will correct any
single error and detect any double error.
14.9 Given the 10-cap of Example 14.21, construct a 20-cap in
PG(4, 3).
14.10 Show that, in PG(m,q), the number of (¢ + 1)-spaces
containing a given f-space is (q” ‘—1)/(q-—1). In
PG(5, 3), state (i) how many planes contain a given line,
(ii) how many 3-spaces contain a given plane, (iii) how
many 4-spaces contain a given 3-space.
14.11 Given that max; (5,3)=20, show that max; (6, 3) <56.
[ Hint: Use parts (1), (ii) and (iii) of Exercise 14.10.]
14.12 State the values of B,(n, 4) for 4<n <112.
15 MDS codes

In the previous chapter we considered the problem of finding


linear codes of maximum length for given redundancy r and
given minimum distance d. Particular attention was paid to the
cases d <4. In this chapter we consider the problem when d is as
large as possible for given redundancy r. The following theorem
shows that this is the case d=r+1.

Theorem 15.1. An [n,n -—r, d|-code satisfies d=r+1.

Proof 1 This is just the Singleton bound applied to linear codes.


Theorem 10.17 states that any q-ary (n,M,d)-code satisfies
M <q"~“*'. So, in particular, an [n,n —r,d]-code over GF(q)
satisfies g’~-"<q"~4*!, whence d<r+ 1.

Proof2 Suppose C is an [n,n — r, d]-code and let G =[I,_, | A]


be a standard form generator matrix of C. Since A has r
columns, those codewords which are rows of G have weight
<r +1. The result follows by Theorem 5.2.

Definition An [n,n—r,r+1]-code (i.e. a linear code of re-


dundancy r whose minimum distance is equal to r + 1) is called a
maximum distance separable code, or MDS code for short.

By Theorem 14.1, the maximum length of an [n,n —r,r+1]-


code over GF(q) is equal to the value of max, (7, q), the largest
size of an (n,r)-set in V(r,q). We recall that an (n,1r)-set in
V(r, q) is a set of n vectors such that any r of them are linearly
independent. Equivalently, an (n,7r)-set in V(r, q) is a set of n
vectors such that any r of them form a basis for V(r, q).
MDS codes were first studied explicitly by Singleton (1964),
although the problem of finding max, (r,q) had already been
studied as a problem in statistics (Bush 1952) and as a problem in
geometry (Segre 1955, 1961). (In the geometrical context, an
192 A first course in coding theory
(n,r)-set, regarded as a subset of PG(r—1,q), is called an
n-arc. This agrees with the usage of the term n-arc for an
(n, 3)-set in PG(2, q) already met in Chapter 14.)
MacWilliams and Sloane (1977) introduce their chapter on
MDS codes as ‘one of the most fascinating in all of coding
theory’. The problem of determining the values of max, (7, q) is
a particularly attractive one for two reasons. Firstly, the problem
is equivalent to a surprising list of combinatorial problems; no
fewer than six different interpretations are given in MacWilliams’
and Sloane’s book, while yet another is given in Fenton and
Vamos (1982). Secondly, although a complete solution to the
problem seems inaccessible at present, the known results suggest
a tantalizingly simply stated conjecture:

Conjecture 15.2. If 2<r<q, then


max, (r,q)=qt1
(except that max; (3, g) = max,_,(q¢—1,q)=q +2 if qg =2").

Note that the conjecture has already been proved for r=2
(Theorem 14.4) and for r=3 (Theorem 14.16). Before consider-
ing the conjecture further let us dispose of the rather uninterest-
ing cases outside the range to which it applies. For redundancies
0 and 1, MDS codes exist of any length n over any field GF(q)
(for r=0, V(n, q) is an [n, n, 1]-code, while for r = 1, the matrix
1
[h-1 :

generates an [n,n — 1, 2]-code). Cases r>q are covered by the


following theorem.

Theorem 15.3 If r=q, then max, (r,qg) =r+1. Any MDS code
of redundancy r =q is equivalent to a repetition code of length
r+.

Proof The repetition code of length r + 1 is an [r+1,1,7r+1]-


code with generator matrix [1 1--- 1]. Hence
max, (7,q)2rt+1.
MDS codes 193

Also, it is clear that any [r+ 1,1,7+1]-code is equivalent to a


repetition code.
Now suppose r=q and suppose for a contradiction that
max, (r,g)=r+2. Then there exists an [7 +2,2,7+1]-code C
over GF(q). This code C must be equivalent to a code having
generator matrix

G=| 1011-:::1
O1a,a,°-::a,
|
In order that any linear combination of the rows of G has weight
at least r+1, the as must be distinct non-zero elements of
GF(q). This implies that r <q — 1, contrary to hypothesis.

Remark It follows from Theorem 15.3 and the preceding


remarks that the only binary MDS codes are V(n, 2), the even
weight codes E,,, and repetition codes. So this chapter is really of
interest only for codes over GF(q) with gq >2.

From now on we assume that r lies in the range 2r<q and


return to our consideration of Conjecture 15.2. Our first task will
be to show that there exist MDS codes which meet the
conjectured values of max, (7, q) in all cases.

Theorem 15.4 Suppose 2=r<q. Let @,,a2,...,a@,_-, be the


non-zero elements of GF(q). Then the matrix
Td 1 +s 1°10
a, a, +--+ az-, 90

H=| % % 77° 4-100


00
ay ay +++ ag, OL
is the parity-check matrix of an MDS [q+1,q+1-,7,r+1]-
code. Equivalently, the columns of H form a (q+1)-arc in
PG(r-1, q).

Proof This is exactly the same as the proof of Theorem


14.13(i), for the determinant of a matrix formed by any r
columns of H is equal to the determinant of a Vandermonde
194 A first course in coding theory
matrix and so is non-zero. Thus any 7 columns of H are linearly
independent.

Corollary 15.5 If 2<r<q, then max, (7,q)=q +1.

As we saw in Theorem 14.13(ii), in the case where g is even


and r = 3, we may add the further column
0
1
0
to the matrix H of Theorem 15.4 to get an MDS code of length
gq +2. Such a trick will not work for r > 3. However, we see from
Conjecture 15.2 that the case g even and r= q — 1 also seems to
be special. Indeed there exists an MDS code of length q +2 in
this case too. This fact will follow from the very useful result that
the dual code of an MDS code is also MDS, thus implying that
the roles of dimension and redundancy are interchangeable in so
far as the existence of MDS codes is concerned. In order to show
this duality, we first reformulate our problem in terms of
matrices having every square submatrix non-singular.

Definitions A square matrix is called non-singular if its columns


are linearly independent, or equivalently, if it has a non-zero
determinant (cf. Theorem 11.2).
Given any matrix A, a t Xt square submatrix of A isatxt
matrix consisting of the entries of A lying in some ¢ rows and
some ¢ columns of A.
For example, if
Ay, @12 413 Ay4
A = 1@o1, A272 Ar3 Arg |,
Az, A302 33 434
then

oe 23) and as a4]


Az. A33 Az; A34

are examples of 2 X 2 square submatrices of A.


Theorem 15.6 Suppose C is an [n,n — r]-code with parity-check
MDS codes 195

matrix H =[A? |-.]. Then C is an MDS code (i.e. d(C) =r +1)


if and only if every square submatrix of A is non-singular.

Proof By Theorem 8.4, C is an MDS code if and only if any r


columns of H are linearly independent, i.e. if and only if any
r Xr submatrix of H is non-singular. Let us interpret this last
condition on H as a condition on A’. Suppose B is an r Xr
submatrix of H obtained by choosing some r columns of H.
Suppose ¢t of the chosen columns are from A’ and r —t of them
from [,. If we expand det B about the last r — ¢ columns, we end
up with
detB = +det B’,

where B’ is the ¢ X t matrix obtained by taking the r X¢ matrix


consisting of the ¢ chosen columns of A’ and then deleting the
r — t rows corresponding to where the chosen columns of J, have
1s. To illustrate this point suppose
QA,,; Ax, a3, 1000
Ay Ann An 0100
A143 A453 23,0010]
Ay, An, A3, 0001
If B is the 4 X 4 submatrix of H consisting of columns 1, 3, 5 and
6, then
a1, a3, 0 O

det B= det | 2 %2 1} det [as in = det B’.


QA13 A33 0 1 QAy4 A3x4

Ay, Az, 0 O

Returning to the general case, it follows that B is non-singular


if and only if the corresponding square submatrix B’ is non-
singular. It is clear that any t X t square submatrix B’ of A’ (for
any ¢t with 1<t<r) arises from some r X r submatrix B of H in
this way, and so the result follows.

Corollary 15.7 The dual code of an MDS code is also MDS.

Proof The code C with parity-check matrix [A7 | I] is MDS


<A’ has the property that every square submatrix is non-
singular
196 A first course in coding theory
<A has the same property (since the determinant of any square
matrix is equal to the determinant of its transpose)
the code C+ with parity-check matrix [J,_,| —A] is MDS.

It follows from Corollary 15.7 that generator matrices and


parity-check matrices of MDS [n, k]-codes serve also as parity-
check matrices and generator matrices respectively of MDS
[n,n — k]-codes.

Corollary 15.8 There exists an MDS [n, k]-code over GF(q) if


and only if there exists an MDS [n, n — k]-code over GF(q).

Corollary 15.9 Suppose q=2", h>1. Then there exists a


[q +2,3,q]-code over GF(q). Equivalently, there exists a (q +
2)-arc in PG(q — 2, q).

Proof By Theorem 14.13(ii), there exists a [q¢+2,q-1,4]-


code over GF(q). By Corollary 15.7, its dual code is a
[q + 2, 3, q|-code.

Combining the results of Corollaries 14.14, 15.5 and 15.9, we


have

Theorem 15.10 If 2<r<q, then max,(r,qg)=q+1. If also


gq = 2" and r =3 +2.
or q —1, then max, (r,q)=q

The known results concerning Conjecture 15.2

Theorem 15.10 shows that the conjectured values of max, (r, q)


are all lower bounds. The conjecture was shown to be true for
r=2 and r=3 in Theorems 14.4 and 14.16. We mention without
proof that, by geometric methods, the conjecture has also been
proved for r = 4 and r=5, for all g (Segre 1955 and Casse 1969).
Using the duality result of Corollary 15.8, the truth of the
conjecture for r<5 implies its truth also for r in the range
q—-3<r<q (see Exercises 15.2 and 15.3). [This last result was
first proved in a different way by Thas (1968), who also showed
(1968, 1969) that the conjecture is true for g odd in the ranges
q>(4r-9) and gq -—3>r>q—4Vq —5/4].
Following MacWilliams and Sloane (1977), we show the results
MDS codes 197

graphically in Fig. 15.11, which neatly illustrates the symmetry


between dimension k and redundancy r.
The broken line n=kK+r=gq +1 in Fig. 15.11 is the conjec-
tured bound above which no MDS code is known to exist. The
heavy line represents an upper bound given by repeated applica-
tion of the recursive bound

Max, +4 (r + 1, q) S max, (r, q) + 1

(see Exercise 15.5), starting at max;(5,q)=q+1 (thus


max,(6,qg)<q+2, max,(7,q)<q+3,...,max,(r,qg)<q+
r—4 for r=6). The region marked with a question mark is
therefore the ‘grey’ area where the existence of MDS codes is
undecided. .
Finally we mention that the conjecture has been verified by
exhaustive search for g<11, for all r (Maneri and Silverman

6 ee eee e@

5 eeeeee
4 eeeeee

3 ee.8e8 ee @

2 eeeee @
1 eee ee e@ © 0 © 0 fe -o-—e--~

0 123 45 6... . q-4 q-iqqt+t


k

Fig. 15.11. Values of k, r for which a [k +r,k]-MDS code exists. @


means MDS code exists for all g. O means MDS code exists if and only
ifqg=2".
198 A first course in coding theory
1966 and Jurick 1968) and so the smallest undecided case is
max, (6, 13) = 14 or 15.

Concluding remarks on Chapter 15

(1) The Reed-Solomon codes described in Chapter 11 are


MDS codes; they are shortened versions of the codes defined in
Theorem 15.4. Since MDS codes meet the Singleton bound,
Theorem 15.4 enables Theorem 11.4 to be improved to

Theorem 15.12 If q is a prime power and if d<n<q +1, then


A,(n, d) = B,(n, d)=q"~@"".
(2) One remarkable property of an MDS [n, k]-code C over
GF(q) is that its weight enumerator is completely determined by
the values of n, k and qg and does not depend on the code C
itself. This fact is a little less surprising when one considers the
MacWilliams identity. Let

WA(z)=1+ > Azz!


i=n—k+1

be the weight enumerator of C. Since C~ is also MDS and hence


has minimum distance k + 1, the coefficients of z,z*,...,z* on
the right-hand side of the MacWilliams identity (Theorem 13.6)
must all be equal to zero, giving k equations in the k unknowns
Ay-k+1>+-++,A4, (we also have the equation 1+ )7_,-.4,A4;=
q*). It turns out that these equations have a unique solution.
Exercise 15.6 gives an illustration of this. In fact it is possible to
derive the formulae

A=("\aq- D> (-1y(' j ‘arth (45.13)


for the A,s in terms of n, d and q, though this derivation is a little
complicated (see e.g. Chapter 11 of MacWilliams and Sloane,
1977) and is not included here.
(3) Theorem 15.6 enables the MDS codes existence problem
to be posed in elementary terms, independently of any terminol-
ogy from coding theory or geometry. In view of Theorem 15.10,
Conjecture 15.2 may be simply stated as follows.
MDS codes 199

Conjecture 15.14 Any rXk matrix over GF(q) with 2<r,


k <q and having the property that any square sub-matrix is
non-singular satisfies
r+k<xqtl1
except pt f for case q> 2" and and rr OF
or k = =3 .

From an earlier remark, the smallest possible counter-example


isa6X9 or 7X8 matrix over GF(13).

Exercises 15

15.1 Consider the matrix

16251
A=/]143
3 6
15515
over GF(7). Check that every square submatrix of A is
non-singular. Hence write down generator matrices for
[8, 3] and [8, 5] MDS codes over GF(7).
15.2 Show that if max, (7, q)=q+1, then max,,._,(q¢+2-
r,qv=qrtil.
15.3 Suppose g =2",h>1. Assuming known results about
max, (7, q) for r <4, show that max,-;(¢—1, q)=q +2.
15.4 Given that GF(8)= {0, a,=1, a5, a@3,...,4@,}, write
down a parity-check matrix for an [n, n — 7, 8]-code over
GF(8) with n = max, (7, 8).
15.5 Prove that max,,, (r+ 1, q) S max, (7, g) +1.
15.6 Use Theorem 13.6 to find (a) the weight enumerator of
an [8, 3, 6]-code over GF(7) and (b) the weight enumera-
tor of an [8,5, 4]-code over GF(7). Check your answers
by using the formulae (15.13).
15.7 For each integer k =2, specify an [n, k,n —k + 1]-code
over GF(11) having n as large as possible.
Concluding remarks, related topics,
1 6 and further reading

The main aims of this final chapter are to review the progress
made in the earlier chapters and to mention some related topics,
with suggestions for further reading.
The treatment presented in the book has been motivated
mainly by two recurring themes:
(1) the problem of finding codes which are optimal in some
sense;
(2) the problem of decoding such codes efficiently.
This has led to a rich interplay with several well-established
branches of mathematics, notably algebra, combinatorics, and
geometry.
With regard to optimal codes, the main emphasis has been on
finding values of A,(n,d), the largest size M of an (n, M, d)-
code over an alphabet of gq letters. jn the case of binary codes,
we gave in Table 2.4 the state of knowledge regarding values of
A,(n,d) for small n and d. We now consider this table again
(Table 16.1), for d<5, in order to indicate those places in the
text where results have been proved.

Remarks 16.2 (1) All of the bounds in Table 16.1 have been
proved in the text or exercises with the exceptions of (i) the
upper bounds obtained by linear programming methods and (ii)
the lower bounds for d=3 and n=9, 10, and 11. A rather
complicated construction of an (11, 144,3)-code was given by
Golay (1954). Successive shortenings of this code give codes with
parameters (10, 72, 3), (9, 38, 3) and (8, 20, 3). For a long time it
was believed that the (9,38, 3)-code was optimal, but recently
Best (1980) found a (9, 40, 3)-code (despite a publication of 1959
which claimed that 39 was an upper bound on A,(9, 3)!).
(2) It is conjectured that the Plotkin bound is always attained
in the range dSn=<2d+1. Indeed it has been shown by
Levenshtein (1964) that there exist codes which meet the Plotkin
202 A first course in coding theory
Table 16.1
Values of A,(n, d)

n d=3 d=5

3 R 2 P —
4 R* 2 P —
5 SH 4 P R 2 P
6 SH 8 P R* 2 P
7 H 16 S R** 2 P
8 G 20 L, E, 4 P
9 B 40 EB; SD 6 P
10 G 72-79 L SD 12 P
11 G 144-158 E, D 24 P
12 SH 256 L, SNR 32 L,
13 SH 512 E, SNR 64 E,
14 SH 1024 EB, SNR 128 E,
15 H 2048 S NR 256 EB,
16 Ey 2560-3276 L, NR* 256-340 L,

Key to Table 16.1


Lower Bounds
If C is a given code, then:
Cr
denotes the code obtained from C by adding an extra zero
coordinate,
SC denotes a code obtained by shortening C, possibly more
than once, i.e. use FE; (below) in the form A,(n —1,d)=
5A,(n ’ d).
repetition code (Example 1.11).
Qe Is

Hamming code (Theorem 8.2).


Best (1980).
Golay (1954); for alternative constructions see
MacWilliams and Sloane (1977, Chapter 2, §7). A
(20, 8, 3)-code is also constructed in Exercise 2.16.
a (u| u+v)-construction (see Exercise 2.18).
oes

see Exercise 2.8.


constructed from a Hadamard design (Exercise 2.12).
NR: Nordstrom—Robinson code (Exercise 9.9).

Upper Bounds
P: Plotkin bound (Exercise 2.22).
Related topics and further reading 203
Key to Table 16.1 (Contd.)
S: | sphere-packing bound (Theorem 2.16).
L: linear programming bound (L,: see Best et al. (1978) or
MacWilliams and Sloane (1977); L,: see Best (1980)).
E,: A,(n,d)<2A,(n — 1, d) (Exercise 2.2).

bound provided certain Hadamard matrices of order m <n exist,


for m =0 (mod 4). [A Hadamard matrix of order m is an m X m
matrix of +1s and —1s such that HH’ = ml (over the field of real
numbers). It is easy to associate a Hadamard design with a
Hadamard matrix and we have already seen how such designs
give rise to optimal codes (see Exercises 2.15 and 2.24)]. An
introduction to Hadamard configurations may be found in
Anderson (1974). A proof of Levenshtein’s theorem may be
found in Chapter 2 of MacWilliams and Sloane (1977). It is also a
well-known conjecture that Hadamard matrices of order m exist
for all positive integers m =0 (mod 4). This conjecture is known
to be true for m <264 and so the Plotkin bound is indeed tight
for n = 264 (in the range 2d +12=n).
(3) Values of A,(n, d) found in the text but outside the range
of Table 16.1 include:
A,(23, 7) = 4096 (Theorem 11.3 or 12.20)
A(n, 3) =2"~-", whenever n = 2’ — 1 (Corollary 8.7).

As well as considering optimal binary codes, much attention


has also been given in this text to optimal g-ary codes for general
q. For example: in Chapter 8 we showed that, for a prime power
q, A,(n,3)=q"" for any n of the form (g’—1)/(g—1); in
Chapter 9 we showed that A,(11, 5) = 3°; in Chapter 10 we found
the values of A,(4,3) for general gq; and in Chapter 15 we
showed that A,(n,d)=q"~“t' if q is a prime power and
d=xn<xqtl.
Finally, the problem of finding optimal linear codes over
GF(q) was considered in Chapters 14 and 15.
A topic not covered in this text is that of asymptotic bounds,
applicable when n is large. However, much research has been
devoted to closing the gap between the best-known asymptotic
lower and upper bounds, which are currently an asymptotic
204 A first course in coding theory
version of the Gilbert-Varshamov lower bound (cf. Theorem
8.10) and an upper bound, obtained by linear programming
methods, due to McEliece et al. (1977). Good accounts of this
topic may be found in MacWilliams and Sloane (1977) and van
Lint (1982).
We now give brief descriptions of some types of code not
previously discussed in this text.

Burst-error correcting codes

The codes we have considered to date are designed to correct


random errors (e.g. for a binary symmetric channel). It often
happens that we need a code for a channel which does not have
random errors but which has errors in bursts, i.e. several errors
close together. There are some linear cyclic codes which are well
adapted for burst-error correcting, two important families being
Reed-Solomon codes and Fire codes. An alternative procedure is
to scramble the order in which the digits are transmitted, the
scrambling occurring over a length of several blocks. Then at the
receiving end the order is changed back to the original sequence.
This change-back will break up any bursts of errors, leaving
errors scattered in a pseudo-random way over several blocks, so
that they fall within the capacity of random-error correcting
codes. The interleaving of codes is one way of carrying out this
procedure.
For a good account of burst-error correcting codes, see
Peterson and Weldon (1972) or Dornhoff and Hohn (1978).

Convolutional codes

Convolutional codes are powerful error-correcting codes which


were introduced by Elias in 1955. They are unlike the codes we
have already considered in that message symbols are not broken
up into blocks for encoding. Instead check digits are interleaved
within a long stream of information digits. For example, for rate
3, one might have the information input x,x,x3--- encoded as
X4X4X2X7X4X3..., Where each check digit x; is a function of
X4,X2,...,X; which is found by means of a feed-back shift
register. The decoding is done one digit at a time using the
previously received and corrected digits.
Related topics and further reading 205
Mathematicians tend to be less interested in convolutional
codes because the mathematica] theory is nothing like as well
developed as for block codes. Convolutional codes are also
intrinsically more difficult. Despite this, such codes have been
extensively used in practice. For example, NASA has been using
convolutional codes in deep-space applications since 1977 (from
1969 to 1976, NASA’s Mariner-class spacecraft had used a
Reed-Muller (32, 6]-block code, as mentioned in Chapter 1).
Chapters on convolutional codes are included in the books by
Blahut (1983), McEliece (1977), Peterson and Weldon (1972),
and van Lint (1982).

Cryptographic codes

Cryptographic codes have little in common with error-correcting


codes, for their aim is the concealment of information. The last
decade has seen an explosion of interest in such codes following
the invention of the concept of the public-key cipher system by
Diffie and Hellman (1976). Such a system makes use of a
one-way trapdoor function. This is an encrypting function which
has an inverse decrypting function; but if only the encrypting
function is known, it is computationally infeasible to discover the
decrypting function. This means that a person R can publish his
encrypting algorithm (e.g. in a directory) so that any member of
the public can send messages to R in complete secrecy, for only
R knows his own decrypting algorithm. Such a_ public-key
system thus overcomes the weakness of a traditional cipher
system which requires the secret delivery of a ‘key’ in advance of
sending a secret message.
Rivest et al. (1978) found an elegant way to implement the
Diffie-Hellman system by using prime numbers and a simple
consequence of Fermat’s theorem (Exercise 16.1). Their method
relies on the facts that
(a) there are computer algorithms for testing primality which
are extremely fast (e.g. a few seconds for a 100-digit number),
while
(b) all known algorithms for factorizing composite numbers
are extremely slow (e.g. if is a 200-digit number obtained by
multiplying two 100-digit prime numbers, the fastest of today’s
206 A first course in coding theory
computers, using the best-known algorithm, would take millions
of years to find the prime factors of 7).

THE RIVEST-SHAMIR-ADLEMAN (R-S-A)


CRYPTOSYSTEM

Let us assume that all messages are encoded as large decimal


numbers (e.g. via A = 01, B =02,..., Z =26). The purpose here
is not to encrypt the message but merely to get it in the numeric
form necessary for encryption.
A subscriber R chooses two large prime numbers p and q, each
about 100 digits long, and calculates n = pq. He then finds two
numbers s and ¢ such that
st =1(mod (p — 1)(q - 1)),
i.e. st=r(p —1)(q —1) +1, for some integer r.
R publishes the numbers 7 and s but keeps the numbers p, q,
and t secret. He also publishes the encryption algorithm, which is
simply:
‘encipher a message number x as y = x* (modn)’.
To decipher the received message y, R simply calculates
y'(modn). This gives the original message x because, using
Exercise 16.1, we have

yi =x =x'P-DG-Dt1= y (mod n).


Remarks (i) A long message number must be broken into
blocks, so that each block represents a number smaller than n.
The blocks are then enciphered separately.
(ii) Even if m is an enormous number, say 200 digits, a
message can be enciphered or deciphered very efficiently, using
less than one second of computing time.
(iii) A subscriber R can construct (privately) his key numbers
P, qd, ", s and ¢t very quickly with a computer. It takes a few
seconds to generate a pair of random prime numbers p and gq,
each having about 100 digits. Then, for a random choice of s, the
Euclidean algorithm provides a very fast method of calculating ¢
such that st = 1 (mod (p — 1)(q —1)).
(iv) The deciphering procedure is secret because ¢ is known
only to R. To find ¢ from n and s requires knowledge of p and q.
Related topics and further reading 207
This in turn requires factorizing n, which we have already
remarked to be computationally infeasible (by known methods).

An illustration of an R-S—A cryptosystem in which p and q


are small prime numbers, so that the code may easily be broken,
is given in Exercise 16.2.
Interesting expository articles on cryptographic codes are
Gardner (1977) and Sloane (1981). For a comprehensive treat-
ment of cipher systems in general, Beker and Piper (1982) is
recommended.

Variable-length source codes

In order to illustrate the ideas here, let us consider the problem


of transmitting English text over a binary symmetric channel as
quickly and as reliably as possible. This can be carried out by
applying two codes in series. First a source code encodes the text
into a long string of binary digits. For reliability, this binary data
is then broken into blocks of length k and each block encoded
into a codeword of length m by means of an error-correcting
[n, k]-code. Decoding of the two codes is, of course, done in
reverse order.
In choosing the source code we are not concerned with the
error-correcting aspects. Our main aim is to encode the source
alphabet as economically as possible. If letters in the source
alphabet occur with differing frequencies, we can best do this by
using a variable-length source code.
We now give three examples of source codes for our alphabet
of 27 letters (‘A’ to ‘Z’ and ‘space’).

ASCII CODE (AMERICAN STANDARD CODE FOR


INFORMATION INTERCHANGE)

Computers are usually constructed internally to handle only Os


and 1s. A source code is therefore required to translate each
typed character into a binary vector. A common such code is the
ASCII code. This has 128 = 2’ codewords representing letters of
the alphabet (upper and lower case), digits 0 to 9, and assorted
other symbols and instructions. Each codeword is a binary vector
of length 7 together with an overall parity check (so that any
208 A first course in coding theory

ASCII Morse Huffman


Character Probability code code code

space 0.185 9 01000001 space 000


A 0.064 2 10000010 01 0100
B 0.012 7 10000100 1000 0111111
C 0.021 8 10000111 1010 11111
D 0.031 7 10001000 100 01011
E 0.103 1 10001011 0 101
F 0.020 8 10001101 0010 001100
G 0.015 2 10001110 110 011101
H 0.046 7 10010000 0000 1110
I 0.057 5 10010011 00 1000
J 0.000 8 10010101 0111 0111001110
K 0.004 9 10010110 101 01110010
L 0.032 1 10011001 0100 01010
M 0.019 8 10011010 11 001101
N 0.057 4 10011100 10 1001
O 0.063 2 10011111 111 0110
P 0.015 2 10100000 0110 011110
Q 0.000 8 10100011 1101 0111001101
R 0.048 4 10100101 010 1101
S 0.051 4 10100110 000 1100
T 0.079 6 10101001 1 0010
U 0.022 8 10101010 001 11110
V 0.008 3 10101100 0001 0111000
W 0.0175 10101111 011 001110
x 0.001 3 10110001 1001 0111001100
Y 0.016 4 10110010 1011 001111
Z 0.000 5 10110100 1100 0111001111

Fig. 16.3. Codes for the English alphabet.

single error may be detected). In other words, the ASCII code


is the binary even-weight code of length 8. Those codewords
representing upper case letters are shown in Fig. 16.3.
For other applications, a fixed-length code such as the ASCII
code may be uneconomical.

MORSE CODE

This is a variable-length code which takes advantage of the high


frequency of occurrence of some letters, such as ‘E’, by making
their codewords short, while very infrequent letters, such as ‘Q’,
Related topics and further reading 209
are represented by longer codewords. The Morse code is given in
Fig. 16.3, where the Os may be read as dots and the 1s as dashes.
Although the Morse code may appear to be a binary code, it is in
fact a ternary code, having the symbols dot, dash, and space. A
space has to be left between letters (and at least two spaces
between words), for otherwise the code cannot be uniquely
decoded; for example, the message 01000110 can mean either
LEG or RUN unless spaces are inserted between letters. This
drawback means that the Morse code is rarely used nowadays.

HUFFMAN CODES

Suppose a source alphabet has N letters a,,a,,...,a), and that


the probability of occurrence of a; is p;. Then if each a; 1s
encoded into a word of length /;, the average word-length of the
code is )™, pil.
Huffman coding is an ingenious way of matching codewords to
source symbols so that
(a) the code is uniquely decodable, i.e. when any string of
source symbols has been encoded into a string of binary
digits, it is always clear where one codeword ends and the
next one begins, and
(b) the average word-length is as small as possible.
While omitting the details of how Huffman codes may be
constructed, we give an example of such a code for the English
alphabet in Fig. 16.3. From the given probabilities, it may be
calculated that the average word length is 4.1195. This gives a
saving of nearly 18% on the best fixed-length code we could have
used, in which all codewords have length 5 (any fixed-length code
is clearly uniquely decodable). The reason why a Huffman code
is uniquely decodable is that no codeword is a prefix of any other
codeword, i.e. if x;x.-+:.x, is any codeword, then there is no
codeword of the form xX,X.°-:X,X,41°°°*X, for any m>n.
For a good account of Huffman source coding, the reader is
referred to McEliece (1977), Jones (1979), or Hamming (1980).

Exercises 16

16.1 Suppose p and q are distinct prime numbers. Prove that


for any integers x and r,
x’@—-N4G-)+1 = x (mod pq)
210 A first course in coding theory
(Hint: Use Fermat’s theorem: ‘if x#0(modp), then
x?-!=1(mod p)’ (cf. Exercise 3.8).]
16.2 Suppose a person’s published encryption algorithm reads:
‘Convert your message to a large decimal number via the
code A=01, B=02,..., Z =26, space = 00. Break this
number into blocks of length 4. Encipher each block x
into the 4-digit block y = x*®° (mod 2813)’.
Find the decryption algorithm for the above code and
hence (with the aid of a pocket calculator) decipher the
following intercepted message:
2385 0593 0736 0209 1671 2595 2026 2418.
16.3 In the R-S—A cryptosystem, explain how messages can
be ‘signed’ to prevent forgeries.
16.4 Consider a source alphabet a@,, a5, a3, a, with probabil-
ities of occurrence 4, 4, 3, ¢ respectively. Which of the
following source codes are (a) uniquely decodable, (b)
prefix-free?

Source Code Code Code Code


letter DP: A B C D

ay 0.5 0 00 0 0
a, 0.25 1 01 10 01
a, 0.125 00 10 110 011
a, 0.125 11 11 111 0111

For those codes which are uniquely decodable, calculate


the average word-length.
16.5 Use the Huffman code of Fig. 16.3 to decode the message
00101110101000101100101011.
Solutions to exercises

Chapter 1

1.1

O
Ce et
ia ee OM
| @ @ @
Cee
PPS a em

ite eee ee

(x MO

|Remark: Pictures have actually been transmitted from


Earth into outer space in this way. Two large prime
numbers were used so that a much more detailed picture
could be sent. It is reasonable to expect that a civilized
recipient of such a message would be able to work out how
to reconstruct the picture, since factorization of a number
into prime factors is a property independent of language
or notation.|
1.2 If either OOOOO or 11111 1s sent, then the received vector
will be decoded as the codeword sent if and only if two or
fewer errors occur. So the probability that the received
vector 1S corrected to the codeword sent is

(1-p)y+50—p)*p+ Jaa — p)°p*


= 1-—10p*+ 15p* — 6p”,
whence the word error probability is 10p° — 15p* + 6p?.
212 A first course in coding theory
1.3 Suppose d(C) = 4. If a received vector y has distance <1
from some codeword, we decode as that codeword. If y
has distance at least 2 from every codeword, we seek
re-transmission. This scheme guarantees the simultaneous
correction of single errors and detection of double errors.
Note that C could also be used either as a single-error-
correcting code or as a triple-error-detecting code, but not
both simultaneously (why not?).
1.4 |(16 — 1)/2] =7.
1.5 Suppose C is a q-ary (3, M,2)-code. Then the M ordered
pairs obtained by deleting the third coordinate of each
codeword must be distinct (if two such pairs were identi-
cal, then the corresponding codewords of C would differ
only in the third position, contradicting d(C)=2). So
M <q’.
A 3-ary (3, 9, 2)-code is
00 0 101 202
O11 112 21 0.
022 120 221
More generally it is easily shown that {(a,b,a+
b)| (a,b) €(F,)’}, where F, = {0,1,...,q—1} anda+b
is calculated modulo q, is a qg-ary (3, q*, 2)-code.

Chapter 2

2.1 (i) {000000, 111111}. (ii) (4%). (iii) Add overall parity-
check to (F)°. (iv) Not possible. Suppose C were a
(5,3, 4)-code. There is no loss in assuming 00000 is a
codeword. But then the other two codewords each have at
least four 1s, which implies that they differ in at most 2
places. (v) Not possible. A binary (8, M, 3)-code satisfies
the sphere-packing bound, M(1+8)<2°, which implies
that M <= 28.
2.2 Suppose C is a binary (n, M,d)-code. Partition the code-
words of C into two disjoint sets, those ending with a 0
and those ending with a 1. One or other of these sets
contains at least M/2 of the codewords. Take this set and
delete the last coordinate to get an (n — 1, =M/2, d)-code
Solutions to exercises 213

(this is called a shortened code of C). Taking M = A,(n, d)


gives A,(n — 1, d) =4A,(n, d).
2.3 Immediate from Exercise 1.5.
2.4 Let C be the code obtained from (F)""' by adding an
overall parity check. Every codeword of C has even weight
and so CC E,,. Since every vector of E,, may be obtained
from one in (F,)""' in this way, we have C=E,. Thus
|E, | =|(B)""1|=2"-1. (6B)! has minimum distance 1,
and so E, has minimum distance 2.

(3)/('5) >”
50 10 5
2.5
2.6 Let C be a binary (n, M, d)-code with d even. Delete a
suitable coordinate from all codewords to get an (n —
1, M,d —1)-code and then add an overall parity check (cf.
proof of Theorem 2.7).
2.7 Any such code is equivalent to {00---0,11---100-- - 0},
where the number of 1s in the second word is one of
1,2,...,n.
2.8 Suppose C is a binary (8, M,5)-code, with M = 4. We may
assume O00000000eEC. At most one codeword has
weight = 6, for two words of weight = 6 could differ in at
most four places. So C has at least two codewords of
weight 5. Up to equivalence, we may assume these are
11111000 and 11000111. It is now easy to show that the
only further codeword possible is 00111111.
2.9 Let C be an (n, g, n)-code over F, = {1,2,...,q} and let
A be a matrix whose rows are the codewords of C. Since
d(C)=n, the g elements of any column of A must be
distinct and so must be precisely the symbols 1,2,...,q
in some order. For each column of A a suitable permuta-
tion of the symbols may be performed to give

2...
A=** :
q 4 ce G.

2.10 Apply either the sphere-packing bound or an argument


similar to that of Exercise 1.5 (i.e. the words formed by
deleting the last two coordinates must be distinct).
214 A first course in coding theory
2.11 By Corollary 2.8 and Example 2.23, we have
A,(8, 4) = A,(7, 3) = 16.
2.12 Take as codewords the 11 rows of an incidence matrix of
the design, the 11 vectors obtained by interchanging all Os
and 1s, the all-O vector, and the all-1 vector. The minimum
distance may be shown to be 5 by an argument similar to
that used in Example 2.23. A binary (11, M,5)-code
satisfies 11
mit +114 ( ) | <2",

and so M 21/67, which implies M = 30.


2.13 (i) Following the hint: for each of the u choices of x
there are r choices of B: for each of the b choices of
B there are k choices of x. So the number of pairs in
the set is ur = bk.
(ii) Let y be a fixed point. Count in two ways the number
of ordered pairs in the set
{(x, B): x is a point, B is a block, x ¥y and both
x andy eB}.
2.14 (i) Condition (ii) of the previous exercise is not satisfied.
(ii) Immediate from Theorem 2.27(i).
2.15 Easy generalization of the argument of Example 2.23,
Exercise 2.12.
2.16 Straightforward check (just 34 comparisons of codewords
are required: 11010000 with 19 others, then 11100100 with
11 others, then 10101010 with 3 others, and finally 0 with
1).
2.17 Since (u, | u, +v,) =(u,|u,+v>) if and only if (uw, v,) =
(u,, V2), the number of codewords in C; is M,M).
Let a=(u|u+v) and b=(u’|u’'+v’) be distinct code-
words of C3.
If v=v’, then d(a, b) = 2d(u, u’) = 2d,.
If v#v’, then d(a, b) =d(u, u’) + d(u+v,u’'+v’)
=w(utu’)+w(u+vt+u'+t+v’)
=d(u+u’,0)+d(u+u',v+v’)
2 d(0,v+Vv’) (by the triangle
inequality)
= d(v, v’)=d).
Solutions to exercises 215

2.18 Let C, be the (8, 128, 2)-code E, (see Exercise 2.4) and let
C, be the (8, 20, 3)-code of Exercise 2.16. Apply Exercise
2.17 to get a (16, 2560, 3)-code.
2.19 C, = (4, 8, 2)-code, C, = (4, 2, 4)-code >
C; = (8, 16, 4)-code.
C, = (8, 16, 4)-code, C, = (8, 2, 8)-code >
C; = (16, 32, 8)-code.
C, = (16, 32, 8)-code, C, = (16, 2, 16)-code >
C; = (32, 64, 16)-code.
2.20 Since w(x; + x;) = d(x;, x;) 2d, we have

w(T)=4M(M — 1)d (1)


Suppose 3M — t, codewords have 1 in the jth position, so
that 3M + ¢, codewords have 0 in the jth position. Then the
number of 1s in the jth column of T is

GM —t)3M +t) =@M) —


_ {0 )? if M is even
~~ l@M)?-1 if M is odd,
since t7 = (3)° if M is odd. Hence

—j4 iM?n if M is even


w(T)s Gar 1)n if Mis odd 2)
(1) and (2) give the required result.
2.21 If A,(n, d) is even, the result is immediate. If A,(n, d) is
odd, use [2x] <2|x| +1.
The result gives A,(9,5)=<10 and A,(10,6)=<6. The
former bound can be improved via Corollary 2.8 and the
latter bound; thus A,(9, 5) = A,(10, 6) <6.
2.22 (i) was shown in Exercise 2.21. (ii) follows from (i) and
Corollary 2.8. (iii) By (i), A,(2d—-—1,d)<2d. Hence
A,(2d, d)<4d by Exercise 2.2. (iv) follows from (iii) and
Corollary 2.8.
2.23 The (32, 64, 16)-code is optimal by Exercise 2.22 (ili). The
generalization follows from the Remark in Exercise 2.19
and Exercise 2.22 (ili).
2.24 Immediate from Exercises 2.15 and 2.22(iii).
216 A first course in coding theory
Chapter 3

3.1 270 = (23)°2? = 1°92 = 4 (mod 7).


3100 = (3*)* = 1° = 1 (mod 10).
3.2 x=0, 1,2 or 3(mod4)>x*=0, 1, 0 or 1 (mod 4)
respectively. Hence x* + y*=0, 1 or 2 (mod 4), but
1839 = 3 (mod 4).
3.3 x: 123456 123 45 6789 10 11 12
x-':14523 6 17910811253 4 612
3.4 (i) 2, (ii) 7.
3.5 Yes, No, No.
3.6 (i) 1-04+2-14+3-34+4-14+5x+6-94+7-1+
8-3+9-94+10-9=0 (mod 11)5>5x+7=0>
5x =45x=4-57'=4-953.
(ii) The number is 00232xy800, where we see that each
of x, y is O, 8 or 9. For the number to be an ISBN, we
require 6x + 7y =7, i.e. y=1+7x. Nowx=O05y=
lx=8>y=2;x=9>D>y=9. Sox=y=9.
3.7 Suppose x; -°*-:Xj9 1s the codeword sent and y, --- yi) the
vector received. If a single error has occurred of mag-
nitude a, then V2, y, = (X72, x;) + a =a (mod 11). So the
error is detected. Unlike the ISBN code, any transposition
of two digits will go undetected, for then )) y; = Yi x; =0.
3.8 la,2a,...,(p—1)a are distinct (modp), for
ia = ja (mod p) >i =] (mod p) (multiplying both sides by
a~'). So la,2a,...,(p—1)a are congruent to the ele-
ments 1,2,...,p—1 in some_- order. Hence
la-2a:--::: (p —-1)a=1-2----(p—1)(modp) and so
(p — 1)! a?-*=(p—1)! (mod p). Multiplying through by
the inverse of (p — 1)! gives a?~'=1 (mod p).
3.9 a+0 > gcd(a, p)=1. By the Euclidean algorithm, 1
=ax+py for some integers x and y. Hence
ax =1(modp) and so x =a™. 31=1-23+8; 23=2-8+
7;8=1-7+1. So1=8-1-7=3-8—23=3:-31-4- 23.
Hence —4-23=1 (mod 31) and so 237! = —4=27.
2, 3, 2 (other answers possible).
Let 1 be the multiplicative identity element of F. The field
elements ni for n =1,2,3,... cannot all be distinct, since
F is finite. So /1=m1 for some 0<m<lJ, whence (/—
m)1=0. This implies that n1 = 0 for some integer n. Let p
Solutions to exercises 217

be the smallest positive integer such that p1 = 0. Then p is


prime because p = rs, with 1<r,s<p, >pl=(ri)(sl)=
0 >r1=0 or s1=0 (by Lemma 3.1 (ii)), contradicting
the minimality of p. Finally, if we F, then pw=a+t
aA+t---+ta=a(1+1+-::+1)=a(p1)=a0=0.
3.12 (") =p!/i!(p—i)!. If ie {1,2,...,p—1}, then the nu-
merator p! is divisible by p, whereas the denominator

i! (p —i)! is not. Hence 6 =(0 (mod p). By the binomial


theorem, p
(a+bp=> (\aibe~ =a? + b? (mod p)
i=0
For the last part, use induction on a.
3.13 In the product, each element x will cancel with its inverse,
except when x =x7'. Now x =x '@x*=106(4-1)(a+
1)=0@x=1o0rx=-1.
3.14 (i) =v >@-yPax-y?=0>
x-y=05x=y.
So the squares of the non-zero elements are precisely
the distinct non-zero elements (in some order).
(ii) Hint: show that if a #0, then x* =a has either 2 or 0
solutions.

Chapter 4

4.1 Show that this single condition holds if and only if both
conditions (1) and (2) of Theorem 4.1 hold.
4.2 Suppose x,yéF,, so that w(x) and w(y) are even
numbers. By Lemma 2.6, w(x+y)=w(x)+w/(y)-
2w(x My) = an even number. So x+ ye E, and hence E,, is
a subspace. By Exercise 2.4, |E,,| =2"~' and so dim (E,,) =
n—1.
The rows of
100-:-:--01
010---01

00:---011
form a basis (other answers are possible).
218 A first course in coding theory
4.3 (1, 2,0, 1) =2(0,1,2,1)+(,0,2,2). So {(0,1,2,1),
(1,0, 2, 2)} is a basis and dim (C) =2.
4.4 Show that {u, v} is linearly dependent if and only if either
u Or Vv is zero or V is a Scalar multiple of u.
4.5 In each case show that the new set is still both a spanning
set and a linearly independent set.
4.6 In F, let n denote the element 1+1+---+1(n1s). Then
the subset {0,1,...,p-—1} of F may be regarded as the
field GF(p), since addition and multiplication are carried
out modulo p. It follows at once that F is a vector space
over GF(p), all the axioms following immediately from
the field properties of F and GF(p). If the vector space F
over GF(p) has dimension h, then it follows, as in the
proof of Theorem 4.3, that |F| =p”.
4.7 We omit the proof of the general result here, as it will be
given in Chapter 14. The points of P, are {000, 100},
{000, 010}, {000, 001}, {000, 110}, {000, 101}, {000, 011},
and {000,111}. The lines are {000, 100, 010, 110},
{000, 100, 001, 101}, etc. That this 7-point plane is the
same as that of Example 2.19 may be seen from Fig. 14.8,
wherein a vector x stands for the point {0, x}.

Chapter 5

5.1 No; 24 is not a power of 2.


5.2 [n,n —1, 2], 1 |.
1
| .

l
5.3 We use Theorem 4.1.

x,yeCS>(x+y)H’
=xH' +yH’ =0+0=0
>xtyec.

xe C and ae GF(q)>(ax)H’' =a(xH’)=a0=0


>axeC.

5.4 If x= (x1,...,%,) €C, let X=(x1,...,%,, U7, x;), where


», x; is calculated modulo 2. Then C = {&|x € C}. Suppose
Solutions to exercises 219

C is linear, so that x, ye C>x+yeC. Then

i GeCSetH=(n ty, oo nt Ya Dut Dy)

=(x1 +5, a in +I D Gr +y)|

=(x+y)e€.
So C is linear.
Adding an overall parity check to the code of Example
5.6(ii) gives an [8, 4, 4]-code with generator matrix
10001011
01001110
00101101
0001011 Iu.
5.5 Let Ev and Od denote the subsets of C consisting of words
of even and odd weights respectively. Suppose Ev #C.
Then there exists a codeword, y say, of odd weight. Now
the set Eu + y = {x+y|xe Ev} is contained in C (since C
is linear). But all words in Ev + y are odd (via w(x + y) =
w(x) + w(y) — 2w(xN y), cf. Lemma 2.6), and so we have
Eu +ycOQOd. Hence |Ev|=|Eu +y|<|Od|. Also Od+
yc Ev and so |Od| <|Ev|. Hence |Ev| = |Od| = 3 |CI.
5.6 00000 d(C,)
= minimum non-
11110 zero weight
‘00111 =3
11001

0000000
1001101=x,
0101011=x,
001011 1=x;
= 110011 0=x,+% d(Cy) = 4.
101101 0=x,+x;
011110 0=x,+x;
111000 1=x,+x,+x;
220 A first course in coding theory
5.7 0000 d(C) = minimum non-zero
1011=x, weight= 3.
Since 9[1 + 2 - 4] = 3*, the sphere-
O112=x, packing bound is attained and so C
202 2=2x, is perfect.
022 1=2x,
112 0=x,+x,
221 0=2x,
+ 2x,
120 2=x,+2x,
2101=2x,+x,

5.8 By Table 2.4, A,(8, 3) =20, A,(8, 4) = 16, and A,(8, 5) =


4. By Exercise 5.4(ii), there exists a linear [8,4, 4]-code
and so B,(8, 4) = 16. There certainly exists also an [8, 4, 3]-
code and so, since B,(8, 3) is a power of 2 and is = 20, we
have B,(8, 3) = 16. The code constructed in Exercise 2.8 is
linear and so B,(8, 5) = 4.
. 01
5.9 generates a [3, 2, 2]-code over GF(q).
011
5.10 First get the required permutation of the rows of A by
permuting the rows of G. The J, part will have been
disturbed but can be restored by a suitable permutation of
the first K columns.
5.11 F1 000011
0100101
001011 0}.
0001111
No, Yes (by Exercise 5.10).
5.12 (uju+v)+('|u'+v’)=(4+u |ut+u'+viv eC,
Thus C; is linear. So B,(2d, d) = 4d by Exercises 2.19 and
2.23 since, at each step, C, and C, are linear.

Chapter 6

6.1 C,: |00 01 10 11 C,: |000 101 011 110


100 001 111 010
Solutions to exercises 221

C3: |00000 10110 01011 11101


10000 00110 11011 01101
01000 11110 00011 10101
00100 10010 01111 11001
00010 10100 01001 11111
00001 10111 01010 11100
11000 01110 10011 00101
10001 00111 11010 01100
(i) 11101, 01011
(ii) e.g. (a) 00000 received as 11000, (b) 00000 received
as 10100.
6.2 P core( Cy) — (1 — py? = 0.9801
Pcon(C2) = (1 — p)? + pl — p)? = (1 — p)* = 0.9801
Pecorr(C3) = (1 — p)?(1 — 2p* + 3p) = 0.9992
There is no point in using C, for error correction since
Po 1S the same as for C,, while C, takes 50% longer than
C, to transmit messages. C, reduces the word error rate
considerably.
Pundetec(C) = 2p(1 — Pp) + p? = 0.0199
Pundetec(C2) = 3p*(1 — p) = 0.000297
Punaetee(Cs) = 2p*(1 — p)’ + (1 — p)p* = 0.00000197.
6.3 (i) No, communication is impossible.
(ii) Yes, interchange all Os and 1s in the received vector
before decoding.
6.4 The coset leaders include all vectors of weight =¢ and a,.,
vectors of weight ¢+ 1. So the probability that the error
vector is not a coset leader is

p‘*?
(, . , 7 a1 |p — p)"~*~* + terms involving

and higher powers. Hence

n
P... = (, 1 y — O41 fo for small p.

6.5 Straightforward calculation, with A; = A,=7, A;=1.


222 A first course in coding theory
6.6 Since the code is perfect 3-error-correcting, we have
23 23
A =1, a, = 23, «,=(“5), a= ( ;)

a,=0 for 1=4.

P corr
= (1— p)?°(1540p? + 210p? + 20p
+ 1) ~ 0.99992
if p = 0.01.
So P,,,= 0.00008 [Remark: A fair approximation is ob-
. . . 23\ 0 2
tained by using Exercise 6.4; namely ( 4 )10 ,

6.7 Suppose x = x,x.-:+ +X, 1S sent and that the received vector
is decoded as x’ = x;x5---x,. Then

SP Prob (x; # x))


mle

Pomp =

Ss) f(e) Prob (e is error vector),


a |

eeV(n,2)

where f(e)=number of incorrect information symbols


after decoding if the error vector is e, and so

Pomp = : >» E:P,.

6.8 Pyymb = 5[P, + P3 + 2P,]

= 3[{2(1 — p)*p* + (1 —p)p* + p*}


+ {(1—p)’?p + (1 —p)’p? + 2(1 — p)p?}
+ 2{3(1 — p)*p* + (1 —p)p*}}.
6.9 Note that P.,.= 52,P. Since F, =0 and 1<E<k for all
122, we have
1 2k 1 2* 2k

pa
k i=
<7k dix EP SDi=2 P,
and hence the result.
Solutions to exercises 223

Chapter 7

7.1
n n

u:v=» ujv;= >, UU; =V°U.


i=1 i=1

(Au+puv)-w= > (Au; + WU;)W, = > (Au;w; + LUW,)


j=1
i= i=1

=A> uw,t+
pu >, uw, =Au-wt uvew.
i=1 i=1

7.2 The standard form generator matrix of FE, was found in


Exercise 5.2. It follows from this and Theorem 7.6 that a
generator matrix for FE; is [11---1]. So E,= {00
-++Q,11---1}, which is the repetition code of length n.
7.3 Find the syndrome S(y) of the received vector y. If
S(y) =90, then y is a codeword. If S(y) #0, then y is not a
codeword and we have detected errors.
7.4 Suppose x is the codeword sent and y=x-+e is received,
where e= e,;e,---e, is the error vector. Then S(y) = (x +
e)H’ =xH’ +eH' =eH’. So S(y)’ =He’ =%_, eH,
where H; is the jth column of H.
7.5 Since the code is perfect, the coset leaders are precisely
those vectors of weight <1. G is in the form [J,| A] and so
1110100
H =[-A?™ | L] =| 1101010
1011001
We use this to construct the syndrome look-up table:

Syndrome coset leader

000 0000000
111 1000000
110 0100000
101 0010000
011 0001000
100 0000100
010 0000010
001 0000001
224 A first course in coding theory
S$ (0000011) = 011; decode as
0000011 — 0001000 = 0001011.
The other three vectors are decoded as 1111111, 0100110,
0010101.
1022 1110
7.6
(a) lonat. >) Lot:
(c) A listing of the codewords reveals that d(C) =3. So
the 9 vectors of weight = 1 are all coset leaders. Since
the total number of coset leaders = 3*/37=9, the
vectors of weight =1 are precisely the coset leaders
(in fact the code is perfect). The look-up table is now
easily constructed, and the given vectors decoded as
0121, 1201, 2220.
7.7 0612960587.
7.8 Let C be a q-ary (10, M, 3)-code. Consider the M vectors
of length 8 obtained by deleting the last two coordinates.
These vectors must be distinct (or the corresponding
vectors of C would be distance <2 apart). So M <q? (this
is a particular case of the Singleton bound, Theorem
10.17). In particular, A,.(10,3)<10°, A,,(10,3)<11%.
[Remark: The sphere-packing bound is not as good in
these cases.] We have A,,(10, 3) = 11° because the linear
[10, 8]-code over GF(11) having

n=[t |
~ £123--- 10
is an 11-ary (10, 11%, 3)-code.
7.9 For example, 0 and 0505000000 are codewords only
distance 2 apart.
7.10 Let e, =0---01---1 (Gls). We require a code such that
@o,€;,..-,@7 are all in different cosets (we could then
decode via syndrome decoding with the es as coset
leaders). This requires that 27/2* = 8, i.e. k <4, and so the
rate cannot be greater than 4. To achieve rate 4+ we would
need a 3 X7 parity-check matrix H such that eH’ #e,H7
if iA~j, i.e. such that (e, —e;)H’ 40 for all i4j. Note that
each e;
— e; is a vector of the form 0---Q01---10---0.A
Solutions to exercises 225

suitable H is

0001000
0100010 |
1010101
If e; — e; is orthogonal to the first row of H, then all its 1s
are to the left or to the right of centre. If also e; —e; is
orthogonal to the second row of H, then there can only be
one 1, in one of the Ist, 3rd, 5th or 7th positions. But then
e; —e; is not orthogonal to the third row of H. (Note: a
similar code of maximum possible rate may be constructed
of any given length.)
7.11 If C is an [n, k]-code, then C is an [n + 1, k]-code and so a
parity-check matrix of C is an (n + 1—k) X (n +1) matrix
whose rows form a linearly independent set of codewords
in C“. It is easily seen that H is such a matrix.

Chapter 8

000000011111111
000111100001111
8.1 H=
011001100110011 |
101010101010101
When y is received, calculate yH’; this gives the binary
representation of the assumed error position. If two or
more errors have occurred, then y will be decoded as a
codeword different from that sent.
8.2 11100001, 01111000, at least two errors, 00110011.
8.3 From the standard form generator matrix (see Example
5.6(i1)), write down a parity-check matrix (via Theorem
7.6) and observe that its columns are the non-zero vectors
of V(3, 2).
8.4 For C, a&=1 and a,=n, giving P,,,(C)=(1-p)” ‘((1-
p+np). Because every vector in V(n, 2) has distance = 1
from a codeword of C, it follows that every vector in
V(n+1,2) has distance<2 from a codeword of C.
Consequently, the coset leaders for C all have weight <2
226 A first course in coding theory
and sO @=1, a@,=n+1, @ =n, which leads to
Poor(C) =(1—p)*'(1-—p+np). [Remark: This result
will be generalized to any perfect binary code in Exercise
9.1.]
01111111
8.5 (i); |po1s346 “| 35234106,
34106, 10561360.
10561360
ii) f000000111111111
011111000001111
1012340123401 23
1111111111111111
1222223333344444
4012340123401234
8.6
8.7 Sood 1110 0 0
3 20 0 1)’ 100 2 1 0
02010 1
[For the code C,, a column operation (e.g. interchange of
columns 3 and 4) is necessary during the reduction of G to
a standard form of G’. So, after applying Theorem 7.6 to
get a parity check matrix H’ corresponding to G’, the
above column operation must be reversed in H’ in order
to get a parity-check matrix for the original code C).]
d(C,) =2, d(C) =3.
(other answers possible)
8.8 For example,
100111
H={010123
001134
has the property that any three columns form a linearly
independent subset of V(3,5), and so H is the parity-
check matrix of a [6, 3, 4]-code.
k
8.9 R,= == (2-1 -ni(2-1)=1-5— > lasr>
8.10 As in solution to Exercise 7.8, A,(n,3)<q"*. Now
suppose g is a prime power. Then the bound is achieved
forn=q+1 by Ham (2, qg) and for n<q +1 by shorten-
ings of Ham (2, q).
Solutions to exercises 227

8.11 f(t)=least value of M for which there exists a ternary


code of length ¢t with M codewords such that any vector in
V(t, 3) has distance <1 from at least one codeword. For
such a code the spheres of radius 1 about codewords must
‘cover’ the whole space V(t, 3) and so a lower bound on M
is given by M(1 + 2t) = 3° (1)
(This is the sphere-packing bound, but with the inequality
reversed.)
(a) (i) If t=(3"—1)/2 then (1) gives f(t) =3'-". The
bound is achieved by a perfect [t,t—r, 3]-
Hamming code over GF(3). So, for t = (3” — 1)/
2, we have f(t) =3'7’.
(ii) Generating Ham (2, 3) by ont and replacing
‘0’ by ‘X’, we get the entry
X 1X2 xX
>
XK

NWR eR
xKx
me
hm

(b) The lower bound f(5) = 23 is given by (1). A crude


upper bound is f(5) <27. This is obtained by com-
bining each of the 9 bets for t=4 with each of the
forecasts 1, 2, X for the 5th match. The surprising
result proved by Kamps and van Lint is that one
cannot do better than this.
8.12 Let C be an (n, M, d)-code with M = A,(n, d). Then there
is no vector in V(n,q) with distance =d from all code-
words in C. Thus the spheres of radius d—1 about
codewords cover V(n,q), whence the result. (The proof
shows that a code meeting the lower bound may be
constructed simply by starting with any word and then
successively adding new words which have distance at least
d from the words already chosen).

Chapter 9

S()-2
9.1 Suppose C is a perfect ¢t-error-correcting [n, k]-code, so
that t

ai
228 A first course in coding theory
As in Exercise 8.4, for C,
n+1
a = (" ) for 0<i<r,
l

and

O,.,=2"t1k#- >)
f (”
+1
i=0 l

=2->(")-3(F)-2G y=):t
n .(n f n
2, l 2, Ly i=
Hence
, , n
pil —p)yrri + ( pa — py"

t
nh . .
"\p'( —p)r*t

+> (, ” ‘pia _ pyitinig ("\p“*a _ py


= (1 — p)Poor(C)
+ (p Peon(C) —(")p*( = py")
+("\pa -py
= Peore(C).
9.2 It is easily checked that u- v= 0 for any rows u and v of
G. It follows that Gj,= G,,. Now show that G,, has no
codeword of weight <5 by imitating the proof of Lemma 3
in the proof of Theorem 9.3.
9.3 If H =[I;| A] has no 4 columns linearly dependent, then
each column of A has at most one zero, and no two
columns of A can have a zero entry in common (or their
sum or difference would be a linear combination of two of
the columns of J). The hint now follows easily. It then
follows that in each of the undecided columns of A, two of
the *s are 2s and the other * is a 1. The remaining columns
may now be completed, one at a time, in a unique way (up
to equivalence).
Solutions to exercises 229

9.4 (a) Suppose y has weight 4. Since G,; is perfect, there is


a unique codeword x such that d(x, y) <3, and so
1<w(x) $7. But every non-zero codeword has we-
ight =7 and so w(x) =7, which implies that x covers
y. The uniqueness of x as a codeword having
distance =3 from y ensures that x is the only code-
word of weight 7 which covers y. Counting in two
ways the number of pairs in the set {(x,y)|x is a
codeword of weight 7, y is a vector of weight 4, x
7 23
covers y} gives A,- ()) = ( 4 - 1, whence A, = 253.

(b) Let P,,..., P3 be points and B,, ... , Bys3 be blocks,


and define P; € B; if and only if the (7, j)th entry of M@
is 1.
9.5 (a) Straightforward generalization of the argument of
Exercise 9.4.
(b) Let X be the set of codewords of weight 2¢+1
beginning with 7 ls. Let Y be the set of vectors in
V(n, 2) of weight ¢+ 1 beginning with i 1s. As in the
proof of Theorem 9.7, counting in two ways the
number of pairs in the set {(x,y)|xeX,yeY,x
. n-l 2t+1-i
covers y}_ gives (ro) = ( i411 i |X|,
whence the result.
9.6 We must show that an arbitrary vector y= y,y.-- + yoq of
weight 5 in V(24, 2) is covered by a unique codeword of
weight 8 in Gy,. Certainly there cannot be two such
codewords or their distance apart would be <6, a con-
tradiction. If y.,=0, then since G); is perfect, G23 contains
a codeword x having distance at most 3 from y,y,- - * yo3.
So x has weight 7 or 8; in either case w(X)=8 and X
covers y. If y.4=1, then y,-- - y23 is covered by a unique
codeword x of weight 7 in G,3 and then X covers y.
- 3 2’-1
9.7 By a now familiar argument, A; - ( = ( )

9.8

9.9
a Ga(e
(i) Assume 1111111100---O0¢G,,. Let G be a gener-
ator matrix of G,,. Since d(G,,) = 8 and since Gy, is
self-dual, it follows by Theorem 8.4 that any 7
230 A first course in coding theory
columns of G are linearly independent. In particular,
the first 7 columns are linearly independent and so
by elementary row operations, G may be trans-
formed to a matrix having its first 7 columns as
shown. Since 1111111100 - - - 0 is orthogonal to every
row of G, the eighth column of G must also be as
shown.
(ii) Let the rows of G be r,,h%,...,¥yp. The set of
codewords with one of the given starts is given by
adding to 0, or to one of ,1r,...,9r,, all vectors of
the form ))j2,A,r;, A; € GF(2). So for each of the 8
starts, there are 2° codewords.
(iii) Immediate, since d(G,,)=8, and any two of the
chosen 256 codewords differ in at most 2 of the first
8 positions.
(iv) Immediate.
9.10 Shorten N,; thrice (cf. Exercise 2.2) to get a (12, 232, 5)-
code.
9.11 (i) Let the rows of G be r,,1r,...,9r¢. To show that G,
generates an [8, 5, 3]-code, it is enough to show that
if x is any non-zero codeword of C generated by
r,,13,.-..,¥.¢, then x has at least three 1s in the last 8
positions. If x had at most two ls in the last 8
positions, then either x or x +r, would be a code-
word of C having weight = 4, a contradiction.
(il) If there existed a [15,8,5]-code, then it could be
twice shortened to give a [13, 6, 5]-code, contrary to
the result of part (i).
(iii) Not immediately, for in this case G, would generate
a [7,4, 3]-code, and a code with these parameters
does exist. However, further considerations do lead
to a contradiction; see, e.g., van Lint (1982), §4.4.
Chapter 10

10.1 Use Theorem 10.8 with uw =1, v=2.


10.2 In Theorem 10.10, take

012 012
A,=B,= 12 O}, A,=B,= 201 .

201 120
Solutions to exercises 231

10.3. Using Theorem 10.19, a set of three MOLS of order 4 1s


Olab QOabil Obla
10ba 1baQ0 la0Oob
= = A2=
‘"~ab0l A2= 1016 > a1b0
balQO b 10a b0al

10.4 Ham (2, q)* has generator matrix


i 111 1
1 Ay A, A, Ag—id
where GF(q) = {Ao, A1,---»Ag-1}- Clearly no non-zero
linear combination of these two rows can have more than
one zero and so Ham (2, q)~ has minimum distance gq. If
we list the codewords generated by
Soo aa
101234
and then apply Theorem 10.20, we get
01234 01234
12340 23401
A,=23401 A,=40123
34012 12340
40123 34012
01234 01234
34012 40123
A,;=12340 Azy=3 4012
40123 23401
23401 12340
5 | 3456789 10 Mt 12
~ f(n): 23.4167 8 2-9 10 2-11
n.13 14 15 1617 18 19 20
f(n): 12 2-13 2-14 15 16 2-17 18 3*-19
* Take three MOLS of order 4 and three MOLS of
order 5 and generalize the construction of Theorem
10.10 to get 3 MOLS of order 20.
232 A first course in coding theory
10.6 The existence of 3 MOLS of order 20 (see prevjous
exercise) gives the existence of a (5,400, 4)-code, by
Theorem 10.20. Since this code achieves the Singleton
bound, we have A,,(5, 4) = 400.

Chapter 11

11.1 0204006910.
11.2
i 479 1
1081 2
977 9
C= fs 218 10
197 4
E 767 1
11.3 0000001000, 1005000003.
11.4 Identify the letters A,B,...,Z with the field elements
0,1,...,25 of GF(29). Let H be the parity-check matrix
11 1 1
12 3 8
1 2? 3? 8?
1 23 3° 8°

for an (8, 29*, 5)-code over GF(29). Let C be the 26-ary


code obtained by taking only those codewords consisting
of symbols 0,1,..., 25, 1.e.

C= fxr --xg/x,;€{0,1,..., 25},


8
S) ix; = 0 (mod 29),j = 0, 1, 2, 3.
i=1

A probabilistic estimate for the number of codewords in


8
C is 29* x (55) = 295,253 (it happens that this is a re-

markably good estimate).


Solutions to exercises 233

Alternatively we could base our code on 26 of the


elements of GF(27). This would give us more codewords,
but the arithmetic involved in the decoding would be less
straightforward.

11.5 o(6)=|][(1-X0)> 0'(6)=-> xX, [] 0-X)


i=1 t=1 it

> 0'(Xj1) = —X, i=1[] (-%:x7").


it¢j
The result now follows from equation (11.10).
37618945210
11.6 H= 54327654321)
There exists a codeword (%),...,%1) of weight 2 with
non-zero entries x; and x; if and only if H; = —(x,/x;)H,,
where H; denotes the ith column of H. In order to
determine which columns of H are scalar multiples of
others, calculate the ratios h,/h, for each column

i
hy}
They are 5, 10, 2, 6, 9, 7, 3, 4, 8, 6, 0. It follows that a
double-error vector will go undetected if and only if it is
of the form (0, 0, 0, A, 0, 0, 0, 0, 0, —A, 0) for some
Ae {1,2,..., 10}.

Chapter 12

12.1 (i) No, No (not linear), (11) No, No, (iii) No, Yes, (iv)
Yes, provided the alphabet is a field, (v) Yes, (vi) No,
No, (vii) Yes.
22 + |O0 1 x 1+x 1 +x has no inverse
0/0 0 0 0
1 0 1 x 1+x
x |O x 1 1+x
1+x|O 1+x 1+x 0
12.3 Just imitate the proof of Theorem 3.5.
12.4 If f(x) had an even number of non-zero coefficients, then
we would have f(1) = 0 and so x — 1 would be a factor of
f(x).
234 A first course in coding theory
12.5 Because p(x) =f(x)g(x) > deg p(x) = deg f(x) + deg g(x)
> either deg f(x) <4deg p(x) or
deg g(x) <2deg p(x).
12.6 x, 1+x, l+x4+x7, 14+x4+x°, 1+x%*%4+%7, 1t+x4+x74,
1+x°4+x4, 1+x4+%x74+2%3+x*. (Using Lemma 12.3 and
Exercise 12.4, it easily follows that the irreducible
polynomials of degrees 2, 3 and 4 are precisely those with
constant coefficient 1 and with an odd number of
non-zero coefficients, with the exception of (1+x+
x”)? =1+x7+x*). For example, F[x]/(1+x+x°) is a
field of order 8.
12.7 (i) By Exercise 3.12, (x? —1)=(x—-1).
(ii) From Fermat’s theorem (Exercise 3.8) and Lemma
12.3(i), it follows that x?~*=(x — 1) —-2)---(«-
(p — 1)).
12.8 By Lemma 12.3(i), x°-—1=(« —1)(x*4+x°4+2%7+x +1),
and the second factor is irreducible by Exercise 12.6. So
the only cyclic codes are {0}, (x —1) (the even weight
code), (x*+x°+x7+x+1) (the repetition code), and
the whole of V(5, 2).
12.9 Yes, (x — 1)g(x).
12.10 2°. (In a factor of x” — 1, each of the ¢ distinct irreducible
factors may or may not be present).
12.11 (1) = whole space
(x —1) =even weight code E,
(x3 +x+1)
}ooth are Hamming codes Ham (3, 2)
(x? +x°+1)
((x —1)(x?+%x+1)) yan are even weight subcodes of
((x — 1)(x? + x* + 1))J Ham (3, 2) (alternatively, both are
duals of Ham (3, 2))
((x3 +x + 1)(4° +x7+1)) =repetition code of length 7
(x7—1) = {0}.
12.12 x8—1=(x4- 14741) = -1D(e + D074 12 4+%4+
2)(x* + 2x + 2), 32.
12.13 Straightforward application of Theorem 12.15.
12.14 Not in general; Yes, C+ is obtained from (h(x)) by
writing the codewords backwards.
12.15 Let g(x) be the generator polynomial of C. Then g(x) isa
Solutions to exercises 235

divisor of (x —1)(x""'+---+x+4+1). If g(x) is a mul-


tiple of x —1, then so is every codeword, and so every
codeword has even weight. So if there exists a codeword
of odd weight, then x”"'+---+x +1 must be a multiple
of g(x), i.e. Le C. The reverse implication is immediate
since w(1) is odd.
12.16 Let g,,...,g, denote the rows of G. Let x denote a
cyclic shift of x. Ifx =). A,g, eC, then x = ) Ag, EC.
12.17 Check that 2°,2',...,2° are precisely the distinct non-
zero elements of GF(11). Hence the code of Example
7.12 is equivalent to the code C with parity-check matrix

111 1
5° 9 2°29}
Now

(29, 29,24)... 28) = 29(29, 21, , 29)


and so C~ is cyclic by Exercise 12.16. Therefore C is
cyclic by Theorem 12.15(ii). The result for Example 11.3
follows similarly.
12.18 The subcode D of G,3 consisting of codewords of even
weight is ((x —1)g,(x)). Thus D* = (g,(x)) = (gi))
and so D< D~. Hence u- v= 0 if u and v are codewords
of even weight. Since 1é€G,3, any codeword of odd
weight is of the form u+ 1 for some codeword u of even
weight. If u+1,v+1 are codewords of odd weight, then
(ut+1)-(v+1)=u-v+1-v+u-1+1-1=0+0+0+1
=]. Also if u+1 has odd weight and v has even weight,
then (u+1)-v=u-v+1-v=0+0=0. Now let x,y be
any codewords of G3 and let X, y be the corresponding
codewords of G,,. Then X*-J~=x-y+Xx4y2,=0, since
x-y=1&x,y both have odd weight Ox 24 = yxy =1. So
G,, ¢ Gd, and since dim (G,,) = dim (G3,) = 12, it follows
that Gy, = G54.
12.19 x*+x+1 is a generator polynomial for Ham (4, 2).
Dividing x'° — 1 by x* +x + 1 (e.g. by long division) gives
A(x)=xV+x8 4x74 P+ x3 4x2 4+4 41.
12.20 Ham(r,2) is a [2’—1,2’-—r-—1,3]-code. By Exercise
12.9, ((x — 1)g(x)) is the subcode of codewords of even
236 A first course in coding theory
weight. This subcode must have dimension 2’ — r — 2 and
minimum distance 4.
12.21 It is enough to show that no vector of the form
(xitx yt +x)=a4+ L(x' +x)
is a codeword of ((x +1)g(x)) (then all vectors of the
form 0, x‘, and x'+.x'*! will be coset leaders). But
(x + 1)(x' + x’) is a codeword> (x + 1)(x' + x’) is a mul-
tiple of (x + 1)g(x)>x' +x! is a multiple of g(x) >x' +
x/ € (g(x)), contradicting d((g(x))) =3.
12.22 (van Lint 1982, solution to Exercise 6.11.7). Show that
every non-zero codeword of C has exactly one zero
entry. Show also that there is exactly one codeword
C= CoC, °°" Cy, such that co = Cig 41) = 1 [Consider the q?
ordered pairs (Co, C(g+1)2) aS ¢ runs over all codewords of
C]. If C were cyclic, then a cyclic shift of ¢ through
(q + 1)/2 positions would yield the same codeword c, but
this is not possible if ¢ contains only one zero entry. Thus
C is not cyclic and so Ham (2, q), being the dual code of
C, is not cyclic by Theorem 12.15(ii).

Chapter 13

13.1 The mapping x—>x+1 gives a one-to-one correspon-


dence between the set of codewords of weight i and the
set of codewords of weight n —1.
13.2 (b) C*~ is generated by
tod
11101)
and so C+ = {00000, 10010, 11101, 01111}. Hence
W c(z) =14+ 27+ 22%. So

Welz) =40 + 2) We)


=4(1+z)§+(1+z)(1—z)?+2(1 +z) -z)"]
=14+3z74+3z3+2°.

13.3 C* is generated by
SO oortioL
110011101)
Solutions to exercises 237

and so We.(z)=1+3z°. Hence W(z)=4[(1 +z)? +


3(1 + z)*(1—z)®], whence Ayp=1, A,=0, A,=4[36+
3(3 + 15 — 18)]=9, A,=27. The sum of all the rows of
the generator matrix of C is 1. By Exercise 13.1,
A; =Aog_;, and so 2(Ayg+A,+A,+A3+A,4) =2’, which
gives A, = 27. Hence
W(z) = 14+ 927 + 2727 + 27z* + 27z° + 27z° +: 927+ 2’.
13.4 Adding an overall parity check increases each odd weight
by 1 and leaves each even weight unchanged. So
W e(z) =14+ 14244 28.
13.5 Let C be Ham (r, 2). Then by Theorem 13.10, We.i(z) =
1+ (2’-1)z7 =14+nz"*?, So
1
W-(z) = 5 [(d+z)? +n — z)@tP2(1 + z)@-D?]

- [i +z)" +n — 27)"-9(1 —z)).


13.6 We(z)
= de[(1 + 2)+ 15(1 — 2°)" — z)]. (1)
Ajp=1, A, =A,=0 (either from (1) or because we know
d(C) = 3), A; =35, A, = 105.
13.7 The coefficient of z’ in the right-hand side is 4A,(1+
(—1)') =A, if i is even, 0 if i is odd.
13.8 If W(z) =» A,z', then

We(z) = » (A;
+ Aj_1)2’
ieven

ieven j odd
=4[W-(z) + We(—z)] + 4z[We(z) — We(-z)].

13.9 From equation (13.12),


1 n

Pundetee(C) = (1 — p)” anak (1 + 7)

«Woi((1-725)/(+755))
—(1-p)’

= 5g Wes(l ~ 2p)~ (1 -p) :


238 A first course in coding theory
13.10 By Lemmas 1, 3, and 4 in the proof of Theorem 9.3, Gy,
is self-dual, A; #0 only if i is divisible by 4, and A,=0.
Since 1 € G,,, it follows from Exercise 13.1 that A) =0
and that A,=A sg. So W(z)=1+Agz?+ Appz? +
Agz'©+z4. Applying the MacWilliams identity and
equating coefficients of W-(z) and W..(z) (since C is
self-dual) gives: 2+2Ag+A,.=2" (constant
coefficients) 0=0 (coefficients of z) and 138+ 10A,—
3A, = 0 (coefficients of z*). Solving these gives Ag = 759,
Ay = 2576.
13.11 Gy, 1S self-dual by Exercise 12.18. By Lemma 12.19,
codewords of G,; of even weight have weight divisible by
4. Since 1 (=g,(x)g2(x)) € G3, it follows by Exercise 13.1
that any odd weight of a codeword of G,, is congruent to
3 (mod 4). Consequently, all codewords of G,, have
weight divisible by 4. Also A,=0, since d(G,4) = 8. The
result now follows exactly as in Exercise 13.10.
13.12 By Exercise 13.10 (or 13.11) the only A,js in W,(z)
which can be non-zero are Ag, Az, Ag, Ay, Ap, Ais, Are
and A,3;. Also Az+Ag=759 and A,,+A,.=2576. By
Exercise 9.4(a), A7=253 and so A, = 506. Since 1€ G,,,
we have Ay, = Aj) = 1288, A,, = 506, and Aj, = 253. So
W o,,(Z) = 1+ 25327 + 506z8 + 1288z"!
+ 1288z!* + 506z! + 253z'6 + 273.
13.13 Straightforward, though tedious, calculation. In the
MacWilliams identity, equate the coefficients of 1, z* and
z* to get three equations in three unknowns A,, Ag and
Aj, which may be solved to give A,=0, Ag=759,
Ay = 2576.

Chapter 14

14.1 No; Exercises 9.9 and 9.11 show that A,(15, 5)=256,
B,(15, 5) = 128.
14.2 (i) V(n,q) is an [n, n, 1]-code.
(ii) C= {xyx.--+-x, | xy +x.4+---+x,=0} is an
[n,n —1,2]-code. Since there cannot exist an
[n, n, 2]-code, we have B,(n, 2) =q""'.
Solutions to exercises 239

14.3 By Theorem 14.4, there exists an [n,n — r, 3]-code over

GF(q)@n <(q"—1)/(q —1)


@r2log, {n(q—-1)+1}
<on—r<n—log, {n(g—1)+1}.
So B,(n,3)=4 [n—log, {n(q~-1)+1}]
14.4 Let t be the number of planes in which a given line L lies.
Counting in two ways the number of members of the set
{(P,)|P is a point not on L, x is a plane containing
both L and P} gives q?+q*+q+1-(q¢+1)=1t(q7+
qg+1-—(q+1)], whence t=q +1.
14.5 The Golay code G,, is a ternary [11, 6, 5]-code, showing
that max, (5,3) 211. If max, (5,3) were =12, then there
would exist a ternary [12,7,5]-code, contradicting the
sphere-packing bound.
14.6 Use Theorem 14.18. Since 2 is a non-square in GF(5),
the 4x26 matrix whose columns are (0,0,0,1)’? and
(x,y,1,x?-—2y’)’, for (x,y)eV(2,5), is the parity-
check matrix of a [26, 22, 4]-code.
14.7 (i) By Theorem 14.16, a plane can contain g + 2 points
of a cap.
(ii) By Exercise 3.14, if g is even, then every element of
GF(q) is a square. [Remark: a version of Theorem
14.18 does hold for qg even, with an elliptic quadric
specified in a different way].
14.8 Let H be the parity-check matrix whose columns form the
(q? + 1)-cap defined in (14.19). Label the column (0001)?
by © and each column (x, y,1,x*—by’)’ by (x,y). A
decoding algorithm is the following. Calculate the syn-
drome s=yH!’ =s,5,535,. If s=0, assume no errors. If
s#0, calculate 6 =s,5,-—s5*+bs3. If 0=0 and s,=0,
assume an error of magnitude s, in position ~. If d=0
and s,#0, assume an error of magnitude s3 in position
(s,/s3, 52/83). If 9#0, then there are =2 errors.
14.9 If {x,,X>,..., Xi} is a 10-cap in PG(3, 3) then the set
{(x, | 0), (x2 | 0), oe , (X10 | 0), (Xi | 1), (% | 1),...,
1 1)}
is a 20-cap in PG(4, 3). (X10|
240 A first course in coding theory
14.10 For a given t-space, the number of ways of choosing an
extra point of PG(m, q) to generate a (t + 1)-space is
qth —] git} —]

q-1 q-1

Many of these extra points generate the same (t+ 1)-


space, and so we must divide by
qt? _ 1 q'*} _ 1

q-1 q-1
?

the number of points in such a (tf + 1)-space not lying in


the given f-space.
(i) 40, (ii) 13, (iii) 4.
14.11 (Bruen and Hirschfeld 1978). Suppose K is a cap in
PG(5,3). We shall show that |K|<56. We may assume
some plane a meets K in four points, for otherwise
|K|<42 (two points on some line L plus at most one
further point on each of the 40 planes through L).
Similarly, we may assume some 3-space contains at least
8 points of K, for otherwise |K| <4+3- 13 = 43. Finally,
since max; (5, 3) = 20, we have |K| <8 + 4(20 — 8) = 56.
14.12 B,(n,4)=3"-* for 5<n <10, 3”-° for 11<n <20, 3”"°°
for 21<n <56, 3"-’ for 57<n <112. (It is not known
whether B;(113, 4) = 3! or 3'°°.)

Chapter 15

15.1 [J,| A], [| AZ].


15.2 Suppose, for a contradiction, that max,,,_,(q+2-
r,q)2qt+2. Then there exists a [g+2,r,q+3-r]-
code whose dual is a [¢ +2,q +2-—r,r+1]-code, con-
tradicting max, (7r,g)=q +1.
15.3. max,_,(q —1,q)2q+2 by Corollary 15.9. If there ex-
isted a [q + 3, 4, q|-code over GF(q), then its dual would
be a [¢q +3, q —1, 5]-code, contrary to max,(4,q)=q +
1.
Solutions to exercises 241

15.4 t 1111]
1 a, a3
H= L 1 a; a3}.

: 1 dg ag |
15.5 Let H=[A | J] be a standard form parity-check matrix for
an [n,n—r,r+1]-code with n=max,(r,q). Deleting
the last row and last column of H gives a matrix whose
columns form an (n—1,r—1)-set in V(r—1,q) and so
n—1<max,_,(7—1,q).
15.6 Let C be an [8, 3, 6]-code over GF(7). By Corollary 15.7,
C+ is an [8,5,4]-code. Let W-(z)=NA,z' and
W -:(z) =X B,z'. By Theorem 13.6,
8
r(1 +> Bz’) = (1+ 6z)®+ A,(1 — z)°(1 + 6z)”
~ + A,(1—z)"(1+6z)+Ag(1—z)® (1)
Equating coefficients of 1, z and z* and solving for A,, A;
and Ag, gives W.(z) =1+ 168z° + 48z’ + 126z°. We.(z)
is now easily obtained directly from (1).
15.7 For2s<k 11,

1 1 1 10
1 2 :++ 10 00

1 2%! .-- 10*' 01


generates a [12, k, 13 — k]-code.
For k = 11, 1 | generates a [k + 1, k, 2]-code.
1
Le |:

Chapter 16

16.1 If x#O(modp), then x"@~DG-Dt1= (Ply G-by = x


(mod p) by Fermat’s theorem. If x =O(modp), then
x’?-D@-)+1 = y (mod p) holds trivially. So p is a factor
242 A first course in coding theory
of x"?~)@-)*1 — yx for any integer x. Similarly g is also a
factor for any integer x. Since p and q are distinct prime
numbers, pq is a factor of x”?~V@~+! — y for any x.
16.2 LEAVING TOMORROW.
16.3 When the subscriber R (of the text) has encrypted a
message he is to send to S (using S’s encryption
algorithm) he signs it with a further message z which he
sends in the form z‘ (mod 7) (i.e. via R’s own decrypting
algorithm). The receiver S verifies the signature by
calculating (z‘)’ = z (modzn). Only R could have sent the
message, since only R knows ¢.
16.4 B, C and D are uniquely decodable. B and C are
prefix-free. Average word-lengths of B, C and D are 2,
1.75 and 1.875, respectively. [Remark: It is a conse-
quence of Shannon’s ‘source coding theorem’ (see, e.g.,
Jones 1979) that the ‘source entropy’, —X?_, p; log) p;
(=1.75 here), gives the smallest possible average word
length. So the above code C here is best possible.|
16.5 THE END.
Bibliography

At the end of each entry the number in square brackets gives the
chapter which refers to this entry.

Anderson, I. (1974). A first course in combinatorial mathematics.


Clarendon Press, Oxford. [2, 16]
Assmus, E. F. and Mattson, H. F. (1967). On tactical configurations and
error-correcting codes. J. Comb. Theory 2, 243-57. [9]
(1969). New 5-designs. J. Comb. Theory 6, 122-51. [9]
(1974). Coding and combinatorics. SIAM Review 16, 349-88. [9]
Barlotti, A. (1955). Un estensione del teorema di Segre-Kustaanheimo.
Boll. Un. Mat. Ital. 10, 96-8. [14]
Beker, H. and Piper, F. (1982). Cipher Systems. Van Nostrand
Reinhold, London. [16]
Berlekamp, E. R. (1968). Algebraic coding theory. McGraw-Hill, New
York. [Pref., 11]
——— (1972). Decoding the Golay code. JPL Technical Report 32-1526,
Vol. IX, 81-5. [9]
Best, M. R. (1980). Binary codes with a minimum distance of four.
IEEE Trans. Info. Theory 26, 738-42. [16]
(1982). A contribution to the nonexistence of perfect codes. Ph.D.
dissertation, University of Amsterdam. [9]
(1983). Perfect codes hardly exist. IEEE Trans. Info. Theory 29,
349-51. [9]
—— and Brouwer, A. E. (1977). The triply shortened Hamming code is
optimal. Discrete Math. 17, 235-45. [8]
Best, M. R., Brouwer, A. E., MacWilliams, F. J., Odlyzko, A. M., and
Sloane, N. J. A. (1978). Bounds for binary codes of length less than
25. IEEE Trans. Info. Theory 24, 81-92. [16]
Blahut, R. E. (1983). Theory and practice of error control codes.
Addison-Wesley, Reading, Mass. [Pref., 11, 12, 16]
Blake, I. F. and Mullin, R. C. (1976). An introduction to algebraic and
combinatorial coding theory. Academic Press, New York. [Pref.]
Blokhuis, A. and Lam, C. W. H. (1984). More coverings by rook
domains. J. Comb. Theory, Ser. A 36, 240-4. [8]
Bose, R. C. (1947). Mathematical theory of the symmetrical factorial
design. Sankhya 8, 107-166. [14]
244 A first course in coding theory
and Ray-Chaudhuri, D. K. (1960). On a class of error-correcting
binary group codes. Info. and Control 3, 68-79. [11, 12]
, shrikhande, S. S., and Parker, E. T. (1960). Further results on the
construction of mutually orthogonal Latin squares and the falsity of
Euler’s conjecture. Canad. J. Math. 12, 189-203. {10}
Brinn, L. W. (1984). Algebraic coding theory in the undergraduate
curriculum. American Math. Monthly, 509-13. [Pref.]
Brown, D. A. H. (1974). Some error correcting codes for certain
transposition and transcription errors in decimal integers. Computer
Journal 17, 9-12. [11]
Bruen, A. A. and Hirschfeld, J. W. P. (1978). Application of line
geometry over finite fields. Il. The Hermitian surface. Geom. Dedi-
cata 7, 333-53. [14|
Bush, K. A. (1952). Orthogonal arrays of index unity. Ann. Math. Stat.
23, 426-34. [15]
Cameron, P. J. and van Lint, J. H. (1980). Graphs, codes and designs.
London Math. Soc. Lecture Note Series, Vol. 43. Cambridge Univ.
Press, Cambridge. [Pref.]
Casse, L. R. A. (1969). A solution to Beniamino Segre’s ‘Problem [, ,’
for q even. Atti. Accad. Naz. Lincei Rend. 46, 13-20. [15]
Delsarte, P. and Goethals, J.-M. (1975). Unrestricted codes with the
Golay parameters are unique. Discrete Math. 12, 211-24. [9]
Dénes, J. and Keedwell, A. D. (1974). Latin squares and _ their
applications. Academic Press, New York. [10]
Diffie, W. and Hellman, M. E. (1976). New directions in cryptography.
IEEE Trans. Info. Theory 22, 644-54. [16]
Dornhoff, L. L. and Hohn, F. E. (1978). Applied Modern Algebra.
Macmillan, New York. [16]
Fenton, N. E. and Vamos, P. (1982). Matroid interpretation of maximal
k-arcs in projective spaces. Rend. Mat. (7) 2, 573-80. [14, 15]
Fernandes, H. and Rechtschaffen, E. (1983). The football pool problem
for 7 and 8 matches. J. Comb. Theory, Series A 35, 109-14. [8]
Games, R. A. (1983). The packing problem for projective geometries
over GF(3) with dimension greater than five. J. Comb. Theory, Series
A 35, 126-44. [14]
Gardner, M. (1977). Mathematical games. Scientific American, August,
120-4. [16]
Gibson, I. B. and Blake, I. F. (1978). Decoding the binary Golay code
with miracle octad generators. IEEE Trans. Info. Theory 24, 261-4.
[9]
Gilbert, E. N. (1952). A comparison of signalling alphabets. Bell Syst.
Tech. J. 31, 504-22. [8]
Goethals, J.-M. (1971). On the Golay perfect binary code. J. Comb.
Theory 11, 178-86. [9]
—— (1977). The extended Nadler code is unique. IEEE Trans. Info.
Theory 23, 132-5. [9]
Bibliography 245
Golay, M. J. E. (1949). Notes on digital coding. Proc. [IEEE 37, 657.
[8, 9]
(1954). Binary coding. Trans IRE PGIT 4, 23-8. [16]
—— (1958). Notes on the penny-weighing problem, lossless symbol
coding with nonprimes, etc. IEEE Trans. Info. Theory 4, 103-9. [9]
Golomb, S. W. and Posner, E. C. (1964). Rook domains, Latin squares,
affine planes, and error-distribution codes. IEEE Trans. Info. Theory
10, 196-208. [9]
Goppa, V. D. (1970). A new class of linear error-correcting codes.
Problems of Info. Transmission 6 (3), 207-12. [11]
Hall, M. (1980). Combinatorial theory. Wiley, New York. [2]
Hamming, R. W. (1950). Error detecting and error correcting codes.
Bell Syst. Tech. J. 29, 147-60. [8]
— (1980). Coding and information theory. Prentice-Hall, New Jersey.
[16]
Hardy, G. H. (1940). A mathematician’s apology. Cambridge University
Press. [11]
Helgert, H. J. and Stinaff, R. D. (1973). Minimum distance bounds for
binary linear codes. IEEE Trans. Info. Theory 19, 344-56. [14]
Hill, R. (1973). On the largest size of cap in S53. Atti Accad. Naz.
Lincei Rend. 54, 378-84. [14]
——— (1978). Caps and codes. Discrete Math. 22, 111-37. [14]
Hirschfeld, J. W. P. (1979). Projective geometries over finite fields.
Oxford University Press. [14]
— (1983). Maximum sets in finite projective spaces. In Surveys in
combinatorics, LMS Lecture Note Series 82, edited by E. K. Lloyd.
Cambridge University Press, 55-76. [14]
Hocquenghem, A. (1959). Codes correcteurs d’erreurs. Chiffres (Paris)
2, 147-58. [11]
Jones, D. S. (1979). Elementary information theory. Clarendon Press,
Oxford. [16]
Jurick, R. R. (1968). An algorithm for determining the largest maxi-
mally independent set of vectors from an r-dimensional space over a
Galois field of n elements. Tech. Rep. ASD—TR-68-40, Air Force
Systems Command, Wright-Patterson Air Force Base, Ohio. [15]
Kamps, H. J. L. and van Lint, J. H. (1967). The football pool problem
for 5 matches. J. Comb. Theory 3, 315-25. [8]
Levenshtein, V. I. (1964). The application of Hadamard matrices to a
problem in coding. Problems of Cybernetics 5, 166-84. [16]
Levinson, N. (1970). Coding theory: a counterexample to G. H. Hardy’s
conception of applied mathematics. Amer. Math. Monthly 77, 249-58.
[11]
Lidl, R. and Niederreiter, H. (1983). Finite fields. Addison-Wesley, and
(1984) Cambridge University Press. [3]
Lin, S. and Costello, D. J. (1983). Error control coding: fundamentals
and applications. Prentice-Hall, New Jersey. [Pref., 12]
246 A first course in coding theory
Lindstrom, B. (1969). On group and nongroup perfect codes in q
symbols. Math. Scand. 25, 149-58. [9]
van Lint, J. H. (1975). A survey of perfect codes. Rocky Mountain J. of
Mathematics 5, 199-224. [9]
(1982). Introduction to coding theory. Springer-Verlag, New York.
[Pref., 6, 9, 16]
Lloyd, S. P. (1957). Binary block coding. Bell Syst. Tech. J. 36, 517-35.
[9]
McEliece, R. J. (1977). The theory of information and coding. Addison-
Wesley, Reading, Mass. [Pref., 6, 16]
McEliece, R. J., Rodemich, E. R., Rumsey, H. and Welch, L. R.
(1977). New upper bounds on the rate of a code via the Delsarte—
MacWilliams inequalities. IEEE Trans. Info. Theory 23, 157-66. [16]
Mackenzie, C. and Seberry, J. (1984). Maximal ternary codes and
Plotkin’s bound. Ars Combinatoria 177A, 251-70. [2]
MacWilliams, F. J. (1963). A theorem on the distribution of weights in a
systematic code. Bell Syst. Tech. J. 42, 79-94. [13]
—— and Sloane, N. J. A. (1977). The theory of error-correcting codes.
North-Holland, Amsterdam. [Pref., 2, 8, 9, 11-15, 16]
Magliveras, S. S. and Leavitt, D. W. (1983). Simple six-designs exist.
Congressus Numerantium 40, 195-205. [9]
Maneri, C. and Silverman, R. (1966). A vector space packing problem.
J. of Algebra 4, 321-30. [15]
Massey, J. L. (1969). Shift-register synthesis and BCH decoding. IEEE
Trans. Info. Theory 15, 122-27. [11]
Nadler, M. (1962). A 32-point n =12, d=5 code. [EEE Trans. Info.
Theory 8, 58. [9]
Nordstrom, A. W. and Robinson, J. P. (1967). An optimum non-linear
code. Info. and Control 11, 613-16. [2]
Pellegrino, G. (1970). Sul massimo ordine delle calotte in S,,. Matemat-
iche (Catania) 25, 1-9. [14]
Peterson, W. W. and Weldon, E. J. (1972). Error-correcting codes, 2nd
ed. MIT Press, Cambridge, Mass. [Pref., 11, 16]
Phelps, K. T. (1983). A combinatorial construction of perfect codes.
SIAM J. Alg. Disc. Math. 4, 398-403. [9]
Pless, V. (1968). On the uniqueness of the Golay codes. J. Comb.
Theory 5, 215-28. [9]
—— (1982). Introduction to the theory of error-correcting codes. Wiley,
New York. [Pref., 9]
Plotkin, M. (1960). Binary codes with specified minimum distance.
IEEE Trans. Info. Theory 6, 445-450. [2]
Prange, E. (1957). Cyclic error-correcting codes in two symbols.
Technical Note TIN-—57-103, Air Force Cambridge Research Labs.,
Bedford, Mass. [12]
QOvist, B. (1952). Some remarks concerning curves of the second degree
in a finite plane. Ann. Acad. Sci. Fenn. Ser. A, no. 134. [14]
Bibliography 247
Ramanujan, S. (1912). Note on a set of simultaneous equations. J.
Indian Math. Soc. 4, 94-6. [11]
Rivest, R. L., Shamir, A., and Adleman, L. (1978). A method for
obtaining digital signatures and public-key cryptosystems. Comm.
ACM 21, 120-6. [16]
Ryser, H. J. (1963). Combinatorial mathematics. Carus Monograph 14,
Math. Assoc. America. [10]
Schénheim, J. (1968). On linear and non-linear single-error-correcting
q-nary perfect codes. Info. and Control 12, 23-6. [9]
Segre, B. (1954). Sulle ovali nei piani lineari finiti. Atti Accad. Naz.
Lincei Rend. 17, 1-2. [14]
—— (1955). Curve razionali normali e k-archi negli spazi finiti. Ann.
Mat. Pura Appl. 39, 357-79. [15]
—— (1961). Lectures on modern geometry. Cremonese, Rome. [15]
Selmer, E. S. (1967). Registration numbers in Norway: some applied
number theory and psychology. Journal of the Royal Statistical
Society, Ser. A 130, 225-31. [11]
Shannon, C. E. (1948). A mathematical theory of communication. Bell
Syst. Tech. J. 27, 379-423 and 623-56. [6]
Singleton, R. C. (1964). Maximum distance g-nary codes. IEEE Trans.
Info. Theory 10, 116-18. [10, 15]
Slepian, D. (1960). Some further theory of group codes. Bell Syst. Tech.
J. 39, 1219-52. [6]
Sloane, N. J. A. (1981). Error-correcting codes and cryptography. In
Klarner, D. A., The mathematical Gardner, Wadsworth, Belmont,
Calif., pp. 346-382. [16]
—— (1982). Recent bounds for codes, sphere packings and related
problems obtained by linear programming and other methods.
Contemporary Mathematics 9, 153-85. [2]
Snover, S. L. (1973). The uniqueness of the Nordstrom—Robinson and
the Golay binary codes. Ph.D. Thesis, Department of Mathematics,
Michigan State Univ. [9]
Stinson, D. R. (1984). A short proof of the nonexistence of a pair of
orthogonal Latin squares of order six. J. Comb. Theory, Ser. A 36,
373-76. [9]
Tarry, G. (1901). Le probléme des 36 officiers. C. R. Acad. Sci. Paris 2,
170-203. [9]
Thas, J. A. (1968). Normal rational curves and k-arcs in Galois spaces.
Rend. Mat. 1, 331-4. [15]
—— (1969). Connection between the Grassmannian G,_,.,, and the set
of the k-arcs of the Galois space S,,,. Rend. Mat. 2, 121-34. [15]
Tietaévainen, A. (1973). On the nonexistence of perfect codes over finite
fields. SIAM J. Appl. Math. 24, 88-96. [9]
— (1980). Bounds for binary codes just outside the Plotkin range.
Info. and Control 47, 85-93. [2]
Varshamov, R. R. (1957). Estimate of the number of signals in error
248 A first course in coding theory
correcting codes. Dokl. Akad. Nauk SSSR 117, 739-41. [8]
Vasil’ev, J. L. (1962). On nongroup closepacked codes. Probl. Kibernat.
8, 337-39. (In Russian), translated in Probleme der Kybernetik 8
(1965), 375-78. [9]
Verhoeff, J. (1969). Error detecting decimal codes. Mathematical Centre
Tracts 29, Mathematisch Centrum, Amsterdam. [11]
Verhoeff, T. (1985). Updating a table of bounds on the minimum
distance of binary linear codes. Eindhoven University of Technology
Report 85-WSK-01. [14]
Weber, E. W. (1983). On the football pool problem for 6 matches: a
new upper bound. J. Comb. Theory, Ser. A 35, 106-8. [8]
Zinov’ev, V. A. and Leont’ev, V. K. (1973). The nonexistence of
perfect codes over Galois fields. Problems of Control and Info. Theory
2, 123-32. [9]
Index

[Note: the bibliography on pages 243-8 serves as a comprehensive index of


authors’ names since each entry is followed by the numbers of those chapters
which refer to that entry.|

A,(n, d) 11 convolutional 204


alphabet 2 cryptographic 205
arc 181, 192 cyclic 141
decimal 76, 125
dual 67
B,(n, d) 54, 175 equivalent 12, 50
basis 43 error-correcting 1
binomial coefficient 17 error-detecting 4
binomial theorem 20 even weight 27, 44, 53, 78
bound extended 79
asymptotic 203 Fire 204
Gilbert—Varshamov 91, 95 Golay
Hamming 20 binary 99, 153
linear programming 90, 203 ternary 102, 157
McEliece et al. 204 Goppa 139
Plotkin 29, 203 group 47
Singleton 122 Hadamard 21
sphere-packing 20 Hamming 81, 159, 177
square root 153, 155 Huffman 209
ISBN 36
linear 7, 24, 47
C 16 MDS 191
C* 67 Morse 191
cap 181 (n, M, d)- 8
capacity 62 [n, k]- 47
channel [n, k, d]- 47
binary symmetric 5 Nadler 110
q-ary symmetric 5 Nordstrom—Robinson 25, 109
characteristic 40 perfect 21, 97
check digit 55 trivial 21
cipher system punctured 101
Diffie-Hellman 205 q-ary 2
public key 205 quadratic residue 153
R-S-A 206 Reed—Muller 9, 28
code Reed-Solomon 125, 204
ASCII 207 repetition 2
BCH 125 self-dual 100
binary 2 shortened 89
block 2 source 207
burst-error correcting 204 ternary 2
250 Index
code (contd) GF(q) 33
uniquely decodable 209 Galois 33
variable length 207 generator matrix 49
codeword 2 generator polynomial 148
congruent 33 group, abelian 41
conic 184
coset 56
coset leader 58
Ham (r, q) 81
cover 104
Hardy 125
hyperplane 186

decoder 1 ISBN 36
decoding ideal 146
BCH codes 131 principal 147
Berlekamp—Massey algorithm 138 identity element 31
- incomplete 74 information theory 65
linear codes 56 inner product 67
maximum likelihood 6 intersection of vectors 15
nearest neighbour 5 inverse element 31
syndrome 71
design
block 21 Jupiter 10
Hadamard 26, 203
symmetric 26
t- 104 Latin square 113
dimension 44 leading coefficient 142
disjoint xii length 2
distance linear combination 42
Hamming 5 linearly dependent 42
minimum 7 linearly independent 42
division algorithm 142
double-adjacent error 163 MLCT problem 175
dual of a code 67 MOLS 114
MacWilliams identity 165, 167, 168
Mariner 9
element xi Mars 9
Elias 204 matrix
encoder 1 generator 49
encoding a linear code 55 Hadamard 203
equivalent codes 12, 50 incidence 23
error vector 56 non-singular 194
Euler 107, 121 parity-check 69
Euclidean algorithm 39 Vandermonde 125
max, (r,q) 176
message digits 55
F, 2 modulo 33
(F)" 2
field 31
finite 32, 145 NASA 9, 205
Galois 33 (n, s)-set 176
prime 33 Neptune 10
Fisher 26 Norwegian registration
football pools 18, 27, 94 numbers 138
Index 251
officers problem 107 ring 32
optimal (n, s)-set 176 ring of polynomials modulo f(x) 143
order xii
of a field 32
of a plane 26 Saturn 10
orthogonal 67 scalar 41
set XI
Shannon vii, 62, 64
Poor 60 Slepian 56
P.. 60 sphere 18
Pretrans 64 standard array 58
standard form 50, 71
Pangerec 63, 171
symb
Steiner system 105
PG(r—-1, q) 178 submatrix 194
packing problem 176 subset xi
parity-check 70 subspace 42
overall 16 sum of vectors 15
parity-check equations 70 symbol error rate 63
parity-check matrix 69 syndrome 71
partial fractions 133, 135 syndrome look-up table 73
permutation 12
persymmetry 137
photographs 9 telephone numbers 3
plane 186 theorem
Fano 22 Assmus—Mattson 105
projective 26, 45, 123, 179 Bruck—Ryser—Chowla 26
seven point 22 Fermat 39, 40
polynomial 142 Lagrange 57
check 151 Levenshtein 201
degree of 142 van Lint-Tietavainen 102
error-evaluator 134 Lloyd 103
error-locator 133 Shannon 62
generator 148 triangle inequality 5
irreducible 144
monic 142
primitive 160 Uranus 10
reciprocal 153
reducible 144 V(n, q) 41
prefix 209 vector 2, 41
primitive element 38 vector space 41
probability Viking 9
symbol error 6 Voyager 10
word error 6, 60
projective geometry 178
weight 15, 47
weight enumerator 165
quadric, elliptic 188 of binary Hamming code 171
of binary Golay code 173
of MDS code 198
R,, 146 word 3
Ramanujan 125
rate of a code 61
received vector 1
redundancy 1, 2, 55, 81
Algebraic coding theory is a new and rapidly developing
subject, motivated by immediate practical applications
but also fascinatingly rich in mathematical structure. It is
not surprising that the subject is becoming increasingly
taught in undergraduate courses in mathematics,
engineering, and computer science.

This book provides an elementary, yet rigorous, intro-


duction to the theory of error-correcting codes. It is
based on courses given by the author over several years
to students ranging from second-year undergraduates to
first-year postgraduates.

The large number of exercises, all with solutions, makes


the text highly suitable for self-learning.

Raymond Hill's book...is very suitable for undergraduate


gol AIM ta coe mma ee MUL Ley?
computing. In addition, it is ideal for anyone who wants
a very clear introduction to the subject, together with an
appreciation of its practical importance and use.
The Times Higher Education Supplement

PULL eM CO le) alert me le eke kel


coding theory can be explained to second year
mathematics students and to interested engineers or
computer scientists. He has succeeded and is to be
congratulated.
Bulletin of the London Mathematical Society

ALSO PUBLISHED BY OXFORD UNIVERSITY PRESS

A first course in combinatorial mathematics


Second edition
|. Anderson

Discrete mathematics
Revised edition
Norman L. Biggs

Codes and cryptography


Dominic Welsh

Error-correcting codes and finite fields


Student edition
O. Pretzel

OXFORD UNIVERSITY PRESS

You might also like