0% found this document useful (0 votes)
45 views20 pages

Notes 3

Uploaded by

m4ushyam
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views20 pages

Notes 3

Uploaded by

m4ushyam
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

EE 387, John Gill, Stanford University Notes #3, October 15, Handout #10

Linear block codes and group codes


Definition: A linear block code over a field F is a linear subspace of F n, where n
is the blocklength of the code.
Definition: A group code is a subgroup of the n-tuples over an additive group.
Facts about linear block codes and group codes.
• In a group code the sum and difference of codewords are codewords.
• In a linear block code, also scalar multiples of codewords are codewords.
• Every linear block code is a group code, but not conversely.
• A binary group code is a linear block code because the only scalars are 0 and 1.
• Parity-check codes are linear block codes over GF(2).
Every PC code is defined by a set of homogeneous binary equations.
logQ Qk k
• If C is a LBC over GF(Q) of dimension k, then its rate is R = = .
n n

EE 387 Notes #3, Page 1

Minimum weight
The Hamming weight wH (v) is the number of nonzero components of v.
Obvious facts:
• wH (v) = dH (0, v)
• dH (v1, v2) = wH (v1 − v2) = wH (v2 − v1)
• wH (v) = 0 if and only if v = 0
Definition: The minimum (Hamming) weight of a block code is the weight of the
nonzero codeword with smallest weight:
w = w∗ = min{w (c) : c ∈ C, c 6= 0}
min H
Examples of minimum weight:
• Simple parity-check codes: w∗ = 2.
• Repetition codes: w∗ = n.
• (7,4) Hamming code: w∗ = 3. (There are 7 codewords of weight 3.)
Weight enumerator: A(x) = 1 + 7x3 + 7x4 + x7.
• Simple product code: w∗ = 4.
EE 387 Notes #3, Page 2
Minimum distance = minimum weight
Theorem: For every linear block code, d∗ = w∗.
Proof : We show that w∗ ≥ d∗ and w∗ ≤ d∗.
(≥) Let c0 be a nonzero minimum-weight codeword. the 0 vector is a codeword, so
w∗ = w (c ) = d (0, c ) ≥ d∗ .
H 0 H 0

(≤) Let c1 6= c2 be two closest codewords. Then c1 − c2 is a nonzero codeword, s


d∗ = dH (c1, c2) = wH (c1 − c2) ≥ w∗ .

Combining these two inequalities, we obtain d∗ = w∗.


It is easier to find minimum weight than minimum distance because the weight
minimization considers only a single parameter.
Computer search: test all vectors of weight 1, 2, 3, . . . until codeword is found.
It is also easier to determine the weight distribution of a linear code than the
distance distribution of a general code.

The result d∗ = w∗ holds for group codes, since the proof used only subtraction.
EE 387 Notes #3, Page 3

Generator matrix
Definition: A generator matrix for a linear block code C of blocklength n and
dimension k is any k × n matrix G whose rows form a basis for C.
Every codeword is a linear combination of the rows of a generator matrix G:
 
g0
 g1 
c = mG = [ m0 m1 . . . mk−1 ]   .. 

gk−1

= m0g0 + m1g1 + · · · + mk−1gk−1 .


Since G has rank k, the representation of c is unique.
Each component of c is inner product of m with corresponding column of G:
cj = m0g0,j + m1g1,j + · · · + mk−1 gk−1,j .

Both sets of equations can be used for encoding. In either case, each codeword
symbol requires k multiplications by constants and k − 1 additions.

EE 387 Notes #3, Page 4


Parity-check matrix
Definition: The dual code of C is the orthogonal complement C ⊥.
Definition: A parity-check matrix for a linear block code C is any r × n matrix H
whose rows span the orthogonal complement C ⊥. Obviously, r ≥ n − k .
Example: G and H for (5, 4) simple parity-check code.
2 3
1 1 0 0 0
60 1 1 0 07 ˆ ˜
G = 6
40
7 ⇒ H = 1 1 1 1 1
0 1 1 0 5
0 0 0 1 1
(H is generator matrix for the (5, 1) repetition code — the dual code.)
Example: G and H for (7, 4) cyclic Hamming code.
2 3
1 1 0 1 0 0 0 2 3
60 1 0 0 1 0 1 1
1 1 0 1 0 07
G = 6
41
7 ⇒ H = 40 1 0 1 1 1 05
1 1 0 0 1 05
0 0 1 0 1 1 1
1 0 1 0 0 0 1

A cyclic code is a linear block code such that the cyclic shift of every codeword is also a codeword. It is not obvious by
inspection that this property holds for the code generated by G.
EE 387 Notes #3, Page 5

Codewords are defined by H


Theorem: If C is an (n, k) linear block code with parity-check matrix H, then an
n-tuple c is a codeword if and only if cH T = 0.
Proof :
(⇒) Suppose c is a codeword.
Each component of cH T is the inner product of c and a column of H T ,
which is a row of H.
Since every row of H is in C ⊥, each row is ⊥ to c.
Thus each component of cH T is c · hi = 0.
(⇐) Since the rows of H span C ⊥, any n-tuple satisfying cH T = 0 belongs to the
orthogonal complement of C ⊥.
By the Dimension Theorem (Blahut Theorem 2.5.10), C ⊥⊥ = C.
Therefore if cH T = 0 then c belongs to C.
(⇔) Thus C consists of vectors satisfying the check equations cH T = 0.

EE 387 Notes #3, Page 6


Generator vs. parity-check matrices
Usually we choose H to consist of n − k independent rows, so H is (n − k) × n.
Sometimes it is convenient or elegant to use a parity-check matrix with redundant
rows (for example, binary BCH codes, to be discussed later).
Each row of H corresponds to an equation satisfied by all codewords.
Since each row of G is a codeword, for any parity-check matrix H,
T
Gk×n · Hr×n = 0k×r (r ≥ n − k)
Each 0 is 0k×r corresponds to one codeword and one equation.
Conversely, if GH T = 0 and rank H = n − k then H is a parity-check matrix.
How do we find H from G?
We could find H from G by finding n − k linearly independent solutions of the
linear equation GH T = 0.
The equations GH T = 0 are easy to solve when G is systematic.

EE 387 Notes #3, Page 7

Systematic generator matrices


Definition: A systematic generator matrix is of the form
 
p0,0 · · · p0,n−k−1 1 0 ··· 0
 p1,0 · · · p1,n−k−1 0 1 ··· 0
G = [P |I ] =  . . . .. . .. .. . . . .. 
. 
pk−1,0 · · · pk−1,n−k−1 0 0 ··· 1

Advantages of systematic generator matrices:


• Message symbols appear unscrambled in each codeword in the rightmost
positions n − k, . . . , n − 1.
• Encoder complexity is reduced; only check symbols need be computed:
cj = m0g0,j + m1g1,j + · · · + mk−1 gk−1,j (j = 0, . . . , n − k − 1)

• Check symbol encoder equations easily yield parity-check equations:


cj − cn−k g0,j − cn−k+1g1,j − · · · − cn−1gk−1,j = 0 (mi = cn−k+i)

• Systematic parity-check matrix is easy to find: H = [ I | −P T ] .


EE 387 Notes #3, Page 8
Systematic parity-check matrix
Given a k × n systematic generator matrix
 
p0,0 · · · p0,n−k−1 1 0 ··· 0
 p1,0 · · · p1,n−k−1 0 1 ··· 0
G = [ P | Ik ] = 
 .. ... .. .. .. . . . ,
.. 
pk−1,0 · · · pk−1,n−k−1 0 0 ··· 1
The corresponding (n − k) × n systematic parity-check matrix is
 
1 0 ··· 0 −p0,0 · · · −pk−1,0
0 1 ··· 0 −p0,1 · · · −pk−1,1 
H = [ In−k | −P T ] =  .
. . . . .. .. .. ... .. 

0 0 · · · 1 −p0,n−k · · · −pk−1,n−k
(The minus signs are not needed for fields of characteristic 2, i.e., GF(2m).)
Each row of H is corresponds to an equation satisfied by all codewords.
These equations simply tell how to compute the check symbols c0, . . . , cn−k−1 in
terms of the information symbols cn−k , . . . , cn−1 .

EE 387 Notes #3, Page 9

Minimum weight and columns of H


Since cH T = 0 for every codeword c = (c0, c1, . . . , cn−1), every nonzero codeword
determines a linear dependence among a subset of the rows of H T . Thus
(cH T )T = HcT = c0h0 + c1h1 + · · · + cn−1hn−1 = 0
is a linear dependence among a subset of the columns of H.
Theorem: The minimum weight of a linear block code is the smallest number of
linearly dependent columns of any parity-check matrix.
Proof : Each linearly dependent subset of w columns corresponds to a codeword of
weight w.
A set of columns of H is linearly dependent if one column is a linear combination
of the other columns.
• A LBC has w∗ ≤ 2 iff one column of H is a multiple of another column.
• Binary Hamming codes have w∗ = 3 because no columns of H are equal.

The Big Question: how to find H such that no 5 (or 7 or more) columns are LI?

EE 387 Notes #3, Page 10


Computing minimum weight
The rank of H is the maximum number of linearly independent columns.
It can be determined in time O(n3) using linear operations — Gaussian elimination.
The minimum distance is the smallest number of linearly dependent columns.
Finding the minimum distance is difficult (NP-hard). We might have to look at
large numbers of subsets of columns.
Solution: design codes whose minimum distance can be proven to have desired
lower bounds.
The dimension of the column space of H is n − k. Thus any n − k + 1 columns
are linearly dependent. Therefore
d∗ = w ∗ ≤ n − k + 1
for any linear block code. This is known as the Singleton bound.
Exercise: Show that the Singleton bound holds for all (n, k) block codes, not just
linear codes.

EE 387 Notes #3, Page 11

Maximum distance separable codes


Codes that achieve Singleton bound are called maximum-distance separable (MDS)
codes.
Repetition code satisfies the Singleton bound with equality:
d∗ = n = (n − 1) + 1 = (n − k) + 1
Another class of MDS codes are the simple parity-check codes:
d∗ = 2 = 1 + 1 = (n − k) + 1
The best known nonbinary MDS codes are the Reed-Solomon codes over GF(Q).
The RS code parameters are
(n, k, d∗) = (Q − 1, Q − d∗, d∗) ⇒ n − k = d∗ − 1 .

Exercise: Show that the repetition codes and the simple parity-check codes are the
only nontrivial binary MDS codes.

EE 387 Notes #3, Page 12


Linear block codes: review
• An (n, k) linear block code is a k-dimensional subspace of a finite field F n.
Sums, differences, and scalar multiples of codewords are also codewords.
• A group code over an additive group G is closed under sum and difference.
• An (n, k) LBC over F = GF(q) has M = q k codewords and rate k/n .
• A linear block code C can be defined by two matrices.
◦ Generator matrix G: rows of G are basis for C, i.e., C = {mG : m ∈ F k }
◦ Parity-check matrix H span C ⊥, hence C = {c ∈ F n : cH T = 0}
• The Hamming weight of an n-tuple is the number of nonzero components.
• The minimum weight w∗ of a block code is the Hamming weight of the nonzero
codeword of minimum weight.
• The minimum distance of every LBC equals the minimum weight: d∗ = w∗.
• The minimum weight of a linear block code is the smallest number of linearly
dependent columns of any parity-check matrix.
EE 387 Notes #3, Page 13

Syndrome decoding
Linear block codes are much simpler than general block codes:
• Encoding is vector-matrix multiplication.
(Cyclic codes are even simpler — polynomial multiplication/division is used.)
• Decoding is inherently nonlinear. Fact: linear decoders are very weak.
However, several steps in the decoding process are linear:
◦ syndrome computation
◦ final correction after error pattern and location have been found
◦ extracting estmated message from estimated codeword
Definition: The error vector or error pattern e is the difference between the
received n-tuple r and the transmitted codeword c:

e = r−c ⇒ r = c+e

Note: The physical noise model may not be additive noise, and the probability distribution for the error e may depend
on the data c. We assume a channel error model determined by P(e).
EE 387 Notes #3, Page 14
Syndrome decoding (2)
Multiply both sides of the equation r = c + e by H:

s = rH T = (c + e)H T = cH T + eH T = 0 + eH T = eH T .

The syndrome of the senseword r is defined to be s = rH T .


The syndrome of r (known to receiver) equals the syndrome of the error pattern e
(not known to receiver but must be estimated).
Decoding consists of finding the most plausible error pattern e such that
eH T = s = rH T .
“Plausible” depends on the error characteristics:
• For binary symmetric channel, most plausible means smallest number of bit
errors. Decoder picks error pattern of smallest weight satisfying eH T = s.
• For bursty channels, error patterns are plausible if the symbol errors are close
together.

EE 387 Notes #3, Page 15

Syndrome decoding (3)


Syndrome table decoding consists of these steps:
1. Calculate syndrome s = rH T of received n-tuple.
2. Find most plausible error pattern e with eH T = s.
3. Estimate transmitted codeword: ĉ = r − e.
4. Determine message m̂ from the encoding equation ĉ = m̂G.
Only step 2 requires nonlinear operations.

For small values of n − k, lookup tables can be used for step 2.


For BCH and Reed-Solomon codes, the error locations are the zeroes of certain
polynomials over the channel alphabet.
These error locator polynomials are linear functions of the syndrome.
Challenge: find, then solve, the polynomials.
Step 4 is not needed for systematic encoders, since m = ĉ[n−k : n−1].
EE 387 Notes #3, Page 16
Syndrome decoding: example
An (8, 4) binary linear block code C is defined by systematic matrices:
   
1 0 0 0 | 0 1 1 1 0 1 1 1 | 1 0 0 0
0 1 0 0 | 1 0 1 1  ⇒ G = 1 0 1 1 | 0 1 0 0

H= 0 0 1 0 | 1

1 0 1 1 1 0 1 | 0 0 1 0
0 0 0 1 | 1 1 1 0 1 1 1 0 | 0 0 0 1
Consider two possible messages:
m1 = [ 0 1 1 0 ] m2 = [ 1 0 1 1 ]
c1 = [ 0 1 1 0 0 1 1 0 ] c2 = [ 0 1 0 0 1 0 1 1 ]
Suppose error pattern e = [ 0 0 0 0 0 1 0 0 ] is added to both codewords.
r1 = [ 0 1 1 0 0 0 1 0 ] r2 = [ 0 1 0 0 1 1 1 1 ]
s1 = [ 1 0 1 1 ] s2 = [ 1 0 1 1 ]
The syndromes are the same and equal column 6 of H, so decoder corrects bit 6.

C is an expanded Hamming code with weight enumerator A(x) = 1 + 14x4 + x8 .


EE 387 Notes #3, Page 17

Standard array
Syndrome table decoding can also be described using the standard array.
The standard array of a group code C is the coset decomposition of F n with
respect to the subgroup C.
0 c2 c3 ··· cM
e2 c2 + e2 c3 + e2 ··· cM + e2
e3 c2 + e3 c3 + e3 ··· cM + e3
.. .. .. ... ..

eN c2 + eN c3 + eN ··· cM + eN

1. The first row is the code C, with the zero vector in the first column.
2. Every other row is a coset.
3. The n-tuple in the first column of a row is called the coset leader
We usually choose the coset leader to be the most plausible error pattern, e.g.,
the error pattern of smallest weight.
EE 387 Notes #3, Page 18
Standard array: decoding
An (n, k) LBC over GF(Q) has M = Qk codewords.
Every n-tuple appears exactly once in the standard array. Therefore the number of
rows N satisfies
M N = Qn ⇒ N = Qn−k .
All vectors in a row of the standard array have the same syndrome.
Thus there is a one-to-one correspondence between the rows of the standard array
and the Qn−k syndrome values.
Decoding using the standard array is simple: decode senseword r to the codeword
at the top of the column that contains r.
The decoding region for a codeword is the column headed by that codeword.
The decoder subtracts the coset leader from the received vector to obtain the
estimated codeword.

EE 387 Notes #3, Page 19

Standard array and decoding regions


0 codewords
wt 1

shells of radius 1
wt 2

shells of radius 2
coset leaders
wt t

shells of radius t
wt >t

vectors of weight > t

EE 387 Notes #3, Page 20


Standard array: example
The systematic generator and parity-check matrices for a (6, 3) LBC are
   
0 1 1 | 1 0 0 1 0 0 | 0 1 1
G = 1 0 1 | 0 1 0 ⇒ H = 0 1 0 | 1 0 1
1 1 0 | 0 0 1 0 0 1 | 1 1 0

The standard array has six coset leaders of weight 1 and one of weight 2.
000000 001110 010101 011011 100011 101101 110110 111000
000001 001111 010100 011010 100010 101100 110111 111001
000010 001100 010111 011001 100001 101111 110100 111010
000100 001010 010001 011111 100111 101001 110010 111100
001000 000110 011101 010011 101011 100101 111110 110000
010000 011110 000101 001011 110011 111101 100110 101000
100000 101110 110101 111011 000011 001101 010110 011000
001001 000111 011100 010010 101010 100100 111111 110001

See http://www.stanford.edu/class/ee387/src/stdarray.pl for the short Perl script that generates the above standard
array. This code is a shortened Hamming code.
EE 387 Notes #3, Page 21

Standard array: summary


The standard array is a conceptional arrangement of all n-tuples.
0 c2 c3 ··· cM
e2 c2 + e2 c3 + e2 ··· cM + e2
e3 c2 + e3 c3 + e3 ··· cM + e3
.. .. .. ... ..
eN c2 + eN c3 + eN ··· cM + eN

• The first row is the code C, with the zero vector in the first column.
• Every other row is a coset.
• The n-tuple in the first column of a row is called the coset leader.
• Senseword r is decoded to codeword at top of column that contains r.
• The decoding region for codeword is column headed by that codeword.
• The decoder subtracts coset leader from r to obtain the estimated codeword.
EE 387 Notes #3, Page 22
Syndrome decoding: summary
Syndrome decoding is closely connected to standard array decoding.
1. Calculate syndrome s = rH T of received n-tuple.
2. Find most plausible error pattern e with eH T = s.
This error pattern is the coset leader of the coset containing r.
3. Estimate transmitted codeword: ĉ = r − e.
The estimated codeword ĉ is the entry at the top of the column containing r in
the standard array.
4. Determine message m from the encoding equation c = mG.
In general, m = cR, where R is an n × k pseudoinverse of G. If the code is
systematic, then R = [ 0(n−k)×k | Ik×k ]T .
Only step 2 requires nonlinear operations and is the conceptually the most difficult.

Surprisingly, most computational effort is spent on syndrome computation.

EE 387 Notes #3, Page 23

Bounds on minimum distance


The minimum distance of a block code is a conservative measure of the quality of
an error control code.
• A large minimum distance guarantees reliability against random errors.
• However, a code with small minimum distance may be reliable — provided the
probability of sending codewords with nearby codewords is small.
We use minimum distance as the measure of a code’s reliability because:
• A single number is easier to understand than a weight/distance distribution.
• The guaranteed error detection and correction ability are
◦ detection: e = d∗ − 1
◦ correction: t = ⌊ 12 (d∗−1)⌋
• Algebraic codes covered in the course are limited by minimum distance — these
codes cannot correct more than t errors even if there is only one closest
codeword.
EE 387 Notes #3, Page 24
Hamming (sphere-packing) bound
The Hamming bound for a (n, k) block code over Q-ary channel alphabet:
• A code corrects t errors iff spheres of radius t around codewords do not overlap.
• Therefore
volume of space Qn
Qk = number of codewords ≤ = ,
volume of sphere of radius t V (Q, n, t)
where V (Q, n, t) is the “volume” (number of elements) of a sphere of radius t
in Hamming space of n-tuples over a channel alphabet with Q symbols:
     
n n n
V (Q, n, t) = 1 + (Q−1) + (Q−1)2 + · · · + (Q−1)t
1 2 t
• Rearranging the inequality gives a lower bound on n − k,
Qn−k ≥ V (Q, n, t) ⇒ n − k ≥ logQ V (Q, n, t) ,
and thus an upper bound on rate R,
       
1 n n n
R ≤ 1 − logQ 1 + (Q−1) + (Q−1)2 + · · · + (Q−1)t .
n 1 2 t
EE 387 Notes #3, Page 25

Hamming bound: example


A wireless data packet contains 192 audio samples, 16 bits for two channels. The
number of the information bits is 192 · 2 · 16 = 6144.
The communications link is a binary symmetric channel with raw error rate 10−3.
How many check bits are needed for reliable communication?
t n−k n Rate P{> t errors}
10 105 6249 0.983 5.4×10−02
12 123 6267 0.980 1.2×10−02
14 141 6285 0.978 2.2×10−03
16 158 6302 0.975 3.0×10−04
18 175 6319 0.972 3.5×10−05
20 192 6336 0.970 3.3×10−06
22 208 6352 0.967 2.6×10−07
24 225 6369 0.965 1.8×10−08
26 241 6385 0.962 1.1×10−09
28 257 6401 0.960 5.5×10−11
30 272 6416 0.958 2.5×10−12
32 288 6432 0.955 1.0×10−13
The Hamming bound shows that more than 4% redundancy is needed to achieve a
reasonable bit error rate.
EE 387 Notes #3, Page 26
Other bounds on minimum distance
The following bounds show tradeoffs between rate R and minimum distance d∗.
• McEliece-Rodemich-Rumsey-Welch (MRRW) upper bound.
 p 
1
R ≤ H 2 − δ(1 − δ) .

H is binary entropy function and δ = d∗/n is normalized minimum distance


• Plotkin upper bound for binary linear block codes (homework exercise):
n · 2k−1 d∗ 1
d∗ ≤ k ⇒ δ= ≤ 2 for large k.
2 −1 n
• Varshamov-Gilbert lower bound for binary block codes. If d∗ < n/2 there then
exists a code with minimum distance d∗ and rate R satisfying
d−1  
!
n
≈ 1 − H(d∗/n) = 1 − H(δ) .
X
R ≥ 1 − log2
i=0
i
For comparison, the Hamming bound is R ≤ 1 − H(δ/2).

EE 387 Notes #3, Page 27

Plots of rate vs. normalized minimum distance


1
Hamming
Varshamov−Gilbert
0.9 M−R−R−W
Singleton

0.8

0.7
rate = k/n

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
δ = d∗/n

The MRRW bound is stronger than the Hamming bound except for high rates.
The Hamming bound is fairly tight for high rates. For example, to correct 10 errors in 1000 bits, the Hamming bound
requires 78 check bits, but there exists a BCH code that has 100 check bits.
EE 387 Notes #3, Page 28
Perfect codes
Definition: A block code is called perfect if every senseword is within distance t of
exactly one codeword.
Other definitions of perfect codes:
• decoding spheres pack perfectly
• have complete bounded-distance decoders
• satisfy the Hamming bound with equality
There are only finitely many classes of perfect codes:
• Codes with no redundancy (k = n)
• Repetition codes with odd blocklength: n = 2m + 1, k = 2m, t = m
• Binary Hamming codes: n = 2m − 1, n − k = m
• Nonbinary Hamming codes: n = (q m − 1)/(q − 1), n − k = m, q > 2
• Binary Golay code: q = 2, n = 23, k = 12, t = 3
• Ternary Golay code: q = 3, n = 11, k = 6, t = 2
Golay discovered the perfect Golay codes in 1949 — a very good year for Golay. EE 387 Notes #3, Page 29

Quasi-perfect codes
Definition: A code is quasi-perfect if every n-tuple is
• within distance t of at most one codeword, and is
• within distance t + 1 of at least one codeword.
Equivalently, a code is quasi-perfect if spheres of radius t surrounding codewords
do not overlap, while spheres of radius t + 1 cover the space of n-tuples.
Examples of quasi-perfect codes:
• Repetition codes with even blocklength
• Expanded Hamming and Golay codes with overall parity-check bit
Exercise: Show that expurgated Hamming codes (obtained by adding an overall
parity-check equation) are not quasi-perfect.

EE 387 Notes #3, Page 30


Modified linear codes
The design blocklength of a linear block code is determined by algebraic and
combinatorial properties of matrices or polynomials.
The desired blocklength of a linear block code is often different from the design
blocklength.
Example:
• Design blocklength of binary Hamming code is 2m − 1 (7, 15, 31, . . .)
• Number of information symbols may not be k = 2m − 1 − m (4, 11, 26, . . .)
There are six ways to modify parameters of a linear block code (n, k, n − k) by
increasing one, decreasing another, and leaving the third unchanged.
The most common modification is to shorten the code by dropping information
symbols.
The other modifications are lengthen, expurgate, augment, puncture, expand.

EE 387 Notes #3, Page 31

Shortened codes
Shorten: Fix n − k, decrease k and therefore n.
Information symbols are deleted to obtain a desired blocklength smaller than the
design blocklength.
The missing information symbols are usually imagined to be at the beginning of the
codeword and are considered to be 0.
Example: Ethernet frames are variable-length packets. Maximum packet size is
about 1500 data octets or 12000 bits.
The 32-bit ethernet checksum comes from a Hamming code with design
blocklength 232 − 1 = 4294967295 bits, or 536870907 octets.
Encoder/decoder cost can be reduced by deleting carefully chosen symbols.

EE 387 Notes #3, Page 32


Shortened code example
The systematic parity-check matrix for a (15, 11) binary Hamming code is
 
1 0 0 0 1 0 0 1 1 0 1 0 1 1 1
0 1 0 0 1 1 0 1 0 1 1 1 1 0 0
H= 0 0 1 0
.
0 1 1 0 1 0 1 1 1 1 0
0 0 0 1 0 0 1 1 0 1 0 1 1 1 1

We can shorten to (12, 8) code by deleting maximum-weight columns 12 to 14:


 
1 0 0 0 1 0 0 1 1 0 1 1
0 1 0 0 1 1 0 1 0 1 1 0
H′ = 0
.
0 1 0 0 1 1 0 1 0 1 0
0 0 0 1 0 0 1 1 0 1 0 1

The shortened code can correct single bit errors in an 8-bit data byte.
Each check equation is the exclusive-or of 5 or 6 input bits, compared to 8 inputs
in original code.

EE 387 Notes #3, Page 33

Lengthened codes
Lengthen: Fix n − k, increase k and therefore n.
New information symbols are introduced and included in check equations.
Usually difficult to do without reducing the minimum distance of the code.
Example: Extended Reed-Solomon codes, obtained by lengthening the (Q − 1, k)
R-S codes to (Q + 1, k + 2) by adding two columns at the left of H:

α α2 · · · αQ−2
 
1
1 α2 α4 · · · α2(Q−2) 
H =  .. .. .. ... ..  ⇒

1 αd α2d · · · αd(Q−2)

α α2 · · · αQ−2
 
1 0 1
0 0 1 α2 α4 · · · α2(Q−2)
H′ = 

 .. .. .. .. .. ... .. 

0 1 1 αd α2d · · · α d(Q−2)

EE 387 Notes #3, Page 34


Expurgated codes
Expurgate: Fix n, decrease k and increase n − k.
Codewords are deleted by adding check equations, reducing the dimension of the
code. The goal is to increase error protecting ability
Example: The (7, 3) expurgated Hamming code.
Example: (15,7) double error correcting binary BCH code is obtained from the
(15,11) Hamming code by adding four more rows to H:
 
1 0 0 0 1 1 0 0 0 1 1 0 0 0 1
0 0 0 1 1 0 0 0 1 1 0 0 0 1 1
H+ = 


0 0 1 0 1 0 0 1 0 1 0 0 1 0 1
0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
The combined parity-check matrix is (α is a primitive element in GF(24)):
1 α α2 α3 α4 · · · α12 α13 α14
   
H
=
H+ 1 α3 α6 α9 α12 · · · α36 α39 α42
We will see that no 4 columns are linearly dependent over GF(2), so d∗ ≥ 5.
EE 387 Notes #3, Page 35

Augmented codes
Augment: Fix n, increase k and decrease n − k.
Add codewords by adding new basis vectors — new rows of generator matrix.
This increases rate of code while possibly decreasing the minimum distance.
Example: The generator matrix of Reed-Muller code R(r, m) is defined by
augmentation:  
G0
G1 
G=  .. 

Gr
“m”
Submatrix Gi has rows and n = 2m columns. Number of information bits is
i
     
m m m
k = + + ··· +
0 1 r

It can be shown that the minimum weight is 2m−r .


The Reed-Muller codes have a wide range of minimum distances and corresponding rates. The rate 1/2 codes have

d∗ = n, which is essentially optimal.
EE 387 Notes #3, Page 36
Expanded codes
Expand: Fix k, increase n − k and n.
Add new check symbols and corresponding equations.
Example: The extended Hamming code is obtained by adding an overall
parity-check bit, thereby increasing the minimum distance from 3 to 4.
Fact: When the minimum distance of a binary linear block code is odd, overall
parity-check bit increases the miminum distance by 1 to the next even number.
Example: The binary Golay code is a (23, 12) code with minimum distance 7 — a
perfect three-error correcting code.
Thus an overall parity-check equation increases the minimum distance to 8.
The extended Golay code, with parameters (24, 12, 8), was used for error protection
in the Voyager I and II spacecraft.
Robert Gallager’s tribute: Marcel Golay’s one-page paper, “Notes on Digital Coding” (Proc. IRE, vol. 37, p. 657, 1949)
is surely the most remarkable paper on coding theory ever written. Not only did it present the two perfect “Golay
codes”, the (n = 23, k = 12, d = 7) binary code and the (n = 11, k = 6, d = 5) ternary code, but it also gave the
non-binary generalization of the perfect binary Hamming codes and the first publication of a parity-check matrix.

EE 387 Notes #3, Page 37

Punctured codes
Puncture: Fix k, decrease n − k and therefore n.
Deleting check symbols may reduce minimum distance.
However, punctured codes may correct the large majority of errors up to the
minimum distance of the original code.
Puncturing may reduce minimum distance but not significantly reduce reliability.
Punctured codes may be obtained from simple codes that have too much
redundancy.
Example: We can puncture a (9, 4) simple product code with d∗ = 4 to obtain a
(8, 4) code with d∗ = 3. If we expand the punctured code by adding an overall
parity-check bit, we recover the simple product code.
Soft-decision decoders or error-and-erasure decoders can treat the missing check
symbols as unreliable.

EE 387 Notes #3, Page 38


Linear block code modifications: summary
Change any two of the block code parameters n, k, , n − k:
• Shorten: delete message symbols:
n − k fixed, k ↓ ⇒ n ↓
• Lengthen: add message symbols:
n − k fixed, k ↑ ⇒ n ↑
• Puncture: delete check symbols:
k fixed, n − k ↓ ⇒ n ↓
• Extend (expand): add check symbols:
k fixed, n − k ↑ ⇒ n ↑
• Expurgate: delete codewords, add check equations:
n fixed, k ↓ ⇒ n − k ↑
• Augment: add codewords, delete check equations:
n fixed, k ↑ ⇒ n − k ↓

EE 387 Notes #3, Page 39

Linear block code modifications: picture

lengthen expand

shorten puncture

k information symbols n − k checks

expurgate

augment

EE 387 Notes #3, Page 40

You might also like