Chapter4 PDF
Chapter4 PDF
Section 4
Error-Control Codes
(Channel Coding)
DISCLAIMER
Whilst every attempt is made to maintain the accuracy and correctness of these notes,
the author, Dr. Roberto Togneri, makes no warranty or guarantee or promise express
or implied concerning the content
ITC314 Information Theory and Coding 314 623.314
4.1 Introduction
In the previous chapter we considered coding of information sources for transmission through a
noiseless channel. In that case efficient compact codes was our primary concern. In general,
though, the channel is noisy and errors will be introduced at the output. In this chapter we will be
concerned with codes which can detect or correct such errors. We restrict our attention to the class
of binary codes and BSC channels. Our model for a noiseless channel was:
message message
Modulator
Source Source
Digital Channel
Encoder Demodulator
Decoder
no errors
1. The channel encoder takes the binary stream output of the source encoder, and maps each fixed-
block message or information word of length L bits, to a fixed-block codeword of length N. Thus
channel coding uses block codes with N > L. The difference, N-L, represents the check-bits used
for channel coding.
2. We restrict ourselves to binary channels. Hence the source encoding uses a binary code and the
data entering and leaving the channel encoder, channel decoder and channel is binary.
3. With L bits of information we need M = 2L codewords, each N-bits in length. Thus we will
receive either one of the M = 2L codewords or any one of the remaining 2N-2L words which
represent the codewords with detectable bit errors. It is this mapping operation and the “extra” N-
L check bits which can be exploited for error-control.
feedback
channel error detected?
NOTE
1. With error correction the error is detected and corrected at the receiver. A communication system
which implements this form of channel coding is known as a Forward-Error-Correction (FEC)
error-control system.
2. With error detection the error is detected and the transmitter is required retransmit the message.
This requires a reliable return feedback channel from receiver to transmitter. The simplest
implementation is a stop-and-wait strategy whereby the receiver acknowledges (ACK) each
correct reception and negatively acknowledges (NACK) each incorrect reception. The transmitter
then retransmits any message that was incorrectly received A communication system which
implements this form of channel coding is known as a Automatic-Repeat-Request (ARQ) error-
control system.
3. In packet-switched data networks the, possibly, source-coded messages are transmitted in N-bit
codeword blocks called frames, packets or segments (depending on which network layer handles
the data) which incorporate not only the message but a control header and/or trailer. ARQ error
control is used based on systematic error-detecting channel codes with the check-bits included in
the appropriate checksum field of the header or trailer.
It should be obvious that since N > L the rate of data entering the channel is greater than the rate of
data entering the channel encoder. Let:
1
ns = ⇒ fixed source coding bit rate (bps)
τs
1
nc = ⇒ channel bit rate (bps)
τc
Example 4.1
(a)_
The source coder rate is 2 bps and the channel bit rate is 4 bps. Say L = 3, then
N = L[n c / n s ] = 3[4 / 2] = 6 and the channel coder will generate N-L = 3 check bits for every L = 3-
bit message block and hence produce a N = 6-bit codeword. The check bits can be used for error-
control (e.g. even parity-check)
(b)_
4
The source coder rate is 3 bps and the channel bit rate is 4 bps. Now N = L .
3
If L = 3 ! N = 4 (1 check bit for every 3-bit message block to produce a 4-bit codeword)
2
If L = 5 ! N = 6 ≡ 6 (1 check bit for every 5-bit message block to produce a 6-bit codeword)
3
BUT there is a mismatch between the source code and channel data rates. Express the fraction as the
2 d
ratio of the two small integers d and n: = thus n = 3 and d = 2. To synchronise the data rates
3 n
we need to transmit d = 2 dummy bits for every n = 3 codewords / message blocks:
Thus the average rate will be the correct (20/15) = (4/3). Both the check-bits + and dummy bits *
can be used for error-control.
With a noiseless channel we know with probability 1 which input was transmitted upon reception of
the output. However, when the channel is noisy there is usually more than one possible input
symbol that may have been transmitted given the observed output symbol. In the presence of noise
we must use a decision rule.
Decision Rule
Example 4.2
Consider the channel matrix:
$!! #!!"
OUTPUT
b1 b2 b3
a1 0.5 0.3 0.2
INPUT
With the second decision rule if b = b1b1b3b2 then d (b) = d (b1b1b3b2 ) = a1a1a 2 a 2
The aim of a good decision rule is to minimise the error probability of making a wrong decision.
We choose
d (b j ) = a *
where a* is given by:
P ( a * / b j ) ≥ P ( a i / b j ) ∀i
Equation 4.1
Now From Bayes' law we have:
P(b j / a*) P( a*) P (b j / a i ) P ( a i )
≥
P (b j ) P (b j )
Equation 4.2
Since P(bj) is independent of ai in Equation 4.2 we have:
P(b j / a*) P(a*) ≥ P (b j / a i ) P( ai )
Equation 4.3
Thus the minimum-error probability decoding requires not only the channel matrix probabilities but
also the a priori probabilities P(ai). In some cases we do not have knowledge of P(ai).
We chose
d (b j ) = a *
where a* is given by
P(b j / a*) ≥ P(b j / ai ) ∀i
Equation 4.4
Thus maximum-likelihood decoding depends only on the channel. This is an advantage in situations
where the P(ai) are not known. Although not minimum-error, maximum - likelihood decoding does
result in minimum-error decoding when the inputs are equiprobable.
If we choose the input according to the decision rule d(bj) then the probability of getting it right is:
P(a* = d (b j ) / b j )
The overall probability PE is the average of P(E/bj) over all the bj.
PE = ∑ P( E / b j ) P(b j )
B
Equation 4.6
where ∑ is the summation over members of the B alphabet for which d(bj) ≠ ai.
b ∈ Bc
Example 4.3
Refer to same channel of Example 4.2. Also assume we are given that P(a1) = 0.3, P(a2) = 0.3 and
P(a3) = 0.4.
b1 b2 b3
d (b1 ) = a1
a
0.5 0.3 0.2 ! d (b ) = a
P(b j / a i ) ≡ INPUT 1
a 2 2 1
0 . 2 0 . 3 0.5 d (b3 ) = a 2
a3
0.3 0.3 0.4
Note that d(b2) = a2 = a3 are also correct. We calculate the error probability using:
PE = ∑ P( E / ai ) P(ai ) = ∑ P( ai ) ∑ P(b / ai )
A A b∈ Bc
To derive P( E / a i ) = ∑ P(b / a ) we sum across the ai row excluding the bj column if d(bj) = ai:
i
b∈B C
$!! #!!"
OUTPUT
b1 b2 b3
0.5 0.3 0.2 ⇒ 0.2 × 0.3 = 0.06
a1
INPUT
a 2 0.2 0.3 0.5 ⇒ 0.5 × 0.3 = 0.15
a 3 0.3 0.3 0.4 ⇒ 1%
.0 × 0%.4 = 0.&
40
('
P ( E / ai ) P(a ) P = 0.61
i E
Thus PE = 0.61. Note that P(E/a3) = 1.0 is an obvious result since the decision rule never decides a3.
b1 b2 b3
0.5 0.3 0.2 ⇒ 0.5 × 0.3 = 0.15
a1
INPUT
0.2 0.3 0.5 ⇒ 1.0 × 0.3 = 0.30
2
a
a 0.3 0.3 0.4 ⇒
3
0%
.3 × 0%
.4 =
('0.&
12
P ( E / ai ) P( a ) P = 0.57
i E
Thus PE = 0.57 which is lower than maximum likelihood decoding.
It can be shown that maximum likelihood decoding will yield the same decision rule (and PE) as
minimum error decoding. Specifically for modern digital communication systems the decision rule
on a per symbol basis is simply d(b = 0) = 0 and d(b = 1) = 1 with PE = q
However the decision rule will be based not on each symbol ai and bj but on which N-bit codeword
a was sent upon receiving the corresponding N-bit word b. We describe the problem as follows:
Channel Decoding Problem
Let a represent the codeword that is transmitted through the channel and let b be the received word.
With an L-bit message block and N-bit codeword this means that we have M = 2L N-bit valid
codewords for a. However there will be a larger set of all possible received words, b. Due to
channel noise which can cause bit errors in transmission any of the 2N possible N-bit words can be
received as b. Since N > L the received word b can either be one of the 2L N-bit codewords (no
errors or sufficient bit errors to convert one codeword to a different codeword) or one of the 2N - 2L
N-bit non code-words (bit errors, but not enough to convert one codeword to a different codeword).
The task of the channel decoder is to map b to the most likely a that was sent.
Hamming Distance
Given two words a = a1a2 … an and b = b1b2 … bn their Hamming distance is defined as the
number of positions in which a and b differ. We denote it by d(a,b):
For each alphabet and each number n, the Hamming distance is a metric on the set of all words of
length n. That is, for arbitrary words a, b, c, the Hamming distance obeys the following conditions
1. d(a,a) = 0 and d(a,b) > 0 whenever a ≠ b
2. d(a,b) = d(b,a)
3. d(a,b) + d(b,c) ≥ d(a,c) (triangle inequality)
Example 4.4
Let n = 8 and consider:
a = 1 1 0 1 0 0 0 1 d ( a , b) = 4
b = 0 0 0 1 0 0 1 0 ! d (b, c) = 2 , and d (a , b) + d (b, c ) ≥ d ( a, c )
c = 0 1 0 1 0 0 1 1 d ( a, c) = 2
As stated, for modern communications systems the maximum likelihood decoding rule will yield
the same result as the minimum error decoding rule. When considering the decision rule for
deciding which N-bit codeword a was sent given the received N-bit word b, then a is chosen to
maximise P(b/a). Say we transmit the N-bit codeword a and observe the N-bit word b at the output
(of the BSC). Assume d(a, b) = D, then this means that a and b differ in exactly D places, thus D
bits are in error (with probability q) and N-D bits are OK (with probability p = 1 - q). Thus:
P (b / a )= (q ) ( p )
D N−D
1
Since q < , P(b/a) is maximised when we chose a such that D = d(a, b) is smallest. This
2
procedure can be stated more precisely as follows:
Hamming distance decoding rule
Let the binary code alphabet A = {0, 1} of block length N be used. Let ai : i ∈ [1..M] represent one
of the M codewords which is sent (through the channel). We receive the word b:
Example 4.5
Consider the following channel code
Message Codeword
(L=2) (N=3)
00 000
01 001
10 011
11 111
L 2 3 N−L 1
Note that R = = , C red = and E red = =
N 3 2 L 2
There are M = 2L = 22 = 4 messages and hence 4 valid codewords a, however with N =3, there are
2N = 8 possible received words, 4 of these will be codewords and 4 of these will be non-codewords.
If the received word b = 000, 001, 011 or 111 then the Hamming distance decoding rule would
mean that we decode directly to the codeword a = 000, 001, 011 or 111 (i.e. a is the same as b and
there is no error). If we receive any of the non codewords then:
b=b1b2b3 Closest codeword Action
010 000 (b2 in error), 011 (b3 in error) error detected
100 000 (b1 in error) single-bit error corrected to 000
101 001 (b1 in error), 111 (b2 in error) error detected
110 111 (b3 in error) single-bit error corrected to 111
Minimum distance
The minimum distance d(K) of a nontrivial code K is the smallest Hamming distance over all pairs
of distinct code words:
d(K) = min {d(a,b) | a,b are words in K and a ≠ b}
Error Detection
A block code K is said to detect all combinations of t errors provided that for each code word a and
each received word b obtained by corrupting t bits in a, b is not a code word (and hence can be
detected). This property is important in situations where the communication system uses ARQ
error-control coding.
A code detects all t errors if and only if its minimum distance is greater than t:
d(K) > t
Property 4.1
Proof
Let a* be the codeword that is sent and b the received word. Assume there are t bit errors in
transmission then d(a*,b) = t. To detect that b is in error then it is sufficient to ensure that b does not
correspond to any of the ai codewords, i.e. d(b, ai) > 0 for all i. Using the triangle inequality we have
for any other codeword ai: d ( a*, b) + d (b, a i ) ≥ d (a*, a i ) ⇒ d (b, a i ) ≥ d ( a*, a i ) − d ( a*, b)
To ensure d(b, ai) > 0 we must have: d ( a*, a i ) − d ( a*, b) > 0 ⇒ d ( a*, a i ) > d ( a*, b) and since
d(a*,b) = t we get the final result that:
d ( a*, a i ) > t ! d(K) > t and this is the condition for detecting up to all t-bit errors
Error Correction
A block code K is said to correct t errors provided that for each code word a and each word b
obtained by corrupting t bits in a, the maximum likelihood decoding leads uniquely to a. This
property is important in situations where the communication system uses FEC error-control coding.
A code corrects all t errors if and only if its minimum distance is greater than 2t:
d(K) > 2t
Property 4.2
Proof
Let a* be the codeword that is sent and b the received word. Assume there are t bit errors in
transmission then d(a*,b) = t. To detect that b is in error and ensure maximum likelihood decoding
uniquely yields a* (error can be corrected) then it is sufficient that d(a*,b) < d(b, ai) for all i.
Using the triangle inequality we have for any other codeword ai:
d ( a*, b) + d (b, a i ) ≥ d (a*, a i ) ⇒ d (b, a i ) ≥ d ( a*, a i ) − d ( a*, b)
To ensure d(b, ai) > d(a*,b) we must have:
d (a*, a i ) − d ( a*, b) > d (a*, b) ⇒ d (a*, a i ) > 2 ⋅ d ( a*, b)
and since d(a*,b) = t we get the final result that:
d (a*, a i ) > 2t ! d(K) > 2t and this is the condition for correcting up to all t-bit errors
Example 4.6
Even-parity check Code
The even-parity check code has N = L + 1 by adding a parity-check bit to the L-bit message. For the
case L = 2 we have:
Message Codeword
00 000
01 011
10 101
11 110
By comparing the Hamming distance between different codewords we see that d(K) = 2 (i.e. no two
codewords are less than 2 distance from each other) and by Property 4.1 d(K) = 2 > t =1 and as
expected this code will be able to detect all single-bit errors.
Repetition Code
The repetition code is defined only for L = 1 and performs error correction by repeating the message
bit (N-1) times (for N odd) and then using a majority vote decision rule (which is the same as the
Hamming distance decoding rule). Consider the N=3 repetition code:
Message Codeword
0 000
1 111
We see that d(K) = 3. From Property 4.1 d(K) = 3 > t = 2 and this code can detect all 2-bit errors.
Furthermore from Property 4.2 d(K) = 3 > 2(t = 1) and this code can correct all 1-bit errors. The
Hamming decoding rule will perform 1-bit error correction. To see this and the majority voting
operation:
Non codeword Closest Majority Action
codeword bit
001 000 0 message 0
010 000 0 message 0
011 111 1 message 1
100 000 0 message 0
101 111 1 message 1
110 111 1 message 1
Note: Using majority voting of the received bits is a much simpler implementation for decoding the
final message than using the Hamming distance decoding rule.
Code K6
Code K6 will be used in a later example and is defined as follows
Message Codeword
(L = 3) (N = 6)
000 000000
001 001110
010 010101
011 011011
100 100011
101 101101
110 110110
111 111000
By comparing codewords we have that d(K) = 3 and hence this code can correct all 1-bit errors. For
example, if a = 010101 is sent and b = 010111 is received than the closest codeword is 010101 and
the error in b5 is corrected. This will always be the case no matter which codeword was sent and
which bit is in error. It should be noted that there are only M = 2L = 23 = 8 valid codewords
compared to 2N = 26 = 64 possible received words.
The d(K) for a particular block code K specifies the code’s error correction/detection properties. Say
we want to design a code of length N and minimum distance d(K). What is the bound, B(N,d(K)) on
M (and hence L), for any given (N, d(K)), i.e. find B(N, d(K) such that M ≤ B(N, d(K))?
This is sometimes known as the main coding theory problem (of finding good codes).
Some results for B(N,d(K)) are:
B(N, 1) = 2N
B(N, 2) = 2N-1
2N
B(N, 3) =
N +1
2 N-1
B(N, 4) =
N
In general, we have:
B(N, 2t+1) = B(N+1, 2t+2) for t = 0,1,2, …
Equation 4.7
thus if the we know the bound for even d(K) we can calculate the bound for odd d(K).
For even d(K) one bound (of the apparently many) is the Plotkin bound for even d(K):
2d (K)
B(N, d (K)) = provided d(K) is even and 2d(K) > N ≥ d(K)
2d (K) - N
Equation 4.8
Example 4.7
Our channel coder encodes messages of L = 2 bits (M = 4). What is the minimum block length, N,
and possible code for the following requirements:
4.4.4 Issues
The problem with using the Hamming distance or encoding and decoding is that with L bit
information words we generate 2L code words which we must store, and perform message word to
code word comparisons, which is an expensive operation. This represents an exponential increase in
memory and computational power with L. Only in the specific case of the repetition code is an
efficient implementation (using majority voting) possible. Later we will develop a way of
systematically encoding and decoding codes in an efficient manner by considering the important
class of binary linear codes.
Let:
Kn = block code of length n (code with n ≡ N-bit codewords)
Pe(Kn) = block error probability for code Kn
Peb(Kn) = bit error probability for code Kn
R(Kn) = information rate for code Kn
The bit error probability Peb(Kn) indicates the probability of bit errors between the transmitted
message and decoded message. This measures the true errors in using Kn. The block error
probability Pe(Kn) indicates the probability of decoding to the incorrect codeword block. It should
be noted that Pe(Kn) ≥ Peb(Kn) since even with an incorrect codeword block not all of the bits will
be in error.
Example 4.8
Consider a channel coding with L=3 and N=6. The 3-bit information sequence 010 is encoded as
010111 and sent through the channel. However due to a bit errors the received word is 011101 and
the channel decoder decodes this as codeword 011111 and message 011. Although the block error
decoding is 100% in error (i.e. Pe(Kn) = 1.0) the first 2 bits are OK so the bit error decoding is only
33% incorrect (i.e. Peb(Kn) = 0.33).
When comparing different codes it is normal to use Pe(Kn) rather than Peb(Kn), since Pe(Kn) is easier
to calculate than Peb(Kn) and Pe(Kn) represents the worst-case performance.
(n = 3) repetition code
Info Code
0 000
1 111
We note that since L = 1 then Peb(Kn) = Pe(Kn). How do we calculate Pe(Kn)? The majority vote and
Hamming distance decoding rule will fail to yield the correct codeword if:
• all n = 3 bits are in error q3
• 3 q 2 p = 3q 2 p
2 out of n = 3 bits are in error 2
Hence:
Pe(Kn) = q3 + 3q2p = 3 x 10-6 much better than q = 1 x 10-3, but there is a cost since R(Kn) = 1/3
(n = 5) repetition code
Info Code
0 00000
1 11111
The majority vote and Hamming distance decoding rule will fail to yield the correct codeword if:
• all n = 5 bits are in error q5
• 4 out of n = 5 bits are in error 5 q 4 p = 5q 4 p
4
• 3 out of n = 5 bits are in error 5 q 3 p 2 = 10q 3 p 2
3
Hence: Pe(Kn) = q + 5q p +10 q p = 10-8, but now R(Kn) = 1/5 !
5 4 3 2
1
With repetition codes R( K n ) = and there is an exchange of message rate for message reliability.
n
But is there a way of reducing Pe(Kn) without a corresponding reduction in R(Kn)? Yes!
Example 4.10
Consider a BSC with q = 0.001 and the restriction that R(Kn) = 0.5. We design two codes K4 and K6
in an attempt to lower Pe(Kn) and still keep R(Kn) = 0.5
Code K4 (N = 4, L = 2)
We transmit a1 once and transmit a2 three times to produce the following code:
Info Code
00 0000
01 0111
10 1000
11 1111
We perform conditional error correction by assuming b1 is correct (a1 = b1) and then do a majority
vote on the remaining b2b3b4 bits to obtain a2. This decoding will be correct when:
• all n = 4 bits are correct p4
• b1 is correct and there is one error in the remaining 3 bits 3 p 3q = 3 p 3q
1
Hence: 1 - Pe(Kn) = p4 + 3p3q ! Pe(Kn) = 1 - p4 - 3p3q = 0.001
But this is indeed better if we consider Peb(Kn). In the following, let “c” mean correct and “i” mean
incorrect. We can derive :
Pci = Pr(a1 = “c”, a2 = “i”) q(3p2q + p3) = ! first bit only is incorrect (only 1 of the 2-bits is wrong)
Pic = Pr(a1 = “i” , a2 = “c”) = p(q3 + 3pq2) ! second bit only is incorrect (only 1 of the 2-bits is wrong)
Pii = Pr(a1 = “i” , a2 = “i”) = q(q3 + 3pq2) ! both bits are incorrect
Pcc = Pr(a1 = “c” , a2 = “c”) = p(3p2q + p3) ! both bits are correct (decoding is correct)
This represents a reduction in the block error probability over K4 by 2 orders of magnitude without
any reduction in R(Kn)!
4.5.1 Issues
The obvious observation is that by using larger values of n our error probability decreases with the
same information rate (compare this to the equivalent statement for source coding). We are tempted
to ask:
1. Can an encoding be found (for n large enough) so that Pe(Kn) → ∈ and R(Kn) = 1
2 ?
2. Can an arbitrary reliability (i.e. make Pe(Kn) as small as we want) be achieved if R(Kn) = 0.9?
Now consider:
R = C = 0.5 bits
L 1
∴ = i.e. N = 2L
N 2
Thus we will extract 0.5N = 0.5(2L) = L bits of information which is exactly the number of bits
needed to recover the message (which is of length L bits)! We have overcome the error of the
channel by introducing redundancies. Shannon's Fundamental Theorem will tell us that for large
enough N, if R ≤ C we can encode for error-free transmission:
Shannons's Fundamental Theorem
Every binary symmetric channel of capacity C > 0 can be encoded with an arbitrary reliability and
with information rate, R(Kn) ≤ C, arbitrarily close to C. That is, there exist codes K1, K2, K3 ….
such that Pe(Kn) tends to zero and R(Kn) tends to C with increasing n:
lim Pe (K n ) = 0, lim R(K n ) = C i.e. R(K n ) = C − ∈1 , ∈1 > 0 and lim ∈1 = 0
n →∞ n →∞ n →∞
That is, If R(Kn) ≤ C, then Pe(Kn) → 0 for large enough n, and as a bonus R(Kn) → C!
The formal proof of Shannon's Theorem is quite lengthy and involved. We only present the salient
features of the proof here (which is why this is an engineer’s proof # ).
Assume an arbitrarily small number ∈1 > 0and that we have to design a code Kn of length n such
that R (K n ) = C − ∈1 . To do this we need an information length of L = n (C − ∈1 )
and hence:
n (C − ∈1 )
R(K n ) = =C − ∈1
n
n (C −∈1 )
This means we need M = 2 codewords.
Shannon's proof makes use of random codes. Given any number n we can pick M out of the 2n
n (C −∈1)
binary words in a random way and we obtain a random code Kn. If M = 2 then we know
that R (K n ) = C − ∈1 but Pe(Kn) is a random variable.
Denote:
~
Pe = E( Pe (K n ))
as the expected value of Pe(Kn) for a fixed value of n, but a completely random choice of M
codewords. The main, and difficult part, of the proof (which we conveniently omit) is to show that:
~
Pe → 0 as n → ∞
~
Thus given arbitrarily small numbers ∈1 > 0 and ∈2 > 0 we can find n such that Pe <∈2 . This means
there must be at least one random code Kn with R (K n ) = C − ∈1 and Pe(Kn) < ∈2.
NOTE
1. The surprising part of Shannon's proof of the theorem is that we can achieve small Pe(Kn) with a
random choice of Kn. However, this feature of the proof flys in the face of common sense: How
can you chose a good code at random?
2. No practical coding scheme realizing the parameters promised by Shannon's Theorem, has ever
been presented. Thus, coding theory tends to ignore the theorem and concentrate on techniques
which permit designing codes capable at correcting lots of errors and still retain a reasonable
information rate.
In every binary symmetric channel of capacity C whenever codes Kn of lengths n have information
rates at least C + ∈1 ( ∈1 > 0) then the codes tend to be totally unreliable:
R (K n )≥ C + ∈1 → lim
n → ∞ Pe (K n )=1
Example 4.11
Consider the BSC from Example 4.10. With q = 0.001 we know that:
C = 1 + p log p + q log q = 0.9886
We can now answer the 4.5.1 Issues questions:
1. Since R = 0.5 < C = 0.9886 we can find a code (for n large enough) such that Pe(Kn) → ∈
2. Since R = 0.9 < C = 0.9886 we can find a code (for n large enough) such that Pe(Kn) → ∈
3. Since R = 0.99 > C = 0.9886 we cannot find a code such that Pe(Kn) → ∈
From now on we define the operations + (mod 2 addition) and .(mod 2 multiplication) as follows:
Exclusive OR + 0 1 . 0 1
+ → 0 0 1 0 0 0 ← . And
1 1 0 1 0 1
Since 1 + 1 = 0 ⇒ 1 = -1
Thus binary mod-2 subtraction coincides with binary mod-2 addition.
Example 4.12
Compare binary mod-2 subtraction with mod-2 addition:
1-0=1 1+0=1
1-1=0 1+1=0
0-1=1 0+1=1
0-0=0 0+0=0 They are the same!
NOTE Since binary mod-2 subtraction is the same as binary mod-2 addition then all binary
mod-2 algebraic equations will be (re)written in standard form using mod-2 binary addition.
Change in notation
An important class of binary codes are the binary linear codes which can be described by systems of
linear equations as follows:
Binary linear codes (one definition)
Denote by xi the ith bit of the code word. Assume we have a message or information word of k bits
which is encoded into a codeword of n bits. In a binary linear code all the 2k codewords satisfy n-k
linear equations in xi for i = 1,2, … n (i.e. n-k equations with n unknowns).
Furthermore, we can rearrange the equations to be homogenous (i.e. the RHS is zero)
Binary linear codes (another definition)
Every homogenous system of linear equations defines a linear code. We can also show that every
linear code can be described by a homogenous system of linear equations.
NOTE
Since there are n-k equations in n unknowns then we have n - (n-k) = k independent variables and n-
k dependent variables. Since the variables are binary this means we have 2k solutions which is as
expected.
Example 4.13
n-length repetition code (n odd)
Info Code:
x1 x2 x3 … xn
0 000…0
1 111…1
With k = 1 then we need n-k = n-1 equations in the n unknowns (x1 x2 x3 … xn) which fully describes
the repetition code:
x2 = x1 x1 + x2 = 0
x3 = x1 x1 + x3 = 0
!
) )
xn = x1 x1 + xn = 0
Given a linear code K described by a system of (n-k) homogenous equations in (n) unknowns we
construct the (n-k) x (n) matrix, H, with the co-efficients of the equations. That is, the ith row of H
expresses the ith equation. H is known as the parity-check matrix.
The binary matrix H is called a parity check matrix of a binary linear code K of length (n) provided
that the code words K are precisely those binary words xT = [x1 x2 … xn] which fulfil:
x1 0
x 0
H 2 = or Hx = 0 and H has (n-k) rows and (n) columns
) )
xn 0
The system of (n-k) homogenous equations are known as the parity-check equations
Example 4.14
H for Repetition Code
$!!!!!# n !!!!!"
1 1 0 0 0 * 0 0 x1
x1 + x2 = 0 1 0 1 0 0 * 0 0 x 2
x1 + x3 = 0 1 0 0 1 0 * 0 0 x3
! Hx = n − 1 =0
) ) ) ) ) ) + ) ) )
x1 + xn = 0 1 0 0 0 0 * 1 0 xn −1
1 0 0 0 0 * 0 1 xn
x1
$!!!# " x2
n !!!
x1 + x2 + * + x n = 0 ! Hx = [1 1 1 * 1 1] ) = 0
xn −1
x n
Hence, H is a (1) x (n) matrix
A binary block code is said to be linear provided that the sum of any arbitrary two code words is
also a code word.
Example 4.15
Repetition Code
Info Code:
x1 x2 x3 … xn
0 000…0
1 111…1
There are only two codewords and 000…0 + 111…1 = 111…1 which is of course a codeword, so
the repetition code is a binary linear code.
As we shall see describing binary linear codes by the parity-check matrix will facilitate both the
analysis and design of codes. However in some cases special codes can be constructed which can be
analysed in a different and more intuitive way. One such class of code is the r x s rectangular code
which is used when transmitting row by row or two dimensional data blocks. This is a binary linear
code of length n = rs, whose codewords are considered as r x s matrices. Each row and column has
r-1 and s-1 information bits respectively.
3 x 4 rectangular code
r = 3 and s =4 " n = 3 x 4 = 12 and k = (3-1) x (2-1) = 6. To analyse this code we represent the
codeword (x1, x2, …, x12) in the following matrix form:
Each check-bit is an even-parity check over the corresponding row or column. If one includes the
check-bits themselves then there are r + s rows and columns to check over and hence r + s = 7
parity-check equations:
x1 + x2 + x3 + x4 = 0
x5 + x6 + x7 + x8 = 0 row parity - checks
x9 + x10 + x11 + x12 = 0
x1 + x5 + x9 = 0
x 2 + x6 + x10 = 0
column parity - checks
x3 + x7 + x11 = 0
x 4 + x8 + x12 = 0
We expect n-k = 6 parity-check equations so one of the equations is in fact redundant. However all 7
equations are used to derive the parity-check matrix description:
1 1 1 1 0 0 0 0 0 0 0 0 x1
row 0 0 0 0 1 1 1 1 0 0 0 0 x 2
parity-checks
0 0 0 0 0 0 0 0 1 1 1 1 )
Hx = 1 0 0 0 1 0 0 0 1 0 0 0 x 6 = 0
column 0 1 0 0 0 1 0 0 0 1 0 0 )
parity-checks 0
0 1 0 0 0 1 0 0 0 1 0 x11
0 0 0 1 0 0 0 1 0 0 0 1 x12
Block codes in which the message or information bits are transmitted in unaltered form are called
systematic codes. Specifically for binary linear codes, consider the (k) length message mT = [m1, m2,
…, mk] (mi is the ith bit if the message). The (n) length codeword is represented by:
xT = [mT | bT] = m1, m2, …, mk, b1, b2, …, bn-k where bi is the ith check bit
That is, the codeword is formed by appending the (n-k) check bits to the (k) information bits
NOTE: Unless otherwise stated, all codes we develop will be systematic codes
Example 4.16
The repetition code is obviously systematic
same
This is a systematic code since x1x2 = m1m2 and x3 = b1 is the check-bit
same
This is a systematic code since x1 x2 = m1 m2 and x2 x4 = b1 b2 are the check-bits
By making m1, m2, …, mk the independent variables and the b1, b2, …, bn-k the dependent variables
the b1, b2, …, bn-k can be expressed as explicit functions of the m1, m2, …, mk variables only. Hence:
Result 4.1
In encoding, the (n-k) check-bits are appended to the (k) length message to form the resultant (n)-
length codeword.
In decoding, after the correct codeword has been chosen (i.e. errors corrected), the (n-k) bits
appended to the (n)-length codeword are stripped to form the (k)-length message.
Define:
b as the (n-k) column vector of check bits
m as the (k) column vector of information bits
m
c as the (n) column codeword vector = ≡ x
b
P as the (n-k) x (k) coefficient matrix
G as the (n) x (k) generator matrix
H as the (n-k) x (n) parity-check matrix
Since the check-bits are generated from the information bits we represent this operation by using P :
b = Pm
Equation 4.9
I
and G = k , where Ik is the (k) x (k) identity matrix, generates the codeword:
P
I m m
c = Gm = k m = =
P Pm b
Equation 4.10
and if H = [ P | In-k ], where In-k is the (n-k) x (n-k) identity matrix, then:
m
Hc = [P | I n-k ] = Pm + b = b + b = 0
b
Exercise: Show that HG = 0
H = [ P | In-k ]
Equation 4.11
NOTE
1. The systematic form for H makes it a trivial operation to extract P and hence generate the check
bits from the information bits.
2. If H is in non-systematic form, the resulting parity-check equations will need to be manipulated,
or H can be directly manipulated by standard row operations into systematic form.
The parity-check matrix systematic structure adopted by S. Haykin is different than that described
here and both the rows and columns of H are reversed. Thus a parity-check matrix, H, from S.
Haykin:
a11 a12 * a1n a mn * a m 2 a m1
a ) )
21 a 22 * a 2 n
+ )
H= is equivalent to our: H=
) ) + ) a2n * a22 a21
am1 a m 2 * a mn a1n * a12 a11
Example 4.17
Repetition code
1 1 0 0 0 * 0 0 m1 m
1 0 1 0 0 * 0 0 b1
1 0 0 1 0 * 0 0 b2
n-1 Hc = =0
) ) ) ) + ) ) ) b
)
1 0 0 0 0 * 1 0 bn − 2
1 0 0 0 0 * 0 1 bn −1
P In-1
b1 1
b 1
2
b = ) = ) [m1 ] = Pm
bn − 2 1
bn −1 1
m1 1 I1
b 1
1
b 1
c = 2 = [m1 ] = Gm
P
) )
bn − 2 1
bn −1 1
m1
m
n-1 2
)
Hc = [1 1 1 * 1 1] =0
m n − 2
P I1 mn −1
b1
m1
m
2
b = [b1 ] = [1 1 1 * 1] ) = Pm
mn − 2
mn −1
m1 1 0 0 * 0 0 m1
m 0 1 0 * 0 0 m2
2
) In-1 0 0 1 * 0 0 m3
c= = = Gm
mn − 2 ) ) ) + ) ) )
mn −1 0 0 0 * 0 1 mn − 2
b1 P 1 1 1 * 1 1 mn −1
4.7.8 Issues
Given the 2k solutions to the system Hx = 0 which we use as the codewords, and the corresponding
2k information words or messages, we can generate a table of codewords with 2k entries. Encoding
will then involve a table lookup, and decoding will involve searching the same table for the closest
matching codeword (Hamming distance decoding rule). Both operations involve an exponential
increase of complexity with k. However there is a better way:
• systematic binary linear codes make encoding a simple logic operation on the k-bit message.
• syndrome decoding involves a simple logic operation on the n-bit received word for error
detection. Error correction will involve a lookup operation of a 2n-k table which becomes a
simple logic operation on the n-bit received word for the case of single-bit error correction.
It should be noted that lookup of a 2n-k table is much less expensive than finding the closest match
in a 2k table. Not only is a table lookup operation more efficient to implement (e.g. as an indexing
operation) but (n-k) << k < n.
Let c represent a codeword and let r represent the codeword received with bit errors. The vector, e,
defined by:
e=c+r
is termed the error vector (error pattern) since it indicates the bit positions in which c and r differ
(i.e. there has been a bit error):
0 if ri = ci
ei =
1 if ri ≠ ci
Example 4.18
Consider (n,k) = (5,2) code with the following parity-check matrix:
1 0 0 1 0
H = 0 1 0 0 1
0 1 1 1 0
The code table will be shown to be:
Info Code
00 00000
01 01101
10 10110
11 11011
Say cT = [01101] was transmitted and rT = [01111] was received, that is bit r4 is in error, and thus
eT = [00010] and we note that r = c + e and c = r + e.
4.8.2 Syndrome
Now let:
H = [h1 | h 2 | h 3 | * | h n −1 | h n ]
th
where hi corresponds to the i column of H. And let:
e1
e
e = 2
)
en
th
where ei is the i bit of e. Consider:
e1
e
He = [h1 | h 2 | h 3 | * | h n −1 | h n ] 2 = e1[h1 ] + e2 [h 2 ] + * + en [h n ] = s
)
en
Consider a bit error in bits j, k and l of the codeword then ej = ek =el = 1 and all other ei = 0. Hence it
follows that:
s = [h j ] + [h k ] + [ h l ]
That is, the syndrome is the sum of the columns in H which correspond to the bits in error. Thus we
get the following result:
Result 4.2
The syndrome, s, only depends on the error pattern, e, and the syndrome is the sum of the columns
∑
of H which correspond to the bits in error, that is s = [h i ] for i s.t. ei = 1 .
Notice that although s is calculated from r it is related to e. But does knowing s allow us to get e?
Unfortunately there are many error combinations which can yield the same syndrome (i.e. we can’t
simply do e = H-1s since H is non-invertible). We make the following observations:
1. Consider the two error patterns, ei and ej, which yield the same syndrome, then
Hei + Hej = s + s = 0, hence ci = (ei + ej) is by definition a codeword. Equivalently, if ei yields s
then ei + c also yields the same s for any and all codewords c.
Result 4.3
All error patterns that have the same syndrome differ by a codeword and a non-zero codeword
added to an error pattern will result in a different error pattern with the same syndrome.
2. The collection of all the error patterns that give the same syndrome is called a coset of the code.
Since there are 2n-k different syndromes then there will be 2n-k different cosets for the code. From
Result 4.3 since there are 2k codewords there will be 2k distinct error patterns in each coset.
[Note that with 2n-k cosets and 2k error patterns per coset we will have 2n-k x 2n-k = 2n error
patterns and hence all possible error pattern combinations will have been accounted for]. Thus
for a given syndrome there will be 2k distinct error patterns in the coset. Which error pattern is
the right one?
3. Based on the Hamming distance decoding rule in the particular coset associated with the
syndrome calculated, it is reasonable to select the error pattern with the minimum number of bit
errors (i.e. minimum number of bits that are 1), as this is the most likely error pattern to occur.
The error pattern with the minimum number of bits being 1 in a particular coset is known as the
coset leader. Hence we choose the coset leader and add that to r to yield the most likely
codeword that could have been sent.
4. If there is more than one candidate coset leader this implies that for that particular syndrome the
error cannot be corrected since there is more than one error pattern which is equally likely.
If H is in systematic form (see Equation 4.11), then the co-effiicient matrix, P, is trivially extracted.
If H is not in systematic form, it will need to be converted to systematic form by the appropriate row
operations. Alternatively, the system of equations Hx = 0 can be converted directly to the form
b = Pm by the appropriate algebraic operations.
Encoding the (k) bit information word, m, to the (n)-bit codeword, c, proceeds as follows:
Example 4.19
Consider the encoding process for the code from Example 4.18 with parity-check matrix:
1 0 0 1 0
H = 0 1 0 0 1
0 1 1 1 0
Since H is not in systematic form we perform the following row operations in the order stated
• swap rows 2 and 3
• swap rows 1 and 2
• add row 2 to row 1
and hence:
1 1 1 0 0 1 1
H sys = 1 0 0 1 0 ⇒ P = 1 0
0 1 0 0 1 0 1
Thus we get the check-bit generation equations:
b1 1 1 b1 = m1 + m2 x3 = x1 + x 2
m1
b = Pm ⇒ b2 = 1 0 ⇒ b2 = m1 ⇒ x 4 = x1
m2
b3 0 1 b3 = m2 x5 = x 2
The code table is now derived by generating the check-bits from the message and appending the
check-bits to the message to form the codeword:
Info Code
x1x2 x1x2x3x4x5
00 00000
01 01101
10 10110
11 11011
Decoding the (n)-bit received word, r, to retrieve the (k)-bit information word, m, proceeds as
follows:
But how do we determine the coset leader? From Result 4.2 and Result 4.3 at least two approaches
suggest themselves:
1. Find the minimum number of columns in H that when added together yield the syndrome. The
locations of the columns specify the bit positions of the coset leader, ec.
2. Find one combination, any combination, of columns in H that when added together yields the
syndrome. Define the error pattern, e, such that ei = 1 if column i was used in the combination
and then form the coset by adding each 2k codeword to e and then locate the coset leader as the
error pattern with the minimum number of bits being 1.
Since the above operations are expensive to perform for each received word, the usual practice is to
prime the decoder with the pre-determined coset leader for each of the possible 2n-k syndromes. The
coset leader is then found by using s to index (lookup) the 2n-k table of coset leaders.
In the case of single-bit errors, from Result 4.2 if s matches the ith column of H then bit i is in error
(the coset leader has ei = 1, and all other bits are 0) and the codeword, c, is obtained by inverting the
ith bit of r. Thus there is no need to explicitly determine the coset leader.
Example 4.20
Consider the decoding process for the code from Example 4.18.
Say rT = [01111] is received. The syndrome s is calculated:
0
1 1 1 0 0 1 0
s = Hr = 1 0 0 1 0 1 = 1
0 1 0 0 1 1 0
1
0
s = 1 matches only the 4th column of H, " bit 4 is in error (i.e. ecT = [00010]) and hence
0
cT = [01101] which is a valid codeword. The message sent is then mT = [01].
1
T
Say r = [00111] is received. The syndrome s = Hr = 1 does not match any column of H.
1
Looking for the minimum number of columns which when added yield s we see that:
1 0 1 1 0 1
h 2 + h 4 = 0 + 1 = 1 = s and h1 + h 5 = 1 + 0 = 1 = s
1 0 1 0 1 1
Hence there are two candidate coset leaders representing double-bit errors:
ec1T = [01010] and ec2T = [10001] and thus the error cannot be corrected. Indeed we see that
c1T = (ec1 + r)T = [01101] and c2T = (ec2 + r)T = [10110] are equally distant from rT = [00111].
Encoding
Generation of the (n-k) check-bits from the m message bits, b = Pm, can easily be implemented in
hardware using XOR logic gates or making use of XOR assembly language instructions for
implementation in software or embedded systems.
m1 c1
m c
2 2
) )
mk ck
m1 b1 ck +1
m2 b )
XOR
2
logic gate
) array ) cn −1
mk bn −k cn
Decoding
Assuming an FEC system, for each syndrome, s, the corresponding unique coset leader, ec, is
derived and stored. Thus syndrome decoding becomes a table lookup operation using s as the index
into a table of size 2n-k. The generation of the syndrome can be implemented using XOR logic gates
or special XOR assembly instructions. The table lookup operation can be implemented using a (n-
k)-to-2n-k decoder and inverters can be used to correct the required bits in error.
inverter
enable logic
n-k inverter
2n-k
s1
r1 enable
to logic c1
s input
r2 XOR
2 2n-k OR
c
2
logic gate inverter
) array ) enable logic gate )
decoder or
rn s n −k cn
buffer
)
inverter
enable logic
The Hamming weight of a word, w(x), is defined as the number of bits distinct from 0. For each
non-trivial code K, the smallest Hamming weight of a code word distinct from 0T = [000...0] is
called the minimum weight of K, w(K).
Example 4.21
Hamming weight examples:
w(xT = [111]) ≡ w(111) = 3
w(101000) = 2
w(110110) = 4
For the r x s rectangular code, K, it is not necessary to list the codewords in order to find the
minimum weight. All that is needed is to use the row and column parity-check equations and
attempt to find the non-zero codeword solution which uses the least number of bits equal to 1. It can
thus be shown that w(K) = 4 for any r x s rectangular code.
Relationship between Hamming distance and Hamming weight for binary linear codes
For each non-trivial binary linear code the minimum distance, d(K), is equal to the minimum
weight, w(K). That is d(K) = w(K)
Property 4.3
A binary linear code corrects (or detects) all t errors if and only if its minimum weight is larger than
2t (or larger than t, respectively)
Property 4.4
Example 4.22
w(K) = n for the n-length repetition code hence:
" detect all (n-1) bit errors and correct all (n-1)/2 errors
Let x be the codeword with a Hamming weight equal to the minimum weight, w(K), of the code K.
If H is a parity-check matrix for K, then we know Hx = 0. Now by definition x has exactly w(K) bits
which are 1, the remaining bits being 0. Thus the operation Hx = 0 effectively sums w(K) columns
of H to 0. This gives us the following results:
Result 4.4
For a linear code K with parity-check matrix, H, if the minimum number of columns of H that sum
to zero is n, then w(K) = n.
Result 4.5
For a linear code K with parity-check matrix, H, if no combination of n or less columns of H sum to
zero, then w(K) > n.
For single-bit error detection w(K) = 2 or w(K) > 1. From Result 4.5 this means no single column of
H must sum to zero, i.e. no column of H must be a zero column. This means w(K) is at least 2 and
can detect all single-bit errors.
Property of H for single-error detection
A binary linear code K detects single-bit errors if and only if every parity-check matrix of K has
non-zero columns (i.e. no column is the all-zeros column).
For single-bit error correction w(K) = 3 or w(K) > 2. From Result 4.5 this means no single column
(i.e. n = 1) of H must sum to zero and no two columns (i.e. n = 2) of H must sum to zero. The first
condition implies non-zero columns and the second condition implies no two columns of H are
identical (if and only if two columns are identical will they sum to zero), that is, the columns are
pairwise distinct.
Property of H for single-error correction
A binary linear code K corrects single-bit errors if and only if every parity-check matrix of K has
non-zero, pairwise distinct columns (i.e. no column is the all-zeros column and no two columns are
the same).
If a binary matrix with r rows has to have nonzero, pairwise distinct columns this means that the
number of columns, c < 2r. For a parity-check matrix with (n-k) rows and (n) columns this means:
Property of K for single-error correction
n < 2n-k
Example 4.23
(a)
It is required to design the most efficient code for single-bit error correction with k=2. This implies
k 2
a code with maximum information rate: max R = = → min n . From the property n < 2n-2, then
n n
the smallest permissible is n = 5, and hence n-k=3 check bits are needed. This H is a 3x5 matrix.
For single-bit error correction H must have nonzero, pairwise distinct columns. Furthermore we
want H to be in systematic form, so one possible solution is:
1 0 1 0 0 x3 = x1
H = 1 1 0 1 0 ⇒ x 4 = x1 + x2
0 1 0 0 1 x5 = x 2
(b)
It is now required to design the most efficient code for 2-bit error correction with k = 2. This
requires a code with at least w(K) = 5. To determine the smallest n that permits this, the upper
bound expression B(n,5) that was derived in Example 4.7 for the same case of M = 4, d(K) = 5 and
L = 2, was N ≡ n = 8, and n-k = 6 check bits, thus H is a 6x8 matrix.
To design the matrix we note that to be in systematic form, predefines the 6x6 identity matrix, I6.
How do we choose the remaining 2 columns (which define P)? Since w(K) = 5 we must ensure that
no combination of 4 or less columns will add to zero. Here is one solution:
1 1 1 0 0 0 0 0 x3 = x1 + x2
0 1 0 1 0 0 0 0
x 4 = x2
1 1 0 0 1 0 0 0 x5 = x1 + x2
H= ⇒
0 1 0 0 0 1 0 0 x6 = x2
1 0 0 0 0 0 1 0 x7 = x1
1 0 0 0 0 0 0 1 x8 = x1
P I6
Info Code
x1x2 x1x2x3x4x5x6x7x8
00 00000000
01 01111100
10 10101011
11 11010111
2 1
From which we see that the minimum weight of the code is indeed w(K) = 5. Also R = = which
8 4
is not that good. We may need to increase k to achieve better efficiency.
A binary linear (n,k)-code requires 2k codewords of length n. Thus there are 2n possible words of
length n and 2k of these are codewords and 2n - 2k of these are non codewords. If the code corrects
up to t errors, the non codewords are either:
Type 1: non codewords of distance t or less from precisely one of the codewords, that is, the t or
less bits in error can be corrected.
Type 2: non codewords of distance greater than t from two or more codewords, that is, the errors
can only be detected.
If all the non codewords are of Type 1 then the code is a perfect code.
A linear code of length n is called perfect for t errors provided that for every word r of length n,
there exists precisely one code word of Hamming distance t or less from r.
With only Type 1 non-codewords, r, then s = Hr, must be able to correct up to t-bit errors. Thus the
2n-k possible values of s must be the same as the total number of different combinations of
0, 1, 2, …, t-1, t bit errors of an n-bit word. Thus a linear (n,k) code is perfect for t errors if:
t
n
2 n−k
= ∑ i
i =0
Equation 4.12
Example 4.24
Consider the code from Example 4.23(a):
Info Code
Consider the 5-bit word 00111
00 00000
01 01011 • It is not of distance (t=1) or less from precisely one codeword
10 10110 • It is, in fact, of distance 2 from two codewords: 01011 and 10110
! NOT a perfect code
11 11101
Furthermore:
t =1
n 5 5
2 n−k
=2 5− 2 3
=2 =8 ≠ ∑ i = 0 + 1 = 6
i =0
The Hamming codes we now consider are an important class of binary linear codes. Hamming
codes represent the family of perfect codes for single-error correction.
For single-error correction t = 1, and since the code is perfect from Equation 4.12:
t =1
n
2 n−k
= ∑ = n + 1 = ! n = 2m -1 where m=n-k and k = n-m = 2m -m-1
i = 0
i
1. For single-error correction, H must have non-zero, pairwise distinct columns. For m rows we
can have, at most, 2m -1 non-zero and distinct columns for H (i.e n ≤ 2m - 1 ).
k n−m
2. Since R= = , the more columns we have (larger n) per fixed m the better our
n n
information rate. Thus the condition for perfect codes, n = 2m -1, implies maximum information
rate.
A binary linear code is called a Hamming code provided that it has, for some number m, a parity
check matrix H of m rows and 2m -1 columns such that each non-zero binary word of length m is a
column of H.
Since the Hamming codes are perfect codes for single-error correction, all 2n possible received
words, r, are either a codeword or a codeword corrupted by 1 bit. Thus for each possible syndrome,
s, there must be only one coset leader. From the syndrome decoding condition for single-error
correcting codes we have the following decoding procedure for Hamming codes:
1. Calculate the syndrome: s = Hr
2. If s is zero there is no error.
3. If s is non-zero, let i be the column of H which matches s, then correct the ith bit of the received
word to form the codeword, c.
4. Decode to the information word, m, by stripping off the (n-k) check bits appended to c.
Example 4.25
Consider the following (7,4) Hamming code parity-check matrix:
0 0 0 1 1 1 1
n = 7 and k = 4
H = 0 1 1 0 0 1 1 "
m = n−k =3
1 0 1 0 1 0 1
Design the encoding logic, list the codewords and perform syndrome decoding on:
0110010 0110101 1101100 0001111
Code Table
We need to convert H to systematic form:
0 1 1 1 1 0 0 x5 = x 2 + x 3 + x 4
H sys = 1 0 1 1 0 1 0 ! x6 = x1 + x3 + x 4
1 1 0 1 0 0 1 x7 = x1 + x2 + x 4
Encoding Logic
message
x4x3x2x1
UP for first 4
x4 x3 x2 x1 bits
codeword
x7x6x5x4x3x2x1
DOWN for
last 3 bits
x7 x6 x5
N.B. Bits are serially shifted to the right with the LSB first (right-hand side) and MSB last (left-
hand side), thus the message and codeword bits are “reversed”.
Syndrome Decoding
We calculate s =Hr (using the original non-systematic form of H):
rT sT i cT mT
0110010 111 7 0110011 0110
0110101 011 3 0100101 0100
1101100 010 2 1001100 1001
0001111 000 - 0001111 0001
NOTE
n−m m
2. The information rate for the Hamming codes go as R = = 1− but although R tends
n 2m − 1
to 1 rapidly for larger m we correct only single errors for increasingly large blocks ⇒ less error
protection.
2 m − 1 2 m −1 2 m − 1 2 m − 2
Pe (m ) = 1 − p − qp
0 1
Assume the codeword ci was transmitted and r is received. The following types of errors can be
present when decoding a word:
1. No error (s = 0 and r = ci) " ideal
2. Detected error (s ≠ 0 and r ≠ any c):
a) correctly correct (coset leader ⊕ r = ci " non-fatal)
b) cannot correct (two or more coset leaders " non-fatal)
c) incorrectly correct (coset leader ⊕ r = cj ≠ ci " fatal)
3. Undetected error (s = 0 and r = cj ≠ ci) " fatal
In FEC error-control both 2(c) and 3 are fatal (and also 2(b) if there is no ARQ capability). For
ARQ error-control only 3 is fatal since the case of detected errors in 2 are dealt with in the same
way (by re-transmission). Since most communication systems use ARQ, it is the undetected errors
that are a universal problem.
An undetected error occurs if the received word, r, is a different codeword, (r = cj), then was
transmitted, ci. Thus:
e = ci + r = ci + cj (e is the sum of two codewords ! e is a codeword itself)
The probability of undetected error, Pund(K), is the sum of the probability of occurrence of the error
patterns which yield an undetected error. Define ei as the error pattern with i bits in error, thus the
probability of occurrence of ei is = qi pn-i. Since the ei of interest are those which are the same as the
non-zero codewords, we let Ai = number of code words with Hamming weight i, and hence:
n
Pund ( K ) = ∑ Ai qi p n − i
i =1
If the minimum weight of the code is d(K) then A1= A2 = ... = Ad(K)-1 = 0 so we have:
Example 4.26
Consider the (7,4) Hamming code table of Example 4.25 from which we see that:
d(K) = 3 ! A3 = 7, A4 = 7, A5 = 0, A6 = 0, A7 = 1
Hence:
Pund ( K ) = 7 q 3 p 4 + 7 q 4 p 3 + q 7
If q = 0.01, then:
Pund(K) = 7 x 10-6 " 7 codewords in a million will have undetected errors
The discussion so far has assumed that bit errors occur randomly (i.e. channel is memoryless or
independent). In practice, however, the condition that causes the bit error may last long enough to
affect a sequence of bits (i.e. channel has memory). If a sequence of , consecutive bits are suddenly
subject to interference, which causes a large number of them to be in error, then we have a condition
called a burst error of length ,. A specific type of burst error is the multiple unidirectional error
which forces groups of bits to all go 0 (low) or 1 (high). This can be caused by a common mode of
failure in the memory block or bus causing a group of local cells or bus lines to go low (i.e. shorted)
or high.
Although codes can be designed to specifically handle ,-length burst errors (e.g. the CRC codes
discussed in Section 4.15), a larger class of codes (e.g. the Hamming codes) exist which are
applicable only to random (or independent channel) errors. One way to convert burst errors to
random errors is to interleave bits entering the channel in a deterministic manner. If a sequence of
bits experience a burst-error the corresponding deinterleaving operation will spread these errors out
and hence “randomise” them.
Burst-error
Encoder Interleaver Deinterleaver Decoder
Channel
Input Output
Data Data
Independent-error Channel
Example 4.27
Consider the following 16-bit block of data: b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16
1:4 Interleaver: b1 b5 b9 b13 b2 b6 b10 b14 b3 b7 b11 b15 b4 b8 b12 b16
A burst-error of length 4 occurs: b1 b5 b9 b13 b2 b6 b10 b14 b3 b7 b11 b15 b4 b8 b12 b16
1:4 Deinterleaver: b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16
And the burst error of length 4 has been converted a 1-bit random error in each n=4 code block.
The m-out-of-n codes are formed as the set of n-bit vectors, where the vector is a code word if and
only if exactly m bits of the bits are 1. The 2-out-of-5 code is the most common example of such a
code:
Table 4.1: 2-out-of-5 code
message number codeword
1 00011
2 00101
3 01001
4 10001
5 00110
6 01010
7 10010
8 01100
9 10100
10 11000
Properties
• Non-systematic code (need to do a table lookup)
• Single error detection
• Multiple unidirectional error detection
• Used in encoding control signals
Since the code is non-systematic its use for encoding and decoding communication messages is
limited. However control signals which do not need to be decoded (just interpreted) can be
configured to exist as m-out-of-n codes, and any multiple unidirectional error on the control bus can
be detected. In such a case it is obvious that the number of 1’s will change and the error can be
detected.
Example 4.28
Consider a 2-out-of-5 code and assume message 7 (encoded as 10010) is sent.
Suppose a unidirectional error that causes the first 3 bits to go low occurs (most common error
when adjacent memory cells or bus lines experience a common failure like a short-circuit), then the
received message will be 00010 and this error is detected (since only one bit is a 1).
Suppose that now the first three bits are all tied high, then the received message will be 11110 and
this error is also detected (since four bits are now 1).
Checksum codes are generated by appending an n-bit checksum word to a block of s n-bit data
words, formed as the sum of the s n-bit words using modulo 2n addition.
Encoding: use an n-bit arithmetic adder to add the s n-bit data words, with any carry beyond the nth
bit being discarded. The sum is then appended to the data block as the n-bit checksum
Decoding: use the same n-bit arithmetic adder to add the s n-bit data words and XOR the sum with
the appended checksum, if the result is zero then there is no error.
Properties
• Simple and inexpensive, hence large number of uses
• Single error detection at least
• Multiple error detection dependent on s and columns in error: error coverage is highest for the
least significant bits
• Error coverage can be controlled by weighting the bits or using extended checksums
• Long error latency (must wait for end of data block)
• Low diagnostic resolution (error can be anywhere in the s n-bit words)
• Uses: sequential storage devices, block-transfer peripherals (disks, tapes), read-only memory,
serial numbering, etc.
Example 4.29
Let n=8 and s=8, then we append an 8-bit checksum, Ck, after every block of 8x 8-bit data words
8
(bytes), wi, calculated as follows: C k = ∑ w i mod 28 . Consider the following data block (in hex):
i =1
01 FA 02 32 62 A1 22 E3
The sum (in hex) = 337 and Ck = 337(hex) mod 256 = 37(hex)
(i.e., we simply retain the low-order 8 bits of the sum and discard the rest)
The multiple error coverage can be improved by using an extended checksum of m bits for n-bit
data where m > n and the sum is modulo 2m addition (i.e. we retain the low-order m bits of the sum).
As an example n=8 and m=16 and a 2-byte checksum is formed from the sum of s data bytes in the
block.
1
Check symbol (a10 = X = 10) is derived such that ∑ia11−i is divisible by 11, where ak is the kth
i =10
digit from the left in the ISBN code. For the above example we see that:
Sum = (i = 10)*(a1 = 0) + (i = 9)*(a2 = 1) + 8*3 + 7*2 + 6*8 + 5*3 + 4*7 + 3*9 + 2*6 +
(i = 1)*(a10 = X = 10) = 187 which is divisible by 11
The last digit of the 7 digit student number, s7, is the check digit derived such that:
Sum = (8*s1 + 4*s2 + 6*s3 + 3*s4 + 5*s5 + 2*s6 + s7) is divisible by 11
4.15.1 Introduction
The problem with checksum codes is that simply summing bits to form a checksum (or check bits)
does not produce a sufficiently complex result. By complex we mean that the input bits should
maximally affect the checksum, so that a wide range of bit errors or group of bit errors will affect
the final checksum and be detected. A more complex “checksumming” operation can be formed by
performing division rather than summation. Furthermore the remainder not the quotient is used, as
the remainder “gets kicked about quite a lot during the calculation, and if more bytes were added to
the message (dividend) it’s value could change radically again very quickly”.
In the previous sections the idea of the parity-check matrix, check-bit generation and syndrome
decoding as a means for efficient design and implementation of linear codes was shown from the
point of view of binary matrix linear algebra. The cyclic codes are a subclass of linear codes which
satisfy the following two properties:
Cyclic codes
Example 4.30
In a systematic cyclic code the check-bits are generated as the remainder of a division operation and
as indicated this provides a more powerful form of “checksumming”. The cyclic codes are analysed
not by representing words as binary vectors and using matrix binary linear algebra but by
representing words as binary polynomials and using polynomial arithmetic modulo 2. Like linear
codes the cyclic codes also permit efficient implementations of the encoding and decoding
operations, especially fast logic implementations for serial data transmission. The properties of
cyclic codes also allow greater scope for designing codes for specialised uses, especially in handling
burst errors. Not surprisingly all modern channel coding techniques use cyclic codes, especially the
important class of Cyclic Redundancy Check (CRC) codes for error detection.
We are adopting the same notation that the textbook by S. Haykin uses. Compared to the previous
definition in Section 4.7, there are two major differences:
1. The indexing sense has been reversed, and indexing starts at 0 not 1.
2. The codeword representation is now reversed. The bits are now running to the “right”.
Previously the 7-bit codeword 1101110 implied a message of 1101 followed by the check-bit
sequence 110 , all running to the “left”: Now the same 7-bit codeword is bit reversed to
0111011 which implies a message of 1011 followed by the check-bit sequence 011 , all
running to the “right”: In both cases the message and check-bits are actually identical.
We denote the column vectors using this reversed notation with a ^, that is:
r T = [1101110] ⇒ rˆ T = [0111011]
For a codeword of length n we define the code polynomial, C(x), as:
n −1
C( x ) = ∑ ci x i
i =0
We can similarly define the message polynomial, M(x) is formed as:
k −1
M( x ) = ∑ mi x i
i =0
and let:
n - k −1
B( x ) = ∑ bi x i
i =0
Example 4.31
Consider the codeword: 010 01100 with n = 8 and k = 5.
The left-most bits 010 are the check-bits and hence:
B( x ) = 0 + 1x + 0 x 2 = x
The right-most bits 01100 are the message bits and hence:
M ( x ) = 0 + 1x + 1x 2 + 0 x 3 + 0 x 4 = x + x 2
The codeword polynomial is then:
C( x ) = 0 + 1x + 0 x 2 + 0 x 3 + 1x 4 + 1x 5 + 0 x 6 + 0 x 7 = x + x 4 + x 5
C( x ) = B( x ) + x 3 M ( x ) = x + x 3 ( x + x 2 ) = x + x 4 + x 5
Say C(x) ≡ 01001100 is now cyclically right shifted by 3 bits, then we get C(3)(x) ≡ 10001001, and
hence:
C( x ) = x + x 4 + x 5 ⇒ C ( 3) ( x ) = 1 + x 4 + x 7
Now:
x 3C( x ) = x 4 + x 7 + x 8
And:
1
3 8 8 8 7 4
x C( x ) mod ( x + 1) ! x + 1 x + x + x
x8 + 1
(3)
x 7 + x 4 + 1 = C (x)
Let:
x 3C( x ) mod ( x 8 + 1) = R ( x )
where R(x) is the remainder, then:
x 3C( x ) = ( x 8 + 1)Q( x ) + R ( x )
where Q(x) is the quotient. From the above results we have that:
Q(x) = 1 and R(x) = x7 + x4 + 1
∴ x 3C( x ) = x 8 + x 7 + x 4 = ( x 8 + 1) ⋅1 + ( x 7 + x 4 + 1)
We choose G(x) to be a factor of xn + 1 (i.e. xn + 1 = G(x)H(x), where H(x) is known as the parity-
check polynomial) and form all words of the form C(x) = A(x)G(x) as code polynomials belonging
to the code defined by G(x). The code so defined is obviously linear, but is it cyclic? That is, will a
cyclic shift of C(x) yield a polynomial which can be defined by, say, A2(x)G(x) (i.e. C(j)(x) =
A2(x)G(x) for some A2(x)) and thus belong to the code? Using Equation 4.15 with the appropriate
substitutions (C(x) = A(x)G(x) and (xn + 1) = G(x)H(x)), we have:
x jA(x)G(x) = Q(x)G(x)H(x) + C(j)(x)
Dividing throughout by G(x) we get:
C( j ) ( x)
x j A( x ) = Q( x )H( x ) +
G( x )
C( j ) ( x)
∴ = x j A( x ) + Q( x )H( x ) = A 2 ( x )
G( x )
and hence C(j)(x) = A2(x)G(x). Thus:
Result 4.6
If the degree n-k polynomial G(x) as defined by Equation 4.17 is a factor of (xn + 1), then all code
polynomials (i.e. codewords) formed as the product A(x)G(x), where A(x) is any degree k-1
polynomial, belong to the (n,k) cyclic code defined by G(x).
NOTE
In the same way that the parity-check matrix, H, was used to define a linear code the generator
polynomial, G(x), is used to define a cyclic code.
By making A(x) = M(x) we simply derive C(x) = M(x)G(x), but this will not give us a systematic
(i.e. separable) code. For a systematic code we must have:
C(x) = A(x)G(x) = B(x) + xn-kM(x)
Hence dividing the above through by G(x):
x n − k M( x ) B( x )
= A( x ) +
G( x ) G( x )
Thus B(x) is the remainder left when dividing xn-kM(x) by G(x).
Encoding Systematic Cyclic Codes
Example 4.32
Consider G(x) = 1 + x + x3 with n=7 and k=4 (we will later see that this G(x) produces a (7,4)
Hamming code). Thus C(x) = x3M(x) + B(x), where B(x) is the remainder left when dividing x3M(x)
by G(x). The following are examples of encoding systematic cyclic codes algebraically for the
messages 1000, 1010 and 1011:
x3 + x2 + x + 1
x3 + x + 1 x6 + x5 + x3
x6 + x4 + x3
x5 + x 4
Q( x )
x5 + x3 + x 2
3
G( x ) x M ( x ) !
x4 + x3 + x2
*
B( x ) x4 + x2 + x
x3 + x
x3 + x + 1
1
For decoding consider R(x) as the received word polynomial of degree n-1. Let S(x) denote the
remainder of dividing R(x) by G(x). Since C(x) = A(x)G(x) then if there are no errors we expect R(x)
= C(x) and hence no remainder (i.e. S(x) = 0). If S(x) ≠ 0 then we know there is an error. S(x) is
called the syndrome polynomial.
We have:
R(x) = C(x) + E(x)
Now:
Q(x)G(x) + S(x) = A(x)G(x) + E(x)
where Q(x) is the quotient and S(x) is the remainder when R(x) is divided by G(x). Then:
E(x) = (Q(x) + A(x))G(x) + S(x)
Hence the syndrome can be obtained as the remainder of dividing the error polynomial, E(x), by
G(x) (i.e. S(x) = E(x) mod G(x)). In the same way the syndrome was dependent only on the error
pattern for linear codes the syndrome polynomial is only dependent on the error polynomial for
cyclic codes. Thus the syndrome polynomial can be used to find the most likely error polynomial.
Example 4.33
Consider the cyclic code from Example 4.32 with G(x) = 1 + x + x3. Say the codeword
cˆ T = [0011010] is transmitted, but rˆ T = [0011110] is received (an error in bit r4). To decode we
form:
R(x) = x2 + x3 + x4 + x5
and divide R(x) by G(x) to yield the remainder S(x):
Q( x ) x2 + x
G( x ) R ( x ) ! x 3 + x + 1 x5 + x4 + x3 + x 2
* x5 + x3 + x 2
S( x )
x4
x4 + x2 + x
x2 + x
Thus S(x) = x2 + x " sˆ T = [011] , and since S(x) ≠ 0 an error is detected and syndrome decoding
can be used to attempt to correct the error.
NOTE
1. The generator polynomial, G(x), can be shown to be related to the generator matrix, G, and the
parity-check polynomial, H(x), can be shown to be related to the parity-check matrix, H.
2. The Hamming codes are cyclic codes! Specifically a (7,4) Hamming code is designed with
G(x) = 1 + x + x3. Since G(x)H(x) = x7 +1, the corresponding H(x) = 1 + x + x2 + x4 can be used to
construct the (7,4) Hamming code parity-check matrix.
Exercise: Verify that the codewords generated with G(x) as defined above indeed produce a (7,4)
Hamming code
4.15.4 Deriving the Generator matrix and Parity-Check matrix from G(x)
As stated previously we can design the cyclic code codewords by C(x) = M(x)G(x). Although this
does not give us a systematic code it will allow direct derivation of the generator matrix, G. For
compatibility with the G and H matrices discussed in Section 4.7 we need to “undo” the bit reversal
implied by the cyclic code polynomial notation. For example, consider the code polynomial C(x) co-
efficient vector:
c0 cn −1
c c
1 n−2
cˆ = ⇒ c=
) )
cn −1 c0
We want G such that:
cn −1 mk −1
) )
c = Gm ⇒ = G
c1 m1
c0 m0
Now we have:
C( x ) = M ( x ) G ( x )
= ( m0 + m1 x + m2 x 2 + - + mk −1 x k −1 )G( x )
= m0G ( x ) + m1 xG ( x ) + m2 x 2 G( x ) + - + mk −1 x k −1G( x )
or:
gn −k 0 * 0 0 0
) gn −k * ) ) 0
c n −1 g2 ) + 0 0 ) m k −1
x n −1 0 0 x
n −1
0
0 m k − 2
0 * 0 * 0
n −2
c g1 g2 * gn −k 0
0
n−2
0 0 n−2
)
x * 0 x * 0 0
g0 g1 * ) gn −k 0 )
) ) +
) ) = ) ) + ) )
c2 0 g0 * g2 ) g n − k m2
0 0 * x 0
c1
0 0 * x 0
0 0 * g1 g2 ) m1
0 *
0 1 * 1
0
c0
0 0 0
'
( & ) ) + g0 g1 g 2 m0
(!'!&
c 0 0 * 0 g0 g1 m
0
( 0 * 0 0 g 0
!!!!!!!!'!!!!!!!!&
G
And this gives G! Now by definition HG = 0 which means row operations on H correspond to
column operations on G. So to convert G to systematic form we perform column operations on G.
Once G is systematic we obtain the systematic form for H trivially. In summary:
1. Take the n column vector of the reversed and zero-padded (n-k) co-efficients of G(x) and form G
as the concatenation of the k cyclic shifts of the column vector as shown above.
2. Perform column operations to convert G to systematic form.
3. Derive the systematic form of H directly from the systematic form of G.
Example 4.34
Find G and hence H corresponding to G(x) = 1 + x + x3 and n=7, and hence verify that this is indeed
a (7,4) Hamming code.
g 3 1
g 0
Now G(x) = 1 + x + x ! g = [1101] ! g = 2 =
3
ˆ T
g1 1
g 0 1
With n=7 and k =4 we form:
1 0 0 0
0 1 0 0
1 0 1 0
I
G = 1 1 0 1 and we need the systematic form, G sys = k
P
0 1 1 0
0 0 1 1
0 0 0 1
By:
• adding columns 3 and 4 to column 1
• adding column 4 to column 2
we get:
1 0 0 0
0 1 0 0
0 0 1 0 1 1 1 0 1 0 0
G sys = 0 0 0 1 ⇒ H = [P | I n − k ] = 0 1 1 1 0 1 0
1 1 1 0 1 1 0 1 0 0 1
0 1 1 1
1 1 0 1
And H is the parity-check matrix of a (7,4) Hamming code
We can now use H to perform syndrome decoding. Consider cˆ T = [0011010] , rˆ T = [0011110] and
1
s = [011] from Example 4.33. Now s = 1 matches the 3rd column of H, hence the 3rd bit of
ˆT
0
r T = [0111100] is corrected to form c T = [0101100] " cˆ T = [0011010] .
The generator polynomial coefficients, gi, can be used to design a linear shift register circuit for
performing the division operation of xn-kM(x) by G(x). One form of the register circuit is:
DECODE ENCODE
IN CONTROL IN
g1 g2 gn-k-1
ENCODE
b0 b1 b2 ... bn-k-1 OUT
CLEAR
Encoding is accomplished by simply clearing the registers, setting CONTROL = 1 and DECODE
IN = 0 and shifting the message word through the shift register via the ENCODE IN. After all
message bits have been shifted into the register, the check bits (i.e. remainder B(x)) can then be
found in the shift register contents b0b1 … bn-k-1 and these can be appended to the message bits by
setting CONTROL = 0 and shifting the contents out of the register via ENCODE OUT.
Decoding is accomplished by clearing the shift register, setting CONTROL = 1 and ENCODE IN =
0 and shifting the received word through the register via DECODE IN. The final contents of the
shift register will only be zero if there has been no error (i.e. the received word is a code word),
otherwise it contains the syndrome bits which can be used to perform error correction by syndrome
decoding.
Example 4.35
For G(x) = 1 + x + x3 we have n-k-1 = 2 and g1 = 1, g2 = 0. Hence the coding circuit is:
DECODE ENCODE
IN CONTROL IN
g1=1 g2=0
ENCODE
b0 b1 b2 OUT
CLEAR
0011010
From From
ENCODE OUT ENCODE IN
And the result is confirmed from Example 4.32
NOTE:
1. The cyclic code circuitry is in fact simpler than the more cumbersome combinational logic
required for check-bit generation and syndrome generation of general linear codes, especially
when data has to be processed serially.
2. An encoder only circuit will have neither a DECODE IN nor top left XOR. Similarly, a decoder
only circuit will have neither the ENCODE IN, CONTROL, top AND gate, nor top right XOR.
The CRC codes are codes specifically designed for error detection. There are several reasons why
CRC codes are the best codes for any form of practical error detection:
1. The encoder and decoder implementation is fast and practical, especially for communication data
which is processed in serial fashion.
2. The CRC codes handle many combinations of errors, especially combinations of burst errors, as
well as random errors.
Since CRC codes only detect errors there is no syndrome decoding as such. The syndrome is
calculated and checked to see if it is zero or not (the true non-zero value of the syndrome is not
important). Thus the ENCODE IN in Figure 4.1 can be used rather than DECODE IN for decoding.
The decoder circuit is identical to the encoder circuit
ENCODE / DECODE
CONTROL IN
g1 g2 gn-k-1
... ENCODE
b0 b1 b2 bn-k-1
OUT
CLEAR
Why does this work? Using ENCODE IN means we are dividing xn-kR(x), rather than R(x),
by G(x). In both cases if R(x) = C(x), the remainder will be zero.
A complete encoder/decoder CRC circuit can be designed using simple combinatorial logic as
shown in Figure 4.3.
SERIAL
IN
SERIAL
OR OUT
ENABLE
FEEDBACK
CONTROL ENCODE
IN
ENCODE
CLEAR CLEAR
OUT
b0 b1 b2 … bn-k-1
ERROR
OR
FLAG
CRC Encoding
1. CLEAR register.
2. Set ENABLE FEEDBACK = 1 and shift the k message bits via SERIAL IN
(same bits appear at SERIAL OUT as the message part of the code word).
3. Set ENABLE FEEDBACK = 0 and shift the n-k register contents
via SERIAL OUT (check bits immediately follow to complete the code word).
CRC Decoding
1. CLEAR register.
2. Set ENABLE FEEDBACK = 1 and shift the n received word bits via SERIAL IN.
3. If ERROR FLAG = 1 an error has been detected.
We have:
R(x) = C(x) + E(x)
So:
R(x) mod G(x) = C(x) mod G(x) + E(x) mod G(x) = 0 + E(x) mod G(x)
Thus errors will go undetected if E(x) mod G(x) = 0. That is, if E(x) is a multiple of G(x) errors
are undetected. So we choose G(x) such that its multiples look as little like the kind of noise we
expect (in terms of the error patterns) as possible. That is we want G(x) to not be a factor of E(x) if
we want to detect the types of errors E(x) represents.
NOTE: Since the CRC logic divides xn-kR(x) by G(x) errors will go undetected if xn-kE(x) mod G(x)
= 0, and the same observation still applies: if E(x) is a multiple of G(x) errors are undetected.
Think! Textbooks on CRC codes fail to mention that G(x) must be a factor of xn +1. In fact the
implication seems to be that n can take on any value, and thus CRC codes are not strictly cyclic
codes.
This is one of the most important and powerful classes of linear block codes. The most common
binary BCH codes, known as primitive BCH codes, are characterised as follows:
BCH Codes:
Block length: n = 2m -1 for m ≥ 3
Message length : k ≥ n - mt
Min. distance: d(K) ≥ 2t + 1
Error: t-bit correction where t < (2m - 1)/2
Thus BCH codes are t-error correcting codes. The Hamming single-correcting codes can be
described as BCH codes with t = 1.
The Reed-Solomon codes, also called RS codes, are an important class of non-binary BCH codes.
The encoding process maps k symbols to n symbols where each symbol is an m-bit word. Obviously
m=8 is a popular choice since this means the code operates on byte word blocks. The RS codes are
characterised as follows:
RS Codes:
Block length: n = 2m -1 symbols
Message length: k symbols
Parity-check length: n-k = 2t symbols
Min. distance: d(K) ≥ 2t + 1 symbols
Error: t-symbol correction
In block coding, a k-bit word is mapped to an n-bit word and processed on a block-by-block basis.
This requires the encoder to shift in a k-bit block and then wait at least (n-k) cycles for the complete
n-bit code block to be generated before the next k-bit block can be shifted in. In most applications
and in data communications in particular, the data message bits arrive continuously. Thus this
would require some form of buffering of data blocks and a coding scheme that codes the data “on-
the-fly” and in a continuous fashion would be preferable. Such codes are called convolutional codes.
Properties
• Coding process operates on data continuously rather than waiting for or buffering a block of data
• Non-systematic coding schemes are used, which makes decoding more complicated