Information Theory and Coding NOTES
Information Theory and Coding NOTES
MODULE I
INTRODUCTION
The purpose of the communication system is to carry information from one place to
another over a communication channel. I.e. the root word of information is inform. Information
means giving a structure to creative and interpretative processes. It is an organization of
knowledge, facts, ideas etc that can be communicated and understood.
Information may be words, numbers, images, audio, video, feelings etc. The form of
information depends on various factors such as (1) what the information is, (2) how it is to be
communicated and (3) to who is to be communicated. In the basis of communication,
information theory deals with mathematical modeling and analysis of communication channel.
CONCEPT OF INFORMATION
Consider a source that produce two sets of messages say A & B. Let P(A) represents the
probability of occurrence message A and P(B) represents the probability of occurrence message
B. Then,
i. I (A) = Φ {P (A)}, I (B) = Φ {P (B)}
iii. If we are sure about an event, the amount of information will be zero. ( No
information is conveyed)
lim
P ( A) 0
P ( A) 1
v. If a set of messages originated from the same source are independent, then
Φ {P (A), P (B)} = Φ {P (A)} + Φ {P (B)}
The only function that simplifies these five conditions is the logarithmic function. So we
adopt the logarithmic function for representing the measure of information or amount of
information.
Consider a discrete source which produces a set of messages, m 1, m2, … mN and their
probability of occurrence is p1 , p2, … pN. Also p1+ p2+ … +pN=1
According to the concept of information theory Ik must satisfies the following conditions.
2. Ik 0 as Pk 1.
1
I m k log b (1)
k
P
The standard convention of the amount of information will take the base equal to 2. i.e.,
b=2
UNITS OF INFORMATION
Note:
PROBLEMS
1. A source produces one of four possible messages during each interval with probabilities
P1= 1 2 , P2= 1
4 , P3=P4= 1
8 . Obtain the information content of each message.
Ans:
1
I K log 2
PK
1 1
I1 log 2 log 2 1 log 2 2 1 bit
P1 2
1 1
I 2 log 2 log 2 1 log 2 4 2 bits
P2 4
1 1
I 3 I 4 log 2 log 2 1 log 2 8 3 bits
P3 8
2. If there are ‘M’ equally likely independent messages, then prove that the information
carried by each message I = N bits, where M=2N; and N is an integer.
Ans: PK 1 M
1
I K log 2
1
log 2 1 log 2 M log 2 2 N bits
N
PK M
3. Prove that if the receiver knows the message being transmitted, the amount of
information carried will be zero.
Ans: If the receiver knows the message then only one message is transmitted. So the
probability of occurrence will be 1. i.e.
PK 1
1 1
I K log 2 log 2 log 2 1 0 bits
PK 1
4. If I(m1) is the information carried by the message m1 & If I(m2) is the information
carried by the message m 2. Prove that the amount of information carried due to m1 &
m2 is I(m1,m2) = I(m1) + I(m2)
5. A message occurs with a probability of 0.8. Determine the information associated with
the message in bits, nats and decits.
Ans: PK = 0.8
i. bits
1 1
I K log 2 log 2 0.32 bits
PK 0.8
ii. Nats
1 1
I K log e log e 0.223 nats
PK 0.8
iii. decits
1 1
I K log 10 log 10 0.0969 decits
PK 0.8
ENTROPY
Entropy is the measure of average information or average uncertainty per source symbol
from a set of messages transmitted by a discrete memory-less source or zero memory sources. A
discrete memory-less source is one which does not have any prior information regarding the
emission of symbols. That is the symbols are picked up in random.
Let a discrete information source emits “M” possible messages m 1, m2,……… mM with
probability of occurrences P1, P2,………PM.
Let us assume that in a long period of transmission of sequence of L messages have been
generated. Let „L‟ be very large that we expect that in L message sequence we transmit p1 L
messages of m1 , p2L messages of m2, etc.
1
I 1 ( total) P1 L log 2 ( A)
P1
Total Information
H
Number of messages
I total
L
1 1 1
P1 log 2 P2 log 2 ........... PM log 2
P 1 P 2 P M
M
1
Entropy , H Pk log 2 (2)
k 1 Pk
PROPERTIES OF ENTROPY
1. When Pk = 1, H = 1 x log 1 = 1 x 0 = 0.
2. When Pk = 0, H = 0 x log α = 0.
That is, for extremely likely and extremely unlikely message, entropy will be zero.
3. Entropy is maximum when all the messages are equally likely. Then H = log 2 M.
So we can conclude, 0 ≤ H ≤ log 2 M.
PROOF
Consider the case of two messages with probabilities „P‟ and „1-P‟.
1 1
That is , H P log (1 P ) log
P 1 P
P log P (1 P ) log (1 P )
P log P (1 P ) log (1 P )
dH
H max 0
dP
1
P log P 1
dH P
dP
log (1 P ) 1
1
(1 P ) 1
(1 P )
1 log P 1 log (1 P )
dH
dP
log P log (1 P )
dH
We have for, H max 0
dP
log P log (1 P) 0
log P log (1 P)
P1 P
2 P 1
1
P
2
PROBLEMS
Ans:
M
1 6
1
H PK log 2 PK log 2
k 1 PK k 1 PK
1 1 1 1 1 1
log 2 2 log 2 4 log 2 8 log 2 16 log 2 32 log 2 32
2 4 8 16 32 32
1 1 1 1 1 1
1 2 3 4 5 5
2 4 8 16 32 32
1 1 3 2 5 5 31
2 2 8 8 32 32 16
1.93 bits / symbol
7. Let ‘x’ represents the outcome of a single roll of a fair die. Find the entropy.
Ans:
PK 1 6
M
1 6 1
6 log 2 6 2.58 bits / symbol
1
H PK log 2 PK log 2
k 1 PK k 1 PK 6
8. A source has 5 outputs denoted as m1, m2 , m3 , m4 & m5. Their probabilities are 0.3,
0.25, 0.25, 0.15 & 0.05 respectively. Determine the maximum entropy and the normal
entropy of the source.
Ans:
i. For maximum entropy, the messages are of equi-probable. Since there are 5
messages the probability is PK 15 .
M
1 5 1
5 log 2 5 2.32 bits / symbol
1
H PK log 2 PK log 2
k 1 PK k 1 PK 5
M
1 5 1
H PK log 2 PK log 2
k 1 PK k 1 PK
1 1 1 1 1
0.3 log 2 0.25 log 2 0.25 log 2 0.15 log 2 0.05 log 2
0.3 0.25 0.25 0.15 0.05
0.3 1.737 0.25 2 0.25 2 0.15 2.737 0.05 4.32
2.14765 bits / symbol
Ans:
i. For RRSTU the entropy is,
M
1 5 1
H PK log 2 PK log 2
k 1 PK k 1 PK
1 1 1 1 1
H log 2 3 log 2 3 log 2 3 log 2 6 log 2 9
3 3 3 6 9
0.528 0.528 0.528 0.431 0.352 2.367 bits / symbol
10. A source produces 8 symbols with equal probability. Find the entropy of the sources.
1
Also determine the entropy if one of the symbols occur with probability 2 while the
others occur with equal probability.
Ans:
1
i. For equal entropy, PK
8
M
1 8 1
8 log 2 8 3bits / symbols
1
H PK log 2 PK log 2
k 1 PK k 1 PK 8
ii. Since the probability of m1 is 1 2 , the other seven messages have the
1
probability 2 . Given other 7 messages have equal probability then their
1
probability is .
14
M
1 8 1
H PK log 2 PK log 2
k 1 PK k 1 PK
log 2 2 7 log 2 14 0.5 1.9 2.4 bits / symbols
1 1
2 14
Due to malfunctioning
P( X 1 X 2 ) 0.8 & P( X 2 X 1 ) 0.2
M
1 1 1
H ( X ) PK log 2 0.8 log 2 0.2 log 2
k 1 PK 0.8 0.2
0.258 0.464 0.722 bits / symbols
That is the entropy is reduced to 72.2%
JOINT ENTROPY
Joint entropy is the average information per pairs of transmitted and received symbols in
a communication system. It is denoted as H(X,Y). It is also called system entropy. It is the
entropy of a joint event.
Consider two sources S1 & S2 delivering symbols xi & yi and there may be some
dependence between x‟s and y‟s. Here we can say the joint entropy of sources S1 & S2 is
H(X,Y).
Now take the case of a communication system. We can say H(X,Y) as the average
uncertainty of the whole communication system.
Let the signals transmitted by the source be x1, x2… xi and the signals received by the
receiver be y1 , y2… yj. Let there be N symbols in both the transmitter and the receiver. The
probability of the transmitted signals be P(x1), P(x2) … P(xi) and the probability of the received
signals be P(y1), P(y2), ………… P(yj).
P y log
N
H Y
1
P y j
j 2
j 1
P y j log 2 P y j
N
j 1
i 1 j 1
P AB P A P B A
.............2
or
P AB P B P A B
.............3
Applying (2) in (1), we get
N N
y y
H X , Y P xi P j log 2 P xi P j
i 1 j 1 xi xi
N N
y
P xi P j log 2 P xi
i 1 j 1 xi
N N
y y
P xi P j log 2 P j
i 1 j 1 xi xi
H(X ) HY
X
Similarly, by applying (3) in (1), we get
H X , Y H (Y ) H X Y
H X ,Y H ( X ) H Y X
H X , Y H (Y ) H X
Y
Note:
N N
yj
On expanding P xi P
xi
log 2 P xi , we get
i 1 j 1
N N y
P xi P j log 2 P xi
i 1 j 1 xi
y1 log P x P x P y2 log P x ..........
P x P xi xi
N i 2 i i 2 i
i 1
............................................... P xi P log 2 P xi
yN
xi
y1 log P x P x P y1 log P x ......
P x1 P
x1 2 1 2
x2
2 2
P x P 2 log P x P x P 2
y y log 2 P x2 .....
1
x1 2 1 2
x2
........................................................................................
yN log P x P x P y N log P x .....
P x P x2
x1
1 2 1 2 2 2
y1 y y
P x1 log 2 P x1 P P 2 x .............P N x
x 1 1 1
y1 P y2 .............P y1
P x log P x P x2
x2 x2
2 2 2
........................................................................................
P x N log 2 P x N P
y1 P y 2 .............P y N
xN
xN
xN
N
P xi log 2 P xi H X
i 1
Pxi , y j log 2 P j
y
N N
CONDITIONAL ENTROPY
Conditional entropy means the entropy of a subsystem when the state of another
subsystem is known, where the two systems are statically correlated. In a communication system,
H Y
X is the entropy of the received symbols when the transmitter state is known.
That is it is the amount of uncertainty remaining in the channel output after the channel output
The conditional entropy of „X‟ when we given Y=yk can be found out by the basic
equation of entropy:-
M
1
H P
k 1
k log 2
Pk
Thus for
M xi 1
H X
Y y
j
P
log 2
x
i 1 yj P i
yj
This quantity is a random variable that takes on value,
H
X Y y , H X Y y , ......... with probabilities P(y0), P(y1), ……
0 1
respectively.
Thus the mean entropy is given by,
H
M N
X Y H X Y y P yi
i 1 j 1 j
M N x
P yi
1
P i log 2
y x
i 1 j 1 P i
yj
j
M N xi
P y P yi log 2 1
x
i 1 j 1 j P i
y j
H
Px , y j log 2
M N
1
X Y
x
i
i 1 j 1
P i
y j
Pxi , y j log 2 P i y
X
M N
x
That is, H
Y i 1 j 1 j
Pxi , y j log 2 P i
Y
M N
y
Similarly we get, H X
i 1 j 1 x
j
MARGINAL ENTROPY
RECEIVER ENTROPY
Note:
i. Equivocation
Equivocation means loss of information due to channel.
PROBLEMS
12. Consider a source with alphabets x1, x2, x3 & x4 with probabilities P X 12 , 14 , 18 , 18 .
1
H X P X log 2
P X
log 2 2 log 2 4 log 2 8 log 2 8 1.75 bits / symbol
1 1 1 1 1 1 3 3
2 4 8 8 2 2 8 8
For Second Order Extension
1
H X 2 PY log 2
PY
log 2 4 log 2 8 log 2 16 log 2 16
1 1 1 1
4 8 16 16
log 2 8 log 2 16 log 2 32 log 2 32
1 1 1 1
8 16 32 32
log 2 16 log 2 32 log 2 64 log 2 64
1 1 1 1
16 32 64 64
log 2 16 log 2 32 log 2 64 log 2 64
1 1 1 1
16 32 64 64
1 3 1 1 3 1 5 5 1 5 3 3 1 5 3 3
3.5 bits / symbol
2 8 4 4 8 4 32 32 4 32 32 32 4 32 32 32
13. Consider a source with messages 0 & 1 with probabilities P(0)=0.25 & P(1)=0.75. Find
the third order extension of the source?
1 1 4 1 3
H X P X log 2 4 log 2 4 4 log 2 3 2 4 0.415 0.815 bits / symbol
3
P X
1
H X P X log 2
P X
64 3 64 9 64
log 2 64 log 2
1 3
log 2 log 2
64 64 3 64 3 64 9
3 64 9 64 9 64 27 64
log 2 log 2 log 2 log 2
64 3 64 9 64 9 64 27
0.094 0.207 0.207 0.378 0.207 0.378 0.378 0.525 2.445 bits / symbol
14. Consider that two sources emit messages x1, x2, x3 and y1 , y2, y3 with joint probability
P(x,y) as shown in the matrix form. Calculate H X , H Y , H X Y and H Y X ?
y1 y2 y3
x1 403 1
40
1
40
1
P X , Y x2 20 3
20
1
20
x3 81 1
8
3
8
Ans:
i. To find H(X)
P X 1
3 1 1 5 1
40 40 40 40 8
P X 2
3 1 1 5 1
20 20 20 20 4
P X 3
3 1 1 5
8 8 8 8
1 1 8
H P X log 2 log 2 8 log 2 4 log 2
1 5
P X 8 4 8 5
1 1 5
3 2 0.6781 1.3 bits / symbols
8 4 8
PY1
3 1 1
0.25
40 20 8
PY2
1 3 1
0.3
40 20 8
PY3
1 1 3
0.45
40 20 8
1 1 1 1
H PY log 2 0.25 log 2 0.3 log 2 0.45 log 2
PY 0.25 0.3 0.45
0.5 0.52 0.52 1.54 bits / symbols
iii. To find H X Y
X P X , Y
We have P
Y PY
403
14 1
40
10
3 1
40
9
20 10 3 1
12
1
18
X 1 1 1
P 20 4 3
10
3 1
9
5
1 1
20 20 20 2 9
Y
18 14 1
8
10
3 3
8
9
20
12 5
12
5
6
Y P X , Y log
H X 2
1
PX
i j
Y
10 1
log 2 12 log 2 18 log 2 5
3 1 1
log 2
40 3 40 40 20
12 3 6
log 2 2 log 2 9 log 2 2 log 2
3 1 1 1
log 2
20 20 8 8 5 8 5
0.13 0.09 0.10 0.12 0.15 0.16 0.13 0.16 0.10 1.14 bits / symbols
iv. To find H Y X
Y P X , Y
We have P
X P X
403
18 1
40
18 1
40
18 53 1
5
1
5
Y 1 1
P 20 4 3
14 1
14 15 3 1
X
20 20 5 5
18 85 1
8
85 3
8
85 15 1
5
3
5
1
X P X , Y log
HY
i j
2
PY
X
5 1
log 2 log 2 5 log 2 5 log 2 5
3 1 1
40 3 40 40 20
5 1 5
log 2 5 log 2 5 log 2 5 log 2
3 1 1 3
log 2
20 3 20 8 8 8 3
0.06 0.06 0.06 0.12 0.11 0.12 0.29 0.29 0.28 1.39 bits / symbols
15. Consider a channel with two inputs x1 , x2 and three outputs y1 , y2, y3 and the
transition matrix of the channel is given by
y1 y2 y3
Y x1 34 1
0
P
4
1
X x2 0 1
2 2
Ans:
We have P X , Y P X P Y X
P X 1 , Y1 P X 1 P 1 1 3 3
Y
X 1 2 4 8
P X 1 , Y2 P X 1 P Y2 1 1 1
X 1 2 4 8
P X 1 , Y3 0; P X 2 , Y1 0
P X 2 , Y2 P X 2 P 2 1 1
Y 1
X 2 2 2 4
P X 2 , Y3 P X 2 P 1 1 1
Y 1
X 2 2 2 4
y1 y2 y3
x1 83 1
0
P X ,Y
8
1 1
x2 0 4 4
i. To find H(X)
P X 1
3 1 4 1
8 8 8 2
P X 2
1 1 1 1
4 4 4 2
1 1
H P X log 2 2 log 2 2 2 log 2 2 2 2 1bits / symbol
1 1 1
P X
iii. To find H X Y
X P X , Y 1 0
1
We have P 3
Y PY 0
3
2 1
1 3
Y P X , Y log
H X 2
PX
log 2 1 1 log 2 3 1 log 2 3 1 log 2 1
8 8 4 2 4
i j
Y
1 1
1.585 1.58 0.343 bits / symbols
8 4
v. X
To find H Y
1
HY X
P X , Y log 2
Y
i j
X
4 1
log 2 log 2 4 log 2 2 log 2 2 0.155 0.25 0.25 0.25
3 1 1
8 3 8 4 4
0.905 bits / symbols
16. Show that the marginal entropy is greater than or equal to the conditional entropy,
i.e. H X H X Y .
Ans: For this proof, consider the plots given below
The figure shows the plots of a straight line y = x – 1 and a logarithmic function y = ln x
on the same set of co-ordinate axis. Note that, any point on the straight line will always
be above the logarithmic function for any given value of x.
ln x x 1
Multiplying both sides by -1, we get,
ln x x 1
ln x 1 1 x
1
ln 1 x
x
Take H X H X Y
H X H X Y Px ln Px Px , y ln P x
i i i j
i
y j
i i j
xi
P y j
P xi , y j ln
i j P x i
xi
P y j
we have; ln 1 P xi
P xi x
P i
y j
P xi
So, H X H X Y P xi , y j 1
xi
i j
P
y j
P xi
P xi , y j P xi , y j
j xi
i
P
y j
P xi
P xi , y j P xi , y j
j P xi , y j
P y j
i
Px , y Px Py
i j
i j i j
P x , y P x , y 0
i j i j
i j
So, H X H X Y
MUTUAL INFORMATION
source is H(X) and the final uncertainty about reception is H X Y .Then I(X, Y) is the
mutual information given by
I X ,Y H X H X Y
bits / symbol
N N
1 N N
I X , Y P xi , yi log 2 P xi , yi log 2
1
i 1 j 1 P xi i 1 j 1 P xi
yi
N N 1
P xi , yi log 2
1
log 2
i 1 j 1 P xi P xi
y
i
N N 1 xi
P xi , yi log 2
P x
log 2 P
yi
i 1 j 1 i
xi
P
i
N N
y
P xi , yi log
P xi
2
i 1 j 1
xi
P y
i
N N
I X , Y P xi , yi log 2
i 1 j 1 P x
i
X P X , Y
If we substitute P Y
PY
N N P xi , yi
I X , Y P xi , yi log 2
P x P y
i 1 j 1 i i
P X , Y
P Y
X
P X
If we substitute
yi
P x
i
N N
P xi , yi log 2
i 1 j
P yi
N N
1 N N
P xi , yi log 2 P xi , yi log 2
1
i 1 j 1 P y
i i 1 j 1 P y i
x
i
H Y HY
X
I X ,Y H X H X Y
H Y H Y X
I Y , X
PROBLEMS
17. Given
y1 y2 y3 y4
x1 14 0 0 0
1 3
x2 10 10
0 0
P X , Y x3 0 1 1
0
1
20 10
1
x4 0 0 20 10
0 1
0
x5 0 20
With probabilities of source alphabet P X 14 2
5
3
20
3
20
1
20 . Find the values of
Y , H Y X and I X ,Y
H X , H Y , H X
Ans:
i. To find H(X)
P X 1 ; P X 2 ; P X 3
1 1 3 4 2 1 1 3
;
4 10 10 10 5 20 10 20
P X 4 ; P X 5
1 1 3 1
20 10 20 20
1 1 5 20
P X log P X 4 log 2 4 5 log 2 2 20 log 2 3
2 3
H 2
20
log 2 20
3 1
log 2
20 3 20
2
0.528 0.410 0.410 0.216 2.06 bits / symbols
4
PY1 ; PY2
1 1 7 3 1 7
;
4 10 20 10 20 20
PY3 ; PY4
1 1 1 1 1
10 20 20 5 10
1 7 20 20 1
H PY log 2 log 2 5 log 2 10
7 1
log 2 log 2
PY 20 7 20 7 5 10
0.5 3 0.53 0.46 0.33 1.856 bits / symbols
iii. To find H X Y
X P X , Y
We have P
Y PY
14 20
7
0 20
7
0 15 0 10
1
75 0 0 0
1 7 1 2 0
10 20 0 15 0 10
3 7 6
0
X 7
10 20 7
1
P 0 20 7 1
20 20
7 1
10 15 0 10 0 1
7
1
2 0
Y
0 20 0 20 20 5 10 10
7 7 1 1 1 1 1
0 0 4 1
0 20
7
0 20
7
20 5
1 1 1
0 10 0 0 1
4 0
Y P X , Y log
H X 2
PX
1
i j
Y
7 1 7 3 7
log 2 7 log 2 2
1 1 1
log 2 log 2 log 2
4 5 10 2 10 6 20 10
iv. To find H Y X
Y P X , Y
We have P
X P X
14 14 0 14 0 14 0 14 1 0 0 0
1 2 3
52 0 52 0 52 14 3
0 0
Y
10 5 10 4
3
P 0 20 3
20 20
1 3
10 20
1 3
0 20 0 1 2
0
X
3 3
3 2
0 20 0 20 20 20 10 20
3 3 1 3 1
0 0 13 3
0 20
1
0 20
1 1
20 20
1 1
0 20 0 0 1 0
HY X P X , Y log
P
2
Y
1
i j
X
4 3
log 2 1 log 2 4 3 20 log 2 3 10 log 2 2
1 1 3 1 1
log 2
4 10 10
3 1
log 2 3 log 2 log 2 1
1 1
20 10 2 20
0 0.2 0.1245 0.079 0.058 0.079 0.058 0 0.5995 bits / symbols
v. To find I(X,Y)
I X ,Y H X H X Y H Y H Y X
2.06 0.807 1.253 bits / symbols
RATE OF INFORMATION
The rate of information is defined as the rate at which the channel is able to transfer
number of bits per second.
Given, the source emits „r‟ number of symbols per second and the average information
associated with the messages is „H‟ bits/symbols, then we can say R=rH bits/second.
PROBLEMS
18. A source emits r = 2000 symbols per second selected from the alphabet of size m=4
with the symbols A, B, C, D. Their probabilities are 12 1
4
1
8
1
8
. Find the rate of
information, also fide the information when the messages have equal probability?
1 1 3 3
1.75 bits / symbols
2 2 8 8
R = rH = 2000 x 1.75 = 3500bits/symbols
19. An analog signal is band-limited to 3Hz sampled at Nyquist rate & the samples are
1
quantized into 4 levels namely q1, q2, q3 & q4. They occur with probabilities P1=P4= 8
R 2B H
1 8
2 log 2 8 2 log 2
1 3
H P log 2
P 8 8 3
0.75 1.06 1.81bits / symbols
R 2 BH 2 3 1.81 10.86bits / sec
20. Calculate the rate of information of a telegraph source having two symbols dot &
dash. The dot duration is 0.2 seconds. The dash is twice as long as the dot & half the
probable?
Ans:
Let the probability of dot is „x‟. So the probability of dash is „x/2‟
x 3x 2
x 1 1 x
2 2 3
Pdot Pdash
2 1
&
3 3
1 2 3 1
H P log 2 log 2 log 2 3
P 3 2 3
0.39 0.53 0.92 bits / symbols
1 1 1
r 1.67
T1 T2 0.2 0.4 0.6
1
R rH 1.67 0.92 1.54 bits / sec
0.2 0.4
SOURCE CODING
The primary objective of the source coding is to increase the efficiency of transmission of
intelligence over a channel and to reduce the transmission errors. Coding or encoding or
enciphering is a procedure for associating words constructed from a finite alphabet of a language
with given words of another language in a one- to- one manner.
Let the source is characterized by a set of symbols. S = {s1 , s2 ,….sQ} where S is the
source alphabet. Consider another set X, comprising of r symbols. X = {x1, x2,….xr} where X is
the code alphabet. Thus coding is a mapping of all possible sequence of symbols of X.
Any finite sequence of symbols from the alphabet X forms a code word. The total
number of symbols contained in the word is called word length.
CODING EFFICIENCY
REDUNDANCY
The necessary and sufficient condition for a code to be instantaneous is that no complete
code word be a prefix of some other code word. When we use these codes, there is no
time lag in the process of decoding.
If we want to encode a 5 symbol source into a binary instantaneous codes. Then S = { S1,
S2, S3, S4, S5}. Since we use binary codes, then the code alphabet X = {0,1}.
Next, we cannot give any code starting with 0 to S 2 , since it is instantaneous codes. So we give
S2 as 1. But, we get only 2 codes, 0 & 1. So we have to make S2 as 10.
S 2 10
S1 0;
S 2 1 0;
S 3 110;
S 4 1110;
S 5 1111
Here, when we start from zero, the length of the 5 th code is 4. But when we start from „00‟, the
average code length can be reduced.
S1 00;
S 2 01;
S 3 10;
S 4 110;
S 5 111
KRAFT’S INEQUALITY
The existence of the instantaneous codes can be checked by using Kraft‟s inequality. It
tells us whether the instantaneous codes will exist or not.
Given a source S = {S1, S2, ……., SQ}. Let the word length of the code be L = {l1, l2,
……., lQ} and let the code alphabet be X = {x1 , x2 , ……., xr}. Then the instantaneous codes exist
if,
Q
r
k 1
lk
1
Proof:
Let us assume that the word lengths have been arranged in the ascending order.
Let nk denotes the actual number of messages encoded into code words of length k. Then n 1
is the number of messages with length 1. (Here in e.g. n1 is zero).
If we doesn‟t use the code word with only one symbol, i.e. when n 1 = 0, then we are able to
use n2 (No. of code words having length 2). The codes with length 2 will be the
combinations of unused n 1 with code alphabets 0 & 1. I.e. 00, 01, 10, 11.
On expanding n3 r n1r n2 r
3 2
Thus nk r n1r
k k 1
........ nk 1r .............................5
…(7)
Q
That is , r
k 1
lk
1
PROBLEMS
Ans:
Symbol Code Length
S1 00 2
S2 01 2
S3 10 2
S4 110 3
S5 1110 4
S6 1111 4
Here r = 2,
Q
r
k 1
lk
2 2 2 2 2 2 2 3 2 4 2 4
Since r
k 1
lk
1 the given code is instantaneous.
Ans:
Symbol Code Length
S1 10 2
S2 110 3
S3 1110 4
S4 11110 5
S5 1111 4
S6 11111 5
Here r = 2,
Q
r
k 1
lk
22 23 24 25 24 25
Even though r
k 1
lk
1 the given code is not instantaneous.
It can be easily understood from the table. That is, in this code there are codes 1111 &
11110. According to the instantaneous codes, no code word is a prefix of another code
word. So it is not instantaneous.
Noiseless coding theorem states that “For a given code with an alphabet of r-symbols & a
source with an alphabet of q-symbols, the average length of the code words, L per source symbol
H S
may be made close to by encoding extensition of the source rather than encoding each
log r
source symbol individually.
H S
That is , L
log r
Proof:
From the Kraft‟s inequality
Q
r lk
1.......................1
k 1
Q
We know that P k 1.......................2
k 1
r lk
Pk .......................3
k 1 k 1
r l k Pk .............(4)
Taking log on both sides, we get
log r lk log Pk
lk log r log Pk
1
lk log r log
Pk
lk ..............................5
1
log r
Pk
Equation (5) can be re-written as,
1 1
log r lk 1 log r
Pk Pk
Multiplying the above equation through out by Pk and summing for all the values of k, we
have
Q Q Q Q
1 1
k 1
Pk log r
Pk
P l
k 1
k k Pk log r
k 1
Pk
Pk k 1
Q Q
1 1
Pk log Pk Q P logk r
Pk
k 1
log r
P l
k 1
k k k 1
log r
1
H S H S
L 1
log r log r
To obtain better efficiency, we are able to take the case of nth extinction of the source
H Sn
Ln
H Sn
1
log r log r
we have H S n nH S
H S Ln H S 1
log r n log r n
Ln H S
lim
n n log r
Ln
Here is the average number of code alphabet symbols used per single symbol of S,
n
Ln
when the input to the encoder is n-symbol message for the extended source S n. But L,
n
Ln
where L is the average word length for the source S and in general L . The code capacity
n
L
is now n log r C bits/ message of the channel. For successful transmission of messages
n
L
through the channel we have H S n log r C bits/ message.
n
H S Ln H S 1
is known as the as the “noiseless coding
log r n log r n
theorem”
4. Repeat steps (2) & (3) on each subgroup until the subgroups contains only one source
symbol.
PROBLEMS
23. A source emits 4 symbols with probabilities 0.4, 0.3, 0.2, 0.1. Obtain the Shannon-
Fano code? Find the average word length, entropy, efficiency and redundancy.
Ans:
Entropy , H Pi log Pi
i
0.4 log 2 0.4 0.3 log 2 0.3 0.2 log 2 0.2 0.1log 2 0.1
1.846
H 1.846
Efficiency , 0.9718 97.18% ,Redundancy = 1-η =2.82%
L 1.9
24. A source emits 9 symbols with probabilities 0.49, 0.14, 0.14, 0.07, 0.07, 0.04, 0.02, 0.02,
0.01. Obtain the Shannon-Fano code? Find the average word length, entropy,
efficiency and redundancy.
Ans:
Entropy , H Pi log Pi
i
H 2.314
Efficiency , 0.9927 99.27%
L 2.33
PROCEDURE
2. Check if q = r + α (r - 1) is satisfied & find the integer α (finite value). Otherwise add a
dummy symbol with zero probability of occurrence to satisfy the condition.
This step is not required if we use the binary coding method (r = 2). In binary coding we
always get α as an integer.
3. Combine the last „r‟ symbols into a single composite signal whose probability of
occurrence is equal to sum of probabilities of occurrence of the r-symbols involved in the
step.
4. Repeat steps 1 & 3 on the resulting set of symbols until in the final step exactly r
Symbols are left.
5. Assign codes freely to the last r- composite symbols and work backward to the source to
arrive at the optimum code.
PROBLEMS
25. Apply Huffman coding to the messages S0, S1 , S2, S3 & S4 with probabilities 0.4, 0.2,
0.2, 0.1 & 0.1.? Find the average word length, entropy, efficiency and redundancy.
Ans:
H 2.122
Efficiency , 0.9645 96.45%
L 2.2
26. Apply Huffman coding to the messages S0, S1 , S2, S3 & S4 with probabilities 0.4, 0.2,
0.2, 0.1 & 0.1? Find average word length, entropy, efficiency and redundancy. Choose
r = 4.
Ans:
q r r 1
qr 54 1
r 1 4 1 3
Since α is not an integer, add one dummy variable, S5 with probability 0. So now q = 6.
q r r 1
qr 64 2
r 1 4 1 3
Since α is not an integer, add one dummy variable, S6 with probability 0. So now q = 7.
q r r 1
qr 74 3
1
r 1 4 1 3
0.4 1 0.2 1 0.2 1 0.1 2 0.1 2 0.4 0.2 0.2 0.2 0.2
1.2
Entropy , H Pi log Pi
i
H 2.122 2.122
Efficiency , 88.42 %
L log 2 r 1.2 2 2.4
The source coding algorithm doesn‟t need source statistics (source probability). Lempel-
Ziv coding is a variable to fixed length source coding algorithm proposed by Abraham Lempel &
Jacob Ziv in 1977. This algorithm is intrinsically adaptable & simpler to implement than
Huffman source coding algorithm.
PROCEDURE
3. The variable length blocks thus obtained are known as phrases. Then phrases are listed in
a dictionary or code book, which stores the existing phrases & their locations.
4. In encoding a new phrase we specify the location of the existing phrase in the code book
& append the new letter.
ADVANTAGES
1. This algorithm uses fixed length codes to represent a variable number of source symbols.
It is used to reduce the size of a repeating string. This repeating string is called run. RLE
encodes a run of symbols into two bytes (Count + Symbol). RLE cannot achieve high
compression ratio compared to other compression methods. But it is easy to implement and
quick to execute
Example:
= 18:38 = 1: 2.1
The joint photographic expert group (JPEG) was formed jointly by two standards
organizations-the CCITT (The European Telecommunication Standards Organization) and the
International Standards Organization (ISO).
Let us now consider the lossless compression option of the JPEG image compression
standard which is a description of 29 distinct coding systems for compression of images. There
are so many approaches because the needs of different users vary so much with respect to the
quality versus compression and compression versus computation time that the committee
decided to provide a broad selection from which to choose. We shall briefly discus here two
methods that use entropy coding.
The two lossless JEPG compression options discussed here differ only in the form of the
entropy code that is applied to the data. The user can choose either a Huffman Code or an
Arithmetic code. We will not treat the Arithmetic Code concept in much detail here. However,
we will summarize its main features:
Some compression can be achieved if we can predict the next pixel using the previous
pixels. In this way we just have to transmit the prediction coefficients (or difference in the
values) instead of the entire pixel. The predictive process that is used in the lossless JPEG coding
schemes to form the innovations data is also variable. However, in this case, the variation is not
based up on the user‟s choice, but rather, for any image on a line by line basis. The choice is
made according to that prediction overall for the entire line.
There are eight prediction methods available in the JPEG coding standards. One of the
eight (which is the no prediction option) is not used for the lossless coding option that we are
examining here. The other seven may be divided into the following categories:
1. Predict the next pixel on the line as having the same value as the last one.
2. Predict the next pixel on the line as having the same value as the pixel in this position
on the previous line (that is, above it).
3. Predict the next pixel on the line as having a value related to a combination of the
previous, above and previous to the above pixel values. One such combination is
simply the combination of the other three.
The differential coding is used in the JPEG standard consists of the differences between
the actual image pixel values and the predicted values. As a result of the smoothness and
redundancy present in most pictures, these differences give rise to relatively small positive and
negative numbers that represent the small typical error in the prediction. Hence the probabilities
associated with these values are larger for the small innovation values and quite small for large
ones. This is exactly the kind of data stream that compresses well with an entropy code.
The typical lossless compression for natural images is 2:1. While this is substantial, it
does not in general solve the problem of storing or moving large sequences of images as
encountered in high quality video.
The JPEG standard includes a set of sophisticated lossy compression options developed
after a study of image distortion acceptable to human senses. The JPEG lossy compression
algorithm consists of an image simplification stage, which removes the image complexity at
some loss of fidelity, followed by a lossless compression step based on predictive and Huffman
or Arithmetic coding.
The lossy image simplification step, which we will call the image reduction, is based on
the exploitation of an operation known as the Discrete Cosine Transform (DCT), defined as
follows.
N 1 M 1
k l
Y k , l 4 yi, j cos 2N 2i 1 cos 2M 2 j 1
i 0 j 0
where the input image is N pixels by M pixels, y(i, j) is the intensity of the pixel in the
row i and column j, Y(k, l) is the DCT coefficient in row k and column l of the DCT matrix. All
the DCT multiplications are real. This lowers the number of required multiplications, as
compared to the Discrete Fourier Transform. For most images, much of the signal energy lies at
lie frequencies, which appear in the upper left corner of the DCT. The lower right values
represent higher frequencies, and are often small (usually small enough to be neglected with the
little visible distortion).
In the JPEG image reduction process, the DCT is applied to 8 by 8 pixel blocks of the
image. Hence, if the image is 256 by 256 pixels in size, we break it into 32 by 32 square blocks
of 8 by 8 pixels and treat each one independently. The 64 pixel values in each block are
transformed by the DCT into a new set of 64 values. These new 64 values, known also as DCT
coefficients, form a whole new way of representing an image. The DCT coefficients represent
the spatial frequency of the image sub-block. The upper left corner of the DCT matrix has low
frequency components and the lower right corners the high frequency components. The top left
coefficient is called the DC coefficient. Its value is proportional to the average value of the 8 by
8 block pixels. The rest are called the AC coefficients.
So far we have not obtained any reduction simply by taking the DCT. However, due to
the nature of most natural images, maximum energy (information) lies in low frequencies as
opposed to high frequency. We can represent the high frequency components coarsely, or drop
them altogether, without strongly affecting the quality of the resulting image reconstruction. This
leads to a lot of compression (lossy). The JPEG lossy compression algorithm does the following
operations:
We next look at the AC coefficients. We first quantize them, which transforms most of
the high frequency coefficients to zero. We then use a zigzag coding. The purpose of zigzag
coding is that we gradually move from the low frequency to high frequency, avoiding abrupt
jump in the values. Zigzag coding will lead to long runs of 0‟s, which are ideal for RLE followed
by Huffman or Arithmetic coding.
The typically quoted performance for JPEG is that photographic quality images of natural
scenes can be preserved with compression ratios of up to about 20:1 or 25:1. Usable quality (that
is, for no critical purposes) can result for compression ratios in the range of 200:1 up to 230:1.
CHANNEL CAPACITY
The channel capacity of a discrete memory-less source is defined as the maximum mutual
information I(X,Y) in a channel. The channel capacity is measured in bits/channel used.
C= Max [ I(X,Y) ]
1. LOSSLESS CHANNEL
A channel is said to be lossless if H X
Y
0 for all distributions. A lossless channel is
determined by the factor that the input is determined by the output and hence no transmission
error can occur.
Or it can be represented as
I X ,Y H X H X Y
For a lossless channel H X Y 0
Module I 49 2008 Scheme
Information Theory & Coding Dept of ECE
C Max I X , Y
Max H X 0
Max H X
If the source is transmitting „M‟ symbols the entropy will be maximum if all the signals are
C Max H X
log 2 M
2. DETERMINISTIC CHANNEL
A channel is said to be deterministic if P Y
X
1 for all the index variable involved and
equallently. H Y X 0 for all input distributions. In the case of a deterministic channel
the noise matrix contain only one non zero element in the row. The element is unity.
Or it can be represented as
I X , Y H Y H Y X
For a deterministic channel H Y X
0
C Max H Y
If the receiver is receiving „N‟ symbols the entropy will be maximum if all the signals are
equally likely. Then the entropy H log 2 N
C Max H Y
log 2 N
A channel is said to be noiseless if the channel is both lossless and deterministic. That is
HY X
H X Y 0 . That is the noise matrix contains a non zero entry in each
row and each column.
Or it can be represented as
I X , Y H Y H Y X
H X H X Y
Module I 51 2008 Scheme
Information Theory & Coding Dept of ECE
4. USELESS CHANNEL
= Max [ 0 ] = 0
5. SYMMETRIC CHANNEL
A channel is symmetric if each row of the noise matrix is identical and as well as the column
is identical.
Y y1 1 y2 1 yN 1
H
x
P
x log y P
x log y .. P
x log y
1 1 P 1 1
x P
x P
x
1 2 N
1 1 1
Similarly,
Y y1 1 y 1 y 1
H P log P 2 log .. P N log
xM xM y xM y xM y
P 1 P 2 P N
xM xM xM
Y Y
ie; H Pxi H
X i xi
Y Y Y Y
H H H .... H h
x1 x2 x3 xM
Y
H P xi h
X i
h P xi
i
Since P xi 1
i
Y
H h
X
C Max H Y H Y X
Max H Y h
log 2 N h
C Max H X H X Y
Max H X h '
log 2 M h '
X
Where, h H Y & h' H
Yy
X x i j
PROBLEMS
27. Determine the channel capacity of a symmetrical channel whose matrix is given as
Ans:
C = log 2 N – h
Where N = 4 & h = H Y X x
i
We know that H Y x H Y x ... H Y x h .
1 2 i
A symmetric channel which uses binary data for transmission and reception is called binary
symmetric channel. Binary symmetric channel is used in binary communication system.
The binary source transmits binary data to the receiver. The comparator compares the
transmitted and the received symbols. The comparator sends the corresponding signal to the
observer. The observer, which has a noiseless channel, sends a „1‟ to the receiver when there
is an error. When there is no error the observer sends a „0‟ to the receiver. Thus the observer
supplies additional information to the receiver thus avoiding the noise in the channel.
Additional information supplied by the observer is exactly equivocation of the source.
1 1
To find the capacity of a binary symmetric channel, let us take P ( X 1 ) and P ( X 2 ) .
2 2
We have P ( X , Y ) P ( X ) P ( Y )
X
p q
2 2
Therefore, P ( X , Y )
q p
2 2
pq pq
From P( X , Y ) , we get P(Y1 ) & P(Y2 )
2 2
p q
2 2
( p q) ( p q)
2 2
P ( X ,Y )
P( X )
Y P (Y )
q p
2 2
( p q) ( p q)
2 2
p q
pq pq
q p
pq pq
p q
P( X )
Y
q p
Therefore,
H X Y P( X ,Y ) log PX1 2
Y
p 1 q 1 p 1 q 1
log log log log
2 p 2 q 2 p 2 q
1 1
p log q log ( p log p q log q )
p q
For a symmetric channel, we know that the channel capacity C = log 2 M – h; here M = 2 &
h H Y H Y
p log 1 q log 1 p log p q log q
X1 X2 p q
C log 2 M h log 2 2 p log p q log q
1 p log p q log q
PROBLEMS
28. Determine the channel capacity of a binary symmetrical channel whose matrix is
given as
Ans:
C = 1+ p log p + q log q
In communication systems, sometimes the data received may be so corrupted and thus we
may not be able to judge the received output. Sometimes the data may be totally lost. Such a
channel is called a binary erasure channel.
H X log
1 1
1 log ................2
1
H X PX i , Y j log 2
Y
1
Xi
P Y j
i j
PX Y PP XY, Y
PY1 p
PY2 1 p 1 p 1
p 1 p p 1 p
PY3 1 p
Y 10
PX
1
0
1
H X P X i , Y j log 2
Y
1
Xi
P Y j
i j
p log 1 p log 1 p 1 log p 1 log
1 1 1 1
1 1 1
1 p log 1 log 1 p H X ........3
1 1
1
Substitute 2 & 3 in 1, we get
C Max H X 1 p H X Max p H X
p
PROBLEMS
29. Determine the channel capacity of a binary erasure channel whose matrix is given as
0
3 1
4 4
P Y
X 3
0 1
4 4
Ans:
0 p 1 p 0
3 1
4 4
P Y
X 3 0 1 p p
0 1
4 4
From this p = ¾
C=p=¾
8. UNSYMMETRIC CHANNEL
An unsymmetric channel doesn‟t have any identical rows or columns. In this case,
Step 1: The noise matrix is multiplied with a matrix of Q1 and Q2 which is then equated to
Step 2: From step 1 we obtain simultaneous equation of the variables Q1 and Q2 which can
be solved to obtain Q1 and Q2
Step 3: The values of Q1 and Q2 are substituted in the general expression of channel capacity
i.e.C log 2Q1 2Q2 ... 2Qn
PROBLEMS
30. Determine the channel capacity of a unsymmetric channel whose matrix is given as
13 2 0
3
P Y 2
3
1
3 0
X
0 0 1
Ans:
Step 1:
13 2
3 0 Q1 1 3 log 2 1 3 2 3 log 2 2 3 0
2 1 0 Q 2 log 2 1 log 1 0
3 3
2 3 2 3 3 2 3
0 0 1
3
Q 0 0 1 log 2 1
Step 2:
From these matrixes, we obtain simultaneous equation
1
3 Q1 2
3 Q2 0.92 ; 2
3 Q1 1
3 Q2 0.92
Q1 2Q2 2.75 ; 2Q1 Q2 2.75
Q1 2Q2 2.75
4Q1 2Q2 5.5
Q1 0.921 ; Q2 0.917 ; Q3 0
Step 3:
C log 2 20.921 20.917 20
1.04 bits / channel
P1 1 11 2 3
P1 2 1 2 2 4
Where,
P2 1 3 1 4 3
P2 2 3 2 4 4
PROBLEMS
31. Two binary symmetric channels are in cascade. Determine the channel capacity of
each channel and the overall system.
Ans:
1st Channel:
PY X
0.9
0.1
0.1
0.9
Channel Capacity, C = 1+ p log p + q log q
2nd Channel:
P Z Y
0.75
0.25
0.25
0.75
Channel Capacity, C = 1+ p log p + q log q = 1 + 0.75 log 0.75+ 0.25 log 0.25
= 1 - 0.81 = 0.19
PZ X
P11
P
P12
P22
21
P11 11 2 3 0.9 0.75 0.1 0.25 0.7
P12 1 2 2 4 0.9 0.25 0.1 0.75 0.3
P21 3 1 4 3 0.1 0.75 0.9 0.25 0.3
P22 3 2 4 0.1 0.25 0.9 0.75 0.7
Channel Capacity, C = 1+ p log p + q log q = 1 + 0.7 log 0.7+ 0.3 log 0.3
= 1 - 0.88 = 0.12
For finding the capacity of a band-limited Gaussian channel, we have to know about Shannon‟s
theorem & Shannon- Hartley theorem
SHANNON’S THEOREM
If there exist a source that produces „M‟ number of equally likely messages generating
information at the information rate „R‟ with channel capacity „C‟, then if R ≤ C; there exist a
coding technique such that we can transmit message over the channel with minimum probability
of error.
NEGATIVE STATEMENT
If there exist a source that produces „M‟ number of equally likely messages generating
information at the information rate „R‟ with channel capacity „C‟, then if R > C; there exist a
coding technique such that we can transmit message over the channel with probability of error
which is close to one.
C B log 2 1 S
N
Where, B is the channel bandwidth
S is the signal power
N is the noise power; N = ηB
is the two sided power spectral density
2
DERIVATION
For the purpose of transmission over a channel the messages are repeated by fixed
voltage levels. Then the source generates one message after another in sequence. The transmitted
signal s(t) can be shown in figure as:
2 2 2 2 ....... M 21
2 2 3 2 3 2 M 1 2 2
S 2
M
7
M-1 because, if we take 8 levels the last level is 2
2 3 2 M 1
2
2
i.e. S .....
M 2 2 2
2 2
2
1 3 ..... M 1
2
2
M 2
2 M M 2 1
2
M 4 6
2
M 1
2
12
From this
12S 12S
M 2 1 M 2 1
2
2
1
12S 2
M 1
2
Noise power, N is the square of root mean square noise voltage. I.e. N = σ 2 .
1
12S 2
M 1 2 ......................1
N
Each message is equally likely. Then we can write entropy, H = log2 M
1
12 S 2
H log 2 1 2
N
12 S
log 2 1 2 ...................2
1
2 N
We know that the rate of information, R = rH……………. (3)
By Shannon‟s theorem, we know that R ≤ C. then the transmission takes place with less
probability of error. I.e. R ≈ C
12S
1 2 N ...................4
1
C rH r log 2
2
By Sampling theorem, the Nyquist rate r is given by r = 2B, B is the bandwidth.
1 12S 12S
C 2 B log 2 1 2 N B log 2 1 2 N
2
Substitute λ2 = 12.
S
C B log 2 1 bits / sec
N
The Shannon- Hartley theorem specifies the rate at which the information may be
transmitted with small probability of error. Thus Shannon- Hartley theorem contemplates that,
with a sufficiently sophisticated transmission technique, transmission at channel capacity is
possible with arbitrarily small probability of error
The Shannon- Hartley theorem gives an idea about Bandwidth - SNR trade off.
The trade off is the exchange of bandwidth with signal to noise power ratio.
S B S
C B log 2 1 i.e.1 log 2 1 N
N C
By plotting B against S , we get
C N
From the figure we know that when SNR increases the BW decreases. At SNR > 10, the
reduction in bandwidth increasing SNR is poor. The use of larger bandwidth for smaller SNR is
called coding upwards (E.g. FM, PM, PCM) and the use of smaller bandwidth for larger SNR is
called coding downwards (E.g. PAM). This trade off can be seen when we take the capacity of a
channel of infinite bandwidth
S
By Shannon Hartley theorem, C B log 2 1
N
When bandwidth becomes infinite, the capacity of the channel increases. But it doesn‟t
become infinite because with the increase in bandwidth, the noise power also increases (N = ηB).
Thus the capacity approaches an upper limit with increasing bandwidth.
S
C B log 2 1
N
S
B log 2 1
B
S S
B log 2 1 B
S
S
S S B
log 2 1
B
x0
C lim C
B
S
S S B
lim log 2 1 B
B
S
As B , 0
B
S
S S B
lim log 2 1 B
0
S
B
S
log 2 e
S
C 1.44
T T
Si (t )S j (t )dt S (t )dt ES ; if i j
2
i
0 0
0 ; if i j
When S1(t) is transmitted, then the output of all correlators ( combination of multipliers
& integrators) expect first correlator will be zero due to orthogonal property.
T T
SHANNON’S LIMIT
To find the Shannon‟s limit, we have to find their minimum value of bit energy to noise
Eb
density .
No
In practical channels, the noise power N o is generally a constant. If Eb is the transmitted
energy per bit. Then the average transmitted power, S = E bC………… (1)
No
The two sided power spectral density of AWGN channel is
2
RN
B N 0 j 2f
e df
B 2
B
N 0 e j 2f
2 j 2 B
N 0 e j 2 B e j 2 B
2 j2
sin 2B sin 2B
N0 N0 B
2 2B
Apply equation (1) & equation (2) in Shannon Hartley theorem, we get
S EC
C B log 2 1 B log 2 1 b
N N0 B
Here,
EC EbC
2 B 1 b 2 B 1
C C
N0 B N0 B
Eb 2 B 1
C
;
N0 C
B
Here C
B is the band width efficiency
If C
B 1then Eb N 0 1 . This implies Eb = N0. That is the signal energy is
C = B0
C B B
log 2 1 0
B0 B0 B
B
B B0
log 2 1 0
B
B
B0 B0
C B0 log 2 1
B
C max B0 log 2 e 1.44 B0
S Eb C N 0 B
C E E B
1 & b 1 b
B N0 N0 C
E B B B0
lim b lim 0 0 0.6944
B N B C Cmax 1.44 B0
0
Eb
1.5836 dB
N 0 min
PROBLEMS
32. A voice grade channel of a telephone network has a bandwidth of 3.4 kHz.
a. Calculate the channel capacity of the telephone channel for a signal to noise ratio
of 30dB.
b. Calculate the minimum signal to noise ratio required to support information
transmitted through the telephone channel at a rate of 4800bits per seconds.
Ans:
a. B = 3.4 kHz, SNR = 30dB
10 x log 10 SNR = 30
log 10 SNR = 3
SNR = 1000
We have
C B log 1 SNR
2
33. A communication system employs a continuous source. The channel noise is white and
Gaussian. The bandwidth of the source output is 10MHz and SNR is 100.
a. Determine the channel capacity.
b. If the SNR drops to 10, how much bandwidth is needed to achieve the same
channel capacity?
c. If the bandwidth is decreased to 1MHz, how much SNR is required to maintain
the same channel capacity?
Ans:
a. B = 10MHz, SNR = 100
C B log 2 1 SNR
10 7 log 2 1 100
66.5 Mbps
34. Alphanumeric data are entered into a computer from a remote terminal through a
voice grade telephone channel. The channel has a bandwidth of 3.4 kHz and output
signal to noise power ratio of 20dB. The terminal has a total of 128 symbols which
may be assumed to occur with equal probability and that the successive transmissions
are statistically independent.
d. Calculate the channel capacity.
e. Calculate the maximum symbol rate for which error free transmission over the
channel is possible?
Ans:
a. B = 3.4 kHz, SNR = 20dB
10 x log 10 SNR = 20
SNR = 100
C B log 1 SNR
2
35. A black and white television picture may be viewed as consisting of approximately
3x105 elements; each one of which may occupy one of 10 distinct brightness levels with
equal probability. Assume the rate of transmission to be 30 picture frames per second,
and the signal to noise ratio is 30dB. Using channel capacity theorem, calculate the
minimum bandwidth required to support the transmission of the resultant video
signal?
Ans:
10 310
5
No. of different pictures possible =
310 5
Entropy, H = log 2 10
3 105 log 2 10
3 105 3.32
9.97 105
Capacity, C = R = r x H
= 30 x 9.97 x 105 = 29.9 x 106
We know that C = B log 2 (1+SNR)
29.9 x 106 = B log 2 (1+1000)
B = 3MHz
MODULE II
Codes for error detection & correction - parity check coding - linear block codes -
error detecting and correcting capabilities - generator and parity check matrices -
Standard array and syndrome decoding –Perfect codes, Hamming codes - encoding
and decoding, cyclic codes – polynomial and matrix descriptions- generation of
cyclic codes, decoding of cyclic codes, BCH codes - description & decoding,
Reed-Solomon Codes, Burst error correction.
Rings
A ring is a set R with two composition laws “+” and “ . ” such that
A subring S of a ring R is a subset that contains 1 R and is closed under addition, passage
to the negative, and multiplication. It inherits the structure of a ring from that on R.
A commutative ring is said to be an integral domain if 1R 0 and the cancellation law holds for
multiplication:
ab ac, a 0, implies a b
r R, a I , implies ra I
Fields
A field is a nonempty set F of elements with two operations „+‟ (called addition) and „·‟
(called multiplication) satisfying the following axioms. For all a, b, c ∈ F:
Furthermore, two distinct identity elements 0 and 1 (called the additive and multiplicative
identities, respectively) must exist in F satisfying the following:
1. a + 0 = a for all a ∈ F.
3. For any a in F, there exists an additive inverse element (−a) in F such that a + (−a) = 0.
4. For any a _= 0 in F, there exists a multiplicative inverse element a−1 in F such that a · a−1
= 1.
When we use the source coding technique (Shannon Fano, Huffman coding, Lempel Ziv
Algorithm or run length coding) we get variable length codes for the source symbols
To avoid these problems we go for the error control coding technique or channel coding
technique.
CHANNEL CODING
During the transmission process, the transmitted signal will pass through certain noisy
channel. Due to noise interference some errors are introduced in the received data. These errors
can be independent errors or burst errors.
Independent errors are caused by the Gaussian noise or the thermal noise. Burst errors are
caused mainly by the impulse noise. The errors caused by these noises can be detected and
corrected by using proper coding technique.
In this method, errors are detected and corrected by proper coding techniques at the
receiver side.
The actual message originates from the information source. The amount of information
can be measured in bits or nats or decits. The source encoder transforms this information into a
sequence of binary digits „u‟ by applying different source coding techniques. The source encoder
is designed such that the bit rate is minimized and also the information can be uniquely
Module II 79 2008 Scheme
Information Theory & Coding Dept of ECE
reconstructed from the sequence „u‟. Then the channel encoder transforms the sequence „u‟ into
the encoder sequence „v‟ by channel coding techniques. That is, the channel encoder adds some
extra bits called check bits to the encoder message sequence „u‟. This encoded sequence is
transmitted through the channel after modulation process. Demodulation will separate the
individual „v‟ sequence and passes it to the channel decoder. The channel decoder identifies the
extra bits or check bits added by the channel encoder and uses them to detect and correct the
errors in the transmitted message if there are any. The source decoder decodes the encoded
message sequence and will recover the original message. Finally the original message reaches at
the destination.
HAMMING DISTANCE
The hamming distance between two code vectors is equal to the number of elements in
which they differ.
Minimum distance is the minimum harming distance between the code vectors.
CODE EFFICIENCY
Message Bits In A Block
Code Efficiency
Transmitted Bits For A Block
VECTOR WEIGHT
The number of non zero elements in the code vector is known as vector weight.
E.g. 11110001. Then vector weight = 5
In this method extra bits or parity check bits are added to the message at the time of
transmission. At the receiver end, the receiver checks these parity bits. Errors are detected if the
expected pattern of parity bits is not received. Two types of parity checking mechanisms are:
In this method each character of the message is converted into its ASCII code.
E.g.:- Let the original message be 1110000, then the transmitted message is
11100001.
If the code vector contains even number of ones, then we add zero to the original
message.
E.g.:- Let the original message be 1111000, then the transmitted message is
11110000.
Advantages
Let the transmitted bits be 11110011 is of the even parity, and assume that one-bit error
be occurred at the receiver side, and let the received bits be 01110011 is of the odd parity.
Thus the receiver can detect the error because the receiver is expecting an even parity.
Thus the receiver sends an ARQ request to the transmitter to re-transmit the message.
Disadvantages
For example if we are transmitting a message „INDIA‟. Then we have write the ASCII
codes of I, N, D, I & A.
The bit b8 is obtained by checking the parity of each column. If a column contains even
number of ones, then we add 0. If a column contains odd number of ones, then we add 1.
Here by checking BCC & b8, the receiver can detect the correct position of error. Thus,
the receiver can correct the error. When double errors occur in a row, the BCC will not change.
But the receiver can detect error with the help of b8.
Advantages
LINEAR CODES
A code is said to be linear code if the sum of two code vectors will produce another code
vector. That is if v1 and v2 are any code vectors of length n of the block code then v1+v2 is also a
code word of length n of the block code.
SYSTEMATIC CODES
In a systematic block code the message bit appear at the beginning of the code word.
Then the check bits are transmitted in the block.
The various linear block codes used for error detection and correction are hamming
codes, cyclic codes, Single parity check bits codes, BCH codes, Reed Soloman codes, etc.
If we want to transmit „k‟ number of message bits the channel encoder adds some extra
bits or check bits to this encoded message bits. Let the number of these additional bits be „q‟.
Then the total number of bits at the output of the channel encoder is „k + q‟ and it is represented
as „n‟. That is “n = k + q”. This type of coding method is known as (n, k) linear block codes.
The code vector consists of message (k) bits and check (q) bits. Let the code vector be X.
The code vectors are obtained by multiplying the message vector with a generator matrix
which is used to generate the check bits.
Module II 84 2008 Scheme
Information Theory & Coding Dept of ECE
GENERATOR MATRIX
g11 g12 ..... g1n
g g 22 ..... g 2 n
I .e., X m1 m2 ...... mk 21
.... .... .... ....
g k1 gk2 .... g kn
g11 g12 ..... g1n
g g 22 ..... g 2 n
Here 21 is generator matrix for generating the check bits
.... .... .... ....
g k1 gk2 .... g kn
.....
P22
.....
.....
..... .....
.........4
pk1 pk 2 ..... pkq
On solving we get,
H PT : I q q ................................6
If the generator matrix is given, then we get this parity check matrix.
PROBLEMS
1. The generator matrix for a (6, 3) block code is shown below. Obtain all the code
vectors?
1 0 0 0 1 1
G 0 1 0 1 0 1
0 0 1 1 1 0
Ans:
For getting the code vectors, we have to find the additional bits/ check bits.
We know that,
.....
P22
.....
.....
..... .....
pk1 pk 2 ..... pkq
On substituting, we get,
0 1 1
c1 c2 c3 m1 m2 m3 1 0 1
1 1 0
On multiplying we get,
c1 m2 m3
c 2 m1 m3
c3 m1 m2
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 3 bits is 23 = 8. That is 000,
001, ….,111.
Another method:
These code vectors can be simply obtained by the equation X = [M] [G]
1 0 0 0 1 1
X 1 0 0 00 1 0 1 0 1 0 0 0 0 0 0
0 0 1 1 1 0
1 0 0 0 1 1
X 2 0 0 10 1 0 1 0 1 0 0 1 1 1 0
0 0 1 1 1 0
1 0 0 0 1 1
X 3 0 1 00 1 0 1 0 1 0 1 0 1 0 1
0 0 1 1 1 0
1 0 0 0 1 1
X 4 0 1 10 1 0 1 0 1 0 1 1 0 1 1
0 0 1 1 1 0
1 0 0 0 1 1
X 5 1 0 00 1 0 1 0 1 1 0 0 0 1 1
0 0 1 1 1 0
1 0 0 0 1 1
X 6 1 0 10 1 0 1 0 1 1 0 1 1 0 1
0 0 1 1 1 0
1 0 0 0 1 1
X 7 1 1 00 1 0 1 0 1 1 1 0 1 1 0
0 0 1 1 1 0
1 0 0 0 1 1
X 8 1 1 10 1 0 1 0 1 1 1 1 0 0 0
0 0 1 1 1 0
The encoding circuit of the (6, 3) linear block code is shown as follows:-
At first the message bits m1 , m2 and m3 to be encoded is shifted in the message register &
simultaneously into the channel through the commutator or switch. When the commutator comes
to the position c1 , c2 & c3. Thus the check bits are shifted to the channel. Thus the code vector, X
= (m1 , m2, m3, c1 , c2 & c3) is transmitted.
2. For a systematic (6, 3) linear block code the parity matrix P is shown below. Obtain all
the code vectors?
1 0 1
P 0 1 1
1 1 0
Ans:
We know that,
1 0 0 1 0 1
x1 x 2 x3 x 4 x5 x6 m1 m2 m3 0 1 0 0 1 1
0 0 1 1 1 0
m1 m2 m3 m1 m3 m2 m3 m1 m2
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 3 bits is 23 = 8. That is 000,
001, ….,111.
At first the message bits m1 , m2 and m3 to be encoded is shifted in the message register &
simultaneously into the channel through the commutator or switch. When the commutator comes
to the position c1 , c2 & c3. Thus the check bits are shifted to the channel. Thus the code vector, X
= (m1 , m2, m3, c1 , c2 & c3) is transmitted.
The encoding circuit of the (6, 3) linear block code is shown as follows:-
Let „X‟ be a valid code vector at the transmitter. At the receiver, the received code vector
is „Y‟.
At transmitter, XHT = 0.
Here X is the transmitted code vector.
And H is the parity check matrix.
When there is no error, we will receive the transmitted code vector itself. Let the received
code vector is „Y‟.
If there is no error X =Y and since XHT = 0. Therefore YHT = 0--------- (1).
If there is an error, therefore YHT 0. Then the non zero output of the product YHT is
called syndrome & this syndrome is used to detect the errors.
S X E H T
XH T EH T
0 EH T
EH T
Each syndrome vector corresponds to a particular error pattern. Each syndrome vector
says which bit is in error.
Block diagram of a syndrome decoder for linear block code to correct errors.
Here the received n – bit vector Y is stored in n – bit register. From this vector the
syndrome is calculated using S = YHT. Thus HT is stored in the syndrome calculator. The q-bit
syndrome vector is then applied to look up table to find out the error pattern. Thus we get
corresponding E. This E is added to vector Y.
Y E X
Thus we get the corrected message X.
PROBLEMS:
3. For a systematic (6, 3) linear block code the parity matrix P is shown below. The
received code vector is R= [1 1 0 0 1 0]. Detect & correct the single error that has
occurred due to noise?
1 0 1
P 0 1 1
1 1 0
Ans:
Here, we have
1 0 1 1 0 1
P 0 1 1 ; so P 0
T
1 1
1 1 0 1 1 0
We know that
H PT : I q q
1 0 1 1 0 0
0 1 1 0 1 0
1 1 0 0 0 1
The syndrome S=RHT.
1 0 1
0 1 1
1 1 0
s1 s2 s3 1 1 0 0 1 0 1 0 0
1 0 0
0 1 0
0
0
1
The syndrome is found by modulo 2 addition and multiplication. The syndrome vector
S=[1 0 0] is present in the 4 th row of HT. Hence the 4th bit in the received vector counting from
the left is error. Also the corrected code vector is [1 1 0 1 1 0].
4. Repetition code represent simplest type of linear block code. The generator matrix of
a (5, 1) repetition code is given by
G 1 1 1 1 1
Ans:
It is given that n = 5 & k = 1 for (5,1) repetition code. Since k = 1 there is only
one “message bit” and the remaining four bits are “check bits”. We know that
[G] = [ P : I ]
Here, for k = 1, the identity matrix Ik is a 1 X 1 matrix given by
Module II 94 2008 Scheme
Information Theory & Coding Dept of ECE
There are five single error patterns given by [1 0 0 0 0], [0 1 0 0 0], [0 0 1 0 0], [0 0 0
1 0] & [0 0 0 0 1]. To find the syndrome [S] = [E][HT]
1 0 0 0
0 1 0 0
S 4 0 0 0 1 0 0 0 1 0 0 0 0 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S 5 0 0 0 0 1 0 0 1 0 1 1 1 1
0 0 0 1
1 1 1 1
There are ten double error patterns given by [1 1 0 0 0], [1 0 1 0 0], [1 0 0 1 0], [1 0 0
0 1], [0 1 1 0 0], [0 1 0 1 0], [0 1 0 0 1], [0 0 1 1 0], [0 0 1 0 1] & [0 0 0 1 1]. To find
the syndrome [S] = [E][HT]
1 0 0 0
0 1 0 0
S 6 1 1 0 0 0 0 0 1 0 1 1 0 0
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S 7 1 0 1 0 0 0 0 1 0 1 0 1 0
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S 8 1 0 0 1 0 0 0 1 0 1 0 0 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S 9 1 0 0 0 1 0 0 1 0 0 1 1 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S10 0 1 1 0 0 0 0 1 0 0 1 1 0
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S11 0 1 0 1 0 0 0 1 0 0 1 0 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S12 0 1 0 0 1 0 0 1 0 1 0 1 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S13 0 0 1 1 0 0 0 1 0 0 0 1 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S14 0 0 1 0 1 0 0 1 0 1 1 0 1
0 0 0 1
1 1 1 1
1 0 0 0
0 1 0 0
S15 0 0 0 1 1 0 0 1 0 1 1 1 0
0 0 0 1
1 1 1 1
STANDARD ARRAY
This standard error is used for the syndrome decoding of linear block codes. Let C1, C2,
C3, ….. C 2 K denotes the 2K code vectors of an (n, k) linear block code. Let R denotes the
The receiver has the task of partitioning the 2 n possible received vector into 2 K disjoint
subsets D1, D2, D3…… D2 K in such a way that Di corresponds to the code vector Ci for
The 2k subsets designed here contributes the standard array of the linear block code. To
construct a linear block code the procedure is as follows:
1. The 2k code vector are placed in a row with the all zero code vectors, C 1 as the left most
element
2. From the remaining (2n – 2k) n-tuple, an error pattern E2 is chosen and is placed under C1
and the second row is formed by adding E2 to each of the remaining code vectors in the
first row. It is important that the error pattern chosen as the first element in the row not
have previously applied in the standard array.
3. Step 2 is repeated until all the possible error pattern has been accounted for.
The 2n – k rows of the array represents the cosets of the code and their first element e2,
e3,….. e2nk are called coset leaders. Using the standard array, the decoding procedure is as
follows:
2. Within the coset characterized by the syndrome „S‟ identify the coset leader. (i.e. the
error pattern with largest probability of occurrence). Call the coset leader as e 0
3. Compute the code vector C r e0 as the decoded version of the received vector r.
Thus we get the corrected output. Thus the standard array can be used as syndrome
decoding.
Module II 99 2008 Scheme
Information Theory & Coding Dept of ECE
PROBLEMS
5. Construct a standard array for the (6, 3) linear block code whose parity matrix P is
shown below.
1 0 1
P 0 1 1
1 1 0
Also decode & correct the error if any if the received vector is (a) 100100 & (b) 000011.
Ans:
We know that,
1 0 0 1 0 1
x1 x 2 x3 x 4 x5 x6 m1 m2 m3 0 1 0 0 1 1
0 0 1 1 1 0
m1 m2 m3 m1 m3 m2 m3 m1 m2
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 3 bits is 23 = 8. That is 000,
001, ….,111.
The code vectors are 000000, 001110, 010011, 011101, 100101, 101011, 110110 & 111000.
Now to find the standard array. The standard array is constructed using the step by step
procedure.
PERFECT CODES
A perfect code which has the property of
t
1
d min 1
2
where „t‟ is the number of errors that can be corrected, d min is the minimum distance, is
called perfect code. The hamming codes which have the parameters, n=2 q -1, d min=3, t=1 are
also examples of perfect codes.
HAMMING CODES
Hamming codes are defined as (n, k) linear block codes. These codes satisfy the
following conditions
1. The number of check bits, q ≥ 3.
2. Block length n = 2 q -1.
3. Number of message bits, k = n – q.
4. Minimum distance, dmin = 3.
It can be used to detect double errors and can correct single errors, since its d min = 3.
PROBLEMS
Ans:
3. Since Hamming code it can detect up-to 2 errors and can correct 1 error.
4. To obtain all the code vectors: For that we have to find the check bits. We know
that,
P11 P12 ..... P1q
P P2 q
c
1 c2 ......
cq m1 m2 ...... mk
21
.....
P22
.....
.....
.....
.....
pk1 pk 2 ..... p kq
On substituting, we get,
1 1 1
1 1 0
c1 c2 c3 m1 m2 m3 m4
1 0 1
0 1 1
On multiplying we get,
c1 m1 m2 m3
c2 m1 m2 m4
c3 m1 m3 m4
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 4 bits is 24 = 16. That is 0000,
0001, ….,1111.
5. Encoder
The switch „S‟ is connected to the message register first & all the message bits are
transmitted. This switch is then connected to check bit register & check bit are
transmitted. This forms the block of 7 bits.
It is same as the syndrome decoding of linear block codes which is explained earlier. The
error correction using the syndrome vector is also the same as that of linear block codes. The two
main properties of syndrome vector are that,
1. It is a q-bit vector.
2. There will be 2q – 1 non zero syndromes/ error
PROBLEMS
Ans:
Since q = 3, the syndrome „S‟ is a 3-bit vector. & there are 2 3 -1= 7 non zero
syndromes or errors. We know that S = EH T
3. Transpose of the matrix give the parity check matrix, H for the code.
PROBLEMS
8. Obtain the (7, 4) non-systematic hamming code? Also find the syndrome for one bit
error?
Ans:
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
1 0 0
0 1 0
1 1 0
H T
0 0 1
1 0 1
0 1 1
1 1
1
Step 3: Transpose.
1 0 1 0 1 0 1
H 0 1 1 0 0 1 1
0 0 0 1 1 1 1
We know that XHT = 0, where X is the transmitted code vector. If we are taking message bits
m1, m2, m3 and m4 . Then we have to add the parity bits in the 2 i positions. That is 20 = 1st, 21
= 2nd and 22 = 4th positions. Let us take parity bits as p1 , p2 and p3 .
Therefore,
1 0 0
0 1 0
1 1 0
XH T
p1 p2 m1 p3 m2 m3 m4 0 0 1
1 0 1
0 1 1
1 1
1
we get p1 m1 m2 m4 0
p2 m1 m3 m4 0
p3 m2 m3 m4 0
Since the message bits are non negative quantity, we can write
p1 m1 m2 m4
p2 m1 m3 m4
p3 m2 m3 m4
Thus we can construct code vectors corresponding to the messages 0000 to 1111
In the non systematic hamming code, there is a specialty that using the syndrome we can
easily understand the position of error. If there is any error, then the syndrome, S = YH T;
where Y is the received message. We know that S = EHT
1 0 0
0 1 0
1 1 0
1 0 0 0 0 0 0 0 0 1 1 0 0
1 0 1
0 1 1
1 1
1
1 0 0
0 1 0
1 1 0
0 1 0 0 0 0 0 0 0 1 0 1 0
1 0 1
0 1 1
1 1
1
1 0 0
0 1 0
1 1 0
0 0 1 0 0 0 0 0 0 1 1 1 0
1 0 1
0 1 1
1 1
1
1 0 0
0 1 0
1 1 0
0 0 0 1 0 0 0 0 0 1 0 0 1
1 0 1
0 1 1
1 1
1
1 0 0
0 1 0
1 1 0
0 0 0 0 1 0 0 0 0 1 1 0 1
1 0 1
0 1 1
1 1
1
1 0 0
0 1 0
1 1 0
0 0 0 0 0 1 0 0 0 1 0 1 1
1 0 1
0 1 1
1 1
1
1 0 0
0 1 0
1 1 0
0 0 0 0 0 0 10 0 1 1 1 1
1 0 1
0 1 1
1 1
1
CYCLIC CODES
2. Cyclic property: Any lateral shift of a codeword will produce an existing codeword.
Let X be the “n” bit code vector represented by X = (xn-1, xn-2………………… x1 , x0).
On shifting we get, X‟ = (xn-2, xn-3………………… x1, x0, xn-1).
Where „p‟ is an arbitrary variable and power of p represents the position of codeword
bits. In this, pn-1 represents MSB & p0 represents the LSB. Then we get,
p X p X ' p xn1 p n 1 .....................1
X p M p G p ..............................2
PROBLEMS
9. The generator polynomial of a (7, 4) cyclic code is given as G (p) = p 3 + p + 1. Obtain all
the code vectors for the code in non-systematic and systematic cyclic codes?
Ans:
= (p6 + p5 + p4 + p3 + p2 + p + 1) X=[1111111]
p p 1
C (p) = p2 + p + 0 C = [1 1 0] X=[0010110]
p p 1
C (p) = p2 +0 p + 1 C = [1 0 1] X=[0011101]
p3 0 p3 p2 0 p 0
C p remainder
p3 p 1
p5
remainder 3 p p 1
3 2
p p 1
C (p) = p2 + p + 1 C = [1 1 1] X=[0100111]
p p 1
C (p) = p2 +0 p + 0 C = [1 0 0] X=[0101100]
p3 0 p3 p 2 p 0
C p remainder
p3 p 1
p5 p4
remainder 3 p 1
2
p p 1
C (p) =0 p2 +0 p + 1 C = [1 0 1] X=[0110101]
p3 p3 0 p2 0 p 0
C p remainder
p3 p 1
p 6
remainder 3 p 1
2
p p 1
C (p) = p2 + 0p + 1 C = [1 0 1] X = [1 0 0 0 1 0 1]
Module II 118 2008 Scheme
Information Theory & Coding Dept of ECE
p3 p3 0 p2 0 p 1
C p remainder
p3 p 1
p6 p3
remainder 3 p p
2
p p 1
C (p) = p2 + p + 0 C = [1 1 0] X = [1 0 0 1 1 1 0]
p3 p3 0 p2 p 0
C p remainder
p3 p 1
p6 p4
remainder 3 p 1
p p 1
C (p) = 0p2 + p + 1 C = [0 1 1] X = [1 0 1 0 0 1 1]
p3 p3 0 p2 p 1
C p remainder
p3 p 1
p p p
6 4 3
remainder 0
p p 1
3
C (p) = 0p2 + 0p + 0 C = [0 0 0] X = [1 0 1 1 0 0 0]
p3 p3 p2 0 p 0
C p remainder
p3 p 1
p6 p5
remainder 3 p
p p 1
C (p) = 0p2 + p + 0 C = [0 1 0] X = [1 1 0 0 0 1 0]
p3 p3 p2 0 p 1
C p remainder
p3 p 1
p6 p5 p3
remainder 3 1
p p 1
Module II 119 2008 Scheme
Information Theory & Coding Dept of ECE
C (p) = 0p2 + 0p + 1 C = [0 0 1] X = [1 1 0 1 0 0 1]
p3 p3 p2 p 0
C p remainder
p3 p 1
p p p
6 5 4
remainder p
2
p p 1
3
C (p) = p2 + 0p + 0 C = [1 0 0] X = [1 1 1 0 1 0 0]
p3 p3 p2 p 1
C p remainder
p3 p 1
p p p p
6 5 4 3
remainder p p 1
2
p p 1
3
C (p) = p2 + p + 1 C = [1 1 1] X = [1 1 1 1 1 1 1]
The generator matrix of a cyclic code has „k‟ rows and „n‟ columns. The generator
polynomial of cyclic codes can be expressed as
G (p) = pq + gq-1 pq-1 + ……….g1 p + 1
To find the generator matrix, multiply both sides with pi, where i=(k-1), (k-2) ,….. 2, 1, 0.
Therefore, the above equation becomes
pi G (p) = pi+q + gq-1 pi+q-1 + ……….g1 pi+1 + pi
Parity Check Matrix, H = [PT:I]
PROBLEMS
10. Find the generator matrix corresponding to G (p) = p3 + p2 + 1 for a (7, 4) cyclic code.
Operation:
The feedback switch is first closed. The output switch is connected to message bits input.
At first, all shift registers are initialized to zero. When switch is closed, the message input bits
are shifted to transmitter and also to the shift registers. After the shift of „k‟ message bits, the
register contains „q‟ check bits. The feedback switch is now opened. Then the output switch is
connected to check bit position. Thus all the check bits are transmitted.
In cyclic codes during transmission errors may occur. Syndrome decoding is used to
correct these errors. Let „X‟ be the transmitted code vector, „Y‟ be the received code vector &
„E‟ be the error vector. That is, E X Y .According to the property of error vector the receive
code vector Y X E .
In polynomial form, Y p X p E p .
We know that, X p M p G p . Here, M(p) is the message vector polynomial, &
G(p) is the generator polynomial.
Y p E p
M p
G p G p
, here E(p) is called the syndrome.
Y p
M p
Syndrome
G p G p
Y p
Syndrome, S remainder
G p
SYNDROME CALCULATOR
Here syndrome calculation is done with the help of shift registers .Initially the contents of
all shift registers are set to zero
With gate 2 in on condition, Gate 1 is off condition the received code vectors entered into
shift register. After the entire received code vectors arte shifted into the registers .the contents of
registers will be the syndrome which can be shifted out of the register by turning gate 1 ON and
the gate 2 OFF.
Once the syndrome is calculated an error pattern is detected for that syndrome using
combination logic circuits.
ADVANTAGES:
1. The error correcting and decoding methods of the cyclic codes are simpler & easy to
implement.
2. The encoders and decoders for cyclic codes are simpler compared to non cyclic codes.
Disadvantages:
They are one of the most important and powerful error correcting cyclic codes. For any
positive integer m ≥ 3, there exits a BCH code with following parameters.
Block length, n = 2m - 1
Minimum distance, dmin ≥ 2t + 1, where „t‟ is the number of errors that can be corrected
Advantage:
The first step for decoding a code is to compute the syndrome from the received vector
r(X). For decoding a t-error correcting primitive BCH code, the syndrome is a 2t-tuple,
1 2 3 ........... n1
1 2 2 2 2 3 ........... 2 n1
1
H
3 3 2 3 3 ........... 3 n1
: :: ::: :::: ::::::::: ::::::
: :: :::: ::::: ::::::::: :::::::
1
2t 2t 2 2t 3
........... 2t n1
From the equations of “S” & “H”, we find that the ith component of the syndrome as
Si r i
r0 r1 i r2 2i .................... rn1 n1i
For 1 i 2t . Note that the syndrome components are elements in the field GF(2 m).
These components can be computed from r(X) as follows
Where bi(X) is the remainder with degree less than that of i X . Since i t 0 , we have
Si r i bi i .
Thus, the syndrome component Si is obtained by evaluating bi(X) with X .
i
Instead of information bits, the Reed Solomon codes uses „symbols‟. Here, symbol is a
group of bits.
In other block codes n = k + q, here „n‟ is the total number of individual bits in a code
word, „k‟ is the number of message bits and „q‟ is the number of check bits.
But in RS code, n = k + q, here n is the total number of „symbols‟ in a code word, k is the
number of information symbols and q is the number of parity or check symbols
Block length, n = 2m - 1
Number of check symbols, q = n – k = 2t, where „t‟ is the number of errors that can
be corrected.
There are occasions when the average bit error code is small, but the error correcting
codes are not effective in correcting the errors because the errors are clustered. That is, in one
region, large percentages of bits are in error. This is called Burst Error.
Examples of the sources of such bursts are static, such as results in radio transmission due to
lightning, cause burst of error.
The primary techniques in overcoming error bursts are „Block Interleaving‟ and
„Convolutional Interleaving‟
BLOCK INTERLEAVING
An effective method for dealing with burst error channels is to interleave the coded data
in such a way that the burst channel is transformed into a channel having independent error. A
burst of errors of length „b‟ is defined as a sequence of b-bit errors, the first and last of which are
1s.
For an (n, k) systematic code, which has (n – k ) = q check bits, can correct bursts of length,
b
1
n k
2
The encoded data is reordered by the interleaver and transmitted over the channel. At the
receiver after demodulation, the deinterleaver puts the data in proper sequence and passes it to
the decoder. As a result of interleaving/ deinterleaving, error bursts are spread out in time so that
errors within a code word appear to be independent.
The block interleaver formats the encoded data in a rectangular array of „m‟ rows and „n‟
columns. Then each row of array constitutes a code word of length „n‟. The bits are read in the
rectangular array row wise. The bits are read out column wise and transmitted over the channel.
At the receiver, the deinterleaver stores the same data in the same rectangular array
formats, but it is readout row-wise, one codeword at a time. As a result, the burst of error is
broken into independent errors.
CONVOLUTIONAL INTERLEAVING
Convolutional interleavers are better matched for use with the convolutional codes. It is
also called periodic interleaver.
The successive symbols of an (n, k) code word are delayed by {0, b, 2b, ………..(n-1)b}
symbol units respectively. As a result, the symbols of on code word are placed at a distance of b-
symbol units in the channel stream and the bursts of length „b‟ separated by a guard space of
about (n-1) x b units only affect one symbol per code word. In the receiver, the codewords are
reassembled through complementary delay units and decoded to correct the single errors so
generated
HAMMING BOUND
We know that for a (n, k) block code; there are „n‟ distant non-zero syndromes. i.e. 2q -1.
There are nC1 single error patterns, nC2= double error patterns and nC3=triple error patterns
t
That is, 2 q nC
i 0
i
We know that n – k = q
t
2 nk nC
i 0
i
t
........................1
k 1
1
n
log 2
n
nCi 0
i
k
Since code rate r , the equation (1) becomes
n
t
1
1 r
n
log 2 nC
i 0
i
This relation showing code rate, „r‟ and „t‟ is called Hamming Bound, where t is the
number of errors that can be corrected.