0% found this document useful (0 votes)
114 views

Information Theory and Coding NOTES

This document provides an introduction to information theory and discusses key concepts such as: - Entropy is a measure of average information or uncertainty per message from a source. - Information content of a message is measured by its probability of occurrence, with less probable messages containing more information. - The logarithmic function is used to represent the measure of information as it satisfies the necessary properties. - Units of information include bits, nats, and decits depending on the logarithmic base used. The document then provides examples of calculating information content for messages with different probabilities of occurrence.

Uploaded by

Ramola Joy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views

Information Theory and Coding NOTES

This document provides an introduction to information theory and discusses key concepts such as: - Entropy is a measure of average information or uncertainty per message from a source. - Information content of a message is measured by its probability of occurrence, with less probable messages containing more information. - The logarithmic function is used to represent the measure of information as it satisfies the necessary properties. - Units of information include bits, nats, and decits depending on the logarithmic base used. The document then provides examples of calculating information content for messages with different probabilities of occurrence.

Uploaded by

Ramola Joy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 129

Information Theory & Coding Dept of ECE

MODULE I

Module I 1 2008 Scheme


Information Theory & Coding Dept of ECE

MODULE I (QUANTITATIVE APPROACH)

Introduction to Information Theory: Concept of amount of information, units-


entropy, marginal, conditional and joint entropies - relation among entropies -
mutual information, information rate.

Source coding : Instantaneous codes- construction of instantaneous codes - Kraft‟s


inequality, coding efficiency and redundancy, Noiseless coding theorem -
construction of basic source codes - Shannon Fano Algorithm, Huffman coding.

Channel capacity -redundancy and efficiency of a channel., binary symmetric


channel (BSC), Binary erasure channel (BEC)- capacity of band limited Gaussian
channels, Shannon- Hartley theorem - bandwidth - SNR trade off - capacity of a
channel of infinite bandwidth, Shannon‟s limit.

Module I 2 2008 Scheme


Information Theory & Coding Dept of ECE

INTRODUCTION

The purpose of the communication system is to carry information from one place to
another over a communication channel. I.e. the root word of information is inform. Information
means giving a structure to creative and interpretative processes. It is an organization of
knowledge, facts, ideas etc that can be communicated and understood.

Information may be words, numbers, images, audio, video, feelings etc. The form of
information depends on various factors such as (1) what the information is, (2) how it is to be
communicated and (3) to who is to be communicated. In the basis of communication,
information theory deals with mathematical modeling and analysis of communication channel.

CONCEPT OF INFORMATION

The amount of information, I or measure of information reach at the destination depends


on the probability of occurrence, P of that event. The message or event with higher probability of
occurrence contains less amount of information and lower probability of occurrence contains
higher or more amount of information.
1
I.e. I 
P

CONCEPT OF INFORMATION THEORY

Consider a source that produce two sets of messages say A & B. Let P(A) represents the
probability of occurrence message A and P(B) represents the probability of occurrence message
B. Then,
i. I (A) = Φ {P (A)}, I (B) = Φ {P (B)}

ii. Information cannot be negative. I (A) ≥ 0; Φ {P (A)} ≥ 0; 0 ≤ P (A) ≤ 1.

iii. If we are sure about an event, the amount of information will be zero. ( No
information is conveyed)
lim
 P ( A) 0
P ( A)  1

Module I 3 2008 Scheme


Information Theory & Coding Dept of ECE

iv. Higher probability of occurrence of an event contains less amount of information


& lower probability of occurrence of an event contains higher amount of
information.
Φ {P (A)} > Φ {P (B)}; iff P (A) < P (B)

v. If a set of messages originated from the same source are independent, then
Φ {P (A), P (B)} = Φ {P (A)} + Φ {P (B)}

The only function that simplifies these five conditions is the logarithmic function. So we
adopt the logarithmic function for representing the measure of information or amount of
information.

LOGARITHMIC MEASURE OF INFORMATION

Consider a discrete source which produces a set of messages, m 1, m2, … mN and their
probability of occurrence is p1 , p2, … pN. Also p1+ p2+ … +pN=1

Let mk represents the intermediate message, Pk be its probability of occurrence & Ik be


the amount of information for transmitting message mk.

According to the concept of information theory Ik must satisfies the following conditions.

1. Ik ≥ 0; (Since information cannot be non negative)

2. Ik 0 as Pk 1.

3. The message with higher probability of occurrence contains the amount of


information which is lesser than the amount of information with lower probability
of occurrence

 1 
I  m k   log b             (1)
 k 
P

The standard convention of the amount of information will take the base equal to 2. i.e.,
b=2

Module I 4 2008 Scheme


Information Theory & Coding Dept of ECE

UNITS OF INFORMATION

CONVERSION OF UNITS OF INFORMATION

Note:

i. Extremely likely message


Amount of information will be zero, i.e. I k = 0, when P k = 1.

ii. Extremely unlikely message


Amount of information will be infinity, i.e. I k = α, when Pk = 0.

iii. Equally likely message


For equally likely message, probability will be same. (P1=P2=…= PN). 0, Pk = 0
P1 + P2 + … + PN = 1.

PROBLEMS

1. A source produces one of four possible messages during each interval with probabilities

P1= 1 2 , P2= 1
4 , P3=P4= 1
8 . Obtain the information content of each message.

Module I 5 2008 Scheme


Information Theory & Coding Dept of ECE

Ans:
 1 
I K  log 2  
 PK 
 1   1 
I1  log 2    log 2  1   log 2  2   1 bit
 P1   2 
 1   1 
I 2  log 2    log 2  1   log 2  4   2 bits
 P2   4 
 1   1 
I 3  I 4  log 2    log 2  1   log 2  8   3 bits
 P3   8 

2. If there are ‘M’ equally likely independent messages, then prove that the information
carried by each message I = N bits, where M=2N; and N is an integer.

Ans: PK  1 M

 1 
I K  log 2 
 1 
 
  log 2  1   log 2 M  log 2 2  N bits
N

 PK   M 

3. Prove that if the receiver knows the message being transmitted, the amount of
information carried will be zero.

Ans: If the receiver knows the message then only one message is transmitted. So the
probability of occurrence will be 1. i.e.
PK 1
 1  1 
I K  log 2    log 2    log 2 1  0 bits
 PK  1 

4. If I(m1) is the information carried by the message m1 & If I(m2) is the information
carried by the message m 2. Prove that the amount of information carried due to m1 &
m2 is I(m1,m2) = I(m1) + I(m2)

Ans: Let the probability of occurrence of message m1 be p1 & the probability of


occurrence of message m2 be p2.
 1   1   1 
I m1   log 2   ; I m2   log 2   ; I m1 , m2   log 2  
 P1   P2   P1 P2 

Module I 6 2008 Scheme


Information Theory & Coding Dept of ECE

I (m1,m2) is the total amount of information carried by m1 & m2


 1   1   1 
I m1 , m2  log 2    log 2    log 2    I m1   I m2 
 P1 P2   P1   P2 

5. A message occurs with a probability of 0.8. Determine the information associated with
the message in bits, nats and decits.

Ans: PK = 0.8
i. bits

 1   1 
I K  log 2    log 2   0.32 bits
 PK   0.8 
ii. Nats

 1   1 
I K  log e    log e   0.223 nats
 PK   0.8 
iii. decits

 1   1 
I K  log 10    log 10   0.0969 decits
 PK   0.8 

ENTROPY

Entropy is the measure of average information or average uncertainty per source symbol
from a set of messages transmitted by a discrete memory-less source or zero memory sources. A
discrete memory-less source is one which does not have any prior information regarding the
emission of symbols. That is the symbols are picked up in random.

Entropy is represented by „H‟. The unit of entropy is bits/message or bits/symbols

Let a discrete information source emits “M” possible messages m 1, m2,……… mM with
probability of occurrences P1, P2,………PM.

Amount of information due to m1 is


 1 
I 1  log 2  
 P1 

Module I 7 2008 Scheme


Information Theory & Coding Dept of ECE

Let us assume that in a long period of transmission of sequence of L messages have been
generated. Let „L‟ be very large that we expect that in L message sequence we transmit p1 L
messages of m1 , p2L messages of m2, etc.

Since there are p1L number of messages in m 1,

 1 
I 1 ( total)  P1 L log 2                ( A)
 P1 

Similarly total information due to the messages m2,


 1 
I 2 ( total)  P2 L log 2                (B )
 P2 

The total information due to all messages is

I total  I1( total)  I 2 ( total)  I 3 ( total)  ..........


1 1   1 
 P1 L log 2    P2 L log 2    .......  PM L log 2  
 P1   P2   PM 

The average information per message interval, represented by symbol H is,

Total Information
H 
Number of messages
I total

L
 1   1   1 
 P1 log 2    P2 log 2    ...........  PM log 2  
P 1  P 2  P M 

M
1
Entropy , H   Pk log 2           (2)
k 1 Pk

PROPERTIES OF ENTROPY

1. When Pk = 1, H = 1 x log 1 = 1 x 0 = 0.

Module I 8 2008 Scheme


Information Theory & Coding Dept of ECE

2. When Pk = 0, H = 0 x log α = 0.
That is, for extremely likely and extremely unlikely message, entropy will be zero.

3. Entropy is maximum when all the messages are equally likely. Then H = log 2 M.
So we can conclude, 0 ≤ H ≤ log 2 M.

PROOF

Consider the case of two messages with probabilities „P‟ and „1-P‟.

Then the entropy of the source


M
1
H P
k 1
k log 2
Pk

1 1
That is , H  P log  (1  P ) log
P 1  P 
  P log P  (1  P ) log (1  P )
  P log P  (1  P ) log (1  P )

The maximum value of entropy at probability P can be found out by differentiating H


with respect to P and equate to zero.

dH
 H max  0
dP

 1  
 P    log P 1  
dH  P  
  
dP   
  log (1  P )   1 
1
 (1  P )    1
  
  (1 P ) 

   1  log P    1  log (1  P ) 
dH
dP
  log P  log (1  P )

dH
We have for, H max  0
dP
  log P  log (1  P)  0
 log P  log (1  P)

Module I 9 2008 Scheme


Information Theory & Coding Dept of ECE

Taking antilogarithm on both sides, we get

P1 P
 2 P 1
1
 P
2

in H   P log P  (1  P) log (1  P) we get,


1
Applying P 
2
H max = 1 bit/message

PROBLEMS

6. Consider a source transmitting 6 symbols with probabilities 1


2 , 1 4 , 18 , 116 , 132 , 132 .
Find the entropy.

Ans:
M
 1  6
 1 
H   PK log 2     PK log 2  
k 1  PK  k 1  PK 
1 1 1 1 1 1
 log 2 2  log 2 4  log 2 8  log 2 16  log 2 32  log 2 32
2 4 8 16 32 32
1 1 1 1 1 1
 1  2   3   4 5 5
2 4 8 16 32 32
1 1 3 2 5 5 31
      
2 2 8 8 32 32 16
 1.93 bits / symbol

Module I 10 2008 Scheme


Information Theory & Coding Dept of ECE

7. Let ‘x’ represents the outcome of a single roll of a fair die. Find the entropy.

Ans:
PK  1 6
M
 1  6  1 
  6  log 2  6  2.58 bits / symbol
1
H   PK log 2     PK log 2 
k 1  PK  k 1  PK  6

8. A source has 5 outputs denoted as m1, m2 , m3 , m4 & m5. Their probabilities are 0.3,
0.25, 0.25, 0.15 & 0.05 respectively. Determine the maximum entropy and the normal
entropy of the source.

Ans:
i. For maximum entropy, the messages are of equi-probable. Since there are 5
messages the probability is PK  15 .
M
 1  5  1 
  5  log 2  5  2.32 bits / symbol
1
H   PK log 2     PK log 2 
k 1  PK  k 1  PK  5

ii. Normal entropy,

M
 1  5  1 
H   PK log 2     PK log 2  
k 1  PK  k 1  PK 
1 1 1 1 1
 0.3 log 2  0.25 log 2  0.25 log 2  0.15 log 2  0.05 log 2
0.3 0.25 0.25 0.15 0.05
 0.3  1.737  0.25  2  0.25  2  0.15  2.737  0.05  4.32
 2.14765 bits / symbol

9. A discrete memory-less source produces 5 symbols (R, S, T, U, V) with probabilities


1
3 , 13 , 16 , 19 , 118 respectively. Find the average amount of information obtained in thr
messages RRSTU & STTUV.

Ans:
i. For RRSTU the entropy is,
M
 1  5  1 
H   PK log 2     PK log 2  
k 1  PK  k 1  PK 

Module I 11 2008 Scheme


Information Theory & Coding Dept of ECE

1 1 1 1 1
H log 2 3  log 2 3  log 2 3  log 2 6  log 2 9
3 3 3 6 9
 0.528  0.528  0.528  0.431  0.352  2.367 bits / symbol

ii. For RSTTUV the entropy is,


M
 1  5  1 
H   PK log 2     PK log 2  
k 1  PK  k 1  PK 
1 1 1 1 1
H  log 2 3  log 2 6  log 2 6  log 2 9  log 2 18
3 6 6 9 18
 0.528  0.431  0.431  0.352  0.232 1.974 bits / symbol

10. A source produces 8 symbols with equal probability. Find the entropy of the sources.
1
Also determine the entropy if one of the symbols occur with probability 2 while the
others occur with equal probability.

Ans:
1
i. For equal entropy, PK 
8
M
 1  8  1 
  8  log 2  8  3bits / symbols
1
H   PK log 2     PK log 2 
k 1  PK  k 1  PK  8

ii. Since the probability of m1 is 1 2 , the other seven messages have the
1
probability 2 . Given other 7 messages have equal probability then their
1
probability is .
14
M
 1  8  1 
H   PK log 2     PK log 2  
k 1  PK  k 1  PK 
 log 2  2   7  log 2 14  0.5  1.9  2.4 bits / symbols
1 1
2 14

11. A source produces 2 symbols with equal probability if unbiased. Due to


malfunctioning the source produces the combination of symbols as follows: Symbol 1
followed by symbol 2 with a probability of 0.8 & Symbol 2 followed by symbol 1 with
a probability of 0.2. What is the effect on entropy?

Module I 12 2008 Scheme


Information Theory & Coding Dept of ECE

Ans: P(X1) = ½ and P(X2) = ½ if unbiased.


H(X) = 2 x ½ log 2 2 = 1 bit/symbol

Due to malfunctioning
P( X 1 X 2 )  0.8 & P( X 2 X 1 )  0.2
M
 1   1   1 
H ( X )   PK log 2    0.8 log 2    0.2 log 2 
k 1  PK   0.8   0.2 
 0.258  0.464  0.722 bits / symbols
That is the entropy is reduced to 72.2%

JOINT ENTROPY

Joint entropy is the average information per pairs of transmitted and received symbols in
a communication system. It is denoted as H(X,Y). It is also called system entropy. It is the
entropy of a joint event.

Consider two sources S1 & S2 delivering symbols xi & yi and there may be some
dependence between x‟s and y‟s. Here we can say the joint entropy of sources S1 & S2 is
H(X,Y).

Now take the case of a communication system. We can say H(X,Y) as the average
uncertainty of the whole communication system.

Let the signals transmitted by the source be x1, x2… xi and the signals received by the
receiver be y1 , y2… yj. Let there be N symbols in both the transmitter and the receiver. The
probability of the transmitted signals be P(x1), P(x2) … P(xi) and the probability of the received
signals be P(y1), P(y2), ………… P(yj).

Then, the entropy of the receiver be

Module I 13 2008 Scheme


Information Theory & Coding Dept of ECE

 P y  log
N
H Y  
1
P y j 
j 2
j 1

  P  y j  log 2 P  y j 
N

j 1

Similarly, the joint entropy be

H  X , Y    Pxi , y j  log 2 Pxi , y j  .............1


N N

i 1 j 1

According to the probability theory, if A & B are dependent events, then

P  AB   P  A P B  A
.............2
or
P  AB   P B  P A  B
 .............3
Applying (2) in (1), we get
N N
y   y 
H  X , Y    P xi  P j  log 2  P  xi  P j 
i 1 j 1  xi    xi 

N N
y 
  P  xi  P j  log 2 P  xi  
i 1 j 1  xi 
N N
y  y 
  P xi  P j  log 2 P j 
i 1 j 1  xi   xi 

  P xi , y j  log 2 P  xi    P xi , y j  log 2 P j 


N N N N
y 
i 1 j 1 i 1 j 1  xi 

 H(X )  HY
X
 
Similarly, by applying (3) in (1), we get

H  X , Y   H (Y )  H X  Y

H X ,Y   H ( X )  H Y  X
H  X , Y   H (Y )  H X 
Y

Module I 14 2008 Scheme


Information Theory & Coding Dept of ECE

Note:
N N
 yj 
On expanding  P xi  P

xi 
log 2 P xi  , we get
i 1 j 1  
N  N y  
   P  xi  P j  log 2 P  xi 
i 1  j 1  xi 

  y1  log P  x   P  x  P y2  log P  x ..........
 P  x  P  xi   xi  
   
N i 2 i i 2 i
   
i 1   
...............................................  P  xi  P  log 2 P  xi  
yN

  xi  

  y1  log P  x   P  x  P y1  log P  x   ......  
 P  x1  P

x1  2 1 2 
 x2 

2 2 
 
 
 P  x  P 2  log P  x   P  x  P 2
y  y  log 2 P  x2   .....  
 
1
 x1  2 1 2
 x2  
 
........................................................................................ 
  yN  log P  x   P  x  P y N  log P  x   ..... 
 P  x  P    x2  
 x1   
1 2 1 2 2 2
 
   y1  y  y  
 P  x1  log 2 P  x1   P   P 2 x   .............P N x    
   x 1  1  1 

   y1   P y2   .............P y1   
P  x  log P  x  P      x2 
  
  x2   x2    
2 2 2
 
 ........................................................................................ 
 
 P  x N  log 2 P  x N   P 
y1   P y 2   .............P y N  



  xN 

 xN 

 xN 
 


  P  x1  log 2 P  x1   P  x2  log 2 P  x2   ........  P  x N  log 2 P  x N 

N
   P  xi  log 2 P  xi   H  X 
i 1

Pxi , y j  log 2 P j
y 
 
N N

Similarly, we can prove  


i 1 j 1  xi
  HY X

Module I 15 2008 Scheme


Information Theory & Coding Dept of ECE

CONDITIONAL ENTROPY

Conditional entropy means the entropy of a subsystem when the state of another
subsystem is known, where the two systems are statically correlated. In a communication system,

we usually denote the conditional entropy as H X  Y


 or HY 
X .

H Y  
X is the entropy of the received symbols when the transmitter state is known.
That is it is the amount of uncertainty remaining in the channel output after the channel output

has been send. H X  


Y is the entropy of the source when the received state is known.

The conditional entropy of „X‟ when we given Y=yk can be found out by the basic
equation of entropy:-
M
1
H P
k 1
k log 2
Pk
Thus for

  M  xi  1
H X
 Y  y 
j 
 P 

 log 2
 x 
i 1  yj  P i 
 yj 
This quantity is a random variable that takes on value,

H   
 X Y  y  , H  X Y  y  , ......... with probabilities P(y0), P(y1), ……
 0   1

respectively.
Thus the mean entropy is given by,
 
H 
M N
 X Y    H  X Y  y  P  yi 
  i 1 j 1 j 
M N  x 
P  yi 
1
   P i  log 2
 y   x 
i 1 j  1   P i
yj 
j

 
M N  xi 
   P y  P  yi  log 2 1
 x 
i 1 j 1  j  P i 
 y j 

Module I 16 2008 Scheme


Information Theory & Coding Dept of ECE

H 
 Px , y j  log 2
M N
1
 X Y 
  x 
i
i 1 j  1
P i 
 y j 

    Pxi , y j  log 2 P i y 
 X 
M N
x 
That is, H 
 Y  i 1 j  1  j 

Pxi , y j  log 2 P i
Y 
M N
y 
Similarly we get, H  X 
 

i 1 j  1  x 
j 

MARGINAL ENTROPY

Marginal entropy is the entropy of the source. It is denoted by H(X).

RECEIVER ENTROPY

Receiver entropy is the entropy of the receiver. It is denoted by H(Y).

Note:

i. Equivocation
Equivocation means loss of information due to channel.

ii. Transition Matrix (Noise Matrix or Channel Matrix


It is the matrix of P  Y 
 X

Module I 17 2008 Scheme


Information Theory & Coding Dept of ECE

PROBLEMS

12. Consider a source with alphabets x1, x2, x3 & x4 with probabilities P X   12 , 14 , 18 , 18 .

Find the entropy for second order extension?

 1 
H  X    P X  log 2  
 P X  
 log 2  2   log 2  4   log 2  8   log 2  8       1.75 bits / symbol
1 1 1 1 1 1 3 3
2 4 8 8 2 2 8 8
For Second Order Extension

 1 
H X 2    PY  log 2  
 PY  
 log 2  4   log 2  8   log 2  16   log 2  16  
1 1 1 1
4 8 16 16
log 2  8   log 2  16   log 2  32   log 2  32  
1 1 1 1
8 16 32 32
log 2  16   log 2  32   log 2  64   log 2  64  
1 1 1 1
16 32 64 64
log 2  16   log 2  32   log 2  64   log 2  64 
1 1 1 1
16 32 64 64
1 3 1 1 3 1 5 5 1 5 3 3 1 5 3 3
                 3.5 bits / symbol
2 8 4 4 8 4 32 32 4 32 32 32 4 32 32 32

From this H (X2) = 2 H(X)

13. Consider a source with messages 0 & 1 with probabilities P(0)=0.25 & P(1)=0.75. Find
the third order extension of the source?

 1  1 4 1 3
H  X    P X  log 2    4 log 2  4   4 log 2  3   2  4  0.415  0.815 bits / symbol
3
 P X    

Module I 18 2008 Scheme


Information Theory & Coding Dept of ECE

 1 
H  X    P X  log 2  
 P X  
 64  3  64  9  64 
 log 2  64   log 2 
1 3
  log 2    log 2  
64 64  3  64  3  64  9 
3  64  9  64  9  64  27  64 
log 2    log 2    log 2    log 2  
64  3  64  9  64  9  64  27 
 0.094  0.207  0.207  0.378  0.207  0.378  0.378  0.525  2.445 bits / symbol

From this H (X 3) = 3 H(X)

So in General H (Xn) = n H(X)

14. Consider that two sources emit messages x1, x2, x3 and y1 , y2, y3 with joint probability
P(x,y) as shown in the matrix form. Calculate H  X , H Y , H X  Y and H Y X  ?
y1 y2 y3
x1  403 1
40
1
40 
1 
P  X , Y   x2  20 3
20
1
20 
x3  81 1
8
3
8


Ans:
i. To find H(X)

P X 1  
3 1 1 5 1
   
40 40 40 40 8
P X 2  
3 1 1 5 1
   
20 20 20 20 4
P X 3     
3 1 1 5
8 8 8 8

Module I 19 2008 Scheme


Information Theory & Coding Dept of ECE

 1  1 8 
H   P X  log 2   log 2  8   log 2  4   log 2  
1 5

 P X   8 4 8 5 
1 1 5
  3   2   0.6781  1.3 bits / symbols
8 4 8

ii. To find H(Y)

PY1  
3 1 1
   0.25
40 20 8
PY2  
1 3 1
   0.3
40 20 8
PY3  
1 1 3
   0.45
40 20 8
 1   1   1   1 
H   PY  log 2    0.25 log 2    0.3 log 2    0.45 log 2  
 PY    0.25   0.3   0.45 
 0.5  0.52  0.52  1.54 bits / symbols

iii. To find H X  Y
 X  P X , Y 
We have P  
Y  PY 

 403
 14 1
40
 10
3 1
40
 9
20   10 3 1
12
1
18 
X  1 1  1 
P     20 4 3
 10
3 1
 9
 5
1 1

 
20 20 20 2 9
Y
 18  14 1
8
 10
3 3
8
 9
20
  12 5
12
5
6

 
 Y    P X , Y log
H X 2
 1
 
PX


i j
 Y 
 10  1
log 2  12   log 2  18   log 2 5 
3 1 1
 log 2   
40  3  40 40 20
 12  3 6 
log 2  2   log 2  9   log 2 2  log 2 
3 1 1 1
  log 2  
20 20 8 8  5  8 5 
 0.13  0.09  0.10  0.12  0.15  0.16  0.13  0.16  0.10  1.14 bits / symbols

iv. To find H Y  X
 Y  P X , Y 
We have P  
 X  P X 

Module I 20 2008 Scheme


Information Theory & Coding Dept of ECE

 403
 18 1
40
 18 1
40
 18   53 1
5
1
5 
Y  1 1   
P     20 4 3
 14 1
 14    15 3 1

X
20 20 5 5
 18  85 1
8
 85 3
8
 85   15 1
5
3
5

 1 
 X    P X , Y  log  
HY
i j
2
 PY

 
X 

5  1
 log 2    log 2  5   log 2  5   log 2 5 
3 1 1
40  3  40 40 20
 5  1  5 
 log 2  5   log 2 5  log 2  5   log 2 
3 1 1 3
log 2   
20  3  20 8 8 8  3 
 0.06  0.06  0.06  0.12  0.11  0.12  0.29  0.29  0.28  1.39 bits / symbols

15. Consider a channel with two inputs x1 , x2 and three outputs y1 , y2, y3 and the
transition matrix of the channel is given by
y1 y2 y3
Y  x1  34 1
0
P  
4
 1
 X  x2 0 1
2 2

Calculate H  X , H Y , H X  Y and H Y X  . Given PX  1


1
2 , P X 2   1 2

Ans:
We have P X , Y   P X P Y  X
P  X 1 , Y1   P  X 1 P  1  1  3  3
Y

 X 1 2 4 8
P  X 1 , Y2   P  X 1 P  Y2  1  1  1

 X 1 2 4 8
P  X 1 , Y3   0; P  X 2 , Y1   0

P  X 2 , Y2   P  X 2 P  2  1  1 
Y 1
 X 2  2 2 4
P  X 2 , Y3   P  X 2 P  1  1  1 
Y 1

 X 2  2 2 4

y1 y2 y3
x1  83 1
0
P  X ,Y   
8
1 1
x2 0 4 4

Module I 21 2008 Scheme


Information Theory & Coding Dept of ECE

i. To find H(X)

P X 1     
3 1 4 1
8 8 8 2
P X 2     
1 1 1 1
4 4 4 2
 1  1
H   P X  log 2    2 log 2  2   2 log 2  2  2  2  1bits / symbol
1 1 1
 P X  

To find H(Y) PY1   ; PY2     ; PY3  


3 1 1 3 1
ii.
8 4 8 8 4
 1  3 8  3 8  1
H Y    PY  log 2    log 2    log 2    log 2  4 
 PY   8 3  8 3  4
 0.52  0.52  0.5  1.54 bits / symbols

iii. To find H X Y
 X  P X , Y  1 0
1
We have P    3

Y  PY  0
3
2 1

 1  3
 Y    P X , Y log
H X 2

PX 
  log 2 1  1 log 2  3   1 log 2  3   1 log 2 1
 8 8 4 2  4
 

i j
 Y 
1 1
  1.585   1.58  0.343 bits / symbols
8 4

v.  X
To find H Y

 1 
HY X 
 P X , Y  log 2 
 
 Y


i j
 X 
4  1
 log 2    log 2  4   log 2 2   log 2 2   0.155  0.25  0.25  0.25
3 1 1
8 3 8 4 4
 0.905 bits / symbols

16. Show that the marginal entropy is greater than or equal to the conditional entropy,
i.e. H  X   H X Y .
Ans: For this proof, consider the plots given below

Module I 22 2008 Scheme


Information Theory & Coding Dept of ECE

The figure shows the plots of a straight line y = x – 1 and a logarithmic function y = ln x
on the same set of co-ordinate axis. Note that, any point on the straight line will always
be above the logarithmic function for any given value of x.
 ln x  x  1
Multiplying both sides by -1, we get,
 ln x    x  1
 ln x 1  1  x
1
 ln  1 x
x

Take H  X   H X Y
H X   H X Y    Px ln Px   Px , y ln P x
i i i j
i

y j 
i i j

   P xi , y j ln P xi    P xi , y j ln P i


x 
y 
i j i j  j 

  xi  
 P y j  
  P xi , y j ln   
i j  P  x i  
 
 

  xi  
 P y j  
we have; ln      1  P  xi 
 P  xi   x 
  P i 
   y j 

Module I 23 2008 Scheme


Information Theory & Coding Dept of ECE

 
 P  xi  
So, H  X   H X  Y   P xi , y j  1 
  xi 

i j
 P 
  y j  
 
 P  xi  
   P xi , y j   P xi , y j  
j   xi 
i
 P 
  y j  

 
 P  xi  
   P xi , y j   P xi , y j  
j  P xi , y j  
P  y j 
i


  Px , y   Px Py 
i j
i j i j

  P x , y   P x , y   0
i j i j
i j

So, H  X   H X  Y
MUTUAL INFORMATION

It is the information obtained when we have a prior knowledge or a posterior knowledge


regarding the information or the loss of information. Consider the initial uncertainty about the

source is H(X) and the final uncertainty about reception is H X Y   .Then I(X, Y) is the
mutual information given by

I  X ,Y  H  X  H X  Y
 bits / symbol
 
N N
 1  N N  
I  X , Y    P xi , yi  log 2     P xi , yi  log 2 
1

i 1 j 1  P  xi   i  1 j  1  P xi 

  yi  
  
N N   1   
P xi , yi  log 2 
1
    log 2  
i 1 j 1   P  xi    P  xi
  y 
 
   
i  

Module I 24 2008 Scheme


Information Theory & Coding Dept of ECE

N N   1    xi  
   P  xi , yi   log 2 
 P x  
  log 2  P 
  yi 
 
i 1 j 1   i 

   xi 
  P 
 i 
N N
y
   P  xi , yi   log  
 P  xi   
2
i 1 j 1   

  

   xi 
  P y   
 i 
N N
I  X , Y    P xi , yi  log 2  
i 1 j 1   P  x  
 i


  

 X   P X , Y 
If we substitute P Y 
  PY 

N N   P xi , yi   
I  X , Y    P xi , yi  log 2 
 P x   P y  

i 1 j 1   i i  

P X , Y 
 P Y 
 X 
P X 
If we substitute
 

   yi   
  P x   
 i 
N N
   P xi , yi  log 2  
i 1 j   
 P yi  
 
   
 
N N
 1  N N  
   P xi , yi  log 2     P xi , yi  log 2 
1 

i 1 j 1  P  y 
i  i 1 j 1  P y i  
  x 
  i 

 H Y   HY  
X

I  X ,Y   H  X   H X  Y
  H Y   H Y  X
  I Y , X 

Module I 25 2008 Scheme


Information Theory & Coding Dept of ECE

PROBLEMS

17. Given
y1 y2 y3 y4
x1  14 0 0 0
1 3 
x2  10 10
0 0
P  X , Y   x3 0 1 1
0
 1 
20 10
1
x4 0 0 20 10 
0 1
0
x5  0 20 
With probabilities of source alphabet P X   14 2
5
3
20
3
20
1
20  . Find the values of
 Y , H Y X  and I X ,Y 
H  X , H Y , H X

Ans:
i. To find H(X)

P X 1   ; P X 2    ; P X 3  
1 1 3 4 2 1 1 3
    ;
4 10 10 10 5 20 10 20
P X 4   ; P X 5  
1 1 3 1
 
20 10 20 20

 1  1 5   20 
 P X  log  P X    4 log 2  4   5 log 2  2   20 log 2  3 
2 3
H 2
     
 20 
log 2  20 
3 1
 log 2  
20  3  20
2
  0.528  0.410  0.410  0.216  2.06 bits / symbols
4

Module I 26 2008 Scheme


Information Theory & Coding Dept of ECE

ii. To find H(Y)

PY1   ; PY2   
1 1 7 3 1 7
   ;
4 10 20 10 20 20
PY3     ; PY4  
1 1 1 1 1

10 20 20 5 10
 1  7  20   20  1
H   PY  log 2   log 2  5   log 2 10
7 1
  log 2    log 2  
 PY   20  7  20  7  5 10
 0.5 3  0.53  0.46  0.33  1.856 bits / symbols

iii. To find H X  Y
X  P X , Y 
We have P 
Y  PY 

 14  20
7
0  20
7
0  15 0  10
1
  75 0 0 0
1  7 1  2 0
10  20 0  15 0  10
3 7 6
0
 X    7
10 20 7
1 
P    0  20 7 1
20 20
7 1
10  15 0  10  0 1
7
1
2 0
Y     
 0  20 0  20 20  5 10  10 
7 7 1 1 1 1 1
0 0 4 1
 0  20
7
0  20
7
20  5
1 1 1 
0  10  0 0 1
4 0
 
 Y    P X , Y log
H X 2

PX
1
 


i j
 Y 
7  1  7 3  7
log 2 7  log 2  2 
1 1 1
 log 2    log 2    log 2   
4  5  10  2  10  6  20 10

log 2  4   log 2 1  log 2 4 


1 1 1

20 10 20
 0.121  0.180  0.066  0.140  0.1  0.1  0.1  0.807 bits / symbols

iv. To find H Y  X
 Y  P X , Y 
We have P  
 X  P X 

 14  14 0  14 0  14 0  14  1 0 0 0
1 2 3
 52 0  52 0  52   14 3
0 0
 Y   
10 5 10 4
3 
P    0  20 3
20  20
1 3
10  20
1 3
0  20  0 1 2
0
X 
3 3
3   2
0  20 0  20 20  20 10  20 
3 3 1 3 1
0 0 13 3
0  20
1
0  20
1 1
20  20
1 1 
0  20  0 0 1 0

Module I 27 2008 Scheme


Information Theory & Coding Dept of ECE

 
HY X    P X , Y log 
P
2
Y
1
 


i j
 X 
 4  3
  log 2 1  log 2  4   3   20 log 2 3  10 log 2  2 
1 1 3 1 1
log 2
4 10 10
 3 1
log 2 3  log 2    log 2 1 
1 1

20 10  2  20
 0  0.2  0.1245  0.079  0.058  0.079  0.058  0  0.5995 bits / symbols

v. To find I(X,Y)
I X ,Y   H X   H X  Y  H Y   H Y  X
 2.06  0.807 1.253 bits / symbols

RATE OF INFORMATION

The rate of information is defined as the rate at which the channel is able to transfer
number of bits per second.

Given, the source emits „r‟ number of symbols per second and the average information
associated with the messages is „H‟ bits/symbols, then we can say R=rH bits/second.

PROBLEMS

18. A source emits r = 2000 symbols per second selected from the alphabet of size m=4
with the symbols A, B, C, D. Their probabilities are 12 1
4
1
8
1
8
. Find the rate of

information, also fide the information when the messages have equal probability?

Ans: We know that R = rH


= 2000 H(x)
 1  1
 P   2 log 2  2   4 log 2  4   8 log 2  8   8 log 2  8 
1 1 1
H  P log 2

1 1 3 3
     1.75 bits / symbols
2 2 8 8
R = rH = 2000 x 1.75 = 3500bits/symbols

H max  4  log 2  4  2 bits / symbols


1
4
R = rH = 2000 x 2 = 4000bits/symbols

Module I 28 2008 Scheme


Information Theory & Coding Dept of ECE

19. An analog signal is band-limited to 3Hz sampled at Nyquist rate & the samples are
1
quantized into 4 levels namely q1, q2, q3 & q4. They occur with probabilities P1=P4= 8

& P2=P3= 83 . Find the rate of information of the source?

Ans: We know that R = rH


At Nyquist rate, (By Sampling Theorem) r = 2B, where B is the bandwidth of the
signal.

 R  2B H 
 1  8 
 2  log 2  8   2  log 2  
1 3
H   P log 2  
P  8 8 3 
 0.75  1.06  1.81bits / symbols
 R  2 BH  2  3  1.81  10.86bits / sec

20. Calculate the rate of information of a telegraph source having two symbols dot &
dash. The dot duration is 0.2 seconds. The dash is twice as long as the dot & half the
probable?

Ans:
Let the probability of dot is „x‟. So the probability of dash is „x/2‟
x 3x 2
x 1  1 x
2 2 3

Pdot   Pdash 
2 1
&
3 3
 1  2 3  1
H   P log 2    log 2    log 2 3 
P  3 2  3
 0.39  0.53  0.92 bits / symbols

1 1 1
r    1.67
T1  T2 0.2  0.4 0.6

1
R  rH  1.67  0.92   1.54 bits / sec
0.2  0.4

Module I 29 2008 Scheme


Information Theory & Coding Dept of ECE

SOURCE CODING

The primary objective of the source coding is to increase the efficiency of transmission of
intelligence over a channel and to reduce the transmission errors. Coding or encoding or
enciphering is a procedure for associating words constructed from a finite alphabet of a language
with given words of another language in a one- to- one manner.

Let the source is characterized by a set of symbols. S = {s1 , s2 ,….sQ} where S is the
source alphabet. Consider another set X, comprising of r symbols. X = {x1, x2,….xr} where X is
the code alphabet. Thus coding is a mapping of all possible sequence of symbols of X.

Any finite sequence of symbols from the alphabet X forms a code word. The total
number of symbols contained in the word is called word length.

CODING EFFICIENCY

Coding efficiency is defined as the ratio of average information to the maximum


information available.
H X 
For a message that is not encoded, Efficiency  
log 2 M
H X 
For a message that is encoded, Efficiency  
L  log 2 r
Q
Where L is the average length of code word, L  P l
K 1
K K

r is the No. of symbols used for the coding


(For binary code, r = 2; ternary code, r = 3)

REDUNDANCY

Redundancy is defined as the unnecessary data involved in transmission.


E=1–η

Module I 30 2008 Scheme


Information Theory & Coding Dept of ECE

VARIOUS CODES USED

1. UNIQUELY DECODABLE CODES/ UNIQUELY DECIPHERABLE CODES.

A code is uniquely decipherable if every word in a sequence of words can be uniquely


identified
E.g.
S1  00;
S 2  01;
S 3  10;
S 4  11;

2. INSTANTANEOUS CODE/ PREFIX CODES/ IRREDUCIBLE CODES

The necessary and sufficient condition for a code to be instantaneous is that no complete
code word be a prefix of some other code word. When we use these codes, there is no
time lag in the process of decoding.

CONSTRUCTION OF INSTANTANEOUS CODES

If we want to encode a 5 symbol source into a binary instantaneous codes. Then S = { S1,
S2, S3, S4, S5}. Since we use binary codes, then the code alphabet X = {0,1}.

Start by assigning 0 to S1.


S1  0

Next, we cannot give any code starting with 0 to S 2 , since it is instantaneous codes. So we give
S2 as 1. But, we get only 2 codes, 0 & 1. So we have to make S2 as 10.
S 2  10

Now, we cannot use a code starting with „10‟. So we have S3 as 110.


S 3  110
S 4  1110
S 5  1111

Thus finally we get

Module I 31 2008 Scheme


Information Theory & Coding Dept of ECE

S1  0;
S 2 1 0;
S 3  110;
S 4  1110;
S 5  1111

Here, when we start from zero, the length of the 5 th code is 4. But when we start from „00‟, the
average code length can be reduced.

S1  00;
S 2  01;
S 3  10;
S 4  110;
S 5  111

In this way, the instantaneous codes can be constructed.

KRAFT’S INEQUALITY

The existence of the instantaneous codes can be checked by using Kraft‟s inequality. It
tells us whether the instantaneous codes will exist or not.

Given a source S = {S1, S2, ……., SQ}. Let the word length of the code be L = {l1, l2,
……., lQ} and let the code alphabet be X = {x1 , x2 , ……., xr}. Then the instantaneous codes exist
if,
Q

r
k 1
lk
1

Proof:
Let us assume that the word lengths have been arranged in the ascending order.

S1  00  l1  2 n1  0  No : of codes with length '1'


S 2  01  l 2  2 n2  3  No : of codes with length '2'
S 3  10  l3  2 n3  2  No : of codes with length '3'
S 4  110  l 4  3
S 5  111  l 5  3

Module I 32 2008 Scheme


Information Theory & Coding Dept of ECE

Thus we can write l1  l2  l3  ......  lQ .............................1

Let nk denotes the actual number of messages encoded into code words of length k. Then n 1
is the number of messages with length 1. (Here in e.g. n1 is zero).

Thus we can write n1  r .............................2 .

[Here r=2 because we use 0 & 1 for coding. So 0 ≤ 2]

If we doesn‟t use the code word with only one symbol, i.e. when n 1 = 0, then we are able to
use n2 (No. of code words having length 2). The codes with length 2 will be the
combinations of unused n 1 with code alphabets 0 & 1. I.e. 00, 01, 10, 11.

Thus we can write, n2  r  n1  r.............................3 .

Similarly n3  r  n1  r  n2  r.............................4

On expanding n3  r  n1r  n2 r
3 2

Thus nk  r  n1r
k k 1
 ........  nk 1r .............................5

Multiplying throughout by r-k , we get

nk r  k  1  n1r 1  n2 r 2 ........  nk 1r ( k 1)

nk r  k  nk 1r ( k 1)  ...  n2 r 2  n1r 1 1 ...............( 6 )

…(7)

Q
That is , r
k 1
lk
1

Module I 33 2008 Scheme


Information Theory & Coding Dept of ECE

PROBLEMS

21. Check weather the given code is instantaneous or not?


Symbol  Code
S1  00
S2  01
S3  10
S4  110
S5  1110
S6  1111

Ans:
Symbol  Code  Length
S1  00  2
S2  01  2
S3  10  2
S4  110  3
S5  1110  4
S6  1111  4

Here r = 2,
Q

r
k 1
lk
 2 2  2 2  2 2  2 3  2 4  2 4

 0.25  0.25  0.25  0.125  0.0625  0.0625


1
Q

Since r
k 1
lk
1 the given code is instantaneous.

22. Check weather the given code is instantaneous or not?


Symbol  Code
S1  10
S2  110
S3  1110
S4  11110
S5  1111
S6  11111

Module I 34 2008 Scheme


Information Theory & Coding Dept of ECE

Ans:
Symbol  Code  Length
S1  10  2
S2  110  3
S3  1110  4
S4  11110  5
S5  1111  4
S6  11111  5

Here r = 2,
Q

r
k 1
lk
 22  23  24  25  24  25

 0.25  0.125  0.0625  0.03125  0.0625  0.03125


 0.5625
Q

Even though r
k 1
lk
1 the given code is not instantaneous.

It can be easily understood from the table. That is, in this code there are codes 1111 &
11110. According to the instantaneous codes, no code word is a prefix of another code
word. So it is not instantaneous.

NOISELESS CODING THEOREM

Noiseless coding theorem states that “For a given code with an alphabet of r-symbols & a
source with an alphabet of q-symbols, the average length of the code words, L per source symbol
H S 
may be made close to by encoding extensition of the source rather than encoding each
log r
source symbol individually.

H S 
That is , L 
log r

Noiseless coding theorem is also known as Shannon’s first theorem or Shannon’s


source coding theorem. This theorem is valid for sources with finite memory (Markov sources,
for this source the symbol transmitted will be depend on proceeding symbols)

Module I 35 2008 Scheme


Information Theory & Coding Dept of ECE

Proof:
From the Kraft‟s inequality
Q

r lk
 1.......................1
k 1

Q
We know that P k  1.......................2
k 1

Combining (1) and (2)


Q Q

r lk
  Pk .......................3
k 1 k 1

 r l k  Pk .............(4)
Taking log on both sides, we get

log r  lk  log Pk
 lk log r  log Pk
1
lk log r  log
Pk

 lk ..............................5
1
log r
Pk
Equation (5) can be re-written as,
1 1
log r  lk  1  log r
Pk Pk
Multiplying the above equation through out by Pk and summing for all the values of k, we
have
Q Q Q Q
1 1

k 1
Pk log r
Pk
 P l
k 1
k k   Pk log r
k 1
  Pk
Pk k 1

Q Q
1 1
 Pk log Pk Q  P logk r
Pk
k 1

log r
 P l
k 1
k k  k 1

log r
1

Module I 36 2008 Scheme


Information Theory & Coding Dept of ECE

H S  H S 
 L  1
log r log r
To obtain better efficiency, we are able to take the case of nth extinction of the source

 
H Sn
 Ln 
 
H Sn
1
log r log r
 
we have H S n  nH S 
H S  Ln H S  1
  
log r n log r n
Ln H S 
lim 
n n log r
Ln
Here is the average number of code alphabet symbols used per single symbol of S,
n
Ln
when the input to the encoder is n-symbol message for the extended source S n. But L,
n
Ln
where L is the average word length for the source S and in general  L . The code capacity
n
L 
is now  n  log r  C bits/ message of the channel. For successful transmission of messages
 n 
L 
through the channel we have H S    n  log r  C bits/ message.
 n 

H S  Ln H S  1
   is known as the as the “noiseless coding
log r n log r n
theorem”

Drawbacks of Noiseless Coding Theorem:

1. Increased coding complexity.

2. Increased time required for encoding & transmitting the information.

Module I 37 2008 Scheme


Information Theory & Coding Dept of ECE

CONSTRUCTION OF BASIC SOURCE CODES

SHANNON - FANO ALGORITHM


PROCEDURE

1. The messages are written in the order of decreasing probabilities.

2. Partition or divide this into two almost equi-probable subgroups.

3. Assign zero to one group & one to another group.

4. Repeat steps (2) & (3) on each subgroup until the subgroups contains only one source
symbol.

PROBLEMS

23. A source emits 4 symbols with probabilities 0.4, 0.3, 0.2, 0.1. Obtain the Shannon-
Fano code? Find the average word length, entropy, efficiency and redundancy.
Ans:

Symbol  Code  Length


S1  0  1
S2  10  2
S3  110  3
S4  111  3

Average word length , L   Pi li


i

 0.4  1  0.3 2  0.2  3  0.1 3  0.4  0.6  0.6  0.3  1.9

Entropy , H    Pi log Pi
i

  0.4 log 2 0.4  0.3 log 2 0.3  0.2 log 2 0.2  0.1log 2 0.1
 1.846
H 1.846
Efficiency ,    0.9718  97.18% ,Redundancy = 1-η =2.82%
L 1.9

Module I 38 2008 Scheme


Information Theory & Coding Dept of ECE

24. A source emits 9 symbols with probabilities 0.49, 0.14, 0.14, 0.07, 0.07, 0.04, 0.02, 0.02,
0.01. Obtain the Shannon-Fano code? Find the average word length, entropy,
efficiency and redundancy.
Ans:

Symbol  Code  Length


S1  0  1
S2  100  3
S3  101  3
S4  1100  4
S5  1101  4
S6  1110  4
S7  11110  5
S8  111110  6
S9  111111  6

Average word length , L   Pi li


i

 0.49  1  2  0.14  3  2  0.07  3  0.04  4  0.02  5  0.02  6  0.01  6


 0.49  0.84  0.42  0.08  0.1  .12  .06  2.33

Entropy , H    Pi log Pi
i

 0.49 log 2 0.49  2  0.14 log 2 0.14  2  0.07 log 2 0.07 


   
  0.04 log 2 0.04  2  0.02 log 2 0.02  0.01log 2 0.01 
 2.314

H 2.314
Efficiency ,    0.9927  99.27%
L 2.33

Redundancy = 1-η =0.73%

Module I 39 2008 Scheme


Information Theory & Coding Dept of ECE

HUFFMAN CODING (MINIMUM REDUNDANCY CODING)

PROCEDURE

1. Arrange the source symbols in the decreasing order of probability.

2. Check if q = r + α (r - 1) is satisfied & find the integer α (finite value). Otherwise add a
dummy symbol with zero probability of occurrence to satisfy the condition.

This step is not required if we use the binary coding method (r = 2). In binary coding we
always get α as an integer.

3. Combine the last „r‟ symbols into a single composite signal whose probability of
occurrence is equal to sum of probabilities of occurrence of the r-symbols involved in the
step.

4. Repeat steps 1 & 3 on the resulting set of symbols until in the final step exactly r
Symbols are left.

5. Assign codes freely to the last r- composite symbols and work backward to the source to
arrive at the optimum code.

6. Discard the codes of the dummy symbols.

PROBLEMS

25. Apply Huffman coding to the messages S0, S1 , S2, S3 & S4 with probabilities 0.4, 0.2,
0.2, 0.1 & 0.1.? Find the average word length, entropy, efficiency and redundancy.
Ans:

Module I 40 2008 Scheme


Information Theory & Coding Dept of ECE

Symbol  Code  Length


S0  00  2
S1  10  2
S2  11  2
S3  010  3
S4  011  3

Average word length , L   Pi l i


i

 0.4  2  2  0.2  2  0.1  3  2  0.8  0.4  0.4  0.3  0.3


 2.2
Entropy , H    Pi log Pi
i

  0.4 log 2 0.4  2  0.2 log 2 0.2  2  0.1log 2 0.1


 2.122

H 2.122
Efficiency ,    0.9645  96.45%
L 2.2

Redundancy = 1-η =3.55%

26. Apply Huffman coding to the messages S0, S1 , S2, S3 & S4 with probabilities 0.4, 0.2,
0.2, 0.1 & 0.1? Find average word length, entropy, efficiency and redundancy. Choose
r = 4.
Ans:
q  r   r  1
qr 54 1
   
r 1 4 1 3
Since α is not an integer, add one dummy variable, S5 with probability 0. So now q = 6.
q  r   r  1
qr 64 2
   
r 1 4 1 3
Since α is not an integer, add one dummy variable, S6 with probability 0. So now q = 7.
q  r   r  1
qr 74 3
    1
r 1 4 1 3

Module I 41 2008 Scheme


Information Theory & Coding Dept of ECE

Symbol  Code  Length


S0  0  1
S1  2  1
S2  3  1
S3  10  2
S4  11  2

Average word length , L   Pi l i


i

 0.4  1  0.2  1  0.2  1  0.1  2  0.1  2  0.4  0.2  0.2  0.2  0.2
 1.2
Entropy , H    Pi log Pi
i

  0.4 log 2 0.4  2  0.2 log 2 0.2  2  0.1log 2 0.1


 2.122

H 2.122 2.122
Efficiency ,     88.42 %
L log 2 r 1.2  2 2.4

Redundancy = 1-η =11.58%

LEMPEL - ZIV ALGORITHM

The source coding algorithm doesn‟t need source statistics (source probability). Lempel-
Ziv coding is a variable to fixed length source coding algorithm proposed by Abraham Lempel &
Jacob Ziv in 1977. This algorithm is intrinsically adaptable & simpler to implement than
Huffman source coding algorithm.

PROCEDURE

1. The comparison of an arbitrary binary sequence is possible by encoding a series of 1s &


0s as some previous strings (prefix string).

Module I 42 2008 Scheme


Information Theory & Coding Dept of ECE

2. New string is formed by the process known as parsing.


Parsing = prefix string + new bits
Thus parsing becomes a potential prefix string for future strings

3. The variable length blocks thus obtained are known as phrases. Then phrases are listed in
a dictionary or code book, which stores the existing phrases & their locations.

4. In encoding a new phrase we specify the location of the existing phrase in the code book
& append the new letter.

ADVANTAGES

1. This algorithm uses fixed length codes to represent a variable number of source symbols.

2. This coding method is suitable for synchronous transmission.

3. It is now used as the standard algorithm for file compression.

4. Achieves a compression of 55% [Huffman Coding 43%].

RUN LENGTH ENCODING

It is used to reduce the size of a repeating string. This repeating string is called run. RLE
encodes a run of symbols into two bytes (Count + Symbol). RLE cannot achieve high
compression ratio compared to other compression methods. But it is easy to implement and
quick to execute

Example:

Consider the bit string 11111111111111100000000000000000001111. Encode this in run length


encoding.

11111111111111100000000000000000001111 == 38 bits.

(Count, Symbol) = (15, 1), (19, 0), (4, 1)

= (01111, 1), (10011, 0), (00100,1) == 18 bits

Module I 43 2008 Scheme


Information Theory & Coding Dept of ECE

Compression ratio = output bit: input bit

= 18:38 = 1: 2.1

JPEG STANDARD FOR LOSS LESS IMAGE COMPRESSION

The joint photographic expert group (JPEG) was formed jointly by two standards
organizations-the CCITT (The European Telecommunication Standards Organization) and the
International Standards Organization (ISO).

Let us now consider the lossless compression option of the JPEG image compression
standard which is a description of 29 distinct coding systems for compression of images. There
are so many approaches because the needs of different users vary so much with respect to the
quality versus compression and compression versus computation time that the committee
decided to provide a broad selection from which to choose. We shall briefly discus here two
methods that use entropy coding.

The two lossless JEPG compression options discussed here differ only in the form of the
entropy code that is applied to the data. The user can choose either a Huffman Code or an
Arithmetic code. We will not treat the Arithmetic Code concept in much detail here. However,
we will summarize its main features:

Arithmetic Code achieves compression in transmission or storage by using the parabolic


nature of the data to render the information with fewer bits than used in the source data stream.
Its primary advantage over the Huffman Code that it comes closer to the Shannon entropy limit
of compression for data streams that involve a relatively small alphabet. The reasons that
Huffman codes work best (highest compression ratios) when the probabilities of the symbols can
be expressed as fractions of powers of two. The computation of coding and decoding arithmetic
codes is costlier than that of Huffman codes. Typically a 5 to 10% reduction in file size is seen
with the application of Arithmetic codes over that obtained with Huffman coding.

Some compression can be achieved if we can predict the next pixel using the previous
pixels. In this way we just have to transmit the prediction coefficients (or difference in the
values) instead of the entire pixel. The predictive process that is used in the lossless JPEG coding

Module I 44 2008 Scheme


Information Theory & Coding Dept of ECE

schemes to form the innovations data is also variable. However, in this case, the variation is not
based up on the user‟s choice, but rather, for any image on a line by line basis. The choice is
made according to that prediction overall for the entire line.

There are eight prediction methods available in the JPEG coding standards. One of the
eight (which is the no prediction option) is not used for the lossless coding option that we are
examining here. The other seven may be divided into the following categories:

1. Predict the next pixel on the line as having the same value as the last one.

2. Predict the next pixel on the line as having the same value as the pixel in this position
on the previous line (that is, above it).

3. Predict the next pixel on the line as having a value related to a combination of the
previous, above and previous to the above pixel values. One such combination is
simply the combination of the other three.

The differential coding is used in the JPEG standard consists of the differences between
the actual image pixel values and the predicted values. As a result of the smoothness and
redundancy present in most pictures, these differences give rise to relatively small positive and
negative numbers that represent the small typical error in the prediction. Hence the probabilities
associated with these values are larger for the small innovation values and quite small for large
ones. This is exactly the kind of data stream that compresses well with an entropy code.

The typical lossless compression for natural images is 2:1. While this is substantial, it
does not in general solve the problem of storing or moving large sequences of images as
encountered in high quality video.

JPEG STANDARD FOR LOSSY IMAGE COMPRESSION

The JPEG standard includes a set of sophisticated lossy compression options developed
after a study of image distortion acceptable to human senses. The JPEG lossy compression
algorithm consists of an image simplification stage, which removes the image complexity at
some loss of fidelity, followed by a lossless compression step based on predictive and Huffman
or Arithmetic coding.

Module I 45 2008 Scheme


Information Theory & Coding Dept of ECE

The lossy image simplification step, which we will call the image reduction, is based on
the exploitation of an operation known as the Discrete Cosine Transform (DCT), defined as
follows.

N 1 M 1
 k   l 
Y k , l     4 yi, j  cos  2N 2i  1 cos  2M 2 j  1
i 0 j 0

where the input image is N pixels by M pixels, y(i, j) is the intensity of the pixel in the
row i and column j, Y(k, l) is the DCT coefficient in row k and column l of the DCT matrix. All
the DCT multiplications are real. This lowers the number of required multiplications, as
compared to the Discrete Fourier Transform. For most images, much of the signal energy lies at
lie frequencies, which appear in the upper left corner of the DCT. The lower right values
represent higher frequencies, and are often small (usually small enough to be neglected with the
little visible distortion).

In the JPEG image reduction process, the DCT is applied to 8 by 8 pixel blocks of the
image. Hence, if the image is 256 by 256 pixels in size, we break it into 32 by 32 square blocks
of 8 by 8 pixels and treat each one independently. The 64 pixel values in each block are
transformed by the DCT into a new set of 64 values. These new 64 values, known also as DCT
coefficients, form a whole new way of representing an image. The DCT coefficients represent
the spatial frequency of the image sub-block. The upper left corner of the DCT matrix has low
frequency components and the lower right corners the high frequency components. The top left
coefficient is called the DC coefficient. Its value is proportional to the average value of the 8 by
8 block pixels. The rest are called the AC coefficients.

Module I 46 2008 Scheme


Information Theory & Coding Dept of ECE

So far we have not obtained any reduction simply by taking the DCT. However, due to
the nature of most natural images, maximum energy (information) lies in low frequencies as
opposed to high frequency. We can represent the high frequency components coarsely, or drop
them altogether, without strongly affecting the quality of the resulting image reconstruction. This
leads to a lot of compression (lossy). The JPEG lossy compression algorithm does the following
operations:

1. First the lowest weights are trimmed by setting them to zero.


2. The remaining weights are quantized (that is, rounded off to the nearest of some
number of discrete code represented values), some more coarsely than others
according to observed levels of sensitivity of viewers to these degradations.
Now several lossless compression steps are applied to the weight data that results from
the above DCT and quantization process, for all the image blocks. We observe the DC
coefficient, which represents the average image intensity, tends to vary from the block of 8x8
pixels to the next. Hence, the prediction of this value from surrounding blocks works well. We
just need to send one DC coefficient and the difference between the DC coefficients of
successive blocks. These differences can also be source coded.

We next look at the AC coefficients. We first quantize them, which transforms most of
the high frequency coefficients to zero. We then use a zigzag coding. The purpose of zigzag
coding is that we gradually move from the low frequency to high frequency, avoiding abrupt
jump in the values. Zigzag coding will lead to long runs of 0‟s, which are ideal for RLE followed
by Huffman or Arithmetic coding.

Module I 47 2008 Scheme


Information Theory & Coding Dept of ECE

The typically quoted performance for JPEG is that photographic quality images of natural
scenes can be preserved with compression ratios of up to about 20:1 or 25:1. Usable quality (that
is, for no critical purposes) can result for compression ratios in the range of 200:1 up to 230:1.

Module I 48 2008 Scheme


Information Theory & Coding Dept of ECE

CHANNEL CAPACITY

The channel capacity of a discrete memory-less source is defined as the maximum mutual
information I(X,Y) in a channel. The channel capacity is measured in bits/channel used.

C= Max [ I(X,Y) ]

The various channels used in communication system are described below:

1. LOSSLESS CHANNEL


A channel is said to be lossless if H X
Y
 0 for all distributions. A lossless channel is
determined by the factor that the input is determined by the output and hence no transmission
error can occur.

Or it can be represented as

CHANNEL CAPACITY OF A LOSSLESS CHANNEL

We know that channel capacity = Max [Mutual Information] = Max [I(X,Y)]

I X ,Y   H X   H X  Y


For a lossless channel H X Y  0 
Module I 49 2008 Scheme
Information Theory & Coding Dept of ECE

C  Max I  X , Y 
 Max H  X   0
 Max H  X 

If the source is transmitting „M‟ symbols the entropy will be maximum if all the signals are

equally likely. Then the entropy H  log 2 M

C  Max H  X 
 log 2 M

2. DETERMINISTIC CHANNEL


A channel is said to be deterministic if P Y
X
 1 for all the index variable involved and

equallently. H Y X   0 for all input distributions. In the case of a deterministic channel
the noise matrix contain only one non zero element in the row. The element is unity.

Or it can be represented as

CHANNEL CAPACITY OF A DETERMINISTIC CHANNEL

We know that channel capacity = Max [Mutual Information] = Max [I(X,Y)]

Module I 50 2008 Scheme


Information Theory & Coding Dept of ECE

I  X , Y   H Y   H Y  X

For a deterministic channel H Y X
 0
C  Max H Y  

If the receiver is receiving „N‟ symbols the entropy will be maximum if all the signals are
equally likely. Then the entropy H  log 2 N

C  Max H Y 
 log 2 N

3. NOISELESS/ IDEAL CHANNEL

A channel is said to be noiseless if the channel is both lossless and deterministic. That is

HY  X
  H X Y   0 . That is the noise matrix contains a non zero entry in each
row and each column.

Or it can be represented as

CHANNEL CAPACITY OF A NOISELESS CHANNEL

We know that channel capacity = Max [Mutual Information] = Max [I(X,Y)]

I  X , Y   H Y   H Y  X
 H X   H X  Y

Module I 51 2008 Scheme
Information Theory & Coding Dept of ECE

For a noiseless channel H Y X


  H X  Y
 0
C  Max H  X    Max H Y  
C  log 2 M  log 2 N
For an Ideal Channel M N

4. USELESS CHANNEL

A useless channel can be characterized by the condition that H  X   H X Y  and


H Y   H Y  
X . The useless channel has zero capacity or I(X,Y) = 0.

CHANNEL CAPACITY OF A USELESS CHANNEL

We know that channel capacity = Max [Mutual Information] = Max [I(X,Y)]

= Max [ 0 ] = 0

5. SYMMETRIC CHANNEL

A channel is symmetric if each row of the noise matrix is identical and as well as the column
is identical.

CHANNEL CAPACITY OF A SYMMETRIC CHANNEL

Consider there are M symbols in transmitter and N symbols in receiver

Take the noise matrix of the symmetric channel as

Module I 52 2008 Scheme


Information Theory & Coding Dept of ECE

We know that channel capacity = Max [Mutual Information] = Max [I(X,Y)]

Take the conditional entropy at points x1 , x2,…………, xM.

Y   y1  1  y2  1  yN  1
H 
x 
 P 
x   log  y   P 
 x  log  y   ..  P 
 x  log  y 
 1  1 P  1  1 
x  P
 x  P
 x 
1 2 N
  
 1  1  1 

Similarly,

 Y   y1  1  y  1 y  1
H    P   log  P  2  log  ..  P  N  log
 xM   xM   y   xM   y   xM  y 
P  1  P  2  P  N 
 xM   xM   xM 

Y  Y 
ie; H     Pxi  H  
X i  xi 

For a symmetric matrix

Y  Y  Y   Y 
H    H    H    ....  H    h
 x1   x2   x3   xM 
Y 
H     P  xi  h
X  i

 h P  xi 
i

Since  P  xi   1
i

Y 
H  h
X 

Module I 53 2008 Scheme


Information Theory & Coding Dept of ECE

Channel capacity, C = Max [I(X,Y)]


C  Max H Y   H Y  X

 Max H Y   h
 log 2 N  h

Similarly by taking H X Y  , we get


C  Max H  X   H X  Y 

 Max H  X   h ' 
 log 2 M  h '

Channel capacity, C = log 2 M – h or C = log 2 N – h‟.

X 
Where, h  H  Y  & h'  H
  Yy 
 X  x i  j

PROBLEMS

27. Determine the channel capacity of a symmetrical channel whose matrix is given as

Ans:
C = log 2 N – h

 
Where N = 4 & h = H  Y X  x 
 i

     
We know that H  Y x   H  Y x   ...  H  Y x   h .
 1  2  i

h  H  Y   log 2 3  log 2 3  log 2 6  log 2 6 1.917


1 1 1 1
 x1  3 3 6 6
C = log 2 N – h = 2 -1.917 = 0.081

Module I 54 2008 Scheme


Information Theory & Coding Dept of ECE

6. BINARY SYMMETRIC CHANNEL

A symmetric channel which uses binary data for transmission and reception is called binary
symmetric channel. Binary symmetric channel is used in binary communication system.

The binary source transmits binary data to the receiver. The comparator compares the
transmitted and the received symbols. The comparator sends the corresponding signal to the
observer. The observer, which has a noiseless channel, sends a „1‟ to the receiver when there
is an error. When there is no error the observer sends a „0‟ to the receiver. Thus the observer
supplies additional information to the receiver thus avoiding the noise in the channel.
Additional information supplied by the observer is exactly equivocation of the source.

A binary symmetric channel can be represented as:

1 1
To find the capacity of a binary symmetric channel, let us take P ( X 1 )  and P ( X 2 )  .
2 2
We have P ( X , Y )  P ( X )  P ( Y )
X

Module I 55 2008 Scheme


Information Theory & Coding Dept of ECE

p q
2 2
 
Therefore, P ( X , Y )   
q p
 2 2 

We have C  Max I  X , Y   Max H  X    H X Y 


H  X 
1 1 1 1
log 2 2  log 2 2   1
2 2 2 2

Now we have to find H X  Y  , for that we require PX Y 


P ( X ,Y )
P ( X , Y )  P (Y )  P ( X )  P( X ) 
Y Y P (Y )

pq pq
From P( X , Y ) , we get P(Y1 )  & P(Y2 ) 
2 2

 p q 
 2 2 
 ( p  q) ( p  q) 
 2 2
P ( X ,Y )  
P( X )  
Y P (Y )  
 q p 
 2 2 
 ( p  q) ( p  q) 
 2 2

 p q 
 pq pq
 
 
 q p 
 
 pq pq

Since „p+q‟ is the total probability, its value become 1. So

p q
P( X )  
Y  
q p

Therefore,

Module I 56 2008 Scheme


Information Theory & Coding Dept of ECE

H X Y   P( X ,Y ) log PX1 2

Y
p 1 q 1 p 1 q 1
 log  log  log  log
2 p 2 q 2 p 2 q
1 1
 p log  q log   ( p log p  q log q )
p q

So the channel capacity C  H ( X )  H X  Y



= 1+ p log p + q log q

For a symmetric channel, we know that the channel capacity C = log 2 M – h; here M = 2 &

h  H  Y  H Y
 
  p log 1  q log 1   p log p  q log q

 X1   X2  p q
C  log 2 M  h  log 2 2   p log p  q log q 
1  p log p  q log q

PROBLEMS

28. Determine the channel capacity of a binary symmetrical channel whose matrix is
given as

Ans:

From this p = 0.9 & q = 0.1

C = 1+ p log p + q log q

= 1 + 0.9 log 0.9 + 0.1 log 0.1 = 1 - 0.47 = 0. 53

Module I 57 2008 Scheme


Information Theory & Coding Dept of ECE

7. BINARY ERASURE CHANNEL

In communication systems, sometimes the data received may be so corrupted and thus we
may not be able to judge the received output. Sometimes the data may be totally lost. Such a
channel is called a binary erasure channel.

A binary erasure channel can be represented as:

C  Max Mutual Information  Max  I  X , Y   Max H  X   H X   Y  ..............1


 1 
H  X    P X  log 2  
 P X  
Let P (0) = α & P (1) = 1 – α.

H  X    log
1 1
 1   log ................2
 1

 
 
 
H X   PX i , Y j log 2 
Y
1
  Xi 


 P Y j 
i j

 

So we have to find P X , Y  & P Y  X


P X , Y   P X   P Y  X
p  1  p  0 

0 1  p 1    1    p 

PX  Y  PP XY, Y 
PY1    p
PY2    1  p   1  p 1   
    p  1    p   p 1  p
PY3   1    p

Module I 58 2008 Scheme


Information Theory & Coding Dept of ECE

 Y   10
PX

1
0
1

 
 
 
H X   P X i , Y j log 2 
Y
1
  Xi  

 P Y j  
i j

 
 p  log   1  p   log  1  p  1    log  p 1    log
1 1 1 1
1  1 1
 
 1  p    log  1    log   1  p  H  X ........3
1 1
  1   
Substitute 2 & 3 in 1, we get
C  Max H  X   1  p H  X   Max  p H  X  
p
PROBLEMS

29. Determine the channel capacity of a binary erasure channel whose matrix is given as

   0
3 1

4 4
P Y
X 3 
0 1
4 4

Ans:

   0 p 1 p 0
3 1
 
4 4
P Y
X 3  0 1 p p
0 1
4 4  

From this p = ¾

C=p=¾

8. UNSYMMETRIC CHANNEL

An unsymmetric channel doesn‟t have any identical rows or columns. In this case,

maximization of H X  Y  or H Y X  possess a problem. In such cases, maximization is


obtained by using calculus of variation.

CHANNEL CAPACITY OF UNSYMMETRIC CHANNEL

Step 1: The noise matrix is multiplied with a matrix of Q1 and Q2 which is then equated to

Module I 59 2008 Scheme


Information Theory & Coding Dept of ECE

 P11 log P11  P12 log P12 


 P log P  P log P 
 21 21 22 22 

 P11 P12   Q1   P11 log P11  P12 log P12 


i.e. 
P22    
 P21  Q2   P21 log P21  P22 log P22 

Step 2: From step 1 we obtain simultaneous equation of the variables Q1 and Q2 which can
be solved to obtain Q1 and Q2

Step 3: The values of Q1 and Q2 are substituted in the general expression of channel capacity


i.e.C  log 2Q1  2Q2  ...  2Qn 
PROBLEMS

30. Determine the channel capacity of a unsymmetric channel whose matrix is given as

 13 2 0
 
3

P Y  2
 3
1
3 0

X

0 0 1

Ans:

Step 1:

 13 2
3 0  Q1   1 3 log 2 1 3  2 3 log 2 2 3  0
2 1 0 Q    2 log 2  1 log 1  0
 3 3
 2   3 2 3 3 2 3 

0 0 1  
 3  
Q  0  0  1 log 2 1 

Step 2:
From these matrixes, we obtain simultaneous equation
1
3 Q1  2
3 Q2  0.92 ; 2
3 Q1  1
3 Q2  0.92
 Q1  2Q2  2.75 ; 2Q1  Q2  2.75
Q1  2Q2  2.75
4Q1  2Q2  5.5
Q1   0.921 ; Q2   0.917 ; Q3  0

Module I 60 2008 Scheme


Information Theory & Coding Dept of ECE

Step 3:

C  log 2 20.921  20.917  20 
 1.04 bits / channel

9. BINARY CASCADED CHANNEL

Here two channels are in cascade connection.

P1 1  11   2  3
P1 2  1 2   2  4
Where,
P2 1   3 1   4  3
P2 2   3  2   4  4

PROBLEMS

31. Two binary symmetric channels are in cascade. Determine the channel capacity of
each channel and the overall system.

Ans:

1st Channel:

PY X
 0.9
 0.1
0.1
0.9
 
Channel Capacity, C = 1+ p log p + q log q

Module I 61 2008 Scheme


Information Theory & Coding Dept of ECE

= 1 + 0.9 log 0.9 + 0.1 log 0.1 = 1 - 0.47 = 0.53

2nd Channel:

P Z Y
 0.75
0.25
0.25
0.75
 
Channel Capacity, C = 1+ p log p + q log q = 1 + 0.75 log 0.75+ 0.25 log 0.25
= 1 - 0.81 = 0.19

Capacity of overall Channel:

PZ  X
  P11
P
P12 
P22 
 21 
P11  11   2  3  0.9  0.75  0.1  0.25  0.7
P12  1 2   2  4  0.9  0.25  0.1  0.75  0.3
P21   3 1   4  3  0.1  0.75  0.9  0.25  0.3
P22   3  2   4   0.1  0.25  0.9  0.75  0.7
Channel Capacity, C = 1+ p log p + q log q = 1 + 0.7 log 0.7+ 0.3 log 0.3
= 1 - 0.88 = 0.12

CAPACITY OF A BAND-LIMITED GAUSSIAN CHANNEL

For finding the capacity of a band-limited Gaussian channel, we have to know about Shannon‟s
theorem & Shannon- Hartley theorem

SHANNON’S THEOREM

If there exist a source that produces „M‟ number of equally likely messages generating
information at the information rate „R‟ with channel capacity „C‟, then if R ≤ C; there exist a
coding technique such that we can transmit message over the channel with minimum probability
of error.

NEGATIVE STATEMENT

If there exist a source that produces „M‟ number of equally likely messages generating
information at the information rate „R‟ with channel capacity „C‟, then if R > C; there exist a

Module I 62 2008 Scheme


Information Theory & Coding Dept of ECE

coding technique such that we can transmit message over the channel with probability of error
which is close to one.

SHANNON- HARTLEY THEOREM

Shannon- Hartley theorem is applicable to the channel affected by Gaussian noise. It


states that, the channel capacity of a white, band limited Gaussian channel is


C  B log 2 1  S
N

Where, B is the channel bandwidth
S is the signal power
N is the noise power; N = ηB

is the two sided power spectral density
2

DERIVATION

For the purpose of transmission over a channel the messages are repeated by fixed
voltage levels. Then the source generates one message after another in sequence. The transmitted
signal s(t) can be shown in figure as:

σ is the root mean square voltage of noise in the signal


λ is a multiple which is kept large to recognize individual levels
σλ is the step size.

The levels are located at voltages  2 ,  2, 


 3  5 
2

Module I 63 2008 Scheme


Information Theory & Coding Dept of ECE

The average signal power, if there are M signals or M levels is

V12 V22 V32  ......VM2


S
M

Here the voltage levels are  2 ,  2, 


 3  5 
2

 2    2    2    2   .......    M 21 
 2  2 3 2 3 2  M 1 2 2

S  2
M
7 
M-1 because, if we take 8 levels the last level is 2

   2  3  2  M  1  
2
2
i.e. S        .....    
M  2   2   2  
2    2
 
2

  1  3  .....  M  1
2

2

M 2 


2    M M 2  1 
2
 
 
M 4  6 


2

M 1
 2 
12

From this
12S 12S
M 2  1  M 2 1 
  2
 2
1
 12S  2
M  1 
  2 
Noise power, N is the square of root mean square noise voltage. I.e. N = σ 2 .
1
 12S  2
 M  1  2  ......................1
  N
Each message is equally likely. Then we can write entropy, H = log2 M
1
 12 S  2
 H  log 2 1  2 
  N
 12 S 
 log 2 1  2 ...................2 
1
2   N
We know that the rate of information, R = rH……………. (3)

Module I 64 2008 Scheme


Information Theory & Coding Dept of ECE

By Shannon‟s theorem, we know that R ≤ C. then the transmission takes place with less
probability of error. I.e. R ≈ C
 12S 
1  2 N ...................4
1
 C  rH  r  log 2
2
By Sampling theorem, the Nyquist rate r is given by r = 2B, B is the bandwidth.
1  12S   12S 
 C  2 B  log 2 1  2 N   B log 2 1  2 N 
2
Substitute λ2 = 12.

 S
 C  B log 2 1   bits / sec
 N

The Shannon- Hartley theorem specifies the rate at which the information may be
transmitted with small probability of error. Thus Shannon- Hartley theorem contemplates that,
with a sufficiently sophisticated transmission technique, transmission at channel capacity is
possible with arbitrarily small probability of error

SIGNIFICANCE OF SHANNON- HARTLEY THEOREM

The significance of Shannon- Hartley theorem is that it is possible to transmit symbols


over a AWGN channel of bandwidth „B‟ Hz at a rate of „C‟ bits/second with an arbitrarily small
probability of error, if the signal is encoded in such a manner that the samples are all Gaussian
signals. It is achieved by orthogonal codes.

The Shannon- Hartley theorem gives an idea about Bandwidth - SNR trade off.

BANDWIDTH - SNR TRADE OFF

The trade off is the exchange of bandwidth with signal to noise power ratio.

Suppose S  7 & B 4kHz , then C = 12,000 bits/sec.


N

Suppose S  15 & B 3kHz , then C = 12,000 bits/sec.


N
That is, for a reduction in bandwidth, we require an increment in signal to noise ratio. We
know that

Module I 65 2008 Scheme


Information Theory & Coding Dept of ECE

 S B  S
C  B log 2 1   i.e.1  log 2 1  N 
 N C
By plotting B against S , we get
C N

From the figure we know that when SNR increases the BW decreases. At SNR > 10, the
reduction in bandwidth increasing SNR is poor. The use of larger bandwidth for smaller SNR is
called coding upwards (E.g. FM, PM, PCM) and the use of smaller bandwidth for larger SNR is
called coding downwards (E.g. PAM). This trade off can be seen when we take the capacity of a
channel of infinite bandwidth

CAPACITY OF A CHANNEL OF INFINITE BANDWIDTH

 S
By Shannon Hartley theorem,  C  B log 2 1  
 N
When bandwidth becomes infinite, the capacity of the channel increases. But it doesn‟t
become infinite because with the increase in bandwidth, the noise power also increases (N = ηB).
Thus the capacity approaches an upper limit with increasing bandwidth.

 S
 C  B log 2 1  
 N
 S 
 B log 2 1  
 B 
S   S 
   B log 2 1  B 
 S  
S
S  S  B
  log 2 1  
  B 

We know that lim 1  x  x  e


1

x0

Module I 66 2008 Scheme


Information Theory & Coding Dept of ECE

When the bandwidth tends to infinity,

 C   lim C
B 

 S

S  S  B 
 lim   log 2 1  B  
B  
   
 
S
As B  , 0
B

 S

S  S  B 
 lim   log 2 1  B  
0    
S
B  
S
  log 2 e

S
 C  1.44 

ORTHOGONAL SIGNAL TRANSMISSION

Shannon has suggested an idealized communication system to achieve transmission at


capacity R ≈ C where different samples of white noise waveforms of average power „S‟ is used
for transmission. The system uses orthogonal signals S1(t), S2 (t),……..SM(t). These orthogonal
signals have the property that

Module I 67 2008 Scheme


Information Theory & Coding Dept of ECE

T T

 Si (t )S j (t )dt  S (t )dt  ES ; if i  j
2
i
0 0

 0 ; if i  j
When S1(t) is transmitted, then the output of all correlators ( combination of multipliers
& integrators) expect first correlator will be zero due to orthogonal property.

T T

Then the output of 1st correlator is e1   S1 (t )S1 (t )dt  S (t )dt  ES


2
1
0 0
T
When noise is added, e1   S1 (t )  n(t ) S1 (t )dt  ES  nk
0

This depends only on signal energy because nk (t ) 


ES and nk 
ES S1 (t ) . Thus
2 2
there is no difficulty for the decision circuit to identify the transmitted signals. Here E S is the
amplitude of the signal.

SHANNON’S LIMIT

To find the Shannon‟s limit, we have to find their minimum value of bit energy to noise
Eb
density .
No
In practical channels, the noise power N o is generally a constant. If Eb is the transmitted
energy per bit. Then the average transmitted power, S = E bC………… (1)
No
The two sided power spectral density of AWGN channel is
2

Module I 68 2008 Scheme


Information Theory & Coding Dept of ECE

Here the SN(f) & RN(σ) are Fourier pairs.

RN    
B N 0 j 2f
e df
B 2
B
N 0 e j 2f

2 j 2 B

N 0 e j 2 B  e  j 2 B

2 j2
sin 2B sin 2B
 N0  N0 B
2 2B

= N0B x sinc 2σB

That is, the total output power = N0 B………. (2)

Apply equation (1) & equation (2) in Shannon Hartley theorem, we get

 S   EC
C  B log 2 1    B log 2 1  b 
 N  N0 B 
Here,
EC EbC
2 B 1  b   2 B 1
C C

N0 B N0 B
Eb 2 B  1
C

 ;
N0 C
B

Module I 69 2008 Scheme


Information Theory & Coding Dept of ECE

Here C
B is the band width efficiency
If C
B 1then Eb N 0  1 . This implies Eb = N0. That is the signal energy is

equal to the noise energy.


We have
 EC
C  B log 2 1  b 
 N0 B 
Eb
Here apply , 1
N0
 C
C  B log 2 1  
 B
C B  C
 log 2 1  
B0 B0  B
We know that channel capacity must be maximum when we utilize the maximum
band width. That is the channel capacity must be equal to the maximum band width,

C = B0

C B  B 
 log 2 1  0 
B0 B0  B 
B
 B  B0
 log 2 1  0 
 B 
B
 B0  B0
C  B0 log 2 1  
 B 
C max  B0 log 2 e  1.44 B0
S  Eb C  N 0 B
 C E E B
  1 & b  1  b  
 B N0 N0 C 
E B B B0
lim b  lim 0  0   0.6944
B  N B  C Cmax 1.44 B0
0

Eb
  1.5836 dB
N 0 min

This is Shannon’s Limit.

Module I 70 2008 Scheme


Information Theory & Coding Dept of ECE

PROBLEMS

32. A voice grade channel of a telephone network has a bandwidth of 3.4 kHz.
a. Calculate the channel capacity of the telephone channel for a signal to noise ratio
of 30dB.
b. Calculate the minimum signal to noise ratio required to support information
transmitted through the telephone channel at a rate of 4800bits per seconds.

Ans:
a. B = 3.4 kHz, SNR = 30dB
10 x log 10 SNR = 30
log 10 SNR = 3
SNR = 1000
We have
C  B log 1  SNR
2

 3400  log 2 1  1000


 33.8 kbps

b. B = 3.4 kHz, C = 4.8kbps


C  B log 2 1  SNR 
4800  3400  log 2 1  SNR 
log 2 1  SNR 1.41  1  SNR  2.65
SNR 1.65
10 x log 10 1.65 = 30
SNR = 2.2dB

33. A communication system employs a continuous source. The channel noise is white and
Gaussian. The bandwidth of the source output is 10MHz and SNR is 100.
a. Determine the channel capacity.
b. If the SNR drops to 10, how much bandwidth is needed to achieve the same
channel capacity?
c. If the bandwidth is decreased to 1MHz, how much SNR is required to maintain
the same channel capacity?

Module I 71 2008 Scheme


Information Theory & Coding Dept of ECE

Ans:
a. B = 10MHz, SNR = 100
C  B log 2 1  SNR 
10 7  log 2 1  100
 66.5 Mbps

b. SNR = 10, C = 66.5 Mbps


C  B log 1  SNR 
2

66.5  10 6  B log 2 1  10


B 19.22 MHz

c. B = 1 MHz, C = 66.5 Mbps


C  B log 2 1  SNR 
66.5  10 6 10 6  log 2 1  SNR 
log 2 1  SNR  66.5
SNR 1.04  10 20
10 x log 10 1.04x1020 = 30
SNR = 200.2dB

34. Alphanumeric data are entered into a computer from a remote terminal through a
voice grade telephone channel. The channel has a bandwidth of 3.4 kHz and output
signal to noise power ratio of 20dB. The terminal has a total of 128 symbols which
may be assumed to occur with equal probability and that the successive transmissions
are statistically independent.
d. Calculate the channel capacity.
e. Calculate the maximum symbol rate for which error free transmission over the
channel is possible?

Ans:
a. B = 3.4 kHz, SNR = 20dB
10 x log 10 SNR = 20
SNR = 100

Module I 72 2008 Scheme


Information Theory & Coding Dept of ECE

C  B log 1  SNR 
2

 3400  log 2 1  100


 22640 bps

b. Hmax = log 2 128 = 7bps, C = 22640bps


We have R = r x H. That is C = r x H.
C 22640
r   3234bps
H 7

35. A black and white television picture may be viewed as consisting of approximately
3x105 elements; each one of which may occupy one of 10 distinct brightness levels with
equal probability. Assume the rate of transmission to be 30 picture frames per second,
and the signal to noise ratio is 30dB. Using channel capacity theorem, calculate the
minimum bandwidth required to support the transmission of the resultant video
signal?

Ans:

10 310
5
No. of different pictures possible =
310 5
Entropy, H = log 2 10

 3  105 log 2 10
 3  105  3.32
 9.97  105

Capacity, C = R = r x H
= 30 x 9.97 x 105 = 29.9 x 106
We know that C = B log 2 (1+SNR)
29.9 x 106 = B log 2 (1+1000)
B = 3MHz

Module I 73 2008 Scheme


Information Theory & Coding Dept of ECE

MODULE II

Module II 74 2008 Scheme


Information Theory & Coding Dept of ECE

MODULE II (QUANTITATIVE APPROACH)

Introduction to rings , fields, and Galois fields.

Codes for error detection & correction - parity check coding - linear block codes -
error detecting and correcting capabilities - generator and parity check matrices -
Standard array and syndrome decoding –Perfect codes, Hamming codes - encoding
and decoding, cyclic codes – polynomial and matrix descriptions- generation of
cyclic codes, decoding of cyclic codes, BCH codes - description & decoding,
Reed-Solomon Codes, Burst error correction.

Module II 75 2008 Scheme


Information Theory & Coding Dept of ECE

RINGS, FIELDS & GALOIS FIELDS

Rings

A ring is a set R with two composition laws “+” and “ . ” such that

a. (R,+) is a commutative group;

b. “ . ” is associative, and there exists an element 1R such that a . 1R = a = 1R . a for


all a  R

c. the distributive law holds: for all a, b, c  R ,

a  b.c  a.c  b.c


a.b  c   a.b  a.c

A subring S of a ring R is a subset that contains 1 R and is closed under addition, passage
to the negative, and multiplication. It inherits the structure of a ring from that on R.

A homomorphism of rings  : R  R ' is a map with the properties

 a  b   a  b,  ab   a  b,  I R   I R' , all a, b  R

A ring R is said to be commutative if multiplication is commutative:

ab  ba, for all a, b  R

A commutative ring is said to be an integral domain if 1R  0 and the cancellation law holds for
multiplication:

ab  ac, a  0, implies a  b

An ideal I in a commutative ring R is a subgroup of (R,+) that is closed under multiplication by


elements of R:

r  R, a  I , implies ra  I

Module II 76 2008 Scheme


Information Theory & Coding Dept of ECE

Fields
A field is a nonempty set F of elements with two operations „+‟ (called addition) and „·‟
(called multiplication) satisfying the following axioms. For all a, b, c ∈ F:

1. F is closed under + and · ; i.e., a + b and a · b are in F.


2. Commutative laws: a + b = b + a, a · b = b · a.

3. Associative laws: (a + b) + c = a + (b + c), a · (b · c) = (a · b) · c.


4. Distributive law: a · (b + c) = a · b + a · c.

Furthermore, two distinct identity elements 0 and 1 (called the additive and multiplicative
identities, respectively) must exist in F satisfying the following:

1. a + 0 = a for all a ∈ F.

2. a · 1 = a and a · 0 = 0 for all a ∈ F.

3. For any a in F, there exists an additive inverse element (−a) in F such that a + (−a) = 0.

4. For any a _= 0 in F, there exists a multiplicative inverse element a−1 in F such that a · a−1
= 1.

We usually write a · b simply as ab, and denote by F∗ the set F\{0}.

Module II 77 2008 Scheme


Information Theory & Coding Dept of ECE

NEED FOR CHANNEL CODING OR ERROR CONTROL CODING

When we use the source coding technique (Shannon Fano, Huffman coding, Lempel Ziv
Algorithm or run length coding) we get variable length codes for the source symbols

If we are transmitting the message sequence S3S4 S1 we have to transmit „1101110‟ if an


one bit error occurs and let the received sequence be „0101110‟. The received sequence will be
decoded as S1 S2S4 S1, which is entirely different from the transmitted message. This is one of the
disadvantages of using variable length coding. The second disadvantage of variable length
coding is that the output data rate will vary randomly.

To avoid these problems we go for the error control coding technique or channel coding
technique.

CHANNEL CODING

During the transmission process, the transmitted signal will pass through certain noisy
channel. Due to noise interference some errors are introduced in the received data. These errors
can be independent errors or burst errors.

Independent errors are caused by the Gaussian noise or the thermal noise. Burst errors are
caused mainly by the impulse noise. The errors caused by these noises can be detected and
corrected by using proper coding technique.

Module II 78 2008 Scheme


Information Theory & Coding Dept of ECE

ERROR CONTROL CODING TECHNIQUES

1. Error detecting with re-transmission


2. Forward acting error correction method

1. ERROR DETECTING WITH RE-TRANSMISSION

Here when an error is detected an ARQ request (Automatic Retransmission Query)


(Automatic Repeat Request) is send back to the transmitter. Then the transmitter will retransmit
the entire data.

2. FORWARD ACTING ERROR CORRECTING METHOD

In this method, errors are detected and corrected by proper coding techniques at the
receiver side.

DATA TRANSMISSION SYSTEM

The actual message originates from the information source. The amount of information
can be measured in bits or nats or decits. The source encoder transforms this information into a
sequence of binary digits „u‟ by applying different source coding techniques. The source encoder
is designed such that the bit rate is minimized and also the information can be uniquely
Module II 79 2008 Scheme
Information Theory & Coding Dept of ECE

reconstructed from the sequence „u‟. Then the channel encoder transforms the sequence „u‟ into
the encoder sequence „v‟ by channel coding techniques. That is, the channel encoder adds some
extra bits called check bits to the encoder message sequence „u‟. This encoded sequence is
transmitted through the channel after modulation process. Demodulation will separate the
individual „v‟ sequence and passes it to the channel decoder. The channel decoder identifies the
extra bits or check bits added by the channel encoder and uses them to detect and correct the
errors in the transmitted message if there are any. The source decoder decodes the encoded
message sequence and will recover the original message. Finally the original message reaches at
the destination.

HAMMING DISTANCE

The hamming distance between two code vectors is equal to the number of elements in
which they differ.

MINIMUM DISTANCE (dmin)

Minimum distance is the minimum harming distance between the code vectors.

Module II 80 2008 Scheme


Information Theory & Coding Dept of ECE

CODE EFFICIENCY
Message Bits In A Block
Code Efficiency 
Transmitted Bits For A Block

VECTOR WEIGHT
The number of non zero elements in the code vector is known as vector weight.
E.g.  11110001. Then vector weight = 5

CODES FOR ERROR DETECTION AND CORRECTION

1. PARITY CHECK CODING

In this method extra bits or parity check bits are added to the message at the time of
transmission. At the receiver end, the receiver checks these parity bits. Errors are detected if the
expected pattern of parity bits is not received. Two types of parity checking mechanisms are:

Even Parity Checking


Even parity checking means checks even number of ones in the code vector.
E.g.  111001.

Odd Parity Checking


Odd parity checking means checks odd number of ones in the code vector.
E.g.  111000.

There are two types of parity check coding techniques


i. Vertical redundancy check (VRC)
ii. Longitudinal redundancy check (LRC)

VERTICAL REDUNDANCY CHECK

In this method each character of the message is converted into its ASCII code.

Case 1: Even Parity Checking


If the code vector contains odd number of ones, then we add one to the original message.

Module II 81 2008 Scheme


Information Theory & Coding Dept of ECE

E.g.:- Let the original message be 1110000, then the transmitted message is
11100001.
If the code vector contains even number of ones, then we add zero to the original
message.
E.g.:- Let the original message be 1111000, then the transmitted message is
11110000.

Case 2: Odd Parity Checking


If the code vector contains even number of ones, then we add one to the original message.
E.g.:- Let the original message be 1110001, then the transmitted message is
11100011.
If the code vector contains odd number of ones, then we add zero to the original message.
E.g.:- Let the original message be 1110000, then the transmitted message is
11100000.

Advantages

Let the transmitted bits be 11110011 is of the even parity, and assume that one-bit error
be occurred at the receiver side, and let the received bits be 01110011 is of the odd parity.
Thus the receiver can detect the error because the receiver is expecting an even parity.
Thus the receiver sends an ARQ request to the transmitter to re-transmit the message.

Disadvantages

1. By this method we cannot detect more than one bit errors.


2. The errors cannot be corrected, because the receiver doesn‟t know which bit is in error.

LONGITUDINAL REDUNDANCY CHECK

The structure of transmitting block of message is as follows:-

Module II 82 2008 Scheme


Information Theory & Coding Dept of ECE

For example if we are transmitting a message „INDIA‟. Then we have write the ASCII
codes of I, N, D, I & A.

The bit b8 is obtained by checking the parity of each column. If a column contains even
number of ones, then we add 0. If a column contains odd number of ones, then we add 1.

Here by checking BCC & b8, the receiver can detect the correct position of error. Thus,
the receiver can correct the error. When double errors occur in a row, the BCC will not change.
But the receiver can detect error with the help of b8.

Advantages

1. In LRC method, single error can be detected & corrected.


2. Double & triple errors can be detected.
Module II 83 2008 Scheme
Information Theory & Coding Dept of ECE

LINEAR CODES

A code is said to be linear code if the sum of two code vectors will produce another code
vector. That is if v1 and v2 are any code vectors of length n of the block code then v1+v2 is also a
code word of length n of the block code.

SYSTEMATIC CODES

In a systematic block code the message bit appear at the beginning of the code word.
Then the check bits are transmitted in the block.

LINEAR BLOCK CODES

The various linear block codes used for error detection and correction are hamming
codes, cyclic codes, Single parity check bits codes, BCH codes, Reed Soloman codes, etc.

If we want to transmit „k‟ number of message bits the channel encoder adds some extra
bits or check bits to this encoded message bits. Let the number of these additional bits be „q‟.
Then the total number of bits at the output of the channel encoder is „k + q‟ and it is represented
as „n‟. That is “n = k + q”. This type of coding method is known as (n, k) linear block codes.

ENCODING OF LINEAR BLOCK CODES

The code vector consists of message (k) bits and check (q) bits. Let the code vector be X.

X  m1 , m2 , m3 ,......, mk , c1 , c2 , c3 , ........., cq ............................. 1


where q  n  k

The code vectors are obtained by multiplying the message vector with a generator matrix
which is used to generate the check bits.
Module II 84 2008 Scheme
Information Theory & Coding Dept of ECE

GENERATOR MATRIX
 g11 g12 ..... g1n 
g g 22 ..... g 2 n 
I .e., X  m1 m2 ...... mk  21
 .... .... .... .... 
 
 g k1 gk2 .... g kn 
 g11 g12 ..... g1n 
g g 22 ..... g 2 n 
Here  21 is generator matrix for generating the check bits
 .... .... .... .... 
 
 g k1 gk2 .... g kn 

I .e., X  m1 k Gk  n ...................................2


The generator matrix can be constructed as
 
G  I k  k : Pk  q ................................3
where I k  k is a k  k identity matrix and
Pk  q is the parity matrix

 P11 P12 ..... P1q 


P P22 ..... P2 q 
Parity matrix, P  
21

 ..... ..... ..... ..... 


 
 p k1 pk 2 ..... p kq 

The check vector can be obtained as

 P11 P12 ..... P1q 


P P2 q 
c
1 c2 ...... 
cq  m1 m2 ...... mk  
21

 .....
P22
.....
.....
..... ..... 
.........4 
 
 pk1 pk 2 ..... pkq 

On solving we get,

c1  m1 P11  m2 P21  .....................  mk Pk 1 



c2  m1 P12  m2 P22  .....................  mk Pk 2 
...............5
.......................................................................
cq  m1 P1q  m2 P2 q  .....................  mk Pk q 

Module II 85 2008 Scheme


Information Theory & Coding Dept of ECE

PARITY CHECK MATRIX (H)

 
H  PT : I q  q ................................6

 P11 P12 ..... P1q 


P P22 ..... P2 q 
where P is Parity matrix, P  
21

 ..... ..... ..... ..... 


 
 pk1 pk 2 ..... pkq 

If the generator matrix is given, then we get this parity check matrix.

PROBLEMS

1. The generator matrix for a (6, 3) block code is shown below. Obtain all the code
vectors?
1 0 0 0 1 1
G  0 1 0 1 0 1
0 0 1 1 1 0

Ans:

Here (6, 3) code means n = 6 & k = 3. Therefore q = n – k = 6 – 3 = 3



We know that the Generator Matrix, G  I k  k : Pk  q 

That is, G  I 3 3 : P3 3 
Thus we get,
0 1 1
P  1 0 1
1 1 0

For getting the code vectors, we have to find the additional bits/ check bits.
We know that,

Module II 86 2008 Scheme


Information Theory & Coding Dept of ECE

 P11 P12 ..... P1q 


P P2 q 
c1 c2 ...... 
cq  m1 m2 ...... mk  
21

 .....
P22
.....
.....
..... ..... 
 
 pk1 pk 2 ..... pkq 

On substituting, we get,

0 1 1 
c1 c2 c3   m1 m2 m3  1 0 1
1 1 0

On multiplying we get,
c1  m2  m3
c 2  m1  m3
c3  m1  m2
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 3 bits is 23 = 8. That is 000,
001, ….,111.

Module II 87 2008 Scheme


Information Theory & Coding Dept of ECE

Another method:

These code vectors can be simply obtained by the equation X = [M] [G]

1 0 0 0 1 1
X 1  0 0 00 1 0 1 0 1  0 0 0 0 0 0
0 0 1 1 1 0

1 0 0 0 1 1
X 2  0 0 10 1 0 1 0 1  0 0 1 1 1 0
0 0 1 1 1 0

1 0 0 0 1 1
X 3  0 1 00 1 0 1 0 1  0 1 0 1 0 1
0 0 1 1 1 0

1 0 0 0 1 1
X 4  0 1 10 1 0 1 0 1  0 1 1 0 1 1
0 0 1 1 1 0

1 0 0 0 1 1
X 5  1 0 00 1 0 1 0 1  1 0 0 0 1 1
0 0 1 1 1 0

1 0 0 0 1 1
X 6  1 0 10 1 0 1 0 1  1 0 1 1 0 1
0 0 1 1 1 0

1 0 0 0 1 1
X 7  1 1 00 1 0 1 0 1  1 1 0 1 1 0
0 0 1 1 1 0

1 0 0 0 1 1
X 8  1 1 10 1 0 1 0 1  1 1 1 0 0 0
0 0 1 1 1 0

The encoding circuit of the (6, 3) linear block code is shown as follows:-

Module II 88 2008 Scheme


Information Theory & Coding Dept of ECE

At first the message bits m1 , m2 and m3 to be encoded is shifted in the message register &
simultaneously into the channel through the commutator or switch. When the commutator comes
to the position c1 , c2 & c3. Thus the check bits are shifted to the channel. Thus the code vector, X
= (m1 , m2, m3, c1 , c2 & c3) is transmitted.

2. For a systematic (6, 3) linear block code the parity matrix P is shown below. Obtain all
the code vectors?
1 0 1
P  0 1 1
1 1 0

Ans:

Here (6, 3) code means n = 6 & k = 3. Therefore q = n – k = 6 – 3 = 3



We know that the Generator Matrix, G  I k  k : Pk  q 

That is, G  I 3 3 : P3 3 
1 0 0 1 0 1
G  0 1 0 0 1 1
0 0 1 1 1 0

We know that,

Module II 89 2008 Scheme


Information Theory & Coding Dept of ECE

 g11 g12 ..... g1n 


g g 22 ..... g 2 n 
x1 x2 ...... xn  m1 m2 ...... mk  21
 ..... ..... ..... ..... 
 
 g k1 gk2 ..... g kn 
On substituting, we get,

1 0 0 1 0 1
x1 x 2 x3 x 4 x5 x6   m1 m2 m3  0 1 0 0 1 1
0 0 1 1 1 0
 m1 m2 m3 m1  m3 m2  m3 m1  m2 
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 3 bits is 23 = 8. That is 000,
001, ….,111.

At first the message bits m1 , m2 and m3 to be encoded is shifted in the message register &
simultaneously into the channel through the commutator or switch. When the commutator comes
to the position c1 , c2 & c3. Thus the check bits are shifted to the channel. Thus the code vector, X
= (m1 , m2, m3, c1 , c2 & c3) is transmitted.

Module II 90 2008 Scheme


Information Theory & Coding Dept of ECE

The encoding circuit of the (6, 3) linear block code is shown as follows:-

ERROR DETECTING AND ERROR CORRECTING CAPABILITY OF LINEAR


BLOCK CODES

1. Detect up to „S‟ errors per word,


d min  s  1

2. It can correct up to „t‟ errors


d min  2t  1; if d min is odd
d min  2t  2; if d min is even
d min  1 d min  2
Thus if dmin is odd, we can correct „t‟ errors t  and if dmin is even, t  .
2 2
This is called random error correcting capability of block codes.
In a (6, 3) encoder which shown earlier, d min is equal to the minimum weight of any non-
zero code vector. Thus we get, dmin = 3. & t = 1. That is, a single error can be corrected. Thus it
is a single error correcting code (SEC). Thus the (6, 3) block code is capable of correcting a
single error in the block of 6 digits.

Module II 91 2008 Scheme


Information Theory & Coding Dept of ECE

SYNDROME DECODING OF LINEAR BLOCK CODES

Let „X‟ be a valid code vector at the transmitter. At the receiver, the received code vector
is „Y‟.
At transmitter, XHT = 0.
Here X is the transmitted code vector.
And H is the parity check matrix.
When there is no error, we will receive the transmitted code vector itself. Let the received
code vector is „Y‟.
If there is no error X =Y and since XHT = 0. Therefore YHT = 0--------- (1).

If there is an error, therefore YHT  0. Then the non zero output of the product YHT is
called syndrome & this syndrome is used to detect the errors.

S = YHT ---------- (2)


This syndrome is a q-bit vector.
Now let as consider an n-bit vector „E‟. Let „E‟ represents the position of transmission
errors in Y.
E.g. If X = 1 0 1 1 0 be the transmitted message.
Y = 1 0 0 1 0 be the received message with one bit error.
Then E = 0 0 1 0 0 is the error vector. Here 1 represents the position of error
Another property of error vector is that
Y  X  E
..................(3)
X  Y  E
On substituting (3) in (2), we get

S   X  E H T
 XH T  EH T
 0  EH T
 EH T
Each syndrome vector corresponds to a particular error pattern. Each syndrome vector
says which bit is in error.

Module II 92 2008 Scheme


Information Theory & Coding Dept of ECE

Block diagram of a syndrome decoder for linear block code to correct errors.

Here the received n – bit vector Y is stored in n – bit register. From this vector the
syndrome is calculated using S = YHT. Thus HT is stored in the syndrome calculator. The q-bit
syndrome vector is then applied to look up table to find out the error pattern. Thus we get
corresponding E. This E is added to vector Y.
Y E  X
Thus we get the corrected message X.

PROBLEMS:

3. For a systematic (6, 3) linear block code the parity matrix P is shown below. The
received code vector is R= [1 1 0 0 1 0]. Detect & correct the single error that has
occurred due to noise?

1 0 1
P  0 1 1
1 1 0

Module II 93 2008 Scheme


Information Theory & Coding Dept of ECE

Ans:

Here, we have
1 0 1 1 0 1
P  0 1 1 ; so P  0
T
1 1
1 1 0 1 1 0
We know that

H  PT : I q  q 
1 0 1 1 0 0
 
0 1 1 0 1 0


1 1 0 0 0 1

The syndrome S=RHT.
1 0 1
0 1 1
 
1 1 0
s1 s2 s3   1 1 0 0 1 0    1 0 0
1 0 0
0 1 0
 
0
 0 
1

The syndrome is found by modulo 2 addition and multiplication. The syndrome vector
S=[1 0 0] is present in the 4 th row of HT. Hence the 4th bit in the received vector counting from
the left is error. Also the corrected code vector is [1 1 0 1 1 0].

4. Repetition code represent simplest type of linear block code. The generator matrix of
a (5, 1) repetition code is given by
G  1 1 1 1 1

a. Write the parity check matrix.


b. Evaluate all the errors for single & double error patterns.

Ans:

It is given that n = 5 & k = 1 for (5,1) repetition code. Since k = 1 there is only
one “message bit” and the remaining four bits are “check bits”. We know that
[G] = [ P : I ]
Here, for k = 1, the identity matrix Ik is a 1 X 1 matrix given by
Module II 94 2008 Scheme
Information Theory & Coding Dept of ECE

[Ik] = [I1] = [1].


So, the parity matrix is P  1 1 1 1

a. The parity check matrix is given by, [ H ] = [ I :PT ]


1 0 0 0 1
0 1 0 0 1
G 
0 0 1 0 1
 
0 0 0 1 1

b. The message vector [M] can be either [0] or [1],

When [M] = 0; [X] = [M][G] = [0] [1 1 1 1 1] = [0 0 0 0 0]

When [M] = 1; [X] = [M][G] = [1] [1 1 1 1 1] = [1 1 1 1 1]

There are five single error patterns given by [1 0 0 0 0], [0 1 0 0 0], [0 0 1 0 0], [0 0 0
1 0] & [0 0 0 0 1]. To find the syndrome [S] = [E][HT]

If the 1st bit is in error vector will be [1 0 0 0 0]. So the syndrome, S is


1 0 0 0
0 1 0 0
 
S1  1 0 0 0 0  0 0 1 0  1 0 0 0
 
0 0 0 1

1 1 1 1

If the 2nd bit is in error vector will be [0 1 0 0 0]. So the syndrome, S is


1 0 0 0
0 1 0 0
 
S 2  0 1 0 0 0 0 0 1 0  0 1 0 0
 
0 0 0 1

1 1 1 1

If the 3rd bit is in error vector will be [0 0 1 0 0]. So the syndrome, S is


1 0 0 0
0 1 0 0
 
S 3  0 0 1 0 0 0 0 1 0  0 0 1 0
 
0 0 0 1

1 1 1 1

Module II 95 2008 Scheme


Information Theory & Coding Dept of ECE

If the 4th bit is in error vector will be [0 0 0 1 0]. So the syndrome, S is

1 0 0 0
0 1 0 0
 
S 4  0 0 0 1 0  0 0 1 0  0 0 0 1
 
0 0 0 1

1 1 1 1

If the 5th bit is in error vector will be [0 0 0 0 1]. So the syndrome, S is

1 0 0 0
0 1 0 0
 
S 5  0 0 0 0 1 0 0 1 0  1 1 1 1
 
0 0 0 1

1 1 1 1

There are ten double error patterns given by [1 1 0 0 0], [1 0 1 0 0], [1 0 0 1 0], [1 0 0
0 1], [0 1 1 0 0], [0 1 0 1 0], [0 1 0 0 1], [0 0 1 1 0], [0 0 1 0 1] & [0 0 0 1 1]. To find
the syndrome [S] = [E][HT]

1 0 0 0
0 1 0 0
 
S 6  1 1 0 0 0 0 0 1 0  1 1 0 0
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S 7  1 0 1 0 0  0 0 1 0  1 0 1 0
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S 8  1 0 0 1 0 0 0 1 0  1 0 0 1
 
0 0 0 1

1 1 1 1

Module II 96 2008 Scheme


Information Theory & Coding Dept of ECE

1 0 0 0
0 1 0 0
 
S 9  1 0 0 0 1 0 0 1 0  0 1 1 1
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S10  0 1 1 0 0 0 0 1 0  0 1 1 0
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S11  0 1 0 1 0 0 0 1 0  0 1 0 1
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S12  0 1 0 0 1 0 0 1 0  1 0 1 1
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S13  0 0 1 1 0 0 0 1 0  0 0 1 1
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S14  0 0 1 0 1 0 0 1 0  1 1 0 1
 
0 0 0 1

1 1 1 1

1 0 0 0
0 1 0 0
 
S15  0 0 0 1 1 0 0 1 0  1 1 1 0
 
0 0 0 1

1 1 1 1

Module II 97 2008 Scheme


Information Theory & Coding Dept of ECE

STANDARD ARRAY

This standard error is used for the syndrome decoding of linear block codes. Let C1, C2,
C3, ….. C 2 K denotes the 2K code vectors of an (n, k) linear block code. Let R denotes the

received vector, which may have one of the 2 n possible values.

The receiver has the task of partitioning the 2 n possible received vector into 2 K disjoint

subsets D1, D2, D3…… D2 K in such a way that Di corresponds to the code vector Ci for

1  i  2 k . The received vector „r‟ is decoded into Ci if it is in the ith subset.

Module II 98 2008 Scheme


Information Theory & Coding Dept of ECE

The 2k subsets designed here contributes the standard array of the linear block code. To
construct a linear block code the procedure is as follows:

1. The 2k code vector are placed in a row with the all zero code vectors, C 1 as the left most
element

2. From the remaining (2n – 2k) n-tuple, an error pattern E2 is chosen and is placed under C1
and the second row is formed by adding E2 to each of the remaining code vectors in the
first row. It is important that the error pattern chosen as the first element in the row not
have previously applied in the standard array.

3. Step 2 is repeated until all the possible error pattern has been accounted for.

The 2n – k rows of the array represents the cosets of the code and their first element e2,

e3,….. e2nk are called coset leaders. Using the standard array, the decoding procedure is as

follows:

1. For a received vector „r‟, compute the syndrome S = r.H T.

2. Within the coset characterized by the syndrome „S‟ identify the coset leader. (i.e. the
error pattern with largest probability of occurrence). Call the coset leader as e 0

3. Compute the code vector C  r  e0 as the decoded version of the received vector r.

Thus we get the corrected output. Thus the standard array can be used as syndrome
decoding.
Module II 99 2008 Scheme
Information Theory & Coding Dept of ECE

PROBLEMS
5. Construct a standard array for the (6, 3) linear block code whose parity matrix P is
shown below.
1 0 1
P  0 1 1
1 1 0

Also decode & correct the error if any if the received vector is (a) 100100 & (b) 000011.
Ans:

Here (6, 3) code means n = 6 & k = 3. Therefore q = n – k = 6 – 3 = 3

We know that the Generator Matrix, G  I k  k : Pk  q  



That is, G  I 3 3 : P3 3 
1 0 0 1 0 1
G  0 1 0 0 1 1
0 0 1 1 1 0

We know that,

 g11 g12 ..... g1n 


g g 22 ..... g 2 n 
x1 x2 ...... xn  m1 m2 ...... mk  21
 ..... ..... ..... ..... 
 
 g k1 gk2 ..... g kn 
On substituting, we get,

1 0 0 1 0 1
x1 x 2 x3 x 4 x5 x6   m1 m2 m3  0 1 0 0 1 1
0 0 1 1 1 0
 m1 m2 m3 m1  m3 m2  m3 m1  m2 
For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 3 bits is 23 = 8. That is 000,
001, ….,111.

Module II 100 2008 Scheme


Information Theory & Coding Dept of ECE

The code vectors are 000000, 001110, 010011, 011101, 100101, 101011, 110110 & 111000.
Now to find the standard array. The standard array is constructed using the step by step
procedure.

Decoding using standard array.

a. Let the received vector be [100100]


On looking to standard array the error corresponding to the received vector [100100] is
[000001]. So the corrected code vector is Y + E = [100100] + [000001] = [100101].

Module II 101 2008 Scheme


Information Theory & Coding Dept of ECE

b. Let the received vector be [000011]


On looking to standard array the error corresponding to the received vector [000011] is
[010000]. So the corrected code vector is Y + E = [000011] + [010000] = [010011].

PERFECT CODES
A perfect code which has the property of

t
1
d min  1
2
where „t‟ is the number of errors that can be corrected, d min is the minimum distance, is
called perfect code. The hamming codes which have the parameters, n=2 q -1, d min=3, t=1 are
also examples of perfect codes.

HAMMING CODES

Hamming codes are defined as (n, k) linear block codes. These codes satisfy the
following conditions
1. The number of check bits, q ≥ 3.
2. Block length n = 2 q -1.
3. Number of message bits, k = n – q.
4. Minimum distance, dmin = 3.

ERROR DETECTING AND ERROR CORRECTING CAPABILITY OF HAMMING


CODES

It can be used to detect double errors and can correct single errors, since its d min = 3.

PROBLEMS

6. The parity check matrix of a (7, 4) linear block code is expressed as


1 1 1 0 1 0 0
H  1 1 0 1 0 1 0
1 0 1 1 0 0 1

a. Obtain the Generator Matrix (G)?


Module II 102 2008 Scheme
Information Theory & Coding Dept of ECE

b. What is the minimum distance between the code vectors?


c. How many errors can be detected & how many errors can be corrected?
d. List all the code vectors
e. Draw the encoder for this code?

Ans:

1. Here (7, 3) code means n = 7 & k = 4. Therefore q = n – k = 7 – 4 = 3


Also, n = 2q -1 7 = 23 -1. Therefore it is a Hamming Code.

We know that the Parity Check Matrix, H  P : I q  q


T
 

That is, H  P : I 33q
T

Thus we get,
1 1 1
1 1 0 1 0
1
P  1 1 1  P 
T  1
0
1 0 1
1 0 1 1  
0 1 1

We know that, the Generator Matrix, G  I k  k : Pk  q    I 4 4 : P43 


We get,
1 0 0 0 1 1 1
0 1 0 0 1 1 0
G 
0 0 1 0 1 0 1
 
0 0 0 1 0 1 1

2. The minimum distance or dmin of a Hamming code is 3.

3. Since Hamming code it can detect up-to 2 errors and can correct 1 error.

4. To obtain all the code vectors: For that we have to find the check bits. We know
that,
 P11 P12 ..... P1q 
P P2 q 
c
1 c2 ...... 
cq  m1 m2 ...... mk  
21

 .....
P22
.....
.....
.....

..... 
 

 pk1 pk 2 ..... p kq 

On substituting, we get,

Module II 103 2008 Scheme


Information Theory & Coding Dept of ECE

1 1 1
1 1 0
c1 c2 c3   m1 m2 m3 m4  
1 0 1
 
0 1 1

On multiplying we get,
c1  m1  m2  m3
c2  m1  m2  m4
c3  m1  m3  m4

For obtaining all the code vectors, we have to give values for m1 ,m2 and m3. The
number of messages that can be constructed using 4 bits is 24 = 16. That is 0000,
0001, ….,1111.

Module II 104 2008 Scheme


Information Theory & Coding Dept of ECE

5. Encoder

The switch „S‟ is connected to the message register first & all the message bits are
transmitted. This switch is then connected to check bit register & check bit are
transmitted. This forms the block of 7 bits.

SYNDROME DECODING OF HAMMING CODES

It is same as the syndrome decoding of linear block codes which is explained earlier. The
error correction using the syndrome vector is also the same as that of linear block codes. The two
main properties of syndrome vector are that,

1. It is a q-bit vector.
2. There will be 2q – 1 non zero syndromes/ error

PROBLEMS

7. The parity check matrix of a (7, 4) linear block code is expressed as


1 1 1 0 1 0 0
H  0 1 1 1 0 1 0
1 1 0 1 0 0 1

Evaluate the syndrome vector for single bit of errors?

Ans:

Here (7, 3) code means n = 7 & k = 4. Therefore q = n – k = 7 – 4 = 3

Module II 105 2008 Scheme


Information Theory & Coding Dept of ECE

Since q = 3, the syndrome „S‟ is a 3-bit vector. & there are 2 3 -1= 7 non zero
syndromes or errors. We know that S = EH T

If the 1st bit is in error vector will be [1 0 0 0 0 0 0]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
1 0 0 0 0 0 0 0 1 1   1 0 1
1 0 0
 
0 1 0
0 1
 0 

If the 2nd bit is in error vector will be [0 1 0 0 0 0 0]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
0 1 0 0 0 0 0 0 1 1  1 1 1
1 0 0
 
0 1 0
0 1
 0 

If the 3rd bit is in error vector will be [0 0 1 0 0 0 0]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
0 0 1 0 0 0 0 0 1 1  1 1 0
1 0 0
 
0 1 0
0 1
 0 

If the 4th bit is in error vector will be [0 0 0 1 0 0 0]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
0 0 0 1 0 0 0 0 1 1   0 1 1
1 0 0
 
0 1 0
0 1
 0 

Module II 106 2008 Scheme


Information Theory & Coding Dept of ECE

If the 5th bit is in error vector will be [0 0 0 0 1 0 0]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
0 0 0 0 1 0 0 0 1 1   1 0 0
1 0 0
 
0 1 0
0 1
 0 

If the 6th bit is in error vector will be [0 0 0 0 0 1 0]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
0 0 0 0 0 1 0 0 1 1   0 1 0
1 0 0
 
0 1 0
0 1
 0 

If the 7th bit is in error vector will be [0 0 0 0 0 0 1]. So the syndrome, S is


1 0 1
1 1 1
 
1 1 0
 
0 0 0 0 0 0 1 0 1 1  0 0 1
1 0 0
 
0 1 0
0 1
 0 

Module II 107 2008 Scheme


Information Theory & Coding Dept of ECE

NON- SYSTAMATIC HAMMING CODES

In the construction of non-systamatic hamming code is done by a procedure.

1. Write the binary coded decimal (BCD) of length n – k for decimals 1 to n.

2. Arrange the sequence in the bit reverse order in a matrix form.

3. Transpose of the matrix give the parity check matrix, H for the code.

Here the parity bits are added in the 2 i position, where i= 0, 1, 2, 3…

PROBLEMS

8. Obtain the (7, 4) non-systematic hamming code? Also find the syndrome for one bit
error?

Ans:

Step 1: BCD of length n – k from 1 to n

0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

Step 2: Bit Reversing.

1 0 0
0 1 0
 
1 1 0
 
H T
 0 0 1
1 0 1
 
0 1 1
1 1
 1 

Module II 108 2008 Scheme


Information Theory & Coding Dept of ECE

Step 3: Transpose.

1 0 1 0 1 0 1
H  0 1 1 0 0 1 1
0 0 0 1 1 1 1

We know that XHT = 0, where X is the transmitted code vector. If we are taking message bits
m1, m2, m3 and m4 . Then we have to add the parity bits in the 2 i positions. That is 20 = 1st, 21
= 2nd and 22 = 4th positions. Let us take parity bits as p1 , p2 and p3 .

Therefore, X = [p1 p2 m1 p3 m2 m3 m4]. Take XHT = 0.

Therefore,
1 0 0
0 1 0

1 1 0
 
XH T
  p1 p2 m1 p3 m2 m3 m4  0 0 1
1 0 1
 
0 1 1
1 1
 1

we get p1  m1  m2  m4  0
p2  m1  m3  m4  0
p3  m2  m3  m4  0

Since the message bits are non negative quantity, we can write

p1  m1  m2  m4
p2  m1  m3  m4
p3  m2  m3  m4

Thus we can construct code vectors corresponding to the messages 0000 to 1111

Module II 109 2008 Scheme


Information Theory & Coding Dept of ECE

In the non systematic hamming code, there is a specialty that using the syndrome we can
easily understand the position of error. If there is any error, then the syndrome, S = YH T;
where Y is the received message. We know that S = EHT

If the 1st bit is in error vector will be [1 0 0 0 0 0 0]. So the syndrome, S is

1 0 0
0 1 0
 
1 1 0
 
1 0 0 0 0 0 0 0 0 1   1 0 0
1 0 1
 
0 1 1
1 1
 1 

Module II 110 2008 Scheme


Information Theory & Coding Dept of ECE

If the 2nd bit is in error vector will be [0 1 0 0 0 0 0]. So the syndrome, S is

1 0 0
0 1 0
 
1 1 0
 
0 1 0 0 0 0 0 0 0 1   0 1 0
1 0 1
 
0 1 1
1 1
 1 

If the 3rd bit is in error vector will be [0 0 1 0 0 0 0]. So the syndrome, S is

1 0 0
0 1 0
 
1 1 0
 
0 0 1 0 0 0 0 0 0 1  1 1 0
1 0 1
 
0 1 1
1 1
 1 

If the 4th bit is in error vector will be [0 0 0 1 0 0 0]. So the syndrome, S is

1 0 0
0 1 0
 
1 1 0
 
0 0 0 1 0 0 0 0 0 1   0 0 1
1 0 1
 
0 1 1
1 1
 1 

If the 5th bit is in error vector will be [0 0 0 0 1 0 0]. So the syndrome, S is

1 0 0
0 1 0
 
1 1 0
 
0 0 0 0 1 0 0 0 0 1  1 0 1
1 0 1
 
0 1 1
1 1
 1 

If the 6th bit is in error vector will be [0 0 0 0 0 1 0]. So the syndrome, S is

Module II 111 2008 Scheme


Information Theory & Coding Dept of ECE

1 0 0
0 1 0
 
1 1 0
 
0 0 0 0 0 1 0 0 0 1  0 1 1
1 0 1
 
0 1 1
1 1
 1 

If the 7th bit is in error vector will be [0 0 0 0 0 0 1]. So the syndrome, S is

1 0 0
0 1 0
 
1 1 0
 
0 0 0 0 0 0 10 0 1   1 1 1
1 0 1
 
0 1 1
1 1
 1 

Module II 112 2008 Scheme


Information Theory & Coding Dept of ECE

CYCLIC CODES

A binary code is said to be a cyclic code if it satisfies two properties:

1. Linearity property: The sum of two codewords is also an existing codeword.

2. Cyclic property: Any lateral shift of a codeword will produce an existing codeword.

ENCODING OF CYCLIC CODES

Let X be the “n” bit code vector represented by X = (xn-1, xn-2………………… x1 , x0).
On shifting we get, X‟ = (xn-2, xn-3………………… x1, x0, xn-1).

We can define the code vector, X as a polynomial as

X (p) = xn-1 pn-1+xn-2 pn-2+….……………+ x1p+ x0 …………….. (1)

X‟ (p) = xn-2 pn-1+xn-3 pn-2+….…………… +x1p2+ x0 p+ xn-1 ………….. (2)

Where „p‟ is an arbitrary variable and power of p represents the position of codeword
bits. In this, pn-1 represents MSB & p0 represents the LSB. Then we get,

 
p X  p   X '  p   xn1 p n  1 .....................1

X  p   M  p G p ..............................2

Here, X(p) is the code vector polynomial,

M(p) is the message vector polynomial and

G(p) is the generating polynomial.

In this the G(p) should be a factor of p n+1………………..(3)

Module II 113 2008 Scheme


Information Theory & Coding Dept of ECE

PROBLEMS

9. The generator polynomial of a (7, 4) cyclic code is given as G (p) = p 3 + p + 1. Obtain all
the code vectors for the code in non-systematic and systematic cyclic codes?

Ans:

1. To obtain non systematic cyclic code

Here n = 7 and k = 4, then q = n – k = 7 – 4 = 3

We know that X (p) = M (p) G (p)

Given that G (p) = p 3 + p + 1

For message, m = [ 0 0 0 0], then M (p) = 0p3 + 0p2 + 0p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + 0p2 + 0p + 0) (X-OR Addition)

= (0p6 + 0p5 + 0p4 + 0p3 + 0p2 + 0p + 0)  X=[0000000]

For message, m = [ 0 0 0 1], then M (p) = 0p3 + 0p2 + 0p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + 0p2 + 0p + 1) (X-OR Addition)

= (0p6 + 0p5 + 0p4 + p3 + 0p2 + p + 1)  X =[ 0 0 0 1 0 1 1 ]

For message, m = [ 0 0 1 0], then M (p) = 0p3 + 0p2 + p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + 0p2 + p + 0) (X-OR Addition)

= (0p6 + 0p5 + p4 + 0p3 + p2 + p + 0)  X =[ 0 0 1 0 1 1 0 ]

For message, m = [ 0 0 1 1], then M (p) = 0p3 + 0p2 + p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + 0p2 + 1p + 1) (X-OR Addition)

= (0p6 + 0p5 + p4 + p3 + p2 + 0p + 1)  X=[0011101]

For message, m = [ 0 1 0 0], then M (p) = 0p3 + p2 + 0p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + p2 + 0p + 0) (X-OR Addition)

= (0p6 + p5 + 0p4 + p3 + p2 + 0p + 0)  X=[0101100]

Module II 114 2008 Scheme


Information Theory & Coding Dept of ECE

For message, m = [ 0 1 0 1], then M (p) = 0p3 + p2 + 0p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + p2 + 0p + 1) (X-OR Addition)

= (0p6 + p5 + 0p4 + 0p3 + p2 + p + 1)  X=[0100111]

For message, m = [ 0 1 1 0], then M (p) = 0p3 + p2 + p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + p2 + p + 0) (X-OR Addition)

= (0p6 + p5 + p4 + p3 + 0p2 + p + 0)  X=[0111010]

For message, m = [ 0 1 1 1], then M (p) = 0p3 + p2 + p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (0p3 + p2 + p + 1) (X-OR Addition)

= (0p6 + p5 + p4 + 0p3 + 0p2 + 0p + 1)  X=[0110001]

For message, m = [ 1 0 0 0], then M (p) = p3 + 0p2 + 0p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + 0p2 + 0p + 0) (X-OR Addition)

= (p6 + 0p5 + p4 + p3 + 0p2 + 0p + 0)  X=[1011000]

For message, m = [ 1 0 0 1], then M (p) = p3 + 0p2 + 0p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + 0p2 + 0p + 1) (X-OR Addition)

= (p6 + 0p5 + p4 + 0p3 + 0p2 + p + 1)  X=[1010011]

For message, m = [ 1 0 1 0], then M (p) = p3 + 0p2 + p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + 0p2 + p + 0) (X-OR Addition)

= (p6 + 0p5 + 0p4 + p3 + p2 + p + 0)  X =[ 1 0 0 1 1 1 0 ]

For message, m = [ 1 0 1 1], then M (p) = p3 + 0p2 + p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + 0p2 + p + 1) (X-OR Addition)

= (p6 + 0p5 + 0p4 + 0p3 + p2 + 0p + 1)  X=[1000101]

For message, m = [ 1 1 0 0], then M (p) = p3 + p2 + 0p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + p2 + 0p + 0) (X-OR Addition)

= (p6 + p5 + p4 + 0p3 + p2 + 0p + 0)  X=[1110100]


Module II 115 2008 Scheme
Information Theory & Coding Dept of ECE

For message, m = [ 1 1 0 1], then M (p) = 1p3 + p2 + 0p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + p2 + 0p + 1) (X-OR Addition)

= (p6 + p5 + p4 + p3 + p2 + p + 1)  X=[1111111]

For message, m = [ 1 1 1 0], then M (p) = p3 + p2 + p + 0

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + p2 + p + 0) (X-OR Addition)

= (p6 + p5 + 0p4 + 0p3 + 0p2 + p + 0)  X=[1100010]

For message, m = [ 1 1 1 1], then M (p) = p3 + p2 + p + 1

X (p) = M (p) G (p) = (p3 + p + 1) (p3 + p2 + p + 1) (X-OR Addition)

= (p6 + p5 + 0p4 + p3 + 0p2 + 0p + 1)  X=[1101001]

Module II 116 2008 Scheme


Information Theory & Coding Dept of ECE

2. To obtain systematic cyclic code


For obtaining systematic code vectors, first we have to find the check bits. The
polynomial of check bits C (p) is obtained by
 p q M  p 
C  p   remainder  
 G p  

For message, m = [ 0 0 0 0], then M (p) = 0p3 + 0p2 + 0p + 0


 
 p3 0 p3  0 p2  0 p  0 
C  p   remainder 


 p3  p  1 
 0 
 remainder  3  0
 p  p  1
C (p) = 0p2 + 0p + 0  C = [0 0 0]  X=[0000000]

For message, m = [ 0 0 0 1], then M (p) = 0p3 + 0p2 + 0p + 1


 
 p3 0 p3  0 p2  0 p 1 
C  p   remainder 


 p3  p 1 
 p3 
 remainder  3   p 1
 p  p  1
C (p) = 0p2 + p + 1  C = [0 1 1]  X=[0001011]

For message, m = [ 0 0 1 0], then M (p) = 0p3 + 0p2 + p + 0


 
 p3 0 p3  0 p2  p  0 
C  p   remainder 


 p3  p  1 
 p4 
 remainder  3  p p
2

 p  p  1
C (p) = p2 + p + 0  C = [1 1 0]  X=[0010110]

For message, m = [ 0 0 1 1], then M (p) = 0p3 + 0p2 + p + 1


  
 p3 0 p3  0 p2  p  1 
C  p   remainder  
 p3  p  1 
 p p 
4 3
 remainder  3   p 1
2

 p  p  1
C (p) = p2 +0 p + 1  C = [1 0 1]  X=[0011101]

For message, m = [ 0 1 0 0], then M (p) = 0p3 + p2 + 0p + 0


Module II 117 2008 Scheme
Information Theory & Coding Dept of ECE

 
 p3 0 p3  p2  0 p  0 
C  p   remainder 


 p3  p  1 
 p5 
 remainder  3   p  p 1
3 2

 p  p  1 
C (p) = p2 + p + 1  C = [1 1 1]  X=[0100111]

For message, m = [ 0 1 0 1], then M (p) = 0p3 + p2 + 0p + 1


 
 p3 0 p3  p2  0 p  1 
C  p   remainder 


 p3  p  1 
 p5  p3 
 remainder  3  p
2

 p  p  1
C (p) = p2 +0 p + 0  C = [1 0 0]  X=[0101100]

For message, m = [ 0 1 1 0], then M (p) = 0p3 + p2 + p + 0

 
 p3 0 p3  p 2  p  0 
C  p   remainder 


 p3  p 1 
 p5  p4 
 remainder  3   p 1
2

 p  p  1 
C (p) =0 p2 +0 p + 1  C = [1 0 1]  X=[0110101]

For message, m = [ 0 1 1 1], then M (p) = 0p3 + p2 + p + 1


 
 p3 0 p3  p2  p  1 
C  p   remainder 


 p3  p  1 
 p5  p4  p3 
 remainder   p
 p  p 1 
3

C (p) = 0p2 + p + 0  C = [0 1 0]  X=[0111010]

For message, m = [ 1 0 0 0], then M (p) = p3 + 0p2 + 0p + 0

 
 p3 p3  0 p2  0 p  0 
C  p   remainder 


 p3  p 1 
 p 6

 remainder  3   p 1
2

 p  p  1
C (p) = p2 + 0p + 1  C = [1 0 1]  X = [1 0 0 0 1 0 1]
Module II 118 2008 Scheme
Information Theory & Coding Dept of ECE

For message, m = [ 1 0 0 1], then M (p) = p3 + 0p2 + 0p + 1

 
 p3 p3  0 p2  0 p 1 
C  p   remainder 


 p3  p 1 
 p6  p3 
 remainder  3  p  p
2

 p  p  1 
C (p) = p2 + p + 0  C = [1 1 0]  X = [1 0 0 1 1 1 0]

For message, m = [ 1 0 1 0], then M (p) = p3 + 0p2 + p + 0

 
 p3 p3  0 p2  p  0 
C  p   remainder 


 p3  p 1 
 p6  p4 
 remainder  3   p 1
 p  p  1

C (p) = 0p2 + p + 1  C = [0 1 1]  X = [1 0 1 0 0 1 1]

For message, m = [ 1 0 1 1], then M (p) = p3 + 0p2 + p + 1

 
 p3 p3  0 p2  p  1 
C  p   remainder 


 p3  p 1 
p p  p 
6 4 3
 remainder   0
 p  p 1 
3

C (p) = 0p2 + 0p + 0  C = [0 0 0]  X = [1 0 1 1 0 0 0]

For message, m = [ 1 1 0 0], then M (p) = p3 + p2 + 0p + 0

 
 p3 p3  p2  0 p  0 
C  p   remainder 


 p3  p  1 
 p6  p5 
 remainder  3  p
 p  p  1
C (p) = 0p2 + p + 0  C = [0 1 0]  X = [1 1 0 0 0 1 0]

For message, m = [ 1 1 0 1], then M (p) = p3 + p2 + 0p + 1

 
 p3 p3  p2  0 p 1 
C  p   remainder 


 p3  p 1 
 p6  p5  p3 
 remainder  3  1
 p  p 1 
Module II 119 2008 Scheme
Information Theory & Coding Dept of ECE

C (p) = 0p2 + 0p + 1  C = [0 0 1]  X = [1 1 0 1 0 0 1]

For message, m = [ 1 1 1 0], then M (p) = p3 + p2 + p + 0

 
 p3 p3  p2  p  0 
C  p   remainder 


 p3  p 1 
p p  p 
6 5 4
 remainder   p
2

 p  p 1 
3

C (p) = p2 + 0p + 0  C = [1 0 0]  X = [1 1 1 0 1 0 0]

For message, m = [ 1 1 1 1], then M (p) = p3 + p2 + p + 1

 
 p3 p3  p2  p  1 
C  p   remainder 


 p3  p 1 
p p  p  p 
6 5 4 3
 remainder    p  p 1
2

 p  p 1
3

C (p) = p2 + p + 1  C = [1 1 1]  X = [1 1 1 1 1 1 1]

Module II 120 2008 Scheme


Information Theory & Coding Dept of ECE

GENERATOR AND PARITY CHECK MATRICES OF CYCLIC CODES

The generator matrix of a cyclic code has „k‟ rows and „n‟ columns. The generator
polynomial of cyclic codes can be expressed as
G (p) = pq + gq-1 pq-1 + ……….g1 p + 1

To find the generator matrix, multiply both sides with pi, where i=(k-1), (k-2) ,….. 2, 1, 0.
Therefore, the above equation becomes
pi G (p) = pi+q + gq-1 pi+q-1 + ……….g1 pi+1 + pi
Parity Check Matrix, H = [PT:I]

PROBLEMS

10. Find the generator matrix corresponding to G (p) = p3 + p2 + 1 for a (7, 4) cyclic code.

Ans: Here n=7, k=4 and q=3


G (p) = p3 + p2 + 1
pi G (p) = p3+i + p2+i + pi
here, i = k-1, k-2. …2, 1, 0
that is, i = 3, 2, 1, 0
The generator matrix will have „k‟ rows i.e., 4 rows and „n‟ columns i.e., 7 columns.
Find pi G (p) ….. for all rows.
For row 1: i = 3, p3 G (p) = p6+ p5+ p3
For row 2: i = 2, p2 G (p) = p5+ p4+ p2
For row 3: i = 1, p1 G (p) = p4+ p3+ p
For row 4: i = 0 , p0 G (p) = p3+ p2+ 1
Thus we get,
1 1 0 1 0 0 0
0 1 1 0 1 0 0
G4  7  
0 0 1 1 0 1 0
 
0 0 0 1 1 0 1

Module II 121 2008 Scheme


Information Theory & Coding Dept of ECE

ENCODER FOR CYCLIC CODES

Operation:

The feedback switch is first closed. The output switch is connected to message bits input.
At first, all shift registers are initialized to zero. When switch is closed, the message input bits
are shifted to transmitter and also to the shift registers. After the shift of „k‟ message bits, the
register contains „q‟ check bits. The feedback switch is now opened. Then the output switch is
connected to check bit position. Thus all the check bits are transmitted.

So Transmitted code vector,

SYNDROME DECODING OF CYCLIC CODES:

In cyclic codes during transmission errors may occur. Syndrome decoding is used to
correct these errors. Let „X‟ be the transmitted code vector, „Y‟ be the received code vector &
„E‟ be the error vector. That is, E  X Y .According to the property of error vector the receive

code vector Y  X  E .

In polynomial form, Y  p  X  p   E  p  .

Module II 122 2008 Scheme


Information Theory & Coding Dept of ECE

We know that, X  p   M  p G p  . Here, M(p) is the message vector polynomial, &
G(p) is the generator polynomial.

Therefore, Y  p  M  p G p   E  p .............(1)

Y  p E p
 M  p 
G p  G p 
, here E(p) is called the syndrome.

Y  p
 M  p 
Syndrome
G p  G p 
 Y  p 
Syndrome, S  remainder  
 G p 

SYNDROME CALCULATOR

Here syndrome calculation is done with the help of shift registers .Initially the contents of
all shift registers are set to zero

With gate 2 in on condition, Gate 1 is off condition the received code vectors entered into
shift register. After the entire received code vectors arte shifted into the registers .the contents of
registers will be the syndrome which can be shifted out of the register by turning gate 1 ON and
the gate 2 OFF.

Once the syndrome is calculated an error pattern is detected for that syndrome using
combination logic circuits.

Therefore the corrected code vector X  Y  E


Module II 123 2008 Scheme
Information Theory & Coding Dept of ECE

ADVANTAGES & DISADVANTAGES OF CYCLIC CODES:

ADVANTAGES:

1. The error correcting and decoding methods of the cyclic codes are simpler & easy to
implement.

2. The encoders and decoders for cyclic codes are simpler compared to non cyclic codes.

Disadvantages:

1. Error correction is complicated since the combinational logic circuit is complex.

BCH CODES (BOSE-CHOUDHRI-HOCQUINGHEM CODES)

They are one of the most important and powerful error correcting cyclic codes. For any
positive integer m ≥ 3, there exits a BCH code with following parameters.

Block length, n = 2m - 1

No of Message bits, k ≥ n-mt.

Minimum distance, dmin ≥ 2t + 1, where „t‟ is the number of errors that can be corrected

Advantage:

1. Flexibility in the choice of block length

DECODING OF BCH CODES

Suppose a code word v X   v0  v1 X  v2 X  ........  vn1 X


2 n1
is transmitted &

the transmission errors result in the received vector:

r  X   r0  r1 X  r2 X 2  ........  rn1 X n1


Let e(X) be the error pattern. Then
r  X   v X   e X ..........................1

Module II 124 2008 Scheme


Information Theory & Coding Dept of ECE

The first step for decoding a code is to compute the syndrome from the received vector
r(X). For decoding a t-error correcting primitive BCH code, the syndrome is a 2t-tuple,

S  S1 , S 2 , ..............., S 2t   r.H T


Where H is given by

1  2 3 ...........  n1 

1  2   2 2  2 3 ...........  2 n1 
1
H 
 3   3 2  3 3 ...........  3 n1  
: :: ::: :::: ::::::::: :::::: 
: :: :::: ::::: ::::::::: ::::::: 
 
1      
2t 2t 2 2t 3
...........  2t n1 
From the equations of “S” & “H”, we find that the ith component of the syndrome as

 
Si  r  i
 r0  r1 i   r2 2i  ....................  rn1 n1i
For 1  i  2t . Note that the syndrome components are elements in the field GF(2 m).
These components can be computed from r(X) as follows

1. Dividing r(X) by the minimal polynomial, i  X  of  i , we obtain


r  X   ai  X i  X   bi  X 

Where bi(X) is the remainder with degree less than that of i  X  . Since i  t  0 , we have
 
Si  r  i  bi  i . 
Thus, the syndrome component Si is obtained by evaluating bi(X) with X   .
i

REED-SOLOMON CODES (RS CODES)

Instead of information bits, the Reed Solomon codes uses „symbols‟. Here, symbol is a
group of bits.

In other block codes n = k + q, here „n‟ is the total number of individual bits in a code
word, „k‟ is the number of message bits and „q‟ is the number of check bits.

Module II 125 2008 Scheme


Information Theory & Coding Dept of ECE

But in RS code, n = k + q, here n is the total number of „symbols‟ in a code word, k is the
number of information symbols and q is the number of parity or check symbols

If a symbol contains m bits,

Block length, n = 2m - 1

Minimum distance, dmin = 2t + 1,

Number of check symbols, q = n – k = 2t, where „t‟ is the number of errors that can
be corrected.

Reed Solomon code is also called Maximum distance separable code.

BURST ERROR CORRECTION

There are occasions when the average bit error code is small, but the error correcting
codes are not effective in correcting the errors because the errors are clustered. That is, in one
region, large percentages of bits are in error. This is called Burst Error.

Examples of the sources of such bursts are static, such as results in radio transmission due to
lightning, cause burst of error.

The primary techniques in overcoming error bursts are „Block Interleaving‟ and
„Convolutional Interleaving‟

BLOCK INTERLEAVING

An effective method for dealing with burst error channels is to interleave the coded data
in such a way that the burst channel is transformed into a channel having independent error. A
burst of errors of length „b‟ is defined as a sequence of b-bit errors, the first and last of which are
1s.

For an (n, k) systematic code, which has (n – k ) = q check bits, can correct bursts of length,

b 
1
n  k 
2

Module II 126 2008 Scheme


Information Theory & Coding Dept of ECE

A block diagram of a system that employs interleaving is shown in figure

The encoded data is reordered by the interleaver and transmitted over the channel. At the
receiver after demodulation, the deinterleaver puts the data in proper sequence and passes it to
the decoder. As a result of interleaving/ deinterleaving, error bursts are spread out in time so that
errors within a code word appear to be independent.

The block interleaver formats the encoded data in a rectangular array of „m‟ rows and „n‟
columns. Then each row of array constitutes a code word of length „n‟. The bits are read in the
rectangular array row wise. The bits are read out column wise and transmitted over the channel.

Module II 127 2008 Scheme


Information Theory & Coding Dept of ECE

At the receiver, the deinterleaver stores the same data in the same rectangular array
formats, but it is readout row-wise, one codeword at a time. As a result, the burst of error is
broken into independent errors.

CONVOLUTIONAL INTERLEAVING

Convolutional interleavers are better matched for use with the convolutional codes. It is
also called periodic interleaver.

The successive symbols of an (n, k) code word are delayed by {0, b, 2b, ………..(n-1)b}
symbol units respectively. As a result, the symbols of on code word are placed at a distance of b-
symbol units in the channel stream and the bursts of length „b‟ separated by a guard space of
about (n-1) x b units only affect one symbol per code word. In the receiver, the codewords are
reassembled through complementary delay units and decoded to correct the single errors so
generated

HAMMING BOUND

We know that for a (n, k) block code; there are „n‟ distant non-zero syndromes. i.e. 2q -1.
There are nC1 single error patterns, nC2= double error patterns and nC3=triple error patterns

Therefore, to correct ‟t‟ errors per word

2 q  1  nC1  nC2  nC3  ....nCt

Module II 128 2008 Scheme


Information Theory & Coding Dept of ECE

t
That is, 2 q   nC
i 0
i

We know that n – k = q

t
2 nk   nC
i 0
i

By taking logarithm to base 2 on both sides


t
n  k  log 2  nC
i 0
i

Dividing both sides by n, we get

t
........................1
k 1
1
n
 log 2
n
 nCi 0
i

k
Since code rate r  , the equation (1) becomes
n

t
1
1 r 
n
log 2  nC
i 0
i

This relation showing code rate, „r‟ and „t‟ is called Hamming Bound, where t is the
number of errors that can be corrected.

Module II 129 2008 Scheme

You might also like