0% found this document useful (0 votes)
35 views117 pages

Digital Communication Unit 1

Uploaded by

4113 PAVITHRA K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views117 pages

Digital Communication Unit 1

Uploaded by

4113 PAVITHRA K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

Please read this disclaimer before proceeding:

This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
R.M.K ENGINEERING COLLEGE

20EC501 Digital Communication

Department : ECE
Batch/Year : 2020-2024 / III year
Created : Dr.J.Jasmine Hephzipah
AP/ECE,RMKEC
Ms.S.Roselin
Ap,ECE,RMKEC
Dr.Ganesh
ASP/ECE,RMKEC
Date : 30.07.2022
Table of Contents
S.No Contents Slide No

1 Course Objectives 7

2 Pre Requisites 8

3 Syllabus 9

4 Course outcomes 11

5 CO- PO/PSO Mapping 12

6 Unit 1 – INFORMATION THEORY 13

6.1 Lecture plan 15

6.2 Activity based learning 17

6.3 Lecture Notes 26

 Video Lecture Links 27

 Unit 1 Learning Material 28

 Introduction 29

 Information 30

 Entropy 32

 Mutual Information 40

 Discrete Memoryless Channels 44

 Types of Channels 45

 Channel Capacity 49

 Channel Coding Theorem 50


Table of Contents
S.No Contents Slide No
 Maximum Entropy for Gaussian 52
Channel
 Channel Capacity Theorem 53

 Source Coding Theorem 56

58
 Coding Techniques

 Problems 64

6.4 Assignments 82

6.5 Part A Questions & Answers 87

6.6 Part B Questions 94

6.7 Supportive Online Certification Courses 98

6.8 Real Time Applications in day to day life and 100


to Industry
6.9 Content beyond the Syllabus 104

7 Assessment Schedule 108

8 Prescribed Text Books & Reference Books 110

9 Mini Project suggestions 113


COURSE OBJECTIVES

The student should be made :


To study the limits set by Information Theory
To study the various waveform coding schemes
To learn the various baseband transmission schemes
To understand the various band pass signaling schemes
To know the fundamentals of channel coding
PRE REQUISITES

Subject Name :Signals and Systems


Subject Code : EC8352
Semester :3
Reason : Students should be familiar with classification of signals and Fourier
Transform

Subject Name :Communication Theory


Subject Code : EC8491
Semester :4
Reason : Students should be familiar with fundamentals of communication,
Modulation types, Sampling, Quantization and Pulse Code Modulation

Subject Name : Probability and Random Processes


Subject Code : MA8451
Semester :4
Reason : Students should be familiar with Probability and random processes
Syllabus
20EC501 DIGITAL COMMUNICATION

OBJECTIVES:
• To study the limits set by information theory
• To study the various waveform coding schemes
• To learn the various baseband transmission schemes
• To understand the various band pass signaling schemes
• To know the fundamentals of channel coding
UNIT I INFORMATION THEORY 9
Discrete Memory less source, Information, Entropy, Mutual Information – Discrete
Memory less channels – Binary Symmetric Channel, Channel Capacity - Hartley –
Shannon law – Source coding theorem - Shannon - Fano & Huffman codes.

UNIT II WAVEFORM CODING & REPRESENTATION 9


Prediction filtering and DPCM - Delta Modulation - ADPCM & ADM principles-Linear
Predictive Coding- Properties of Line codes- Power Spectral Density of Unipolar /
Polar RZ & NRZ – Bipolar NRZ - Manchester

UNIT III BASEBAND TRANSMISSION & RECEPTION 9


ISI – Nyquist criterion for distortion less transmission – Pulse shaping – Correlative
coding – Eye pattern – Receiving Filters- Matched Filter, Correlation receiver,
Adaptive Equalization

UNIT IV DIGITAL MODULATION SCHEME 9


Geometric Representation of signals - Generation, detection, PSD & BER of Coherent
BPSK, BFSK & QPSK - QAM – Carrier Synchronization – Timing Synchronization-
Structure of Non-coherent ,Receivers – Principle of DPSK.
Syllabus (cont’d.)
UNIT V ERROR CONTROL CODING 9
Channel coding theorem - Linear Block codes - Hamming codes - Cyclic codes –

Convolutional codes - Viterbi Decoder.

TOTAL:45 PERIODS

TEXT BOOK:
1. Haykin S, Digital Communications, John Wiley, 2005.
2. Sklar B, Digital Communication Fundamentals and Applications, Pearson
Education, Second Edition, 2009..
REFERENCES
1. Proakis J.G, Digital Communication, Tata Mc Graw Hill Company, Fifth Edition,
2018.
2. Lathi B. P, Modern Digital and Analog Communication Systems, Oxford University
Press, Third
Edition, 2007.
3. Hsu H.P, Schaum’s Outline Series – Analog and Digital Communications, Tata Mc
Graw Hill
Company, Third Edition, 2006.
4. Roody D, Coolen J, Electronic Communications, PHI, Fourth Edition, 2006.
5. Wayne Tomasi - Electronic Communication Systems, Pearson Education India,
2008
COURSE OUTCOMES

After successful completion of the course, the students should be able to

Level in
Course
Description Bloom’s
Outcomes
Taxonomy
Understand the limits set by information theory
C301.1 K2

C301.2 Understand the various waveform coding schemes K2

Design and implement base band transmission


C301.3 K2
schemes
Design and implement band pass signaling schemes
C301.4 K2

Analyze the spectral characteristics of band pass


C301.5 signaling schemes and their noise performance K3

C301.6 Design Error control coding schemes K3


CO – PO/PSO MAPPING

Program
-
- Specific
- Program Outcomes
Outcomes
-
-
Course
- Level
Out K3
- of CO K3 K4 K4 K5 K5 A3 A2 A3 A3 A3 A3 A2
Comes K6 K6 K5 K6

PSO PSO PSO


PO PO PO PO PO PO PO PO PO PO PO PO
1 1 1
1 2 3 4 5 6 7 8 9 10 11 12

C301.1 K2 2 1 - - - - - - - 2 - - - 1 2
C301.2 K2 2 1 - - - - - - - 2 - - - - 2
C301.3 K2 2 1 - - - - - - - 2 - - - 1 2
C301.4 K2 2 1 - - - - - - - 2 - - - - 2
C301.5 K3 3 2 2 2 - - - - - - - - - - 3
C301.6 K3 3 2 2 2 - - - - - 3 - - - - 3
Unit – I
Information Theory
LECTURE PLAN
UNIT I
6.1 Lecture Plan
UNIT I INFORMATION THEORY

S.N Proposed Topic Actual COs Highest Mode


o Period Period Cognitive of
Level Delivery
1 Discrete Memory less CO1 K3 MD2
source, Information

2 Entropy, Mutual CO1 K3 MD2


Information
3 Discrete Memoryless CO1 K2 MD2
Channel, Binary
Symmetric Channel
4 Channel Capacity, CO1 K2 MD2
Hartley Shannon Law
5 Source Coding CO1 K2 MD2
Theorem
6 Shannon Fano Code CO1 K2 MD2

7 Huffman Code CO1 K2 MD2

8 Problems CO1 K2 MD2


Activity Based Learning
Unit 1
6.2 Activity Based Learning
Sl.
No Activity Topic

Fundamentals of Information Theory and


1 Quiz
Coding

2. Poster Presentation Source Encoder, Shannon Hartley Law

Check Your Knowledge


using this fun game.
3. Information Theory
https://h5p.org/node/12
86566

Information and Probability inversely


4. Role Play
proportional
Information Theory, Shannon's theorem,
5. Connections Game
information and mutual information
QUIZ (Unit 1)

1.The expected information contained in a message is called

a) Entropy
b) Efficiency
c) Coded signal
d) None of the above

Answer: a) Entropy

2. Advantages of digital communication are

a) Easy multiplexing
b) Easy processing
c) Reliable
d) All of the mentioned

Answer: d) All of the mentioned

3. Analog to digital conversion includes

a) Sampling
b) Quantization
c) Sampling & Quantization
d) None of the mentioned

Answer: c) Sampling & Quantization

4. The signal can be reconstructed

a) At Nyquist rate
b) Above Nyquist rate
c) At & above the Nyquist rate
d) None of the mentioned

Answer: c) At & above the Nyquist rate


QUIZ

5. The output of an information source is

a) Random
b) Deterministic
c) Random & Deterministic
d) None of the mentioned

Answer: a) Random

6. When the base of the logarithm is e, the unit of measure of


information is
a) Bits
b) Bytes
c) Nats
d) None of the mentioned

Answer: c)Nats

7. Which conveys more information?

a) High probability event


b) Low probability event
c) High & Low probability event
d) None of the mentioned

Answer: b) Low probability event

8. Self information should be

a) Positive
b) Negative
c) Positive & Negative
d) None of the mentioned

Answer: a) Positive
QUIZ

9. What are the disadvantages of digital communication?

a) Needs more bandwidth


b) Is more complex
c) Needs more bandwidth & Is more complex
d) None of the mentioned

Answer: c) Needs more bandwidth & Is more complex

10. The unit of average mutual information is

a) Bits
b) Bytes
c) Bits per symbol
d) Bytes per symbol

Answer: a) Bits

11. When X and Y are statistically independent, then I (X,Y) is

a) 1
b) 0
c) Ln 2
d) Cannot be determined

Answer: b) 0

12. Which among the following is used to construct the binary code that
satisfies the prefix condition?
a) Information Rate
b) Entropy
c) Channel Coding Theorem
d) Kraft Inequality

Answer: d) Kraft Inequality


13. Information rate basically gives an idea about the generated
information per _____ by source.
a) Second
b) Minute
c) Hour
d) None of the above

Answer: a) Second

14. Which coding technique/s exhibit/s the usability of fixed length


codes?
a) Huffman Code
b) Shannon Fano Code
c) ASCII Code
d) None of the Above

Answer: c) ASCII Code

15. Huffman coding technique is adopted for constructing the source


code with ________ redundancy.
a) Maximum
b) Constant
c) Minimum
d) Unpredictable

Answer: c) Minimum

16. In discrete memoryless source, the current letter produced by a


source is statistically independent of _____
a) Past output
b) Future output
c) Both a and b
d) None of the above

Answer: c) Both a and b


17. The prefix code is also known as

a) Instantaneous code
b) Block code
c) Convolutional code
d) Parity bit

Answer: a) Instantaneous code

18. The mutual information

a) Is symmetric
b) Always non negative
c) Both a and b are correct
d) None of the above

Answer: c) Both a and b are correct

19. Entropy is

a) Information in a signal
b) Average information per message
c) Amplitude of signal
d) All of the above

Answer: b) Average information per message

20. The memory less source refers to

a) No previous information
b) Emitted message is independent of previous message
c) No message storage
d) None of the above

Answer: b) Emitted message is independent of previous message


Posters
Unit 1
Poster 1
Poster 2
6.3 Lecture Notes
Video Lecture Links

Unit I- Information Theory and Coding


https://nptel.ac.in/courses/117/105/117105077/

https://www.youtube.com/watch?v=VhhNcmPLKgo&list=PLp6ek2hDcoNBtaNoXzFE3BYKNZ2
kUJo1f&index=2

https://www.youtube.com/watch?v=98BAo8L-
o6s&list=PLp6ek2hDcoNBtaNoXzFE3BYKNZ2kUJo1f&index=3

https://www.youtube.com/watch?v=-
pHUIQ0sDqA&list=PLp6ek2hDcoNBtaNoXzFE3BYKNZ2kUJo1f&index=4

https://www.youtube.com/watch?v=82v41gliAJE&list=PLp6ek2hDcoNBtaNoXzFE3BYKNZ2kU
Jo1f&index=5
https://www.youtube.com/watch?v=RePldArdYZ4&list=PLp6ek2hDcoNBtaNoXzFE3BYKNZ2k
UJo1f&index=6
Organization of Topics
1. INFORMATION THEORY (Unit 1)
1.1 Introduction
1.2 Information
1.2.1 Information and Uncertainty
1.2.2 Properties
1.3 Entropy
1.3.1 Properties of Entropy
1.3.2 Entropy of Bernoulli Random Variable
1.3.3 Extension of a Discrete Memoryless Source
1.3.4 Information Rate (R)
1.3.5 Joint and Conditional Entropy
1.4 Mutual Information
1.4.1 Properties of Mutual Information
1.5 Discrete Memoryless Channels
1.6 Types of Channels
1.6.1 Lossless Channel
1.6.2 Deterministic Channel
1.6.3 Noiseless Channel
1.6.4 Binary Symmetric Channel
1.6.5 Binary Erasure Channel
1.7 Channel Capacity
1.8 Channel Coding Theorem
1.9 Maximum Entropy for Gaussian Channel_
L
1.10 Channel (Information) Capacity Theorem (or) Shannon
Theorem
1.11 Source Coding Theorem
1.11.1 Average Code Word Length( )
1.11.2 Coding Efficiency (η)
1.11.3 Statement
1.11.4 Variance (σ2)
1.12 Coding Techniques
1.12.1 Shannon-Fano Coding Procedure
1.12.2 Huffman Coding Procedure
1. INFORMATION THEORY

1.1 INTRODUCTION
The purpose of a communication system is to facilitate the transmission of
signals generated by a information source to the receiver end over a
communication channel.
Information theory is a branch of probability theory which may be
applied to the study of communication systems. Information theory allows us to
determine the information content in a message signal leading to different source
coding techniques for efficient transmission of message.
In the context of communication, information theory deals with
mathematical modeling and analysis of communication rather than with physical
sources and physical channels.
In particular, it provides answers to two fundamental questions,
i. What is the irreducible complexity below which a signal cannot be
compressed?
ii. What is the ultimate transmission rate for reliable communication over a
noisy channel?
The answers to these questions lie on the entropy of the source and
capacity of a channel respectively.
• Entropy is defined in terms of the probabilistic behavior of a source of
information.
• Channel Capacity is defined as the intrinsic ability of a channel to convey
information; it is naturally related to the noise characteristics of the channel.

1.2 INFORMATION
Information is the source of a communication system, whether it is
analog or digital communication. Information is related to the probability of
occurrence of the event. The unit of information is called the bit.
Based on memory, information source can be classified as follows:
i. A source with memory is one for which a current symbol depends on the
previous symbols.
ii. A memory less source is one for which each symbol produced is
independent of the previous symbols.
iii. A Discrete Memory less Source (DMS) can be characterized by list of
symbols, the probability assignment to these symbols, and the specification
of the rate of generating these symbols by the source.
Consider a probabilistic experiment involves the observation of the output
emitted by a discrete source during every unit of time (Signaling Interval). The
source output is modeled as a discrete random variable (S), which takes on
symbols from a fixed finite alphabet.
S= { s0,s1,s2,……..,sK-1}
with probabilities,
P(S=sk)= pk , k=0,1,2,…….,K-1.
The set of probabilities that must satisfy the condition
K 1
(1.1)
 pk
k0
 1, pk  0

We assume that the symbols emitted by the source during successive


signaling intervals are statistically independent.
A source having the alone described properties is called a Discrete
Memory less Source (DMS).
1.2.1 Information and Uncertainty
Information is related to the probability of occurrence of the event. More
is the uncertainty, more is the information associated or related with it.
Consider the event S=sk, describing the emission of symbol sk by the
source with probability pk clearly, if the probability pk=1 and pi=0 for all i≠k, then
, there is no surprise and therefore no information when symbol sk is emitted.
If, on the other hand, the source symbols occur with different probabilities
and the probability pk is low, then there is more surprise and, therefore,
information when symbol sk is emitted by the source than when another symbol
si, , with higher probability is emitted.
Example:
i. Sun rises in East : Here Uncertainty is zero there is no surprise in the
statement. The probability of occurrence is 1 (pk=1).
ii. Sun does not rise in East : Here uncertainty is high, because there is
maximum surprise, maximum information is not possible.
The amount of information is related to the inverse of the probability of
occurrence of the event S = sk as shown in Fig. 1.1. The amount of information
gained after observing the event S = sk, which occurs with probability pk, as the
logarithmic function as shown in the following equation.

 1 
I (s k )  log   (1.2)
pk 

Figure 1.1 Probability of occurrence of the event vs Information


1.2.2 Properties
i. Even if we are absolutely certain of the outcome of an event, even before it
occurs, there is no information gained.
I (s k )  0 for pk  1,
ii. The occurrence of an event S = sk either provides some or no information, but
never brings about a loss of information.
I (s k )  0 for 0  pk  1
iii. That is, the less probable an event is, the more information we gain when it
occurs.
I (s k )  I (s i ) for pk  pi

iv. This additive property follow from the logarithmic definition of I(sk).
I (s k , s l )  I (s k )  I (s l )
If sk and sl are statistically independent.
It is standard practice in information theory to use a logarithm to base 2
with binary signalling in mind. The resulting unit of information is called the bit,
which is a contraction of the words binary digit.
 1 
I (s k )  log2  
 pk 

  log2 pk for k= 0, 1, 2, ….., K-1


when pk= ½, we have I(sk) = 1 bit.
Hence , one bit is the amount of information that we gain when one of
two possible and equally likely (i.e..equiprobable) event occurs.
Note that the information I(sk) is positive, because the logarithm of a
number less than one, such as a probability, is negative.

1.3 ENTROPY
The entropy of a discrete random variable, representing the output of a
source of information, is a measure of the average information content per source
symbol.
The amount of information I(sk) produced by the source during an
arbitrary signalling interval depends on the symbol sk emitted by the source at the
time. The self-information I(sk) is a discrete random variable that takes on the
values I(s0), I(s1), …, I(sK – 1) with probabilities p0, p1, ….., pK – 1 respectively.
The expectation of I(sk) over all the probable values taken by the random
variable S is given by
H (S )  E [I (s k )]
K 1
  pk I (s k )
k 0
K 1  1 
H (S )   pk log2   (1.3)
k 0  pk 
The quantity H(S) is called the entropy.
1.3.1 Properties of Entropy
Consider a discrete memoryless source whose mathematical model is
defined by
S= { s0,s1,s2,……..,sK-1}
with probabilities,
P(S=sk)= pk , k=0,1,2,…….,K-1.

The entropy H(S) of the discrete random variable S is bounded as follows:


0  H (s )  log2 (K ) (1.4)
where K is the number of symbols in the alphabet 𝒮.
Property 1: H(S) = 0, if, and only if, the probability pk = 1 for some k, and the
remaining probabilities in the set are all zero; this lower bound on entropy
corresponds to no uncertainty.
Proof
We know that,
K 1  1 
H (S )   pk log2  
k 0  pk 

Consider pk=1 for a particular value of k and pk=0 for all other values of k
then the above equation becomes
1 
H (S )  0  0  ....  0  log2    0  ....  0
1
 
 log10 1 
H (S )   
 log10 2 
H (S )  0
Property 2: H(S) = log K, if, and only if, pk = 1/K for all k (i.e., all the symbols in
the source alphabet 𝒮 are equiprobable); this upper bound on entropy corresponds
to maximum uncertainty.
Proof:
Consider the probability of all K messages as (1/k)
1
p0  p1  p2  ....  pK 1 
K
then,
K 1  1 
H (S )   pk log2  
k 0  pk 

substituting the values of k and expanding the above equation we will get,
 1  1   1 
H (S )  p 0 log2    p1 log2    ...  pK 1 log2  
 p0   p1   pK 1 
Substituting the probability values,
1 1 1 
H (S )    log2 K     log2 K   ...    log2 K 
K  K  K 
Since, log2(K) term is present K number of times the equation becomes,
1
H (S )    K log2 K  
K 
H (S )  log2 K (1.5)
To proceed with the proof, consider first any two different probability
distributions denoted by p0, p1, ….., pK – 1 and q0, q1, ….., qK –1 on the alphabet 𝒮 =
{s0, s1, …, sK – 1}of a discrete source.
We may then define the relative entropy of these two distributions:
K 1  pk 
D (p || q )   pk log2  
k 0  qk  (1.6)
Considering the RHS of the above equation,
K 1  pk  K 1 q 
 pk log2      pk log2  k 
k 0  qk  k 0  pk 

1
where log x   log  
x 
  qk  
 log  
K 1  q k  K 1  10
 pk   logx a
 pk log2  p   k0 pk  log 2  logb a 
logx b
k 0  k   10

 
 
Multiply both numerator and denominator by log10e
  qk 
 log10  
 log10 e    pk
K 1
  pk  
. 
k 0  log10 e   log10 2 
 
 
Interchange the denominator of two fractions,
  qk  
 log10  
K 1  log10 e    pk  
  pk  . 
k 0  log10 2   log10 e 
 
 
K 1  qk 
  pk log2 e .loge  
k 0  pk 
K 1 q  loge x  ln x
 log2 e . pk ln  k 
k 0  pk 
According to the property of natural logarithm, ln x ≤ (x-1). The above
equation becomes
K 1 q  K 1 q 
log2 e . pk ln  k   log2 e . pk  k  1 
k 0  pk  k 0  pk 
K 1
 log2 e . q k  p k 
k 0

K 1 K 1

 log2 e .   q k   pk 
 k 0 k 0 
0

K 1 K 1
We know that,  q k k pk
k 0 0
1

K 1  qk 
Hence,  pk log 2  0
k0  pk 
Let us consider qk=(1/K) for all values of k in the following equation
K 1  q k  K 1   1 
 pk log2     pk log2 q k  log2    0
k 0  p k  k 0   pk  
then
K 1  1  1 
 pk log 2  K   log2  p
 
  0
k0   k  
K 1
1 K 1  1 
 pk log2  K   k0 pk log2  p   0
k 0  k 
K 1
1 
 pk log
k 0
2 K

  H (S )  0

 1  K 1
log2    pk  H (S )  0
 K  k 0
 K 1 
 log2 K   H (S )  0   pk  1 
 k 0 

H (S )  log2 K  (1.7)
Thus, H(S) is always less than or equal to log2 K. The equality holds if, and
only if, the symbols in the alphabet 𝒮 are equally probable.
1.3.2 Entropy of Bernoulli Random Variable
Consider the Bernoulli random variable for which symbol 0 occurs with
probability p0 and symbol 1 with probability p1 = 1 – p0.
The entropy of this random variable is
H (S )  p0 log2 p0  p1 log2 p1
 p0 log2 p0  (1  p0 ) log2 (1  p0 ) bits (1.8)
From the above equation we observe the following:
i. When p0 = 0, the entropy H(S) = 0; this follows from the fact that as x logex→0
as x→0.
ii. When p0 = 1, the entropy H(S) = 0.
iii. The entropy H(S) attains its maximum value Hmax = 1 bit when p1 = p0 = 1/2;
that is, when symbols 1 and 0 are equally probable. In other words, H(S) is
symmetric about p0 = 1/2 as shown in Fig.1.2.Customizing the entropy function
H(S) with a special symbol, we define the new entropy function H(p0) as,

H (p0 )  p0 log2 p0  (1  p0 ) log2 (1  p0 )


Figure 1.2 Entropy Function H(p0)
1.3.3 Extension of a Discrete Memoryless Source
Let us consider the discrete memory less source having alphabet S= { s0,
s1, s2,……, sK-1}. We often find it useful to consider blocks rather than individual
symbols, with each block consisting of n successive source symbols.
Each such block as being produced by an extended source with a source
alphabet described by the Cartesian product of a set Sn that has Kn distinct blocks,
where K is the number of distinct symbols in the source alphabet S of the original
source.
With the source symbols being statistically independent, it follows that the
probability of a source symbol in Sn is equal to the product of the probabilities of the
n source symbols in S that constitute a particular source symbol of Sn .
H (S n )  nH (S ) (1.9)
The above equation shows that H(Sn), the entropy of the extended
source, is equal to n times H(S), the entropy of the original source.
1.3.4 Information Rate (R)
Information rate (R) is represented in average number of bits
of information per second.
R = r H(S) (1.10)
Where, R - information rate ( Information bits / second )
H(S) - the Entropy or average information (bits / symbol) and
r - the rate at which messages are generated (symbols / second).
1.3.5 Joint and Conditional Entropy
Let us define the following entropy functions for a channel with m inputs
and n outputs.
m 1
H ( X )    p (x i ) log2 p (x i )
i 0

n 1
H (Y )   p (y j ) log2 p (y j )
j 0

n 1 m 1
H ( X /Y )    p (x i y j ) log
, 2 p (x i / y j )
j 0 i
0

n 1 m 1
H (Y / X )    p (x i y j ) log
, 2 p (y j / x i )
j 0 i0

n 1 m 1
H (X ,Y )    p (x i y j ) log
, 2 p (x i , y j )
j 0 i0

Where,
H(X) – Average uncertainty of channel input
H(Y) – Average uncertainty of channel output
H(X/Y) and H(Y/X) – Conditional Entropy
H(X,Y) – Joint Entropy
p(xi) – Input probability
p(yi) – Output probability
p(xi/yi) , p(yi/xi) – Conditional probability and
p(xi, yi) – Joint probability.
Relationship between Conditional and Joint Entropy
1. H(X,Y) = H(X/Y) + H(Y)
2. H(X,Y) = H(Y/X) + H(X)
Proof:
H(X,Y) = H(X/Y) + H(Y)
n 1 m 1
H ( X ,Y )    p (x i y j ) log
, 2
p (x i , y j )
j 0 i0
n 1 m 1
H ( X ,Y )    p (x i y j ) log
, 2 p (x i , y j )
j 0 i0

n 1 m 1
   p (x i y j ) log
, 2
 p (x i / y j )p (y j ) 
j 0 i0

Where, p (x i , y j )  p (x i / y j ) p (y j )
n 1 m 1
   p (x i y j ) log
, 2 p (x i / y j )  log2 p (y j ) 
j 0 i0
n 1 m 1 n 1 m 1
   p (x i ,y j ) log2 p (x i / y j )    p (x i y j ) log
, 2 p (y j )
j 0 i 0 j 0 i0
n 1 m 1 n 1 m 1
   p (x i y j ) log
, 2 p (x i / y j )    p (x i )p (y j ) log 2 p (y j )
j 0 i0 j 0 i0
n 1 m 1 n 1 m 1
   p (x i ,y j ) log2 p (x i / y j )   p (y j ) log2 p (y j )  p (x i )
j 0 i 0 j 0 i 0
n 1 m 1 n 1
   p (x i y j ) log
, 2 p (x i / y j )   p (y j ) log2 p (y j )
j 0 i0 j 0

H (X ,Y )  H (X /Y )  H (Y ) (1.11)
Similarly to prove H(X,Y) = H(Y/X) + H(X),
n 1 m 1
H ( X ,Y )    p (x i y j ) log
, 2 p (x i , y j )
j 0 i0
n 1 m 1
   p (x i y j ) log
, 2
 p (y j / x i )p (x i ) 
j 0 i0
n 1 m 1
   p (x i y j ) log
, 2 p (y j / x i )  log2 p (x i ) 
j 0 i0

n 1 m 1 n 1 m 1
   p (x i ,y j ) log2 p (y j / x i )    p (x i y j ) log
, 2 p (x i )
j 0 i 0 j 0 i0

n 1 m 1 n 1 m 1
   p (x i ,y j ) log2 p (y j / x i )    p (x i )p (y j ) log 2 p (x i )
j 0 i 0 j 0 i0

n 1 m 1 m 1 n 1
   p (x i y j ) log
, 2 p (y j / x i )   p (x i ) log2 p (x i )  p (y j )
j 0 i0 i 0 j 0
n 1 m 1 m 1
   p (x i y j ) log
, 2 p (y j / x i )   p (x i ) log2 p (x i )
j 0 i 0 i 0

H (X ,Y )  H (Y / X )  H (X ) (1.12)
1.4 MUTUAL INFORMATION
The mutual information I(X;Y) is a measure of the uncertainty about the
channel input, which is resolved by observing the channel output.
Mutual information I(xi,yj) of a channel is defined as the amount of
information transferred when xi transmitted and yj received.

x 
log p  i  (1.13)
y
I (x i , y j )   j  bits
p (x i )

I(xi,yj) - Mutual Information

 xi 
p  - Conditional probability that xi was transmitted and yj was
y received
 j 

p(xi) - Probability of symbol xi for transmission


1.4.1 Properties of Mutual Information
a) Symmetry Property
The mutual information of a channel is symmetric in the sense that
I(X;Y) =I(Y;X) (1.14)
Proof:
Consider the entropy formula H(X)
J 1  1 
H ( X )   p (x j ) log2 
 p (x ) 
j 0  j 

J 1  1 K 1
  p (x j ) log2 
 p (x ) k
p (y k | x j )
j 0  j  0

J 1 K 1  1 
   p (y k | x j )p (x j ) log2 
 p (x ) 
j 0 k 0  j 

J 1 K 1  1 
H ( X )    p (x j , y k ) log2 
 p (x ) 
j 0 k 0  j 
We know that,
J 1 K 1  1 
H ( X |Y )    p (x j , y k ) log2 
 p (x | y ) 
j 0 k 0  j k 

The conditional entropy H(X|Y) relates the channel output Y to the


channel input X. The entropy H(X) defines the entropy of the channel input X by
itself. Then the mutual information is defined as,

I (X ;Y )  H (X )  H (X |Y ) (1.15)

J 1 K 1  1  J 1 K 1  1 
I ( X ;Y )    p (x j , y k ) log2 
    p (x j , y k ) log2 
 p ( x )  j 0 k 0  p (x | y ) 
j 0 k 0  j   j k 

taking common terms on the right hand side of the above equation,

J 1 K 1   1   1 
I (X ;Y )    p (x j , y k ) log2    log2  
  p (x )   p (x | y )  
j 0 k 0  j   j k 

J 1 K 1  p (x j | y k ) 
I ( X ;Y )    p (x j , y k ) log2 
 p (x ) 
j 0 k 0  j 
similarly,
J 1 K 1  p (y k | x j ) 
I (Y; X)    p (x j , y k ) log2  
j 0 k 0  p (y k ) 

substituting the Baye’s rule for conditional probability ,


p (x j | y k ) p (y k | x j )

p (x j ) p (y k )

in the previous equation


J 1 K 1  p (y k | x j ) 
I ( X ;Y )    p (x j , y k ) log2  
j 0 k 0  p (y k ) 

I (X ;Y )  I (Y; X ) (1.16)
Hence the symmetry property of the Mutual Information was proved.
b) Expansion of the Mutual Information property
The mutual information of a channel is related to the joint entropy of the
channel input and channel output by
I (X ;Y )  H (X )  H (Y )  H (X ,Y ) (1.17)
where the joint entropy H(X, Y) is defined by
J 1 K 1  1 
H ( X ,Y )    p (x j , y k ) log2 
 p (x , y ) 
j 0 k 0  j k 

multiplying both numerator and denominator of the log term by p(xj)p(yk) in the
above equation,
J 1 K 1  1 p (x j )p (y k ) 
H ( X ,Y )    p (x j , y k ) log2 
 p (x , y ) p (x ) p (y ) 
j 0 k 0  j k j k 

Interchanging the denominator of the fraction terms and applying logarithmic rule
J 1 K 1  p (x j )p (y k )  J 1 K 1  1 
H ( X ,Y )    p (x j , y k ) log2      p (x j , y k ) log2 
 p (x , y )  j 0 k 0  p (x , y ) 
j 0 k 0  j k   j k 

Consider the second term of the above equation,


J 1 K 1  1  J 1 K 1  1  J 1 K 1  1 
  p (x j , yk ) log2      p (x j , y k ) log2 
 p ( x , y )  j 0 k 0     p (x j , y k ) log2 
 p ( x )  j 0 k 0 
j 0 k 0  j k   j   p (y k ) 

J 1  1 K 1 K 1  1  J 1
  log2    p (x j , y k )   log2   p ( x j , y k )
 p ( x ) k  0
j 0  j  k 0  p (y k )  j 0
J 1  1  K 1  1 
  p (x j ) log2     p (y k ) log2  
 p ( x )  k 0
j 0  j   p (y k ) 
K 1 J 1

where,  p (x j , y k )  p (x j ) and 
j
p (x j , yk )  p (y k )
0
k 0
 H (X )  H (Y )
Replacing the second term of H(X,Y) equation with the above result
J 1 K 1  p ( x j ) p (y k ) 
H (X ,Y )    p (x j , y k ) log2   H (X )  H (Y )
 p (x , y ) 
j 0 k 0  j k 

 
 
J 1 K 1
 p (x j ) 
   p (x j , y k ) log2   H ( X )  H (Y )
j 0 k 0  p (x j , y k )  
  
 p ( y )
 k 
J 1 K 1  p (x j ) 
H ( X ,Y )    p (x j , y k ) log2   H ( X )  H (Y )
 p (x | y ) 
j 0 k 0  j k 
J 1 K 1  p (x j | y k ) 
H ( X ,Y )    p (x j , y k ) log2   H ( X )  H (Y )
 p (x ) 
j 0 k 0  j 
H (X ,Y )  I (X ;Y )  H (X )  H (Y )

I (X ;Y )  H (X )  H (Y )  H (X ,Y ) (1.18)
Hence the Expansion of Mutual information property was proved.
c) Non negativity Property
The mutual information is always nonnegative. We cannot lose
information, on the average, by observing the output of a channel .
I(X;Y) ≥ 0 (1.19)
Proof:
We know that the Mutual Information as
J 1 K 1  p (x j ) 
I( X ;Y )    p (x j , y k ) log2 
 p (x | y ) 
j 0 k 0  j k 

Substituting the following relation in the above equation


p (x j , y k )
p (x j | y k ) 
p (y k )

then,
J 1 K 1  p (x j ) p (y k ) 
I(X ;Y )    p (x j , y k ) log2 
 p (x , y ) 
j 0 k 0  j k 

From the above equation we can observe that the Mutual Information is a
non negative value since log of any value and the probability values are always
positive. Mutual information reaches the value zero if and only if,
p (x j , y k )  p (x j ) p (y k ) for all j and k
1.5 DISCRETE MEMORYLESS CHANNELS

p(yi/xi)

Figure 1.3 Discrete Memoryless Channel


Discrete Memoryless Channel is a statistical model with an input X
and output Y (i.e., noisy version of X). Both X and Y are random variables.
The channel is said to be discrete ,when both X and Y have finite sizes. It
is said to be memoryless when the current output symbol depends only on the
current input symbol and not on previous ones.

X  x 0 , x 1 ,...., x m 

Y  y 0 , y1 ,...., yn 

Set of transition probabilities is represented in matrix form


 P(y 0 / x 0 ) P(y 1 / x 0 ) P(y n / x 0 ) 
 
P(y 0 / x 1 ) P(y 1 / x 1 ) P(y n / x 1 )  (1.20)
P  P (Y / X )  
 
 
P(y 0 / x m ) P(y 1 / x m ) P(y n / x m ) 

where P=channel matrix.


Each row of the channel matrix P corresponds to a fixed channel input,
whereas each column of the matrix corresponds to fixed channel output.
Property of channel matrix is,
n


j
P (y j
0
/ xi )  1 for all j
(1.21)
Even that the channel input X=xi occurs with probability,

p (x i )  P (X  x i ) for i=0,1,2,....,m
Joint probability distribution of random variables X and Y is

p (x i , y j )  P ( X  x i ,Y  y j )

 P (Y  y j | X  x i )P( X  x i )

p (x i , y j )  p (y j | x i ) p(x i ) (1.22)
Marginal probability distribution of the output random variable Y is
obtained by averaging out the dependence of p(xi,yj) on xi as

p (y j )  P (Y  y j )
m
  P (Y  y j | X  x i )P( X  x i )
i 0
m
  p(y j | x i ) p(x i ) for j=0,1,2,...,n (1.23)
i 0

The above equation states that gives the input , a prior probabilities p(x i)
and channel matrix p(yj, xi) we may calculate the probabilities of output symbols
p(yj) .

1.6 TYPES OF CHANNELS


1.6.1 Lossless Channels
A channel described by a channel matrix with only one non-zero element
in each column is called a lossless channel. The channel matrix of a lossless
channel will be like,

3 1 
4 0 0 0
4
 
1 2
P (Y | X )   0 0 0

3 3
 
0 0 0 0 1
 

Figure 1.4 Lossless Channel


1.6.2 Deterministic Channel
A channel described by a channel matrix with one non-zero element in
each row is called a deterministic channel.
1 0 0 x1
 
1 0 0 x2 y1
P (Y | X )  0 1 0 x3
  y2
0 1 0
0 x4 y3
 0 1
x5
Figure 1.5 Deterministic Channel
1.6.3 Noiseless Channel
A channel is called as noiseless channel if it is both lossless and
deterministic. The channel matrix of the noiseless channel is as follows

1 0 0 0
 
0 1 0 0
P (Y / X )  
0 0 1 0
 
0 0 0 1

Figure 1.6 Noiseless Channel


1.6.4 Binary Symmetric Channel
Binary Symmetric Channel has two input symbols x0=0 and x1=1 and two
output symbols y0=0 and y1=1. The channel is symmetric because the probability
of receiving a 1 if a 0 is sent is the same as the probability of receiving a 0 if a 1
is sent.
In other words correct bit transmitted with probability 1-p, wrong bit
transmitted with probability p. It is also called “cross-over probability”. The
conditional probability of error is denoted as p.

Figure 1.7 Binary Symmetric Channel


The Entropy H(X) is maximized when the channel input probability
p(x0)=p(x1)=1/2, where x0 and x1 are each 0 or 1.
The Mutual Information I(X:Y) is similarly maximized, so that,
C  I (X ;Y ) p ( x 1 (1.24)
0 )p ( x 1 )
2
From BSC diagram,

p (y 0 | x 1 )  p ( y 1 | x 0 )  p (1.25)
and
p (y 0 | x 0 )  p (y 1 | x1 )  1  p (1.26)

Capacity of Binary Symmetric Channel (BSC) is

C  1  p log2 p  (1  p ) log2 (1  p ) (1.27)

C  1  H (p ) (1.28)

Figure.1.8 Variation of Channel Capacity of a BSC with Transition Probability (p)


i) When channel is Noise free
P=0, the channel capacity C attains its maximum value of one bit per channel
use, which is exactly the information in each channel input.
Entropy attains minimum value of Zero.
ii) When channel is Noisy
P=1/2, the channel capacity C attains its minimum value of Zero.
Entropy attains maximum value of unity and the channel is said to be useless.
1.6.5 Binary Erasure Channel (BEC)
A Binary erasure channel (BEC) has two inputs (0,1) and three outputs
(0,y,1). The symbol y indicates that, due to noise, no deterministic decision can
be made as to whether the received symbol is a 0 or 1. In other words the symbol
y represents that the output is erased. Hence the name Binary Erasure Channel
(BEC) .
The mutual information and channel capacity are
I(X, Y) = H(X)-H(X|Y)
=H(X)-(1-p)H(X) (1.29)
where, H(X|Y)=(1-p)H(X)
I(X, Y) = pH(X)
The channel Capacity ( C ) is defined as,
C= max I (X:Y)
= max[pH(X)]
= p max[H(X)]
C= p (1.30)
since max [H(X)] = 1

Figure 1.9 Binary Erasure Channel


1.7 CHANNEL CAPACITY
Channel Capacity of a discrete memory less channel as the maximum
average mutual information I(X:Y) in any single use of channel i.e., signaling
interval, where the maximization is over all possible input probability distribution
{p(xi)} on X. It is measured in bits per channel use.
Channel capacity (C) =max I (X:Y) (1.31)
The channel capacity C is a function only of the transition probabilities
p(yj|xi), which defines the channel. The calculation of channel capacity (C)
involves maximization of average mutual information I(X:Y) over K variables
p(xi) ≥ 0 for all k
and
K 1

 p( x )  1
i 0
i (1.32)

The transmission efficiency (or) channel efficiency is expressed as,

actualtransmission of Information
 (1.33)
maximum transmission of Information

I(X :Y )

max I ( X : Y )

I(X :Y ) (1.34)

C

The Redundancy of the channel is expressed as,

R  1  (1.35)

C  I(X :Y )
R (1.36)
C
1.8 CHANNEL CODING THEOREM
The design goal of channel coding is to increase the resistance of a digital
communication system to channel noise. Specifically, channel coding consists of
mapping the incoming data sequence into a channel input sequence and inverse
mapping the channel output sequence into an output data sequence in such a
way that the overall effect of channel noise on the system is minimized.
Mapping operation is performed in the transmitter by a channel encoder,
whereas the inverse mapping operation is performed in the receiver by a channel
decoder, as shown in the block diagram of Figure. For simplification source
encoding (before channel encoding) and source decoding (after channel
decoding) process was not included in this Fig. 1.10.

Discrete Discrete
Channel Memoryless Channel
Memoryless Destination
encoder Channel Decoder
Source

Transmitter Receiver
Noise

Figure1.10 Block Diagram of Digital Communication System


Theorem
The channel-coding theorem for a discrete memoryless channel is stated
as follows:
i) Let a discrete memoryless source with an alphabet 𝒮 have entropy H(S) for
random variable S and produce symbols once every Ts seconds. Let a discrete
memoryless channel have capacity C and be used once every Tc seconds. Then,
if
H (S ) C (1.37)

Ts Tc

there exists a coding scheme for which the source output can be transmitted
over the channel and be reconstructed with an arbitrarily small probability of
error. The parameter C/Tc is called the critical rate.
When the system is said to be signalling at the critical rate then it need to
satisfy the below condition
H (S ) C (1.38)

Ts Tc
ii) Conversely, if
H (S ) C

Ts Tc
it is not possible to transmit information over the channel and reconstruct it with
an arbitrarily small probability of error.
The ratio Tc/Ts equals the code rate of encoder denoted by ‘r’, where
Tc (1.39)
r 
Ts

In short, the Channel Coding Theorem states that if a discrete memoryless


channel has capacity C and a source generates information at a rate less than
C, then there exists a coding technique such that the output of the source
maybe transmitted over the channel with an arbitrarily low probability of symbol
error.
Conversely, it is not possible to find such a code if the code rate ‘r’ is
greater than the channel capacity C. If r ≤ C, there exists a code capable of
achieving an arbitrarily low probability of error.
Limitations
i. It does not show us how to construct a good code.
ii. The theorem does not have a precise result for the probability of symbol
error after decoding the channel output.

Figure 1.11 Information Rate vs. Probability of Error


1.9 MAXIMUM ENTROPY FOR GAUSSIAN CHANNEL
Probability density function of Gaussian Channel / function is
x2
1 
p (x )  e 2 2
(1.40)
 2

where, σ2 is the average power of the source.


The maximum entropy is

1 (1.41)
H(S)  

p (x ) log2
p (x )
dx


   p (x ) log2 p (x )dx


 1 
2
 x
 2
   p (x ) log2  e 2  dx
  2 
  
   1   x2
2

   p (x ) log2    log2  e
2  dx
   2  

  

   1   x2
2

   p (x ) log2    log2  e
2  dx
   2  

  

 x2 
2

  1  
   p (x ) log2   dx   p (x ) log2  e
2  dx
   2  
   
 x2 
2
 

  
  p (x ) log2  2 dx   p (x ) log2  e 2  dx

   
  2
 x2 
1
 
    p (x ) log2  2 dx   p (x )  2  log2 e dx
 2     2 
 

1
 log2 e  
   log2 2 2
2
  p (x )dx 
2 2 x
2
p (x )dx
 
1
 log2 e   

2

   log2 2 2  p (x )dx 
2 2 x
2
p (x )dx
 

1 log2 e   (
2

   log2 2 2 
2 2
 2
)
 

 x p (x )dx   2
2
where p (x )dx  1 and
 

1

2   
log2 2 2  log2 e 

1
H (S )  log2  2 e  2  (1.42)
2
(or)

H (S )  log2 2 e  2 (1.43)

From the above results we can observe the entropy is maximum when
p(x)is zero mean probability density function.

1.10 CHANNEL (INFORMATION) CAPACITY THEOREM


OR SHANNON’S THEOREM
The information capacity (C) of a continuous channel of bandwidth B
hertz, affected by AWGN of total noise power with in the channel bandwidth N, is
given by the formula,
S (1.44)
C  B log2 (1  )bits / sec
N

where, C - Information Capacity.


B - Channel Bandwidth
S - Signal power (or) Average Transmitted power
N – Total noise power within the channel bandwidth (B)
It is easier to increase the information capacity of a continuous
communication channel by expanding its bandwidth than by increasing the
transmitted power for a prescribed noise variance.
Proof
Let us consider the channel or information capacity (C) as the difference
between the entropies of channel output and the total noise power within the
channel bandwidth.
C  max H (Y )  max H (N) (1.45)

C  log2 2 e Y2  log2 2 e  N2

C  log2 2 e (S  N )  log2 2 eN

Since the value of Y2  S  N and  N2  N

 2 e (S  N ) 
C  log2 
 2  eN 
 

1  2 e (S  N ) 
C  log2  
2  2 eN 

1 S N 
C  log2  
2  N 

1  S 
C  log2 1  
2  N

If the signal is band limited, it is sampled at Nyquist rate given as 2B,then

1  S 
C  2Bx log2 1  
2  N

 S
C  B log2 1   bits / sec .
 N
If we substitute the power spectral density of noise as N=N0B, then the Channel
or Information Capacity will be,

 S 
C  B log2 1   bits / sec . (1.46)
 N 0B 
Trade off
i. Noiseless Channel has infinite bandwidth. If there is no noise in the
channel , then N=0 hence (S/N)= . Then,
C  B log2 1    
ii. Infinite bandwidth has limited capacity. If B is infinite, channel capacity is
limited, because noise power (N) increases, (S/N) ratio decreases. Hence, even
B approaches infinity, capacity does not approach infinity.
1.11 SOURCE CODING THEOREM
Efficient representation of data generated by a discrete source is
accomplished by some encoding process. The device that performs the
representation is called a source encoder.
Efficient source encoder must satisfy two functional requirements
i) Code Nodes produced by the encoder are in binary form.
ii) The source code is uniquely decodable, so that the original source sequence
can be reconstructed perfectly from the encoded binary sequence.

sk bk

Discrete
Channel Binary
Memory less
Encoder Sequence
Source

Figure 1.12 Source Encoder


1.11.1 Average Code word length (L)
_ K
L   pk I k (1.47)
k 1

where,
_
L - Average number of bits per source symbol.
pk - Probability of the symbols
Ik - No. of bits per message
1.11.2 Coding Efficiency ( Ƞ )
The value of coding efficiency (Ƞ) is always lesser or equal to 1. The
source encoder is said to be efficient when Ƞ approaches unity.

Lmin
 _
(1.48)
L
where,
_
L  Lmin (1.49)
1.11.3 Theorem Statement
Given a discrete memory
_ less source of entropy H(S) or H(X) then the
average code word length L for any distortion less source encoding is bounded
as,
_
L  H (X ) (1.50)
(or )
_
L  H (S)
where, the Entropy H(S) or H(X) is the fundamental limit on the average number
of bits/source symbol.
Thus with L min  H (X)
H (X )
  _
L (1.51)
1.11.4 Variance (σ2)
The variance is defined as,
K 2
 _

   pk I k  L 
2

k 0   (1.52)

where, σ2 - Variance of the code


pk - probability of the kth symbol
Ik - number of bits assigned to kth symbol
_
L - Average code word length
1.12 CODING TECHNIQUES
The two popular coding technique used in Information theory are (i)
Shannon-Fano Coding and (ii) Huffman Coding. The coding procedures are
explained in the following section.
1.12.1 Shannon-Fano Coding Procedure
1. Sort the list of symbols in decreasing order of probability.

2. Partition the set into two sets that are close to equiprobable as possible and
assign 0 to the upper set and assign 1 to the lower set.
3. Repeat the steps 2 and 3 for each part, until all the symbols are split into
individual subgroups.

4. Assign code for each symbol using the binary values obtained in each stages.
1.12.2 Huffman Coding – Procedure
1. The messages are arranged in the order of decreasing probability.

2. The two messages of lowest probabilities are assigned 0 & 1. i.e, combine the
probabilities of two symbols having the lowest probabilities, and reorder
resultant probabilities. This step is called resultant 1.
3. The same procedure is repeated until there are two ordered probabilities
remaining.
4. Start encoding with the last reduction, which exactly consists of two ordered
probabilities. (i) Assign 0 as the first digit in the code words, for all the source
symbols associated with the first probability. (ii) Assign 1 to the second
probability.

5. Keep regressing this key until the first column is reached.


1.13 PROBLEMS
1. A discrete memoryless source has symbols x1, x2, x3, x4, x5 with probabilities of
0.4, 0.2, 0.1, 0.2, 0.1 respectively. Construct a Shannon-Fano code for the
source and calculate code efficiency η.
Solution:
Using Shannon-Fano coding procedure the following table was formed.

i) To calculate Average code word length


_ K
L   pk I k
k 1

 (0.4  2)  (0.2  2)  (0.2  2)  (0.1  3)  (0.1  3)


_
L  2.2 bits / symbol

ii) To find Entropy


K  1 
H (X )   pk log2  
k 1  pk 
 1   1   1   1   1 
H (X )  0.4 log2    0.2log2    0.2log2    0.1log2    0.1log2  
 0.4   0.2   0.2   0.1   0.1 
H (X )  2.12192 bits / symbol
iii) To find Code Efficiency (η)
H (X )
 _
L
2.12192
  0.9645
2.2

 ( )
iv) To find Code Redundancy
  1 

 1  0.9645

  0.0355

2. For a discrete memoryless source ‘X” with 6 symbols x1, x2,….., x6 the probabilities
are p(x1) =0.3, p(x2) =0.25, p(x3) = 0.2, p(x4) = 0.12, p(x5) = 0.08, p(x6) = 0.05
respectively. Using Shannon-Fano coding, calculate entropy, average length of
code, efficiency and redundancy of the code.
Solution:
Using Shannon-Fano coding procedure the following table was formed.
i) To calculate Average code word length
_ K
L   pk I k
k 1

 (0.3  2)  (0.25  2)  (0.2  2)  (0.12  3)  (0.08  4)  (0.05  4)


_
L  2.38 bits / symbol
ii) To find Entropy
K  1 
H (X )   pk log2  
k 1  pk 
 1   1   1 
H ( X )  0.3log2    0.25log2    0.2log2  
 0.3   0.25   0.2 
 1   1   1 
0.12log2    0.08log2    0.05log2  
 0.12   0.08   0.05 
H (X )  2.3568 bits / symbol
iii) To find Code Efficiency (η)

H (X )
 _
L
2.3568
  0.99
2.38

iv) To find Code Redundancy ( )


  1 

 1  0.990

  0.01
3. Using Huffman algorithm find the average code word length and code efficiency.
Consider the 5 source symbols of a DMS with probabilities p(m1)=0.4,
p(m2)=0.2, p(m3)=0.2, p(m4)=0.1 and p(m5)=0.1.
Solution:
Method 1: Using Huffman coding procedure the following table was formed by
placing the combined signal as high as possible.

i) To calculate Average code word length


_ K
L   pk I k
k 1

 (0.4  2)  (0.2  2)  (0.2  2)  (0.1  3)  (0.1  3)


_
L  2.2 bits / symbol
ii) To find Entropy

K  1 
H (X )   pk log2  
k 1  pk 
 1   1   1   1   1 
H (X )  0.4 log2    0.2log2    0.2log2    0.1log2    0.1log2  
 0.4   0.2   0.2   0.1   0.1 
H (X )  2.12192 bits / symbol
iii) To find Code Efficiency (η)
H (X )
 _
L
2.12192
  0.9645
2.2

 ( γ)
iv) To find Code Redundancy
  1 

 1  0.9645

  0.0355

Method 2: Using Huffman coding procedure the following table was formed by
placing the combined signal as low as possible.
i) To calculate Average code word length
_ K
L   pk I k
k 1
 (0.4  1)  (0.2  2)  (0.2  3)  (0.1  4)  (0.1  4)
_
L  2.2 bits / symbol

ii) To find Entropy


K  1 
H (X )   pk log2  
k 1  pk 

 1   1   1   1   1 
H (X )  0.4 log2    0.2log2    0.2log2    0.1log2    0.1log2  
 0.4   0.2   0.2   0.1   0.1 

H (X )  2.12192 bits / symbol

iii) To find Code Efficiency (η)


H (X )
 _
L
2.12192
  0.9645
2.2

iv) To find Code Redundancy ( γ )

  1 

 1  0.9645
  0.0355
4. For a discrete memoryless source ‘X” with 6 symbols x1, x2,….., x6 the probabilities
are p(x1) =0.3, p(x2) =0.25, p(x3) = 0.2, p(x4) = 0.12, p(x5) = 0.08, p(x6) = 0.05
respectively. Using Huffman coding, calculate entropy, average length of code,
efficiency and redundancy of the code.
Solution:
Using Huffman coding procedure the following table was formed.

i) To calculate Average code word length


_ K
L   pk I k
k 1

 (0.3  2)  (0.25  2)  (0.2  2)  (0.12  3)  (0.08  4)  (0.05  4)

_
L  2.38 bits / symbol
ii) To find Entropy

K  1 
H (X )   pk log2  
k 1  pk 
 1   1   1 
H ( X )  0.3log2    0.25log2    0.2log2  
 0.3   0.25   0.2 
 1   1   1 
0.12log2    0.08log2    0.05log2  
 0.12   0.08   0.05 

H (X )  2.3568 bits / symbol

iii) To find Code Efficiency (η)

H (X )
 _
L
2.3568
  0.99
2.38

iv) To find Code Redundancy (  )


  1 

 1  0.990

  0.01
5. The Binary Symmetric Channels are connected in cascade as shown below

(i) Find the channel matrix of resultant channel.


(ii) Find P(z1) and P(z2) if P(x1)= 0.6 and P(x2)=0.4.
Solution
(i) To obtain Channel Matrix
To obtain Channel Matrix, we can write two matrices of the channel as follows.

P(y 1 / x 1 ) P(y 2 / x 1 )  0.8 0.2


P (Y / X )    
P(y 1 / x 2 ) P(y 2 / x 2 )  0.2 0.8

P(z1 / y1 ) P(z2 / y1 )  0.7 0.3


P (Z/ Y)    
P(z1 / y 2 ) P(z2 / y 2 )  0.3 0.7 
Here resultant channel matrix is
P (Z/ X)  P (Y/ X).P (Z/ Y)
0.8 0.2 0.7 0.3
  
0.2 0.8 0.3 0.7 
0.62 0.38
 
0.38 0.62
(ii) To obtain P(z1) and P(z2)
The probability of z1 and z2 are given as

P (Z)  P (X).P (Z/ X)

0.62 0.38
P (Z)  P(x1 ) P(x 2 )   
0.38 0.62
0.62 0.38
P (Z)  0.6 0.4   
0.38 0.62

P (Z)  0.524 0.476 


The value of P(z1) and P(z2) are 0.524 and 0.476 respectively.

0.9 0.1
6. The channel transition matrix is given by   .Draw the channel diagram
0.2 0.8
and determines the probabilities associated with outputs assuming equiprobable
Inputs. Also find the mutual information I(X:Y) for the channel.
Solution:
P(y1 / x1 ) P(y 2 / x1 )  0.9 0.1
P   
P(y1 / x 2 ) P(y 2 / x 2 )  0.2 0.8
Inputs are equiprobable i.e., P(x1) and P(x2) are 0.5 respectively
(probabilities of 2 input symbols).
Channel Diagram
P (y 1 | x 1 )  0.9
y1
P (x 1 )  0.5
P (y 1 | x 2 )  0.2

P (y 2 | x 1 )  0.1
P (y 2 | x 2 )  0.8
y2
P (x 2 )  0.5

Output symbol probabilities


 p (y 1 )  P(y 1 / x 1 ) P(y 2 / x 1 ) 
    p (x 1 ) p (x 2 )   
 p (y 2 )  P(y 1 / x 2 ) P(y 2 / x 2 ) 
Substituting the values in the above equation
 p (y 1 )  0.9 0.1
   0.5 0.5  
 p (y 2 )  0.2 0.8

0.55
P(Y)   
0.45
i.e., P(y1) = 0.55 ; P(y2) =0.45
Mutual Information is defined as,
m n p( xi / y j )
I  X : Y    p( xi , y j ) log 2
i 1 j 1 p( xi )

From probability theory we know that


p( xi , y j )  p( xi | y j ) p( y j )
p( xi , y j )  p ( y j | xi ) p ( xi )

 p( xi | y j ) p( y j )  p( y j | xi ) p( xi )
p( xi | y j ) p( y j | xi )
 
p( xi ) p (y j )
Hence mutual information equation becomes
m n p( y j | xi )
I ( X : Y )   p( y j | xi ) p( xi ) log 2
i 1 j 1 p( y j )
p( y1 | x1 ) p( y1 | x2 )
 p( y1 | x1 ) p( x1 ) log 2  p( y1 | x1 ) p( x2 ) log 2
p( y1 ) p ( y2 )
p( y2 | x1 ) p( y2 | x2 )
 p( y2 | x1 ) p(x1 ) log 2  p( y2 | x2 ) p( x2 ) log 2
p ( y2 ) p ( y2 )

Substituting values in above equation


0.9 0.2 0.1
I ( X : Y )  (0.9)(0.5) log 2  (0.2)(0.5) log 2  (0.1)(0.5) log 2  (0.8)(0.5
0.55 0.55 0.45
 0.3197  0.1459  0.1084  0.3320

I ( X : Y )  0.3974 bits / symbol.


7. A black and white TV picture consists of about 2 106 picture elements with 16
different brightness levels , with equal probabilities . If the pictures are repeated
at the rate of 32 per second , calculate average rate of information conveyed by
this TV picture source . If SNR is 30dB , what is the maximum bandwidth required
to support the transmission of the resultant video signal.
Solution:
Picture elements =2 10
6

Source levels (symbols) =16 i.e., M=16


Picture repetition rate = 32/sec
S
   30
 N dB
i) Symbol Entropy:
Source emits any one of the 16 Brightness levels . Here M=16.These levels
are equiprobable .
H  log 2 M  log 2 16  4bits / symbol
ii) Symbol Rate(r):
Each picture consists of 2 106 picture elements. Such 32 pictures are
transmitted per second.
i.e., r  2 106  32 symbols / sec
 64 106 symbols / sec
iii) Average information rate(R):
R  rH

  64  4 106 bits / sec


R  256 106 bits / sec
Required Bandwidth for S/N=30dB
S S
   10log
 N dB N
S
30  10 log 2
N
From channel coding theorem
RC
 S
256 106  B log 2 1  
 N
256 106  B log2 1  1000 
256 106
B
log 2 (1001)
B  25.684MHz
Therefore transmission channel must have a Bandwidth of 25.684MHz to
transmit the resultant video signal.
8. A discrete source emits one of five symbols once every milliseconds with
probabilities 1/2, 1/4, 1/8, 1/16 and 1/16 respectively. Determine source entropy
and information rate.
Solution
(i) Source Entropy

K  1 
H (X )   pk log2  
k 1  pk 

1 1 1 1 1
H (X )  log2  2  
log2  4   log2  8   log2 16   log2 16 
2 4 8 16 16
1 1 3 1 1 15
H (X )      
2 2 8 4 4 8
H (X )  1.875 bits / symbol

(ii) Symbol rate (r)

1 1
r    1000 Symbols / sec
Tb 103

(iii) Information Rate (R)


R  r * H (X )
R  1000 *1.875

R  1875 bits/sec
9. A discrete memory less source (DMS) has five symbols x1, x2, x3, x4, x5 with
p(x1) = 0.4, p(x2) = 0.19, p(x3) = 0.16, p(x4) = 0.15, p(x5) = 0.1 respectively.
i) Construct a Shannon-Fano code for the source and calculate code efficiency η.
ii) Construct Huffman code and compare the results.
Solution:
a) Using Shannon-Fano coding procedure the following table was formed.

i) To calculate Average code word length


_ K
L   pk I k
k 1

 (0.4  2)  (0.19  2)  (0.16  2)  (0.15  3)  (0.1  3)


_
L  2.25 bits / symbol

ii) To find Entropy


K  1 
H (X )   pk log2  
k 1  pk 
 1   1   1 
H ( X )  0.4 log2    0.19log2    0.16 log2  
 0.4   0.19   0.16 
 1   1 
0.15log2    0.1log2  
 0.15   0.1 
H (X )  2.15 bits / symbol
iii) To find Code Efficiency (η)
H (X )
 _
L
2.15
  0.956
2.25

iv) To find Code Redundancy ( )


  1 

 1  0.956

  0.044

Huffman Code
b) Using Huffman coding procedure the following table was formed.
i) To calculate Average code word length
_ K
L   pk I k
k 1

 (0.4  1)  (0.19  3)  (0.16  3)  (0.15  3)  (0.1  3)


_
L  2.2 bits / symbol

ii) To find Entropy


K  1 
H (X )   pk log2  
k 1  pk 

 1   1   1 
H ( X )  0.4 log2    0.19log2    0.16 log2  
 0.4   0.19   0.16 
 1   1 
0.15log2    0.1log2  
 0.15   0.1 

H (X )  2.15 bits / symbol

iii) To find Code Efficiency (η)


H (X )
 _
L
2.15
  0.977
2.2
iv) To find Code Redundancy ( γ )

  1 

 1  0.977

  0.023
10. A Discrete Memoryless Source (DMS) emits five symbols with probabilities
p(s0)=0.4, p(s1)=0.2, p(s2)=0.2, p(s3)=0.1 and p(s4)=0.1. Compute two different
codes. Also find the average code length and the variance of the code.
Solution:
Method 1: Using Huffman coding procedure the following table was formed by
placing the combined signal as high as possible.

i) Average code word length


_ K
L   pk I k
k 1

 (0.55  1)  (0.15  3)  (0.15  3)  (0.10  3)  (0.05  3)


_
L  1.9 bits / symbol
ii) Variance of the code

K 2
 _

   pk I k  L 
2

k 0  
2 2 2 2 2
 0.55 1  1.9  0.15 3  1.9  0.15 3  1.9  0.1 3  1.9  0.05 3  1.9

 2  0.99
Method 2: Using Huffman coding procedure the following table was formed by
placing the combined signal as low as possible.

i) Average code word length


_ K
L   pk I k
k 1

 (0.55  1)  (0.15  2)  (0.15  3)  (0.10  4)  (0.05  4)


_
L  1.9 bits / symbol
ii) Variance of the code
K 2
 _

   pk I k  L 
2

k 0  
2 2 2 2 2
 0.55 1  1.9  0.15 2  1.9  0.15 3  1.9  0.1 4  1.9  0.05 4  1.9 

 2  1.29
6.4 Assignments
Unit 1
1. Determination of Entropy

Aim:
To find information and entropy of a given source.
Apparatus:
Personal Computer, MATLAB
Algorithm:
1. Enter no. of symbols.
2. Input the probabilities of symbols resp.
3. Calculate the entropy of the channel input. i.e. H(x) using the
formula:
2. Determination of various entropies and mutual
information of the given channel.

Aim:
Write a program for determination of various entropies and mutual
information of a given channel.
Apparatus:
Personal Computer, MATLAB
Algorithm:
1. Input the no. of inputs of a channel.
2. Input the no. of outputs of a channel.
3. Input the channel matrix. Test the condition that sum of all the entries
in each row should be equal to 1.
4. Input the channel input probabilities. i.e. P[X].
5. Calculate the entropy of the channel input. i.e. H(X)
6. Calculate output probability matrix P[Y], by multiplying input probability
matrix by channel matrix.
7. Also calculate entropy of channel output. i.e. H(Y).
8. Convert input probability matrix into diagonal matrix. i.e. P[X]d
9. Calculate the joint probability matrix by multiplying input probability
matrix in diagonal form by channel matrix.
10. Calculate joint entropy with the help of formula
11. Calculate conditional entropies H(Y/X)&H(X/Y).
12. Also we can calculate mutual information as
I(X;Y)=H(X)-H(X/Y) or
I(X;Y)=H(Y)-H(Y/X)
3. Determination of various entropies and mutual
information of the given BSC channel

Aim:
Write a program for determination of various entropies and mutual
information of a given channel. (Binary symmetric channel).
Apparatus:
Personal Computer, MATLAB
Algorithm:
1. Input the no. of inputs of a channel.
2. Input the no. of outputs of a channel.
3. Input the channel matrix. Test the condition that sum of all the entries
in each row should be equal to 1.
4. Input the channel input probabilities. i.e. P[X].
5. Calculate the entropy of the channel input. i.e. H(X)
6. Calculate output probability matrix P[Y], by multiplying input probability
matrix by channel matrix.
7. Also calculate entropy of channel output. i.e. H(Y).
8. Convert input probability matrix into diagonal matrix. i.e. P[X]d
9. Calculate the joint probability matrix by multiplying input probability
matrix in diagonal form by channel matrix.
10. Calculate joint entropy with the help of formula
11. Calculate conditional entropies H(Y/X)&H(X/Y).
12. Also we can calculate mutual information as I(X;Y)=H(X)-H(X/Y) or
I(X;Y)=H(Y)-H(Y/X)
4. Encoding and decoding of Huffman code
(Variable length source coding )

Aim:
Write a program for generation and evaluation of variable length source
coding using Huffman Coding and decoding. Calculate the entropy, average length
and efficiency of Huffman Coding.
Apparatus:
Personal Computer, MATLAB
Algorithm:
1. Start.
2. Input the total number of probabilities.
3. Arrange the messages in decreasing order of probabilities.
4. Add last two probabilities.
5. Assign them ‘0’ and ‘1’.
6. With addition & other probabilities again sort out the total probabilities.
7. If the addition result is equal to probability of an symbol then put it on
the top.
8. Repeat the program from step 4 until addition is 1.
9. To find code for particular symbol take the path of probability of symbol
and write code in reverse fashion.
10. Find out entropy, avg. code word length and efficiency.
11. Stop .
6.5 Part A
Questions & Answers
(Unit-1)
UNIT-I INFORMATION THEORY
PART – A (Q&A)

Questions Bloom’s CO’S


Level
1. Define entropy and find the entropy of a DMS with K2 CO1
probability s1=1/2, s2=1/2 and s3=1/4.
The entropy of a discrete random variable,
representing the output of a source of information, is a
measure of the average information content per source
symbol. The expectation of I(sk) over all the probable values
taken by the random variable S is given by
 1 
K 1
H (S )   pk log2 

k 0  pk 
Given :S={s1, s2, s3} and respective probabilities of
occurrence P={1/2, 1/2, 1/4}
K 1  1 
Entropy H (S )   pk log  
2
k
0  pk 
1 1 1
 log2 2  log2 2  log2 4
2 2 4
H (S )  1.5 bits/symbol

2. State Shannon’s Channel capacity theorem/What is K1 CO1


Shannon’s limit?
The information capacity (C) of a continuous channel
of bandwidth B hertz, affected by AWGN of total noise
power with in the channel bandwidth N, is given by the
formula,
S
C  B log2 (1  )bits / sec
N
where,
C - Information Capacity
B - Channel Bandwidth
S - Signal power (or) Average Transmitted power
N -Total noise power within the channel bandwidth
It is easier to increase the information capacity of a
continuous communication channel by expanding its
bandwidth than by increasing the transmitted power for a
prescribed noise variance.
UNIT-I INFORMATION THEORY
PART – A (Q&A)

Questions Bloom’s CO’S


Level
3. State the properties of entropy. K2 CO1
Property 1: H(S) = 0, if, and only if, the probability pk = 1
for some k, and the remaining probabilities in the set are all
zero; this lower bound on entropy corresponds to no
uncertainty.
Property 2: H(S) = log K, if, and only if, pk = 1/K for all k
(i.e., all the symbols in the source alphabet 𝒮 are
equiprobable); this upper bound on entropy corresponds to
maximum uncertainty.

4. A source generates 3 messages with probabilities of K2 CO1


0.5, 0.25,0.25. Calculate source entropy.
S={s1,s2,s3} ,respective probabilities of occurrence
P={0.5, 0.25, 0.25}
K 1  1 
Entropy H (S )   pk log 2 
k
0  pk 
 1   1 
 0.5 log2    0.25 log2  
 0.5   0.25 
 1 
0.25 log2  
 0.25 

H (S )  1.5 bits/symbol

5. Define mutual information. K1 CO1


The mutual information I(X;Y) is a measure of the
uncertainty about the channel input, which is resolved by
observing the channel output.
Mutual information I(xi,yj) of a channel is defined as
the amount of information transferred when xi transmitted
and yj received. x 
log p  i 
y 
I (x i , y j )   j  bits
p (x i )
I(xi,yj) - Mutual Information
p(xi,yj) - Conditional probability that xi was
transmitted and yj was received
p(xi) - Probability of symbol xi for transmission
UNIT-I INFORMATION THEORY
PART – A (Q&A)

Questions Bloom’s CO’S


Level
6. Define Channel Capacity. K1 CO1
Channel Capacity of a discrete memory less channel
as the maximum average mutual information I(X:Y) in any
single use of channel i.e., signaling interval, where the
maximization is over all possible input probability distribution
{p(xi)} on X. It is measured in bits per channel use.
Channel capacity (C) =max I (X:Y)
The channel capacity C is a function only of the transition
probabilities p(yj|xi), which defines the channel.

7. State Source Coding theorem. K1 CO1


Efficient source encoder must satisfy two functional
requirements:
i) Code Nodes produced by the encoder are in binary form.
ii)The source code is uniquely decodable, so that the original
source sequence can be reconstructed perfectly from the
encoded binary sequence
Given a discrete memory less source of entropy H(S)
or H(X) then the average code word length for any distortion
less source encoding is bounded as,
_
L  H (X )
(or )
_
L  H (S)
where, the Entropy H(S) or H(X) is the fundamental limit on
the average_ number of bits/source symbol.
Thus with L min  H (X) , efficiency given by
H (X )
  _
L
8. Using Shannon law determine the maximum capacity K1 CO1
of 5MHz channel with S/N ratio of 10 dB.
Channel Capacity (C) is given by:
S
C  B log2 (1  )bits / sec
N
B= 5MHz, S/N = 10 dB= 10 ( dB to Decimal conversion)
C= 5x106 log 2 (1+10) = 5x106 log 2 (11) = 17.29 Mbits/sec
UNIT-I INFORMATION THEORY
PART – A (Q&A)

Questions Bloom’s
Level
9. Comment the tradeoff bandwidth and signal to K3 CO1
noise ratio.
Channel capacity given by:

S
C  B log2 (1  )bits / sec
N
Trade off
i) Noiseless Channel has infinite bandwidth. If there
is no noise in the channel , then N=0 hence (S/N)= .

ii) Infinite bandwidth has limited capacity. If B is


infinite, channel capacity is limited, because noise power
(N) increases, (S/N) ratio decreases. Hence, even B
approaches infinity, capacity does not approach infinity.

K2 CO1
10.Mention the properties of mutual information.

Properties of Mutual Information


i)Symmetry Property
The mutual information of a channel is symmetric in
the sense that
I(X;Y) =I(Y;X)

ii) Expansion of the Mutual Information property


The mutual information of a channel is related to the
joint entropy of the channel input and channel output by

I (X ;Y )  H (X )  H (Y )  H (X ,Y )
iii) Non negativity Property
The mutual information is always non-negative. We
cannot lose information, on the average, by observing the
output of a channel.
I(X;Y) ≥ 0
UNIT-I INFORMATION THEORY
PART – A (Q&A)

Questions Bloom’s
Level
11. Define Discrete Memoryless Source. K1 CO1
A source in which, each symbol is independent of
previous symbols.DMS can be characterized by list of
symbols, the probability assignment to these symbols, and
the specification of the rate of generating these symbols by
the source.

12. Define Entropy. K1 CO1


The entropy H(S) of a discrete random variable,
representing the output of a source of information, is a
measure of the average information content per source
symbol. The expectation of I(sk) over all the probable values
taken by the random variable S is given by
H (S )  E [I (s k )]
K 1  1 
H (S )   pk log2  
k 0  pk 

13. Define Discrete memoryless channel. K1 CO1


Discrete Memoryless Channel is a statistical
model with an input X and output Y (i.e., noisy version of
X). Both X and Y are random variables.
The channel is said to be discrete ,when both X and Y
have finite sizes. It is said to be memoryless when the
current output symbol depends only on the current input
symbol and not on previous ones.

14. List various types of data coding methods. K1 CO1


i) Shannon-Fano Code
ii) Huffman Code
iii) Prefix Code
iv) Lempel-Ziv Code

15. Define Information Rate. K1 CO1


If a source generates messages then information rate
R is represented in average number of bits of information
per second. R = rH (bits / second)
where, H is the Entropy (bits/messages) or average
information and r is the rate at which messages are
generated( messages/ sec ).
UNIT-I INFORMATION THEORY
PART – A (Q&A)

Questions Bloom’s
Level
16. Define mutual information I(X;Y) between two K1 CO1
discrete random variables X and Y
The mutual information between two random variables X and
Y can be stated formally as follows: I(X ; Y) = H(X) – H(X | Y)

17. What is the capacity of channel having infinite bandwidth? K1 CO1


As B → ∞, the channel capacity does not become infinite since,
with an increase in bandwidth, the noise power also increases.
If the noise power spectral density is η/2, then the total noise
power is N = ηB, so the Shannon-Hartley law becomes

This gives the maximum information transmission rate possible for a


system of given power but no bandwidth limitations.
18. Define Prefix code. K1 CO1
Prefix codes are the codes in which no codeword is a prefix
of any other codeword. The idea is usually applied to
variable-length codes. A prefix code has the property that, as
soon as all the symbols of a codeword have been received,
the codeword is recognized as such. Prefix codes are
therefore said to be instantaneously decodable and uniquely
decodable.

19. A voice grade channel of the telephone network has a


bandwidth of 3.4 kHz. Calculate the information capacity
of the telephone channel for a signal-to-noise ratio of 30
dB.
Solution : Given that
B = 3.4 kHz, and SNR = 30 dB , SNR = 1000
C = B log2
Substituting all the values, we get
C = 3.4 x 103 log2 (1 + 1000)
or C = 3.4 x 103 = 3400 x 9.9672
or C = 33888.57 bits/sec = 33.89 kbps.
6.6 Part B
Questions
(Unit 1)
UNIT-I INFORMATION THEORY
PART – B (Questions)

Questions Bloom’s CO’S


Level

1. A DMS has six symbols x1, x2, x3, x4, x5, x6 with probability K3 CO1
of emission 0.2, 0.3,0.11,0.16,0.18,0.05 encode the source
with Huffman and Shannon–Fano codes compare it’s
efficiency.

2. i) Derive the mutual information I(x;y) for a binary K3 CO1


symmetric channel, when the probability of source is equally
likely and the probability of channel p=0.5.

ii) For a source emitting three symbols with probabilities p(X)


={1/8,1/4,5/8} and p(Y/X) as given in the table, where X
and Y represent the set of transmitted and received symbols
respectively, compute H(X),H(X/Y) and H(Y/X).
 y1 y 2 y 3 
 
x1 2 2 1 
 5 5 5
P ( X |Y )   1 2 2
x 2 
 5 5 5
 2 1 2
x 3 
 5 5 5
3. i) Consider a binary memoryless source X with two symbols K3 CO1
x1 and x2. Prove that H(X) is maximum when both x1and x2 is
equiprobable.

ii) Given a telegraph source having two symbols dot and


dash. The dot duration is 0.2 sec. The dash duration is 3
times the dot duration. The probability of the dot occurring is
twice that of the dash, and the time between symbols is 0.2
sec. Calculate the information rate of the telegraph source.

4. i) Find the channel capacity of the binary erasure channel as K2 CO1


shown in fig below.
UNIT-I INFORMATION THEORY
PART – B (Questions)

Questions Bloom’s CO’S


Level
4. ii) A source is emitting equiprobable symbols. Construct a K3 CO1
Huffman code for source.

5. State Shannon’s theorems and explain. K2 CO1

6. A discrete memoryless source has five symbols x1,x2,x3,x4 K3 CO1


and x5 with probabilities 0.4,0.19,0.16,0.15 and 0.15
respectively attached to every symbol.
i) Construct a Shannon-Fano code for the source and
calculate code efficiency.
ii) Construct the Huffman code and compare the two source
coding techniques.

7. i) State and prove mutual information and write the K2 CO1


properties of mutual Information. [K2]
(ii) Derive Shannon - Hartley theorem for the channel
capacity of a continuous channel having an average
power limitation and perturbed by an additive band-limited
white Gaussian noise.

8. Consider a discrete memory less source with seven possible K3 CO1


symbols Xi = {1,2,3,4,5,6,7} with associated probabilities Pi
= {0.37,0.33,0.16,0.04,0.02,0.01}. Construct the Huffman's
code and Shannon Fano code and determine the coding
efficiency and redundancy.

9. (i) The two binary random variables X and Y are distributed K3 CO1
according to the joint PDF given by P(X = 0,Y = 1) = 1/4 ;
P(X = 1,Y = 1) = 1/2 ; and P(X = 1,Y = 1) = 1/4 ;
Determine H(X,Y), H(X), H(Y), H(X/Y) and H(Y/X).
(ii) Define entropy and plot the entropy of a binary source.

10. Explain the Huffman coding algorithm with a flowchart and K3 CO1
illustrate it using an example.
UNIT-I INFORMATION THEORY
PART – B (Questions)

Questions Bloom’s CO’S


Level

11. A source emits one of the four symbols A, B, C and D with K3 CO1
probabilities 1/3, 1/6,1/4 and 1/4 respectively. The emissions
of symbols by the source are statistically efficiency if the
Shannon Fano Coding is used.

12. (i) Discuss about discrete memoryless channels. K2 CO1


(ii) Explain the properties of entropy.

13. Compute two different Huffman codes for the source with K3 CO1
the probabilities 0.4, 0.2, 0.2, 0.1, 0.1 .

14. State Shannon’s theorem and illustrate Shannon Fano K1 CO1


algorithm with example.

K2, K3 CO1
15. i) Derive the channel capacity of band limited Gaussian
channel. [K2]
ii) Calculate the channel capacity of the channel with the
channel matrix shown below: [K3]

0.4 0.4 0.1 0.1


P (Y / X )   
 0.1 0.1 0.4 0.4 

16. i) For the given channel matrix compute the mutual K3 CO1
information I(x,y) with P(x1) = 1/2 and P(x2) = 1/2.

2 1 
3 3 0
P (Y / X )   
0 1 5 
 6 6 

ii) Construct Huffman code for the following message set.


X = [x1, x2, x3, x4, x5, x6, x7, x8] with probabilities P(x) =
[0.07, 0.08, 0.04, 0.26, 0.14, 0.4, 0.005, 0.005]. Compute
the coding efficiency and redundancy.
UNIT-I INFORMATION THEORY
PART – B (Questions)

Questions Bloom’s CO’S


Level

17. Examine the effectiveness of discrete memory less channels. K2 CO1

18. A DMS has six symbols x1, x2, x3, x4, x5, x6 with probability of K3 CO1
emission 0.2, 0.3, 0.11, 0.16, 0.18, 0.05. Encode the source
with Huffman and Shannon-fano codes and compare its
efficiency.

19. i) Derive the mutual information I(X;Y) for a binary K2,K3 CO1
symmetric channel, when the probability of source is equally
likely and the probability of channel p=0.5 [K2]

ii) For a source emitting three symbols with probabilities p(X)


= {1/3, 1/6, 1/6} and p(Y/X) as given in the table, where X
and Y represent the set of transmitted and received symbols
respectively. H(X), H(X/Y) and H(Y/X). [K3]

1 2 2
5 5 5
 
2 2 1
P (Y / X )  
5 5 5
 
2 1 2
 5 5 5 
6.7 Supportive online
Certification Courses
NPTEL REFERENCE VIDEO LINKS
(For Extended Learning)

1. https://nptel.ac.in/courses/117/101/117101051/
Topics covered:
1. Introduction to Digital Communication
2. Sampling
3. Quantization
4. Encoding
5. PCM and Delta Modulation
6. Channels and Models
7. Information Theory
8. Digital Modulation Techniques
9. Source Coding
10. Equalizers
11. Channel Coding

2. https://nptel.ac.in/courses/108/102/108102096/
Topics covered:
All the topics of Unit I to V

3. https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-450-
principles-of-digital-communications-i-fall-2006/video-lectures/
Topics covered:
All the topics of Unit I to V

4. http://www.infocobuild.com/education/audio-video-courses/electronics/modern-
digital-communication-iit-kharagpur.html
Topics covered:
All the topics of Unit I to V
6.8 Real time
Applications in Day to
Day life and to Industry
A. Intelligence uses and secrecy applications

Information theoretic concepts apply to cryptography and cryptanalysis.


Turing's information unit was used in the Ultra project, breaking the German
Enigma machine code and hastening the end of World War II in Europe. Shannon
himself defined an important concept now called the unicity distance. Based on
the redundancy of the plaintext, it attempts to give a minimum amount of
ciphertext necessary to ensure unique decipherability.

Information theory leads us to believe it is much more difficult to keep


secrets than it might first appear. A brute force attack can break systems based
on asymmetric key algorithms or on most commonly used methods of symmetric
key algorithms (sometimes called secret key algorithms), such as block ciphers.
The security of all such methods currently comes from the assumption that no
known attack can break them in a practical amount of time.

Information theoretic security refers to methods such as the one-time pad


that are not vulnerable to such brute force attacks. In such cases, the positive
conditional mutual information between the plaintext and ciphertext (conditioned
on the key) can ensure proper transmission, while the unconditional mutual
information between the plaintext and ciphertext remains zero, resulting in
absolutely secure communications.

In other words, an eavesdropper would not be able to improve his or her


guess of the plaintext by gaining knowledge of the ciphertext but not of the key.
However, as in any other cryptographic system, care must be used to correctly
apply even information-theoretically secure methods; the Venona project was able
to crack the one-time pads of the Soviet Union due to their improper reuse of key
material.

https://www.youtube.com/watch?v=-hfyUwhjK4A
B. Pseudorandom Number Generation

Pseudorandom number generators are widely available in computer


language libraries and application programs. They are, almost universally,
unsuited to cryptographic use as they do not evade the deterministic nature of
modern computer equipment and software. A class of improved random number
generators is termed cryptographically secure pseudorandom number generators,
but even they require random seeds external to the software to work as intended.
These can be obtained via extractors, if done carefully.

The measure of sufficient randomness in extractors is min-entropy, a value


related to Shannon entropy through Rényi entropy; Rényi entropy is also used in
evaluating randomness in cryptographic systems. Although related, the
distinctions among these measures mean that a random variable with high
Shannon entropy is not necessarily satisfactory for use in an extractor and so for
cryptography uses.

https://www.youtube.com/watch?v=GtOt7EBNEwQ

C. Seismic Exploration

One early commercial application of information theory was in the field of


seismic oil exploration. Work in this field made it possible to strip off and separate
the unwanted noise from the desired seismic signal. Information theory and
digital signal processing offer a major improvement of resolution and image clarity
over previous analog methods.

D. Semiotics

Semioticians DoedeNauta and Winfried Nöth both considered Charles


Sanders Peirce as having created a theory of information in his works on
semiotics. Nauta defined semiotic information theory as the study of "the internal
processes of coding, filtering, and information processing."
Concepts from information theory such as redundancy and code control
have been used by semioticians such as Umberto Eco and Ferruccio Rossi-Landi to
explain ideology as a form of message transmission whereby a dominant social
class emits its message by using signs that exhibit a high degree of redundancy
such that only one message is decoded among a selection of competing ones.

https://www.youtube.com/watch?v=IZ1GsP1soC4

E. Miscellaneous applications

Information theory also has applications in Gambling and information


theory, black holes and bioinformatics.

https://www.youtube.com/watch?v=IZ1GsP1soC4
6.9 Content Beyond
the Syllabus
(Unit 1)
LOSSY COMPRESSION TECHNIQUES

A. Difference between Lossy Compression and Lossless


Compression
Data Compression is a technique in which the size of data is reduce
without loss of information. Lossy compression and Lossless compression
are the categories of data compression method. The main difference between the
two compression techniques (lossy compression and Lossless compression) is
that, The lossy compression technique does not restored the data in its original
form, after decompression on the other hand lossless compression restores and
rebuilt the data in its original form, after decompression.

S.No Lossy Compression Lossless Compression

Lossy compression is the method Lossless Compression does not


1. which eliminate the data which is eliminate the data which is not
not noticeable. noticeable.

A file does not restore or rebuilt in A file can be restored in its original
2.
its original form. form.

In Lossy compression, Data’s quality But Lossless Compression does not


3.
is compromised. compromise the data’s quality.
Lossy compression reduces the size But Lossless Compression does not
4.
of data. reduce the size of data.
Algorithms used in Lossy Algorithms used in Lossless
compression are: compression are:
a) Transform coding a) Run Length Encoding
5.
b) Discrete Cosine Transform b) Lempel-Ziv-Welch
c) Discrete Wavelet Transform c) Huffman Coding
d) Fractal compression etc. d) Arithmetic encoding etc.
Lossy compression is used in Lossless Compression is used in Text,
6.
Images, audio, video. images, sound.
It has less data-holding capacity than
7. It has more data-holding capacity.
Lossy compression technique.
Lossy compression is also termed as Lossless Compression is also termed as
8.
irreversible compression. reversible compression.
B. LOSSY COMPRESSION METHODS

Our eyes and ears cannot distinguish subtle changes. In such cases, we
can use a lossy data compression method. These methods are cheaper—they take
less time and space when it comes to sending millions of bits per second for
images and video. Several methods have been developed using lossy compression
techniques. JPEG (Joint Photographic Experts Group) encoding is used to
compress pictures and graphics, MPEG (Moving Picture Experts Group) encoding
is used to compress video, and MP3 (MPEG audio layer 3) for audio compression.
Image compression – JPEG encoding

An image can be represented by a two-dimensional array (table) of picture


elements (pixels).

A grayscale picture of 307,200 pixels is represented by 2,457,600 bits, and


a color picture is represented by 7,372,800 bits. In JPEG, a grayscale picture is
divided into blocks of 8 × 8 pixel blocks to decrease the number of calculations
because, as we will see shortly, the number of mathematical operations for each
picture is the square of the number of units.

Figure JPEG grayscale example, 640 × 480 pixels


The whole idea of JPEG is to change the picture into a linear (vector) set
of numbers that reveals the redundancies. The redundancies (lack of changes)
can then be removed using one of the lossless compression methods we studied
previously. A simplified version of the process is shown in Figure.

https://www.youtube.com/watch?v=Ba89cI9eIg8
7.ASSESSMENT SCHEDULE
Unit Test I : 22.08.2022 TO 27.08.2022

Internal Assessment Test I : 16.09.2022 TO 22.09.2022

Unit Test II : 10.10.2022 TO 15.10.2022

Internal Assessment Test II : 2.11.2022 TO 8.11.2022

Model Examination : 1.12.2022 TO 10.12.2022


8.Prescribed Text Books
and Reference Books
TEXT BOOK
1. S. Haykin, “Digital Communications”, John Wiley, 2005.

REFERENCES
1. B. Sklar, “Digital Communication Fundamentals and Applications”, 2nd Edition,
Pearson Education, 2009.
2. B.P.Lathi, “Modern Digital and Analog Communication Systems”, 3rd Edition,
Oxford
3. University Press 2007.
4. H P Hsu, Schaum Outline Series – “Analog and Digital Communications”, TMH
2006.
5. J.G Proakis, “Digital Communication”, 4th Edition, Tata McGraw Hill Company,
2001.

ADDITIONAL REFERENCES
1. Dennis Roddy & John Coolen, “Electronic Communication”, 4th Edition, Prentice
Hall of India.
2. Herbert Taub & Donald L Schilling, “Principles of Communication Systems”, 3rd
Edition, Tata McGraw Hill, 2008
3. Simon Haykin, ”Communication Systems”, John Wiley & sons, NY, 4th Edition,
2001.
4. Bruce Carlson, ”Communication Systems”, 3rd Edition, Tata Mc Graw Hill.
5. R.P Singh and S.D.Sapre, “Communication Systems – Analog and Digital”,
6. Tata McGraw Hill, 2nd Edition, 2007.
7. J.G.Proakis, M.Salehi, “Fundamentals of Communication Systems”, Pearson
Education 2006.
8. Couch.L., “Modern Communication Systems”, Pearson, 2001.
9. S. Haykin, “Communication Systems”, John Willy & Sons.
10. A.B. Carlson, “Communication Systems”, Mc-Graw-HW.
11. P Chakrabarti Dhanpat Rai, “Analog Communication Systems”.
12. Taub, Herbert & Schilling, Donald L. “Communication Systems” Tata McGraw-Hill
13. Carlson, A. Bruce, Crilly, Paul B. & Rutledge, Janet C, “Communication Systems
an Introduction to Signals & Noise in Electrical Communication” , Tata McGraw-
Hill.
14. Kennedy, George & Davis, Bernard “Electronic Communication Systems”, 4th
Edition, Tata McGraw-Hill.
15. Singh, R.P. & Sapre, S.D. “Communication Systems: Analog Digital” Tata McGraw-
Hill.
16. Sanjay Sharma, “Communication Systems (Analog and Digital)”, Reprint 2013
Edition , S.K. Kataria & Sons, 2013.
9.MINI PROJECT
Mini Project List
1.Simulate Entropy and Mutual Information for a lossless
channel
MATLAB PROGRAM for Entropy and Mutual Information of lossless
channel - MATLAB Programming (matlabcoding.com)
2. Simulate Entropy and Mutual Information for a Binary
Symmetric channel
MATLAB PROGRAM for Entropy and Mutual Information for Binary
Symmetric Channel - MATLAB Programming (matlabcoding.com)
3.Simulate Shannon Fano Coding
Shannon-Fano Encoding using MATLAB (m-file) - MATLAB
Programming (matlabcoding.com)
4. Simulate Huffman coding using any programming language
Simplified I/O code by removing Python 2 support. ·
nayuki/Reference-Huffman-coding@bc28635 · GitHub
Mini Project List
5.Implement the Shannon Fano Coding Algorithm for Data
Compression
https://www.geeksforgeeks.org/shannon-fano-algorithm-for-
data-compression/
6. Huffman Encoding and Decoding in MATLAB
https://www.electronicsforu.com/electronics-
projects/software-projects-ideas/huffman-coding-decoding-
matlab
7.Implementation of image compression algorithm
https://in.mathworks.com/videos/accelerate-image-
compression-algorithm-77688.html
Thank you

Disclaimer:

This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.

You might also like