21EC51_DC_Module_4
21EC51_DC_Module_4
Information Theory
In this chapter, we attempt to answer two basic questions that arise in the analysis and
design of communication systems:
(1) Given an information source, how do we evaluate the "rate" at which the source is
emitting information?
(2) Given a noisy communication channel, how do we evaluate the maximum "rate" at
which "reliable" information transmission can take place over the channel?
We develop answers to these questions based on probabilistic models for information
sources and communication channels.
Information sources can be classified into two categories: analog (or continuous-valued)
and discrete. Analog information sources, such as a microphone actuated by a voice
signal, emit a continuous-amplitude, continuous time electrical waveform. The output of a
discrete information source such as a teletype consists of sequences of letters or symbols.
Analog information sources can be transformed into discrete information sources through
the process of sampling and quantizing.
If the units of information are taken to be binary digits or bits, then the average
information rate represents the minimum average number of bits per second needed to
For a bandlimited channel with bandwidth B, Shannon has shown that the capacity C is
equal to B log2(1 + S/ N), where S is the average signal power at the output of the channel
and N is the average power of the bandlimited Gaussian noise that accompanies the
signal.
MEASURE OF INFORMATION
Information Content of a Message
The output of a discrete information source is a message that consists of a sequence of
symbols. The actual message that is emitted by the source during a message interval is
selected at random from a set of possible messages. The communication system is
designed to reproduce at the receiver either exactly or approximately the message emitted
by the source. Some messages produced by an information source contain more
information than other messages. To get the concept of the “amount of information” let us
consider the following example.
E.g.: - Suppose you are planning a trip to Miami, Florida from Minneapolis in the winter
time. To determine the weather in Miami, you telephone the Miami weather bureau and
receive one of the following forecasts:
1. mild and sunny day,
2. cold day,
3. Possible snow flurries.
The amount of information received is obviously different for these messages.
1. It contains very little information since the weather in Miami is mild and sunny
most of the time.
2. The forecast of a cold day contains more information since it is not an event that
occurs often.
3. In comparison, the forecast of snow flurries conveys even more information since
the occurrence of snow in Miami is a rare event.
Thus on an intuitive basis the amount of information received from the knowledge of
occurrence of an event is related to the probability or the likelihood of occurrence of the
event. The message associated with an event least likely to occur contains most
information.
4. When two independent messages are received the total information content is the
sum of the information conveyed by each of the two messages.
Mathematically, we can state this requirement by
The base for the logarithm determines the unit assigned to the information
content.
● If the natural logarithm base is used, then the unit is nat.
● If the base is 10, then the unit is Hartley or decit.
● If the base is 2, then the unit of information is the familiar bit, an
abbreviation for binary digit.
Note: Unless otherwise specified, we will use the base 2 in the definition of information
content.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.1
A source puts out one of five possible messages during each message interval. The
probabilities of these messages are
p1 = , p2 = , p3 = , p4 = , p5 =
Solution: I (m1) = log2 (1/ p1) = log2 ( ) = log2 (2) ⇒ I (m1) = 1 bit
( )
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Average Information Content (Entropy) of Symbols in Long Independent Sequences
Messages produced by information sources consist of sequences of symbols. While the
receiver of a message may interpret the entire message as a single unit, communication
systems often have to deal with individual symbols. For example, if we are sending
messages in the English language using a teletype, the user at the receiving end is
interested mainly in words, phrases, and sentences, whereas the communication system
has to deal with individual letters or symbols. Hence, from the point of view of
communication systems that have to deal with symbols, we need to define the information
content of symbols.
When we attempt to define the information content of symbols, we need to keep the
following two factors in mind:
1. The instantaneous flow of information in a system may fluctuate widely due to the
randomness involved in the symbol selection. Hence we need to talk about
average information content of symbols in a long message.
2. The statistical dependence of symbols in a message sequence will alter the
average information content of symbols. For example, the presence of the letter U
following Q in an English word carries less information than the presence of the
same letter U following the letter T.
We will first define the average information content of symbols assuming the source
selects or emits symbols in a statistically independent sequence, with the probabilities of
occurrence of various symbols being invariant with respect to time.
Suppose we have a source that emits one of M possible symbols s1, s2, ... ,sM in a
statistically independent sequence. That is, the probability of occurrence of a particular
The total information content of the message is then the sum of the contribution due to
each of the M symbols of the source alphabet and is given by
I total = p1N log2 (1/p1) + p2N log2 (1/p2) + p3N log2 (1/p3) + - - - - - + pMN log2 (1/pM )
We obtain the average information per symbol by dividing the total information content
of the message by the number of symbols in the message, as
H = H(s) = ∑
H = p1 log2 (1/p1) + p2 log2 (1/p2) +p3 log2 (1/p3)
H = (½) log2 (1/ (½)) + (¼) log2 (1/ (¼)) + (¼) log2 (1/ (¼))
H = (½) log2 (2) + (¼) log2 (4) + (¼) log2 (4)
H =1.5 bits/ symbol
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
In general, for a source with an alphabet of M symbols, the maximum entropy is attained
when the symbol probabilities are equal, that is, when p1 = p2 = ... = pM = l/M, and Hmax is
given by
Hmax = log2 M bits/symbol
It was mentioned earlier that symbols are emitted by the source at a fixed time rate, say rs
symbols/sec. The average source information rate R in bits per second as the product of
the average information content per symbol and the symbol rate rs.
R = rsH bits/sec
The abbreviation BPS is often used to denote bits per second.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.3
A discrete source emits one of five symbols once every millisecond. The symbol
H = H(s) = ∑
H = p1 log2 (1/p1) + p2 log2 (1/p2) +p3 log2 (1/p3) + p4 log2 (1/p4) + p5 log2 (1/p5)
H = (½) log2(1/ (½)) + (¼) log2(1/(¼)) +(⅛) log2(1/(⅛)) + (1/16) log2(1/(1/16)) +
(1/16) log2(1/(1/16))
H = (½) log2 (2) + (¼) log2 (4) + (⅛) log2 (8) + (1/16) log2 (16) + (1/16) log2 (16)
H = 0.5 + 0.5 + 0.375 + 0.25 + 0.25
H = 1.875 bits/ sym
Information rate R is
R = rsH = (1000) (1.875) = 1875 bits/ sec
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 Hartley = = nats [ = ]
pj = and pk =
(a) The information content of a dot, I dot = log2 (1/ pdot ) = log2 (4/ 3)
I dash = 2 bits
(b) The average information in the dot-dash code
H(s) = pdot log2 (1/pdot) + pdash log2 (1/pdash)
pace = ⇒ pace =
Iace = log2 (1/ pace ) ⇒ Iace = 3.7 bits
(c) There is only one "ace of spades" present in a deck of 52 cards.
pace_spade =
Iace_spade = log2 (1/ pace_spade ) ⇒ Iace_spade = 5.7 bits
Yes. The information content of the message "ace of spades" is the sum of the
information contents of the messages "spade" and "ace".
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.8
A binary source is emitting an independent sequence of 0's and 1's with probabilities p
and (1 - p), respectively. Plot the entropy of this source versus p (0 < p < 1).
Solution: Given p0 = p and p1 = (1- p)
Let p= 0.1, H(s) =0.1 log2 ( ) + 0.9 log2 ( ) ⇒ H(s) = 0.469 bits/ sym
Let p = 0.3, H(s) = 0.3 log2 ( ) + 0.7 log2 ( ) ⇒ H(s) = 0.881 bits/ sym
Let p = 0.4, H(s) = 0.4 log2 ( ) + 0.6 log2 ( ) ⇒ H(s) = 0.971 bits/ sym
Let p= 0.5, H(s) = 0.5 log2 ( ) + 0.5 log2 ( ) ⇒ H(s) = 1 bit/ sym
For p = 0 & p= 1, H(s) = 0
for p1 = p2 = p3 = ……… = pm =
ղs=
Rղs= 1- ղs
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.9
A black and white TV picture consists of 525 lines of picture information. Assume that
each line consists of 525 picture elements (pixels) and that each element can have 256
brightness levels. Pictures are repeated at the rate of 30 frames/sec. Calculate the average
rate of information conveyed by a TV set to a viewer.
Solution: Total number of pixels in one frame = 525 × 525 = 2, 75, 625 pixels
Each pixel can have 256 different brightness levels.
Total number of different frames possible = (256)2, 75,625
Let us assume that all these frames occur with equal probability, the total maximum
coverage content per frame is
H(s) max = log2M = log2 [(256)2, 75,625] = 2, 75,625 × log2 (256) = 22.05 × 105bits/ frame
Given rs = 30 frames/sec
Information rate is R = rsH(s) max = (30) (22.05 × 105) = 66.15 × 105 bits/ sec
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.10
A discrete message source s emits two independent symbols x and y with probabilities
0.55 and 0.45 respectively. Calculate the efficiency of the source and its redundancy.
Solution: px = 0.55, py = 0.45
H(s) = ∑ = px log2 (1/px) + py log2 (1/py)
= 2 p12 log2 (1/p1) + 2 p1 p2 log2 (1/p1) + 2p1 p2 log2 (1/ p2)+ 2 p22 log2 (1/p2)
H (s2) = 2 H(s)
Similarly, the 3rd extension of the basic source will have
● s1 s1 s1 occur with probability of p1 p1 p1= p13
● s1 s1 s2 occur with probability of p1 p1 p2= p12 p2
● s1 s2 s1 occur with probability of p1 p2 p1= p12 p2
● s1 s2 s2 occur with probability of p1 p2 p2= p1 p2 2
● s2 s1 s1 occur with probability of p2 p1 p1= p12 p2
● s2 s1 s2 occur with probability of p2 p1 p2= p1 p2 2
A zero memory source has source alphabet s= {s1, s2, s3} with probability p = { , , }
Find the Entropy of this source. Also determine the Entropy of its 2nd extension and
verify that H (s2) = 2 H(s).
Solution: For source with 3 symbols,
The sum of all probabilities of the 2nd extension of symbols must be equal to 1.
The Entropy of the 2nd extended source is given by
H (s2) = ∑
H (s2) = (¼) log2 (1/ (¼)) + 4 [ (⅛) log2 (1/ (⅛))] + 4 [(1/16) log2 (1/(1/16))]
, , , and respectively. Find the Entropy of the source. List all the elements for the
second extension of this source. Hence show H (s2) = 2 H(s).
Solution: For the basic source, the Entropy is given by
H(s) = ∑
m1 m1 m2 m1 m3 m1 m4 m1
(p1) (p5) (p9) (p13)
m1 m2 m2 m2 m3 m2 m4 m2
(p2) (p6) (p10) (p14)
m1 m3 m2 m3 m3 m3 m4 m3
(p3) (p7) (p11) (p15)
m1 m4 m2 m4 m3 m4 m4 m4
(p4) (p8) (p12) (p16)
H (s2) = ∑
In particular, if some source symbols are known to be more probable than others, then we
may exploit this feature in the generation of a source code by assigning short code-word
to frequent source symbols, and long code-words to rare source symbols. We refer such a
source code as a variable length code. The Morse code is an example of a variable length
code. In Morse code, the letters of the alphabet and numerals are encoded into steam of
marks and spaces, denoted as dots „.‟ and dashes „_„respectively. Since in English
language, the letter E occur more frequently than letter Q, for example, the Morse code
Sk bk
Discrete
Source Binary sequence
memoryless
Encoder
source
̅ ∑
In physical terms, ̅ represents the average number of bits per source symbols used in the
source encoding process. Let Lmin denote the minimum possible values of ̅ . We then
define the coding efficiency of the source encoder as
̅
With ̅ ≥ Lmin, we clearly have The source encoder is said to be efficient when
approaches unity.
Shannon’s First Theorem:
To determine minimum value of Lmin the Source-coding theorem, states as follows:
“Given a discrete memoryless source of entropy H(s), the average code-word length ̅ for
any source encoding is bounded as ̅ ”
The entropy H(s) represents a fundamental limit on the average number of bits per source
symbol ̅ , necessary to represent a discrete memoryless source in that ̅ can be made as
small as, but no smaller than, the entropy H(s).
Thus with Lmin = H(s), the efficiency of a source encoder in terms of entropy H(s) as
To define prefix condition, let the kth code-word assigned to source symbol „Sk‟ be
denoted by (mk1, mk2,…, mkn), where the individual element mk1,mk2,…,mkn are 0‟s and
1‟s and „n‟ is the code-word length.
The initial part of the code-word is represented by the elements mk1, mk2,…,mki for some
. Any sequence made up of initial part of the code-word is called a prefix of the
code-word. A prefix code is defined as a code in which no code-word is the prefix of any
other code-word.
Source Probability of
Code I Code II Code III
Symbol occurrence
S0 0.5 0 0 0
S1 0.25 1 10 01
S2 0.125 00 110 011
S3 0.125 11 111 0111
1) Code I is not a prefix code since the bit 0 the code-word for S0, is a prefix of 00,
the code-word for S2, Likewise, the bit 1 the code-word for S1, is a prefix of 11,
the code-word for S2.
2) Similarly, the code III is not a prefix code.
3) But Code II is a prefix code.
In order to decode a sequence of code-words, generated from a prefix source code, the
source decoder simply starts at the beginning of the sequence and decodes one code-
word at a time.
S0
0
S1
0
Initial State
0 S2
1
1
1 S3
The decoder always starts in initial state. The first received bit moves the decoder to the
terminal state S0 if it is 0, or else to a second decision point if it is 1. In the latter case, the
second bit moves the decoder one step further down the tree, either to terminal state S1 if
it is 0, or else to a third decision point if it is 1, and so on.
Once each terminal state emits its symbols, the decoder is reset to its initial state. Note
also that each bit in the received encoder sequence is examined only once. For example,
the encoder sequence 1011111000 is readily decoded as the source sequence
S1S3S2S0S0… [10→ (S1), 111→ (S3), 110→ (S2), 0→ (S0), 0→ (S0)]
A prefix code has the important property that it is always uniquely decodable.
If a prefix code has been constructed for a discrete memoryless source with source
alphabet {S0, S1, …,Sk-1} and source statistics {P0, P1,…Pk-1} and the code-word for
symbol „Sk‟ has length „lk‟, k=1,2,…, K-1, then the code-word length of the code satisfy a
certain inequality known as the Kraft-McMillan inequality.
In mathematical terms, we say state that
Where, the factor 2 refers to the radix (number of symbols) in the binary alphabet.
Conversely, we may state that if the code-word length of a code for a discrete memoryless
source satisfies the Kraft-McMillan inequality, then a prefix code with these code-word
lengths can be constructed.
Although all prefix codes are uniquely decodable, the converse is not true. For example,
Code III in the table does not satisfy the prefix condition, and yet it is uniquely decodable
since the bit 0 indicates the beginning of each code-word in the code.
Prefix codes distinguish themselves from other uniquely decodable codes by the fact that
the end of a code-word is always recognizable. Hence, the decoding of a prefix code can
be accomplished as soon as the binary sequence representing a source symbol is fully
received. For this reason, prefix codes are also referred to as instantaneous codes.
Given a discrete memoryless source of entropy H(S), the average code-word length of a
code is bounded as follows:
∑ ∑
Under this condition, the Kraft-McMillan inequality implies that we can construct a
prefix code, such that the length of the code-word assigned to source symbol Sk is lk. For
such a code, the average code-word length is
̅ ∑ ∑
̅ ∑
∑ ∑
Hence, in this special case, the prefix case is matched to the source in that ̅
For the nth extension of a code, a source encoder operates on blocks of „n‟ samples rather
than individual samples.
From nth extension of entropy we have,
H(Sn) = n H(S) (3)
Let ̅̅̅ denote the average code-word length of the extended prefix code. For a uniquely
decodable code, ̅̅̅ is the smallest possible.
From eqn. (1) we deduce that,
̅̅̅ (4)
In the limit, as „n‟ approaches infinity, the lower and upper bounds in above equation
coverage, as shown
̅̅̅
We may therefore state that by making the order „n‟ of an extended prefix source encoder
large enough, we can make code faithfully represent the discrete memoryless source „S‟
as closely as desired.
Encoding of the Source Output:
Source encoding is the process by which the output of an information source is converted
into a binary sequence. The functional block that performs this task in a communication
system is called source encoder.
The input to the source encoder is the symbol sequence emitted by the information
source. The source encoder assigns variable length binary codewords to blocks of
symbols and produces binary sequences as its output.
If the encoder operates on blocks of N symbols, it will produce an average bit rate of GN
bits/sym. As the block length N is increased, the average output bit rate per symbol will
approach GN and GN will approach „H‟.
Thus, with a large block size the output of the information source can be encoded into a
binary sequence with an average bit rate approaching „R‟, the source information rate.
The performance of the encoder is usually measured in terms of coding efficiency that is
defined as the ratio of the source information rate and the average output bit rate of the
encoder.
Shannon’s Encoding Algorithm:
The following steps indicate the Shannon‟s procedure for generating binary codes:
1) List the source symbols in decreasing order of probabilities
Given S= {S1, S2…SM} with P= {P1, P2… PM}; P1 ≥ P2 ≥ P3 ≥…≥ PM
2) Compute the sequence
α1=0
α2=P1= P1+ α1
α3=P2 + P1=P2 + α2
α4= P3 + P2 + P1=P3 + α3
⁄ ⁄ ⁄ ⁄ ⁄
Solution:
1) The symbols are arranges in decreasing order of probabilities as
m5 6/16 P1
m4 4/16 P2
m3 3/16 P3
m1 2/16 P4
m2 1/16 P5
2) α1= 0
α2 = P1= 6/16 = 0.375
α3= P2 + P1= 4/16 + 6/16 = 0.625
α4= P3 + P2 + P1 = 3/16 + 4/16 + 6/16 = 0.8125
α5 = P4 + P3 + P2 + P1 = 2/16 + 3/16 + 4/16 + 6/16 = 0.9375
α6 = P5 + P4 + P3 + P2 + P1 = 1/16 + 2/16 + 3/16 + 4/16 + 6/16 = 1
1 2.66 2 (l1)
2 4 2 (l2)
3 5.33 3 (l3)
4 8 3 (l4)
5 16 4 (l5)
m5 3/8 00 2
m4 1/4 01 2
m3 3/16 101 3
m1 1/8 110 3
m2 1/16 1111 4
By inspecting the code, we can come to the conclusion that all prefixes are absent
hence is an instantaneous code satisfying the prefix property.
The average length ̅ can be calculated using
̅ ∑ ∑
̅ ( ) ( ) ( ) ( ) ( )
̅ ⁄
( ) ( ) ( ) ( ) ( ) ( )
( )
⁄
Code efficiency is calculated as
1 2 1
2 3.33 2
3 5 3
4)
(i) α1= 0
(ii) α2 = (0.5)10 = (0.10)2
(iii) α3 = (0.8)10 = (0.1101…)2
5) The code table now can be constructed as shown
S1 0.5 0 1
S2 0.3 10 2
S3 0.2 110 3
Average length ̅ ∑
Entropy ∑ ( ) ( ) ( )
Code efficiency ̅
Code redundancy
ii) For 2nd extension of basic source will have 32 = 9 symbols given by
P1 P2 P3 P4 P5 P6 P7 P8 P9
1 4 2
2 6.67 3
3 6.67 3
4 10 4
5 10 4
6 11.11 4
7 16.67 5
8 16.67 5
9 25 5
4)
i) α1= 0
ii) α2 = (0.25)10 = (0.01)2
iii) α3 = (0.4)10 = (0.011..)2
iv) α4 = (0.55)10 = (0.1000..)2
v) α5 = (0.65)10 = (0.1010)2
vi) α6 = (0.75)10 = (0.11)2
S1S1 0.25 00 2
̅ ∑
̅
The entropy of 2nd extension is
H(S2) = 2 H(S) = 2.971 bits/sym
Code efficiency ̅
%
Code redundancy
E A D B C
P1 P2 P3 P4 P5
2) α1 = 0
α2 = P1 = 0.3125
α3 = P2 + P1 = 0.5625
α4 = P3 + P2 + P1 = 0.75
α5 = P4 + P3 + P2 + P1 = 0.875
α6 = P5 + P4 + P3 + P2 + P1 = 1
3) Word length is calculated using
1 3.2 2
2 4 2
3 5.33 3
4 8 3
5 8 3
4)
i) α1= 0
ii) α2 = (0.3125)10 = (0.010)2
iii) α3 = (0.5625)10 = (0.1001)2
iv) α4 = (0.75)10 = (0.11)2
v) α5 = (0.875)10 = (0.111)2
Pi Code in binits
Source symbol
E 5/16 00 2
A 1/4 01 2
D 3/16 100 3
B 1/8 110 3
C 1/8 111 3
Average length ̅ ∑
Entropy ∑
Code efficiency ̅
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Shannon-Fano Encoding Algorithm:
This is an improvement over Shannon‟s first algorithm on source coding in the sense that
it offers better coding efficiency compared to Shannon‟s algorithm. The algorithm is as
follows:
Step 1: Arrange the probabilities in non-increasing order.
Step 2: Group the probabilities into exactly two sets such that the sum of probabilities in
both the groups in almost equal. Assign bit „0‟ to all elements of first group and bit „1‟ to
all elements of group 2.
Step 3: Repeat Step 2 by dividing each group in two subgroups till no further division is
possible.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.17
Consider the following sources: S = (A, B, C, D, E, F).
P = (0.10, 0.15, 0.25, 0.35, 0.08, 0.07)
Find the code-word for the source using Shannon-Fano algorithm. Also find the source
efficiency and redundancy.
Average length
̅ ∑
̅
Entropy
Efficiency
Redundancy
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.18
Apply Shannon-Fano algorithm to the following set of messages and obtain code
efficiency and redundancy.
m1 m2 m3 m4 m5
1/8 1/16 3/16 ¼ 3/8
m5 0.375 00 2
m4 0.25 01 2
m3 0.1875 10 2
m1 0.125 110 3
m2 0.0625 111 3
Average length
̅ ∑
̅
Entropy
Efficiency
Redundancy
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Huffman Code (Compact Codes):
Huffman codes achieve the minimum code-word lengths amongst all other coding
algorithms. Hence they are called compact codes. Huffman codes are computationally
simpler and are widely used in the modern communication systems.
The algorithm for encoding the information into an „r-ary‟ code-word is detailed as
follows:
̅ ∑
Entropy
Efficiency
Redundancy
G 1/27 222
Average length
̅ ∑
Entropy
Efficiency
Redundancy
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.20
Construct a quaternary Huffman code for the following set of message symbols with the
respective probabilities.
Step 2: As „n‟ is not an integer. The neat minimum value for „N‟ to have an integer value
for „n‟ is N= 10. Therefore, add two dummy symbols with „0‟ probabilities for which „n‟
becomes
E 0.10 01 0.10 01
F 0.08 02 0.08 02
H 0.02 031
D1 0 032
D2 0 033
̅ ∑
Entropy
Efficiency
Redundancy
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Extended Huffman Coding:
The Huffman coding offers better coding efficiency compared to other coding techniques.
However, it fails to achieve a good efficiency under certain circumstances.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.21
Consider a source with three symbols S = (A, B, C) with respective probabilities P = (0.8,
0.15, 0.15). Find the efficiency using extended Huffman code.
Solution: Using normal Huffman coding, the codewords can be constructed as shown
Symbol Prob Code-word Length
A 0.8 0 1
B 0.15 10 2
C 0.15 11 2
Average length
̅ ∑
Entropy
Efficiency
⇒
̅
Although this efficiency is good as compared to other coding techniques, it could still be
improved if we consider coding the symbols considering „m‟ symbols at a time. More is
the value of „m‟, better will be the efficiency. This can be shown by considering the
second extension of the source.
Average length
̅̅̅ ∑
Entropy
Efficiency
̅̅̅
⇒ ,
which is an improvement over the earlier case. Thus the symbols can be represented using
lesser number of bits using Huffman coding if we consider a group of symbols at a time
for encoding. This process of encoding „m‟ symbols at a time is known as extended
Huffman Coding.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Example 4.22
Consider a discrete memoryless source with S = (X, Y, Z) with the state probabilities
P = (0.7, 0.15, 0.15) for its output.
a) Apply Huffman encoding algorithm to find the code-words in binary. Find the source
efficiency and redundancy.
b) Consider the second order extension of the source. Compute the code-word for this
extended source and also find its efficiency.
Solution: a) Step 1:
X 0.7 0 1
Y 0.15 10 2
Z 0.15 11 2
Average length
̅ ∑
Entropy
Efficiency
⇒
̅
Redundancy ⇒
b) Now consider second order extension of the source. The extension will have 32=9
combinations.
XX XY XZ YX YY YZ ZX ZY ZZ
Step 1:
XY 0.105 001 0.105 001 0.105 001 0.105 001 0.195 000 0.21 01 0.30 00 0.49 1
XZ 0.105 010 0.105 010 0.105 010 0.105 010 0.105 001 0.195 000 0.21 01
YX 0.105 011 0.105 011 0.05 011 0.105 011 0.105 010 0.105 001
ZX 0.105 0000 0.105 0000 0.105 0000 0.105 0000 0.105 011
ZZ 0.0225 000101
Average length
̅̅̅ ∑
Entropy
Efficiency
Figure 4.4: Error control coding; Channel bit error probability is qc= p {dk ≠ ̅̅̅̅} and
message bit error probability is Pe= p {bk ≠ ̅̅̅}
Figure 4.4 shows the digital communication system for transmitting the binary output
{bk} of a source encoder over a noisy channel at a rate of „rb‟ bits/ sec. Due to channel
noise, the bit stream {̅̅̅} recovered by the receiver differs from the transmitted sequence
{bk}. It is desire that the probability of error p {bk ≠ ̅̅̅} can be less than some prescribed
value.