Notes Shannon
Notes Shannon
Antonio Bonafonte
H ≤R
R<H +1
To prove it, we assume the next two equations which will be shown below:
1. Kraft inequality:
M
X
2−li ≤ 1
i=1
2.
M
X M
X X
H=− pi log2 pi ≤ − pi log2 qi ∀qi : qi > 0, qi = 1
i=1 i=1
P
Obviously, this is also true if qi < 1.
It is equal only if qi = pi .
For any uniquely decodable code, with lengths li , we define qi = 2−li . Then qi ≤ 1,
P
M
X
H≤− pi log2 qi
i=1
But the right term is the rate:
M
X M
X M
X
− pi log2 qi = − pi log2 2−li = pi · li = R
i=1 i=1 i=1
1
which accomplish Kraft inequality:
M M M
X X X 1
2−li = 2log2 pi −ǫi = pi ≤1
i=1 i=1 i=1
2 ǫi
0 a
0 0
0 0
0 e
0 1
1 d
1 b
1 c
ai pi li code
a 0.03 4 0000
b 0.38 2 01
c 0.51 1 1
d 0.07 3 001
e 0.01 4 0001
H=1.51 R = 1.64
2
In this case the redundancy decreases to ρ = 0.13.
Exercise Exactly the same than previous one but with larger alphabet
Repeat with the alphabet A = {a, b, c, d, e, f, g, h} and probabilities
(0.32, 0.25, 0.15, 0.13, 0.07, 0.05, 0.02, 0.01)
0 0
h
0
0 g
0 1
0 f
1
1 e
0 0 d
1
1 c
0 b
1
1 a
ρ = 0.22
Note that a better prefix code can be obtained just removing the nodes with only one
child. In this particular case, it can be shown than the the rate is the same than the rate
of an optimal code (Huffman).
0 h
0
0 g
1
0 1 f
1 e
0 0 d
1
1 c
0 b
1
1 a
3
ai pi li code
a 0.32 2 11
b 0.25 2 10
c 0.15 3 011
d 0.13 3 010
e 0.07 3 001
f 0.05 4 0001
g 0.02 5 00001
h 0.01 5 00000
H = 2.48 R = 2.54
Now, let’s prove the two equations which were taken by granted. The first one:
M
X M
X X
H=− pi log2 pi ≤ − pi log2 qi ∀qi : qi > 0, qi = 1
i=1 i=1
x−1
Then,
1 1
M M M ✼✓ X M ✓✼
qi qi ✓ ✓
X X X
pi ln ≤ pi −1 = ✓ q i − ✓ pi = 0
pi pi
i=1 i=1 ✓
i=1 ✓
i=1
M
X M
X M
X
pi log2 qi − pi log2 pi ≤ 0 H≤− pi log2 qi
i=1 i=1 i=1
4
One side consecuence: given a ergodic source with unknown probabilities pi , Let’s be
qi an estimation of the probabilities. The entropy of this estimation over the test set
(x1 . . . ), HQ , is:
N M
1 X X
HQ = lim − log2 qxn = − pi log2 qi ≥ H
N →∞ N
n=1 i=1
H(X) ≤ log2 M
1
If we select qi = M, then,
M
X 1
H≤− pi log2 = log2 M
i=1
M
M
X
2−li ≤ 1
i=1
0 1
a b
In the figure, two symbols of length 1, 2i=1 2−1 = 1. It is not possible select lengths
P
smaller than that. If we had selected longer codes for any symbol, the sum will be smaller
than 1.
We can add a new symbol if we substitute a code of length li , and contribution to the sum
2−li , with two symbols with lj , lk ≥ li +1, and contribution 2−lj +2−lk ≤ 2 2 i + 2 2 i = 2−li .
−l −l
In fact, it makes not sense to choose lj , lk 6= li + 1, as it increases the length of that code
with no benefit.
In the next figure, li : (1, 2, 4, 4, 3) and the sum is 1.
0 1
a 0 1
b 0 1
0 1 e
c d
5
Furthermore, given a code with lengths that satisfy Kraft inequality, we can build easily
a prefix code.
Example: suppose the alphabet A = {a, b, c, d, e}, and a uniquely decodable code:
ai code li 2−li
1
a 11 2 4
1
b 10 2 4
1
c 01 2 4
1
d 100 3 8
1
e 1000 4 16
0 0 0 1 0 1 0 1
0 0 1 0 1 0 0 1 0 1 0 1 0 1
a a b a b c a b c 0 a b c 0 1
d d 0
e
Prefix code:
ai li code
a 2 00
b 2 01
c 2 10
d 3 110
e 4 1110