0% found this document useful (0 votes)

3 views

EE4740 Lecture4 Slides

The document discusses fundamental concepts in data compression, focusing on source codes, prefix codes, and optimal coding techniques such as Shannon and Huffman codes. It explains the Kraft inequality, the construction of prefix codes, and the optimality conditions for these codes. Practical applications of Huffman coding in data compression methods like GZIP and JPEG are also mentioned.

Uploaded by

gkhosrokhavar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

EE4740 Lecture4 Slides

Uploaded by

gkhosrokhavar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Fundamental Limits of Source Codes and Optimal Codes

EE4740/SC42160 Data Compression: Entropy & Sparsity Perspectives

Delft University of Technology, The Netherlands

Recap and Summary

Last lecture:
● Non-singular, uniquely decodable, and prefix codes
● Kraft inequality and bounds on the optimal code length

Today:
● Constructions of prefix codes: Shannon, Huffman, Arithmetic
● Optimality of Huffman code
Reference: Cover and Thomas, Chapters 5 and 13 (particularly, the parts
presented on the slides)

2 / 38
Recap

● Information produced by a discrete information source is represented using the

alphabet X = {x1 , x2 , . . . , xm }

● Source code: A source code C for a random variable X is a mapping from X ,

the range of X , to D, the set of finite-length strings of symbols from a D-ary
alphabet.

● The expected length L(C ) of a source code C (x) for a random variable X with
probability mass function p(x) is given by

L(C ) = ∑ p(x)l(x),
x∈X

where l(x) is the length of the codeword associated with x.

3 / 38
Recap: Types of Codes

A code is nonsingular if every element of X maps into a different string in D

A code is called uniquely decodable if its n extension is nonsingular for any n
A code is called a prefix code or an instantaneous code if no codeword is a prefix
of any other codeword

4 / 38
Recap: Optimal Prefix Code

Theorem (Kraft inequality)

For any prefix code over an alphabet of size D, the codeword lengths l1 , l2 , . . . , lm
must satisfy the inequality
m
−li
∑D ≤ 1.
i=1

Conversely, given a set of codeword lengths that satisfy this inequality, there exists a
prefix code with these word lengths.

Prefix code with the minimum expected length

−l
min ∑ pi li such that ∑ D i ≤ 1
l1 ,l2 ,...,m i i

Ô⇒ li∗ = − logD pi (non-integer choice)

This noninteger choice yields the expected codeword length HD (X )

5 / 38
Optimal Code

● If − log pi is not an integer, we use li = ⌈− log pi ⌉ which satisfies the Kraft

inequality
● Due to 1 bit overhead due to the ceiling, the optimal prefix code satisfies

HD (X ) ≤ L < HD (X ) + 1

● McMillian theorem says that any uniquely decodable code satisfies Kraft
inequality, implying there is no better choice than prefix code

Today:
● Constructions of prefix codes: Shannon, Huffman, Arithmetic
● Optimality of Huffman code

6 / 38
Representing Prefix-free Code
Root
0 1
Symbol Codeword c
A 0 A .
B 10 10 11
C 110
D 1110 B .
E 1111 110 111

Tree representation: Codewords correspond to the C .

leaf nodes 1110 1111

D E

7 / 38
Representing Prefix-free Code
Root
0 1
Symbol Codeword c
A 0 A .
B 10 10 11
C 110
D 1110 B .
E 1111 110 111

Tree representation: Codewords correspond to the C .

leaf nodes 1110 1111

D E
● Interval representation: Codewords correspond to intervals, which are
non-intersecting subsets of [0, 1]
● Given codeword c = (c1 , ..., cl ) ∈ {0, 1}, the interval is

Ic = [bc , bc + 2−l ] where bc = 0.c1 c2 . . . cl ∈ [0, 1]

7 / 38
Interval Representation
Given codeword c = (c1 , ..., cl ) ∈ {0, 1}, the interval is

the interval is Ic = [bc , bc + 2−l )

where bc = 0.c1 c2 . . . cl ∈ [0, 1]

Symbol Codeword c Number bc Interval

A 0 0 [0,0.5)
B 10 0.5 [0.5, 0.75)
C 110 0.75 [0.75, 0.875)
D 1110 0.875 [0.875, 0.9375)
E 1111 0.9375 [0.9375,1)

8 / 38
Interval Representation
Given codeword c = (c1 , ..., cl ) ∈ {0, 1}, the interval is

the interval is Ic = [bc , bc + 2−l )

where bc = 0.c1 c2 . . . cl ∈ [0, 1]

Symbol Codeword c Number bc Interval

A 0 0 [0,0.5)
B 10 0.5 [0.5, 0.75)
C 110 0.75 [0.75, 0.875)
D 1110 0.875 [0.875, 0.9375)
E 1111 0.9375 [0.9375,1)

● Why non-intersecting?: For any x ∈ Ic , we have

bc ≤ b < bc + 2−l ≤ bc + 0.00 . . . 0111 . . . = bc 111 . . .

´¹¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¶
l zeros

The interval Ic consists of all b ∈ [0, 1] starting with bc . Since the prefix is not
shared, the intervals are non-intersecting.

8 / 38
Construction 1: Shannon Code

● Shannon used interval-based construction by arranging the symbols in the order

of decreasing probability and considering the cumulative probability
i−1
Fi = ∑ pi
j=1

Symbol pi Fi li
A 4/10 0 2
B 3/10 4/10 2
C 2/10 7/10 3
D 1/10 9/10 4

● Expand Fi over its first li bits where

− log pi ≤ li < − log pi + 1

● The code for Fi differs from all succeeding ones in one or more of its li places
since the remaining Fi are at least 2−li larger

9 / 38
Decimal Fractions to Binary: Algorithm

1 Multiply the fraction by 2, while noting down the resulting integer and fraction
parts of the product

2 Keep multiplying each successive resulting fraction by 2 until you get a resulting
fraction product of zero

3 Now, write all the integer parts of the product in each step

For example 0.625 is 0.101 in binary:

0.625 ∗ 2 = 1 + 0.25 0.25 ∗ 2 = 0.5 0.5 ∗ 2 = 1

10 / 38
Shannon Code

Symbol pi Fi li Code
AA 4/10 0 2 00
H(X ) = 1.846
AB 3/10 4/10 2 01
L2 = 2.4
BA 2/10 7/10 3 101
BB 1/10 9/10 4 1110

11 / 38
Shannon Code: Tree Representation

● Compute
− log pi ≤ li < − log pi + 1

● Start with a complete binary tree of depth maxi li

● Place the symbol with the shortest length on a node at its corresponding length,
discard its children, move to the next symbol, and repeat this process
sequentially
● Both the constructions lead to the same code if we select the available node
with the least value in binary; a different rule for selecting the next available
node will lead to different codes.

12 / 38
Shannon Code: Some Observations

● Codeword lengths are chosen as li = ⌈− log pi ⌉

● Shannon code is asymptotically optimal as n → ∞

● Shannon code may be much worse than the optimal code. For example when
p1 = 0.99 and p2 = 0.01 since ⌈− log p2 ⌉ = 7

13 / 38
Example: Shannon Code

Root
0 1
Symbol pi Fi ⌈− log p(x)⌉ Code .
A
A 0.55 0 1 0 10 11
B 0.25 0.55 2 10
C 0.1 0.8 4 1100
B .
D 0.1 0.9 4 1110
110 111
What is the optimal (shortest expected length)
. .
prefix code? Huffman code!
1100 1101

C D

14 / 38
Huffman Code

● An optimal (shortest expected length) prefix code for a given distribution can be
constructed by a simple algorithm discovered by Huffman

● Huffman code arranges the symbols in the order of decreasing probability, and
joins the two least probable symbols together

● The new messages are reordered after which two symbols are join again. Repeat
this process until two symbols are left

● Working backward, a 0 or 1 is added to the codeword

15 / 38
Example: Huffman Code

16 / 38
Some Examples

1 p(x) = {0.25, 0.5, 0.125, 0.125}

2 p(x) = {0.25, 0.25, 0.2, 0.15, 0.15}

17 / 38
Some Examples

1 p(x) = {0.25, 0.5, 0.125, 0.125}

● Shannon code = {10, 0, 110, 111}
● Huffman code = {10, 0, 110, 111}

2 p(x) = {0.25, 0.25, 0.2, 0.15, 0.15}

17 / 38
Some Examples

1 p(x) = {0.25, 0.5, 0.125, 0.125}

● Shannon code = {10, 0, 110, 111}
● Huffman code = {10, 0, 110, 111}

2 p(x) = {0.25, 0.25, 0.2, 0.15, 0.15}

● Shannon code = {00,01,100,101,110}
● Huffman code = {10, 00, 01, 110, 111}

17 / 38
Huffman Code on English Text

Image courtesy: Twitter, Simon Pampena and Stanford data compression course notes

18 / 38
Practical prefix-free coding

● Huffman codes are actually used in practice due to their optimality and easy
construction

● Examples:
● http/2 header compression
● ZLIB, DEFLATE, GZIP, PNG compression
● JPEG compression

● The data after multiple transformations (domain dependent) needs to be

losslessly compressed where Huffman codes are used typically

19 / 38
Non-unique Optimal Codes

● Consider a random variable X with a distribution (1/3, 1/3, 1/4, 1/12)

● The Huffman coding procedure results in codeword lengths of

(2, 2, 2, 2) OR (1, 2, 3, 3)

● Both these codes achieve the same expected codeword length of 2

● In the second code, the third symbol has length 3, which is greater than
⌈−log 1/4⌉ = 2

20 / 38
Optimality of Codes

Theorem (Necessary conditions for Optimality)

For any distribution, there exists an optimal prefix code (with minimum expected
length) that satisfies the following properties:
1 The lengths are ordered inversely with the probabilities (i.e., if pj > pk , then
lj ≤ lk ).
2 The two longest codewords have the same length.

● A code C is optimal is for any other uniquely decodable code Ĉ , we have

L(C ) ≤ L(Ĉ )

21 / 38
Proof of Condition 1

Theorem (Necessary conditions for Optimality)

For any distribution, there exists an optimal prefix code that satisfies the following
properties:
1 The lengths are ordered inversely with the probabilities (i.e., if pj > pk , then
lj ≤ lk ).
2 The two longest codewords have the same length.

● Let’s assume that our code C is optimal but for pk > pj it has lk > lj .

● Now, lets consider another prefix-free code Ĉ where we exchange the codewords
corresponding to i and j.
m
L(Ĉ ) − L(C ) = ∑ pi (lˆi − li ) = pk (lj − lk ) + pj (lk − lj ) = (pk − pj )(lj − lk ) < 0
i=1

● Hence, this is in contradiction to our assumption that code C is optimal. This

proves the Condition 1.

22 / 38
Proof of Condition 2

Theorem (Necessary conditions for Optimality)

For any distribution, there exists an optimal prefix codethat satisfies the following
properties:
1 The lengths are ordered inversely with the probabilities (i.e., if pj > pk , then
lj ≤ lk ).
2 The two longest codewords have the same length.

● If the two longest codewords are not of the same length, one can delete the last
bit of the longer one, preserving the prefix property and achieving lower expected
codeword length.

● Hence, the two longest codewords must have the same length.

● For example, let a codeword be 0110101 be the unique longest codeword. Then,
as there is no other codeword of length 7, we can be sure that we can drop one
bit, we get a shorter average codelength!

23 / 38
Huffman Code and Optimality Conditions

Theorem (Necessary conditions for Optimality)

● The symbols with higher probability are chosen later, so their code-lengths are
lower.
● We always choose the two symbols with the smallest probability and combine
them, so the Condition 2 is also satisfied.
We verified that the Huffman code has some desirable properties, it does not prove
the optimality. See the C&T textbook for a rigorous proof.

Theorem (Optimality of Huffman Code)

Huffman coding is optimal.

24 / 38
Huffman Code: Some Observations

● The main property is that after the merge step of merging two smallest
probability nodes of probability distribution (p1 , p2 , . . . , pm ), the remaining tree
is optimal for the distribution (p1 , p2 , . . . , pm−1 + pm ) obtained after merging

● Huffman coding is a “greedy” algorithm in that it coalesces the two least likely
symbols at each stage.

● The typical prefix-free code decoding works by parsing through the prefix-free
tree, until we reach a leaf node

25 / 38
Issues with Symbol Codes

● Design Huffman code for the distribution p(x) = (4/5, 1/5)

Symbol pi Code
A 4/5 0
B 1/5 1
H(X ) = 0.722
L1 = 1
● The expected length of Huffman code L ≤ H(X ) + 1
● We can try Huffman code with extensions (block codes)

● For n extensions, H(X ) ≤ Ln < H(X ) + 1/n

26 / 38
Issues with Symbol Codes

● Design Huffman code for the distribution p(x) = (4/5, 1/5)

Symbol pi Code
A 4/5 0
B 1/5 1
H(X ) = 0.722
L1 = 1
● The expected length of Huffman code L ≤ H(X ) + 1
● We can try Huffman code with extensions (block codes)
Symbol pi Code
AA 16/25 0
AB 4/25 10
BA 4/25 110
BB 1/25 111
H(X ) = 1.444
L2 = 1.560/2 = 0.780
● For n extensions, H(X ) ≤ Ln < H(X ) + 1/n

26 / 38
A New Optimal Code

● Huffman with block size n achieves compression as close as 1/n bits/symbol to

entropy H(X )

● The issue is that, as we increase the block size n, our codebook size increases
exponentially as D n

● The larger the codebook size the more complicated the encoding/decoding
becomes, the more memory we need, the higher the latency etc.

● Huffman code is not easily extended to longer block lengths (extensions)

without redoing all the calculations

● The idea doesn’t hold ground practically. Arithmetic coding addresses this
issue.

27 / 38
Arithmetic Coding

● Arithmetic coding encodes entire data as a single block

● Codebook size: Codewords are computed on a fly. No pre-computed codebook

is involved.

● Codeword length: H(X ) ≤ LArithmetic ≤ H(X ) + σ+1 n

i.e., σ bits of overhead for the entire sequence, not for each symbol

● Arithmetic coding achieves almost the same compression as Huffman code for
extension, but it is practical.

28 / 38
An Example: Encoding

Given a sequence of n = 6 i.i.d, random variables with source alphabet

X = {A, B, C , D} and pmf {0.2, 0.5, 0.2, 0.1}. Consider the encoding of the sequence
{CBAABD}
1 Find an interval (or a range) [L, H) ⊂ [0, 1), corresponding to the entire
sequence. We start with [L, H) = [0, 1), and the subdivide the interval as we see
each symbol, depending on its probability.
● C → [0.7, 0.9)
● CB → [0.74, 0.84)
● CBA → [0.74, 0.76)
● ⋮
● CBAABD → [0.7426, 0.7428)

29 / 38
Example: Finding The Interval

30 / 38
Example: Finding The Interval

Given a sequence of n = 6 i.i.d, random variables with source alphabet

X = {A, B, C , D} and pmf {0.2, 0.50.2, 0.1}. Consider the encoding of the sequence
{C , B, A, A, B, D}

Find an interval (or a range) [L, H) ⊂ [0, 1), corresponding to the entire sequence.
We start with [L, H) = [0, 1), and the subdivide the interval as we see each symbol,
depending on its probability.

CBAABD → [0.7426, 0.7428)

31 / 38
Decoding
● 0.1011111000100 → v1 = 0.74267578125

● We can start decoding from the first interval I0 (x n ) = [0, 1) by comparing with
the cumulative distribution

0.74267578125 ∈ [0.7, 0.9) Ô⇒ C

● Shift the intervals back to [0,1)

v1 − 0.7
v2 = = 0.21337890625 ∈ [0.2, 0.7) Ô⇒ B
0.2

● Continue until you decode all the symbols

● When to stop: Inform the number of symbols beforehand OR use a special
message to stop (extra-overhead of σ bits)

32 / 38
Natural Example

● Suppose we have an i.i.d. source described by the following symbol probabilities:

1 1 1 1
p(A) = , p(B) = , p(C ) = , p(D) = .
2 4 8 8
Let the sequence be AAB....
● The first A is mapped to [0, 0.1) Ô⇒ emit a 0 as the first bit because all
numbers in this interval must begin with bit 0.
● For the next A, the interval becomes [0, 0.01)implies send a 0.
● For next B, the interval is [0.001, 0.0011). Any number within this range can
encode AAB, but the total codeword for AAB could be 001, 0010, 001011, etc.
● In this case, the decoder can interpret bits as they arrive: the first 0 maps to A,
and so forth.

33 / 38
Out of Order Example

● Suppose we have an i.i.d. source described by the following symbol probabilities:

1 1 1 1
p(D) = , p(B) = , p(A) = , p(C ) = .
8 4 2 8
Let the sequence be AAB....
● The first A is mapped to [0.011, 0.111) and can have its first bit as 0 or 1
● So we can not start encoding right away, unlike Huffman coding

34 / 38
Unique Representation

● We can uniquely represent the interval [L, H) by picking the midpoint: (L + H)/2
● The midpoint could have a very long expansion, so we round it off after B bits
to get the codeword w

− log(H − L) ≤ B ≤ − log(H − L) + 1

● For example, if the interval is [0.7426, 0.7428), the length is

In = 0.0002 Ô⇒ B = 13

Our choice is 0.7427 → w = 0.1011111000100

● This choice is prefix free

35 / 38
Optimality of Arithmetic Codes
● What is the size of the interval (H-L) for the input x n ?
n
H − L = ∏ pi = p(x n )
i=1

● The number of bits required to encode < σ − log p(x n ) + 1

● Length of encoded sequence < 1

n
(σ − log p(x n ) + 1) bits/symbol

● Expected codelength is

σ − log p(X n ) + 1 σ+1

LArithmetic < E ( ) = H(X ) + ,
n n

● LArithmetic → H(X ) for large n

36 / 38
Huffman Vs Arithmetic Coding

● For both schemes, the expected codelength goes to entropy as n → ∞

● Unlike Huffman coding, Arithmetic encoding does not have exponentially

growing memory demands

● Arithmetic coding can be applied to non-iid models (source with memory)

● A set of new compression algorithms, Asymmetric Numeral Systems, can

achieve compression performance similar to Arithmetic coding, but speeds closer
to that of Huffman coding.
Check out Charles Bloom’s blog:
http://cbloomrants.blogspot.com/2014/01/1-30-14-understanding-ans-1.html

37 / 38
Summary

● Shannon code
● Huffman code
● Optimality of Huffman code
● Arithmetic codes

End of our discussion on entropy-based data compression!

Next lecture: Sparsity perspective!

38 / 38

Data Compression Solutions
79% (19)
Data Compression Solutions
67 pages
Hartmann Pumps PVX Repair Manual
No ratings yet
Hartmann Pumps PVX Repair Manual
18 pages
Strategy Renault
No ratings yet
Strategy Renault
28 pages
K30160NACE - K30165NACE - K30166NACE Operation Manual PDF
No ratings yet
K30160NACE - K30165NACE - K30166NACE Operation Manual PDF
22 pages
Basement Plan
100% (1)
Basement Plan
1 page
H50P - Operator's Manual - V7.0 - EN PDF
No ratings yet
H50P - Operator's Manual - V7.0 - EN PDF
220 pages
Lecture 6 PDF
No ratings yet
Lecture 6 PDF
5 pages
Huffman Codes: Spring 2010
No ratings yet
Huffman Codes: Spring 2010
7 pages
3
No ratings yet
3
11 pages
ICE513 Module 4 - Source Coding
No ratings yet
ICE513 Module 4 - Source Coding
26 pages
Lec27 PDF
No ratings yet
Lec27 PDF
26 pages
Data Compression Arithmetic Coding
No ratings yet
Data Compression Arithmetic Coding
38 pages
Huffman Coding: Eric Dubois
No ratings yet
Huffman Coding: Eric Dubois
17 pages
Kraft'S and Mcmillan'S Inequalities: Theorem 1.11
No ratings yet
Kraft'S and Mcmillan'S Inequalities: Theorem 1.11
11 pages
Lecture35-37 SourceCoding
No ratings yet
Lecture35-37 SourceCoding
20 pages
5 Data Compression
No ratings yet
5 Data Compression
6 pages
IT_w2
No ratings yet
IT_w2
16 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
a2 sol
No ratings yet
a2 sol
7 pages
Ex 06 e
No ratings yet
Ex 06 e
7 pages
HW 3 Soln
100% (1)
HW 3 Soln
4 pages
Coding Tech
No ratings yet
Coding Tech
32 pages
Data Compression Can Be Achieved by Assigning To of The Data Source and
No ratings yet
Data Compression Can Be Achieved by Assigning To of The Data Source and
42 pages
15-583:algorithms in The Real World: Data Compression I - Introduction - Information Theory - Probability Coding
No ratings yet
15-583:algorithms in The Real World: Data Compression I - Introduction - Information Theory - Probability Coding
33 pages
Lecture 9
No ratings yet
Lecture 9
24 pages
2. Coding Theory
No ratings yet
2. Coding Theory
49 pages
Mathematical Prelims
No ratings yet
Mathematical Prelims
13 pages
Huffman Coding: Vida Movahedi
No ratings yet
Huffman Coding: Vida Movahedi
24 pages
Lecture 4 - Arithmetic Coding and Lempel-Ziv
No ratings yet
Lecture 4 - Arithmetic Coding and Lempel-Ziv
26 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
Data Compression Introduction
No ratings yet
Data Compression Introduction
43 pages
Notes
No ratings yet
Notes
32 pages
Lossless Data Compression
No ratings yet
Lossless Data Compression
24 pages
EEM465_lecture3
No ratings yet
EEM465_lecture3
4 pages
Source Coding Ompression
No ratings yet
Source Coding Ompression
34 pages
Huffman Coding
No ratings yet
Huffman Coding
39 pages
L12, L13, L14, L15, L16 - Module 4 - Source Coding
No ratings yet
L12, L13, L14, L15, L16 - Module 4 - Source Coding
59 pages
Shannon Coding Extensions PDF
No ratings yet
Shannon Coding Extensions PDF
139 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
HW 3 Sol
No ratings yet
HW 3 Sol
9 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
cp467_12_lecture14_compression1
No ratings yet
cp467_12_lecture14_compression1
146 pages
Huffman
No ratings yet
Huffman
53 pages
EECS 554 hw3
No ratings yet
EECS 554 hw3
2 pages
Information Theory and Coding PDF
No ratings yet
Information Theory and Coding PDF
150 pages
Huffman
No ratings yet
Huffman
11 pages
04Huffman-2x2
No ratings yet
04Huffman-2x2
6 pages
Dce 1
No ratings yet
Dce 1
21 pages
Unit 2
No ratings yet
Unit 2
30 pages
Information Theory-Homework Exercises: 1 Entropy, Source Coding
No ratings yet
Information Theory-Homework Exercises: 1 Entropy, Source Coding
18 pages
Input Source Encoder Channel Encoder Binary Interface
No ratings yet
Input Source Encoder Channel Encoder Binary Interface
29 pages
Unit I Information Theory & Coding Techniques P I
No ratings yet
Unit I Information Theory & Coding Techniques P I
48 pages
Huffman
No ratings yet
Huffman
22 pages
1 2 M 1 M 1 M 1 2 M
No ratings yet
1 2 M 1 M 1 M 1 2 M
5 pages
LectureNotes01 PDF
No ratings yet
LectureNotes01 PDF
29 pages
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
No ratings yet
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
7 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Imc14 03 Huffman Codes PDF
No ratings yet
Imc14 03 Huffman Codes PDF
31 pages
Entropy Coding
No ratings yet
Entropy Coding
18 pages
Uniquely Decodable Codes
No ratings yet
Uniquely Decodable Codes
10 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
EC Cryptography Tutorials - Herong's Tutorial Examples
From Everand
EC Cryptography Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet
C Programming for Arduino
From Everand
C Programming for Arduino
Julien Bayle
4/5 (13)
Lect 4
No ratings yet
Lect 4
11 pages
Neo Shower Enclosure Installation Instructions: Important
No ratings yet
Neo Shower Enclosure Installation Instructions: Important
8 pages
IKO Roller Follower PDF
No ratings yet
IKO Roller Follower PDF
9 pages
Transportation Published Paper
No ratings yet
Transportation Published Paper
11 pages
Sigmacover™ 620: Product Data Sheet
No ratings yet
Sigmacover™ 620: Product Data Sheet
5 pages
Holm Et Al. 2016 Achieving The Sustainable Development Goals - A Case Study of The Complexity of Water Quality Health Risk 1
No ratings yet
Holm Et Al. 2016 Achieving The Sustainable Development Goals - A Case Study of The Complexity of Water Quality Health Risk 1
9 pages
Design of Liquid Retaining Concrete Structures PDF
No ratings yet
Design of Liquid Retaining Concrete Structures PDF
6 pages
ESO Cartridge: Everpure
No ratings yet
ESO Cartridge: Everpure
2 pages
Potential of Transm Towers During Ground Faults
No ratings yet
Potential of Transm Towers During Ground Faults
3 pages
B6_V2_Instruction_Manual_CN_V1.2.pdf
No ratings yet
B6_V2_Instruction_Manual_CN_V1.2.pdf
22 pages
Digital Phase Lock Loop Detector
No ratings yet
Digital Phase Lock Loop Detector
2 pages
Aguilar-Ruysschaert, Rodrigo (VLP)
No ratings yet
Aguilar-Ruysschaert, Rodrigo (VLP)
83 pages
Cost and Performance Baseline For Fossil Energy Plants Volume 1 - Bituminous Coal and Natural Gas To Electricity
No ratings yet
Cost and Performance Baseline For Fossil Energy Plants Volume 1 - Bituminous Coal and Natural Gas To Electricity
626 pages
Experiment 2 Cmt555
No ratings yet
Experiment 2 Cmt555
8 pages
13 Structural Looseness
No ratings yet
13 Structural Looseness
1 page
Lawn Tractors: Customer Care Hot Line
No ratings yet
Lawn Tractors: Customer Care Hot Line
14 pages
Ah Arbor View
No ratings yet
Ah Arbor View
4 pages
4045tf220
No ratings yet
4045tf220
2 pages
Building The Ultimate Roland Keyboard Rig
100% (2)
Building The Ultimate Roland Keyboard Rig
29 pages
Solar Energy Management System
No ratings yet
Solar Energy Management System
12 pages
Nipple Up BOP: Instruksi Kerja / Work Instruction
No ratings yet
Nipple Up BOP: Instruksi Kerja / Work Instruction
3 pages
Stationary Ni-Cd Battery Sizing: Sizing After Charge at Constant Current
No ratings yet
Stationary Ni-Cd Battery Sizing: Sizing After Charge at Constant Current
4 pages
Industrial Automation-Car Manufacturing Industry: 1. Use Case Diagram
No ratings yet
Industrial Automation-Car Manufacturing Industry: 1. Use Case Diagram
7 pages
Car Wiper Mechanism
No ratings yet
Car Wiper Mechanism
64 pages
HEMM Dozer4
No ratings yet
HEMM Dozer4
3 pages

EE4740 Lecture4 Slides

Uploaded by

EE4740 Lecture4 Slides

Uploaded by

Fundamental Limits of Source Codes and Optimal Codes

EE4740/SC42160 Data Compression: Entropy & Sparsity Perspectives

Delft University of Technology, The Netherlands

● Information produced by a discrete information source is represented using the

● Source code: A source code C for a random variable X is a mapping from X ,

where l(x) is the length of the codeword associated with x.

A code is nonsingular if every element of X maps into a different string in D

Theorem (Kraft inequality)

Prefix code with the minimum expected length

Ô⇒ li∗ = − logD pi (non-integer choice)

● If − log pi is not an integer, we use li = ⌈− log pi ⌉ which satisfies the Kraft

Tree representation: Codewords correspond to the C .

Tree representation: Codewords correspond to the C .

Ic = [bc , bc + 2−l ] where bc = 0.c1 c2 . . . cl ∈ [0, 1]

the interval is Ic = [bc , bc + 2−l )

Symbol Codeword c Number bc Interval

the interval is Ic = [bc , bc + 2−l )

Symbol Codeword c Number bc Interval

● Why non-intersecting?: For any x ∈ Ic , we have

bc ≤ b < bc + 2−l ≤ bc + 0.00 . . . 0111 . . . = bc 111 . . .

● Shannon used interval-based construction by arranging the symbols in the order

● Expand Fi over its first li bits where

− log pi ≤ li < − log pi + 1

For example 0.625 is 0.101 in binary:

0.625 ∗ 2 = 1 + 0.25 0.25 ∗ 2 = 0.5 0.5 ∗ 2 = 1

● Start with a complete binary tree of depth maxi li

● Codeword lengths are chosen as li = ⌈− log pi ⌉

● Shannon code is asymptotically optimal as n → ∞

● Working backward, a 0 or 1 is added to the codeword

1 p(x) = {0.25, 0.5, 0.125, 0.125}

2 p(x) = {0.25, 0.25, 0.2, 0.15, 0.15}

1 p(x) = {0.25, 0.5, 0.125, 0.125}

2 p(x) = {0.25, 0.25, 0.2, 0.15, 0.15}

1 p(x) = {0.25, 0.5, 0.125, 0.125}

2 p(x) = {0.25, 0.25, 0.2, 0.15, 0.15}

● The data after multiple transformations (domain dependent) needs to be

● Consider a random variable X with a distribution (1/3, 1/3, 1/4, 1/12)

● The Huffman coding procedure results in codeword lengths of

● Both these codes achieve the same expected codeword length of 2

Theorem (Necessary conditions for Optimality)

● A code C is optimal is for any other uniquely decodable code Ĉ , we have

Theorem (Necessary conditions for Optimality)

● Hence, this is in contradiction to our assumption that code C is optimal. This

Theorem (Necessary conditions for Optimality)

Theorem (Necessary conditions for Optimality)

Theorem (Optimality of Huffman Code)

● Design Huffman code for the distribution p(x) = (4/5, 1/5)

● For n extensions, H(X ) ≤ Ln < H(X ) + 1/n

● Design Huffman code for the distribution p(x) = (4/5, 1/5)

● Huffman with block size n achieves compression as close as 1/n bits/symbol to

● Huffman code is not easily extended to longer block lengths (extensions)

● Arithmetic coding encodes entire data as a single block

● Codebook size: Codewords are computed on a fly. No pre-computed codebook

● Codeword length: H(X ) ≤ LArithmetic ≤ H(X ) + σ+1 n

Given a sequence of n = 6 i.i.d, random variables with source alphabet

Given a sequence of n = 6 i.i.d, random variables with source alphabet

CBAABD → [0.7426, 0.7428)

0.74267578125 ∈ [0.7, 0.9) Ô⇒ C

● Shift the intervals back to [0,1)

● Continue until you decode all the symbols

● Suppose we have an i.i.d. source described by the following symbol probabilities:

● Suppose we have an i.i.d. source described by the following symbol probabilities:

● For example, if the interval is [0.7426, 0.7428), the length is

Our choice is 0.7427 → w = 0.1011111000100

● The number of bits required to encode < σ − log p(x n ) + 1

● Length of encoded sequence < 1

σ − log p(X n ) + 1 σ+1

● LArithmetic → H(X ) for large n

● For both schemes, the expected codelength goes to entropy as n → ∞

● Unlike Huffman coding, Arithmetic encoding does not have exponentially

● Arithmetic coding can be applied to non-iid models (source with memory)

● A set of new compression algorithms, Asymmetric Numeral Systems, can

End of our discussion on entropy-based data compression!

Next lecture: Sparsity perspective!

You might also like