List Week5
List Week5
1) Consider a dice with 8 faces written the letters from A to H. The probability for
each side are: A (1/2), B (1/4), C (1/8), D (1/16), E (1/32), F (1/64), G (1/128) e
H (1/128).
a) Find the Shannon-Fano and Huffman encodings for the symbols emitted
by this source
b) Compute the entropy of the source and compare with the average length
of the code words obtained in a), determining the coding efficiency of
Shannon-Fano and Huffman techniques
2) An unfair dice with 5 faces has probability of to give face A and to give
face B. The other three faces C, D and E have probability each
a) Find the Shannon-Fano coding for symbols emitted by this source.
b) Compute the entropy of the source and compare with the average length
of the code words obtained in a), determining the coding efficiency of
Shannon-Fano and Huffman techniques
3) Create the Shannon-Fano and Huffman codings for the following set of
symbols, then compare with the average length of code words obtained with the
entropy H(X), determining its efficiency:
Symbol Probabilities
x1 0,2
x2 0,18
x3, x4, x5 0,1 cada
x6 0,061
x7 0,059
x8, x9, x10, x11 0,04 cada
x12 0,03
x13 0,01
4) You want to transmit the following phrase to a receptor: this list is very easy
using the ASCII character set to map characters to 7 bits sequences
a) How many bits are needed to encode the sequence above?
b) How would be this sequence after applying the Shannon-Fano coding? What
is the average length?
c) How would this sequence be after applying the Huffman coding? What is the
average length?
d) Compute the entropy of the source and the efficiency of the codings found at
a), b), and c).
a) A way to encode this sequence would use a fixed size code, with code words
long enough to encode the 14 different symbols. How many bytes would be
needed to transmit this phrase with 44 characters using a fixed code size?
b) Determine the minimum number of bits required to encode the phrase
assuming that each character is independent of its surrounding characters.
c) What is the theoretical contribution of each one of the 14 symbols to the
average information?
d) Build a code dictionary using the Huffman algorithm for the 14 symbols
e) Encode the phrase using the code sequence of item d):
i. How many bits are needed?
ii. How does this number compare with the number of bits needed when using
the code obtained at a)?
iii. How does this number compare with the information content of the phrase
calculated in item b
7) Consider a source X with symbols x1, x2, x3, x4 encoded with the following
codes:
9) A source X has four symbols x1, x2, x3 e x4 with p(x1) = , p(x2) = and
p(x3) = p(x4) = 1/8. Build the Shannon-Fano code for X. Show that this code
has 100% efficiency.
10) Given a source X with m equiprobable symbols xi, i=1, ..., m. Let n be the
size of a word in a fixed size encoding. Show that, if n=log2m, then the code
efficiency is 100%.
11) In a DNA string, there are 4 kinds of bases: G, T, C and A. Which is the
information contained in a DNA of size 10 in the following cases?
a) All bases are equiprobable
b) The bases G and T are twice more probable than the bases C and A
b) Determine the efficiencies of both Joozinho and Huffman codes for the
unfair dice of item a). Is there something strange with Joozinhos code? If yes,
justify.