0% found this document useful (0 votes)

4 views

exercise6-sol

The document contains exercises related to Information Theory, focusing on convolutional codes, state diagrams, and rate-distortion functions. It includes tasks to draw code trees, encode messages, and derive rate-distortion functions for specific distortion measures. Solutions are provided for each task, detailing the processes and equations used to arrive at the answers.

Uploaded by

lingxuant7

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

exercise6-sol

Uploaded by

lingxuant7

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Exercise 6

Information Theory I
Prof. Dr. Heinz Koeppl, M.Sc. Maximilian Gehri

WS 2023/24
Sheet 6

Exercises (Prepare at home and discuss in session)

Task 6.1

Consider a convolutional code with the generator sequences g(0) = [1001] and g(1) = [1101].
(a) Draw the code tree.
(b) Draw the state diagram.
(c) Encode the message m = [1011010].

Solution:

From the generator sequences, we obtain the following relations between the input m(i) and outputs c(0) (i), c(1) (i):

c(0) (i) = m(i) + m(i − 3),

c(1) (i) = m(i) + m(i − 1) + m(i − 3),

where m(i) = 0 for i < 0. We begin by writing down, for arbitrary time i ≥ 0, the table of encoder states m(i − 3), m(i −

2), m(i − 1) , input m(i), and outputs c(0) (i), c(1) (i):

State name m(i − 3) m(i − 2) m(i − 1) m(i) c(0) (i) c(1) (i)
0 0 0 0 0 0
a
0 0 0 1 1 1
0 0 1 0 0 1
b
0 0 1 1 1 0
0 1 0 0 0 0
c
0 1 0 1 1 1
0 1 1 0 0 1
d
0 1 1 1 1 0
1 0 0 0 1 1
e
1 0 0 1 0 0
1 0 1 0 1 0
f
1 0 1 1 0 1
1 1 0 0 1 1
g
1 1 0 1 0 0
1 1 1 0 1 0
h
1 1 1 1 0 1

1
From this table, it is straightforward to draw both diagrams. In the following, we label transitions with input and output
using the notation m | c(0) c(1) .
(a) The code tree:

0|00
a
0|00
a
1|11
b
0|00
0|01
c
1|11
b
1|10
d
0|00
0|00
e
0|01
c
1|11
f
1|11
0|01
g
1|10
d
1|10
h

0|11
a
0|00
e
1|00
b
0|01
0|10
c
1|11
f
1|01
d
1|11
0|11
e
0|01
g
1|00
f
1|10
0|10
g
0|10
h
1|01
h

(b) The state diagram:

(c) Encoding can be done by using for example the state diagram: We start in state a and then move through the state
sequence b → c → f → d → g → f → c to produce the output sequence [11011101010010].

Task 6.2

In this exercise, we consider a recursive convolutional code with the rational transfer functions
1+D
g (0) (D) = , g (1) (D) = 1.
1 + D + D2
(a) Draw the state diagram for this code.
(b) Encode the message m = [100101110].

2
0|00
1|10

a
1|11 b
0|01 c d

0|00 1|01
0|11 1|00 1|11 0|10 0|01 1|10

e f
1|00 g
0|10 h

0|11
1|01

Solution:

the most systematic way to proceed is to write down, for arbitrary time i ≥ 0, the table of all possible encoder input m(i),
states m(i − 1), c(0) (i − 2), c(0) (i − 1) , and outputs c(0) (i), c(1) (i). From the transfer functions, we have the relations

m(i) + m(i − 1) = c(0) (i) + c(0) (i − 1) + c(0) (i − 2),

m(i) = c(1) (i)

where m(i) = c(0) (i) = 0 for i < 0. The table then reads
State name m(i − 1) c(0) (i − 2) c(0) (i − 1) m(i) c(0) (i) c(1) (i) New state
0 0 0 0 0 0 a
a
0 0 0 1 1 1 d
1 0 0 0 1 0 c
b
1 0 0 1 0 1 b
0 0 1 0 1 0 g
c
0 0 1 1 0 1 f
1 0 1 0 0 0 e
d
1 0 1 1 1 1 h
0 1 0 0 1 0 c
e
0 1 0 1 0 1 b
1 1 0 0 0 0 a
f
1 1 0 1 1 1 d
0 1 1 0 0 0 e
g
0 1 1 1 1 1 h
1 1 1 0 1 0 g
h
1 1 1 1 0 1 f
(a) The state diagram is
(b) Encoding the message by using the state diagram, starting in state a, gives the sequence of states d, e, c, f, a, d, h, f, a
with output sequence [110010010011110100].

3
1|11

0|00
1|01

0|10
a b c d

0|10 1|11
1|01 0|10 1|11
1|01
0|10
0|00
e f g h
1|11
0|00
1|01

0|00

Task 6.3

Find the rate-distortion function R(D) = minp(x̂|x) : E[d(X,X̂)]≤D I(X; X̂) for X ∼ Bernoulli(1/2) and distortion measure

 0, x = x̂,
d(x, x̂) = 1, x = 1, x̂ = 0,
∞, x = 0, x̂ = 1.


Solution:

We express the distortion E[d(X, X̂)] in terms of probabilities:

X XX
E[d(X, X̂)] = pX (x)p(x̂ | x)d(x, x̂) = pX (x)p(x̂ | x)d(x, x̂)
x,x̂ x x̸̂=x

=pX (0)p(X̂ = 1 | X = 0)d(0, 1) + pX (1)p(X̂ = 0 | X = 1)d(1, 0)

1
= (p(X̂ = 1 | X = 0) d(0, 1) +p(X̂ = 0 | X = 1)).
2 | {z }
=∞

We conclude that E[d(X, X̂)] < ∞ implies p(X̂ = 1 | X = 0) = 0. Since D < ∞ we can eliminate this degree of freedom
in the optimization problem by setting p(X̂ = 1 | X = 0) = 0 and p(X̂ = 0 | X = 0) = 1. For brevity we denote
p := p(X̂ = 0 | X = 1). Then
p
E[d(X, X̂)] = ≤ D.
2
We show two approaches to express the mutual information as of function of p.

Approach 1: Straight forward

where we used pX (x) = 1/2 in the second line and the degree of freedom reduction in the third line. Then

1 1 p 1 p
I(X; X̂) = 1 − log(1 + p) + p log = 1 − (1 + p)HB .
2 2 1+p 2 1+p
This expression can be used for constraint optimization with constraints 0 ≤ p ≤ min{2D, 1}.

Approach 2: Via entropies

We express the mutual information as

I(X; X̂) = H(X) − H(X | X̂) = HB (1/2) − p(X̂ = 0)H(X | X̂ = 0) − p(X̂ = 1)H(X | X̂ = 0).

Then we represent the conditional entropies as binary entropies:

I(X; X̂) = 1 − p(X̂ = 0)HB (p(x | X̂ = 0)) − p(X̂ = 1)HB (p(x | X̂ = 1))

Conclusion: We observe that

1 p
∂p I(X; X̂) = log <0
2 p+1
for all p > 0 and I(X, X̂) = 0 at p = 1. Thus the maximum possible p obeying the constraint attains minimum rate. We
obtain (
1 − (D + 12 )HB D+
D
1 for all D ≤ 1/2
R(D) = 2

0 for all D ≥ 1/2.

Task 6.4

Let d(x, x̂) be a distortion function. We have a source X ∼ p(x). Let R(D) be the associated rate distortion function.
˜ x̂) =
(a) Find R̃(D) in terms of R(D), where R̃(D) is the rate distortion function associated with the distortion d(x,
d(x, x̂) + a for some constant a > 0. (They are not equal!)
(b) Now suppose that d(x, x̂) ≥ 0 for all x, x̂ and define a new distortion function d∗ (x, x̂) = b d(x, x̂), where b ≥ 0.
Find the associated rate distortion function R∗ (D) in terms of R(D).

5
Solution:

(a)

R̃(D) = inf I(X; X̂)

˜
p(x̂|x):E(d(x,x̂))≤D

= inf I(X; X̂)

p(x̂|x):E(d(x,x̂))+a<D

= inf I(X; X̂)

p(x̂|x):E(d(x,x̂))≤D−a

= R(D − a)

(b) If b > 0

R∗ (D) = inf I(X; X̂)

p(x̂|x):E(d∗ (x,x̂))≤D

= inf I(X; X̂)

p(x̂|x):E(b d(x,x̂))≤D

= inf I(X; X̂)

p(x̂|x):E(d(x,x̂))≤ D
b

D
=R ,
b

else if b = 0, then d∗ = 0 ≤ D for all D ≥ 0 and hence

R∗ (D) = 0 for all D ≥ 0.

Task 6.5

Let X ∼ N (0, σ 2 ) and let the distortion measure be squared error. Here, we do not allow block descriptions, i.e., we use
a (2nR , n)-rate distortion code with n = 1.
q
Show that the optimal reproduction points for 1-bit quantization (i.e., R = 1) are ± π2 σ and the expected distortion for
1-bit quantization is π−2 2
π σ . Compare this with the distortion-rate bound D(R) = σ 2 2−2R for R = 1.

Solution:

The distribution of X is given via its density

1 x2
p(x) = N (x | 0, σ 2 ) = √ e− 2σ2 .
2πσ 2
Since the distribution is symmetric around zero and we have only two reproduction points, we can regard the cases
x ≤ 0 and x > 0 separately with each one reproduction point. Let x̂ = −a be the reproduction point for x ≤ 0 and x̂ = a

6
for x > 0. The optimal points minimize the squared error distortion measure, D = E[(x − x̂)2 ], for which we find
Z 0 Z ∞
D = E[(x − x̂)2 ] = p(x)(x + a)2 dx + p(x)(x − a)2 dx
−∞ 0
Z ∞
(1) 2
= 2 p(x)(x − a) dx
0
Z ∞ Z ∞ Z ∞
= 2 a2 p(x)dx + x2 p(x)dx − 2a xp(x)dx
0 0 0
Z ∞
1 2 1 2 1 x2
− 2σ
= 2 a + σ − 2a √ xe 2
dx
2 2 2πσ 2 0
Z ∞
(2) 2 4a
= a + σ2 − √ σ2 e−y dy
2πσ 2 0
4aσ 2
= a2 + σ 2 − √ ,
2πσ 2
where in (1) we used the symmetry property and in (2) applied integration by substitution with y = x2 /(2σ 2 ) ⇒ dy =
2x/(2σ 2 ) · dx ⇒ dx = σ 2 dy/x. We find the optimal reconfiguration point by partial differentiation of D and setting the
derivative to zero:
∂D 4σ 2 !
= 2a − √ =0
∂a 2πσ 2
r
2
⇒ a∗ = σ
π
For the second partial derivative weqobtain 2 from which we see that a∗ actually minimizes the distortion. Hence, the
2
optimal reproduction points are ± π σ. For the optimal reproduction points we find the distortion evaluated with
q
a∗ = π2 σ to be
4a∗ σ 2 π−2 2
D|a=a∗ = (a∗ )2 + σ 2 − √ = σ .
2πσ 2 π
The optimal distortion for R = 1 (the distortion-rate function evaluated at R = 1) yields a lower bound on the single-bit
quantization (i.e., R = 1) distortion of a Gaussian random variable X:

σ2 π−2 2
Dopt = D(R = 1) = < σ
4 π
This shows that the single-bit quantizer does not achieve optimal distortion limit for single-bit quantizers. Analogously
to channel coding, the minimum distortion value D(R) for a fixed rate R is usually achieved only asymptotically for
large block length n.

Task 6.6

Consider the ordinary Gaussian channel with two correlated observations of X, i.e., Y = [Y1 , Y2 ]T , where

Y1 = X + Z1 , Y2 = X + Z2

with a power constraint P on X, and [Z1 , Z2 ]T ∼ N (0, K), where

N Nρ
K= .
Nρ N

Find the capacity for the cases ρ = 1, and ρ = −1.

7
Solution:

For ρ = 1, we have Z1 = Z2 , so that Y1 = Y2 . Clearly, the capacity will then be equal to the capacity for only one
measurement Y1 . For ρ = −1, we have Z1 = −Z2 , so that Y1 + Y2 = 2X. Then X can be determined exactly, so that the
capacity is infinite.

Task 6.7

Consider three parallel additive Gaussian noise channels

Yi = Xi + Zi , i = 1, 2, 3,

where the noise distribution is given by

    
Z1 1 0 0
Z2  ∼ N 0, σ 2 0 2 0 .
Z3 0 0 5
P3
We assume a common input power constraint i=1 E[Xi2 ] ≤ P . Find the distribution of X = [X1 , X2 , X3 ]T that leads to
maximum capacity for the three cases P = 0.5σ 2 , P = 2σ 2 , and P = 10σ 2 .

Solution:

The solution of this problem can be found with the water-filling algorithm. The maximizing distribution for x is
a multivariate zero-mean Gaussian with independent components, where the variances can most easily be found
graphically:

(a) (b) (c)

σ2 { σ2 { σ2 {

1 2 3 1 2 3 1 2 3
Channel number Channel number Channel number

Therefore, the solutions for the three cases are

(a) P1 = σ 2 /2, P2 = 0, P3 = 0.
(b) P1 = 3σ 2 /2, P2 = σ 2 /2, P3 = 0.
(c) P1 = 5σ 2 , P2 = 4σ 2 , P3 = σ 2 .

8
Task 6.8

Consider three parallel additive Gaussian noise channels

Yi = Xi + Zi , i = 1, 2, 3,

where the noise distribution is given by     

Z1 1 0 0
Z2  ∼ N 0, 0 1 ρ .
Z3 0 ρ 1
P3
Assuming a common input power constraint i=1 E[|Xi |2 ] ≤ P = 6ρ, find the maximum capacity and the optimum
distribution of power among the channels for which it is obtained.

Solution:

The case of correlated noise in a Gaussian channel can be handled by applying the water-filling algorithm in coordinates
in which the covariance matrix is diagonal.
In this case, the covariance matrix Σ has √ the√eigenvalues λ1√= 1, λ2√= 1 + ρ, and λ3 = 1 − ρ with corresponding
normalized eigenvectors [1, 0, 0]T , [0, 1/ 2, 1/ 2]T , and [0, 1/ 2, −1/ 2]T . Defining
   
1 0√ 0√ 1 0 0
S = 0 1/√2 1/ √2  , D = 0 1 + ρ 0 ,
0 1/ 2 −1/ 2 0 0 1−ρ

we then have Σ = SDST . We then to apply the water-filling algorithm to the diagonal matrix D. Let P̃i , i = 1, 2, 3, be
the allocated power in the new coordinates. By applying the water-filling algorithm, we find that

P̃1 = 2ρ, P̃2 = ρ, P̃3 = 3ρ,

where we assumed that ρ ≥ 0, as otherwise the input power constraint would be meaningless. The allocated power in
the original coordinates, i.e., the correlation matrix for the input which achieves the capacity, is then found to be
   
2ρ 0 0 2ρ 0 0
ST  0 ρ 0  S =  0 2ρ −ρ .
0 0 3ρ 0 −ρ 2ρ.

Questions of understanding (Solve and discuss during the session)

Task 6.9: Probabilistic decoding algorithms

(a) In Task 4.6 (f) we have shown that MAP-decoding minimizes the error probability Pe . State the probabilities of
error, that are minimized by the sum-product and the max-product message passing algorithm.
(b) Consider both block-wise ML-decoding and block-wise MAP-decoding for any linear binary (n, k)-block code with
P (c) chosen as in the lecture. Over which set do you optimize such that the decoders are equivalent? Explain your
answer.
(c) Discuss why bit-wise MAP decoding does not always yield valid codewords.
(d) Are bit-wise ML-decoding and bit-wise MAP-decoding via the sum-product algorithm equivalent for any linear binary
(n, k)-block code with P (c) chosen as in the lecture? Explain your answer.
Hint: Consider the case that the codebook contains at least one codeword with ci = 1 for all i ∈ {1 . . . , n} and
afterwards consider the opposite case. Is P (ci ) always uniform for all i?

9
(e) Discuss how ordinary minimum distance decoding and block-wise MAP decoding via the max-product algorithm
compare to each other.
(f) So far we used the sum-product and the max-product algorithm only for binary symmetric channels. What do we
need to change (in comparison to the BSC) if we want to decode a message sent over a binary erasure channel with
erasure probability ε?

Solution:

(a) The sum-product algorithm is a bit-wise MAP decoder (also called a bit-wise MAP estimator) ĉi = argmaxci p(ci | r)
for all i ∈ {1, . . . , n} and the max-product algorithm is a block-wise MAP decoder ĉ = argmaxc p(c | r). Therefore
the respective decoders minimize Pe = P (gi (r) ̸= ci ) w.r.t. the decoding function gi for all i ∈ {1, . . . , n} and
Pe = P (g(r) ̸= c) w.r.t. the decoding function g.
(b) The block-wise MAP-decoder is equivalent to an ML-decoder if the considered search space is the codebook C and
not the space of length-n sequences. The prior p(c) = 2−k M (c) is uniform on C, but not on {0, 1}n .
Case 1:
n
Y n
Y
ĉML,1 = argmaxc∈C p(rj | cj ) = argmaxc∈C M (c) p(rj | cj ) = ĉMAP .
j=1 j=1

Case 2:
n
Y n
Y
ĉML,2 = argmaxc∈{0,1}n p(rj | cj ) ̸= argmaxc∈{0,1}n M (c) p(rj | cj ) = ĉMAP = ĉML,1 .
j=1 j=1

(c) As we have seen in the solution of part (b), the block-wise MAP decoder effectively uses the codebook as its search
space. In contrast, the search space of bit-wise MAP decoding is not constrained, such that the decoding function
of the entire codeword g(r) = [g1 (r), . . . , gr (r)] maps to the whole space {0, 1}n instead of just mapping to the
codebook. The decoding function g processes the bits independently from each other. One can also speak of the
bit-wise MAP decoder using an averaged likelihood function:
X X
ĉi = argmaxci p(c | r) = argmaxci p(r | c)p(c) = argmaxci Ec [p(r | c) | ci ]p(ci ),
¬ci ¬ci

where Ec is the expectation over all codewords (and not over the received words r). This averaging reduces the
information, which length-n sequences are codewords, to the information how often r is produced (on average)
by a codeword that has value ci at the i-th position. The remaining information is not enough to enforce that the
decoding yields valid codewords.
(d) Let us start with the definition of the two decoding functions:

ĉi,MAP =argmaxci ∈{0,1} p(ci | r) = argmaxci ∈{0,1} p(r | ci )p(ci )

ĉi,ML =argmaxci {0,1} p(r | ci ),

where p(ci ) = ¬ci p(c) = 2−k ¬ci M (c). We first consider the case mentioned in the hint. If there exists at least
P P
one codeword with ci = 1, then the i-th column gi of the generator matrix G = [g1 , . . . , gn ] contains at least one
non-zero entry. Recall that ci = mgi for some message m ∈ {0, 1}k . Assume that the number of non-zero entries in
this particular column is w ≤ k. We want to calculate the number of codewords with ci = 0 to evaluate P (ci = 0).
To do this, we reorder the rows of G, such that the first w coordinates of gi are equal to one and the remaining
coordinates are all zero. This changes the mapping from message to codeword, but it does not change the set
of codewords C and so p(ci ) is also unchanged. To get a codeword with ci = 0 the number of ones in the first w
coordinates of ⇕ must be even. The remaining (k − w) elements can be chosen arbitrarily, since their choice has no
effect on ci . We evaluate the number of codewords with ci = 0 to
X X w
k−w
M (c)|ci =0 = 2 = 2k−w 2w−1 = 2k−1 ,
¬c r even
r
i

10
where we used the identity r even wr = 2w−1 . This implies p(ci ) = 21 , i.e., the marginal distribution of ci on {0, 1}
P

is uniform. Now we turn to the case, where there exists one i, such that ci = 0 for all codewords. As an example
consider the (5, 2)-code with generator matrix

0 0 1 1 0
G= .
0 1 1 0 1

It is a valid linear code since all rows are linearly independent, but c1 = 0 for all codewords. In this example, we
obviously have P (c1 = 0) = 1 which violates the assumption that c1 is uniformly distributed in {0, 1}. So only
under the assumption, that for all i ∈ {1 . . . , n} the codebook contains at least one codeword with ci = 1, we can in
general conclude that bit-wise MAP decoding via the sum-product algorithm equals the bit-wise ML estimator given
above. However, under certain assumptions, e.g., for a binary symmetric channel with bit-flip probability ε < 12 , the
decoding functions can still be equivalent.
(e) As we have shown in part (b) the block-wise MAP decoder is equivalent to the block-wise ML-decoder if P (c) is
chosen uniform on the space of codewords. Recall that minimum distance decoding is equivalent to block-wise ML
decoding for a binary symmetric channel with bit-flip probability ε < 21 . So if these two conditions are satisfied, then
the max-product algorithm is equivalent to ordinary minimum distance decoding. Note that in this case minimum
distance decoding minimizes the probability of a block decoding error. In all other cases the two decoding methods
are potentially different and minimum distance decoding may either not minimize the block decoding error or it
may even be inappropriate. For example, minimum distance decoding is not well-defined for the binary erasure
channel (Why?).
(f) The messages at terminal factor nodes are initialized as

(1 − ε, 0) if rj = 0

(p(rj | cj = 0), p(rj | cj = 1)) = (ε, ε) if rj = e

(0, 1 − ε) if rj = 1


Otherwise the algorithms remain unchanged.

Task 6.10: Rate distortion theory

(a) For a source X with a finite alphabet X and Hamming distortion, describe a (not necessarily constructive) sequence
of rate distortion codes that is asymptotically lossless and achieves the rate R = H(X).
1
(b) Suppose that X = {1, 2, 3, 4}, X̂ = {1, 2, 3, 4} and p(xi ) = 4 for i = 1, 2, 3, 4, and X1 ,X2 ,... are i.i.d. ∼ p(x). The
distortion matrix d(x, x̂) is given by
1 2 3 4
1 0 0 1 1
2 0 0 1 1
3 1 1 0 0
4 1 1 0 0

(i) Find R(0), the rate necessary to describe the process with zero distortion.
(ii) Find the rate distortion function R(D).
Hint: There are some irrelevant distinctions in alphabets X and X̂ , which allow the problem to be collapsed.
(iii) Suppose we have a non-uniform distribution p(xi ) = pi for i = 1, 2, 3, 4. What is R(D) ?

11
Solution:

(a) In the lecture we have seen a quote by Shannon, which motivates that we can discard all non-typical sequences and
assigning fixed length descriptions to all typical sequences could be used to construct an asymptotically error free
source code with asymptotic rate H(X).
(n) (n) (n)
Let ϵ = n1 , set X̂ = X and consider the typical sequences Tϵ ⊆ X n . Set M = |Tϵ | and let g : {1, . . . , M } → Tϵ
be an arbitrary mapping from an index set to the typical sequences. Since g is invertible we define the encoding
function f such that (
(n)
g −1 (x) if x ∈ Tϵ
f (x) =
1 else.
We simply map all non-typical sequences to the same index as it is asymptotically irrelevant where we map them. It
is left to show that this sequence of rate distortion codes is actually asymptotically lossless and achieves the rate
R = H(X):
n n
1X 1X 1
0 ≤ E[d(X n , X̂ n )] = P(Xi ̸= X̂i ) ≤ P(X n ̸= X̂ n ) = P(X n ̸= X̂ n ) = P(X n ∈
/ Tϵ(n) ) < ,
n i=1 n i=1 n

(n)
where we used that P(X n ∈ Tϵ ) < 1 − ϵ. So we have limn→∞ E[d(X n , X̂ n )] = 0
log |T (n) |
By construction we also have rate R(n) = lognM = n
ϵ
the n-th rate distortion code. The typical set is bounded
(n)
as (1 − ϵ)2n(H(X)−ϵ) ≤ |Tϵ | ≤ 2n(H(X)+ϵ) , such that

log(1 − ϵ)
+ H(X) − ϵ ≤ R(n) ≤ H(X) + ϵ
n
and hence limn→∞ R(n) = H(X).
Comment: We could have used a more constructive Ansatz by designing the encoding function as a modified
version of a block-Huffman encoder, block-Fano encoder or arithmetic (block-Elias) encoder. Each of these choices
asymptotically leads to having very long codewords for the non-typical sequences (much longer than length nH(X))
and approximately length nH(X) codewords for all typical sequences. An encoder can thus be constructed by
mapping a sequences with length ≥ n − n(1−H(X))
2 to 1 and all others to their respective binary integer, given by
(n)
the codeword. Asymptotically we can thus still exploit the properties of the typical set, e.g., we have M ≤ |Tϵ | + 2
for large n.
(b) Considering the distortion matrix we can exploit the fact that the distortion measure function does not differentiate
the pair of symbols (1, 2) as well as pair (3, 4). Let Y be a binary random variable such that:
(
0, X ∈ {1, 2}
Y =
1, X ∈ {3, 4}

And the distortion matrix:

d(y, ŷ) 0 1
0 0 1
1 1 0
Now the problem is reduced to a binary source, with hamming distortion measure. Recall the definition of the rate
distortion function:

R(D) = min I(Y ; Ŷ )

p(ŷ|y):E[d(Y,Ŷ )]≤D

12
We know that Y is a binary source and its probability distribution is given by:
1
p(Y = 0) = p(X = 1) + p(X = 2) = 2
1
p(Y = 1) = p(X = 3) + p(X = 4) = 2

And the source entropy H(Y ) is:

H(Y ) = HB ( 21 ) = 1 Bit/Symbol
Now we need to represent E[d(Y, Ŷ )] in terms of probabilities.
2 X
X 2
E[d(Y, Ŷ )] = d(yi , yˆj )p(Ŷ = ŷj |Y = yi )p(Y = yi )
i=1 j=1

= d(0, 1)p(Ŷ = 1|Y = 0)p(Y = 0) + d(1, 0)p(Ŷ = 0|Y = 1)p(Y = 1)

(1) For D = 0:
E[d(Y, Ŷ )] = 0
Therefore,
p(Ŷ = 1|Y = 0) = p(Ŷ = 0|Y = 1) = 0

Now we can express I(Y ; Ŷ ) in terms of entropies:

I(Y ; Ŷ ) = H(Y ) − H(Y |Ŷ )
= HB ( 21 ) − H(Y |Ŷ )
= 1 − H(Y |Ŷ )
Notice that the conditional probabilities p(Ŷ = 1|Y = 0) and p(Ŷ = 0|Y = 1) are both set to zero. Since both
Y and Ŷ are binary random variables we conclude that the value of one variable completely determines the
value of the other. Therefore,
H(Ŷ |Y ) = H(Y |Ŷ ) = 0

R(0) = HB ( 21 ) = 1 Bit/Symbol.

(2) Recall the expression of E[d(Y, Ŷ )] in terms of probabilities:

E[d(Y, Ŷ )] = d(0, 1)p(Ŷ = 1|Y = 0)p(Y = 0) + d(1, 0)p(Ŷ = 0|Y = 1)p(Y = 1)
= p(Ŷ = 1|Y = 0)p(Y = 0) + p(Ŷ = 0|Y = 1)p(Y = 1) = p(Y ̸= Ŷ )
= p(Y ̸= Ŷ ) ≤ D
To find the lower bound we consider the worst case where p(Y ̸= Ŷ ) = D.
Recall Fano’s inequality: For Ŷ an estimator of Y (with same support) we have
H(Y |Ŷ ) ≤ Pe log2 (|Y| − 1) + HB (Pe ), |Y| alphabet size of Y
≤ HB (D).
The first inequality is an equality if the distribution p(y | Ŷ = ŷ, Ŷ ̸= Y ) is uniform on Y \ {ŷ}. For |Y| = 2 this
is always true. With this upper bound for the conditional entropy we conclude:
I(Y ; Ŷ ) = H(Y ) − H(Y |Ŷ ) (1)
= HB ( 21 ) − H(Y |Ŷ ) (2)
≥ 1 − HB (D) (3)
R(D) = 1 − HB (D) (4)
Remark: Inequality (4) is obtained from the definition of the rate-distortion function.

13
(3) Finally, after generalizing the rate distortion function for a uniformly distributed source. In this task we assume
non-uniform distribution for the source X, p(xi ) = pi for i = 1, 2, 3, 4. Since we reduced the problem to a binary
problem the distribution of the binary random variable Y must be expressed in terms of the distribution of X
which we have already discussed in sub task (1) of this problem.

I(Y ; Ŷ ) = H(Y ) − H(Y |Ŷ ) (5)

= H(p1 + p2 , p3 + p4 ) − H(Y |Ŷ ) (6)
≥ H(p1 + p2 , p3 + p4 ) − HB (D) (7)
R(D) = HB (p1 + p2 ) − HB (D) (8)

Solutions To Problems Related To Information Theory
No ratings yet
Solutions To Problems Related To Information Theory
4 pages
Question Bank Itc
0% (1)
Question Bank Itc
14 pages
01 - Fatigue Theory (Part 1)
No ratings yet
01 - Fatigue Theory (Part 1)
42 pages
First Midterm Exam
No ratings yet
First Midterm Exam
10 pages
Final Sol
No ratings yet
Final Sol
7 pages
Final 2015 Sol
No ratings yet
Final 2015 Sol
13 pages
EE/Ma 126b Information Theory - Homework Set #4
No ratings yet
EE/Ma 126b Information Theory - Homework Set #4
5 pages
Exam_Fa23
No ratings yet
Exam_Fa23
5 pages
midit11
No ratings yet
midit11
5 pages
Ps 2
No ratings yet
Ps 2
7 pages
Untitled
No ratings yet
Untitled
4 pages
Signal Coding Estimation Theory 2011
No ratings yet
Signal Coding Estimation Theory 2011
7 pages
Digital Communications Homework
No ratings yet
Digital Communications Homework
7 pages
Robert Gallager (1) - Merged
No ratings yet
Robert Gallager (1) - Merged
17 pages
Robert Gallager
No ratings yet
Robert Gallager
13 pages
HW 4 Sol
No ratings yet
HW 4 Sol
27 pages
1960 Max: Qu, Antixing For Minimum Distortion 7
No ratings yet
1960 Max: Qu, Antixing For Minimum Distortion 7
6 pages
Do Not Open This Booklet Until You Are Told To Do So: Name: ID
No ratings yet
Do Not Open This Booklet Until You Are Told To Do So: Name: ID
3 pages
Exam_Fa22
No ratings yet
Exam_Fa22
6 pages
Exam For Information Theory and Source Coding: Dr.-Ing. Michael Mecking BMW Group, Munich
No ratings yet
Exam For Information Theory and Source Coding: Dr.-Ing. Michael Mecking BMW Group, Munich
8 pages
Hw1sol Sanov Rate Distortion
No ratings yet
Hw1sol Sanov Rate Distortion
15 pages
Minor Sol 2023 Copy (4)
No ratings yet
Minor Sol 2023 Copy (4)
5 pages
Solutions To Homework Set #2: C P N P P P N
No ratings yet
Solutions To Homework Set #2: C P N P P P N
11 pages
New Question Bank - Itc-Qb Ultimate
0% (1)
New Question Bank - Itc-Qb Ultimate
14 pages
TTE2008: Information Theory
No ratings yet
TTE2008: Information Theory
4 pages
SS 19
No ratings yet
SS 19
22 pages
A Mathematical Theory of Communication: by C.E.Shannon Presented by Ling Shi
No ratings yet
A Mathematical Theory of Communication: by C.E.Shannon Presented by Ling Shi
10 pages
Probc 1
No ratings yet
Probc 1
4 pages
capacity+coding
No ratings yet
capacity+coding
37 pages
SC 11
No ratings yet
SC 11
47 pages
Jan 08
No ratings yet
Jan 08
3 pages
Optimal One-Bit Quantization
No ratings yet
Optimal One-Bit Quantization
9 pages
Bayesian Input Design For Linear Dynamical Model Discrimination 2019 Bania
No ratings yet
Bayesian Input Design For Linear Dynamical Model Discrimination 2019 Bania
13 pages
Tutorial 1 - EEL710
No ratings yet
Tutorial 1 - EEL710
3 pages
Course No.: Eem822 Title: Digital Communications
No ratings yet
Course No.: Eem822 Title: Digital Communications
8 pages
EE6340 - Information Theory Problem Set 5 Solution: Max Max I
No ratings yet
EE6340 - Information Theory Problem Set 5 Solution: Max Max I
3 pages
Solution_Re-exam_April2024
No ratings yet
Solution_Re-exam_April2024
4 pages
Unit 1 INFORMATION THEORY SOURCE CODING MCQ
No ratings yet
Unit 1 INFORMATION THEORY SOURCE CODING MCQ
16 pages
09EC352 Assignment 2
No ratings yet
09EC352 Assignment 2
1 page
Dig Comm Ex
No ratings yet
Dig Comm Ex
41 pages
ECE 361: Problem Set 4: Problems and Solutions Capacity and Sampling
No ratings yet
ECE 361: Problem Set 4: Problems and Solutions Capacity and Sampling
5 pages
2016_End-Spring_Semester_Electronic_and_Electrical_Communication_Engineering_(EC)_Digital_Communication-_EC31002
No ratings yet
2016_End-Spring_Semester_Electronic_and_Electrical_Communication_Engineering_(EC)_Digital_Communication-_EC31002
1 page
Information Theory For Single-User Systems With Arbitrary Statistical Memory
No ratings yet
Information Theory For Single-User Systems With Arbitrary Statistical Memory
111 pages
Intro To Information Theory - Assignment-1-Q&Solution - Old
No ratings yet
Intro To Information Theory - Assignment-1-Q&Solution - Old
2 pages
HW2
No ratings yet
HW2
3 pages
Problem Set 1
No ratings yet
Problem Set 1
3 pages
midit10
No ratings yet
midit10
5 pages
Mutual Information
No ratings yet
Mutual Information
48 pages
EEEN 464 Fall 2024 Homework II
No ratings yet
EEEN 464 Fall 2024 Homework II
3 pages
cmi2024_questions
No ratings yet
cmi2024_questions
5 pages
EEE 431 - Homework 2 - Final
No ratings yet
EEE 431 - Homework 2 - Final
3 pages
Module 1
No ratings yet
Module 1
23 pages
HW 2
No ratings yet
HW 2
3 pages
Lec06 PDF
No ratings yet
Lec06 PDF
10 pages
ML Ctanujit
No ratings yet
ML Ctanujit
56 pages
Be Itc MCQ 675263
No ratings yet
Be Itc MCQ 675263
18 pages
Lec40 - 210102096 - VEDIKA GARG
No ratings yet
Lec40 - 210102096 - VEDIKA GARG
5 pages
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
No ratings yet
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
30 pages
Mit6 441s16 Course Notes
No ratings yet
Mit6 441s16 Course Notes
295 pages
A Field Guide to the Aliens of Star Trek: The Next Generation
From Everand
A Field Guide to the Aliens of Star Trek: The Next Generation
Zachary Auburn
No ratings yet
Ravel Mother Goose Suite - Teacher Notes PDF
100% (1)
Ravel Mother Goose Suite - Teacher Notes PDF
14 pages
Analogy Mock Exam
No ratings yet
Analogy Mock Exam
14 pages
Tot Halyester Manual 1.1 ENG PDF
0% (1)
Tot Halyester Manual 1.1 ENG PDF
118 pages
National Transport Authority MSI Building, Royal Road, Les Cassis, Port Louis Tel: 202 2800 Application For The Reservation of A Registration Mark
No ratings yet
National Transport Authority MSI Building, Royal Road, Les Cassis, Port Louis Tel: 202 2800 Application For The Reservation of A Registration Mark
1 page
Uniform Probability Distribution Normal Probability Distribution Exponential Probability Distribution
100% (1)
Uniform Probability Distribution Normal Probability Distribution Exponential Probability Distribution
29 pages
IFU0163rK - en Makoto Intravascular Imaging System User Guide TVC MC10
100% (1)
IFU0163rK - en Makoto Intravascular Imaging System User Guide TVC MC10
184 pages
Popovic Et Al. 'Srebrenica Seven' Indictment (Srebrenica Genocide)
No ratings yet
Popovic Et Al. 'Srebrenica Seven' Indictment (Srebrenica Genocide)
52 pages
The Impact of Modern Technology On IT Students A
No ratings yet
The Impact of Modern Technology On IT Students A
5 pages
Tax Invoice/Bill of Supply/Cash Memo: (Original For Recipient)
No ratings yet
Tax Invoice/Bill of Supply/Cash Memo: (Original For Recipient)
1 page
Specifications - Imb0140
No ratings yet
Specifications - Imb0140
16 pages
Instructional Design Project
No ratings yet
Instructional Design Project
13 pages
Weaving Record
No ratings yet
Weaving Record
11 pages
Warehouse Internal Operations Unit 5
No ratings yet
Warehouse Internal Operations Unit 5
20 pages
Al125 Xeon Water Pump
No ratings yet
Al125 Xeon Water Pump
1 page
APA Style Reerence Citations
No ratings yet
APA Style Reerence Citations
4 pages
Download (Ebook) NUMERICAL METHODS FOR SCIENTIFIC AND ENGINEERING COMPUTATION by M K JAIN, S R K IYENGAR, R K JAIN ISBN 9788122433234, 8122433235 ebook All Chapters PDF
100% (2)
Download (Ebook) NUMERICAL METHODS FOR SCIENTIFIC AND ENGINEERING COMPUTATION by M K JAIN, S R K IYENGAR, R K JAIN ISBN 9788122433234, 8122433235 ebook All Chapters PDF
81 pages
Electrical Standards List
No ratings yet
Electrical Standards List
16 pages
Environmentally-Aware and Energy-Efficient Multi-Drone Coordination and Networking For Disaster Response
No ratings yet
Environmentally-Aware and Energy-Efficient Multi-Drone Coordination and Networking For Disaster Response
17 pages
Manual de Operacion de Compresor Ingersoll Rand
50% (2)
Manual de Operacion de Compresor Ingersoll Rand
102 pages
Transpo Law Case Digest
No ratings yet
Transpo Law Case Digest
6 pages
Im01c22d01-01en (17) - Eja310, Eja430
No ratings yet
Im01c22d01-01en (17) - Eja310, Eja430
83 pages
Bài Tập Global Success 9 (Lưu Hoằng Trí) (UNIT 6 - 12) .Docx-đã Gộp
No ratings yet
Bài Tập Global Success 9 (Lưu Hoằng Trí) (UNIT 6 - 12) .Docx-đã Gộp
248 pages
G
100% (1)
G
20 pages
ICT Occupational Standards Level 5
No ratings yet
ICT Occupational Standards Level 5
80 pages
Materi-Ke-1-2019 (Rekayasa Kualitas) by Aulia Ishak
No ratings yet
Materi-Ke-1-2019 (Rekayasa Kualitas) by Aulia Ishak
32 pages
Cemtop 200
No ratings yet
Cemtop 200
4 pages
Dejan Zafirovski Resume
No ratings yet
Dejan Zafirovski Resume
1 page
Pupils
100% (1)
Pupils
63 pages
Eureka Forbes LTD Case Analysis
No ratings yet
Eureka Forbes LTD Case Analysis
5 pages

exercise6-sol

Uploaded by

exercise6-sol

Uploaded by

Exercise 6

Exercises (Prepare at home and discuss in session)

c(0) (i) = m(i) + m(i − 3),

(b) The state diagram:

m(i) + m(i − 1) = c(0) (i) + c(0) (i − 1) + c(0) (i − 2),

We express the distortion E[d(X, X̂)] in terms of probabilities:

=pX (0)p(X̂ = 1 | X = 0)d(0, 1) + pX (1)p(X̂ = 0 | X = 1)d(1, 0)

Approach 1: Straight forward

Approach 2: Via entropies

Then we represent the conditional entropies as binary entropies:

Conclusion: We observe that  

0 for all D ≥ 1/2.

R̃(D) = inf I(X; X̂)

= inf I(X; X̂)

= inf I(X; X̂)

R∗ (D) = inf I(X; X̂)

= inf I(X; X̂)

= inf I(X; X̂)

else if b = 0, then d∗ = 0 ≤ D for all D ≥ 0 and hence

R∗ (D) = 0 for all D ≥ 0.

The distribution of X is given via its density

with a power constraint P on X, and [Z1 , Z2 ]T ∼ N (0, K), where

Find the capacity for the cases ρ = 1, and ρ = −1.

Consider three parallel additive Gaussian noise channels

where the noise distribution is given by

(a) (b) (c)

Therefore, the solutions for the three cases are

Consider three parallel additive Gaussian noise channels

where the noise distribution is given by     

P̃1 = 2ρ, P̃2 = ρ, P̃3 = 3ρ,

Questions of understanding (Solve and discuss during the session)

Task 6.9: Probabilistic decoding algorithms

ĉi,MAP =argmaxci ∈{0,1} p(ci | r) = argmaxci ∈{0,1} p(r | ci )p(ci )

Otherwise the algorithms remain unchanged.

Task 6.10: Rate distortion theory

And the distortion matrix:

R(D) = min I(Y ; Ŷ )

And the source entropy H(Y ) is:

= d(0, 1)p(Ŷ = 1|Y = 0)p(Y = 0) + d(1, 0)p(Ŷ = 0|Y = 1)p(Y = 1)

Now we can express I(Y ; Ŷ ) in terms of entropies:

(2) Recall the expression of E[d(Y, Ŷ )] in terms of probabilities:

I(Y ; Ŷ ) = H(Y ) − H(Y |Ŷ ) (5)

You might also like

Conclusion: We observe that