Final Sol
Final Sol
= 1 h2 (p).
Final
Page 1 of 7
= 1 h2 (p).
(d) The capacity with cost constraint is
Cd =
=
I(X; Y |S)
max
PX|S :E[X]0.25
max
PX|S :E[X]0.25
max
PX|S :E[X]0.25
H(X Z) H(Z)
max
PX|S :E[X]0.25
max
H(X Z) H(Z)
PX :E[X]0.25
1
= h2 ( p) h2 (p).
4
It is clear that the equality can be achieved by X Bern( 14 ) that are independent
to S. Therefore,
1
Cd = h2 ( p) h2 (p).
4
Final
Page 2 of 7
min
I(X; Q).
E[d` (X,Q)]D
Hint: use the previous part to show that the constraint set for the minimization can
be taken to be H(X|Q) D instead of E[d` (X, Q)] D.
(e) (8 points) For a given distortion level D, describe a concrete implementable scheme
(not based on a random coding argument) for achieving R(D).
Final
Page 3 of 7
H(X|Q) H(X|Q).
Therefore,
= H(X) H(X|Q)
H(X) H(X|Q) I(X; Q).
I(X; Q)
(d) In part (c), we have seen that E[d` (X, Q)] D implies H(X|Q) D. Therefore,
R(D) =
min
I(X; Q)
E[d` (X,Q)]D
min
I(X; Q)
H(X|Q)D
H(X) H(X|Q)
min
H(X|Q)D
= H(X) D.
= H(X|Q), we have
On the other hand, since E[d` (X, Q)]
H(X) D =
min
I(X; Q)
H(X|Q)D
min
E[d` (X,Q)]D
I(X; Q)
= R(D).
Therefore, R(D) = H(X) D which is a straight line between (H(X), 0) and
(0, H(X)).
(e) Since the rate-distortion curve is a straight line, we can use the time sharing argument.
First, we have a concrete scheme that achieves that losslessly compress the source
using the rate H(X) (Enumerating all typical sequences). Also, we have a concrete
scheme that achieves the distortion H(X) using zero-rate (the reconstruction is simply
PMF of X).
D
By using first scheme for H(X)D
fraction of time and using second scheme for H(X)
H(X)
fraction of time, we can achieve the distortion D with a rate R = H(X) D.
Final
Page 4 of 7
where, for QX,Y P(X , Y), I(QX,Y ) denotes the mutual information between X and Y
when distributed according to QX,Y (and ties in the maximization are broken arbitrarily).
(a) (8 points) Does Valerys decoder have to know the channel statistics PY |X in order
to implement this decoding scheme?
(b) (8 points) Prove that, for any QX,Y P(X , Y),
D (QX,Y kPX PY ) I(QX,Y ).
Hint: Use the fact that I(QX,Y ) = D (QX,Y ||QX QY ).
(c) (8 points) Using the previous part, prove that f () , where
f () =
min
D (QX,Y kPX PY ) .
(d) (8 points) Using the method of types show that for any 2 j 2nR
P I(PX n (j),Y n ) J = 1 = 2nf ()
(e) (8 points) What is the supremum of rates R for which
P J 6= J 0 as n
under Valerys scheme? How does it compare to the supremum of rates for which
joint typicality decoding would have achieved reliable communication?
Final
Page 5 of 7
P J 6= J
= P J 6= J|J = 1
P I(PX n (1),Y n ) < J = 1 + P I(PX n (j),Y n ) for some 2 j 2nR J = 1
P I(PX n (1),Y n ) < J = 1 + 2nR P I(PX n (2),Y n ) J = 1 .
Then, with the help of previous parts, establish for which values of R and the
expression in the last line vanishes.
Solution:
(a) It only needs to know the X n (j) for all 1 j 2nR and the output Y n . Unlike to
the joint typicality decoding, the decoder does not have to know about the channel
statistics PY |X .
(b)
D(QX,Y ||PX PY ) I(QX,Y )
X
X
QX,Y (x, y)
QX,Y (x, y)
QX,Y (x, y) log
x,y
QX (x) log
QX (x)QY (y)
PX (x)PY (y)
QY (y)
QX (x) X
+
QY (y) log
PX (x)
PY (y)
y
f () =
min
D (QX,Y kPX PY )
min
I (QX,Y )
.
(d) Let En = {QX,Y P(X , Y) : I(QX,Y ) }. For j 2, the joint law of X n (j) and
Final
Page 6 of 7
Y n is PX PY . Therefore,
P I(PX n (j),Y n ) J = 1 = P PX n (j),Y n En |J = 1
X
=
P(T (QX,Y ))
QX,Y En
(PX PY )n (T (QX,Y ))
QX,Y En
.
= 2n minQX,Y En D(QX,Y ||PX PY )
.
= 2nf ()
(e) Recall the hint:
P J 6= J
(i)
= P J 6= J|J = 1
(ii)
(iii)
P I(PX n (1),Y n ) < J = 1 + P I(PX n (j),Y n ) for some 2 j 2nR J = 1
P I(PX n (1),Y n ) < J = 1 + 2nR P I(PX n (2),Y n ) J = 1 .
This is because,
(i) is because of symmetry.
(ii) is because: J 6= 1 if I(PX n (1),Y n ) < or I(PX n (J),Y n ) I(PX n (1),Y n ) for
some other J 6= 1.
(iii) is because of symmetry again.
Suppose that the inequalities R < < I(X; Y ) = I(PX,Y ) hold. Then, by the law of
large number,
lim I(PX n (1),Y n ) = I(PX,Y ) > .
This implies that limn P I(PX n (1),Y n ) < J = 1 = 0.
On the other hand, by part (c) and (d),
.
2nR P I(PX n (2),Y n ) J = 1 = 2n(Rf ())
2n(R)
which vanishes as n grows.
Thus, we can argue that P J 6= J converges to zero as n grows. Finally, using this
scheme, we can achieve any rate below supPX I(X; Y ) which is the channel capacity
that we achieved using joint typicality.
Final
Page 7 of 7