0% found this document useful (0 votes)
102 views

Tutorial On Compressed Sensing Exercises: 1. Exercise

The document provides exercises and solutions related to compressed sensing. It covers topics such as: - Showing that the l0 minimization problem for general matrices A is NP-hard by reducing it to the exact cover problem. - Constructing a 1-sparse vector x that is not the solution to the lq minimization problem for q > 1 and a matrix A with m < N. - Proving that the solution to the l1 minimization problem is s-sparse if A is m x N and the solution is unique. - Showing the equivalence between various statements about a matrix A, such as the existence of a decoder mapping and the null space property. - Pro

Uploaded by

peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views

Tutorial On Compressed Sensing Exercises: 1. Exercise

The document provides exercises and solutions related to compressed sensing. It covers topics such as: - Showing that the l0 minimization problem for general matrices A is NP-hard by reducing it to the exact cover problem. - Constructing a 1-sparse vector x that is not the solution to the lq minimization problem for q > 1 and a matrix A with m < N. - Proving that the solution to the l1 minimization problem is s-sparse if A is m x N and the solution is unique. - Showing the equivalence between various statements about a matrix A, such as the existence of a decoder mapping and the null space property. - Pro

Uploaded by

peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Tutorial on Compressed Sensing

Exercises

1. Exercise
Let η ≥ 0. Show that the `0 minimization problem

(P0,η ) x# = arg min kzk0 s.t. kAz − yk2 ≤ η

for general m × N -matrices A and y ∈ Rm is an NP-hard problem.

Hint: You can use the fact that the exact cover problem is NP-hard.
Exact Cover Problem: Given as the input a natural number m divisible by 3 and
a system {Tj : j = 1, ..., N } of subsets of {1, ..., m} with |Tj | = 3 for all j ∈ [N ].
Decide, if there is a subsystem of mutually disjoint sets {Tj : j ∈ J}, J ⊂ [N ],
such that ∪j∈J Tj = {1, ..., m}.

Solution
We show that any algorithm solving the `0 -problem can be transformed in poly-
nomial time into an algorithm solving the exact cover problem.
Let therefore {Tj : j = 1, . . . , N } be a system of subsets of {1, . . . , m} with
|Tj | = 3. We construct a matrix A ∈ Rm×N by putting
(
1 if i ∈ Tj
aij := ,
0 if i ∈
/ Tj

i.e. the j-th column of A is the indicator function of Tj denoted by XTj and
N
X
Ax = xj XTj . (1)
j=1

This construction can be of course done in polynomial time. Let now x be the
solution to the minimization problem
 
1
 .. 
min kxk0 s.t. Ax = y =  .  . (P0 )
1

By (1) follows:
N
X N
X
m = kyk0 = kAxk0 = k xj XTj k0 ≤ kxj XTj k0 ≤ 3kxk0 , (2)
i=1 i=1

i.e. kxk0 ≥ m/3, where the last step in the inequality follows, because every XTj
has at most three nonzero entries and kxj XTj k0 = 0, if xj = 0.
We show, that the exact cover problem has a unique solution if and only if kxk0 =
m/3. This shows that after solving (P0 ) we can decide if the exact cover problem
has a positive solution or not by computing the `0 -norm of the solution x.
Let us first assume that the exact cover problem has a positive solution.
P Then
there is a set J ⊂ {1, . . . , N } with |J| = m/3 and y = X{1,...,m} = j∈J XTj .
Hence y = Ax for x = XJ and kxk0 = |J| = m/3, which is indeed the minimizer
of (P0 ), because of (2).
If on the other hand y = Ax and kxk0 = m/3, then {Tj : j ∈ supp x} solves the
exact cover problem.
2. Exercise
Let q > 1 and let A be a m × N -matrix with m < N . Show that, there is a
1-sparse vector x, which is not a solution of the optimization problem

(Pq ) x∗ = arg min kzkq s.t. Az = Ax.

Solution
For j = 1, . . . , N let ej ∈ RN be a 1-sparse vector. Now suppose that for all
z ∈ RN with Az = Aej and z 6= ej we have kzkqq > kej kqq = 1. Let v ∈ ker(A) \ {0}
and t 6= 0 with |t| < 1/kvk∞ , then we obtain
X X
1 < kej + tvkqq = |1 + tvj |q + |tvk |q = |1 + tvj |q + |t|q |vk |q ∼t→0 1 + qtvj ,
k6=j k6=j

where the last estimation follows from the multi-binomial theorem.


This inequality shows that vj = 0 because it is in particular true for −1/kvk∞ ≤
t < 0. But this is in fact true for all j ∈ {1, . . . , N }, and therefore it follows v = 0,
which yields a contradiction.
3. Exercise
Let A be a m × N -matrix, y ∈ Rm , η > 0 and let k · k an arbitrary norm on Rm .
Show that the solution of the optimization problem

(P1,η ) x∗ = arg min kzk1 s.t. kAz − yk ≤ η

is m-sparse in the case of the uniqueness of the solution.


Hint: Show that the system of columns {aj : j ∈ supp x∗ } is linearly independent.

Solution
We show that, if x∗ is a unique solution, with K := supp x∗ , then the columns
{aj : j ∈ K} have to be linearly independent. Since at most m columns can be
linearly independent the statement follows. Suppose {aj : j ∈ K} is not linearly
independent, then there is a v ∈ RN with Av = 0 and v 6= 0, i.e. v ∈ ker(A),
and supp(v) ⊂ K. But because of the uniqueness we have for every t ∈ R small
enough (in absolute value):
X X
kx∗ k1 < kx + tvk1 = |xj + tvj | = sign(xj + tvj )(xj + tvj )
j∈K j∈K
X X
= sign(xj + tvj )xj + t vj sign(xj + tvj )
j∈K j∈K
X X
= sign(xj )xj + t vj sign(xj )
j∈K j∈K
X
= kx∗ k1 + t vj sign(xj ),
j∈K
P
which is a contradiction, since we can choose t such that t j∈K vj sign(xj ) be-
comes smaller than zero.
4. Exercise
Let A be an m × N matrix and 2s ≤ m. Show that the following statements are
equivalent:

i) There is a mapping Λ : Rm → RN such that Λ(Ax) = x for all x ∈ Σs . We


call such a mapping Λ a decoder.

ii) Σ2s ∩ ker(A) = {0}.

iii) For any set T with #T = 2s, the matrix AT has rank 2s.

iv) The symmetric non-negative matrix AtT AT is invertible, i.e. positive definite.

Solution
The equivalence between ii), iii) and iv) is linear algebra.
For example ii)⇒iii): If Σ2k ∩ ker(A) = {0}, we can deduce that for every T ⊂
{1, . . . , N } with |T | ≤ 2k it holds ker(AT ) = {0}. And therefore that AT has full
rank.
i)⇒ ii): Let x ∈ Σ2k ∩ ker(A), then we can write x = x1 − x0 , where both x1 and
x0 lie in Σk . Since x ∈ ker(A) it holds Ax = 0 and therefore Ax1 = Ax0 , which
implies by assumption i) that x1 = x0 and therefore x = 0.
ii)⇒ i): For any y ∈ Rm we define the decoder Λ as Λ(y) to be the element with
the smallest support in the set of solutions {x ∈ RN : Ax = y}. Suppose there is
x1 ∈ Σk such that Λ(Ax1 ) 6= x1 . This implies that there is a x0 with Ax0 = Ax1
and kx0 k0 ≤ kx1 k0 = k, and hence that x1 − x0 ∈ Σ2k ∩ ker(A). By assumption
this implies x1 = x0 .
5. Exercise
[NSP] Given a matrix A ∈ Rm×N , every vector x ∈ RN supported on a set T is
the unique solution of (P1 ) with y = Ax if and only if A satisfies the null space
property relative to T .
Reminder: A is said to satisfy the null space property relative to the set T if for
all v ∈ ker(A) holds

kvT k1 < kvT C k1 ,

where (vT )i = vi if i ∈ T and (vT )i = 0 otherwise.

Solution
Given a index set T and assume that every vector x ∈ RN supported on T is the
unique minimizer. Thus for every v ∈ ker(A) \ {0}, vT is the unique minimizer of
(P1 ) with Ax = Avk . But because of A(vT + vT C ) = Av = 0, we can deduce that
A(−vT C ) = Avk and hence by assumption kvT k1 < kvT C k.

Conversely let us assume that the NSP relative to T holds. Given a vector x
supported on T , for every z ∈ RN with Az = Ax and z 6= x, we have v := x − z ∈
ker(A) \ {0}. We obtain by assumption

kxk1 ≤ kx − zT k1 + kzT k1 = kvT k + kzT k1 < kvT C k1 + kzT k1


= kzT C k1 + kzT k1 = kzk1 ,

where we used in the third step the assumption.


6. Exercise
Given a matrix A ∈ Rm×N , a P vector x ∈ RN with support T is the unique
minimizer of (P1 ) if and only if | j∈T sign(xj )vj | < kvT C k1 for all v ∈ ker A \ {0}.

Solution
Let us start by proving that the inequality implies that x ∈ RN with support T
is the unique minimizer of (P1 ). For a vector z ∈ RN , z 6= x, with Az = Ax we
write, with v = x − z ∈ ker(A) \ {0},

kzk1 = kzT k1 + kzT C k1 = k(x − v)T k1 + kvT C k1


> |hx − v, sign(x)T i| + |hv, sign(x)T i| ≥ |hx, sign(x)T i|
= kxk1 .

It remains to show that the inequality holds as soon as x, supported on T , is


the unique minimizer of (P1 ). In this situation for v ∈ ker(A) \ {0}, the vector
z = x − v satisfies Ax = Az and kxk1 < kzk1 . From this we can deduce

hx, sign(z)T i ≤ kxk1 < kzk1 = kzT k1 + kzT C k1 = hz, sign(z)T i + kzT C k1
⇔ hv, sign(z)T i < kzT C k1
⇔ hv, sign(x − v)T i < kzT C k1 = kvT C k1 .

But since this holds true for every v ∈ ker(A) \ {0}, it holds for t > 0 that

htv, sign(x − tv)T i < tkvT C k1


⇔ hv, sign(x − tv)T i < kvT C k1 .
And for t small enough it holds sign(x − tv)j = sign(xj ) and therefore:
hv, sign(x)i < kvT C k1 .

7. Exercise
Show that the RIP implies the NSP.
More explicit: Let A ∈ Rm,d satisfy the restricted isometry porperty (RIP) of
order 2s with constant 0 < δ2s < 1/3, i.e.
(1 − δ2s )kxk22 ≤ kAxk22 ≤ (1 + δ2s )kxk22
holds for all 2s-sparse vectors x, i.e. for all
x ∈ Σ2s = {v ∈ Rd | kvk0 = #{i | vi 6= 0} ≤ 2s}.
Show that A satisfies the null space property of order s (NSP), i.e. for any T ⊂ [d]
with #T ≤ s and any v ∈ ker A\{0} it holds
2kvT k1 < kvk1 ,
where (vT )i = vi if i ∈ T and (vT )i = 0 otherwise.
Hint:
1. First show that
hAx, Ayi ≤ δ2s kxk2 kyk2
if x, y are s-sparse with disjoint support.
2. For v ∈ ker A\{0} let T0 ⊂ [d] denote the set of indices corresponding to the
s-largest entries of v (in magnitude). Further let T c = T1 ∪ T2 ∪ . . . be a
partition of T c such that T1 contains the indices of s-largest entries of vT c ,
T2 contains the s-largest entries of VT c \T1 etc.

Solution
1. step Let x, y ∈ Σs with disjoint support and with kxk2 = kyk2 = 1. Then it
holds x ± y ∈ Σ2s and kx ± yk22 = 2. Using the RIP of A of order 2s we obtain
2(1 − δ2s ) = (1 − δ2s )kx ± yk22 ≤ kA(x ± y)k22 ≤ (1 + δ2s )kx ± yk22 = 2(1 + δ2s ).
Now the claim follows from the polarization identity, since
1 1
kA(x + y)k22 − kA(x − y)k22 ≤ 2(1 + δ2s ) − 2(1 − δ2s ) = δ2s .

|hAx, Ayi| =
4 4
2. step Let v ∈ ker A\{0} and let T0 ⊂ [d] = {1, 2, . . . , d} denote the set of
indices corresponding to the largest s entries of v (in magnitude). Further divide
T c = [d]\T into sets
T1 − s − largest indices of vT c
T2 − s − largest indices of vT c \T1
..
.
In total we splittet the support T of v into disjoint sets T0 , T1 , . . . such that T0
contains indices of s largest entries of v, T1 contains indices of remaining s-largest
entries etc., hence

T = T0 ∪ T1 ∪ T2 ∪ . . .

Since we v is an element of the kernel of A we get

0 = Av = A(vT0 + vT1 + vT2 + . . .) ⇒ AvT0 = −A(vT1 + vT2 + . . .).

Now we can apply the RIP (since #T0 ≤ s) to arrive at

(1 − δ2s )kvT0 k22 ≤ kAvT0 k22 = hAvT0 , AvT0 i = hAvT0 , −A(vT1 + vT2 + . . .)i
X
= hAvT0 , −AvTi i.
i≥1

Here we can apply our first step to get


X X
(1 − δ2s )kvT0 k22 ≤ hAvT0 , −AvTi i ≤ δ2s kvT0 k2 kvTi k2 .
i≥1 i≥1

Using our construction of the Ti ’s we further estimate for i ≥ 1


1/2
2 1/2
  
X X √ √
kvTi k2 =  vj2  ≤ max |vk |  = s max |vk | ≤ s min |vk |
k∈Ti k∈Ti k∈Ti−1
j∈Ti j∈Ti
P
|vj |
√ j∈Ti−1 kvTi−1 k1
≤ s = √ .
s s

Hence,
X X kvTi−1 k1 δ2s kvT0 k2
(1 − δ2s )kvT0 k22 ≤ δ2s kvT0 k2 kvTi k2 ≤ δ2s kvT0 k2 √ = √ kvk1 .
s s
i≥1 i≥1

Dividing by kvT0 k2 and (1 − δ2s ) and using δ2s < 1/3 we end up with

1 δ2s kvk1
kvT0 k2 ≤ √ kvk1 ≤ √
s 1 − δ2s 2 s
| {z }
<1/2

which yields the claim, since by Cauchy-Schwartz inequality it holds for any x ∈ Rs

kxk1 = hx, sign(x)i ≤ kxk2 k sign(x)k2 = skxk2 .

8. Exercise
Let s ∈ N, 0 < δ < 1 and let

m ≥ cδ −2 s log(ed/s).

Further let A = Ã/ m ∈ Rm,d with i.i.d. entries ãij ∼ N (0, 1). Show that A
satisfies the RIP of order s with RIP constant δs ≤ δ with probability at least

1 − 2 exp −Cδ 2 m .


Hint: First use a Bernstein inequality to show that

P kAxk22 − kxk22 ≥ tkxk22 ≤ 2 exp −ct2 m


 

holds for all t > 0 and x ∈ Rd . Then show the desired RIP inequality for a fixed
s-dimensional subspace using a covering argument.

Solution
We use the Bernstein inequality:
Let X1 , . . . , Xm be independent mean zero (i.e. EXi = 0) subexponential random
variables, i.e. it holds

P(|Xi | ≥ t) ≤ β exp(−κt)

holds for any t > 0 and constants β, κ > 0. Then it holds



m
!
−κ2 t2
X  
Xi ≥ t ≤ 2 exp .

P
4βm + κt
i=1

1. step With Bernstein’s inequality we first want to show that

P kAxk22 − kxk22 ≥ tkxk22 ≤ 2 exp(−ct2 m).




Therefore let x ∈ Rd with kxk2 = 1 and let ãi denote the i-th row of Ã. Consider
the random variable

Xi = |hãi , xi|2 − kxk22 = |hãi , xi|2 − 1.

Then it holds
• Xi are independent, since ãi are independent,

• Xi are subexponential, since ãi (and hãi , xi) are Gaussians,

• Xi have mean zero, since


 2
2 x
EXi = E |hãi , xi| − kxk22 = kxk22 E ãi , − kxk22 = kxk22 E |g|2 −kxk22 = 0
kxk2 | {z }
| {z } =1
∼N (0,1)

with g ∼ N (0, 1),

• it holds
m m m   2
1 X 1 X X ãi
(|hãi , xi|2 − kxk22 ) = ( √ , x − kxk22 ) = kAxk22 − kxk22 .

Xi =
m m m
i=1 i=1 i=1
Now we apply Bernstein’s inequality to get
m
! m !
−κ2 t2 m2
1 X X  
P kAxk22 − kxk22 ≥ t = P

Xi ≥ t = P Xi ≥ mt ≤ 2 exp .

m 2βm + κtm


i=1 i=1

2. step We fix an s-dimensional subspace. Therefore let T ⊂ [d] with #T = s and


let

XT = {x ∈ Σs | supp x ⊂ T }.

We want to show that

(1 − δ)kxk2 ≤ kAxk2 ≤ (1 + δ)kxk2

holds fpr all x ∈ XT with probability at least


 s
12
1−2 exp(−cδ 2 m).
δ

Let δ > 0 and let Q ⊂ XT be a δ/4-net of XT ∩ B d , i.e. it holds

• kqk2 = 1 for all q ∈ Q and

• for any x ∈ XT with kxk2 = 1 there exists some q ∈ Q with kx − qk2 ≤ δ/4.

It is known that we can choose Q with #Q ≤ (12/δ)s (a proof is given below).


For any q ∈ Q and t = δ/2 we now use the first step to get

P kAqk22 − kqk22 ≥ δ/2 ≤ 2 exp(−cδ 2 m)




which is equivalent to
   
δ δ
1− kqk22 ≤ kAqk22 ≤ 1+ kqk22
2 2

with probability at least 1 − 2 exp(−cδ 2 m). Hence, for any (fixed) q ∈ Q it also
holds
   
δ δ
1− kqk2 ≤ kAqk2 ≤ 1 + kqk2
2 2

with probability at least 1 − 2 exp(−cδ 2 m). Hence, this inequality holds (simulta-
neously) far all q ∈ Q with probability at least

1 − 2#Q exp(−cδ 2 m) ≥ 1 − 2(12/δ)s exp(−cδ 2 m).

Now we want to prove that the desired inequality also holds for all x ∈ XT . Let
δ̂ > 0 be the smallest constant such that

kAxk2 ≤ (1 + δ̂)kxk2
holds for all x ∈ XT and let v ∈ XT be fixed with kvk2 = 1. Then there is some
q ∈ Q with kv − qk2 ≤ δ/4 and we get
 
δ δ
kAvk2 ≤ kAqk2 + kA(v − q)k2 ≤ 1 + + (1 + δ̂)
2 4

which implies
 
δ δ
δ̂ ≤ 1 + + (1 + δ̂) ⇒ δ̂ < δ.
2 4

3. step We already proved the inequality for every s-dimensional subspace. Since
there are
 s
 ed
d s≤
s

possibilities to choose s indices out of d the claim follows.


Covering argument It remains to show the covering argument which we used for
the set Q which we will prove in a more general setting:
Let X be an m-dimensional normed space, let ε > 0 and denote BX = {x ∈ X |
kxk ≤ 1}. Then the covering number
n
[
N = min{n ∈ N | ∃q1 , . . . , qn ∈ BX : BX ⊂ (qi + εBX )}
i=1

can be bounded by
 m
2 + 2ε
N≤ .
ε

Indeed, let Q = {q1 , . . . , qk } be (any) maximal set of points in BX with

kqi − qj k > ε, for all i 6= j.

Then it holds
S
• BX ⊂ qi + εBX , since otherwise there is some z ∈ BX with kz − qi k > ε
for alle i = 1, . . . , k in contradiction to the maximality of Q. Hence, we have
k ≥ N.

• The sets qi + ε/2BX are mutually disjoint. Assume that there exists i, j and

z ∈ (qi + ε/2BX ) ∩ (qj + ε/2BX ) .

It follows kqi − qj k ≤ kqi − zk + kqj − zk ≤ ε/2 + ε/2 which implies i = j.

We conclude
k k
[ ε [
qi + B X ⊂ +εBX ⊂ εBX
2
i=1 i=1
and comparing the volumes we arrive at
k
!
[ ε ε   ε m
vol qi + BX = k vol BX = k vol(BX ) ≤ vol((1 + ε)BX )
2 2 2
i=1
= (1 + ε)m vol(BX ),

hence
 ε m  m
m 2 + 2ε
k ≤ (1 + ε) ⇒ N ≤k≤ .
2 ε

Matlab Exercises

1. Matlab Exercise

1. Implement the basis pursuit

min kxk1 subject to Ax = y


x∈Rd

in the form of a linear optimization problem.


Hint: Matlab routine linprog can be useful.

2. Test your program for noisy measurements of the form y = Ax + z, where


z ∈ Rm is either deterministic noise (i.e. kzk is small) or random Gaussian
noise (i.e. zi ∼ N (0, σ 2 ) and σ > 0 small).

2. Matlab Exercise
Show numerically that the number of measurements m only has to grow logarith-
mically in the dimension d if we want to recover an s-sparse signal x0 ∈ Rd from
linear measurements y = Ax with A ∈ Rm,d .
To show this, calculate for increasing values of d and m the error of your approx-
imation and plot the resulting matrix.

3. Matlab Exercise

1. Implement the 1-Bit Compressed Sensing Algorithm


m
X
max yi hai , xi subject to kxk1 ≤ R, kxk2 ≤ 1,
x∈Rd
i=1

which recovers the true signal x0 ∈ Rd with kx0 k1 ≤ R and kx0 k2 ≤ 1 from
measurements yi = signhai , x0 i, i = 1, . . . , m.
Hint: Matlab package CVX can be useful.
2. Test your algorithm with noisy measurements of the form
(
signhai , x0 i with probability 1 − p,
yi =
− signhai , x0 i with probability p

for some 0 < p < 1/2.

4. Matlab Exercise
Let f : Bd → R be a ridge function with f (x) = g(ha, xi) for some (unknown)
s-sparse ridge vector a ∈ Rd with kak2 = 1 and some differentiable ridge profile
g : R → R with g 0 (0) 6= 0. The ridge vector a gets recovered by the following
algorithm:

• Input: Ridge function f (x) = g(ha, xi), h > 0 small and m ∈ N

• Take Φ ∈ Rm×d a normalized Bernoulli matrix (i.e. with entries ±1, both
with probability 1/2.
f (hϕj )−f (0)
• Put b̃j := h , j = 1, . . . , m

• Put ã := ∆1 (b̃) = arg minw∈Rd kwk1 s.t. Φw = b̃



• Put â := kãk2

• Output: â

Implement this algorithm and show numerically that it indeed recovers the ridge
vector a.

Solution
Matlab implementations are given in the corresponding Matlab files.
kx0 − xk2
60 0
sparsity s = 5
55 0.2

50
1.8 0.4
s=5 no noise 45
Gaussian noise, sigma=0.1
1.6 d = 1000 deterministic noise, r=0.1 0.6
40

measurements m
1.4
0.8
35
1.2
1
30
kx0 − xk2

1
1.2
25
0.8

20 1.4
0.6

15 1.6
0.4

10 1.8
0.2

5
2
0
20 25 30 35 40 45 50 55 60 100 200 300 400 500 600 700 800 900 1000
measurements m dimension d

0.014

g = @(t)tanh(t-1)
1.2
0.012 s=5
s=5
1.1 d = 1000 d = 1000
0.01 m = 60
1
ka − âk2

0.9 0.008
kx0 − xk2

0.8

0.006
0.7

0.6 0.004

0.5
0.002
0.4 no noise
misclassification prob. p=0.1
misclassification prob. p=0.2
0.3 0
50 100 150 200 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
measurements m
step size h

Figure 1: Top: Generated figure of test basis pursuit (left) and figure gener-
ated by phase transition (right). Bottom: Generated figure of test one bit
(left) and generated figure of ridge function (right).

You might also like