Deterministic Cs
Deterministic Cs
Agenda
observe b = Ax
recover by `1 mimimization
0
xs = arg minkzk` ≤s kx − zk
0
xs : s-sparse
s-largest entries of x are the nonzero entries of xs
General signal recovery
1
Theorem (Noiseless recovery (C., Romberg and
Taoa ))
0.5
a This version due to C. 08
√
0 If δ2s < 2 − 1 = 0.414 . . ., `1 recovery obeys
√
!0.5
0 5 10 15 20 25 30 35 40 45 50
kx̂ − xk`2 . kx − xs k`1 / s
kx̂ − xk`1 . kx − xs k`1
General signal recovery
1
Theorem (Noiseless recovery (C., Romberg and
Taoa ))
0.5
a This version due to C. 08
√
0 If δ2s < 2 − 1 = 0.414 . . ., `1 recovery obeys
√
!0.5
0 5 10 15 20 25 30 35 40 45 50
kx̂ − xk`2 . kx − xs k`1 / s
kx̂ − xk`1 . kx − xs k`1
Exact if x is s-sparse.
Otherwise, essentially reconstructs the s largest entries of x
Powerful if s is close to m
General signal recovery from noisy data
Inaccurate measurements: z error term (stochastic or deterministic)
b = Ax + z, with kzk`2 ≤
b = Ax + z, with kzk`2 ≤
= *
x is s-sparse: kxk`0 ≤ s
can we recover x from Ax = b?
*
Perhaps possible if sparse vectors lie away from
null space of A
*
Interlude: when does sparse recovery make sense?
= *
x is s-sparse: kxk`0 ≤ s
can we recover x from Ax = b?
*
Perhaps possible if sparse vectors lie away from
null space of A
*
= *
x is s-sparse: kxk`0 ≤ s
can we recover x from Ax = b?
*
Perhaps possible if sparse vectors lie away from
null space of A
*
h 6= 0 is 2s-sparse with Ah = 0
h = x1 − x2 x1 , x2 both s-sparse
Ax1 = Ax2
Lower bound guarantees that distinct sparse signals cannot be mapped too
closely (analogy with codes)
Formal equivalence
K-planes
1
means entries are iid N (0, 1)
2
means entries are iid Bernoullis Aij = ±1 w.p. 1/2
3
means rows (frequencies) are selected at random (result due to C. and Tao,
Rudelson and Vershynin)
4
means rows iid sampled from F obeying isotropy condition and with
coherence µ
Gaussian matrices and RIP
Silverstein (1985)
√ √
σmin (X) → 1 − c a.s., σmax (X) → 1 + c a.s.
Asymptotic theory of Gaussian matrices
Silverstein (1985)
√ √
σmin (X) → 1 − c a.s., σmax (X) → 1 + c a.s.
Expectations obey
√ √ √ √
E σmin (X) ≥ m− s, E σmax (X) ≤ m+ s
Apply Borell
√ √ 2
s + t ≤ e−t /2
P σmax (X) > m+
√ √ 2
P σmin (X) < m − s − t ≤ e−t /2
Apply Borell
√ √ 2
s + t ≤ e−t /2
P σmax (X) > m+
√ √ 2
P σmin (X) < m − s − t ≤ e−t /2
!
p 2
P sup σmax (AT ) > 1 + 2s/m + t ≤ #{T : |T | ≤ 2s} · e−mt /2
T :|T |≤2s
n 2
≤ 2s · e−mt /2
2s
2
≈ e2s log(n/2s) · e−mt /2
p √
Similarly for σmin . Take t small enough to that 1 + s/m + t ≤ 1.41 and get
m & s log(n/s)
as claimed
Geometric intuition for noisy recovery
Geometric intuition for noisy recovery
Parallelogram identity
1
|hAx, Ax0 i| = kAx + Ax0 k2`2 − kAx − Ax0 k2`2 ≤ δs+s0
4
Proof of noisy recovery result
and thus
khT0c k`1 ≤ khT0 k`1 + 2kxT0c k`1
Divide T0c into subsets of size s in decreasing order of magnitude of hT0c
T1 : indices of the s largest coefficients of hT0c ,
T2 : indices of the next s largest coefficients,
and so on...
Divide T0c into subsets of size s in decreasing order of magnitude of hT0c
T1 : indices of the s largest coefficients of hT0c ,
T2 : indices of the next s largest coefficients,
and so on...
For each j ≥ 2,
and thus
X
khTj k`2 ≤ s−1/2 (khT1 k`1 + khT2 k`1 + . . .) ≤ s−1/2 khT0c k`1
j≥2
The tube constraint and the restricted isometry property then give
p
|hAhT0 ∪T1 , Ahi| ≤ kAhT0 ∪T1 k`2 kAhk`2 ≤ 2ε 1 + δ2s khT0 ∪T1 k`2
Also
|hAhT0 , AhTj i| ≤ δ2s khT0 k`2 khTj k`2
and likewise for T1 in place of T0
√
Since khT0 k`2 + khT1 k`2 ≤ 2khT0 ∪T1 k`2 for T0 and T1 are disjoint,
√ √
−1/2 2 1 + δ2s 2 δ2s
⇒ khT0 ∪T1 k`2 ≤ α ε + ρ s khT0c k`1 , α≡ ,ρ ≡
1 − δ2s 1 − δ2s
We now conclude from this that
khT0 ∪T1 k`2 ≤ αε + ρkhT0 ∪T1 k`2 + 2ρe0 ⇒ khT0 ∪T1 k`2 ≤ (1 − ρ)−1 (αε + 2ρe0 )
And finally,
khk`2 ≤ khT0 ∪T1 k`2 + kh(T0 ∪T1 )c k`2 ≤ 2kh(T0 ∪T1 ) k`2 + 2e0
≤ 2(1 − ρ)−1 (αε + (1 + ρ)e0 )
Lemma
Let h be any vector in the nullspace of A and let T0 be any set of cardinality s.
Then √
khT0 k`1 ≤ ρ khT0c k`1 , ρ = 2 δ2s (1 − δ2s )−1 . (2)
Recall that khT0 k`1 ≤ s1/2 khT0 k`2 ≤ s1/2 khT0 ∪T1 k`2 and
khT0 ∪T1 k`2 ≤ ρs−1/2 khT0c k`1 with ε = 0
Recovery with `1 metric
We also claimed that in the noiseless case, kx̂ − xk`1 ≤ Ckx − xs k`1 . Why?
Lemma
Let h be any vector in the nullspace of A and let T0 be any set of cardinality s.
Then √
khT0 k`1 ≤ ρ khT0c k`1 , ρ = 2 δ2s (1 − δ2s )−1 . (2)
Recall that khT0 k`1 ≤ s1/2 khT0 k`2 ≤ s1/2 khT0 ∪T1 k`2 and
khT0 ∪T1 k`2 ≤ ρs−1/2 khT0c k`1 with ε = 0
Therefore,
khk`1 = khT0 k`1 + khT0c k`1 ≤ 2(1 + ρ)(1 − ρ)−1 kxT0c k`1