0% found this document useful (0 votes)
2 views

DRLT Correction Note

The document outlines the aims for correction work related to algorithms for bit-flip correction, including the optimality of the algorithm and reconstruction error analysis. It describes three models for bit-flip errors and presents algorithms for correcting these errors using the Single Switch Model, Adjacent Switch Model, and Random Switch Model. Empirical results demonstrate the effectiveness of the correction algorithms in improving reconstruction error after correcting bit-flips.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

DRLT Correction Note

The document outlines the aims for correction work related to algorithms for bit-flip correction, including the optimality of the algorithm and reconstruction error analysis. It describes three models for bit-flip errors and presents algorithms for correcting these errors using the Single Switch Model, Adjacent Switch Model, and Random Switch Model. Empirical results demonstrate the effectiveness of the correction algorithms in improving reconstruction error after correcting bit-flips.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

DRLT-Correction

September 2024

1 Aims for the Correction work


These are the aims/objective for the correction work:
1. Determine optimality for the Algorithm to create W given in Alg.2 of [2].
2. Set up appropriate correction algorithms for bit-flip correction.
3. Find the reconstruction error of the lasso estimates post-correction.
4. Multiple stages of correction-
(a) Determine whether the number of bit-flips goes to 0 after multiple
correction stages.
(b) Determine whether after each stage of correction, the number of de-
tected measurements is a subset of the number of detected measure-
ments of the stage before.
(c) Reconstruction error of the lasso estimates after multiple correction
stages.

2 Correction of Bit-flip models using Drlt (sim-


ilar to permutation in ICASSP)
The hypothesis test for δ ∗ can even be used to correct for bit-flip or permutation
errors. Before presenting algorithms for correction, we define three intuitive
models for bit-flips in the pooling matrix A, followed by a model for permutation
errors. Precise knowledge of these models is not necessary for detection of bit-
flips, but is needed for correction.
1. Single Switch Model (SSM): In this model, a bit-flipped pool (measure-
ment as described in Eqn.(5) of [2]) contains exactly one bit-flip at a
randomly chosen index. Suppose that the ith pool (measurement) con-
tain a bit-flip. Under the SSM scheme, exactly one of the following two
can happen: (1) some jth sample that was intended to be in the pool (as
defined in A) is excluded, or (2) some jth sample that was not intended

1
to be part of the pool (as defined in A) is included. These two cases lead
to the following changes in the ith row of Â, and in both cases the choice
of j ∈ [p] is random uniform: Case 1: Âij = −1 but Aij = 1, Case 2:
Âij = 1 but Aij = −1.

2. Adjacent Switch Model (ASM): In ASM, a bit-flipped pool contains bit-


flips at two adjacent indices. Suppose the ith pool contains bit-flips. Then
under the ASM scheme, either (1) the jth sample that was not intended
to be in the pool is included and the j ′ th sample where j ′ ≜ mod(j + 1, p)
that was intended to be in the pool is excluded, or (2) the jth sample that
is intended to be in the pool is excluded and the j ′ th sample where j ′ ≜
mod(j + 1, p) that is not intended to be in the pool, is included at random.
This leads to the following changes in the ith row of Â, and in both cases
the choice of j is uniform random: Case 1: Âij ′ = −1, Âij = 1 and Aij ′ =
1, Aij = −1, Case 2: Âij ′ = 1, Âij = −1 and Aij ′ = −1, Aij = 1.
3. Random Switch Model (RSM): In RSM, a pool that contains bit-flips will
necessarily contain two bit-flips at random locations. Suppose the ith pool
has bit-flips. Then under the RSM scheme, for two distinct samples k ∈ [p]
and l ∈ [p], either (1) the kth sample that is not intended to be in the pool
is mistakenly included and the lth sample that is intended to be in the
pool is mistakenly excluded, or (2) the kth sample that is intended to be in
the pool is mistakenly excluded and the lth sample that is not intended to
be in the pool is mistakenly included. This leads to the following changes
in the ith row of Â, for l ̸= k ∈ [p], and in both cases the choice of k, l is
uniform random: Case 1: Âik = −1, Âil = 1 and Aik = 1, Ail = −1, Case
2: Âik = 1, Âil = −1 and Aik = −1, Ail = 1. Note that ASM is a special
case of RSM, with the second index fixed to mod(j + 1, p) when the first
index is j.
Correction Algorithms: Let J be the index set of all such measurements
in y for which the hypothesis test of ODrlt is rejected, i.e. J contains the
indices of all measurements with MMEs. Upon obtaining J, we can re-estimate
β ∗ using the Lasso estimator β̂λ ≜ minβ 2n 1 2
′ ∥yJ c − AJ c β∥2 + λ∥β∥1 using only

n = |J | = n − |J| measurements, i.e. by discarding the measurements in J.


′ c

However instead of discarding the measurements in J, we provide algorithms


to correct for errors in the corresponding rows of matrix Â, again making use of
the key principles of the ODrlt technique, as well as exploiting the particular
statistical model for mismatch.
We first provide an algorithm for the SSM for bit-flips – see Alg. 1. Alg. 1
essentially works as follows: for every measurement i in J, we flip the sign of
element Aij of A where j ∈ [p], and recompute δ̂W using Eqn.(15) of [2]. We
check whether the new estimate δ̂W i satisfies H0,i as per the test described in [2].
If H0,i is rejected, we have been successful in identifying the bit-flip in the ith
row at location j with probability 1−α, where α is the level of significance of the
test. Otherwise, the sign of other entries of the ith row need to be changed until

2
the bit-flip is found. Note that as per SSM, a given row can contain bit-flips in
at the most one entry. The procedure for correction of bit-flips in ASM is quite
similar to the one in Alg. 1. Here, instead of toggling Aij in Step 2 of Alg. 1,
we instead do as follows: If Ai,j ̸= Ai,j ′ , then swap the values of Ai,j and Ai,j ′
where j ′ = mod(j + 1, p). The remaining steps are exactly as in Alg. 1. For
RSM, we do the following: In any row of A, we check for all pairs of unequal
entries Ai,j1 and Ai,j2 with j1 ̸= j2 and swap their values. The rest of the steps
are exactly as in Alg. 1. Detailed versions of these modifications for ASM, RSM,
are given in [1]. The matrix thus obtained from the correction algorithm can
then be used to re-estimate β ∗ using Lasso.

Algorithm 1 Correction for bit-flips following the Single Switch Model (SSM)
Require: Measurement vector y, pooling matrix A, Lasso estimate β̂λ1 , λ
and the set J of corrupted measurements estimated by ODrlt method
Ensure: Bit-flip corrected matrix Ã
1: for every i ∈ J do
2: for every j ∈ [p] do
3: if β̂λ1 ,j ̸= 0 then
4: Set bf := −1 (bit-flip flag), α := 0.01, Aij := −Aij (induce a bit-flip
at Aij ).
5: Find the solution β̂λ1 , δ̂λ2 to the convex program given in Eqn.(6)
of [2].
6: Compute the weight matrix W of as given in Alg.2 of [2].
7: Calculate the debiased Lasso estimate δ̂W given by Eqn.(15) of [2].

8: Set TH,i = √[δ̂W ]i , where Σδ is defined in Eqn.(28) of [2].


[Σδ ]ii
9: if |TH,i | ≤ τα/2 then
10: set bf := j.
11: end if
12: if bf == −1 then {bit-flip not detected at Aij }
13: Aij = −Aij {reverse induced bit-flip in Aij }
14: end if
15: end if
16: end for
17: end for
18: return à = A.

2.1 Empirical Results


We performed experiments to correct bit-flips under the SSM model using Alg.1.
We ran the experiments for p = 500, n = 400, fσ = 0.01, fsp = 0.01, fadv = 0.02
using the same experimental setup as given in [2] for 4 different noise runs. In
Table 1 we provide the true number of bitflips (NP ), the number of corrected
bitflips (NP T ), the RRMSE of reconstruction of β ∗ using Rl after discarding

3
Run # NP NP T RRMSE-ODrlt-D RRMSE-ODrlt-C
1 8 7 0.0477 0.0322
2 8 6 0.0512 0.0452
3 8 6 0.0498 0.0428
4 8 7 0.0465 0.0366

Table 1: Avg. #RSM errors (NP ) and avg. #RSM errors detected correctly
(NP T ) athe RRMSE of reconstruction of β ∗ using Lasso after discarding the
measurements in J (ODrlt-D) and the RRMSE of reconstruction of β ∗ using
Rl after correction using y, Ã (ODrlt-C)

the measurements in J (ODrlt-D) and the RRMSE of reconstruction of β ∗


using Rl after correction using y, Ã (ODrlt-C).
In Table 1, we see that in the 4 runs the correction algorithm corrects 6 − 7
out of the 8 bit-flips. Furthermore, the reconstruction error after correction
improves on ODrlt-D. It is important to note that the reason behind such low
number of runs is that W is very expensive to compute using Alg.2 of [2]. Since
in this correction algorithm, we always re-compute W in each iteration of j in
Alg.1.

3 RSM Correction algorithm


We provide two similar algorithms for correction of bit-flips following the RSM
model. In Alg.7, we calculate the debiased lasso estimate in the correction step
using W = A and in Alg.3, we calculate the debiasedplasso estimate using
W = copt A, where copt = max{−µ1 + 1, −µ2 + 1/p, −µ3 1 − n/p + 1}, where
µ1 , µ2 and µ3 are as defined in Alg.2 of [2]. We will show in Lemma.1 that
W = copt A where copt is as defined above is a feasible solution to optimisation
problem given in Alg.2 of [2]. We further show that of all the feasible solutions
of the form cA of Alg.2 of [2], W = copt A is the optimal.
q q
Lemma 1 Given A is a Rademacher matrix, µ1 ≜ 2 2 log(p) n , µ2 ≜ 2 log(2np)
np +
q
1 2 2 log(n)
n and µ3 ≜
√ p , W = copt A is a feasible solution to Alg.2 of [2]
1−n/p
p
where copt = max{−µ1 + 1, −µ2 + 1/p, −µ3 1 − n/p + 1}. Furthermore, among
all feasible solutions of the form W = cA, copt A is the optimal solution.

Proof of Lemma 1: For the solution to be feasible, we need to show that


W = copt A satisfies all the constraints of Alg.2 of [2] with probability 1. If
we plug in the solution cA in the constraint C0, we get c2 a⊤ .j a.j /n ≤ 1 for all
j ∈ [p]. Since a⊤ .j a.j /n = 1, we need c2
≤ 1. Based on the values of µ1 , µ2 and
µ3 , copt satisfies this constraint.
Now, if we plug in the solution cA in the constraint C1, we get
Ip − nc A⊤ A ∞ ≤ µ1 . For all j ∈ [p] the (j, j)th element of the matrix

4
Algorithm 2 Correction for bit-flips following the Random Switch Model
(RSM) using W = A
Require: Measurement vector y, pooling matrix A, Lasso estimate β̂λ1 , λ
and the set J of corrupted measurements estimated by ODrlt method
Ensure: Bit-flip corrected matrix Ã
1: for every i ∈ J do
2: Set bf 1 := −1, bf 2 = −1 (bit-flip flag), max-p value= 0.01.
3: for every j ∈ [p − 1] do
4: for l ∈ [p] do
5: if {{Aij == −1}and{Ail == 1}}or{{Aij == 1}and{Ail ==
−1}} then
6: Anew = A.
7: if Aij == 1 then
8: Aij = −1, Ail = 1.
9: else if Aij == −1 then
10: Aij = 1, Ail = −1.
11: end if
12: Find the solution β̂λ1 , δ̂λ2 to the convex program given in Eqn.(6)
of [2].
13: Calculate the debiased Lasso estimate δ̂W given by Eqn.(15) using
W = A of [2].
14: Set pval = 1 − Φ(TH,i = √[δ̂W ]i ), where Σδ is defined in Eqn.(28)
[Σδ ]ii
of [2].
15: if pval ≥ max-p value then
16: set bf 1 := j, bf 2 := l, max-p value = pval .
17: end if
18: if bf 1 ! = −1 and bf 2 ! = −1 then {bit-flip not detected at Aij }
19: Aij = −Aij , Ail = −Ail {reverse induced bit-flip in Aij }
20: end if
21: end for
22: end for
23: end for
24: return à = A.

5
Algorithm 3 Correction for bit-flips following the Random Switch Model
(RSM) using W = copt A
Require: Measurement vector y, pooling matrix A, Lasso estimate β̂λ1 , λ
and the set J of corrupted measurements estimated by ODrlt method
Ensure: Bit-flip corrected matrix Ã
1: for every i ∈ J do
2: Set bf 1 := −1, bf 2 = −1 (bit-flip flag), max-p value= 0.01.
3: for every j ∈ [p − 1] do
4: for l ∈ [p] do
5: if {{Aij == −1}and{Ail == 1}}or{{Aij == 1}and{Ail ==
−1}} then
6: if Aij == 1 then
7: Aij = −1, Ail = 1.
8: else if Aij == −1 then
9: Aij = 1, Ail = −1.
10: end if
11: Find the solution β̂λ1 , δ̂λ2 to the convex program given in Eqn.(6)
of [2].
12: Calculate the debiased Lasso estimate δ̂W given by Eqn.(15) using
W = copt A of [2].
13: Set pval = 1 − Φ(TH,i = √[δ̂W ]i ), where Σδ is defined in Eqn.(28)
[Σδ ]ii
of [2].
14: if pval ≥ max-p value then
15: set bf 1 := j, bf 2 := l, max-p value = pval .
16: end if
17: if bf 1 ! = −1 and bf 2 ! = −1 then {bit-flip not detected at Aij }
18: Aij = −Aij , Ail = −Ail {reverse induced bit-flip in Aij }
19: end if
20: end for
21: end for
22: end for
23: return à = A.

6
Ip − nc A⊤ A is 1 − c. This implies, |1 − c| ≤ µ1 . Clearly, −µ2 + 1/p ≤ µ1 + 1
p
and −µ3 1 − n/p + 1 ≤ µ1 + 1. Hence, |1 − copt | ≤ µ + 1. In case of the off-
diagonal elements, for l, j ∈ [p], the (j, l)th element of the matrix Ip − nc A⊤ A
a⊤
.j a.l a⊤
.j a.l
is c n . Since, −1 ≤ n ≤ 1, we have −µ1 ≤ c ≤ µ1 . Clearly, copt satisfies
this condition too. This implies copt A satisfies constraint C1.
Next, if we plug
 in the solution cA in the constraint C2, we get
1 c ⊤
p I n − n AA A ≤ µ2 . For i ∈ [n] and j ∈ [p], the (i, j)th element
 ∞ 
a Pp Pn
of the matrix p1 In − nc AA⊤ A is given by pij − np c
l=1 k=1 ail akl akj as
shown
Pp inP result (2) of Lemma 5 of [2]. Since A is Rademacher, we have, −1 ≤
1 n
np l=1 k=1 ail akl akj ≤ 1. Hence, we have, −µ2 + 1/p ≤ c ≤ µ2 + 1. Clearly,
copt satisfies the constraint C2.
 Lastly, let us plug in p
the solution cA in the constraint C3, we get
In − pc AA⊤ ≤ 1 − n/pµ3 . For all i ∈ [n] the (i, i)th element of the

matrix In − pc AA⊤ is 1−c. This implies, |1−c| ≤ µ1 . Clearly, −µ2 +1/p ≤ µ1 +1
p
and −µ3 1 − n/p + 1 ≤ µ1 + 1. Hence, |1 − copt | ≤ µ + 1. In case of the off-
diagonal elements, for i, k ∈ [n], the (i, k)th element of the matrix Ip − nc A⊤ A
a a⊤ a a⊤ p p
is c i.p k. . Since, −1 ≤ i.p k. ≤ 1, we have − 1 − n/pµ3 ≤ c ≤ 1 − n/pµ3 .
Clearly, copt satisfies this condition too. This implies copt A satisfies constraint
C3.
Hence, copt A is a feasible solution to the optimal solution
Pp of Alg.2 of [2].
Note that, the objective function of Alg.2 of [2] is given by j=1 w.j ⊤ w.j . For
W = cA, the objective function is given by nc2 . Since we want to minimise
the objective function nc2 , the p smallest value it can take is for c = copt =
max{−µ1 + 1, −µ2 + 1/p, −µ3 1 − n/p + 1} based on the constraints C1, C2
and C3. Therefore, among all feasible solutions of the form W = cA, copt A is
the optimal solution.

3.1 Experimental Results


We performed experiments to correct bit-flips under the RSM model using Alg.1.
We ran the experiments for p = 500, n = 400, fσ = 0.01, fsp = 0.1, fer = 0.1
using the same experimental setup as given in [2] for 4 different noise runs. In
Table 1 we provide the true number of bitflips (NP ), the number of corrected
bitflips by Alg.7(NP A ), the number of corrected bitflips by Alg.3(NP cA ).
Multiple stages of Correction of RSM: We have noticed that after the first
stage of correction in Drlt-C, a small fraction of RSM errors remain uncor-
rected, and a small number of new RSM errors are falsely created. Both these are
caused due to small but inevitable Type-I and Type-II errors in the hypothesis
test TH . For this set of experiments, we take p = 500, n = 450, s = 5, fσ = 0.01
and r = 8 RSM errors. After the first stage of correction in Drlt-C, we execute
the following three steps iteratively: (i ) We re-estimate β ∗ and the permuta-
tion noise vector based on the corrected measurements using Robust Lasso Rl.

7
Run # NP NP A NP cA
1 4 2 3
2 2 1 2
3 4 2 3
4 3 2 2
5 5 3 4

Table 2: True number of bitflips (NP ), the number of corrected bitflips by


Alg.7(NP A ), the number of corrected bitflips by Alg.3(NP cA )

Stage # True bit-flips(B) J NP NP T NP F


1 (11,24,78,126,195,242,311,337) (B,17,112,119,396) 8 6 4
2 (11,112,126) (B,17,119,396) 3 2 3
3 (112) (112,119) 1 1 2
4 ϕ (112,119) 0 0 2

Table 3: Run-1: Set of true bit-flips (B), set of detected bitf-flips (J) and set of
corrected bit-flips ( C), #RSM errors (NP ) and #RSM errors detected correctly
(NP T ) and incorrectly (NP F ), after each stage of correction in Drlt-C for r = 8
RSM errors.

Stage # True bit-flips(B) J NP NP T NP F


1 (11,24,78,126,195,242,311,337) (B,37,175,192,255,341) 8 5 5
2 (24,78,192,195) (B,37,255,341) 4 2 3
3 (24,195) (B,37,255,341) 2 1 3
4 (24) (24,37,341) 1 1 2
5 ϕ (37,341) 0 0 2

Table 4: Run-2: Set of true bit-flips (B), set of detected bitf-flips (J) and set of
corrected bit-flips ( C), #RSM errors (NP ) and #RSM errors detected correctly
(NP T ) and incorrectly (NP F ), after each stage of correction in Drlt-C for r = 8
RSM errors.

8
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (2,67,72,219,361,392) (2,13,44,67,72,145,219,276,361) (2,67,145,219,361)
2 (72,145,392) (13,72,145,276,392) (72,392)
3 (145) (13,145) (145)
4 ϕ (13) ϕ

Table 5: First β, first run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 6

# RRMSE Sens Spec p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0517 0.0422 0.83 1 0.987 0.992 1e-8
2 0.0422 0.0389 1 1 0.992 0.995 0.000044
3 0.0389 0.0378 1 1 0.995 0.997 0.0041
4 0.0389 0.0378 1 1 0.995 0.997 0.0271

Table 6: fσ = 0.01: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, s = 10, r = 6

(ii ) We then perform debiasing of these estimates as shown in Eqn. (14) and
(15) of [2]. (iii ) Based on the new set of detected permuted measurements, we
perform correction given by Alg. 3. After each stage of correction, we report
the average number of measurements correctly detected to have RSM errors,
the average number incorrectly detected to have RSM errors and the average
actual number of RSM errors (over 20 noise runs). These results are presented
in Table 4 which shows that after the fifth stage, the test only falsely detects a
small number of permutations in the model and there are no permutations left
in the sixth stage.

3.2 RSM correction for different β and η (Table 5-20)


In this subsection of experimental results, we run the correction algorithm for
RSM for three different β and for each β we run the correction for three sepa-
rate runs i.e., different η. For each β, η combination, we provide two separate
tables of results. In the first table, we provide the set of true bit-flips (B),
set of detected bitf-flips (J) and set of altered bit-flips ( C), after each stage
of correction in Drlt-C for RSM errors. In the second table, we provide pre
correction and post correction RRMSE, Sensitivity and Specificity, after each
stage of correction in Drlt-C for RSM errors. The fixed set of parameters
are p = 500, n = 400, fσ = 0.01, s = 10. Note that, the true number of bit-flips
varies as the RSM errors are induced in an non-adversial manner with fer = 0.2.

9
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (2,67,72,219,361,392) (2,67,89,192,219,222,307,392) (2,67,219,392)
2 (72,361) (72,89,192,222) (72,192)
3 (192,361) (89,192,222,361) (192,361)
4 ϕ (89,222) ϕ

Table 7: First β, second run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 6

# RRMSE Sens Sens p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0567 0.0485 0.67 0.83 0.987 0.995 1e-6
2 0.0485 0.0452 0.83 1 0.995 0.995 0.00057
3 0.0452 0.0429 1 1 0.995 0.995 0.0022
4 0.0452 0.0429 1 1 0.995 0.995 0.0451

Table 8: First β, second run: Pre correction and post correction RRMSE,
Sensitivity and Specificity, after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 6

# Effective bitflips(B) effective bitflips detected J bitflips altered ( C)


1 (2,67,72,219,361,392) (2,59,67,72,108,119,219,255,292,316,361,377,392) (2,67,72,119,219,361,377,392)
2 (119,392) (59,108,119,292,316,392) (119,292,392)
3 (292) (59,108,292,316) (292)
4 ϕ (59,108) ϕ

Table 9: First β, third run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 6

# RRMSE Sens Sens p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0615 0.0479 1 1 0.982 0.989 1e-14
2 0.0479 0.0477 1 1 0.989 0.995 1e-9
3 0.0477 0.0416 1 1 0.995 0.995 1e-7
4 0.0477 0.0416 1 1 0.995 0.995 0.00542

Table 10: First β, third run: Pre correction and post correction RRMSE, Sensi-
tivity and Specificity, after each stage of correction in Drlt-C for RSM errors.
The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 6

10
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (96,117,272,329,346) (23,96,117,156,213,272,329,336,346) (96,117,213,329,346)
2 (213,272) (23,96,213,276) (213,272)
3 ϕ (23) ϕ

Table 11: Second β, first run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 5

# RRMSE Sens Sens p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0447 0.0372 0.8 1 0.992 0.995 1e-7
2 0.0372 0.0346 1 1 0.995 0.998 0.00098
3 0.0372 0.0346 1 1 0.995 0.998 0.00928

Table 12: Second β, first run: Pre correction and post correction RRMSE,
Sensitivity and Specificity, after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 5

# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)


1 (96,117,272,329,346) (55 ,72,96,142,202,291,329,346) (96,202,329,346)
2 (117,202,272) (55,117,202,272,291) (117,202)
3 (272) (55,272) (272)
4 ϕ (55) ϕ

Table 13: Second β, second run: Set of true bit-flips (B), set of detected bitf-
flips (J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 5

# RRMSE Sens Sens p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0552 0.0513 0.6 0.8 0.992 0.995 1e-6
2 0.0513 0.0474 0.8 1 0.995 0.997 0.0021
3 0.0474 0.0436 1 1 0.997 0.998 0.0362

Table 14: Second β, second run: Pre correction and post correction RRMSE,
Sensitivity and Specificity, after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 5

# Effective bitflips(B) effective bitflips detected J bitflips altered ( C)


1 (96,117,272,329,346) (82,96,117,172,209,252,329,346,397) (96,117,329,346)
2 (272) (82,172,209,272) (272)
3 ϕ (82,209) ϕ

Table 15: Second β, third run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 5

11
# RRMSE Sens Sens p-value simul
Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0602 0.0452 0.8 1 0.992 0.995 1e-13
2 0.0452 0.0427 1 1 0.995 0.997 1e-8
3 0.0452 0.0427 1 1 0.995 0.997 0.00044

Table 16: Second β, third run: Pre correction and post correction RRMSE,
Sensitivity and Specificity, after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 5
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (23,44,166,192,245,289,356) (23,44,98,166,177,192,245,289,332,396) (23,44,166,192,245,332)
2 (289,332,356) (98,177,289,332,356) (98,289,356)
3 (98,332) (98,177,332) (98,332)
4 ϕ ϕ ϕ

Table 17: Third β, first run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 7
# RRMSE Sens Sens p-value sim
Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0692 0.0577 0.713 0.8571 0.989 0.995 1e-12
2 0.0577 0.0512 0.8571 1 0.995 0.998 1e-7
3 0.0512 0.0478 1 1 0.997 1 0.00034

Table 18: Third β, first run: Pre correction and post correction RRMSE, Sensi-
tivity and Specificity, after each stage of correction in Drlt-C for RSM errors.
The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 7
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (23,44,166,192,245,289,356) (44,67,98,166,192,203,245,289,317,334,396) (44,166,192,245,289,334)
2 (23,334,356) (23,67,98,334,356) (23,334,356)
3 ϕ (67,98) ϕ

Table 19: Third β, second run: Set of true bit-flips (B), set of detected bitf-flips
(J) and set of altered bit-flips ( C), after each stage of correction in Drlt-C
for RSM errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 7
# RRMSE Sens Sens p-value simul
Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0688 0.0492 0.857 1 0.992 0.995 1e-11
2 0.0492 0.0441 1 1 0.995 0.997 1e-7
3 0.0492 0.0441 1 1 0.995 0.997 0.00093

Table 20: Third β, second run: Pre correction and post correction RRMSE,
Sensitivity and Specificity, after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, s = 10, r = 7

12
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (2,67,72,219,361,392) (2,13,44,67,72,145,219,276,361) (2,67,145,219,361)
2 (72,145,392) (13,72,145,276,392) (72,392)
3 (145) (13,145) (145)
4 ϕ (13) ϕ

Table 21: fσ = 0.01: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of altered bit-flips ( C), after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, s = 10, r = 6

# RRMSE Sens Spec p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0517 0.0422 0.83 1 0.987 0.992 1e-8
2 0.0422 0.0389 1 1 0.992 0.995 0.00055
3 0.0389 0.0378 1 1 0.995 0.997 0.0072
4 0.0389 0.0378 1 1 0.995 0.997 0.0277

Table 22: fσ = 0.01: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, s = 10, r = 6

3.3 RSM correction for varying fσ and s (Table 21-32)


In this subsection of experimental results, we run the correction algorithm for
RSM for three different values of fσ = 0.01, 0.03, 0.05. We also run the correction
algorithm for three different values of s = 5, 10, 20. For each fσ and s, we
provide two separate tables of results. In the first table, we provide the set
of true bit-flips (B), set of detected bitf-flips (J) and set of altered bit-flips
( C), after each stage of correction in Drlt-C for RSM errors. In the second
table, we provide pre correction and post correction RRMSE, Sensitivity and
Specificity and the p-value for the simultaneous-test given in (35), after each
stage of correction in Drlt-C for RSM errors. The fixed set of parameters
are p = 500, n = 400, fσ = 0.01, s = 10. Note that, the true number of bit-flips
varies as the RSM errors are induced in an non-adversial manner with fer = 0.2.

3.4 Proportion of bit-flips correctly altered by RSM cor-


rection algorithm (Table 33)
In this set of experiments, we present the proportion of times each bit-flip has
been correctly altered in the first stage of the RSM correction algorithm. The
fixed parameters of the experiments are p = 500, n = 400, s = 10, r = 6. We
vary fσ = 0.01, 0.03, 0.05. The proportions are taken over 10 different instances
of η for each fσ . In Table 33, the first column represent the indices that have
true effective bit-flips. Columns 2,3,4 represent the proportion of times that

13
# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)
1 (2,67,72,219,361,392) (2,22,52,67,89,194,198,219.232,271,302,311,361,387) (2,67,89,219,311)
2 (72,89,311,361,392) (22,52,72,89,198,271,302,311,387,392) (72,89,198,311)
3 (198,361,392) (22,198,271,311,361,392) (198,271,392)
4 (271,361) (22,271,311,361) (271)
5 (361) (22,361) (361)
6 ϕ (22) ϕ

Table 23: fσ = 0.03: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of altered bit-flips ( C), after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, s = 10, r = 6

# RRMSE Sens Spec p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0822 0.0757 0.67 0.8 0.974 0.989 1e-16
2 0.0757 0.0661 0.8 1 0.989 0.992 1e-11
3 0.0661 0.0632 1 1 0.992 0.995 1e-7
4 0.0632 0.0617 1 1 0.995 0.997 0.00049
5 0.0617 0.0611 1 1 0.997 0.998 0.00225
6 0.0617 0.0611 1 1 0.997 0.998 0.0319

Table 24: fσ = 0.03: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, s = 10, r = 6

# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)


1 (2,67,72,219,361,392) (2,67,219,361,{19 other indices}) (2,67,142,219,293)
2 (72,142,293,361,392) (42,67,72,141,142,193,225,259,283,317,366,392,397) (72,142,193,317,392)
3 (193,293,317,361) (42,67,225,259,293,317,397) (293,317,397)
4 (193,361,397) (42,225,361,397) (361,397)
5 (193,361) (42,193,225,361) (193)
6 (361) (42,225,361) ϕ

Table 25: fσ = 0.05: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of altered bit-flips ( C), after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, s = 10, r = 6

14
# RRMSE Sens Spec p-value simul
Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.1033 0.0898 0.67 0.6 0.951 0.974 1e-16
2 0.0898 0.0852 0.6 0.8 0.974 0.989 1e-12
3 0.0852 0.0831 0.8 0.75 0.989 0.992 1e-9
4 0.0801 0.0774 0.75 1 0.992 0.995 1e-6
5 0.0774 0.0752 1 1 0.995 0.997 0.000252
6 0.0752 0.0737 1 1 0.997 0.997 0.00121

Table 26: fσ = 0.05: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, s = 10, r = 6

# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)


1 (4,255) (4,117,255,289,315) (4,255,315)
2 (315) (117,315) (315)
3 ϕ ϕ ϕ

Table 27: s = 5: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of altered bit-flips ( C), after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, r = 2

# RRMSE Sens Spec p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0402 0.0377 1 1 0.997 0.998 0.00062
2 0.0377 0.0356 1 1 0.998 1 0.0071
3 0.0377 0.0356 1 1 1 1 0.0515

Table 28: s = 5: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, r = 2

# True effective bit-flips(B) effective bitflips detected J bitflips altered ( C)


1 (143,184,239,278,282) (69,127,143,177,239,275,278,282,299,375) (69,143,239,278,282)
2 (69,184) (69,177,184,299,375) (69,177,184)
3 (177) (177) (177)
4 ϕ ϕ ϕ

Table 29: s = 10: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of altered bit-flips ( C), after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, r = 5

15
# RRMSE Sens Spec p-value simul
Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.0598 0.0407 0.8 1 0.987 0.992 1e-6
2 0.0407 0.0392 1 1 0.992 1 0.00091
3 0.0392 0.0379 1 1 1 1 0.0253

Table 30: s = 10: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, s = 10, r = 5

# True effective bit-flips(B) effective bitflips detected J bitflips altered (


1 (160,186,255,261,285,334,368) (63,109,124,178,186,220,222,261,325,334,346,368) (124,186,261,325,334
2 (124,160,255,325) (63,82,109,124,220,222,255,325) (82,124,222,255,3
3 (82,160,222) (63,82,109,222) (82,222)
4 (160) (63,160) (160)
5 ϕ (63) ϕ

Table 31: s = 15: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of altered bit-flips ( C), after each stage of correction in Drlt-C for RSM
errors. The parameters are p = 500, n = 400, fσ = 0.01, r = 7

# RRMSE Sens Spec p-value simul


Pre-corr Post-corr Pre-corr Post-corr Pre-corr Post-corr
1 0.119 0.0952 0.714 0.75 0.979 0.987 1e-16
2 0.0952 0.0923 0.75 0.67 0.987 0.992 1e-12
3 0.0923 0.0861 0.67 1 0.992 0.995 1e-7
4 0.0861 0.0848 1 1 0.995 0.998 0.00078
5 0.0861 0.0848 1 1 0.995 0.998 0.0045

Table 32: s = 15: Pre correction and post correction RRMSE, Sensitivity and
Specificity and p-value for simultaneous tests as given in (35) after each stage
of correction in Drlt-C for RSM errors. The parameters are p = 500, n =
400, fσ = 0.01, s = 10, r = 7

16
Indices of true effective bit-flips fσ = 0.01 fσ = 0.03 fσ = 0.05
2 1(1) 0.9 (1) 0.7 (0.9)
67 1(1) 1 (1) 0.9 (1)
72 0.8 (0.8) 0.6 (0.8) 0.2 (0.5)
219 0.6 (0.9) 0.5 (0.7) 0.5 (0.6)
361 0.9 (1) 0.8 (0.8) 0.5 (0.7)
392 0.4 (0.5) 0.1 (0.3) 0 (0.1)

Table 33: Effective Bit-flips correctly altered: first column represent the indices
that have true effective bit-flips. Columns 2,3,4 represent the proportion of
times that particular bitflip has been correctly altered by the RSM correction
algorithm for fσ = 0.01, fσ = 0.03 and fσ = 0.05 respectively. The fixed
parameters of the experiments are p = 500, n = 400, s = 10, r = 6. In brackets
are the proportion of the number of times that bit-flip was detected in the first
stage of RSM correction.

particular bitflip has been correctly altered by the RSM correction algorithm
for fσ = 0.01, fσ = 0.03 and fσ = 0.05 respectively. The objective of this
experiment is to observe how the correction of bit-flips worsen with increase in
noise variance.

4 Alternate correction algorithm for RSM


We see from the correction algorithms given in the previous section that there
are multiple bit-flips that remain un-corrected at the end of the first stage of
correction. This requires the correction to be done multiple times. However,
given y, A it is not clear as to how we can decide upto which stage should
the multi-stage correction process continue. The objective in this section is
three-fold.

1. Provide an better correction algorithm that is capable of correcting almost


all the bit-flips that can be corrected.
2. Provide a measure to evaluate the improvement in the correction algo-
rithm.

3. Provide a stopping criterion for the multiple-stage correction algorithm.


The main issue with Alg.3 is that the closeness criterion used in step 13 is
dependent on the debiased lasso estimate δ̂ W . This estimate is affected by the
gaussian noise η and is not ideal as a closeness criterion to 0. The alternate
algorithm uses the lasso estimate δ̂λ2 itself. The lasso estimate is robust to
additive noise and hence works better. The alternate algorithm is given in
Alg.4. Now, in order to measure the improvement in the correction algorithm,
we use Root Mean Square Error (Rmse) over the set of measurements detected

17
Run # NP NP cA Rmse pre-correction Rmse post-correction
1 4 4 23.25 5.75
2 2 2 11.5 3.5
3 4 3 15.5 5.25
4 3 3 17.67 6.67
5 5 4 25.2 8.4

Table 34: True number of bitflips (NP ), the number of corrected bitflips by
Alg.4, RM SEJ pre-correction and RM SEJ post-correction

Stage # True bit-flips(B) J NP NP T NP F


1 (11,126,195,242) (11,27,126,195,242,322,387,392) 4 3 4
2 (11,322) (11,27,322,387) 2 2 2
3 ϕ (27,387) 0 0 2

Table 35: Run-3: Set of true bit-flips (B), set of detected bitf-flips (J) and
set of corrected bit-flips ( C), #RSM errors (NP ) and #RSM errors detected
correctly (NP T ) and incorrectly (NP F ), after each stage of correction in Drlt-
C for r = 8 RSM errors.

to have bit-flips. The measure is given as follows:


v
uP
u i∈J(yi − ai. β̂λ1 − δ̂λ2i )2
RM SEJ = t (1)
nJ

where nJ is the cardinality of J, J is the set of measurements detected to have


bit-flips.

4.1 Experimental Results


We performed experiments to correct bit-flips under the RSM model using Alg.1.
We ran the experiments for p = 500, n = 400, fσ = 0.01, fsp = 0.1, fer = 0.1
using the same experimental setup as given in [2] for 4 different noise runs. In
Table 1 we provide the true number of bitflips (NP ), the number of corrected
bitflips by Alg.7(NP A ), the number of corrected bitflips by Alg.3(NP cA ).

5 Optimal solution for Alg.1 of [3]


We will find the optimal solution for the equivalent optimisation problem w.r.t.
to W for the optimisation problem given in Alg.1 of [3]. The optimisation
problem w.r.t to M of Alg.1 of [3] is given in Alg.5.

18
Algorithm 4 Correction for bit-flips following the Random Switch Model
(RSM) using W = copt A
Require: Measurement vector y, pooling matrix A, Lasso estimate β̂λ1 , δ̂λ2 ,
λ and the set J of corrupted measurements estimated by ODrlt method
Ensure: Bit-flip corrected matrix Ã
1: for every i ∈ J do
2: Set bf 1 := −1, bf 2 = −1 (bit-flip flag), cutoff= |δ̂λ2i |.
3: for every j ∈ [p − 1] do
4: for l ∈ [p] do
5: if {{Aij == −1}and{Ail == 1}}or{{Aij == 1}and{Ail ==
−1}} then
6: if Aij == 1 then
7: Aij = −1, Ail = 1.
8: else if Aij == −1 then
9: Aij = 1, Ail = −1.
10: end if
11: Find the solution β̂λ1 , δ̂λ2 to the convex program given in Eqn.(6)
of [2].
12: if |δ̂λ2i | < cutoff then
13: set bf 1 := j, bf 2 := l, cutoff = |δ̂λ2i |.
14: end if
15: if bf 1 ! = −1 and bf 2 ! = −1 then {bit-flip not detected at Aij }
16: Aij = −Aij , Ail = −Ail {reverse induced bit-flip in Aij }
17: end if
18: end for
19: end for
20: end for
21: return à = A.

19
Algorithm 5 Construction of M (Alg 1 of [3])
Require: Measurement vector y, design matrix A, µ
Ensure: M

1: Set Σ̂ = A A/n.
2: Let mi ∈ Rp for i = 1, 2, . . . , p be a solution of:

minimize mi ⊤ Σ̂1 mi
subject to ∥Σ̂mi − ei ∥∞ ≤ µ, (2)
q
where ei ∈ Rp is the ith column of the identity matrix I and µ = 2 logn p .
3: Set M = (m1 | . . . |mp )⊤ . If any of the above problems is not feasible, then
set M = Ip .

The equivalent separable problem in step 2 of Alg.5 w.r.t. to W (where


W = AM ) is given as follows:

P := minimize w.j ⊤ w.j /n


1 ⊤
subject to n A w.j − ej ∞ ≤ µ, (3)

where w.j is the j th column of W and ej is the j th column of Ip for j ∈ [p].


This is the primal problem (P). We will now find the dual problem (D) of P
given in (3) in Lemma 2.
q
Lemma 2 Let A be a n × p Rademacher matrix and µ = 2 logn p . Given the
primal optimisation problem (P) in (3), the dual of this problem for j ∈ [p] is
given by:
1
Dj := max − (γ1 − γ2 )⊤ A⊤ A(γ1 − γ2 ) − γ1 ⊤ (ej + µ1) + γ2 ⊤ (ej − µ1) (4)
γ1 ≥0,γ2 ≥0 4n

where ej is the j th column on Ip . Furthermore, for all j ∈ [p] one optimal


solution of the primal problem P is given by w̃.j = (1 − µ)a.j and one optimal
solution to the dual(problem is given by γ̃1 = 0 and
2(1 − µ) ; if j ′ = j
for j ′ ∈ [p] γ̃2j ′ = .
0 ; otherwise
The minimum of the primal problem P and the maximum of the dual problem
D is equal at the aforementioned solutions and is given by (1 − µ)2 .
Proof of Lemma 2: Recall the primal problem P for all j ∈ [p] is given by,

P := minimize w.j ⊤ w.j /n


1 ⊤
subject to n A w.j − ej ∞
≤ µ,
1 ⊤
We can express the constraint for j ∈ [p], n A w.j − ej ∞
≤ µ in the form of

20
two sets of component wise inequalities:
1 ⊤
−µ ≤ a w.j − ejl ; l ∈ [p]
n .l
1 ⊤
a w.j − ejl ≤ µ ; l ∈ [p]
n .l
The Lagrangian function for γ1 p×1 ≥ 0, γ2 p×1 ≥ 0 is given by, for j ∈ [p],
⊤ p p
w.j w.j
  X  
X 1 ⊤ 1 ⊤
L(w.j , γ1 , γ2 ) = + γ1l a w.j − ejl − µ + γ2l ejl − a.l w.j − µ (5)
n n .l n
l=1 l=1

The dual is given by g(γ1 , γ2 ) = min n L(w.j , γ1 , γ2 ). Hence, the dual optimi-
w.j ∈R
sation problem D is given by,

D := max g(γ1 , γ2 ) = max min L(w.j , γ1 , γ2 ). (6)


γ1 ≥0,γ2 ≥0 γ1 ≥0,γ2 ≥0w.j ∈Rn

We will now attempt to find the exact form of g(γ1 , γ2 ). Hence, we want to find
the minimum value of L(w.j , γ1 , γ2 ) w.r.t. w.j for all j ∈ [p]. Hence, setting
the first order derivative of L(w.j , γ1 , γ2 ) w.r.t. w.j to zero, we get,

∂L(w.j , γ1 , γ2 )
= 0
∂w.j ⊤
p p
2w.j ⊤ X a.l X a.l
=⇒ + γ1l − γ2l = 0
n n n
l=1 l=1
p
1X 1
=⇒ ŵ.j = − (γ1l − γ2l )a.l = − A(γ1 − γ2 ). (7)
2 2
l=1

The second order derivative of L(w.j , γ1 , γ2 ) is given by,

∂L(w.j , γ1 , γ2 ) 2
= Ip > 0 (8)
∂w.j ⊤ w.j n

Hence, ŵ.j = − 12 A(γ1 − γ2 ) is the minima. Therefore, plugging in the minima

21
from (7), we get,

g(γ1 , γ2 ) = L(ŵ.j , γ1 , γ2 )
⊤ p p
ŵ.j ŵ.j X
  X  
1 ⊤ 1 ⊤
= + γ1l a ŵ.j − ejl − µ + γ2l ejl − a.l ŵ.j − µ
n n .l n
l=1 l=1
 
1 1
= (γ1 − γ2 )⊤ A⊤ A(γ1 − γ2 ) + γ1 ⊤ − A⊤ A(γ1 − γ2 ) − ej − µ1
4n 2n
 
1 ⊤
+ γ2 ⊤ A A(γ1 − γ2 ) + ej − µ1
2n
1 1
= (γ1 − γ2 )⊤ A⊤ A(γ1 − γ2 ) − (γ1 − γ2 )⊤ A⊤ A(γ1 − γ2 )
4n 2n
− γ1 ⊤ (ej + µ1) + γ2 ⊤ (ej − µ1)
1
= − (γ1 − γ2 )⊤ A⊤ A(γ1 − γ2 ) − γ1 ⊤ (ej + µ1) + γ2 ⊤ (ej − µ1) (9)
4n
Hence, the dual problem is given by
1
Dj := max − (γ1 − γ2 )⊤ A⊤ A(γ1 − γ2 ) − γ1 ⊤ (ej + µ1) + γ2 ⊤ (ej − µ1).
γ1 ≥0,γ2 ≥0 4n
(10)
This completes the proof of the dual problem. Now, we will show that, for all
j ∈ [p], w̃.j = (1 −( µ)a.j and γ̃1 = 0 and
2(1 − µ) ; if j ′ = j
for j ∈ [p] γ̃2j ′ = are the optimal solutions to the pri-
0 ; otherwise
mal (P) and dual (Dj ) problems respectively. In order to show that, we will
invoke the Strong Duality Theorem, i.e., we will show that the optimal objective
function values of the primal and dual problems at the solutions w̃.j , γ̃1 and γ̃2
are equal for all j ∈ [p].
Recall that the objective function for the primal problem is given by, f (w.j ) =
w.j ⊤ w.j /n. Hence, the value of f (w.j ) at w̃.j for all j ∈ [p] is given by,

a.j ⊤ a.j
f (w̃.j ) = f ((1 − µ)a.j ) = (1 − µ)2 = (1 − µ)2 . (11)
n
Now, the values of the dual objective function for γ̃1 and γ̃2 is given by,

1 a.j ⊤ a.j
g(γ̃1 , γ̃2 ) = − 4(1 − µ)2 + 2(1 − µ)(1 − µ) = (1 − µ)2 (12)
4 n
Hence, from (11) and (12), that both the primal and dual objective function
values are equal to (1−µ)2 for the solutions w̃.j , γ̃1 and γ̃2 for all j ∈ [p]. Hence,
by Strong Duality Theorem, they are the optimal solutions for the primal and
dual problems. This completes the proof. ■

22
6 Optimal solution for Alg.2 of [2]
Now, we will provide a theorem which derives the optimal solution for Alg.6 We
state this algorithm in Alg.6.

Algorithm 6 Design of W
Require: A, µ1 , µ2 and µ3
Ensure: W
1: We solve the following optimisation problem :

p
X
PW ≜ minimize w.j ⊤ w.j /n
W
j=1

subject to C0 : w.j ⊤ w.j /n ≤ 1 ∀ j ∈ [p]


 
1 ⊤
C1 : Ip − W A ≤ µ1 ,
n ∞
 
1 1
C2 : In − W A⊤ A ≤ µ2 ,
p n ∞

 
n WA p
C3 : q − In ≤ µ3 ,
p 1 − np n n ∞

q q q
2 log(p)
where µ1 ≜ 2 n , µ2 ≜ 2 log(2np)
np + 1
n and µ3 ≜ √ 2 2 log(n)
p .
1−n/p
2: If the above problem is not feasible, then set W = A.

The primal optimisation problem of Alg.6 is given by PW .


q
Theorem 1 Let A be a n × p Rademacher matrix, µ1 = 2 2 log(p)
n , µ2 =
q q
2 log(2np)
np + n1 and µ3 = √ 2 2 log(n)
p . Given the primal optimisation
1−n/p
problem (P) in (3), the dual of this problem is given by:

DW ≜ max g(Λ1 , Λ2 , Λ3 , Λ4 , Λ5 , Λ6 , λ7 ). (13)


Λ1 ≥0,Λ2 ≥0,Λ3 ≥0,Λ4 ≥0,Λ5 ≥0,Λ6 ≥0,λ7 ≥0

where, g(Λ1 , Λ2 , Λ3 , Λ4 , Λ5 , Λ6 , λ7 ) is as given in (25). Furthermore, if n <


one optimal solution of the primal problem PW is given by W = (1 −
p, p
µ3 1 − n/p)A and one optimal solution to the dual problem(DW is given by
2p
p p
n (1 − µ3 1 − n/p)(1 + 1 − n/p); if i = j =
Λ̂1 = 0 = Λ̂2 ,Λ̂3 = 0 = Λ̂4 , Λ̂5 = 0 and λ̂7 = 0 with [Λ̂6 ]ij =
0; o.w.
The minimum of the primal problem PW and the maximum of the dual p problem
DW is equal at the aforementioned solutions and is given by (1 − µ3 1 − n/p)2 .
■.

23
Proof of Theorem 1: The primal problem PW can be written with element-
wise constraints as follows:
p n
1 XX 2
PW ≜ minimize wkl
W n
l=1 k=1
n
1 X
2
subject to C0 : wkl ≤ 1 ∀ l ∈ [p]
n
k=1
n
1X
C1 : −µ1 ≤ Ipl1 l2 − wkl1 akl2 ≤ µ1 ∀ l1 , l2 ∈ [p],
n
k=1
n p
ak1 l1 1 X X
C2 : −µ2 ≤ − wk1 l2 ak2 l2 ak2 l1 ≤ µ2 ∀ l1 ∈ [p], k1 ∈ [n],
p np
k2 =1 l2 =1
p
p 1X p
C3 : −µ3 1 − n/p ≤ Ink1 k2 − wk1 l ak2 l ≤ µ3 1 − n/p ∀ k1 , k2 ∈ [n].
p
l=1

Note that, there are 7 separate set of constraints. Hence, we can define the La-
p×p p×p
grangian function with the Lagrangian parameters Λ1 ∈ R+ , Λ2 ∈ R+ , Λ3 ∈
n×p n×p n×n n×n p
R+ , Λ4 ∈ R+ , Λ5 ∈ R+ , Λ6 ∈ R+ , λ7 ∈ R+ as follows:

L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 )
p p
n n
!
1 XX 2 X 1X 2
= wkl + λ7l wkl − 1
n n
l=1 k=1 l=1 k=1
p p n
!
XX 1X
+ Λ1l1 l2 −µ1 − Ipl1 l2 − wkl1 akl2
n
l1 =1 l2 =1 k=1
p X
p n
!
X 1X
+ Λ2l1 l2 Ipl1 l2 − wkl1 akl2 − µ1
n
l1 =1 l2 =1 k=1
p p
n X n
!
X ak1 l1 1 X X
+ Λ3k1 l1 −µ2 − − wk1 l2 ak2 l2 ak2 l1
p np
k1 =1 l1 =1 k2 =1 l2 =1
p p
n X n
!
X ak1 l1 1 X X
+ Λ4k1 l1 − wk1 l2 ak2 l2 ak2 l1 − µ2
p np
k1 =1 l1 =1 k2 =1 l2 =1
p
n X
n
!
X p 1X
+ Λ5k1 k2 −µ3 1 − n/p − Ink1 k2 − wk1 l ak2 l
p
k1 =1 k2 =1 l=1
p
n X
n
!
X 1X p
+ Λ6k1 k2 Ink1 k2 − wk1 l ak2 l − µ3 1 − n/p . (14)
p
k1 =1 k2 =1 l=1

The dual function of the primal problem PW is given as follows:

g(Λ1 , Λ2 , Λ3 , Λ4 , Λ5 , Λ6 , λ7 ) = min L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 ). (15)


W

24
The dual optimisation problem DW is given by:

DW ≜ max g(Λ1 , Λ2 , Λ3 , Λ4 , Λ5 , Λ6 , λ7 ). (16)


Λ1 ≥0,Λ2 ≥0,Λ3 ≥0,Λ4 ≥0,Λ5 ≥0,Λ6 ≥0,λ7 ≥0

We will now attempt to find the exact form of g(.). Hence, we want to find the
minima of L(.) w.r.t. W . Hence, evaluating the first order derivative of L(.)
w.r.t. wij , we get, for all i ∈ [n], j ∈ [p],
 
dL(·) ∂L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 )
=
dW ij ∂wij
p p n p
2 2λ7j 1 X 1 X 1 X X
= wij + wij − Λ1jl2 ail2 − Λ2jl2 ail2 − Λ3il1 ak2 j ak2 l1
n n n n np
l2 =1 l2 =1 k2 =1 l1 =1
n p n n
1 X X 1 X 1 X
− Λ4il1 ak2 j ak2 l1 − Λ5ik2 ak2 j − Λ6ik2 ak2 j . (17)
np p p
k2 =1 l1 =1 k2 =1 k2 =1

Setting the first order derivative in (17) to 0 to satisfy the stationary condition
for all i ∈ [n], j ∈ [p], we get
∂L(W ,Λ1 ,Λ2 .Λ3 ,Λ4 ,Λ5 ,Λ6 ,λ7 )
∂wij =0
2(1+λ7j ) 1 1
=⇒ n ŵij = (Λ1j. + Λ2j. )⊤ ai. + (Λ3i. + Λ4i. )⊤ A⊤ a.j
n np
1
+ (Λ5i. + Λ6i. )⊤ a.j
p
(
1
=⇒ ŵij = (Λ1j. + Λ2j. )⊤ ai. (18)
2(1 + λ7j )
)
1 ⊤ ⊤ n ⊤
+ (Λ3i. + Λ4i. ) A a.j + (Λ5i. + Λ6i. ) a.j
p p

The second order derivative of L(·) w.r.t. wij for all i ∈ [n], j ∈ [p] is given as
follows:
d2 L(·) 2(1 + λ7j )
= >0 (19)
(dwij )2 n

The last inequality is true because λ7 ≥ 0. Hence, from (19), the solution
ŵij given in (18) is the minima. Now we will plug in the value of ŵij for all
i ∈ [n], j ∈ [p] in L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 ) in (14). In order to do so,

25
we re-write L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 ) as follows:

L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 )
p p p
p X
1 X X X
= (1 + λ7l1 )w·l1 ⊤ w·l1 − λ7l1 − (Λ1l1 l2 + Λ2l1 l2 )µ1
n
l1 =1 l1 =1 l1 =1 l2 =1
X p
p X p
n X
X
+ (Λ2l1 l2 − Λ1l1 l2 )Ipl1 l2 − (Λ3k1 l1 + Λ4k1 l1 )µ2
l1 =1 l2 =1 k1 =1 l1 =1
n X p n X n
1 X X p
+ (Λ4k1 l1 − Λ3k1 l1 )ak1 l1 − (Λ5k1 k2 + Λ6k1 k2 )µ3 1 − n/p
p
k1 =1 l1 =1 k1 =1 k2 =1
n X n p X p
X 1 X

− (Λ6k1 k2 − Λ5k1 k2 )Ink1 k2 − (Λ1l1 l2 + Λ2l1 l2 )w·l a·l2
n 1
k1 =1 k2 =1 l1 =1 l2 =1
n X p n n
1 X 1 X X
− (Λ3k1 l1 + Λ4k1 l1 )wk⊤1 · A⊤ a·l1 − (Λ5k1 k2 + Λ6k1 k2 )wk⊤1 · a(20)
k2 · .
np p
k1 =1 l1 =1 k1 =1 k2 =1

Note that the last three terms of (20) are the only terms dependent on W . Now,
we will find the minimum value of L(W , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 ) which is
given by L(W c , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 ). In order to obtain this, we evaluate
the first term and the last three terms of (20) at W c separately. The first term
expanded is given as follows:
p p
(
1 X ⊤ 1 X 1
(1 + λ7l1 )ŵ·l1 ŵ·l1 = (Λ1l1 · + Λ2l1 · )⊤ A⊤ A(Λ1l1 · + Λ2l1 · )
n 4n 1 + λ7l1
l1 =1 l1 =1
1 ⊤
+ a A(Λ3 + Λ4 )⊤ (Λ3 + Λ4 )A⊤ a·l1
p2 ·l1
n2 ⊤
+ a (Λ5 + Λ6 )⊤ (Λ5 + Λ6 )a·l1
p2 ·l1
2
+ (Λ1l1 · + Λ2l1 · )⊤ A⊤ (Λ3 + Λ4 )A⊤ a·l1
p
2n
+ (Λ1l1 · + Λ2l1 · )⊤ A⊤ (Λ5 + Λ6 )a·l1
p
)
2n ⊤
+ a A(Λ3 + Λ4 )⊤ (Λ5 + Λ6 )a·l1 (21)
p2 ·l1

Now, we evaluate the third-last term of (20) at W


c as follows:
p p p p
(
1 XX ⊤ 1 X X (Λ1l1 l2 + Λ2l1 l2 )
− (Λ1l1 l2 + Λ2l1 l2 )ŵ·l a·l2 = − (Λ1l1 · + Λ2l1 · )⊤ A⊤ a·l2
n 1
2n 1 + λ7l1
l1 =1 l2 =1 l1 =1 l2 =1
)
1 ⊤ ⊤ n ⊤ ⊤
+ a A(Λ3 + Λ4 ) a·l2 + a·l1 (Λ5 + Λ6 ) a·l2 .(22)
p ·l1 p

26
The second last term of (20) at W
c can be written as:

n p
1 X X
− (Λ3k1 l1 + Λ4k1 l1 )wk⊤1 · A⊤ a·l1
np
k1 =1 l1 =1
p
n
(
1 X X
= − (Λ3k1 l1 + Λ4k1 l1 ) a⊤ ⊤ ⊤
k1 · (Λ1 + Λ2 )Λ7 A a·l1
np
k1 =1 l1 =1
)
1 ⊤ ⊤ ⊤ ⊤ n ⊤ ⊤ ⊤
+ (Λ3k1 · + Λ4k1 · ) A AΛ7 A a·l1 + (Λ5k1 · + Λ6k1 · ) AΛ7 A a·l(23)
1 ,
p p

where Λ7 is p × p dimensional diagonal matrix with Λ7jj = 1/(1 + λ7j ) for


j ∈ [p]. The last term of (20) at W
c is as follows:

n n
1 X X
− (Λ5k1 k2 + Λ6k1 k2 )wk⊤1 · ak2 ·
p
k1 =1 k2 =1
n n
(
1 X X
= − (Λ5k1 k2 + Λ6k1 k2 ) a⊤ ⊤
k1 · (Λ1 + Λ2 )Λ7 ak2 ·
p
k1 =1 k2 =1
)
1 ⊤ ⊤ ⊤ n ⊤
+ (Λ3k1 · + Λ4k1 · ) A AΛ7 ak2 · + (Λ5k1 · + Λ6k1 · ) Aak2 · ,(24)
p p

where Λ7 is as defined before. Now we plug in (21),(22),(23) and (24) in (20) in


order to get the dual function. Note that, we write the dual function in terms

27
of Trace of matrices to make it simpler to write. Hence, we get,

g(Λ1 , Λ2 , Λ3 , Λ4 , Λ5 , Λ6 , λ7 ) = L(W
c , Λ1 , Λ2 .Λ3 , Λ4 , Λ5 , Λ6 , λ7 )
= −1⊤ ⊤ ⊤
p λ7 − µ1 1p (Λ1 + Λ2 )1p + T r[Λ2 − Λ1 ] − µ2 1n (Λ3 + Λ4 )1p
T r[A⊤ (Λ4 − Λ3 )] p
+ − µ3 1 − n/p 1⊤ n (Λ5 + Λ6 )1n + T r[Λ6 − Λ5 ]
p
1 1
+ T r[Λ⊤ ⊤ ⊤
7 (Λ1 + Λ2 ) A A(Λ1 + Λ2 )] + T r[Λ⊤ ⊤ ⊤
7 A A(Λ3 + Λ4 ) (Λ3 + Λ4 )A A]

4n 4np2
n 1
+ 2
T r[Λ⊤ ⊤ ⊤
7 A (Λ5 + Λ6 ) (Λ5 + Λ6 )A] + T r[Λ⊤ ⊤ ⊤
7 (Λ1 + Λ2 ) A (Λ3 + Λ4 )A A]

4p 2np
1 1
+ T r[Λ⊤ ⊤ ⊤
7 (Λ1 + Λ2 ) A (Λ5 + Λ6 )A] + T r[Λ⊤ ⊤ ⊤
7 A A(Λ3 + Λ4 ) (Λ5 + Λ6 )A]
2p 2p2
1 1
− T r[Λ⊤ ⊤ ⊤ ⊤
7 (Λ1 + Λ2 ) (Λ1 + Λ2 ) A A] − T r[Λ⊤ ⊤ ⊤
7 (Λ1 + Λ2 ) A A(Λ3 + Λ4 ) A]

2n 2np
1 1
− T r[Λ⊤ ⊤ ⊤ ⊤
7 (Λ1 + Λ2 ) A (Λ5 + Λ6 ) A] − T r[(Λ3 + Λ4 )⊤ A⊤ (Λ1 + Λ2 )Λ⊤ ⊤
7 A A]
2p 2np
1 1
− 2
T r[(Λ3 + Λ4 )⊤ (Λ3 + Λ4 )A⊤ AΛ⊤ ⊤
7 A A] − T r[(Λ3 + Λ4 )⊤ (Λ5 + Λ6 )AΛ⊤ ⊤
7 A A]
2np 2p2
1 1
− T r[(Λ5 + Λ6 )⊤ A⊤ (Λ1 + Λ2 )Λ⊤ ⊤
7 A )] − T r[(Λ5 + Λ6 )⊤ (Λ3 + Λ4 )A⊤ AΛ⊤ ⊤
7 A ]
2p 2p2
n
− T r[(Λ5 + Λ6 )⊤ (Λ5 + Λ6 )AΛ⊤ ⊤
7 A ], (25)
2p2
where Λ7 is as defined before, 1n is vector of length n of ones. This completes
the derivation of the dual problem. Now, we find the values of the primal
problem and dual problem at the given points of solution
p respectively. We have
given the choice Ŵ = copt A where, copt = 1 − µ3 1 − n/p. Hence, the value
of the primal problem is given as,
p p
1X ⊤ X p
2
ŵ.j ŵ.j = copt a⊤ 2 2
.j a.j /n = copt = (1 − µ3 1 − n/p) . (26)
n j=1 j=1

Now we are given the choice for the dual solutions


( which are Λ̂1 = 0 = Λ̂2 ,Λ̂5 =
2p
p p
n (1 − µ3 1 − n/p)(1 + 1 − n/p); if i = j = 1
0, Λ̂3 = 0 = Λ̂4 and λ̂7 = 0 with [Λ̂6 ]ij = .
0; o.w.
Hence, in the dual solution, Λ7 = Ip . Therefore, for these choices of parameters,
the dual solution of (25) becomes,

g(Λ̂1 , Λ̂2 , Λ̂3 , Λ̂4 , Λ̂5 , Λ̂6 , λ̂7 )


2p p p p p p
= (1 − µ3 1 − n/p)2 (1 + 1 − n/p) − (1 − µ3 1 − n/p)2 (1 + 1 − n/p)2
n p n
p 2 p(1 + 1 − n/p) p
= (1 − µ3 1 − n/p) (2 − 1 − 1 − n/p)
p n
2
= (1 − µ3 1 − n/p) . (27)

28
Clearly we see from (26) and (27), for the given choices of parameters, the dual
and primal solutions are equal. Hence by Strong Duality Theorem, the solutions
given are one the optimal solutions for both the primal andpdual respectively
with the corresponding optimal values is given by (1 − µ3 1 − n/p)2 . This
completes the proof. ■

7 Simultaneous Confidence Intervals for Opti-


mal Drlt’s
We will now to derive the simultaneous hypothesis testing of multiple parameters
of β ∗ and δ ∗ respectively. Here, we want to test G0 : β ∗ K = 0 vs. G1 : β ∗ K ̸=
0, where K ⊂ [p] such that ∥K∥0 = k which is fixed as n, p → ∞. We further
want to test H0 : δ ∗ L = 0 vs. G1 : δ ∗ L ̸= 0, where L ⊂ [n] such that ∥L∥0 = l
which is fixed as n, p → ∞. In order to perform such tests, let us first recall the
debiased lasso estimates of β ∗ and δ ∗ respectively.
√ √
 
1 1 1  
n(β̂W − β ∗ )K = √ WK ⊤ η + n Ip − W ⊤ A (β ∗ − β̂λ1 ) + √ WK ⊤ δ ∗ − δ̂λ2 , (28)
n n K n
   
1 1 1  
(δ̂W − δ ∗ )L = In − W A⊤ η + In − W A⊤ A(β ∗ − β̂λ1 ) − WL A⊤ δ ∗ − δ̂λ(29)
2
.
n L n L n

Furthermore, the variance covariance matrices are given as follows:


 
1 ⊤ 1 ⊤
ΣβK ≜ Var √ WK η = σ 2 WK WK , (30)
n n
      ⊤
1 ⊤ 2 1 ⊤ 1 ⊤
ΣδL ≜ Var In − W A η = σ In − W A In − W A (31).
n L n L n L

Note that, we assume σ 2 to be known.√ Now we will provide a theorem that


provides the joint distributions of n(β̂W − β ∗ )K of (28) and (δ̂W − δ ∗ )L of
(29) which will aid us in creating the joint tests and its corresponding confidence
intervals.

Theorem 2 Given A is n×p dimensional Rademacher p matrix and W = copt A


2
be the optimal solution of Alg.6 with copt = (1 − µ3 1 − n/p) . Let K ⊂ [p]
such that ∥K∥0 = k and L ⊂ [n] such that ∥L∥0 = l with both k, l being fixed
as n, p → ∞. Furthermore, let (β̂W − β ∗ )K and (δ̂W − δ ∗ )L be as defined
in (28),(29) and ΣβK and ΣδL be as defined in (30) and (31) respectively. If
n log n is o(p) and n is ω(((s + r) log(p))2 ) then, we have,
√ √ D
{ n(β̂W − β ∗ )K }⊤ ΣβK −1 { n(β̂W − β ∗ )K } → χ2k , (32)
∗ ⊤ −1 ∗ D
{(δ̂W − δ )L } ΣδL {(δ̂W − δ )L } → χ2l , (33)

29
This theorem provides us with the following simultaneous hypothesis test.
Simultaneous test for β: Let K ⊂ [p] such that ∥K∥0 = k. We reject the
∗ ∗
null hypothesis G0 : βK = 0 vs the alternate G1 : βK ̸= 0 at α% level of
significance if:
√ √
{ n(β̂W )K }⊤ ΣβK −1 { n(β̂W )K } > χ2k,1−α (34)

where χ2k,1−α is the upper α% point of a χ2k distribution.


Simultaneous test for δ: Let L ⊂ [n] such that ∥L∥0 = l. We reject the null
∗ ∗
hypothesis H0 : δL = 0 vs the alternate H1 : δL ̸= 0 at α% level of significance
if:
{(δ̂W − δ ∗ )L }⊤ ΣδL −1 {(δ̂W − δ ∗ )L } > χ2l,1−α (35)
where χ2l,1−α is the upper α% point of a χ2l distribution. Proof of Theorem
2: We will first prove the joint distribution of β ∗ given in (32). Hence, we now
expand (32) using the structure given in (28) as follows:
( )⊤ ( )
√ ∗ ⊤ −1 √ ∗ 1 ⊤ −1 1 ⊤
{ n(β̂W − β )K } ΣβK { n(β̂W − β )K } = √ WK η ΣβK √ WK η
n n
( ) ⊤ ( )
√ −1 √
   
1 ⊤ ∗ 1 ⊤ ∗
+ n Ip − W A (β − β̂λ1 ) ΣβK n Ip − W A (β − β̂λ1 )
n K n K

 ⊤
( ) ( )
1 ⊤

∗ −1 1 ⊤



+ √ WK δ − δ̂λ2 ΣβK √ WK δ − δ̂λ2
n n
( )⊤ ( )
−1 √
 
1 ⊤ 1 ⊤ ∗
+ 2 √ WK η ΣβK n Ip − W A (β − β̂λ1 )
n n K
( )⊤ ( )
1 ⊤ −1 1 ⊤



+ 2 √ WK η ΣβK √ WK δ − δ̂λ2
n n
( )⊤ ( )

 
1 ⊤ −1 1 ⊤
 
+ 2 n Ip − W A (β ∗ − β̂λ1 ) ΣβK √ WK δ ∗ − δ̂λ2 (36)
n K n

In Lemma 5, we will show that ΣβK is positive definite, hence ΣβK −1 exists.
Next in Lemma 3, we will show that the last 5 terms (all except the first term on
the RHS) of (36) goes to 0 in probability. Lastly, we see that the first term on the
( )⊤ ( )
RHS of (36) can be written as ΣβK −1/2 √1n WK ⊤ η ΣβK −1/2 √1n WK ⊤ η .

Since, η is Gaussian with mean(0 and variance covariance


) matrix σ 2 In , by

linearity of Gaussian, we have, ΣβK −1/2 √1n WK ⊤ η ∼ Nk (0, Ik ). Hence,


( )⊤ ( )
ΣβK −1/2 √1n WK ⊤ η ΣβK −1/2 √1n WK ⊤ η ∼ χ2k .

30
Therefore, by applying Slutsky’s Theorem on (36), we have,
√ √ D
{ n(β̂W − β ∗ )K }⊤ ΣβK −1 { n(β̂W − β ∗ )K } → χ2k . (37)

This completes the proof of (32). Now, we prove the joint distribution given in
(33) with the same approach. Let us first expand (33) using the structure given
in (29). We have,
(  )⊤ (  )
1 1
{(δ̂W − δ ∗ )L }⊤ ΣδL −1 {(δ̂W − δ ∗ )L } = In − W A⊤ η ΣδL −1 In − W A⊤ η
n L n L
(  )⊤ (  )
1 1
+ In − W A⊤ A(β ∗ − β̂λ1 ) ΣδL −1 In − W A⊤ A(β ∗ − β̂λ1 )
n L n L

 ⊤
( ) ( )
1 ⊤

∗ −1 1 ⊤



+ WL A δ − δ̂λ2 ΣδL WL A δ − δ̂λ2
n n
(  )⊤ (  )
1 ⊤ −1 1 ⊤ ∗
+ 2 In − W A η ΣδL In − W A A(β − β̂λ1 )
n L n L
(  )⊤ ( )
1 ⊤ −1 1 ⊤



− 2 In − W A η ΣδL WL A δ − δ̂λ2
n L n
(  )⊤ ( )
1 ⊤ ∗ −1 1 ⊤



− 2 In − W A A(β − β̂λ1 ) ΣδL WL A δ − δ̂λ2 (38)
n L n

In Lemma 5, we will show that ΣδL is positive definite, hence ΣδL −1 exists.
Next in Lemma 4, we will show that the last 5 terms (all except the first term on
the RHS) of (38) goes to 0 in probability. Lastly, we see that the first term on the
( )⊤ ( )
   
RHS of (38) can be written as ΣδL −1/2 In − n1 W A⊤ η ΣδL −1/2 In − n1 W A⊤ η .
L L
2
Since, η is Gaussian with mean(0 and variance covariance matrix
) σ In , by
 
linearity of Gaussian, we have, ΣδL −1/2 In − n1 W A⊤ η ∼ Nk (0, Il ).
L
( )⊤ ( )
   
−1/2 ⊤ −1/2
Hence, ΣδL In − n1 W A η ΣδL In − n1 W A⊤ η ∼ χ2l .
L L

Therefore, by applying Slutsky’s Theorem on (38), we have,


D
{(δ̂W − δ ∗ )L }⊤ ΣδL −1 {(δ̂W − δ ∗ )L } → χ2l . (39)

This completes the proof of (33). ■

Lemma 3 Given A is n × p dimensional Rademacher p matrix and W = copt A


2
be the optimal solution of Alg.6 with copt = (1 − µ3 1 − n/p) . Let K ⊂ [p]

31
such that ∥K∥0 = k and L ⊂ [n] such that ∥L∥0 = l with both k, l being fixed as
n, p → ∞. Furthermore, let (β̂W − β ∗ )K be as defined in (28) and ΣβK be as
defined in (30). If n log n is o(p) and n is ω(((s + r) log(p))2 ) then, we have,
( )⊤ ( )
√  1 ⊤

∗ −1 √

1 ⊤

∗ P
1. n Ip − n W A (β − β̂λ1 ) ΣβK n Ip − n W A (β − β̂λ1 ) →
K K

0.
( )⊤ ( )
   
P
2. √1 WK ⊤ ∗
δ − δ̂λ2 ΣβK −1 √1 WK ⊤ ∗
δ − δ̂λ2 →0
n n

( )⊤ ( )
√  
P
3. 2 √1 WK ⊤ η ΣβK −1
n Ip − n1 W ⊤ A (β ∗ − β̂λ1 ) → 0.
n K

( )⊤ ( )
 
P
4. 2 √1 WK ⊤ η ΣβK −1 √1 WK ⊤ δ ∗ − δ̂λ2 → 0.
n n

( )⊤ ( )
√    
P
5. 2 n Ip − n1 W ⊤ A ∗
(β − β̂λ1 ) Σβ K −1 √1 WK ⊤
n

δ − δ̂λ2 →
K

0.

Proof of Lemma3:
Result 1.: We will expand the quadratic form and utilize individual proba-
bilistic rates derived in [2] to prove the claim. We have
( )⊤ ( )
√ √
   
1 1 ⊤
n Ip − W ⊤ A ∗
(β − β̂λ1 ) ΣβK −1
n Ip − W A ∗
(β − β̂λ1 )
n K n K
( )( )
√ √
   
XX
−11 ⊤ ∗ 1 ⊤ ∗
= [ΣβK ]lj n Ip − W A (β − β̂λ1 ) n Ip − W A (β − β̂λ1 )
n j n l
j∈K l∈K
( ) ( )
√ √
   
XX 1 1
≤ [ΣβK −1 ]lj n Ip − W ⊤ A (β ∗ − β̂λ1 ) n Ip − W ⊤ A (β ∗ − β̂λ1 )
n j n l
j∈K l∈K
2 X X

 
1
≤ n Ip − W ⊤ A (β ∗ − β̂λ1 ) [ΣβK −1 ]lj (40)
n ∞ j∈K l∈K

[ΣβK −1 ]lj = OP (1). Now, we have from


P P
We have from Lemma 5 j∈K l∈K
√  
(23) of Theorem 3 of [2], under the given conditions n Ip − n1 W ⊤ A (β ∗ − β̂λ1 ) =

32
oP (1). Since k is fixed as n, p → ∞, the proof is complete.
Result 2.: We follow the same process as in the proof of Result 1. We have
( )⊤ ( )
1   1  
√ WK ⊤ δ ∗ − δ̂λ2 ΣβK −1
√ WK ⊤ δ ∗ − δ̂λ2
n n
( )( )
XX
−1 1 ⊤


 1 ⊤



= [ΣβK ]lj √ W j δ − δ̂λ2 √ W l δ − δ̂λ2
n n
j∈K l∈K
( ) ( )
XX 1   1  
≤ [ΣβK −1 ]lj √ W⊤ δ ∗ − δ̂λ2 √ W⊤ δ ∗ − δ̂λ2
n j n l
j∈K l∈K

1   XX
≤ √ W ⊤ δ ∗ − δ̂λ2 [ΣβK −1 ]lj . (41)
n ∞ j∈K l∈K

[ΣβK −1 ]lj = OP (1). Now, we have from


P P
We have from Lemma 5 j∈K l∈K  
(24) of Theorem 3 of [2], under the given conditions √1n W ⊤ δ ∗ − δ̂λ2 =

oP (1). Since k is fixed as n, p(→ ∞, the proof is complete.
)
Result 3.: Recall that Z ≜ ΣβK −1/2 √1n WK ⊤ η ∼ Nk (0, Ik ). We have

( )⊤ ( )

 
1 1
2 √ WK ⊤ η ΣβK −1 n Ip − W ⊤ A (β ∗ − β̂λ1 )
n n K
( )
−1/2 √
 
⊤ 1 ⊤ ∗
= Z ΣβK n Ip − W A (β − β̂λ1 )
n K

 
XX
−1 1 ⊤
≤ [ΣβK ]lj |Zj | n Ip − W A (β ∗ − β̂λ1 )
n l
j∈K l∈K

 
1 X X
≤ n Ip − W ⊤ A (β ∗ − β̂λ1 ) [ΣβK −1 ]lj |Zj | . (42)
n ∞ j∈K l∈K

√  
We have from (23) of Theorem 3 of [2], under the given conditions n Ip − n1 W ⊤ A (β ∗ − β̂λ1 ) =

oP (1). Since k is fixed and we have from Lemma 5 j∈K l∈K [ΣβK −1 ]lj =
P P
OP (1) as n, p → ∞, the proof is complete.

33
Result 4.: By similar arguments as in (42), we have,
( )⊤ ( )
1 1  
2 √ WK ⊤ η ΣβK −1 √ WK ⊤ δ ∗ − δ̂λ2
n n
( )
⊤ −1/2 1 ⊤



= Z ΣβK √ WK δ − δ̂λ2
n
XX 1  
≤ [ΣβK −1 ]lj |Zj | √ W ⊤ l δ ∗ − δ̂λ2
n
j∈K l∈K

1   XX
≤ √ W ⊤ δ ∗ − δ̂λ2 [ΣβK −1 ]lj |Zj | . (43)
n ∞ j∈K l∈K
 
We have from (24) of Theorem 3 of [2], under the given conditions √1n W ⊤ δ ∗ − δ̂λ2 =

oP (1). Since k is fixed and we have from Lemma 5 j∈K l∈K [ΣβK −1 ]lj =
P P
OP (1) as n, p → ∞, the proof is complete.
Result 5.: Expanding the quadratic form, we have
( )⊤ ( )

 
1 ⊤ ∗ −1 1 ⊤



2 n Ip − W A (β − β̂λ1 ) ΣβK √ WK δ − δ̂λ2
n K n

XX 1   √  1 ⊤

−1 ⊤ ∗
≤ [ΣβK ]lj √ W l δ − δ̂λ2 n Ip − W A (β ∗ − β̂λ1 )
n n j
j∈K l∈K

 
1   1 X X
≤ √ W ⊤ δ ∗ − δ̂λ2 n Ip − W ⊤ A (β ∗ − β̂λ1 ) [ΣβK −1 ]lj .
n ∞ n ∞ j∈K l∈K
 
We have from (23),(24) of Theorem 3 of [2], under the given conditions √1n W ⊤ δ ∗ − δ̂λ2 =

√  
oP (1) and n Ip − n1 W ⊤ A (β ∗ − β̂λ1 ) = oP (1). Since k is fixed and we

P P −1
have from Lemma 5 j∈K l∈K [ΣβK ]lj = OP (1) as n, p → ∞, the proof
is complete.

Lemma 4 Given A is n × p dimensional Rademacher p matrix and W = copt A


be the optimal solution of Alg.6 with copt = (1−µ3 1 − n/p)2 . Let K ⊂ [p] such
that ∥K∥0 = k and L ⊂ [n] such that ∥L∥0 = l with l being fixed as n, p → ∞.
Furthermore, let (δ̂W − δ ∗ )L be as defined in (29) and ΣδL be as defined in
(31). If n log n is o(p) and n is ω(((s + r) log(p))2 ) then, we have,
( )⊤ ( )
   
1 ⊤ ∗ −1 1 ⊤ ∗ P
1. In − n W A A(β − β̂λ1 ) ΣδL In − n W A A(β − β̂λ1 ) →
L L

0.

34
( )⊤ ( )
   
1 ⊤ ∗ −1 1 ⊤ ∗ P
2. n WL A δ − δ̂λ2 ΣδL n WL A δ − δ̂λ2 → 0.

( )⊤ ( )
   
⊤ −1 ⊤ P
3. 2 In − n1 W A η ΣδL In − n1 W A A(β ∗ − β̂λ1 ) → 0.
L L

( )⊤ ( )
   
1 ⊤ −1 1 ⊤ ∗ P
4. 2 In − nW A η ΣδL n WL A δ − δ̂λ2 → 0.
L

( )⊤ ( )
   
1 ⊤ ∗ −1 1 ⊤ ∗ P
5. 2 In − nW A A(β − β̂λ1 ) ΣδL n WL A δ − δ̂λ2 → 0.
L


Proof of Lemma4:
Result 1.: We will expand the quadratic form and utilize individual proba-
bilistic rates derived in [2] to prove the claim. We have
(  )⊤ (  )
1 ⊤ ∗ −1 1 ⊤ ∗
In − W A A(β − β̂λ1 ) ΣδL In − W A A(β − β̂λ1 )
n L n L
(  )(   )
XX
−1 1 ⊤ ∗ 1 ⊤ ∗
= [ΣδL ]ik In − W A A(β − β̂λ1 ) In − W A A(β − β̂λ1 )
n i n k
i∈L k∈L
(  ) (  )
XX
−1 1 ⊤ ∗ 1 ⊤ ∗
≤ [ΣδL ]ik In − W A A(β − β̂λ1 ) In − W A A(β − β̂λ1 )
n i n k
i∈L k∈L
  2
n 1 ⊤ 1 XX
≤ p In − W A ∗
A(β − β̂λ1 ) n 2 [ΣδL −1 ]ik . (44)
p 1 − n/p n 2
∞ p (1−n/p) i∈L k∈L
1
P P −1
We have from Lemma5, n2 i∈L k∈L [ΣδL ]ik = OP (1). Now, we
2
p (1−n/p)
 
have from (25) of Theorem 3 of [2], under the given conditions In − n1 W A⊤ A(β ∗ − β̂λ1 ) =

oP (1). Since k is fixed as n, p → ∞, the proof is complete.
Result 2.: We follow the same process as in the proof of Result 1. We have
 ⊤
( ) ( )
1 ⊤

∗ −1 1 ⊤



WL A δ − δ̂λ2 ΣδL WL A δ − δ̂λ2
n n
( ) ( )
XX
−1 1 ⊤


 1 ⊤



≤ [ΣδL ]ik Wi A δ − δ̂λ2 Wk A δ − δ̂λ2
n n
i∈L k∈L
2
n 1   1 XX
≤ p W A⊤ δ ∗ − δ̂λ2 n2
[ΣδL −1 ]ik (45)
p n/p n ∞ p2 (1−n/p) i∈L k∈L

35
 
We have from (26) of Theorem 3 of [2], under the given conditions √ n 1
n W A⊤ δ ∗ − δ̂λ2 =
p 1−n/p

1
P P −1
oP (1). Since k is fixed and we have from Lemma5, n2 i∈L k∈L [ΣδL ]ik =
p2 (1−n/p)
OP (1) as n, p → ∞, the proof(is complete. )
 
−1/2 1 ⊤
Result 3.: Recall that Z = ΣδL In − n W A η ∼ Nk (0, Il ). We
L
have
(  )⊤ (  )
1 1
2 In − W A⊤ η ΣδL −1 In − W A⊤ A(β ∗ − β̂λ1 )
n L n L
(  )
⊤ −1 1 ⊤
= Z ΣδL In − W A A(β ∗ − β̂λ1 )
n L
 
XX
−1 1 ⊤
≤ [ΣδL ]ik |Zi | In − W A A(β ∗ − β̂λ1 )
n k
i∈L k∈L
 
1 ⊤
XX
≤ In − W A A(β ∗ − β̂λ1 ) [ΣδL −1 ]ik |Zj |
n ∞ i∈L k∈L
 
n 1 1 XX
≤ p In − W A⊤ A(β ∗ − β̂λ1 ) 3 n 2 [ΣδL −1 ](46)
ik .
p 1 − n/p n p2 (1−n/p) ∞ i∈L k∈L

 
We have from (25) of Theorem 3 of [2], under the given conditions √ n In − n1 W A⊤ A(β ∗ − β̂λ1 ) =
p 1−n/p

1
P P −1
oP (1). Since k is fixed and we have from Lemma5, n2 i∈L k∈L [ΣδL ]ik =
p2 (1−n/p)
OP (1) as n, p → ∞, the proof is complete.
Result 4.: By similar arguments as in (46), we have,
(  )⊤ ( )
1 ⊤ −1 1 ⊤



2 In − W A η ΣδL WL A δ − δ̂λ2
n L n
( )
−1/2 1
 
⊤ ⊤ ∗
= Z ΣδL WL A δ − δ̂λ2
n
XX 1  
≤ [ΣδL −1/2 ]ik |Zi | Wk A⊤ δ ∗ − δ̂λ2
n
i∈L k∈L
np 1   XX
≤ 1 − n/p W A⊤ δ ∗ − δ̂λ2 [ΣδL −1/2 ]ik |Zj |
p n ∞ i∈L k∈L

n 1   1 XX
≤ p W A⊤ δ ∗ − δ̂λ2 3 n2
[ΣδL −1 ]ik (47)
.
p 1 − n/p n p2 (1−n/p)
∞ i∈L k∈L
 
1 − n/p n1 W A⊤ δ ∗ − δ̂λ2
n
p
We have from (26) of Theorem 3 of [2], under the given conditions p =

36
1
[ΣδL −1 ]ik =
P P
oP (1). Since k is fixed and we have from Lemma5, n2 i∈L k∈L
p2 (1−n/p)
OP (1) as n, p → ∞, the proof is complete.
Result 5.: Expanding the quadratic form, we have
(  )⊤ ( )
1 1  
2 In − W A⊤ ∗
A(β − β̂λ1 ) ΣδL −1
WL A⊤ δ ∗ − δ̂λ2
n L n
 
XX 1 1  
≤ [ΣδL −1 ]ik W A⊤ A(β ∗ − β̂λ1 )
In − Wk A⊤ δ ∗ − δ̂λ2
n i n
i∈L k∈L
 
n 1 n 1   1 XX
≤ p In − W A⊤ A(β ∗ − β̂λ1 ) p W A⊤ δ ∗ − δ̂λ2 n2
[
p 1 − n/p n p 1 − n/p n ∞ ∞ p2 (1−n/p) i∈L k∈L
 
We have from (25),(26) of Theorem 3 of [2], under the given conditions √ n In − n1 W A⊤ A(β ∗ − β̂λ1 )
p 1−n/p
 
oP (1) and √ n 1
W A⊤ δ ∗ − δ̂λ2 = oP (1). Since k is fixed and we
p 1−n/p n

1
P P −1
have from Lemma5, n2 i∈L k∈L [ΣδL ]ik = OP (1) as n, p → ∞,
p2 (1−n/p)
the proof is complete.

Lemma 5 Given A is n × p dimensional Rademacher p matrix and W = copt A


be the optimal solution of Alg.6 with copt = (1 − µ3 1 − n/p)2 . Let K ⊂ [p]
such that ∥K∥0 = k and L ⊂ [n] such that ∥L∥0 = l with both k, l being fixed
as n, p → ∞. Furthermore, let ΣβK and ΣδL be as defined in (30) and (31)
respectively. Then, we have,
1.
 
XX c26 k 2
P [ΣβK −1 ]lj ≤ p  ≥ 1−((ψ)n−k+1 +(c5 )p ).
j∈K l∈K
ψ 2 c2opt (1 − (k − 1)/n)2
(48)
2.
!
1 XX
−1 l2 (1 − n/p)
P n2
[ΣδL ]ik ≤ p
p2 (1−n/p) i∈L k∈L (copt ϵ21 (1 − n/p)2 − n/p)2
≥ 1− {(c6 ϵ1 )n−k+1 + (c5 )p }. (49)

[ΣβK −1 ]lj = OP (1) and 1


[ΣδL −1 ]ik =
P P P P
Furthermore if n is o(p), then j∈K l∈K n2 i∈L k∈L
p2 (1−n/p)
OP (1). ■

Proof of Lemma5:
Result 1: To prove this, we will exploit the singular value bounds of the

37
Rademacher matrix. First we bound the sum by the maximal element as follows:
XX
[ΣβK −1 ]lj ≤ k 2 ΣβK −1 ∞ (50)
j∈K l∈K

Now we bound ΣβK −1 ∞


≤ ΣβK −1 ∞
using the following inequality,

ΣβK −1 ∞
≤ ∥ΣβK −1 ∥2 (51)
Note that by definition, we have,
1 1
∥ΣβK −1 ∥2 = σmax (ΣβK −1 ) = = copt 2 (52)
σmin (ΣβK )
n σmin (A

Note that, for an arbitrary ϵ1 > 0, using Theorem 1.1. from [4] for the mean-zero
sub-Gaussian random matrix A, we have the following
 
copt p
P √ σmin (AK ) ≤ copt ϵ1 (1 − (k − 1)/n) ≤ (c6 ϵ1 )n−k+1 + (c5 )p . (53)
n
where c6 > 0 and c5 ∈ (0, 1) are constants dependent on the sub-Gaussian norm
of the entries of ΣβK . Let for some small constant ψ ∈ (0, 1) ϵ1 c6 ≜ ψ. We
have
!
−1 c26
P σmax (ΣβK ) ≤ 2 p ≥ 1 − ((ψ)n−k+1 + (c5 )p ).
copt ψ 2 (1 − (k − 1)/n)2
(54)
Therefore, using (54), (52), (51) and (50) we get,
 
2 2
XX c k
P [ΣβK −1 ]lj ≤ 6
p  ≥ 1−((ψ)n−k+1 +(c5 )p ).
ψ 2 c2 (1 − (k − 1)/n)2
j∈K l∈K opt
(55)
This completes the proof of Result 1.
Result 2: We approach this proof the same way we approach the proof of
Result 1. Using the same results as (50), (52) and (51), we have,
1 XX l2
n2
[ΣδL −1 ]ik ≤  2 (56)
n2
p2 (1−n/p) i∈L k∈L p2 (1−n/p) σmin W A⊤ /n − In L

For an arbitrary ϵ1 > 0, using Theorem 1.1. from [4] for the mean-zero sub-
Gaussian random matrix A, we have the following
 
copt
P √ σmin (A⊤ ) ≤ copt ϵ1 (1 − n/p) ≤ (c6 ϵ1 )n−k+1 + (c5 )p .
p
(57)
n
where c6 > 0 and c5 ∈ (0, 1) are constants dependent on the sub-Gaussian norm
of the entries of ΣδL . Since W A⊤ /n = copt AA⊤ /n, we have,
 p 
P σmin W A⊤ /n ≤ copt ϵ21 ( p/n − 1)2 ≤ (c6 ϵ1 )n−k+1 + (c5 )p .

(58)

38
Further since σmin W A⊤ /n − In = σmin W A⊤ /n −1 and σmin W A⊤ /n − In L ≥
   

σmin W A⊤ /n − In , we have from (58)




 p 
P σmin W A⊤ /n − In L ≤ copt ϵ21 ( p/n − 1)2 − 1 ≤ (c6 ϵ1 )n−k+1 + (c5 )p .
 

(59)
Let for some small constant ψ ∈ (0, 1) ϵ1 c6 ≜ ψ. Then, we have from (56) and
(59)
!
1 XX
−1 l2
P n2
[ΣδL ]ik ≤ n2
p
2 ( p/n − 1)2 − 1)2
≥ 1−{(c6 ϵ1 )n−k+1 +(c5 )p }.
p2 (1−n/p) i∈L k∈L (c ϵ
p2 (1−n/p) opt 1
(60)
This completes the proof. ■

8 Reconstruction error of β post-correction (on-


going)
Recall that for the true bitflipped matrix Â, we have, Â = A + ∆A, where
∆A represents the error matrix due to bit-flips. Note that, after correction we
obtain the corrected matrix A.
e Let the corrected matrix be such that

A
e = A + ∆1 A

, where ∆1 A is the error matrix representing the MME’s remaining post-


correction. Hence, from (??), we have,

y = Âβ ∗ + η = Aβ
e ∗ + (∆A − ∆1 A)β ∗ + η = Aβ
e ∗ + δe∗ + η, (61)

where δe∗ ≜ (∆A − ∆1 A)β ∗ is the post-correction bit-flip vector. Here, δe∗ is
also a sparse vector however, the sparsity level is a random quantity given by
r̃ = ∥δe∗ ∥0 . Here also, we estimate β ∗ and δe∗ using the Robust Lasso estimator
given as follows:
!
βeλ1 1 2
= arg min y − e − δ + λ1 ∥β∥1 + λ2 ∥δ∥ ,
Aβ 1 (62)
δeλ2 β,δ 2n 2

where λ1 , λ2 are appropriately chosen regularization parameters.


In this subsection, we will attempt to derive an upper bound on the recon-
struction error of βeλ1 and δeλ2 . However, in order to do that, we will first show
that r̃ < r with high probability. Here r is the sparsity level of the original MME
vector δ ∗ . Next, we will need to show that the corrected matrix A e satisfies the
Extended Restricted Eigenvalue Condition(EREC) as defined in [2].
Let us define the following sets: R = {i ∈ [n] : δi∗ ̸= 0} and R e = {i ∈ [n] :
δi ̸= 0}. Recall that J is the set of all indices detected to have bit-flips by the
e∗

hypothesis test. Let k be the cardinality of J. In order to show r̃ < r with high

39
probability, we will show that R e ⊂ R with high probability as R e ⊂ R implies
r̃ < r. Let us first define the power of the test. We denote the power of the test
as γn,p . Hence, we have,

γn,p = P (rejecting a false null hypothesis) = P (|TH,i | > zα/2 | δi∗ ̸= 0)


= 1 − P −zα/2 ≤ TH,i ≤ zα/2 | δi∗ ̸= 0

!
δi∗ ∗
W,i − δi
δd δi∗ ∗
= 1 − P −zα/2 − p ≤ p ≤ zα/2 − p δi ̸= 0
σ Σδii σ Σδii σ Σδii
( ! !)
δi∗ δi∗
= 1 − Φ zα/2 − p − Φ −zα/2 − p
σ Σδii σ Σδii
√n δi∗ √n δi∗
    
 p 1−n/p p 1−n/p

= 1 − Φ zα/2 −  − Φ −zα/2 − (63)

√n √n
p p
 σ Σδii σ Σδii 
p 1−n/p p 1−n/p

n
p
Note that, from result 2 of Theorem 2 of [2], √ σ Σδii → σ as n, p → ∞.
p 1−n/p
Furthermore, zα/2 is a constant for a given α. In order to ensure that the power
√ n
|δ ∗ |
1−n/p i
γn,p goes to 1 as n, p → ∞, we need
p
n
√ → ∞ as n, p → ∞. This
√ σ Σδii
p 1−n/p
√ n |δ ∗ |
p 1−n/p i
implies we need,
 √ σ → ∞ as n, p → ∞. Hence, under the condition
p 1−n/p
min|δi∗ | = ω n σ , we have that,
i∈R

√n δ∗ √n δ∗
 
   
p 1−n/p i p 1−n/p i
 
lim γn,p = lim 1 − Φ zα/2 −  − Φ −zα/2 −  
√n √n
p p
n,p→∞ n,p→∞  σ Σδii σ Σδii 
p 1−n/p p 1−n/p

= 1 − {Φ(−∞) − Φ(−∞)} = 1. (64)



 √ of the test for δ goes to 1 as n, p → ∞ under the
Hence, we have that power
p 1−n/p
condition min|δi∗ | = ω n σ .
i∈R
Now, recall that, by construction, δ ∗ = ∆Aβ ∗ . Here, ∆A being the error
matrix is sparse. In fact, base on our assumption that there can be at-most one
pair of bit-flips per row, min|δi∗ | = O(min|βj∗ |). Hence, the assumption required
i∈R j∈S
for the power to go to 1 is
p !
p 1 − n/p
min|βj∗ | =ω σ .
j∈S n

8.1 New Correction Algorithm

40
Algorithm 7 Correction for bit-flips following the Random Switch Model
(RSM) using W = A
Require: Measurement vector y, pooling matrix A, Lasso estimate β̂init , λ
and the set J of corrupted measurements estimated by ODrlt method
where
1
β̂ init = arg min ∥y J c − AJ c β∥22 + λ∥β∥1 . (65)
β 2n

Ensure: Bit-flip corrected matrix Ã


1: for every i ∈ J do
2: Set bf 1 := −1, bf 2 = −1 (bit-flip flag), minval = ∥y J − AJ β̂init ∥2 .
3: for every j ∈ [p − 1] do
4: for l ∈ [p] do
5: if {{Aij == −1}and{Ail == 1}}or{{Aij == 1}and{Ail ==
−1}} then
6: Anew = A.
7: if Aij == 1 then
8: Aij = −1, Ail = 1.
9: else if Aij == −1 then
10: Aij = 1, Ail = −1.
11: end if
12: Find the solution β̂ to the convex program given by:
1 2
β̂ = arg min ∥y c − AJ c β∥2 + λ∥β∥1 (66)
β 2n J

13: Calculate the test error given by testval = ∥y J − AJ β̂∥2 .


14: if minval ≥ testval then
15: set bf 1 := j, bf 2 := l, testval = minval .
16: end if
17: if bf 1 ! = −1 and bf 2 ! = −1 then {bit-flip not detected at Aij }
18: Aij = −Aij , Ail = −Ail {reverse induced bit-flip in Aij }, J =
J − {i}
19: Recompute
1
β̂ init = arg min ∥y c − AJ c β∥22 + λ∥β∥1 . (67)
β 2n J

20: end if
21: end for
22: end for
23: end for
24: return à = A.

41
8.2 Rough notes
We will now evaluate the probability that the set of indices with effective bit-flips
post-correction is a subset of the set of indices with effective bit-flips initially.
This event is represented by R̃ ⊆ R. To find the probability of R̃ ⊆ R, we
condition it on the event that all the effective bit-flips were detected in the
detection stage by the marginal tests i.e., we condition R̃ ⊆ R on R ⊆ J.
Hence, using theorem of total probability, we have,

P ( R̃ ⊆ R) ≥ P ( R̃ ⊆ R|R ⊆ J)P (R ⊆ J) (68)

We will now evaluate both the probabilities on the R.H.S. of (68) separately.
Note that, {R ⊆ J} implies that for all the indices that belong to R, the
test statistic |TH,i | is rejected. Hence, the event {R ⊆ J} is equivalent to
the event {R ∩ Jc = ∅} which implies that none of the elements of R be-
longs to Jc . This impliesthat the event {R ∩ Jc = ∅} is equivalent to
 c
∪ |TH,i | ≤ zα/2 |δi∗ ̸= 0 = ∩ |TH,i | > zα/2 |δi∗ ̸= 0 . Hence, we have,
 
i∈R i∈R

   
P (R ⊆ J) ∩ |TH,i | > zα/2 |δi∗ ̸= 0 = 1 − P ∪ |TH,i | ≤ zα/2 |δi∗ ̸= 0
 
= P
i∈R i∈R
X


≥ 1− P |TH,i | ≤ zα/2 |δi ̸= 0
i∈R
= 1 − r(1 − γn,p ). (69)

Here, γn,p is the power of the test for a given n, p. Now, we evaluate P ( R̃ ⊆
R|R ⊆ J). Now, using Lemma 6, we have,

P ( R̃ ⊆ R|R ⊆ J) = P ( R̃ ∩ Rc ∩J = ∅) (70)

We will now define a few notations. Let us recall the in the correction algorithm
Alg.3, for each i ∈ J, for all j1 ∈ [p−1], j2 = j1 +1, . . . , p , we swap the elements
the elements aij1 with aij2 . Let us denote the new bit-flip error at this location as
δi∗ (j1 , j2 ). Then we perform Lasso followed by debiasing to obtain the debiased
lasso estimate and its corresponding test statistic denoted by TH,i (j1 , j2 ). Note
that, here we assume that a wrong swap does create a non-zero bit-flip error
δi∗ (j1 , j2 ). Lastly let us define the set of tuples K = {(j1 , j2 ), j1 ∈ [p − 1], j2 =
{j1 + 1, . . . , p} : {aij1 ̸= aij2 } ∩ {{βj∗1 ̸= 0} ∪ {βj∗2 ̸= 0}}}.
Note that, the event { R̃ ∩ Rc ∩J = ∅} implies that none of the elements
of J \ R are in R̃. Hence, the event { R̃ ∩ Rc ∩ J = ∅} implies that for all
i ∈ J \ R, the hypothesis test w.r.t. TH,i (j1 , j2 ) is rejected for all j1 , j2 . Hence,

42
we have,
  
P ( R̃ ∩ Rc ∩J = ∅) |TH,i (j1 , j2 )| > zα/2 |δi∗ (j1 , j2 ) ̸= 0

= P ∩ ∩
i∈J\R (j1 ,j2 )∈K
  
|TH,i (j1 , j2 )| ≤ zα/2 |δi∗ (j1 , j2 ) ̸= 0

= 1−P ∪ ∪
i∈J\R (j1 ,j2 )∈K
X X
P ( |TH,i (j1 , j2 )| ≤ zα/2 |δi∗ (j1 , j2 ) ̸= 0 )

≥ 1−
i∈J\R (j1 ,j2 )∈K
X X
= 1− (1 − γn,p )
i∈J\R (j1 ,j2 )∈K

(p − 1)(p − 2)
≥ 1− (k − r)(1 − γn,p ) (71)
2
Joining (68), (69) and (71), we get,
 
(p − 1)(p − 2)
P ( R̃ ⊆ R) ≥ {1 − r(1 − γn,p )} 1 − (k − r)(1 − γn,p ) . (72)
2

Under the condition ∥β ∗ ∥∞ = ω np σ , we have γn,p → 1 as n, p → ∞. Since,




γn,p → 1 at an exponential rate, 1−r(1−γn,p ) → 1 and 1− (p−1)(p−2)


2 (k −r)(1−
γn,p ) → 1 as n, p → ∞. Hence, from (72), under the condition ∥β ∗ ∥∞ = ω np σ ,


we have P ( R̃ ⊆ R|R ⊆ J) → 1 as n, p → ∞.

Lemma 6 Let A, B, C be sets. The event (C \ B) ∩ A = ∅ is equivalent to the


conditional statement A ⊆ B | B ⊆ C.

Proof of Lemma 6
If (C \ B) ∩ A = ∅, then A ⊆ B | B ⊆ C.
Assume (C \ B) ∩ A = ∅. This implies that no element of A belongs to C \ B,
i.e., for all x ∈ A:
x∈/ C \ B =⇒ x ∈ B ∪ (U \ C),
where U is the universal set. Now, assume B ⊆ C. Since x ∈ B implies x ∈ C,
the complement U \ C does not contain elements of A. Thus, every x ∈ A must
be in B, implying:
A ⊆ B.
Hence, A ⊆ B | B ⊆ C holds.
If A ⊆ B | B ⊆ C, then (C \ B) ∩ A = ∅. Assume B ⊆ C and A ⊆ B. If
B ⊆ C, then every element of B is in C, so A ⊆ B implies A ⊆ C. Additionally,
since A ⊆ B, no element of A can be outside B. Therefore, no element of A can
belong to C \ B, and we have:

(C \ B) ∩ A = ∅.

This concludes the proof. ■

43
9 Miscl.

Let z = a√+ ib = rz eiθz and w = c + id = rw eiθw , where rz = a2 + b2
and rw = c2 + d2 represent the magnitudes of z and w, and θz , θw are their
respective phases.
The squared magnitude of the difference is given by:

|z − w|2 = (a − c)2 + (b − d)2

In polar form, expand |z − w|2 as follows:

|z − w|2 = |rz eiθz − rw eiθw |2

Expanding, we get,
2 2
 
|z − w|2 = rz eiθz + rw eiθw − 2Re rz eiθz · rw eiθw

2 2
Simplify each term, we have, rz eiθz = rz2 , rw eiθw = rw 2
, The cross term
involves the complex conjugate:
   
Re rz eiθz · rw eiθw = Re rz rw ei(θz −θw ) = rz rw cos(θz − θw ).

Substituting back into the expression, we get,

|z − w|2 = rz2 + rw
2
− 2rz rw cos(θz − θw ).
√ √
Finally, using rz = a2 + b2 and rw = c2 + d2 , we have,
p p
|z − w|2 = (a2 + b2 ) + (c2 + d2 ) − 2 a2 + b2 c2 + d2 cos(θz − θw ).

This expression shows that the phase difference θz −θw is explicitly captured
in the term cos(θz − θw ).
We now show that,

|z − w|2 = (a − c)2 + (b − d)2 .

which is the cartesian form.


Recall that the squared magnitude of the difference is:

|z − w|2 = rz2 + rw
2
− 2rz rw cos(θz − θw ),
√ √
where rz = a2 + b2 and rw = c2 + d2 , and the phases are:
   
b d
θz = arctan , θw = arctan .
a c

In Cartesian coordinates, the difference z − w is:

z − w = (a − c) + i(b − d).

44
The squared magnitude is:

|z − w|2 = |(a − c) + i(b − d)|2 .

The magnitude of a complex number u + iv is

|u + iv|2 = u2 + v 2 .

Substitute u = a − c and v = b − d, we have,

|z − w|2 = (a − c)2 + (b − d)2 .

In polar form, the expression includes the cosine term:

|z − w|2 = rz2 + rw
2
− 2rz rw cos(θz − θw ).

Expand rz2 = a2 + b2 and rw


2
= c2 + d2 , and use the relationship:

Re(z, w̄) ac + bd
cos(θz − θw ) = =√ √ .
|z|.|w| a2 + b2 c2 + d2
Substitute these into the polar form and simplify:
p p ac + bd
|z − w|2 = (a2 + b2 ) + (c2 + d2 ) − 2 a2 + b2 c2 + d2 · √ √ .
a2 + b2 c2 + d2
Cancel the square roots:

|z − w|2 = (a2 + b2 ) + (c2 + d2 ) − 2(ac + bd).

Combine terms:
|z − w|2 = (a − c)2 + (b − d)2 .
This shows the equivalence between the polar and Cartesian forms.

Lemma 7 The Fenchel dual of the problem


 
1 ⊤
min f (w) + g A w ,
w∈Rd n

where f (·) and g(·) are convex functions, and A ∈ Rm×d is a matrix, is given
by:  
Au
maxm −f ∗ − g ∗ (−u),
u∈R n
where f ∗ (·) and g ∗ (·) are the Fenchel conjugates of f (·) and g(·), respectively.

Proof 1 The primal problem is:


 
1 ⊤
min f (w) + g A w .
w n

45
1 ⊤
Introduce an auxiliary variable v = n A w, transforming the problem into:

1 ⊤
min f (w) + g(v), s.t. v = A w.
w,v n
The Lagrangian for this problem is:
 
1
L(w, v, u) = f (w) + g(v) + u⊤ v − A⊤ w ,
n

where u ∈ Rm is the Lagrange multiplier.


The dual function is obtained by minimizing L(w, v, u) over w and v.

Step 1: Minimization with respect to w:
The terms involving w in the Lagrangian are:
1 ⊤ ⊤
f (w) − u A w.
n
This minimization yields the Fenchel conjugate of f (w):
   
1 Au
inf f (w) − u⊤ A⊤ w = −f ∗ .
w n n

Step 2: Minimization with respect to v:
The terms involving v in the Lagrangian are:

g(v) + u⊤ v.

This minimization yields the Fenchel conjugate of g(v):

inf g(v) + u⊤ v = −g ∗ (−u).



v


Step 3: Dual Problem:
Combining the results from the two steps, the dual function is:
 
Au
D(u) = −f ∗
− g ∗ (−u).
n

The dual problem is to maximize D(u), i.e.:


 
∗ Au
max −f − g ∗ (−u).
u∈Rm n

46
10 Optimal solution for W - looser assumption
∥a.j ∥22
n o
Theorem 3 Let A be a n×p matrix with d = min n . Let the coherence
j∈[n]
of A is
1
ν(A) = max |⟨a.i , a.j ⟩|.
i̸=j n
Consider the optimisation problem given in (3) as follows:

1 1 ⊺
min ∥w∥2 subject to A w − ej ≤µ
w n n ∞

ν(A) n(1−µ)
and assume that 1 > µ ≥ d+ν(A) . Then wj = a
∥a.j ∥22 .j
is an optimal solution.

Proof of Theorem 3:

ν
Primal feasibility: From the assumption 1 > µ ≥ d+ν , we have that, dµ +
µν ≥ ν which implies (1 − µ)ν/d ≤ µ. Hence, this choice of wj is primal feasible
since
1 ⊺ (1 − µ)
A ∥a ∥2 a.j − ej ≤ max{|µ|, |(1 − µ)ν/d|} = µ.
n .j 2
n ∞

Primal objective function value: The primal objective function value is


2
(1−µ)2
= (∥a(1−µ)2
.j ∥ /n)
2
2 ∥a.j ∥2 /n = ∥a ∥2 /n .
.j
2 2

The Fenchel dual problem A way to derive the dual is via Fenchel duality.
This tells us that given an optimisation problem of the form
 
1 ⊺
min f (w) + g A w
w n

where f and g are convex, the Fenchel dual is


 
1
max −f ∗ Ay − g ∗ (−y)
y n

where f ∗ and g ∗ are the convex conjugates of f and g, respectively.


In our setting we can take
(
1 0 if ∥w − ej ∥∞ ≤ µ
f (w) = ∥w∥2 and g(w) = .
n ∞ otherwise

Then we have,
n
f ∗ (u) = sup⟨u, w⟩ − f (w) = ∥u∥2
w 4

47
and
g ∗ (u) = sup⟨u, w⟩ − g(w) = sup ⟨u, w⟩ = yj + µ∥u∥1 .
w ∥w−ej ∥∞ ≤µ

gives a dual problem in the form


1 ⊺ ⊺
sup − y A Ay + yj − µ∥y∥1 .
y 4n

2(1−µ)ej
Dual feasibility: The point y = ∥a.j ∥22 /n
is feasible for the dual (trivially).

Dual objective function value: The corresponding dual objective function


value is
1 4(1 − µ)2 2(1 − µ) 2(1 − µ) (1 − µ)2 (1 − µ)2 (1 − µ)2
− ∥a.j ∥2 2 + 2 −µ 2 =− 2 +2 2 = .
4n 2
(∥a.j ∥2 /n) ∥a.j ∥2 /n ∥a.j ∥2 /n ∥a.j ∥2 /n ∥a.j ∥2 /n ∥a.j ∥22 /n
Since the primal solution and the dual solution as equal, it establishes that
an optimal solution for the primal is ∥a(1−µ)2
.j ∥ /n
a.j and an optimal solution to the
2
2(1−µ)
dual is e .
∥a.j ∥22 /n j
This completes the proof. ■

Theorem 4 Let A be a n × p random matrix with each of the elements be-


ing independentlynand identically distributed from standard normal distribution.
∥a.j ∥22
o
Define L = min n , where a.j is the j th column of A. For any constant
j∈[n]
8
c ∈ (0, 1), if n ≥ (1−c)2 log p, then

1
P (L ≥ c) ≥ 1 − . (73)
p

∥a.j ∥22
n o
Proof of Theorem 4: Given L = min n . Let us define for all j ∈ [p],
j∈[n]
Pn
Uj = n1 ∥a.j ∥22 = n1 i=1 a2ij . Since, aij ∼ N (0, 1) i.i.d. for all i ∈ [n], j ∈ [p].
Pn
Hence, a2ij ∼ χ21 i.i.d. for all i ∈ [n], j ∈ [p]. Therefore, we have, i=1 a2ij ∼ χ2n
for all j ∈ [p]. Let, for all j ∈ [p], Vj = ∥a.j ∥22 . Then, Uj = n1 Yj .
Recall that the Moment Generating Function (M.G.F) of χ2n random variable
is given by,
MYj (λ) = (1 − 2λ)−n/2 ; λ < 1/2. (74)
For λ > 0, ϵ ∈ (0, 1), using Markov’s inequality, we have for j ∈ [p],

P (Uj ≤ 1 − ϵ) = P (Yj ≤ n(1 − ϵ)) = P (e−λYj ≥ e−λn(1−ϵ) )


≤ eλn(1−ϵ) E(e−λYj ) = eλn(1−ϵ) MYj (−λ)
= eλn(1−ϵ) (1 + 2λ)−n/2 (75)

Now we will minimise the upper bound in (75) w.r.t. λ to tighten the up-
per bound. Let h(λ) ≜ eλn(1−ϵ) (1 + 2λ)−n/2 . Define g(λ) ≜ ln (h(λ)) =

48
λn(1 − ϵ) − n2 ln(1 + 2λ). Note that, since ln(.) is a monotonically increas-
ing function, minimising h(λ) w.r.t.λ is equivalent to minimising g(λ) w.r.t.λ.
Hence, equating g ′ (λ) = 0, we have,
n 2
g ′ (λ) = 0 =⇒ n(1 − ϵ) − =0
2 1 + 2λ
 
1 1 ϵ
=⇒ λ̂ = −1 = . (76)
2 1−ϵ 2(1 − ϵ)

Plugging this in the upper bound of (75), we get,

en(1−ϵ) 2(1−ϵ) − 2 ln( 1−ϵ ) = e 2 + 2 ln(1−ϵ) .


ϵ n 1 nϵ n
P (Uj ≤ 1 − ϵ) ≤ (77)
2 3
Note that the Taylor’s series expansion of ln(1 − ϵ) = −ϵ − ϵ2 − ϵ3 − . . .. Since
ϵ ∈ (0, 1) is arbitarily small, we approximate the Taylor Series Expansion upto
the second term in (77) to get,
nϵ n nϵ nϵ nϵ2 nϵ2
P (Uj ≤ 1 − ϵ) ≤ e 2 +2 ln(1−ϵ)
≈e 2 − 2 − 2 = e− 2 . (78)
nϵ2 nϵ2
For some δ ∈ (0, 1), let us put δ = e− 2 . This implies ln(1/δ) = 2 =⇒ ϵ =
q
2 ln(1/δ)
n . Plugging this in (78), we have,
r !
ln(1/δ)
P Uj ≥ 1 − 2 ≥ 1 − δ. (79)
n

∥a.j ∥22
n o
Using the bound in (79), we will obtain a lower bound on L = min n =
j∈[n]
min Uj . Recall that, here Uj ’s are independently and identically distributed for
j∈[n]
all j ∈ [p]. Hence, for some constant l ∈ (0, 1), we have,
 
P (L ≥ l) = P min Uj ≥ l = P (U1 ≥ l, U2 ≥ l, . . . , Up ≥ l)
j∈[n]
p
Y
= P (Uj ≥ l) = [P (U1 ≥ l)]p = (1 − δ)p . (80)
j=1
q
Taking l = 1 − 2 ln(1/δ)
n in (80), for arbitarily small δ, we have,
r !
ln(1/δ)
P L≥1−2 ≥ (1 − δ)p ≈ 1 − pδ. (81)
n

Taking δ = 1/p2 in (81), we get the tail bound as,


r !
2 ln(p) 1
P L≥1−2 ≥1− . (82)
n p

49
8 ln(p)
Now, for some constant c ∈ (0, 1), we have the assumption, n ≥ (1−c)2 . This
q
assumption implies c ≤ 1 − 2 2 ln(p)
n . Hence, from (82), we have,

1
P (L ≥ c) ≥ 1 − . (83)
p
This completes the proof. ■
Lemma 8 Let A be a n×p random matrix with each of the elements being inde-
pendently and identically distributed from standard normal distribution. Define
|a⊤ a |
ν(A) = max .ln .j , where a.j is the j th column of A. Then for any δ > 0,
l̸=j∈[p]

r !
2 ln(p/δ)
P ν(A) ≥ 4 ≤ δ2 . (84)
n


a⊤
.l a.j

1
Pn of Lemma 8: Let us define for all l ̸= j ∈ [p], Ulj =
Proof = n
n a a
i=1 il ij . Hence, ν(A) = max|Ulj |. We will first find a tail bound on
l̸=j
Ulj . From Lemma√9, we have that, ail aij are sub-exponentially distributed
with parameters (2 2, 2). We will now √ use the bernstein’s inequality given in
Eqn.(2.20) of [5] on |Ulj | with v ∗ = 2 2 and b∗ = b = 2 to get for some t > 0,
nt2
P (|Ulj | ≥ t) ≤ 2e− 16 . (85)

Now, using union bound of probability on (85), we get,


 
nt2 nt2
P max|Ulj | ≥ t ≤ p(p − 1)e− 16 ≤ p2 e− 16 (86)
l̸=j
q
Taking t = 4 2 ln(p/δ)
n for some δ > 0 in (86), we get,
r !
2 ln(p/δ) 2
/p2 )
P ν(A) ≥ 4 ≤ p2 eln(δ = δ2 . (87)
n

This completes the proof. ■


Lemma 9 Let X1 , X2 be two independently distributed standard normal random
variables. Define a random
√ variable Z ≜ X1 X2 . Then Z is sub-exponential in
nature with parameters (2 2, 2).
Proof of Lemma 9: Recall the definition of sub-exponential random variables
as given in [5, Chapter 2]. A random variable X with mean µ = E(X) is
sub-exponential if there are non-negative parameters (v, b) such that,
v 2 λ2 1
E[eλ(X−µ) ] ≤ e 2 ∀ |λ| < . (88)
b

50
We have X1 , X2 ∼ N (0, 1) independently and Z = X1 X2 . Clearly, E[Z] =
E[X1 X2 ] = E[X1 ]E[X2 ] = 0. Hence, we have for λ ∈ R,
 −1/2
λZ λX1 ,X2 λX1 X2 1 2 2 1 2 1
E[e ] = E[e ] = E[E[e |X1 ]] = E[e 2 λ X1 ]= 1−2 λ ; λ2 < 1/2.
2 2
= (1 − λ2 )−1/2 ; |λ| ≤ 1/2 (89)

In (89), the third equality comes from the M.G.F. of standard normal random
variable and the fourth equality comes from the M.G.F of a chi-squared random
variable as X12 ∼ χ21 . √
2 (2 2)2 λ2
Note that (1 − λ2 )−1/2 ≤ e4λ = e 2 √for |λ| ≤ 1/2. Hence, we have
that, Z is sub-exponential with parameters (2 2, 2). This completes the proof.

References
[1] Supplemental material. Uploaded on Journal Portal.

[2] S. Banerjee, R. Srivastava, J. Saunderson, and A. Rajwade. Robust non-


adaptive group testing under errors in group membership specifications.
2024.
[3] A. Javanmard and A. Montanari. Confidence intervals and hypothesis test-
ing for high-dimensional regression. J Mach Learn Res, 2014.

[4] M.Rudelson and R.Vershynin. Smallest singular value of a random rectan-


gular matrix. Comm. Pure Appl. Math., 2009.
[5] M. J. Wainwright. High Dimensional Statistics: A non-asymptotic view-
point. Cambridge publications, 2019.

51

You might also like