Recovering Cryptographic Keys From Partial Information, by Example
Recovering Cryptographic Keys From Partial Information, by Example
information, by example
Gabrielle De Micheli1 and Nadia Heninger2
1
Université de Lorraine, CNRS, Inria, LORIA, Nancy, France
2
University of California, San Diego
Abstract
Side-channel attacks targeting cryptography may leak only partial or
indirect information about the secret keys. There are a variety of tech-
niques in the literature for recovering secret keys from partial information.
In this tutorial, we survey several of the main families of partial key recov-
ery algorithms for RSA, (EC)DSA, and (elliptic curve) Diffie-Hellman, the
public-key cryptosystems in common use today. We categorize the known
techniques by the structure of the information that is learned by the at-
tacker, and give simplified examples for each technique to illustrate the
underlying ideas.
Contents
1 Introduction 2
2 Motivation 4
3 Mathematical background 6
1
4.2.8 Partial recovery of RSA d from most significant bits is not
possible . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.9 Partial recovery of RSA d from least significant bits . . . 20
4.3 Non-consecutive bits known with redundancy . . . . . . . . . . . 21
4.3.1 Random known bits of p and q . . . . . . . . . . . . . . . 21
4.3.2 Random known bits of the Chinese remainder coefficients
d mod (p − 1) and d mod (q − 1) . . . . . . . . . . . . . . 23
4.3.3 Recovering RSA keys from indirect information . . . . . . 24
4.3.4 Open problem: Random known bits without redundancy 24
7 Conclusion 40
8 Acknowledgements 41
1 Introduction
You are dangling in a rope sling hung from the ceiling of a datacenter
in an undisclosed location, high above the laser tripwires criscrossing
the floor. You hold an antenna over the target’s computer, watching
the bits of their private key appear one by one on your smartwatch
display. Suddenly you hear a scuffling at the door, the soft beep of
keypad presses. You’d better get out of there! You pull your emer-
gency release cable and retreat back to the safety of the ventilation
2
duct. Drat! You didn’t have time to get all the bits! Mr. Bond is
going to be very disappointed in you. Whatever are you going to do?
In a side-channel attack, an attacker exploits side effects from computation or
storage to reveal ostensibly secret information. Many side-channel attacks stem
from the fact that a computer is a physical object in the real world, and thus
computations can take different amounts of time [Koc96], cause changing power
consumption [KJJ99], generate electromagnetic radiation [QS01], or produce
sound [GST14], light [FH08], or temperature [HS14] fluctuations. The specific
character of the information that is leaked depends on the high- and low-level
implementation details of the algorithm and often the computer hardware itself:
branch conditions, error conditions, memory cache eviction behavior, or the
specifics of capacitor discharges.
The first work on side-channel attacks in the published literature did not
directly target cryptography [EL85], but since Kocher’s work on timing and
power analysis in the 90s [Koc96, KJJ99], cryptography has become a popular
target for side-channel work. However, it is rare that an attacker will be able to
simply read a full cryptographic secret through a side channel. The information
revealed by many side channel attacks is often indirect or incomplete, or may
contain errors.
Thus in order to fully understand the nature of a given vulnerability, the side-
channel analyst often needs to make use of additional cryptanalytic techniques.
The main goal for the cryptanalyst in this situation is typically: “I have obtained
the following type of incomplete information about the secret key. Does it allow
me to efficiently recover the rest of the key?” Unfortunately there is not a one-
size-fits-all answer: it depends on the specific algorithm used, and on the nature
of the information that has been recovered.
The goal of this work is to collect some of the most useful techniques in this
area together in one place, and provide a reasonably comprehensive classification
on what is known to be efficient for the most commonly encountered scenarios
in practice. That is, this is a non-exhaustive survey and a concrete tutorial with
motivational examples. Many of the algorithmic papers in this area give con-
structions in full generality, which can sometimes obscure the reader’s intuition
about why a method works. Here, we aim to give minimal working examples to
illustrate each algorithm for simple but nontrivial cases. We restrict our focus
to public-key cryptography, and in particular, the algorithms that are currently
in wide use and thus the most popular targets for attack: RSA, (EC)DSA, and
(elliptic curve) Diffie-Hellman.
Throughout this work, we will illustrate the information known for key values
as follows:
Known bits
3
The organization of this survey is given in Table 1.
2 Motivation
While this tutorial is mostly operating at a higher level of mathematical ab-
straction than the side-channel attacks that we are motivated by, we will give a
few examples of how attackers can learn partial information about secrets.
Modular exponentiation. All of the public-key cryptographic algorithms we
discuss involve modular exponentiation or elliptic curve scalar addition operat-
ing on secret values. For RSA signatures, the victim computes s = md mod N
where d is the secret exponent. For DSA signatures, the victim computes a
per-signature secret value k and computes the value r = g k mod p, where g and
p are public parameters. For Diffie-Hellman key exchange, the victim generates
a secret exponent a and computes the public key exchange value A = g a mod p,
where g and p are public parameters.
Naive modular exponentiation algorithms like square-and-multiply operate
bit by bit over the bits of the exponent: each iteration will execute a square
operation, and if that bit of the exponent is a 1, will execute a multiply op-
eration. More sophisticated modular exponentiation algorithms precompute a
digit representation of the exponent using non-adjacent form (NAF), windowed
non-adjacent form (wNAF) [Möl03], sliding windows, or Booth recoding [Boo51]
and then operate on the precomputed digit representation. [Gor98].
Cache attacks on modular exponentiation. Cache timing attacks are one
of the most commonly exploited families of side-channel attacks in the academic
literature [Pag02, TTMH02, TSS+ 03, Per05, Ber05, OST06]. There are many
variants of these attacks, but they all share in common that the attacker is able
to execute code on a CPU that is co-located with the victim process and shares a
CPU cache. While the victim code executes, the attacker measures the amount
of time that it takes to load information from locations in the cache, and thus
deduces information about the data that the victim process loaded into those
cache locations during execution. In the context of the modular exponentiation
or scalar addition algorithms discussed above, a cache attack on a vulnerable
implementation might reveal whether a multiply operation was executed at a
particular bit location if the attacker can detect whether the code to execute
the multiply instruction was loaded into the cache. Alternatively, for a pre-
computed digit representation of the number, the attacker may be able to use
a cache attack to observe the digit values that were accessed [ASK07, AS08,
BH09, BvSY14].
Other attacks on modular exponentiation. Other families of side chan-
nels that have been used to exploit vulnerable modular exponentiation imple-
mentations include power analysis and differential power analysis attacks [KJJ99,
KJJR11], electromagnetic radiation [QS01], acoustic emanations [GST14], raw
timing [Koc96], photonic emission [FH08], and temperature [HS14]. These at-
tacks similarly exploit code or circuits whose execution varies based on secrets.
4
Scheme Secret information Bits known Technique Section
Table 1: Visual table of contents for key recovery methods for public-key cryp-
tosystems.
5
Cold boot and memory attacks. An entirely different class of side-channel
attacks that can reveal partial information against keys include attacks that
may leak the contents of memory. These include cold boot attacks [HSH+ 08],
DMA (Direct Memory Access), Heartbleed, and Spectre/Meltdown [LSG+ 18,
KHF+ 19]. While these attacks may reveal incomplete information, and thus
serve as theoretical motivation for some of the algorithms we discuss, most of
the vulnerabilities in this family of attacks can simply be used to read arbitrary
memory with near-perfect precision, and cryptanalytic algorithms are rarely
necessary.
Length-dependent operations. A final vulnerability class is implementa-
tions whose behavior depends on the length of a secret value, and thus variations
in the behavior may leak information about the number of leading zeros in a se-
cret. Simple examples include copying a secret key to a buffer in such a way that
it reveals the bit length of a secret key, or iterating a modular exponentiation
algorithm only until the most significant nonzero digit. [BT11] In another exam-
ple, the Raccoon attack observes that TLS versions 1.2 and below strips leading
zeros from the Diffie-Hellman shared secret before applying the key derivation
function, resulting in a timing difference depending on the number of hash input
blocks required for the length of the secret. [MBA+ 20]
3 Mathematical background
Lattices and lattice reduction algorithms Several of the algorithms we
present make use of lattices and lattice algorithms. We will state a few facts
about lattices, but try to avoid being too formal.
For the purposes of this tutorial, we will specify a lattice by giving a basis
matrix B which is a n × n matrix of linearly independent row vectors with
rational (but in our applications usually integer) entries. The lattice generated
by B, written as L(B), consists of all vectors that are integer linear combinations
of the row vectors of B. The determinant of a lattice is the absolute value of
the determinant of a basis matrix: det L(B) = | det B|.
Geometrically, a lattice resembles a discrete, possibly skewed, grid of points
in n-dimensional space. This discreteness property ensures that there is a short-
est vector in the lattice: there is a non-infinitesimal smallest length of a vector
in the lattice, and there is at least one vector v1 that achieves this length. For
a random lattice, the Euclidean
p length of this vector is approximated using the
Gaussian heuristic: |v1 |2 ≈ n/(2πe)(det L)1/n . We often don’t need this much
precision; for lattices of very small dimension we will often use the quick and
dirty approximation that |v1 |2 ≈ (det L)1/n .
The shortest vector in an arbitrary lattice is NP-hard to compute exactly,
but the LLL algorithm [LLL82] will compute an exponential approximation
to this shortest vector in polynomial time: in the worst case, it will return
a vector b1 satisfying ||b1 ||2 ≤ 2(n−1)/4 (det L)1/n . In practice, for random
lattices, the LLL algorithm obtains a better approximation factor ||b1 ||2 ≤
1.02n (det L)1/n [NS06]. In fact, the LLL algorithm will return an entire ba-
sis for the lattice whose vectors are good approximations for what are called the
6
successive minima for the lattice; for our purposes the only fact we need is that
these vectors will be fairly short, and for a random lattice they will be close
to the same length. Current implementations of the LLL algorithm can be run
fairly straightforwardly on lattices of a few hundred dimensions.
To compute a closer approximation to the shortest vector than LLL, one can
use the BKZ algorithm [Sch87, SE94]. This algorithm runs in time exponential
in a block size, which is a parameter to the algorithm that determines the quality
of the approximation factor. The theoretical guarantees of this algorithm are
complicated to express; for our purposes we only need to know that for lattices
of dimension below around 100, one can easily compute the shortest vector in
the heuristically random-looking lattices we consider using the BKZ algorithm,
and often can often find the shortest vector, or a “good enough” approximation
to it, by using smaller block sizes. Theoretically, the LLL algorithm is equivalent
to using BKZ with block size 2.
7
Since encryption and signature verification only use the public key, decryp-
tion and signature generation are the operations typically targeted by side-
channel attacks.
RSA-CRT. To speed up decryption, instead of computing cd mod N directly,
implementations often use the Chinese remainder theorem (CRT). RSA-CRT
splits the exponent d into two parts dp = d mod (p − 1) and dq = d mod (q − 1).
To decrypt using the Chinese remainder theorem, Alice would compute mp =
cdp mod p and mq = cdq mod q. The message can be recovered with the help of
the pre-computed value qinv = q −1 mod p by computing
ed ≡ 1 mod (p − 1)(q − 1)
We know that d < (p − 1)(q − 1), so k < e. The value of k is not known to the
attacker, but since generally e ≤ 65537 in practice it is efficient to brute force
over all possible values of k.
For attacks against the CRT coefficients dp and dq , we can obtain similar
relations:
edp = 1 + kp (p − 1) and edq = 1 + kq (q − 1) (1)
for some integers kp < e and kq < e. Brute forcing over two independent 16-bit
values can be burdensome, but we can relate kp and kq as follows:
Rearranging the two relations, we obtain edp −1−kp = kp p and edq −1−kq =
kq q. Multiplying these together, we get
(edp − 1 + kp )(edq − 1 − kq ) = kp kq N
Thus given a value for kp , we can solve for the unique value of kq mod e, and
for applications that require brute forcing values of kp and kq we only need to
brute force at most e pairs. [IGA+ 15]
8
The multiplier k also has a nice relationship to these values. Multiplying
the relations from Equation 1 together, we have
p = gcd((edp − 1)/kp + 1, N )
9
4.2.1 Warm-up: Lattice attacks on low-exponent RSA with bad
padding.
The main algorithmic technique used for RSA key recovery with contiguous bits
is to formulate the problem as finding a small root of a polynomial modulo an
integer, and then to use lattice basis reduction to solve this problem.
In order to introduce the main tool of using lattice basis reduction to find
roots of polynomials, we will start with an illustrative example for the concrete
application of breaking small-exponent RSA with known padding. In later sec-
tions we will show how to modify the technique to cover different RSA key
recovery scenarios.
The original formulation of this problem is due to Coppersmith [Cop96].
Howgrave-Graham [HG97] gave a dual approach that we find easier to explain
and easier to implement. May’s survey [May10] contains a detailed description
of the Coppersmith/Howgrave-Graham algorithm.
To set up the problem, we have an integer N , and a polynomial f (x) of
degree k that has a root r modulo N , that is, f (r) ≡ 0 mod N . We wish to find
r. Finding roots of polynomials can be done efficiently modulo primes [LLL82],
so this problem is easy to solve if N is prime or the prime factorization of N is
known. The Coppersmith/Howgrave-Graham methods are generally of interest
when the prime factorization of N is not known: it gives an efficient algorithm
for finding all small roots (if they exist) modulo N of unknown factorization.
Problem setup. For our toy example, we will use the 96-bit RSA modulus
N = 0x98664cf0c9f8bbe76791440d
pad(m) = 0x01FFFFFFFFFFFFFFFF00 || m
c = 0xeb9a3955a7b18d27adbf3a1
10
a m
pad(m)
a = 0x01FFFFFFFFFFFFFFFF0000
be the known padding string, offset to the correct byte location. We also know
the length of the message; in this case m < 216 . Thus we have that c =
(a + m)3 mod N , for unknown small m. Let f (x) = (a + x)3 − c; we have set up
the problem so that we wish to find a small root m satisfying f (m) ≡ 0 mod N
for the polynomial
(We have reduced the coefficients modulo N so that they will fit on the page.)
Construct a lattice. Let the coefficients of f be f (x) = x3 + f2 x2 + f1 x + f0 .
Let M = 216 be an upper bound on the size of the root m. We construct the
matrix 3
f2 M 2 f1 M f0
M
0 NM2 0 0
B= 0
0 NM 0
0 0 0 N
We then apply the LLL lattice basis reduction algorithm to the matrix. The
shortest vector of the reduced basis is
v =(−0x66543dd72697M 3 , −0x35c39ac91a11c04M 2 , 0x3f 86f 973d67d25eae138M,
− 0x10609161b131f d102bc2a8)
Extract a polynomial from the lattice and find its roots. We then
construct the polynomial
11
The polynomial g has one integer root, 0x42, which is the desired solution
for m.
This specific 4 × 4 lattice construction works to find roots up to size N 1/6 .
For the small key size we used in our example, this is only 16 bits, but since
it scales directly with the modulus size, this same lattice construction would
suffice to learn 170 unknown bits of message for a 1024-bit RSA modulus, or
341 bits of message for a 2048-bit RSA modulus. Lattice reduction on a 4 × 4
lattice basis is instantaneous.
More detailed explanation. Why does this work? The rows of this matrix
correspond to the coefficient vectors of the polynomials f (x), N x2 , N x, and
N . We know that each of these polynomials evaluated at x = m will be 0
modulo N . Each column is scaled by a power of M , so that the `1 norm of
any vector in this lattice is an upper bound on the value of the corresponding
(un-scaled) polynomial evaluated at r. For a vector v = (v3 M 3 , v2 M 2 , v1 M, v0 )
in the lattice,
For our example, we have (det B)1/ dim L = (M 6 N 3 )1/4 < N . Solving for M ,
this will be satisfied when M < N 1/6 . In this case, N has 96 bits, and m is 16
bits, so the condition is satisfied.
This can be extended to N 1/e , where e is the degree of the polynomial f
by using a larger dimension lattice. Howgrave-Graham’s dissertation [HG] and
May’s survey [May10] give detailed explanations of this method and improve-
ments.
12
4.2.2 Factorization from consecutive bits of p.
In this section we show how to use lattices to factor the RSA modulus N if a
large portion of contiguous bits of one of the factors (without loss of generality
p) is known.
2` b r
N = 0x4d14933399708b4a5276373cb5b756f312f023c43d60b323ba24cee670f5.
a = 0x68323401cb3a10959e7bfdc0000000
Cast the problem as finding the roots of a polynomial. Let f (x) = a+x.
We know that there is some value r such that f (r) = p ≡ 0 mod p. We do not
know p, but we know that p divides N and we know N .
We know that the unknown r is small, and in particular |r| < R for some
bound R that is known. Here, R = 230 .
Construct a lattice. We can form the lattice basis
2
R Ra 0
B=0 R a
0 0 N
13
We then run the LLL algorithm on our lattice basis B. Let v = (v2 R2 , v1 R, v0 )
be the shortest vector in the reduced basis. In our example, we get the vector
v =(−0x0x17213d8bc94R2 , −0x1d861360160a4f86181R,
0xf9decdc1447c3f3843819a5d)
We can then calculate the roots of f . In this example, f has one integer root,
r = 0x873209. We can then reconstruct a + r and verify that gcd(a + r, N )
factors N .
This 3 × 3 lattice construction works for any |r| < p1/3 , and directly scales
as p increases. In our example, we chose p and q so that they have 120 bits,
and r has 30 bits. However, this same construction will work to recover 170 bits
from a 512-bit factor of a 1024-bit RSA modulus, or 341 bits from a 1024-bit
factor of a 2048-bit RSA modulus.
More detailed explanation. The rows of this matrix correspond to the
coefficient vectors of the polynomials x(x + a), x + a, and N . We know that
each of these polynomials evaluated at x = r will be 0 modulo p, and thus every
polynomial corresponding to a vector in the lattice has this property. As in the
previous example, each column is scaled by a power of R, so that the `1 norm
of any vector in this lattice is an upper bound on the value of the corresponding
(un-scaled) polynomial evaluated at r.
If we can find a vector in the lattice of length less than p, then it corresponds
to a polynomial g that must satisfy g(r) < p. Since by construction, g(r) = 0
(mod p), this means that g(r) = 0 over the integers.
We compute the determinant of the lattice to verify that it contains a suf-
ficiently small vector. For this example, det B = R3 N . This means we need
(det B)1/ dim L = (R3 N )1/3 < p. Solving for R, this gives R < p1/3 . For an RSA
modulus we have p ≈ N 1/2 , or R < N 1/6 .
This method works up to R < p1/2 at the limit by increasing the dimension
of the lattice. This is accomplished by taking higher multiples of f and N . See
Howgrave-Graham’s dissertation [HG] and May’s survey [May10] for details on
how to do this.
4.2.3 RSA key recovery from least significant bits of p
It is also straightforward to adapt this method to deal with a contiguous chunk
of unknown bits in the least significant bits of p: if the chunk begins at bit
position `, the input polynomial will have the form f (x) = 2` x + a. This can
be multiplied by 2−` mod N and solved exactly as above.
14
p
2t rm a r`
N =0x3ab05d0c0694c6bd8ee9683d15039e2f738558225d7d37f4a601bcb9
29ccfa564804925679e2f3542b
a = 0xc48c998771f7ca68c9788ec4bff9b40b80000
be the middle bits of one of its factors p; there are 16 unknown bits in the most
and least significant bit positions. Thus we know that R = 216 in our concrete
example. We wish to recover p.
15
Cast the problem as finding solutions to a polynomial. In the previous
examples, we only had one variable to solve for. Here, we have two, so we need
to use a bivariate polynomial. We can write down f (x, y) = x + 2t y + a, so that
f (r` , rm ) = p.
In our concrete example, p has 164 bits, so we have f (x, y) = x+2148 y+a. We
hope to construct two polynomials g1 (x, y) and g2 (x, y) satisfying g1 (r` , rm ) = 0
and g2 (r` , rm ) = 0 over the integers. Then we can solve the system for the
simultaneous roots.
Construct a lattice. As before, we will use our input polynomial f and the
public RSA modulus N to construct a lattice. Unfortunately for the simplicity of
our example, the smallest polynomial that is guaranteed to result in a nontrivial
bound on the solution size for our desired roots has degree 3, and results in a
lattice of dimension 10.
As before, each column corresponds to a monomial that appears in our
polynomials, and each row corresponds to a polynomial that evaluates to 0
mod p at our desired solution. In our example, we will use the polynomials
f 3 , f 2 y, f y 2 , y 3 N, f 2 , f y, y 2 N, f, yN , and N ; the monomials in the columns are
x3 , x2 y, xy 2 , y 3 , x2 , xy, y 2 , x, y, and 1. Each column is scaled by the appropriate
power of R.
3
3 · 2t R3 3 · 22t R3 23t R3 3aR2 6 · 2t aR2 3 · 22t aR2 3a2 R 3 · 2t a2 R a3
R
0
R3 2 · 2t R 3 22t R3 0 2aR2 2 · 2t aR2 0 a2 R 0
0
0 R3 2t R3 0 0 aR2 0 0 0
0
0 0 R3 N 0 0 0 0 0 0
0 0 0 0 R2 2 · 2t R2 22t R2 2aR 2 · 2t aR a2
B=
0
0 0 0 0 R2 2t R 2 0 aR 0
0
0 0 0 0 0 R2 N 0 0 0
0
0 0 0 0 0 0 R 2t R a
0 0 0 0 0 0 0 0 RN 0
0 0 0 0 0 0 0 0 0 N
We reduce this matrix using the LLL algorithm, and reconstruct the bivariate
polynomials corresponding to each row of the reduced basis. Unfortunately,
these are too large to fit on a page.
Solve the system of polynomials to find common roots. Heuristically,
we would hope to only need two sufficiently short vectors and then compute the
resultant of the corresponding polynomials or use a Gröbner basis to find the
common roots, but in our example the two shortest vectors are not algebraically
independent. In this case it suffices to use the first three vectors. Concretely, we
construct an ideal over the ring of bivariate polynomials with integer coefficients
whose basis is the polynomials corresponding to the three shortest vectors in
the reduced basis for L(B) above, and then call a Gröbner basis algorithm on it.
For this example, the Gröbner basis is exactly the polynomials (x − 0x339b, y −
0x5a94), which reveals the desired solutions for x = r` and y = rm .
In this example, the nine shortest vectors all vanish at the desired solution,
so we could have constructed our Gröbner basis from other subsets of these
short vectors.
16
More detailed explanation. The determinant of our lattice is det B =
R20 N 4 , and the lattice has dimension 10. We hope to find two vectors v1
and v2 of length approximately det B 1/ dim B ; this is not guaranteed to be pos-
sible, but for random lattices we expect the lengths of the vectors in a reduced
basis to have close to the same lengths. The `1 norms of the vectors v1 and v2
are upper bounds on the magnitude of the corresponding polynomials fv1 (x, y),
fv2 (x, y) evaluated at the desired roots r` , rm . In order to guarantee that these
vanish, we want the inequality
√
|fvi (r` , rm )| ≤ |vi |1 < p ≈ N
to hold.
Thus the desired condition for success is
√
det B 1/ dim B < N
20 4 1/10
(R N ) < N 1/2
R20 < N
In our example, N was 326 bits long, and we chose R to have 16 bits.
This attack was applied in [BCC+ 13] to recover RSA keys generated by
a faulty random number generator that generated primes with predictable se-
quences of bits.
4.2.5 RSA key recovery from multiple chunks of bits of p
The above idea can be extended to handle more chunks of p at the cost of
increasing the dimension of the lattice. Each unknown “chunk” of bits intro-
duces a new variable in the linear equation that will be solved for p. At the
limit, the algorithm requires 70% of the bits of p divided into at most log log N
blocks [HM08].
17
is known about both p and q or other fields of the RSA private key, then the
methods of Section 4.3.1 may be applicable.
2` b r
dp
dq
18
attack for all possible values of 1 ≤ kp < 65537. With the correct parameters, we
are guaranteed to find a solution for the correct value of kp . For other incorrect
guesses of kp , in practice the attack is unlikely to result in any solutions found,
but any spurious solutions that arise can be eliminated because they will not
result in a factorization of N .
We can rearrange Equation 4, with e−1 computed modulo N :
e(a + r) − 1 + kp ≡ 0 mod p
a + r + e−1 (kp − 1) ≡ 0 mod p
Let A = a+e−1 (kp −1). Then we wish to find a small root r of the polynomial
f (x) = A + x modulo p, where |r| < R.
For our concrete example, we have R = 230 and kp = 23592, so
A = 0x8ffe9143aa4c189787058057a0784576848f3f28d79a83169f72a0550699112
Construct a lattice. Since the form of the problem is identical to the previ-
ous section, we use the same lattice construction:
2
R RA 0
B=0 R A
0 0 N
We apply the LLL algorithm to this basis and take the shortest vector in
the reduced basis. For our example, this is
At the limit, this technique can work up to R < p1/2 [BM03] by increasing the
dimension of the lattice with higher degree polynomials and higher multiplicities
of the root.
4.2.8 Partial recovery of RSA d from most significant bits is not
possible
Partial recovery for d varies somewhat depending on the bits that are known
and the size of e. Since e is small in practice, we will focus on that case here.
19
d
Figure 7: For small exponent e, the most significant bits of d do not allow full
key recovery.
Most significant bits of d. When e is small enough to brute force, the most
significant half of bits of d can be recovered easily with no additional information.
This implies that if full key recovery were possible from only the most significant
half of bits of d, then small public exponent RSA would be completely broken.
Since small public exponent RSA is not known to be insecure in general, this
unfortunately means that no such key recovery method is possible for this case.
Consider the RSA equation
ed = 1 mod (p − 1)(q − 1)
ed = 1 + k(p − 1)(q − 1)
ed = 1 + k(N − (p + q) + 1)
d = kN/e − (k(p + q − 1) − 1)/e
√
Since p + q ≈ N , the second term affects only the least significant half of
the bits of d, so the value kN/e shares approximately the most significant half
of its bits in common with d.
On the positive side, this observation allows the attacker to narrow down
possible values for k if the attacker knows any most significant bits of d for
certain. See Boneh, Durfee, and Frankel [Bon98] for more details.
4.2.9 Partial recovery of RSA d from least significant bits
For low-exponent RSA, if an adversary knows the least significant t bits of d,
then this can be transformed into knowledge of the least significant t bits of p,
and then the method of Section 4.2.3 can be applied. This algorithm is due to
Boneh, Durfee, and Frankel [Bon98].
d0
Assume the adversary knows the t least significant bits of d; call this value
d0 . Then
ed0 ≡ 1 + k(N − (p + q) + 1) mod 2t
20
Let s = p + q. The adversary tries all possible values of k, 1 < k < e to obtain
e candidate values for the t least significant bits of s.
Then for each candidate s, the least significant bits of p are solutions to the
quadratic equation
p2 − sp + N ≡ 0 mod 2t .
Let a be a candidate solution for the least significant bits of p. Putting
this in the context of Section 4.2.3, the attacker wishes to solve f (x) = a +
2t x ≡ 0 mod p. This can be multiplied by 2−t mod N and the exact method
of Section 4.2.3 can be applied to recover p. Since at the limit, the methods of
Section 4.2.3 work to recover N 1/4 bits of p, this method will work when as few
as N 1/4 bits of d are known.
There are more sophisticated lattice algorithms that involve different trade-
offs, but for very small e, which is typically the case in practice, they require
nearly all of the least significant bits of d to be known [BM03].
4.3 Non-consecutive bits known with redundancy
This section covers key recovery in the case that many non-consecutive bits of
secret values are known or need to be recovered. The lattice methods covered
in the previous section can be adapted to recover multiple chunks of unknown
key bits, but at a high cost: the lattice dimension increases with the number of
chunks, and when a large number of bits is to be recovered, the running time
can be exponential in the number of chunks.
In this section, we explore a different technique that allows a different trade-
off. In this case, the attacker has knowledge of many non-contiguous bits
of secret key values, and knows these for multiple secret values of the key.
The attacker might have learned parts of both p and q, or d mod (p − 1) and
d mod (q − 1), for example.
4.3.1 Random known bits of p and q
We begin by analyzing a case that is less likely to arise in practice, the case of
random erasures of bits of p and q, in order to give the main ideas behind the
algorithm in the simplest setting.
The main technique used for these cases is a branch and prune algorithm.
The idea behind the branch and prune algorithm is to write down an integer
relationship between the elements in the secret key and the public key, and
progressively solve for unknown bits of the secret key, starting at the least
significant bits. This produces a tree of solutions: every branch corresponds to
21
guesses for one or more unknown bits at a particular solution, and branches are
pruned if the guesses result in incorrect relationships to the public key.
This algorithm is presented and analyzed in [HS09].
Problem setup. Let N = 899. Imagine we have learned some bits of p and
q, in an erasure model: for each bit position, we either know the bit value, or
we know that we do not know it. For example, we have
p = t11 t 1,
and
q = t1 t 0t.
Defining an integer relation. The integer relation that we will take advan-
tage of for this example is N = pq.
Iteratively solve for each bit. The main idea of the algorithm is to itera-
tively solve for the bits of the unknowns p and q, starting at the least significant
bits. These can then be checked against the known public value of N .
At the least significant bit, the value is known for p and is unknown for q.
There are two options for the value of q, but only the bit value 1 satisfies the
constraint that pq = N mod 2. The algorithm then proceeds to the next step,
where the value of the second bit is known for q but not for p. Only the bit value
1 satisfies the constraint pq = N mod 22 , so the algorithm continues down this
branch. Since this generates a tree, the tree can be traversed in depth-first or
breadth-first order; depth-first will be more memory efficient. This is illustrated
in Figure 10.
p = 11111
Xq = 01101
p = 01111
X q = 01101
Figure 10: The branch and prune tree for our numeric example. The algorithm
begins at the right-hand node representing the least significant bits, and itera-
tively branches and prunes guesses for successive bits moving towards the most
significant bits.
22
ith bits of both p and q are known, an incorrect solution has around a 50%
probability of being pruned. Thus the algorithm is expected to be efficient as
long as there are not long runs of simultaneous unknown bits. We assume the
length of p and q is known. Once the algorithm has traversed this many bits,
the final solution pq = N can be checked without modular constraints.
When random bits are known from p and q, the analysis of [HS09] shows
that the tree of generated solutions is expected to have polynomial size when
57% of the bits of p and q are revealed at random. This algorithm can still
be efficient if the distribution of bits known is not random, as long as it allows
efficient pruning of the tree. An example would be learning 3 out of every 5 bits
of p and q, as in [YGH16].
Paterson, Polychroniadou, and Sibborn [PPS12] give an analysis of the re-
quired information for different scenarios, and observe that doing a depth-first
search is more efficient memory-wise than a breadth-first search.
4.3.2 Random known bits of the Chinese remainder coefficients d mod
(p − 1) and d mod (q − 1)
The description in Section 4.3.1 can be extended to recover the Chinese re-
mainder exponents dp = d mod (p − 1) and dq = d mod (q − 1) using the same
technique as the previous section. This is the most common case encountered
in RSA side channel attacks.
dp
dq
dp = t0 t t1, dq = t t t0t
We wish to recover the missing unknown bits of dp and dq , which will allow
us to recover the secret key itself.
Define integer relations. We know that edp ≡ 1 mod (p − 1) and edq ≡
1 mod (q − 1). We rewrite these as integer relations
We have no information about the values of p and q, but their values are uniquely
determined from a guess for dp or dq .
We also know that
pq = N.
23
The values kp and kq are unknown, so we must brute force them by running
the algorithm for all possible values. We expect it to fail for incorrect guesses,
and succeed for the unique correct guess. Equation 2 in Section 4.1 shows that
there is a unique value of kq for a given guess for kp . Since kp < e we need to
brute force at most e pairs of values for kp and kq .
In our example, we have kp = 13 and kq = 3, although this won’t be verified
as the correct guesses until the solution is found.
Iteratively solve for each bit. With our integer relations in place, we can
then use them to iteratively solve for each bit of the unknowns dp , dq , p, and q,
starting from the least significant bit. We check guesses for each value against
our three integer relations, and at bit i we prune those that do not satisfy the
relations mod 2i . We have three relations and four unknowns, so we generate
at most two new branches at each bit.
edp − 1 + kp ≡ kp p mod 2i ,
edq − 1 + kq ≡ kq q mod 2i ,
pq ≡ N mod 2i .
Since the values of p and q up to bit i are uniquely determined by our guess
for dp and dq up to bit i, the algorithm prunes solutions based on the relation
pq ≡ N mod 2i . The analysis of this case is then identical to the case of learning
bits of p and q at random.
For incorrect guesses for the values of kp and kq , we expect the equations
to act like random constraints, and thus to quickly become unsatisfiable. Once
there are no more possible solutions in a tree, the guess for kp and kq is known
to be incorrect. This is illustrated by Figure 11.
4.3.3 Recovering RSA keys from indirect information
For this type of key recovery algorithm, it is not always necessary to have direct
knowledge of bits of the secret key values with certainty. It can still be possi-
ble to apply the branch-and-prune technique to recover secret keys even if only
“implicit” information is known about the secret values, as long as this implicit
information implies a relationship that can be checked to prioritize or prune
candidate key guesses from the least significant bits. Examples in the literature
include [BBG+ 17], which computes partial sliding window square-and-multiply
sequences for candidate guesses and compares them to the ground truth mea-
surements, and [MVH+ 20], which compares the sequence of program branches
in a binary GCD algorithm implementation computed over the cryptographic
secrets to a ground truth measurement.
4.3.4 Open problem: Random known bits without redundancy
As mentioned in Section 4.2.6, it is an open problem to recover an RSA secret
key when many nonconsecutive chunks of bits need to be recovered, and the
bits known are from only one secret key field, with no additional information
from other values. Applying the branch-and-prune methods discussed in this
secction to a single secret key value, say a factor p of N , where random bits
24
dp = 00011
dq = 01001
X p = 11011
q = 01001
dp = 00011
dq = 11001
X p = 11011
q = 11001
dp = 10011 dp = t0011
dq = 11001 dq = t1001
X p = 01011 p = t1011
q = 11001 q = t1001
dp = 10011 dp = . . . 0011 dp = . . . 0011
dq = 01001 dq = . . . 0001 dq = · · · t 001
X p = 01011 X p = . . . 1011 p = · · · t 011
q = 11001 q = . . . 0001 q = · · · t 001
dp = 10111
dq = 10101
X p = 11111
q = 01101
Figure 11: We give a sample branch and prune tree for recovering dp and dq
from known bits, starting from the least significant bits on the right side of the
tree. At each bit location, the value of p up to bit i is uniquely determined by
the guess for dp up to bit i, and the value of q up to bit i is uniquely determined
by the buess for dq up to bit i. The red X marks the branches that are pruned
by verifying the relation pq = N mod 2i .
25
are known, would result in a tree with exponentially many solutions unless
additional information were available to prune the tree.
26
the value r to be the x-coordinate of kg. The implementation then computes
the integer s = k −1 (h + dr) mod n. The signature is the pair of integers (r, s).
5.1.3 Nonce recovery and (EC)DSA security.
The security of (EC)DSA is extremely dependent on the signature nonce k
being securely generated, uniformly distributed, and unique for every signature.
If the nonce for one or more signatures is generated in a vulnerable manner,
then an attacker may be able to efficiently recover the long-term secret signing
key. Because of this property, side channel attacks against (EC)DSA almost
universally target nonce generation.
Key recovery from signature nonce. For a DSA or ECDSA key, if the
nonce k is known for a single signature, it is simple to compute the long-term
private key. Rearranging the expression for s, the secret key d can be recovered
as
d = r−1 (ks − h) mod n (5)
k1
k2
..
.
Figure 12: (EC)DSA key recovery from signatures where most significant bits
of the nonces are known.
The first technique is via lattices. This is generally considered more straight-
forward to implement, and works well when more nonce bits are known, and
information from fewer signatures is available: we would need to know at least
two most significant bits from the nonces of dozens to hundreds of signatures.
We cover this technique below.
The second technique is via Fourier analysis. This technique can deal with
as little as one known most significant bit from signature nonces, but empir-
ically appears to require an order of magnitude or more signatures than the
lattice approach. Recent works report using 223 [ANT+ 20], 235 [ANT+ 20], and
27
226 [TTA18] signatures for record computations. We leave a more detailed dis-
cussion of this technique to a future version of this survey. Nice descriptions of
the algorithm can be found in [DHMP13, TTA18].
5.2.1 Lattice attacks
The main idea behind lattice attacks for (EC)DSA key recovery is to formulate
the (EC)DSA key recovery problem as an instance of the Hidden Number Prob-
lem and then compute the shortest vector of a specially constructed lattice to
reveal the solution.
Below we give a simplified example that shows how to recover the key from a
small number of signatures when many of the most significant bits of the nonce
are zero, and then we will show how to extend the attack to more signatures
with fewer bits known from each nonce, and cover the case of arbitrary bits
known from the nonce.
Problem setup. Let p = 0xffffffffffffd21f be a 64-bit prime, and let
E : y 2 = x3 + 3 be an elliptic curve over Fp . Let g = (1, 2) be our generator
point on E, which has order n = 0xfffffffefa23f437.
We have two ECDSA signatures
and
These signatures both use 32-bit nonces k; that is, we know that their 32
most significant bits are 0.
Cast the problem as a system of equations. Our signatures above satisfy
the equivalencies
The values k1 , k2 , and d are unknown; the other values are known.
We can eliminate the variable d and rearrange terms as follows:
k1 − s−1 −1 −1 −1 −1
1 s2 r1 r2 k2 + s1 r1 h2 r2 − s1 h1 ≡ 0 mod n
Let t = −s−1 −1
1 s2 r1 r2 and u = s−1 −1 −1
1 r1 h2 r2 − s1 h1 . We can then simplify
the above as
k1 + tk2 + u ≡ 0 mod n (6)
We wish to solve for k1 and k2 , and we know that they are both small. Let
|k1 |, |k2 | < K. For our example, we have K = 232 .
28
Construct a lattice. We construct the following lattice basis:
n 0 0
B = t 1 0
u 0 K
29
Scaling to many signatures to decrease the number of bits known.
To decrease the number of bits required from each signature, we can incor-
porate more signatures into the lattice. If we have access to many signatures
(r1 , s1 ), . . . , (rm , sm ) on message hashes h1 , . . . , hm , we use the same method
above to write down equivalencies si ≡ ki−1 (hi + dri ) mod n, then as above we
rearrange terms and eliminate the variable d to obtain
k1 + t1 km + u1 ≡ 0 mod n
k2 + t2 km + u2 ≡ 0 mod n
.. (7)
.
km−1 + tm−1 km + um−1 ≡ 0 mod n
In order to solve SVP, we must run an algorithm like BKZ with block size
dim L(B) = m + 1. Using BKZ to look for the shortest vector can be done rela-
tively efficiently up to dimension around 100 currently; beyond that it becomes
increasingly expensive. In practice, one can often achieve a faster running time
for fixed parameters by using more samples to construct a larger dimension lat-
tice, and applying BKZ with a smaller block size to find the target vector. This
method can recover a secret key from knowledge of the 4 most significant bits of
nonces from 256-bit ECDSA signatures using about 70 samples, and 3 most sig-
nificant bits using around 95 samples. For fewer bits known, either the Fourier
analysis technique or a more powerful application of these lattice techniques is
required, along with significantly more computational power.
Known nonzero most significant bits. If the most significant bits of the
ki are nonzero and known, we can write ki = ai + bi , where the ai are known,
and the bi are small, so satisfy some bound |bi | < K. Then substituting into
Equation 6, we obtain
Thus we can let u0i = ui + ai + ti bm , and use the same lattice construction
as above, with u0i substituted for ui .
Nonce rebalancing. The signature nonces ki take values in the range 0 <
ki < n, but the lattice construction bounds the absolute value |ki |. Thus if we
know that 0 < ki < K for some bound K, we can achieve a tighter bound by
30
renormalizing the signatures. Let ki0 = ki − K/2, so that |ki0 | < K/2. Then we
can write Equations 7 as
ki + ti km + ui ≡ 0 mod n
(ki0 + K/2 + ti (km
0
+ K/2) + ui ≡ 0 mod n
ki0 + ti km
0
+ (ti + 1)K/2 + ui ≡ 0 mod n
Thus we have an equivalent problem with t0i = ti , u0i = (ti + 1)K/2 + ui , and
0
K = K/2, and can solve as before. This optimization can make a significant
difference in practice by reducing the number of required samples.
5.2.2 (EC)DSA key recovery from least significant bits of the nonce
k
The attack described in the previous section works just as well for known least
significant bits of the (EC)DSA nonce.
2` bi ai
k1
k2
..
.
Figure 13: (EC)DSA key recovery from signatures where least significant bits
of the nonces are known.
ai + 2` bi + ti (am + 2` bm ) + ui ≡ 0 mod n
2` bi + 2` ti bm + ai + ti am + ui ≡ 0 mod n
bi + ti bm + 2−` (ai + ti am + ui ) ≡ 0 mod n
We have an equivalent instance of the problem with t0i = ti , u0i = 2−` (ai +
ti am + ui ), and B 0 = B, and solve as above.
31
5.2.3 (EC)DSA key recovery from middle bits of the nonce k
2` ci ai bi
k1
k2
..
.
Figure 14: (EC)DSA key recovery from signatures where middle bits of the
nonces are known.
Recovering an ECDSA key from middle bits of the nonce k is slightly more com-
plex than the methods discussed above, because we have two unknown “chunks”
of the nonce to recover per signature. Fortunately, we can deal with these by
extending the methods to multiple variables per signature. The method we will
use here is similar to the multivariate extension in Section 4.2.4, but this case
is simpler.
Problem setup. We will use the same elliptic curve group parameters as
above. Let p = 0xffffffffffffd21f be a 64-bit prime, and let E : y 2 = x3 + 3
be an elliptic curve over Fp . Let g = (1, 2) be our generator point on E, which
has order n = 0xfffffffefa23f437.
We have two ECDSA signatures
and
a1 = 0x50e2fd5d8000
be the middle 34 bits of the signature nonce k1 used for the first signature above.
The first and last 15 bits are unknown. Let
a2 = 0x172930ab48000
be the middle 34 bits of the signature nonce k2 used for the second signature
above.
32
Cast the problem as a system of equations. As above, our two signature
nonces k1 and k2 satisfy the
where t = −s−1 −1 −1 −1 −1
1 s2 r1 r2 and u = s1 r1 h2 r2 − s1 h1 .
Since we know the middle bits of k1 and k2 are a1 and a2 respectively, we
can write
k1 = a1 + b1 + 2` c1 and k2 = a2 + b2 + 2` c2
where b1 , c1 , b2 , and c2 are unknown but small, less than some bound K. In
our example, we have |b1 |, |b2 |, |c1 |, |c2 | ≤ 215 and ` = 64 − 15 = 49.
Substituting and rearranging into Equation 8, we have
K K · 249 Kt Kt · 249 u0
Kn
B= Kn
Kn
n
If we call the BKZ algorithm on B, we obtain a basis that contains the vector
We can do the same for the next three short vectors in the basis, and obtain
four linear polynomials in our four unknowns. Solving the system, we obtain
the solutions
33
More detailed explanation. The row vectors of the lattice correspond to
the weighted coefficient vectors of the linear polynomial f in Equation 9, nx1 ,
ny1 , nx2 , and ny2 . Each of these linear polynomials vanishes by construction
modulo n when evaluated at the desired solution x1 = b1 , y1 = c1 , x2 = b2 ,
y2 = c2 , and thus so does any linear polynomial corresponding to a vector in
this lattice. If we can find a lattice vector whose `1 norm is less than n, then the
corresponding linear equation vanishes over the integers when evaluated at the
desired solution. Since we have four unknowns, if we can find four sufficiently
short lattice vectors corresponding to four linearly independent equations, we
can solve for our desired unknowns.
The determinant of our example lattice is det B = K 4 n4 , and the lattice has
dimension 5. Thus, ignoring approximation factors and constants, we expect
to find a vector of length det B 1/ dim B = (Kn)(4/5) . This is less than n when
K 4 < n; in our example this is satisfied because we have chosen a 15-bit K and
a 64-bit n.
The determinant bounds guarantee that we will find one short lattice vector,
but do not guarantee that we will find four short lattice vectors. For that, we
rely on the heuristic that the reduced vectors of a random lattice are close to
the same length.
5.2.4 (EC)DSA key recovery from many chunks of nonce bits
The above technique can be extended to an arbitrary number of variables.
k1
k2
..
.
(EC)DSA key recovery from signatures where multiple chunks of the nonces are
known.
The extension is called the Extended Hidden Number problem [HR07] and
can be used to solve for ECDSA keys when many chunks of signature nonces
are known. Each unknown “chunk” of nonce in each signature introduces a
new variable, so the resulting lattice will have dimension one larger than the
total number of unknowns; if there are m signatures and h unknown chunks of
nonce per signature, the lattice will have dimension mh + 1. We expect this
technique to find the solution when the parameters are such that the system of
equations has a unique solution. If the size of each chunk is K, heuristically
this will happen when K mh < nm−1 . This technique has been used in practice
in [FWC16] and further explored in [DPP20].
34
6 Key recovery method for the Diffie-Hellman
Key Exchange
6.1 Finite field and elliptic curve Diffie-Hellman prelimi-
naries
The Diffie-Hellman (DH) key exchange protocol [DH76] allows two parties to
create a common secret in a secure manner. We summarize the protocol in the
context of finite fields and elliptic curves.
Finite field Diffie-Hellman. Finite-field Diffie-Hellman parameters are spec-
ified by a prime p and a group generator g. Common implementation choices
are p a safe prime, i.e., q = (p − 1)/2 is prime, in which case g is often equal to
2, 3 or 4, or p is chosen such that p − 1 has a 160, 224, or 256-bit prime factor
q and g generates a subgroup of F∗p of order q. Key exchange is performed as
follows:
35
6.2 Most significant bits of finite field Diffie-Hellman shared
secret
The Hidden Number Problem approach we used in the previous section to re-
cover ECDSA or DSA keys from information about the nonces can also be used
to recover a Diffie-Hellman shared secret from most significant bits.
Bc
ri ki
s
sB c
A = g a mod p = 0x3526bb85185259cd42b61e5532fe60e0
and
B = g b mod p = 0x564df0b92ea00ea314eb5a246b01ac9c.
We have learned the value of the first 65 bits of s: let
r1 = 0x3330422f6047011b8000000000000000,
r2 = 0x80097373878e37d20000000000000000.
s = r1 + k1 mod p st = r2 + k2 mod p
where s, k1 , and k2 are small and unknown, and r1 , r2 , and t are known. We
can eliminate the variable s to obtain the linear equation
We now have a linear equation in the same form as the Hidden Number
Problem we solved in the previous section.
36
Construct a lattice. We construct the lattice basis
p
M = t−1 1
−1
a1 − t a2 K
If we call the LLL algorithm on M , we obtain a basis that contains the vector
This corresponds to our desired solution (k1 , k2 , K), although if the Diffie-
Hellman assumption is true we cannot verify its correctness.
More detailed explanation. This method is due to Boneh and Venkate-
san [BV96], and was the original motivation for their formulation of the Hidden
Number Problem. The Raccoon attack recently demonstrated an attack sce-
nario using this technique in the context of TLS [MBA+ 20].
This method can be adapted to multiple samples with the same number of
bits required as the attacks on ECDSA. Knowing the most significant bits of s is
not necessary either; we only need the most significant bits of known multiples
ti of s.
6.3 Discrete log from contiguous bits of Diffie-Hellman
secret exponents
This section addresses the problem of Diffie-Hellman key recovery when the
known partial information is part of one or the other of the secret exponents.
The technique we apply in this section is Pollard’s kangaroo (also known as
lambda) algorithm [Pol78]. Unlike the techniques of the previous sections, which
are generally efficient when the attacker’s knowledge of the key is above a certain
threshold, and either inefficient or infeasible when the attacker’s knowledge of
the key is below this threshold, this algorithm runs in exponential time: square
root of the size of the interval. Thus it provides a significant benefit over brute
force, but in practice is likely limited to 80 bits or fewer of key recovery unless
you have access to an unusually large amount of computational resources.
The Pollard kangaroo algorithm is a generic discrete logarithm algorithm
that is designed to compute discrete logarithms when the discrete logarithm
lies in a small known interval. It applies to both elliptic curve and finite field
discrete logarithms. We will use finite field discrete logarithms for our examples,
but the algorithm is the same in the elliptic curve context.
6.3.1 Known most significant bits of the Diffie-Hellman secret ex-
ponent.
Problem Setup. Using the same notation for finite fields as in Section 6.1, let
A be a a Diffie-Hellman public key, p be a prime modulus, and g a generator of
a multiplicative group of order q modulo p. These values are all public, and thus
we assume that they are known. Imagine that we have obtained a consecutive
fraction of the most significant bits of the secret exponent a, and we wish to
recover the unknown bits of a to reconstruct the secret.
37
2` m0 r
Figure 15: Recovering Diffie-Hellman shared secret with most significant bits of
secret exponent.
38
m = 0x1400 0x1480
0x1483 0x148a 0x1494
0x1497
m + w = 0x1500
a+0xa a+0x2b
a a+0xd a+0x17 a+0x21 a+0x28 a+0x2e a+0x36
a+0x2f
Compute the discrete log. We know that si = s0j for si on the tame kan-
garoo’s path and s0j on the wild kangaroo’s path. Thus we have
si = s0j mod p
0
g xi = g a+xj mod p
xi = a + x0j mod q
xi − x0j = a mod q
39
6.3.2 Unknown most significant bits of the Diffie-Hellman secret ex-
ponent
2` r m
Figure 16: Recovering Diffie-Hellman shared secret with least significant bits
2` r m r0
a
7 Conclusion
This work surveyed key recovery methods with partial information for popular
public key cryptographic algorithms. We focused in particular on the most
widely-deployed asymmetric primitives: RSA, (EC)DSA and Diffie-Hellman.
The motivation for these algorithms arises from a variety of side-channel attacks.
40
While the existence of key recovery algorithms for certain cases may deter-
mine whether a particular vulnerability is exploitable or not, we emphasize that
these thresholds for an efficiently exploitable key recovery attack should not
be used to guide countermeasures. Instead, implementations should strive to
have fully constant-time operations for all cryptographic operations to protect
against side-channel attacks.
8 Acknowledgements
Pierrick Gaudry, Daniel Genkin, and Yuval Yarom made significant contribu-
tions to early versions of this work. We thank Akira Takahashi and Billy Bob
Brumley for clarifications and suggesting additional citations. This work was
funded by the US National Science Foundation under grants no. 1513671 and
1651344.
References
[ANT+ 20] Diego F. Aranha, Felipe Rodrigues Novaes, Akira Takahashi,
Mehdi Tibouchi, and Yuval Yarom. LadderLeak: Breaking
ECDSA with less than one bit of nonce leakage. In Jay Ligatti,
Xinming Ou, Jonathan Katz, and Giovanni Vigna, editors, ACM
CCS 20, pages 225–242. ACM Press, November 2020.
[AS08] Onur Aciiçmez and Werner Schindler. A vulnerability in RSA im-
plementations due to instruction cache analysis and its demonstra-
tion on OpenSSL. In Tal Malkin, editor, CT-RSA 2008, volume
4964 of LNCS, pages 256–273. Springer, Heidelberg, April 2008.
[ASK07] Onur Aciiçmez, Werner Schindler, and Çetin Kaya Koç. Cache
based remote timing attack on the AES. In Masayuki Abe, editor,
CT-RSA 2007, volume 4377 of LNCS, pages 271–286. Springer,
Heidelberg, February 2007.
[BBG+ 17] Daniel J. Bernstein, Joachim Breitner, Daniel Genkin, Leon Groot
Bruinderink, Nadia Heninger, Tanja Lange, Christine van Vre-
dendaal, and Yuval Yarom. Sliding right into disaster: Left-
to-right sliding windows leak. In Wieland Fischer and Naofumi
Homma, editors, CHES 2017, volume 10529 of LNCS, pages 555–
576. Springer, Heidelberg, September 2017.
41
[BH09] Billy Bob Brumley and Risto M. Hakala. Cache-timing template
attacks. In Mitsuru Matsui, editor, ASIACRYPT 2009, volume
5912 of LNCS, pages 667–684. Springer, Heidelberg, December
2009.
[Ble98] Daniel Bleichenbacher. Chosen ciphertext attacks against proto-
cols based on the RSA encryption standard PKCS #1. In Hugo
Krawczyk, editor, CRYPTO’98, volume 1462 of LNCS, pages 1–
12. Springer, Heidelberg, August 1998.
[BM03] Johannes Blömer and Alexander May. New partial key exposure
attacks on RSA. In Dan Boneh, editor, CRYPTO 2003, volume
2729 of LNCS, pages 27–43. Springer, Heidelberg, August 2003.
[Bon98] Dan Boneh. The decision Diffie-Hellman problem. In Third Al-
gorithmic Number Theory Symposium (ANTS), volume 1423 of
LNCS. Springer, Heidelberg, 1998. Invited paper.
[BT11] Billy Bob Brumley and Nicola Tuveri. Remote timing attacks
are still practical. In Vijay Atluri and Claudia Dı́az, editors, ES-
ORICS 2011, volume 6879 of LNCS, pages 355–371. Springer, Hei-
delberg, 2011.
[BV96] Dan Boneh and Ramarathnam Venkatesan. Hardness of comput-
ing the most significant bits of secret keys in Diffie-Hellman and
related schemes. In Neal Koblitz, editor, CRYPTO’96, volume
1109 of LNCS, pages 129–142. Springer, Heidelberg, August 1996.
[BvSY14] Naomi Benger, Joop van de Pol, Nigel P. Smart, and Yuval Yarom.
“ooh aah... just a little bit”: A small amount of side channel can
go a long way. In Lejla Batina and Matthew Robshaw, editors,
CHES 2014, volume 8731 of LNCS, pages 75–92. Springer, Heidel-
berg, September 2014.
[Cop96] Don Coppersmith. Finding a small root of a bivariate integer
equation; factoring with high bits known. In Ueli M. Maurer,
editor, EUROCRYPT’96, volume 1070 of LNCS, pages 178–189.
Springer, Heidelberg, May 1996.
[DDME+ 18] Fergus Dall, Gabrielle De Micheli, Thomas Eisenbarth, Daniel
Genkin, Nadia Heninger, Ahmad Moghimi, and Yuval Yarom.
Cachequote: Efficiently recovering long-term secrets of sgx epid
42
via cache attacks. IACR Transactions on Cryptographic Hardware
and Embedded Systems, 2018(2):171–191, May 2018.
[DH76] Whitfield Diffie and Martin E. Hellman. New directions in cryp-
tography. IEEE Trans. Information Theory, 22(6):644–654, 1976.
[DHMP13] Elke De Mulder, Michael Hutter, Mark E. Marson, and Peter Pear-
son. Using Bleichenbacher’s solution to the hidden number prob-
lem to attack nonce leaks in 384-bit ECDSA. In Guido Bertoni
and Jean-Sébastien Coron, editors, CHES 2013, volume 8086 of
LNCS, pages 435–452. Springer, Heidelberg, August 2013.
43
[HG97] Nicholas Howgrave-Graham. Finding small roots of univariate
modular equations revisited. In Michael Darnell, editor, Crytogra-
phy and Coding, pages 131–142, Berlin, Heidelberg, 1997. Springer
Berlin Heidelberg.
[HG01] Nick Howgrave-Graham. Approximate integer common divisors.
pages 51–66, 2001.
[HM08] Mathias Herrmann and Alexander May. Solving linear equations
modulo divisors: On factoring given any bits. In Josef Pieprzyk,
editor, ASIACRYPT 2008, volume 5350 of LNCS, pages 406–424.
Springer, Heidelberg, December 2008.
[HR07] Martin Hlavác and Tomás Rosa. Extended hidden number prob-
lem and its cryptanalytic applications. In Eli Biham and Amr M.
Youssef, editors, SAC 2006, volume 4356 of LNCS, pages 114–133.
Springer, Heidelberg, August 2007.
[HS09] Nadia Heninger and Hovav Shacham. Reconstructing RSA private
keys from random key bits. In Shai Halevi, editor, CRYPTO 2009,
volume 5677 of LNCS, pages 1–17. Springer, Heidelberg, August
2009.
[HS14] Michael Hutter and Jörn-Marc Schmidt. The temperature side
channel and heating fault attacks. Cryptology ePrint Archive,
Report 2014/190, 2014. .
[HSH+ 08] J. Alex Halderman, Seth D. Schoen, Nadia Heninger, William
Clarkson, William Paul, Joseph A. Calandrino, Ariel J. Feldman,
Jacob Appelbaum, and Edward W. Felten. Lest we remember:
Cold boot attacks on encryption keys. In Paul C. van Oorschot,
editor, USENIX Security 2008, pages 45–60. USENIX Association,
July / August 2008.
[IGA+ 15] Mehmet Sinan Inci, Berk Gülmezoglu, Gorka Irazoqui Apecechea,
Thomas Eisenbarth, and Berk Sunar. Seriously, get off my cloud!
cross-vm rsa key recovery in a public cloud. IACR Cryptology
ePrint Archive, 2015:898, 2015.
[KHF+ 19] Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel
Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Man-
gard, Thomas Prescher, Michael Schwarz, and Yuval Yarom. Spec-
tre attacks: Exploiting speculative execution. In 2019 IEEE Sym-
posium on Security and Privacy, pages 1–19. IEEE Computer So-
ciety Press, May 2019.
[KJJ99] Paul C. Kocher, Joshua Jaffe, and Benjamin Jun. Differential
power analysis. In Michael J. Wiener, editor, CRYPTO’99, volume
1666 of LNCS, pages 388–397. Springer, Heidelberg, August 1999.
44
[KJJR11] Paul Kocher, Joshua Jaffe, Benjamin Jun, and Pankaj Rohatgi. In-
troduction to differential power analysis. Journal of Cryptographic
Engineering, 1(1):5–27, Apr 2011.
[Koc96] Paul C. Kocher. Timing attacks on implementations of Diffie-
Hellman, RSA, DSS, and other systems. In Neal Koblitz, editor,
CRYPTO’96, volume 1109 of LNCS, pages 104–113. Springer, Hei-
delberg, August 1996.
[LLL82] Arjen Klaas Lenstra, Hendrik Willem Lenstra, and László Lovász.
Factoring polynomials with rational coefficients. Mathematische
Annalen, 261(4):515–534, 1982.
[LSG+ 18] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher,
Werner Haas, Anders Fogh, Jann Horn, Stefan Mangard, Paul
Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg. Melt-
down: Reading kernel memory from user space. In William Enck
and Adrienne Porter Felt, editors, USENIX Security 2018, pages
973–990. USENIX Association, August 2018.
[May10] Alexander May. Using LLL-reduction for solving RSA and fac-
torization problems. ISC, pages 315–348. Springer, Heidelberg,
2010.
[MBA+ 20] Robert Merget, Marcus Brinkmann, Nimrod Aviram, Juraj So-
morovsky, Johannes Mittmann, and J org Schwenk. Raccoon at-
tack: Finding and exploiting most-significant-bit-oracles in tls-
dh(e). 2020.
[Möl03] Bodo Möller. Improved techniques for fast exponentiation. In
Pil Joong Lee and Chae Hoon Lim, editors, ICISC 02, volume
2587 of LNCS, pages 298–312. Springer, Heidelberg, November
2003.
[MVH+ 20] Daniel Moghimi, Jo Van Bulck, Nadia Heninger, Frank Piessens,
and Berk Sunar. CopyCat: Controlled instruction-level attacks
on enclaves. In Srdjan Capkun and Franziska Roesner, editors,
USENIX Security 2020, pages 469–486. USENIX Association, Au-
gust 2020.
[NIS13] National Institute of Standards and Technology. Digital Signature
Standard (DSS), 2013.
[NS02] Phong Q. Nguyen and Igor E. Shparlinski. The insecurity of the
Digital Signature Algorithm with partially known nonces. J. Cryp-
tology, 15(3):151–176, 2002.
[NS03] Phong Q. Nguyen and Igor E. Shparlinski. The insecurity of the
Elliptic Curve Digital Signature Algorithm with partially known
nonces. Des. Codes Cryptography, 30(2):201–217, 2003.
45
[NS06] Phong Nguyen and Damien Stehlé. LLL on the average. In Pro-
ceedings of the 7th International Conference on Algorithmic Num-
ber Theory, ANTS’06, pages 238–256, Berlin, Heidelberg, 2006.
Springer-Verlag.
[OST06] Dag Arne Osvik, Adi Shamir, and Eran Tromer. Cache attacks
and countermeasures: The case of AES. In David Pointcheval,
editor, CT-RSA 2006, volume 3860 of LNCS, pages 1–20. Springer,
Heidelberg, February 2006.
[OW99] Paul C. Oorschot and Michael J. Wiener. Parallel collision search
with cryptanalytic applications. J. Cryptol., 12(1):1–28, January
1999.
[Pag02] D. Page. Theoretical use of cache memory as a cryptanalytic side-
channel. Cryptology ePrint Archive, Report 2002/169, 2002. .
[Per05] Colin Percival. Cache missing for fun and profit. In BSDCon 2005,
Ottawa, CA, 2005.
[Pol78] John M. Pollard. Monte Carlo methods for index computation
mod p. Mathematics of Computation, 32:918–924, 1978.
[PPS12] Kenneth G. Paterson, Antigoni Polychroniadou, and Dale L. Sib-
born. A coding-theoretic approach to recovering noisy RSA keys.
In Xiaoyun Wang and Kazue Sako, editors, ASIACRYPT 2012,
volume 7658 of LNCS, pages 386–403. Springer, Heidelberg, De-
cember 2012.
[QS01] Jean-Jacques Quisquater and David Samyde. Electromagnetic
analysis (ema): Measures and counter-measures for smart cards. In
Proceedings of the International Conference on Research in Smart
Cards: Smart Card Programming and Security, E-SMART ’01,
pages 200–210, London, UK, UK, 2001. Springer-Verlag.
[Rup10] Raminder Singh Ruprai. Improvements to the Gaudry-Schost Al-
gorithm for Multidimensional discrete logarithm problems and Ap-
plications. PhD thesis, 2010.
[Sch87] Claus-Peter Schnorr. A hierarchy of polynomial time lattice basis
reduction algorithms. Theoretical Computer Science, 53(2-3):201–
224, 1987.
[Sch90] Claus-Peter Schnorr. Efficient identification and signatures for
smart cards. In Gilles Brassard, editor, CRYPTO’89, volume 435
of LNCS, pages 239–252. Springer, Heidelberg, August 1990.
[SE94] Claus-Peter Schnorr and M. Euchner. Lattice basis reduction:
Improved practical algorithms and solving subset sum problems.
Mathematical Programming, 66(2):181–199, 1994.
46
[Tes00] Edlyn Teske. On random walks for pollard’s rho method. Mathe-
matics of Computation, 70:809–825, 2000.
[TSS+ 03] Yukiyasu Tsunoo, Teruo Saito, Tomoyasu Suzaki, Maki Shigeri,
and Hiroshi Miyauchi. Cryptanalysis of DES implemented on
computers with cache. In Colin D. Walter, Çetin Kaya Koç, and
Christof Paar, editors, CHES 2003, volume 2779 of LNCS, pages
62–76. Springer, Heidelberg, September 2003.
[TTA18] Akira Takahashi, Mehdi Tibouchi, and Masayuki Abe. New Ble-
ichenbacher records: Fault attacks on qDSA signatures. IACR
TCHES, 2018(3):331–371, 2018. .
47