Introduction_to_Side-Channel_Attacks
Introduction_to_Side-Channel_Attacks
net/publication/225852558
CITATIONS READS
181 11,220
1 author:
François-Xavier Standaert
Université Catholique de Louvain - UCLouvain
339 PUBLICATIONS 12,993 CITATIONS
SEE PROFILE
All content following this page was uploaded by François-Xavier Standaert on 20 May 2014.
François-Xavier Standaert⋆
1 Introduction
A cryptographic primitive can be considered from two points of view: on the one
hand, it can be viewed as an abstract mathematical object or black box (i.e. a
transformation, possibly parameterized by a key, turning some input into some
output); on the other hand, this primitive will in fine have to be implemented
in a program that will run on a given processor, in a given environment, and
will therefore present specific characteristics. The first point of view is the one
of classical cryptanalysis; the second one is the one of physical security. Physi-
cal attacks on cryptographic devices take advantage of implementation-specific
characteristics to recover the secret parameters involved in the computation.
They are therefore much less general - since specific to a given implementation
- but often much more powerful than classical cryptanalysis, and are considered
very seriously by cryptographic devices manufacturers.
Such physical attacks are numerous and can be classified in many ways. The
literature usually sorts them among two orthogonal axes:
1. Invasive vs. non-invasive: invasive attacks require depackaging the chip to
get direct access to its inside components; a typical example of this is the
connection of a wire on a data bus to see the data transfers. A non-invasive
attack only exploits externally available information (the emission of which is
however often unintentional) such as running time, power consumption, . . .
2. Active vs. passive: active attacks try to tamper with the devices proper
functioning; for example, fault-induction attacks will try to induce errors in
the computation. As opposed, passive attacks will simply observe the devices
behavior during their processing, without disturbing it.
⋆
Postdoctoral researcher of the Belgian Fund for Scientific Research (FNRS).
The side-channel attacks we consider in this paper are a class of physical attacks
in which an adversary tries to exploit physical information leakages such as
timing information [10], power consumption [11] or electromagnetic radiation [1].
Since they are non-invasive, passive and they can generally be performed using
relatively cheap equipment, they pose a serious threat to the security of most
cryptographic hardware devices. Such devices range from personal computers
to small embedded devices such as smart cards and RFIDs (Radio Frequency
Identification Devices). Their proliferation in a continuously larger spectrum of
applications has turned the physical security and side-channel issue into a real,
practical concern that we aim to introduce in this paper.
For this purpose, we start by covering the basics of side-channel attacks. We
discuss the origin of unintended leakages in recent microelectronic technologies
and describe how simple measurement setups can be used to recover and exploit
these physical features. Then, we introduce some classical attacks: Simple Power
Analysis (SPA) and Differential Power Analysis (DPA). In the second part of
the paper, we put forward the different steps of an actual side-channel attack
through two illustrative examples. We take advantage of these examples to stress
a number of practical concerns regarding the implementation of side-channel
attacks and discuss their possible improvements. Finally, we list a number of
countermeasures to reduce the impact of physical information leakages.
Rmeas Rmeas
CL CL
Rmeas Rmeas
Gnd Gnd
µIdl × rb
dB = , (2)
4πr2
where µ is the magnetic permeability, I is the current carried on a conductor
of infinitesimal length dl, rb is the unit vector specifying the distance between
the current element and the field point and r is the distance from the current
element to the field point. Although such a simple equation does not describe
the exact (complex) radiation of an integrated circuit, it already emphasizes two
important facts: (1) the field is data-dependent, due to the dependence on the
current intensity and (2) the field orientation depends on the current direction.
This data-dependent radiation is again the origin of side-channel information
leakages. In general, any physically observable phenomenon that can be related
to the internal configuration or activity of a cryptographic device can be a source
of useful information to a malicious adversary.
Leakage models. From the previous physical facts, side-channel adversaries
have derived a number of (more or less sophisticated) leakage models. They
can be used both to simulate the attacks or to improve an attack’s efficiency.
For example, the Hamming distance model assumes that, when a value x0 con-
tained in a CMOS device switches into a value x1 , the actual side-channel
leakages are correlated with the Hamming distance of these values, namely
HD (x0 , x1 ) = HW (x0 ⊕ x1 ). The Hamming weigh model is even simpler and
assumes that, when a value x0 is computed in a device, the actual side-channel
leakages are correlated with the Hamming weight of this value, namely HW (x0 ).
As will be emphasized in Section 4, good leakage models have a strong impact
on the efficiency of a side-channel attack. Hamming weight and distance models
assume both that there are no differences between 0 → 1 and 1 → 0 events and
that every bit in an implementation contributes identically to the overall power
consumption. Improved models relax these assumptions, e.g. by considering dif-
ferent leakages for the 0 → 1 and 1 → 0 events [20], assigning different weights
to the leakage contributions of an implementation’s different parts [25] of by
considering advanced statistical tools to characterize a device’s leakage [6].
Beyond the previous classification of physical attacks (i.e. invasive vs. non-
invasive, active vs. passive), the literature also classifies the attacks according
to the statistical treatment applied to the leakage traces. For example, “simple”
and “differential” attacks were introduced in the context of power analysis [11].
Fig. 2: SPA monitoring from a single AES encryption performed by a smart card.
voltage
time
4. Selection of the device inputs. If allowed, the adversary selects the inputs
that are to be feeded to the target device, e.g. randomly. If not allowed, it is
generally assumed that a side-channel adversary can monitor the plaintexts.
5. Derivation of internal values within the algorithm. This is the core of
the divide-and-conquer strategy. For a number of (known) input plaintexts, the
adversary predicts (key-dependent) internal values within the target device that
are to be computed during the execution of the algorithm. For computational
reasons, only values depending on a small part of the key are useful. For example,
one could predict the 4 bits after the permutation in the first DES round, for
each of the 64 possible key values entering S0, as illustrated in the central table of
Figure 4. As a result of this values derivation phase, the adversary has predicted
internal values of the block cipher implementation for q plaintexts and each key
class candidate s∗ (out of 64 possible ones), stored in vectors vqs∗ ’s.
Ri 6 known bits
Key[0…5] Key[0…5]
Expansion
6 known bits 0 1 2 3 0 1 2 3
Ki 6 key bits 0 5 12 7 2
6 bits guessed 0 2 2 3 1
1 9 0 12 6 1 2 0 2 2
Ri Ri
S0 S1 S2 S3 S4 S5 S6 S7 2 14 4 1 13 2 3 1 1 3
3 7 5 5 8 3 3 2 2 1
4 bits guessed
Permutation 4 3 10 15 1 4 2 2 4 1
4 bits guessed
Fig. 4: Derivation of the internal values and leakage modeling within the DES.
6. Modeling of the leakage. For the same set of key class candidates as dur-
ing the derivation of the internal values, the adversary models a part or function
of the actual target device’s leakage. For example, assuming that the power con-
sumption in CMOS devices depends on the switching activity occurring during a
computation, the Hamming weigh or distance models can be used to predict the
leakage, as illustrated in the right table of Figure 4. In this context, the models
are directly derived from the internal values, e.g. M(s∗ , vqs∗ ) = HW (vqs∗ ).
8. Selection of the relevant leakage samples. Since the leakage traces ob-
tained from an acquisition device may contain hundreds of thousands samples,
actual side-channel adversaries usually reduce the data-dimensions to lower val-
ues. This may be done using simple techniques such as SPA or by using advanced
statistical processing. In the example of Figure 5, only the maximum value of the
clock cycle corresponding to the DES permutation is extracted from the traces.
As a result of this phase, the adversary obtains a reduced vector: R(lq ).
P R(L)
P R(L)
leakage
0 1.675
0 1.675
1 1
1.432 1.432
2 1.221
2 1.221
3 1.498
3 1.498
4 1.937
4 1.937
time
9. Statistical comparison. For each of the key class candidates, the adversary
finally applies a statistic to compare the predicted leakages with the transformed
measurements. If the attack is successful, it is expected that the model corre-
sponding to the correct key candidate gives rise to the best comparison result.
For example, in our previous illustrations, the values derivation vectors vqs∗ and
reduced traces R(li )′ s both have q elements. Therefore, if we store the hypotheti-
cal Hamming weight models in a vector mqs∗ = HW (vqs∗ ) the empirical correlation
coefficient can be used for comparison [5] :
Pq
(li − Ê(R(lq ))) · (mis∗ − Ê(mqs∗ ))
corr(s∗ ) = qP i=1 Pq , (3)
q i − Ê(mq ))2
(l
i=1 i − Ê(R(l q )))2 ·
i=1 (m s∗ s∗
where Ê(.) denotes the empirical mean. In Figure 6, such a correlation attack
is applied to our leaking DES implementation and the coefficient is computed
for an increasing number of observations. It clearly illustrates that the attack is
successful after approximately 100 measured encryptions.
0.8
correct key candidate
0.6
0.4
correlation
0.2
−0.2
−0.4
−0.6
−0.8
−1
0 50 100 150 200
number of measurement queries
0.9
correct key candidate
0.8
0.7
likelihood
0.6
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80
number of measurement queries
5 Countermeasures
References
Ri
Li Ri
Ki Expansion
Ki
f
S0 S1 S2 S3 S4 S5 S6 S7
In 1977, the DES algorithm [18] was adopted as a Federal Information Pro-
cessing Standard (FIPS) for unclassified government communication. Although
a new Advanced Encryption Standard was selected in October 2000 [19], DES is
still widely used, particularly in the financial sector. DES encrypts 64-bit blocks
with a 56-bit key and processes data with permutations, substitutions and XOR
operations. The plaintext is first permuted by a fixed permutation IP. Next the
result is split into two 32-bit halves, denoted with L (left) and R (right) to which
a round function is applied 16 times. The ciphertext is calculated by applying
the inverse of the initial permutation IP to the result of the 16th round. The se-
cret key is expanded by the key schedule algorithm to sixteen 48-bit round keys
Ki and in each round, a 48-bit round key is XORed to the text. The key sched-
ule consists of known bit permutations and shift operations. Therefore, finding
any round key bit directly involves that the secret key is corrupted. The round
function is represented in Figure 8 (a) and is easily described by:
Li+1 = Ri
Ri+1 = Li ⊕ f (Ri , Ki )
Time Waveform
0.08
0.07
0.06
0.05
0.04
w(t)
0.03
0.02
0.01
0.01
0.02
3.95 4 4.05 4.1 4.15 4.2 4.25 4.3
t 5
x 10
Time Waveform
0.15
0.1
0.05
w(t)
0.05
0.1
3.95 4 4.05 4.1 4.15 4.2 4.25 4.3
t 5
x 10