0% found this document useful (0 votes)
21 views

Revealing Secret Key From Low Success Rate Deep Learning-Based Side Channel Attacks

Revealing Secret Key from Low Success Rate Deep Learning-Based Side Channel Attacks

Uploaded by

Phuc Hoang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Revealing Secret Key From Low Success Rate Deep Learning-Based Side Channel Attacks

Revealing Secret Key from Low Success Rate Deep Learning-Based Side Channel Attacks

Uploaded by

Phuc Hoang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Revealing Secret Key from Low Success Rate Deep

Learning-Based Side Channel Attacks


Van-Phuc Hoang1 , Ngoc-Tuan Do1,2* , Trong-Thuc Hoang3 , Cong-Kha Pham3
1 Institute
of System Integration, Le Quy Don Technical University, Hanoi, Vietnam
2 Telecommunications University, Nha Trang, Vietnam
3 The University of Electro-Communications, Tokyo, Japan
* Correspondence: [email protected]

Abstract—Non-profiled deep learning-based side channel at- implementation of cryptographic algorithms. The performance
tacks utilize deep neural networks to extract highly accurate of these attacks is commonly evaluated by using SCA metrics
sensitive information. These attacks pose a significant threat to [1] like rank (or key rank), success rate (SR), or guessing
the security of cryptographic devices. Unlike profiled attacks,
non-profiled attacks do not require prior knowledge of the target entropy (GE). While the first metric requires only one attack,
device, making them more versatile. Deep learning algorithms the others are the result of performing the attacks multiple
enable attackers to learn complex relationships between side times to achieve stable results. Recently, the authors in [2]
channel signals and secret information, enabling the recovery have investigated the performance of DLSCA. Accordingly,
of cryptographic keys, even the common SCA countermeasure DL techniques have multiple sources of randomness (due
deployed. However, non-profiled DLSCA can not reveal the secret
key if the correct key’s metric is not clearly distinguished from to the initialization, regularization, and optimization proce-
the incorrect candidates. This paper discusses the mentioned dure). Consequently, DL algorithms exhibit stochastic behav-
issue of non-profiled DLSCA. Then, a new metric based on the ior, leading to variations in attack results and necessitating the
inversion of exponential rank (IER) is proposed to enhance the execution of multiple independent attacks to obtain reliable
performance of these attacks. The experimental results show that outcomes. However, it is important to note that the afore-
the proposed technique could reveal the secret subkey even if
the partial success rate percentage is only 10% in the ASCAD mentioned research exclusively investigated DLSCA within a
dataset. Furthermore, when utilizing minimally tuned models and profiling context. In addition, DL has been exploited in several
IER metric to execute attacks on the CHES-CTF 2018 data, there other studies in the non-profiled context [3]–[6]. Therefore,
is a substantial increase in the percentage of correctly revealed non-profiled DLSCA encounters similar challenges and limi-
bytes, rising from 62.5% to 93.75%. tations as identified in the profiling scenario.
Index Terms—Deep learning, Side channel attack, Key rank,
metric To investigate deeper into this issue, for a method for a deep
learning based security evaluation of cryptographic algorithm
I. I NTRODUCTION in IoT based smart healthcare systems, this paper focuses
on exploring the negative effect of the randomness sources
Deep learning (DL) has emerged as a powerful tool in vari- in non-profiled DL-based SCA techniques. Especially, we
ous domains, revolutionizing the field of artificial intelligence. introduce a novel metric aiming to yield improved outcomes
With its ability to automatically learn intricate patterns and compared to the conventional metrics used in a non-profiled
extract meaningful representations from vast amounts of data, context. Notably, the proposed metric can be used as a new
deep learning has been applied to numerous applications, rang- distinguisher that is capable of unveiling the secret key. Our
ing from image recognition to natural language processing. main contributions are:
However, like any powerful technology, deep learning can also
be harnessed for malicious purposes. One such application • Our investigation focuses on examining the impact of al-
is the field of side-channel attacks (SCA). Moreover, the gorithmic randomness on the performance of non-profiled
emergence of deep learning has led to a substantial increase in DLSCA.
the effectiveness of these attacks, increasing their potency and • We introduce a novel metric called inversion of exponen-
consequences. Especially, in the smart healthcare systems with tial rank (IER) to serve as a valuable tool for evaluating
sensitive data, the cyber-security issues are very critical. In our and quantifying the stability of non-profiled DL attacks
on-going project “AIPOSH” funded by ASEAN IVO program, in the absence of knowledge about the secret key.
a comprehensive cyber-security platform with be provided • A novel non-profiled DLSCA approach utilizing the IER
with artificial intelligence powered hardware-software oriented metric is presented. The proposed technique demonstrates
solutions for Internet-of-things based smart healthcare sys- exceptional performance in uncovering secret keys even
tems, toward a technology roadmap for ASEAN countries in under limited attack success rates.
the field. The rest of this paper is organized as follows. In Section II,
In literature, a multitude of studies have demonstrated DL-based non-profiled techniques are presented. Section III
the efficiency of DL-based SCA (DLSCA) in breaking the introduces a new SCA metric, and a new distinguisher is
1.16
1.14 Incorrect key guess
Correct key guess
1.16
1.14
Incorrect key guess
Correct key guess 1.14
Incorrect key guess
Correct key guess
B. Sources of randomness in non-profiled deep learning based
1.12
1.10
1.12 1.12 SCA
Loss

Loss
Loss
1.10 1.10
1.08
1.06
1.08 1.08 As indicated in [2], several common sources of random-
1.06 1.06
1.04
1.02
Rank=0 1.04 Rank=1 1.04 Rank=6 ness will have a negative impact on the attack results. The
0 5 10 15 20 25 30
Number of epochs
0 5 10 15 20
Number of epochs
25 30 0 5 10 15
Number of epochs
20 25 30
authors showed that the random sources are connected with
1.16
Incorrect key guess
1.16
Incorrect key guess 1.16 Incorrect key guess the dataset (dataset randomness) and the machine learning
1.14 Correct key guess 1.14 Correct key guess 1.14 Correct key guess
1.12 1.12 1.12
algorithm (algorithmic randomness). However, the source that

Loss
Loss

Loss
1.10
1.08
1.10
1.08
1.10
1.08
impacts the most attack results is algorithmic randomness. The
1.06 1.06 1.06 common sources are the initialization of weights and biases,
1.04 Rank= 14 1.04 1.04 Rank= 72
0 5 10 15 20 25 30 0 5 10
Rank= 36
15 20 25 30 0 5 10 15 20 25 30
regularization techniques, or optimization techniques.
Number of epochs Number of epochs Number of epochs Similar to profiled DLSCA, non-profiled DLSCA also faces
the randomness sources of the DL model. Furthermore, non-
Fig. 1. The results of different DDLA-SHW based attacks using randomness
(Glorot initialization) weights on the same dataset. profiled DLSCA like DDLA and MOR [6] determine the secret
key based on the training metrics, such as loss and accuracy.
Therefore, an attacker could achieve unstable DLSCA results
presented. In the next section, the experimental results are in a non-profiled context. Indeed, we take the attack results
shown to demonstrate the efficiency of the proposed tech- using DDLA-SHW described in [7] as an example for illus-
niques. Finally, we conclude the paper in Section V. trating the assumption. As illustrated in Fig. 1, it is clear to
see that the results are different with the same dataset and
hyper-parameter. The attacker can reveal the correct key by
II. N ON - PROFILED DEEP LEARNING BASED SCA comparing the loss value of all candidates or using other SCA
metrics like key rank (KR) to determine the best candidate.
A. Differential deep learning analysis However, in the case of using an un-optimized model, it is
Differential deep learning analysis (DDLA) is a technique difficult to indicate where is the correct key in the graph, even
that exploits side-channel information obtained only from the if it has the lowest loss value in most epochs.
target device based on deep learning [3]. First, the attacker
III. P ROPOSED A NEW SCA METRIC FOR NON - PROFILED
obtains (N) power traces {t1 , t2 , . . . , tN } in cryptographic
DLSCA
operations with a fixed key k secret. Based on “divide and
conquer” strategy, an attacker attacks against AES by training A. SCA metrics
a network model to obtain outputs from the power traces. The This part briefly introduces the commonly used metrics in
outputs are then compared to the specific intermediate values SCA domain, such as score & rank, success rate (SR) and
(labels) {yk,1 , yk,2 , . . . , yk,N } for each key candidate k. Each guessing entropy (GE).
label yk,i computed from the key candidate k and the plaintext 1) Score & rank: In the case of attacking 8-bit
i using a power consumption model h, such as Hamming Sbox output, the set of key candidates k is limited to
Weight (HW), Hamming Distance (HD) or Least Significant K = {0, 1, . . . , 255}. The attack produces 256 scores
Bit (LSB). The key corresponding to the highest accuracy after [score0 , score1 , . . . , score255 ], where scorei is the attack
the training is specified as the correct key. The procedure of score of the key candidate i. For example, score17 is the
the non-profiled DLSCA is shown in Algorithm 1. attack score achieved by the key candidate k = 17. Finally,
we can produce a vector [rank0 , rank1 , . . . , rank255 ], where
Algorithm 1 Differential Deep Learning Analysis (DDLA) [3] rankk is the rank of key candidate k and the best possible rank
Input: D traces (ti )1≤i≤D , corresponding plaintexts equals 1. For example, if the best score came from k = 17,
(di )1≤i≤D , and K key hypotheses. A network N et and then rank17 = 1. By convention, a rank equal to 1 indicates
number of epochs ne the best key candidate. However, we set the best key candidate
Output: kcr ∈ k rank equal to zero in our scenario.
1: Set training data as X = (ti )1≤i≤D . Taking the attack results using DDLA-SHW as an example,
2: for kj ∈ k do the KR metric is determined for each attack as illustrated in
3: Re-initialize trainable parameters of N et Fig. 1. It is clear that the correct key usually has a lower
4: Compute the series of hypothetical values loss (low KR) value than the incorrect keys, as shown in
Fig. 1.a,b,c. However, detecting the key through the normal

hkj ,i 1≤i≤D
5:

Set training labels as ykj ,i = hkj ,i 1≤i≤D method or even the “early stop” technique is not enough.
 2) Success rate: The goal of the attacker is to recover the
6: Perform DL training: DL N et, X, ykj ,i , ne
key, while the evaluator’s goal is to assess how hard it is to find
7: end for
the key, even if she cannot do so. This divergence motivates
8: return key kcr which leads to the best DL training metrics
the “known-key analysis” and its respective metrics, namely
the known-key score, the known-key rank, and the success
rate [1]. With Q traces in the attack phase, an attack outputs a rank called inversion of exponential rank (IER) is proposed
key guessing vector in decreasing order of probability where and calculated as follows:
g1 denotes the most likely and the least likely key candidate. n
The success rate of order o is the average empirical probability 1X 1
IERj = , (α > 1) (4)
that the correct key is located within the first o elements of n i=1 αKRi,j
the key guessing vector g
 where 1 ≤ i ≤ n and 0 ≤ j ≤ 255.
i 1, ifrankkc ≤ o By using inversion of exponential rank with α > 1, the
SRo = (1)
0, otherwise IER of the correct key will reach 1 when KRi,ck equals zero
n for all i. In this case, the SR percentage of n attacks equal
1X 100%. Conversely, a significantly small IER value indicates a
SRo = SRoi . (2)
n i=1 higher rank. The values of IER for various ranks, calculated
using different values of α, are presented in Table I. Bold
3) Guessing Entropy: Similar to SR, guessing entropy is
font highlights the values that exceed 0.1. The analysis reveals
a commonly used SCA metric. The guessing entropy metric
that the determination of significant ranks can be influenced
can be used to evaluates the rank of the correct key and
by selecting the value of α. For instance, when considering
directly derived from the average rank of the correct key kc .
α = 1.3, only key guesses within the rank range of 0 to
For a certain experiment i, GE i is equal to log2 (rankkc ) [1].
5 significantly contribute to higher values of IER. The key
Then, GE is determined by the average of all n experiments
guesses with higher ranks have a negligible impact on IER.
as presented in Equation 3.
Consequently, the more consistently the DLSCA attacks yield
n
1X low-ranked keys, the higher the IER value.
GE = GE i . (3)
n i=1 From the analysis above, the proposed metric could be used
in evaluating the performance of different models under the
Although effective, the internal relationships between the same attack conditions. Moreover, the novelty of this metric
correct key and other candidates are not considered by GE. lies in its capability to reveal the secret key without requiring
B. Inversion of Exponential Rank prior knowledge of the correct key. This can be achieved by
The aforementioned metrics are primarily employed for following the steps outlined below:
known-key analysis, indicating the level of difficulty for an • The attack is performed and repeated N times on the

attacker to extract the secret key from a given set of measured same dataset. The ranks KR of all hypothesis keys are
traces. Specifically, when the SR percentage is low due to calculated on each attack.
non-zero key ranks, an evaluator can infer that “By employing • The IERj of the key guess number j is determined

attack A based on DL model B with N power traces, we were following Equation 4.


unable to successfully recover the secret key.” This situation • The hypothesis key corresponding to the highest IER is

commonly occurs when an unoptimized model is utilized specified as the correct key.
for DLSCA. As demonstrated in the previous section, non- Our proposed distinguisher is completed in Algorithm 2.
profiled DLSCA leverages training metrics to discriminate the
secret key from other potential candidates. Consequently, the Algorithm 2 Proposed non-profiled DLSCA using IER metric
attacker must fine-tune the DL model’s hyperparameters to Input: D traces (ti )1≤i≤D , corresponding plaintexts
maximize the distinction between the correct and incorrect (di )1≤i≤D , and K key hypotheses. A network N et and
keys. Additionally, the DLSCA attack needs to be repeated number of epochs ne
to obtain reliable outcomes. Hence, hyperparameter tuning in Output: kcr ∈ k
DL is a time-consuming and costly process. Consequently, the 1: Set training data as X = (ti )1≤i≤D .
ability to reveal the secret key from low success rate attacks 2: for i ∈ interation do
using minimally tuned models is vital in SCA evaluations. 3: for kj ∈ k do
A notable characteristic of non-profiled DLSCA is the 4: Re-initialize trainable parameters of N et
consistency and lower values of training metrics (loss, ac- 5:  Compute the series of hypothetical values
curacy) associated with the secret key compared to other hkj ,i 1≤i≤D
key hypotheses. When an optimized model is employed, the 6:

Set training labels as ykj ,i = hkj ,i 1≤i≤D
correct key kck has the lowest loss (or the highest accuracy). 
7: DL N et, X, ykj ,i , ne
Otherwise, the loss (accuracy) of key kck is not the lowest
8: Calculate the key rank KRi,j
(highest). Nevertheless, it still demonstrates lower loss or
9: end for
higher accuracy compared to numerous other key hypotheses.
10: end for
Importantly, these metrics are consistently achieved across
11: Calculate the IER for all key guesses using (4)
most attacks. Hence, we can assess the consistency of a
12: return key kcr which leads to the highest IER
hypothesis key across n attacks to distinguish it from other
keys. To investigate our hypothesis, a metric based on key
TABLE I 0.30 0.30
IER VALUES OF KEY RANKS USING DIFFERENT α.
0.25 Correct subkey=224
0.25
IER by Equation 4 0.20 0.20 Correct subkey=224
Rank (KR)
α=1.01 α=1.05 α=1.1 α=1.3 α=1.5 α=1.7 α=1.9

IER
IER
0 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.15 0.15
1 0.99 0.95 0.91 0.77 0.67 0.59 0.53 0.10 0.10
2 0.98 0.91 0.83 0.59 0.44 0.35 0.28

0 50100150200250
0.05 0.05
3 0.97 0.86 0.75 0.46 0.30 0.20 0.15
4 0.96 0.82 0.68 0.35 0.20 0.12 0.08 0.00 0.00
5 0.95 0.78 0.62 0.27 0.13 0.07 0.04 0 50 100 150 200 250
Key guesses Key guesses
10 0.91 0.61 0.39 0.07 0.02 0.00 0.00 a) b)
20 0.82 0.38 0.15 0.01 0.00 0.00 0.00
40 0.67 0.14 0.02 0.00 0.00 0.00 0.00 Fig. 2. Attack results using DDLA-SHW and IER. a) 30 epochs; b) 20 epochs.
80 0.45 0.02 0.00 0.00 0.00 0.00 0.00
160 0.20 0.00 0.00 0.00 0.00 0.00 0.00

in Table III. It shows that DDLA-SHW model provides poor


results, and the success rate of the attacks is only 23.33% and
IV. VALIDATION EXPERIMENTS
10%, corresponding to 30 epochs and 20 epochs, respectively.
A. Data preparation Typically, the attacker can conclude that DDLA-SHW failed
ASCAD (ANSSI SCA Database): This database set aims to attack the ASCAD dataset, especially in the case of SR is
to provide a benchmarking reference for the SCA community only 10%.
[8]. The purpose is to have something similar to the MNIST As indicated in the previous part, the loss metric of the
database that the Machine Learning community has been using attacks does not reach to lowest. However, the loss metric of
for quite a while now to evaluate the classification algorithm’s the correct key in different training keeps low and stable. It
performance. This dataset provides electromagnetic radiation is, therefore, exploited this phenomenon to find out the secret
data of an 8-bit ATMega8515 board with the first-order key. By applying Algorithm 2, we calculate the IER metric
protected software AES implementation. This paper uses the with α equal to 1.3. Fig. 2 shows the results of DDLA-
first version of the ASCAD dataset, which is captured at a SHW combining IER distinguisher. It is clear to see that
sampling rate of 2GSa/s. The length of these power traces is the correct key can be discriminated from other candidates.
100,000 samples. This dataset consists of two sets of traces: Especially in the case of training 20 epochs, the secret key
a profiling set of 50,000 traces to train DL networks and is still clearly distinguished with the SR percentage equal to
an attack set of 10,000 traces to test the efficiency of the 10%. The obtained results have clarified the effectiveness of
trained models in a profiled context. It is worth noting that our proposed metric and distinguisher, particularly in scenarios
700 samples corresponding to the 3rd S-box processing output with a low SR percentage of attacks. Furthermore, the IER
during the first round are taken to construct the ASCAD.h5 value associated with the correct key provides a positive
file. The value of the third byte is 224. Additional bytes can relationship with higher SR percentages compared to lower
be generated automatically using the provided Python script. SR percentages. This finding suggests that the DDLA-SHW
The data and scripts are available on the ASCAD GitHub attack, conducted over 30 epochs, delivers higher performance
repository1 . compared to that attack over 20 epochs. Therefore, the IER
CHES-CTF 2018: This dataset refers to the CHES Capture- metric proves to be suitable for evaluating the stability of
the-flag (CTF) AES-128 trace set, released in 2018 for the DLSCA attacks without prior knowledge of the secret key
conference on Cryptographic Hardware and Embedded Sys- like SR, GE, or rank.
tems (CHES) [9]. This database contains 45,000 power traces
C. MOR attack results
that record the masked AES-128 encryption on a 32-bit STM
microcontroller. This paper considers a pre-processed version To clarify the efficiency of proposed techniques on different
of the dataset, which includes a fixed key for all power traces, platform, CHES-CTF2018 is selected. In this case, we inves-
and each trace consists of 2200 samples. The full secret key of tigate the stability of non-profiled DLSCA based on a multi-
this dataset is presented in Table II. The pre-processed dataset output approach called MOR as indicated in [6]. Firstly, we
is available at http://aisylabdatasets.ewi.tudelft.nl/. perform the attack on all bytes of CHES-CTF2018. The attack
is repeated 30 times, and each is initialized with a random
B. DDLA-SHW attack results weight.
We first perform the attacks on ASCAD data using DDLA- Initially, the first attack is conducted over 15 epochs. Fol-
SHW as described in [7]. To evaluate the general performance lowing 30 repetitions, the success rates (SR) of all bytes are
of the attack, we perform multiple training phases (30 times) displayed in the first row of Table IV. It is observed that six
for the DDLA-SHW model with the weights initiated ran- bytes (approximately 62.5%) have SR percentages lower than
domly (Glorot normal). The attack results are summarized 40%. These particular bytes are considered difficult to reveal
during the initial attack. Subsequently, an attempt is made to
1 https://github.com/ANSSI-FR/ASCAD/tree/master/ATMEGA AES v1 uncover the secret key based on these low SR outcomes. We
TABLE II
T HE FULL SECRET KEY OF CHES-CTF 2018 DATASET.

Byte 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Value 23 92 242 153 122 133 131 65 60 119 223 172 126 108 89 216

TABLE III Byte 2 Byte 3 Byte 8


80
80 Correct key guess Correct key guess Correct key guess
T HE ATTACK RESULTS OF MOR AND MOR COMBINED IPGE ON THE Incorrect key guess Incorrect key guess
80
Incorrect key guess
78
ASCAD ( FIXED KEY ) DATASET. 78 78

MSE

MSE

MSE
76
76 76

Byte 74
74 74
Attack No. of epochs Results
3 72
72 72

DDLA-SHW [7] SR (%) 23.33 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16


30 Number of epochs Number of epochs Number of epochs
DDLA-SHW+IER (α = 1.3) ✓
Byte 12 Byte 15 Byte 16
DDLA-SHW [7] SR (%) 10 Correct key guess Correct key guess
20 80 80 Correct key guess 80
DDLA-SHW+IER (α = 1.3) ✓ 78
Incorrect key guess
78
Incorrect key guess
78
Incorrect key guess

✓: Successful revealing secret key

MSE

MSE
76

MSE
76
76
74
74
74
72
72
72 70
70
apply Algorithm 2 on all bytes of the secret key. The output of 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
Number of epochs Number of epochs Number of epochs
four bytes that have less than 40% SR are plotted in Fig. 3 and
Fig. 4. It is quite interesting when 12 correct bytes are taken. Fig. 3. Mean square errors (MSE) of the guessed subkey having SR
In which two bytes (8th and 12th ) are revealed from the low percentage lower than 40%.
SR results (36.67% for both). In addition, the other three bytes
(2th ,3th , and 15th ) are potentially distinguished since they are 0.4
Byte 2 Byte 3
0.6
Byte 8
Final guess 0.8 Final guess 191 Final guess
quite higher than other candidates. These results highlight the 0.3
89
0.7 0.5 The correct
0.6 key=65
improved performance of the proposed distinguisher, utilizing The correct
0.5
The correct 0.4
IER

IER

IER
0.2 key=92 0.4 key=242 0.3
the IER metric, in enhancing the attack’s effectiveness. 0.3 0.2
0.1 0.2
In the next step, we decided to perform further experiments 0.1
0.1
0.0 0.0 0.0
using MOR and MOR combined IER. However, one of the 0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250
Key guess Key guess Key guess
hyper-parameters of MOR architecture is modified to improve Byte 12 Byte 15 Byte 16
the attack’s performance. Our choice is based on the impact 0.7
0.6
Final guess
0.8
0.7 103 Final guess 0.6 113 Final guess

of the number of epochs in MOR attacks as indicated in 0.5


The correct
0.6
0.5 The correct
0.5
0.4
The correct
key=216
0.4
IER

IER

IER
[6]. Therefore, the number of epochs is chosen as 20 in this key=172 0.4 key=89
0.3 0.3
0.3
experiment. It is worth noting that the last hyper-parameters 0.2 0.2
0.2
0.1 0.1 0.1
are kept as the original MOR architecture. The SR of the 0.0 0.0 0.0
0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250
attacks on 6 bytes with 20 epochs is displayed in the third Key guess Key guess Key guess
row of Table IV. As expected, the SR of these bytes goes
up significantly. However, the SR of byte 2th is still low Fig. 4. The attack results of MOR attack using IER metric on the bytes
having low SR percentage (30 attacks, 15 epochs/attack).
(approximately 20%), and the SR of byte 16th does not change
(0%). Next, the IER distinguisher is applied to the outcomes
as illustrated in Fig. 5, and three correct bytes are detected (the vice, making them highly versatile. Deep learning algorithms
circled point at the maximum IER value). Consequently, these empower attackers to uncover complex relationships between
attacks significantly increase the number of correctly revealed side-channel signals and secret information, bypassing typical
bytes (approximately 93.75%). Notably, the IER metric proves side-channel analysis (SCA) countermeasures. However, non-
effective in revealing the secret key even in scenarios with low profiled DLSCA encounters challenges when the metric of the
success rates, similar to its performance observed with the correct key is not distinguishable from incorrect ones. This
ASCAD dataset. These results provide clear evidence of the paper has addressed this issue and proposed a novel metric,
efficiency of the IER distinguisher in enhancing the probability the inversion of exponential rank (IER), to enhance the perfor-
of non-profiled DLSCA attacks in revealing the secret key. mance of non-profiled DLSCA attacks. Experimental results
demonstrate the effectiveness of the proposed technique, even
V. C ONCLUSION
in scenarios where the partial success rate percentage is as low
In conclusion, non-profiled DLSCA leveraging deep neural as 10% using the ASCAD dataset. Moreover, when applied to
networks have emerged as a formidable threat to the security the CHES-CTF 2018 data, the number of correctly revealed
of cryptographic devices, enabling accurate extraction of sensi- bytes significantly increases by 93.75% from the initial 62.5%.
tive information. These attacks are advantageous over profiled In our future work, we will further explore the proposed metric
attacks as they do not require prior knowledge of the target de- by combining it with other deep learning techniques, such as
TABLE IV
ATTACK R ESULTS OF MOR AND MOR C OMBINED IER ON THE CHES-CTF2018 DATASET.

Byte
Attack No. of epochs Results
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
MOR [6] SR (%) 96.67 3.33 26.67 93.33 100 60 86.67 36.67 73.33 60 70 36.67 70 73.33 10 0
15
MOR+IER (α = 1.3) ✓ ✗ ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✗
MOR [6] SR (%) - 20 53.33 - - - - 90 - - - 60 - - 60 0
20
MOR+IER (α = 1.3) ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗
✓: Successful revealing secret key

Byte 3 Byte 15 [5] V.-P. Hoang, N.-T. Do, and V. S. Doan, “Efficient non-profiled side
0.8
0.8 Final guess Final guess channel attack using multi-output classification neural network,” IEEE
0.7
0.7 Embedded Systems Letters, pp. 1–1, 2022.
0.6 The correct
0.6
The correct [6] N.-T. Do, V.-P. Hoang, and V. S. Doan, “A novel non-profiled side
0.5 0.5 key=89
channel attack based on multi-output regression neural network,” Journal
IER

IER

0.4
key=242 0.4
0.3
of Cryptographic Engineering, mar 2023.
0.3
0.2
[7] N.-T. Do, V.-P. Hoang, V. S. Doan, and C.-K. Pham, “On the performance
0.2
0.1 0.1
of non-profiled side channel attacks based on deep learning techniques,”
0.0 0.0
IET Information Security, vol. 17, no. 3, pp. 377–393, dec 2022.
0 50 100 150 200 250 0 50 100 150 200 250 [8] R. Benadjila, E. Prouff, R. Strullu, E. Cagli, and C. Dumas, “Deep
Key guess Key guess learning for side-channel analysis and introduction to ASCAD database,”
Byte 16 Byte 2 Journal of Cryptographic Engineering, vol. 10, no. 2, pp. 163–188, nov
0.5 2019.
Final guess 131 Final guess
0.3 [9] A. Gohr, S. Jacob, and W. Schindler, “Ches 2018 side channel contest
0.4 The correct The correct ctf - solution of the aes challenges,” Cryptology ePrint Archive, Paper
0.3 key=216 0.2
key=92 2019/094, 2019, https://eprint.iacr.org/2019/094. [Online]. Available:
IER

IER

https://eprint.iacr.org/2019/094
0.2
0.1
0.1

0.0 0.0
0 50 100 150 200 250 0 50 100 150 200 250
Key guess Key guess

Fig. 5. The attack results of MOR attack using IER metric on the bytes
having low SR percentage (30 attacks, 20 epochs/attack).

the “early stopping”. This combination aims to enhance the


performance of the SCA attacks and facilitate the comparison
of effectiveness among various DLSCA techniques, providing
an efficient security evaluation method for the cryptographic
algorithm in IoT based smart healthcare systems.
ACKNOWLEDGMENT
This publication is the output of the ASEAN IVO
(https://www.nict.go.jp/en/asean ivo/index.html) project, “Ar-
tificial Intelligence Powered Comprehensive Cyber-Security
for Smart Healthcare Systems (AIPOSH)”, and financially
supported by NICT (https://www.nict.go.jp/en/index.html).
R EFERENCES
[1] K. Papagiannopoulos, O. Glamočanin, M. Azouaoui, D. Ros, F. Regaz-
zoni, and M. Stojilović, “The side-channel metrics cheat sheet,” ACM
Computing Surveys, vol. 55, no. 10, pp. 1–38, feb 2023.
[2] L. Wu, G. Perin, and S. Picek, “On the evaluation of deep learning-based
side-channel analysis,” in Constructive Side-Channel Analysis and Secure
Design. Springer International Publishing, 2022, pp. 49–71.
[3] B. Timon, “Non-profiled deep learning-based side-channel attacks with
sensitivity analysis,” IACR Transactions on Cryptographic Hardware and
Embedded Systems, vol. 2019, no. 2, pp. 107–131, Feb. 2019. [Online].
Available: https://tches.iacr.org/index.php/TCHES/article/view/7387
[4] D. Kwon, S. Hong, and H. Kim, “Optimizing implementations of non-
profiled deep learning-based side-channel attacks,” IEEE Access, vol. 10,
pp. 5957–5967, 2022.

You might also like