0% found this document useful (0 votes)

14 views13 pages

EVADE Targeted Adversarial False Data Injection Attacks for State Estimation in Smart Grid (1)

The article presents the EVADE strategy, a targeted adversarial false data injection attack designed to bypass bad data detection and neural attack detection methods in smart grid state estimation. By selecting key state variables and perturbing as few variables as possible, EVADE aims to achieve high attack efficiency and success rates while minimizing costs. Experimental results indicate that this approach poses significant security risks to cyber-physical power systems, highlighting the vulnerabilities of deep learning models in this context.

Uploaded by

Sharda Tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views13 pages

EVADE Targeted Adversarial False Data Injection Attacks for State Estimation in Smart Grid (1)

Uploaded by

Sharda Tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

This article has been accepted for publication in IEEE Transactions on Sustainable Computing.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TSUSC.2024.3492290

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 1

EVADE: Targeted Adversarial False Data Injection

Attacks for State Estimation in Smart Grid
Jiwei Tian, Chao Shen, Buhong Wang, Chao Ren, Xiaofang Xia, Runze Dong, and Tianhao Cheng

Abstract—Although conventional false data injection attacks z A vector of measurements

can circumvent the detection of bad data detection (BDD) in H A Jacobian matrix for state estimation
sustainable power grid cyber physical systems, they are easily x A vector of state variables
detected by well-trained deep learning-based detectors. Still, state
estimation models with deep leaning-based detectors are not e A vector of measurement errors
secure due to the vulnerabilities and fragility of deep learning x̂ A vector of estimated state variables
models. Using the related laws of conventional false data injection R A covariance matrix of e
attacks and adversarial sample attacks, this paper proposes the σi The standard deviation of the i-th meter
targEted adVersarial fAlse Data injEction (EVADE) strategy τ The BDD detection threshold
to explore targeted adversarial false data injection attacks for
state estimation in Smart Grid. The proposed EVADE attack za A vector of measurements after FDIA
strategy selects key state variables based on adversarial saliency a FDIA vector
maps to improve the attack efficiency and perturbs as few state c A vector of malicious errors
variables as possible to reduce the attack cost. In this way, the z′ A vector of an adversarial example
EVADE attack strategy can bypass the detection of BDD and zf A vector of measurements after AFDIA
neural attack detection (NAD) methods (that is, maintaining deep
stealthy) with a high success rate and achieve the attack target si- f (·) A function defined by a NN
multaneously. Experimental results demonstrate the effectiveness ζ(·) A function computed before Softmax of
of the proposed strategy, posing serious and pressing concerns f (·)
for sustainable cyber physical power system security. ω(·) Softmax function
Index Terms—Adversarial example, cyber physical system, y An output vector of a NN
deep learning, false data injection, state estimation. ŷz A predicted label of z
ϑ A perturbation vector added to measure-
ments
ϑc A perturbation added to state variables
N OMENCLATURE
Πc The indicator vector of zero elements of c
Acronyms µ A predetermined state perturbation range
AFDIA Adversarial False Data Injection Attack ϖ The change-of-variables for ζc
BDD Bad Data Detection
EVADE targEted adVersarial fAlse Data injEction Subscripts
FDIA False Data Injection Attack m Number of meter measurements
NAD Neural Attack Detection n Number of state variables
NN Neural Network
WLS Weighted Least Squares
I. I NTRODUCTION
Symbols

This work was supported by the National Natural Science Foundation of

China under Grant 62402520 and 62301600, China Postdoctoral Science
I N the era of Energy Internet, the integration of artificial
intelligence (AI) into power grid systems is a crucial
step towards the development of smart grid [1]. AI models
Foundation under Grant Number 2024M752586, Young Talent Fund of and algorithms are being utilized more extensively, allowing
Association for Science and Technology in Shaanxi under Grant 20240105, for improved efficiency, predictive maintenance, and overall
Shaanxi Provincial Natural Science Foundation under Grant 2024JC-YBQN-
0620, and Shaanxi Province Postdoctoral Research Funding Project under management of power grid systems. AI brings many benefits
Grant 2023BSHYDZZ20. J. Tian and C. Shen are with the Ministry of to the transformation of power systems into smart grid, it also
Education Key Laboratory for Intelligent Networks and Network Secu- brings challenges [2] [3]. Understanding these challenges and
rity, School of Cyber Science and Engineering, Xi’an Jiaotong University,
Xi’an, China (e-mail: [email protected]; [email protected]). finding effective countermeasures is crucial to the success-
J. Tian is also with Air Traffic Control and Navigation College, Air ful and secure implementation of AI in smart grid [4]–[7].
Force Engineering University, Xi’an, China (e-mail: [email protected]). However, these advancements come with certain challenges
B. Wang, R. Dong and T. Cheng are with Information and Navigation
College, Air Force Engineering University, Xi’an, China (e-mail: hong- due to the susceptibility of AI systems to adversarial attacks
[email protected];[email protected];[email protected]). C. Ren is [8]. Adversarial example attacks are carefully crafted inputs
with School of Electrical and Electronic Engineering, Nanyang Technological designed to confuse machine learning (ML) models, leading
University, Singapore (email: [email protected]). X. Xia is with School
of Computer Science and Technology, Xidian University, Xi’an, China (e- them to make erroneous decisions. These attacks represent a
mail: [email protected]).(Corresponding author: Chao Shen) significant threat to ML-based model’s security and could be

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on November 16,2024 at 10:40:27 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Sustainable Computing. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TSUSC.2024.3492290

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 2

particularly damaging in critical infrastructure. For smart grid, investigate targeted adversarial false data injection attacks in
adversarial attacks could lead to incorrect load forecasting order to reach the planned attack target [34]. The sparse-state
[9]–[11], misclassify the instability operating status [12]–[15], adversarial attacks’ success rate is limited, despite the fact
compromise power quality recognition [16], allow for unde- that the proposeed parallel optimization technique can explore
tected energy theft [17]–[19], or interfere with non-intrusive additional sparse-state adversarial attacks [34]. Additionally, if
load monitoring (NILM) [20]. more state variables are attacked, the attack cost will increase,
The vulnerabilities of deep learning (DL) models, which are and the parallel approach will produce more combinations,
particularly good at detecting patterns in large data sets, have which is inappropriate for systems with limited software and
been a subject of many studies. Given the high stakes associ- hardware environments. In fact, achieving a higher attack
ated with power grid operations, researchers are paying more success rate at a lower attack cost is the preferred attack
attention to possible attacks and potential countermeasures. method for attackers, and such an attack method is also an
A comprehensive review of the recent progress in designing appropriate way to evaluate practical security risks from a
attack and defense methods against machine learning methods defensive perspective, which has not yet been fully studied.
in smart grid is provided in [21]. DL-based models used for To address the problems above, we propose the EVADE
stability assessment are prime targets for such attacks [12]– (targEted adVersarial fAlse Data injEction) strategy to evade
[15]. The reliability and accuracy of these models are essential both BDD and NAD methods and attack power system state
for the smooth operation of a power grid. If these models are estimation in this paper. Fig. 1 illustrates how conventional
compromised, it could result in operational inefficiencies or false data injection attacks (FDIA) can easily circumvent the
even catastrophic failures and blackout [22]–[24]. Similarly, BDD method while being easily caught by a well-trained
adversarial attacks on models used for load forecasting could NAD model. Using the appropriate methods of adversarial
severely impact a power grid’s operational efficiency [9]– examples, attacks that can circumvent both BDD and NAD
[11]. If load forecasts are inaccurate, it could lead to energy simultaneously are proposed in [31]–[33]. However, these
waste, inadequate power supply, or even power outages. A methods are non-targeted attacks. To avoid impacting targets
similar security vulnerability also exists in renewable energy of original attacks and increase attack success rate and attack
forecasting, i.e. adversarial learning attack (ALA) [25]. The efficiency, we propose the EVADE strategy that circumvents
potential for adversarial attacks extends to other aspects of both BDD and NAD by perturbing key state variables of non-
smart grid as well. Power quality recognition is crucial for attacked targets.
ensuring that the electrical power supplied to consumers meets We summarize the main contributions below:
the required standards, where the related adversarial attack • We propose the EVADE attack strategy, which can both
and defense method are proposed in [16], [26], [27]. Energy achieve the attack target with less attack cost and higher attack
theft detection is essential for preventing unauthorized use of efficiency:
electricity [17]–[19]. NILM is used to determine the type and 1) The EVADE attack strategy perturbs non-target state
number of electrical appliances used in a specific location [20], variables to achieve the attack target and perturbs as few
and it’s an essential tool for energy conservation. Besides, state variables as possible to reduce the attack cost.
the transferable adversarial attacks with distribution targeting 2) The EVADE attack strategy selects key state variables
the deep reinforcement learning (DRL)-based dynamic pricing based on adversarial saliency maps to improve the attack
system are proposed in [28]. efficiency.
Energy management system (EMS) is an intelligent system 3) The EVADE attack strategy can bypass the detection
integrating hardware and software to monitor, control and op- of BDD and NAD methods (that is, maintaining deep
timize energy flow and energy consumption in energy systems. stealthy) with a high success rate and achieve the attack
As the backbone of EMS, the security of power system state target simultaneously.
estimation is critical. Therefore, state estimation based on deep • We conducted extensive experimental analyses to investi-
learning has also received much attention [29]–[35]. However, gate the proposed EVADE strategy. The comparison results
physical restrictions or the use of bad data detection (BDD) with other related methods demonstrate that the proposed
methods are not taken into account in the adversarial examples EVADE attack strategy has a higher attack success rate with
or evasion attacks proposed in [29] and [30]. Since BDD smaller attack cost and attack perturbation. In addition, tar-
is likely to identify the proposed attacks, even though they geted adversarial false data injection attacks strongly correlates
can avoid deep learning-based neural attack detection (NAD) with the original attack targets’ scale and the state variation
techniques, they are likely to be useless. The problem is then range (physical limit): as the original FDIA attack targets’
addressed in [31]–[33], which explores adversarial example scale decreases and the state variation range increases, the
attacks that adhere to the inherent physical restrictions of attack success rate will increase. Furthermore, as the scale
power systems. The original attack target is surely impacted by of the power grid increases, the potential perturbation search
added perturbations, despite the fact that these techniques can range is larger, and the corresponding attack success rate
produce covert attack vectors. Additionally, these techniques also increases, posing a serious danger to state estimation in
produce a high degree of uncertainty regarding attack vectors practical large-scale power grids.
and are less constrained by the impacts of ultimate attacks (to a The remainder of the paper is arranged as follows. In
certain extent, the ultimate attack is covert if the perturbation Section II, the preliminary materials and related works are
is little enough) [33]. Therefore, we proposed a strategy to briefly summarized. In Section III, the proposed EVADE attack

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 3

strategy is given and examined. In Section IV, the proposed χ2 test with the desired level of significance (e.g., 99%), the
EVADE attack strategy is evaluated and analyzed for a variety threshold τ > 0 is predefined.
of case studies, followed by a conclusion in Section V. However, as mentioned in [37], an attacker can implement
stealthy FDIA if the designed attack vector a meets the
II. PRELIMINARIES condition
a = Hc, (5)
A. State Estimation and False Data Injection Attacks
It takes a long time to estimate the state of a large-scale where c is an n-length non-zero column vector. The rationale
power system employing the nonlinear alternating current is that
(AC) power flow model. Additionally, it frequently does
r a = z a − H x̂a = (z + a) − H(x̂ + c) = z − H x̂ = r, (6)
not converge to an ideal solution [36]. To approximate the
nonlinear AC model, a linearized direct current (DC) model which means that the residual r a under the FDIA attack is the
is sometimes employed as an alternative. Though less accurate same as the original residual r.
than the AC power flow model, the DC power flow model is Since L(x̂a ) = L(x̂), it follows that if L(x̂) meets the
more resilient, faster, and simpler. Therefore, real-time pro- condition L(x̂) ≤ τ , then L(x̂a ), which corresponds to r a ,
cedures, like the computation of the real-time local marginal also meets the condition L(x̂a ) ≤ τ , indicating that the attack
price, frequently make use of the DC power flow model. The is stealthy.
aforementioned benefits make DC state estimation popular in
power grids. The DC power flow model for a n + 1 buses’
power system with m meters is as follows1 : B. Adversarial False Data Injection Attacks
To address the threat of FDIA, as reviewed and summarized
z = Hx + e, (1) in [39], a number of methods utilizing neural networks (NN)
where z ∈ Rm , x ∈ Rn and e ∈ Rm stand for a meter mea- have been put forth, all of which exhibit outstanding detection
surements’ vector, a state variable vector, and a measurement performance. FDIA’s neural attack detection (NAD) module
errors’ vector (noise), respectively. The measurement Jacobian is used to detect whether the measurement data is attacked
matrix is represented by H ∈ Rm×n . It is assumed that mea- by FDIA, where a normal sample has a class label of 0,
surement errors follow a zero mean and R covariance matrix’s and an attacked sample has a class label of 1. A function
Gaussian distribution, where R = diag(σ12 , σ22 , · · · , σm
2
) and f (z : Ψ) = y that receives a measurement vector z ∈ Rm
2
Rii ≜ σi is the variance of the i-th meter. and outputs y ∈ Rκ (κ = 2), where Ψ represents the network
The weighted least squares (WLS) state estimation, which parameters, can be used to represent the NAD model. Note
is often employed, minimizes the following function [37] that we will not explicitly show how f is dependent on Ψ
in the next sections for the purpose of simplicity. In general,
m
X the network’s output for classification tasks is calculated using
L(x) = (z i − H i x)2 /Rii . (2)
the Softmax function ω(·) [40], and the learned NAD model
i=1
f classifies the instance by
Based on (2), one can derive the optimal solution
ŷz = max{yi |yi ∈ {y1 , y2 , . . . , yκ } : y = f (z : Ψ)}, (7)
x̂ = (H T R−1 H)−1 H T R−1 z, (3) yi

where yi stands for the y’s i-th variable and indicates the
where H T denotes the matrix H’s transpose and R−1 denotes
likelihood that z will be assigned to the i-th category. The
the matrix R’s inverse. Erroneous measurements may be
output from the function ζ(z) is passed to the Softmax
introduced for a variety of reasons including meter failures
function ω(·) by
and cyber attacks. The goal of BDD is to determine whether
bad data exist in the measurement vectors, and BDD just f (z : Ψ) = ω (ζ (z)) = y. (8)
utilize the residual r = z − H x̂ to identify bad data.
It is theoretically demonstrated that L(x̂) follows the χ2 The constrained optimization problem below can be used
distribution by presuming that the meter errors (z i −H i x̂)/σi to generate an adversarial example z ′ for a well-trained NAD
follow the normal distribution [38]: model f and a measurement vector z
m m
X r i 2 X z i − H i x̂ 2 min φ(z ′ , z) (9a)
L(x̂) = = ′
z
σi σi (4)
i=1 i=1 s.t ŷz′ ̸= ŷz . (9b)
T
= (z − H x̂) R−1 (z − H x̂) .
ϑ ≜ z ′ − z represents the added perturbation. ℓ0 , ℓ2 , and
z is recognized as a bad measurement vector if L(x̂) ≥ τ . ℓ∞ are used in a variety of works to compute the φ(z ′ , z)
If not, z is regarded as a normal data vector. Following a distance function. The aforementioned optimization problem
(9) minimizes the perturbation while causing models to in-
1 Although DC state estimation approaches are used in this work, we
correctly classify the input under specific restrictions. FGSM
think that AC state estimation models are also likely to have similar flaws.
Therefore, in subsequent work, we will conduct a thorough investigation of [41], JSMA [42], DeepFool [43] and C&W [44] are typical
AC state estimation methods. adversarial example generation methods in the area of images.

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 4

Power System BDD and Detection

Power System Topology Results
9 SCADA Network
Measurements
7 C
G GENERATORS 13 8
Bad Data Alarm
C SYNCHRONOUS 12 14 Bad Data Detector
CONDENSERS
4
(yes or no)
RTU1
11
10

G
9 RTU2 Neural Attack Detection
6
1 C C
8

7
5 4 RTU3
Attack Data Alarm
(yes or no)
2

3 RTUm
C
End-to-End Detector

Possible Attack Paths z

False Data Injection Attack
za  z  a FDI Bypassing BDD
Detected by NAD
Sparse-State Adversarial Attack
(Key State Variable Selection)
EVADE Attack Strategy
za  z  aFDI +H ( c c ) Bypassing BDD
Bypassing NAD

Fig. 1: The schematic diagram of the EVADE attack strategy bypassing both BDD and NAD in Smart Grid. The EVADE
attack method, which is described in Section III, is used to compute the perturbation vector ϑc .

III. EVADE: TARGETED A DVERSARIAL FALSE DATA use of a variety of tools and techniques by attackers, such as
I NJECTION ATTACKS listening in on network traffic and invading database systems
Adversarial false data injection attacks (AFDIA) proposed [33].
in [33] attempt to fool NAD to identify attacked/false in- • Attack Objective: The c’s non-zero elements represent the
stances as legitimate/normal. The key of AFDIA is to generate initial attack target, indicating the introduction of certain
adversarial examples for FDIA’s NAD detection. Since the errors to specified state variables. To execute the so-called
measurement vector after FDIA (z a = z + a) will be targeted AFDIA, the attack approach should thus maintain
identified as attacked/false by NAD, a smart attacker must the original attack target’s integrity. Note that, the attacker
add an appropriate perturbation vector ϑ to z a so that the do not seek low-sparsity FDIA, therefore the associated val-
ultimate adversarial measurement vector after AFDIA (z f = ues of targeted state variables are arbitrary. This is important
z a + ϑ = z + a + ϑ) can bypass the NAD’s detection, where to keep in mind when considering a more thorough attack
a′ = a + ϑ represents the AFDIA’s total attack vector. The strategy2 .
above-mentioned procedure can be stated as • Attack Cost: In order to avoid being affected by the NAD
approach, we suppose that the attacker has the means to
deal with additional sensors and tamper with meter values,
min φ(z f , z a ) (10a) thus making sure the original attack target is unaffected.
ϑ
s.t zf = za + ϑ (10b) The attacker can only perturb the non-attack target’s state
variables (the zero-valued elements of c in (5)) because the
ŷ za = 1 (10c) initial attack target should stay unchanged, which gener-
ŷ zf = 0 (10d) ally means that the ultimate attack cost should be raised.
Therefore, the ultimate attack objective seeks to minimize
where φ(z f , z a ) = ∥z f − z a ∥2 . The constraints (10c) and
the perturbation-related attack costs while maintaining the
(10d) imply that the measurement vector after FDIA z a is
original attack target. The increase in the number of non-
categorized as attacked and the ultimate measurement vector
zero-valued elements of c can be used to indicate the
after AFDIA z f is classed as legitimate. It is important to
additional perturbation-related attack cost since using more
keep in mind that the attack described by the aforementioned
state variables typically requires dealing with more sensors.
issue (10) is not targeted because the original attack objective
• Attack Efficiency: The attack strategy and algorithm need
would be impacted. In the following sections, we formulate
to maintain high efficiency under the existing software
and describe the designed EVADE strategy to address the
and hardware conditions to implement feasible attacks.
problem.
Otherwise, if the algorithm’s running time is too long, the
generated attack vector is likely no longer suitable for the
A. Attack Model and Objective changing environment.
The threat model considered in this work is as follows: • Attack Success Rate: The attack strategy needs to have a
• Attack Condition: We suppose that the attacker is able to high success rate to implement a successful attack with a
create false data using knowledge of the power grid and 2 For low-sparsity FDIA, the attack vector a should include as few non-
NAD parameters. Actually, the relevant information may be zero items as possible. In this case, the relationship between the associated
acquired either through insider threats [45] or through the targeted state variables must meet specific requirements and is not random.

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 5

high probability, that is, bypass the detection of BDD and Therefore, the final residual r f under z f is equal to r by
NAD and achieve the original attack target.
r f = z f − H x̂f
Although the aforementioned conditions and presumptions ′ ′

might not always be applicable in real-world situations, they = (z + a + H(Πc ⊙ ϑc )) − H(x̂ + c + Πc ⊙ ϑc )

allow us to investigate the robustness and weaknesses of power = z − H(x̂)
system state estimation with deep learning based detectors. = r.
According to the predetermined attack objective (c), as (14)
presented in Fig. 1, the attacker can use formula (5) to
calculate the appropriate FDIA attack vector a. A well-trained From (14), the ultimate measurement vector z f will not
NAD model, however, is likely to be able to identify the induce residuals, indicating that the final attack vector can
ideal attack vector. The attacker can then implement additional bypass the BDD’s detection. Therefore, the feasible solution
adversarial attacks to evade the NAD’s detection by taking of (12) is stealthy to BDD and NAD methods.
advantage of the flaws in the NAD model. Inspired by “one-
The ultimate measurement vector z f will not induce resid-
pixel attack” [46] and SparseFool [47] in the image domain,
uals because a + H(Πc ⊙ ϑc ) = H(c + Πc ⊙ ϑc ), which
the above mentioned adversarial attacks can be implemented as
is demonstrated in (6); as a result, the ultimate measurement
sparse-state adversarial attacks to lower the attack cost, which
vector is covert to BDD. The attack against the NAD approach
will be presented below.
can then be assured to be stealthy according to the feasible so-
lutions of (12). As a result, the attacker is able to produce deep
B. Sparse-State Adversarial Attacks stealthy false data that is hidden from both BDD and NAD
approaches. Using the C&W attacks [44] as our inspiration ,
To avoid detection by the NAD approach, the attacker must we can recast (12) by
search for perturbations of the initial FDIA attack vector
(a = Hc). Recall that FDIA does not induce residuals (6). min ψ (ϑc ) =(ζ(z f )1 − ζ(z f )0 )+
ϑc
Inspired by this, state variables are perturbed to generate (15)
s.t ∥Πc ⊙ ϑc ∥∞ ≤ µ.
perturbations indirectly. Let the target state variable set and
non-target state variable set be denoted by Υt = {i | ci ̸= 0} where (e)+ denotes max(e, 0), and ζ(x)i stands for the fea-
and Υu = {i | ci = 0} for each c, respectively. To accomplish tures of x that have been activated for the i-th category. The at-
certain attack objectives, elements in Υt should remain fixed, tacker can obtain the ultimate attack vector a + H(Πc ⊙ ϑc )
while elements in Υu can be altered to circumvent NAD. Let and adversarial measurement vector z f using the derived
Πc represent a column vector of size c that is specified by ϑc . To reduce the cost of the attack, the attacker can use
( additional sparse-state adversarial attacks, which are detailed
0, if ci ̸= 0 subsequently.
(Πc )i = (11)
1, if ci = 0
Definition 1. N -state Adversarial Attack [34]: Suppose the
Subsequent adversarial attacks can therefore be represented by initial attack target, original measurement vector and NAD
model are c, z, and f (x : Ψ) = y, respectively. If ŷz+a = 1,
and there are suitable perturbations that make ŷzf = 0, where
min φ(z f , z a ) (12a) z f = z + a + H(Πc ⊙ ϑc ), then the ultimate measurement
ϑc
vector z f are deeply stealthy. Besides, the above additional
s.t z f = z a +H(Πc ⊙ ϑc ) (12b)
attack is a N -state adversarial attack if the number of non-
ŷza = 1 (12c) zero items in Πc ⊙ ϑc is N (i.e., ∥Πc ⊙ ϑc ∥0 = N ).
ŷzf = 0 (12d)
∥Πc ⊙ ϑc ∥∞ ≤ µ (12e) C. Key State Variable Selection Based on Adversarial Saliency
Map
where ⊙ indicates the Hadamard product, ∥·∥∞ represents the
ℓ∞ norm, and µ stands for the attacker-defined state variation Given the specific non-target state variable set Υu , we need
range. to find a suitable N -state adversarial attack to circumvent
Proposition 1: If the problem (12) has a feasible solution, NAD. Obviously, from the perspective of attack cost, the
then we can obtain a covert attack vector that is stealthy to smaller the N , the better. However, we don’t know what the
both BDD and NAD methods. smallest N is and what are the corresponding non-target state
variables to attack. Although we have proposed a parallel
′
Proof. Suppose that ϑc is a feasible solution to the problem optimization technique to investigate additional sparse-state
′
(12). Then, z f = z + a + H(Πc ⊙ ϑc ) is computed to derive adversarial attacks, the overall attack success rate is limited
the ultimate measurement vector. From (12), the ultimate [34]. In addition, as the number of attacked state variables
measurement vector z f can bypass the NAD’s detection. In increases, the number of possible combinations will increase
addition, we have exponentially and exceed the limitations of existing hardware
′ ′
and software environments. Therefore, a more reasonable
z f = z + a + H(Πc ⊙ ϑc ) = z + H(c + Πc ⊙ ϑc ). (13) attack strategy is needed.

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 6

Inspired by JSMA [42] and “key region attack” [48] in the D. Change of Variables
image domain, additional sparse-state adversarial attacks can The issue (20) is a typical “box constrained” problem.
be explored by key state variable selection based on adversarial Therefore, we use the change-of-variable technique and op-
saliency maps. The saliency maps previously introduced as timize over ϖ rather than optimizing over ϑc by
visualization tools [49] are extended to construct adversarial
saliency maps as in [42]. ϑc = µ · tanh(ϖ). (21)
Let ϑc represent a column vector of size c and set its
elements to zero. According to Equation (5), (7) and (8), Given that −1 ≤ tanh(ϖ) ≤ 1, then −µ ≤ ϑc ≤ µ,
and the result falls inside the predetermined range. The above
problem (20) can therefore be recast as
y = f (z f ) = f (za +Hϑc ) = ω (ζ (z f )) = ω (ζ (za + Hϑc )) . min ψ (ϖ) = (ζ(z f )1 − ζ(z f )0 )+
ϖ (22)
(16)
Then, we can compute the forward derivative with respect to s.t z f =z + a + H(ΠN ⊙ (µ · tanh(ϖ))).
ϑc , With the change-of-variable technique described above, we
" #
∂ζ (z f ) ∂ζ (z f )j can apply optimization methods without box limitations. Since
= , (17) it is not known in advance how many state variables need to be
∂ϑc ∂(ϑc )i
i∈1···n, j=0/1 perturbed for the attack to succeed, to reduce the attack cost,
where i refers to the state variables in the power system, which the EVADE attack strategy is implemented through an iterative
ranges from 1 to n. process and is given in Algorithm 1. At the step 1 of Algorithm
1, we compute Υu and Υt given an attack target c. After the
For the “attacked” sample, ζ (z f )1 is larger than ζ (z f )0 .
initialization, the overall forward derivatives are computed,
To bypass the detection of NAD, added perturbations should
and the list is created at the step 3 and 4, respectively.
make ζ (z f )0 larger than ζ (z f )1 . Based on Equation (17), for
Then, appropriate perturbations are found using the outer and
each state variable, we can calculate two forward derivatives
∂ζ(z f )0 ∂ζ(z f )1 inner iterations. In the outer iteration, key state variables are
∂(ϑc )i and ∂(ϑc )i . For the case of only two output classes, if selected. In the inner iteration, the vector ϖ is computed and
the predicted probability of one class increases, the predicted
updated. The corresponding parameters are recorded from step
probability of the other class must decrease. Therefore, one
12 to step 13 if there are suitable perturbations. If not, the step
of the two forward derivatives must be positive and the other
16 applies gradients to update the variables ϖ (the Adam [50]
negative. Then, we can define the overall forward derivative
optimizer is actually used in the experiments due to its better
for each state variable by
performance). If one feasible solution can be found, Algorithm
∂ζ (z f ) ∂ζ (z f )0 ∂ζ (z f )1 1 will be stopped. Therefore, creating appropriate sparse-state
|i = × . (18) adversarial attack vectors will be effective.
∂(ϑc ) ∂(ϑc )i ∂(ϑc )i

∂ζ(z ) IV. E XPERIMENTAL A NALYSES

The larger ∂(ϑcf) |i (|•| refers to operations that take
absolute values) is, the greater the influence of the perturbation Through extensive simulations with IEEE test systems such
of the corresponding state variable on the predicted class. as 14-bus, 30-bus, and 118-bus systems, we analyze the
In this way, we can get the forward derivatives for all state proposed EVADE attack approach in this section.
variables and sort them in descending order,
A. Experimental Setup
∂ζ (z f ) ∂ζ (z f ) Dataset Generation: Matpower [51] contains information
list = argsort(− |1 , · · · , − |n ). (19)
∂(ϑc ) ∂(ϑc ) about test systems, including their topology, bus data, branch
data, and meter locations, which can be used to create mea-
The above list can be seen as the adversarial saliency map surement instances. We take into account that loads on each
as in [42]: the state variable earlier in the list, the more bus follow the Uniform distribution U (80%∗baseload, 120%∗
its perturbation has a greater impact on the predicted class baseload), same like in [52] and [53]. Then, we created 30,000
change. Therefore, by perturbing key state variables in the normal measurement instances under the assumption that each
list, we can explore additional sparse-state adversarial attacks the test systems are thoroughly measured. 15,000 samples
in Section III.B. are utilized as attacked/false training instances when well-
Let ΠN be a column vector of size c (ΠN = 0), and designed FDIA vectors are added, and 15,000 samples are
only the N items of ΠN corresponding to the top N non- utilized as legitimate/normal training data. The noise that is
target state variables in the list are set as 1. Then, the attacker added to the data has a zero-mean Gaussian distribution, and
can investigate the associated N -state adversarial attack by the standard deviation is set at 2% of the mean measurements
formulating and resolving the following issue: of the relevant meter. Based on c that was generated at random,
the FDIA vectors (a = Hc) were created [33]:
min ψ (ϑc ) =(ζ(z f )1 − ζ(z f )0 )+
ϑc
(20) i) The number of targeted state variables follows the Uni-
s.t ∥ΠN ⊙ ϑc ∥∞ ≤ µ. form distribution U (1, n/2);

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 7

Algorithm 1: EVADE (targEted adVersarial fAlse Data 15,000 of 30,000 normal instances. The entire dataset was
injEction) Attack Strategy. then randomly divided at a 2:1 ratio, with 20,000 samples for
Input: training and 10,000 samples for testing.
z: A normal measurement vector. Target DNNs for NAD: The NAD models for FDIA were
c: The original attack objective. created using fully connected neural networks, and the model
ζ: The function calculated before the softmax of the structures are shown in Table II, which can also be referred
NAD model. to [33] [34]. Pytorch is used to train all of the NAD models.
µ: The specified variation range of state variable The well-trained NAD models’ test accuracies for the 14-bus,
perturbations. 30-bus, and 118-bus systems, are 99.4%, 99.6%, and 99.0%,
M : The maximum number of iterations. respectively [34].
λ: The learning rate. Evaluation of Attacks: 5,000 FDIA instances in the test set
Output: were used for our evaluation of the proposed EVADE attack
z f : The adversarial measurement vector. approach. Based on the proposed EVADE attack strategy, we
ϑc : The feasible state variable perturbation vector. search for perturbations added to each FDIA sample in an
1 Creating Υt and Υu : Υt = {i | ci ̸= 0}, effort to prevent the NAD from detecting them, and M = 200
Υu = {i | ci = 0}, Q = card(Υu ). was employed in the tests. The success rates (Pmiss 3 ) of
2 Initialization: za = z + Hc, ϑc = 0, the sparse-state adversarial attacks outlined above can vary
z f = za + Hϑc , stage = 0. depending on the circumstances. So, in various scenarios,
3 Computing the overall forward derivatives: we examined the attack success rate Pmiss and impacting
∂ζ(z f ) ∂ζ(z f )0 ∂ζ(z f )1 factors. The initial attack objective is attained since both
∂(ϑc ) |i = ∂(ϑc )i × ∂(ϑc )i .
NAD and BDD cannot identify the EVADE attack. If the
4 Creating the list:
∂ζ(z ) ∂ζ(z )
attacker can seek a feasible perturbation to circumvent NAD,
list = argsort(− ∂(ϑcf) |1 , · · · , − ∂(ϑcf) |n ). it implies that the EVADE attack is effective because the
5 for N = 1 : Q do proposed attack strategy searchs for perturbations to avoid
6 ΠN = 0, finding the largest N non-target variables NAD under the premise of that BDD is circumvented. In
in list and setting the corresponding elements in addition, ∥ϑc ∥0 , ∥ϑc ∥2 and ∥ϑc ∥∞ are also recorded to
ΠN as 1. evaluate the perturbations required for successful attacks.
7 z f = za + H(ΠN ⊙ (µ · tanh(ϖ)))
8 ψ (ϖ) = (ζ(z f )1 − ζ(z f )0 )+ B. Analysis of Key State Variable Selection
9 for m = 1 : M do
We implemented the EVADE attack strategy and recorded
10 if ψ == 0 then
average Pmiss , ∥ϑc ∥0 , ∥ϑc ∥2 and ∥ϑc ∥∞ (µ = 1 rad =
11 stage = 1
(180/π)o ). To explore the effect of the proposed EVADE
12 ϑc ← µ · tanh(ϖ)
method, we compare “Forward” (Key State Variable Selection)
13 z f ← za + H(ΠN ⊙ ϑc )
with “Reverse” (the order of selection of key state variables
14 break
is the opposite of the Key State Variable Selection approach)
15 end
∂ψ and “Random” (the selection of key state variables is random).
16 ϖ ← ϖ − λ · ∂ϖ
Fig. 2 shows that Pmiss for 14-bus, 30-bus and 118-bus are
17 end
about 80%, 80%, and 96%, respectively, posing a considerable
18 if stage == 1 then
threat to power systems. In addition, the success rates of these
19 break
three approaches are similar, as all approaches can increase the
20 end
number of perturbed state variables and add perturbations to
21 N =N +1
bypass detection in an iterative process. Although the success
22 end
rates are similar, the perturbations and attack costs generated
23 return z f and ϑc .
by different methods are highly different. Among the three
methods, ∥ϑc ∥0 , ∥ϑc ∥2 and ∥ϑc ∥∞ of the EVADE strategy
(“Forward”) are the smallest. For ∥ϑc ∥0 , the smallest average
ii) The targeted state variables follow the Gaussian distri- value implies that the EVADE attack strategy can carry out
bution N ∼ 0, ν 2 . stealthy attacks with minimal attack cost. For example, the av-
Then, according to variance differences, we created 5000 erage ∥ϑc ∥0 values for “Forward”, “Random”, and “Reverse”
FDIA vectors at each scale.: in 118-bus are about 10.87, 19.14 and 33.26, respectively.
(i) at the small scale, ν 2 = 0.02; For ∥ϑc ∥2 and ∥ϑc ∥∞ , the smallest values mean that the
(ii) at the medium scale, ν 2 = 0.1; perturbation generated by the EVADE attack strategy is the
(iii) at the large scale, ν 2 = 0.5. smallest and, therefore, relatively less likely to be noticed by
system operators. In summary, the proposed EVADE attack
The related system parameters and data are presented in
strategy has a higher success rate with smaller attack cost and
Table I. After the above process, we eventually obtained
15,000 normal and 15,000 attacked data by adding these 3 For a successful case, both NAD and BDD methods miss the detection of
15,000 well-designed attack vectors at random to samples of the EVADE attack.

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 8

TABLE I: The related parameters and data for IEEE 14-bus, 30-bus and 118-bus systems.
Attacked samples (adding FDIA vectors)
No. of buses No. of lines Normal samples
Small scale (ν 2 = 0.02) Medium scale (ν 2 = 0.1) Large scale (ν 2 = 0.5)
14bus 14 20 5000 5000 5000 15000
30bus 30 41 5000 5000 5000 15000
118bus 118 186 5000 5000 5000 15000

TABLE II: Network architectures of DNNs for NAD in experimental analyses for IEEE 14-bus, 30-bus and 118-bus systems.
NAD models Input 1 2 3 4 5 6 7 8 9
14-bus 54 20 ReLu 20 ReLu 40 ReLu 20 ReLu 2 Softmax
30-bus 112 60 ReLu 60 ReLu 40 ReLu 40 ReLu 20 ReLu 20 ReLu 2 Softmax
118-bus 490 250 ReLu 120 ReLu 60 ReLu 60 ReLu 40 ReLu 40 ReLu 20 ReLu 20 ReLu 2 Softmax

attack perturbation. In addition, as shown in Fig. 3, the average 3

execution time of successful cases needed for 14-bus, 30-bus,
and 118-bus systems are 0.62 sec., 0.95 sec., and 2.74 sec., re- 2.5
spectively. Since the sampling period of SCADA (Supervisory
Control And Data Acquisition) systems is 2–4s, the average 2

Average Time
running time indicates that attackers can effectively investigate
feasible perturbations to avoid detection. A successful EVADE
1.5
attack case for IEEE 14-bus system is shown in Fig. 4: the
attacker tries to induce specific errors to targeted state variables
(c1 = 0.0478, c5 = 0.0171, c8 = −0.0163, c12 = 0.0189, and 1
c13 = 0.0374; the corresponding state variables are colored red
in Fig. 4 (c)), and the related measurement vector after FDIA 0.5
is presented in Fig. 4 (d). However, the FDIA attack can be
easily detected by well-trained NAD models. To bypass the 0
14-bus 30-bus 118-bus
detection of NAD and BDD, the attacker use the proposed
EVADE strategy to seek perturbations on untargeted state Fig. 3: Successful attack cases’ average execution time (in
variables (c3 = 0.1161 and c4 = 0.1040; the corresponding seconds).
state variables are colored yellow in Fig. 4 (e)). In this way,
the attacker can successfully carry out targeted attacks while
maintaining stealth. representation of physical boundaries). The results in Fig. 5
(a) demonstrate that as the attack limit decreases, the attack
success rate also decreases due to the reduced perturbation
1 40 search range. Although the attack success rate did not drop
30 significantly, this was due to the limited reduction of the
state variation range (µ = 1 to µ = 0.5). This phenomenon
Pmiss

0.5 20 can be explained in Fig. 5 (d) that most ∥ϑc ∥∞ values are
10 less than 0.5. We speculate that if the state variation range
is small such as µ = 0.1, the attack success rate will be
0 0 significantly reduced. In addition, Fig. 5 (b), (c) and (d) show
14-bus 30-bus 118-bus 14-bus 30-bus 118-bus
(a) (b) that for most cases, ∥ϑc ∥0 , ∥ϑc ∥2 and ∥ϑc ∥∞ values for
4 1 µ = 0.5 are less than ∥ϑc ∥0 , ∥ϑc ∥2 and ∥ϑc ∥∞ for µ = 1,
which implies smaller attack cost and attack perturbation.
3
To sum up, as the state variation range µ increases, the
2 0.5 attack success rate increases, but the attack cost and attack
1
perturbation also increase, which means that there may be a
trade-off between attack success rate and attack cost /attack
0 0 perturbation. Therefore, a smart attacker must implement a
14-bus 30-bus 118-bus 14-bus 30-bus 118-bus
(c) (d) reasonable attack based on the system information, physical
constraints, and attack resources.
Fig. 2: Average Pmiss (a), ∥ϑc ∥0 (b), ∥ϑc ∥2 (c) and ∥ϑc ∥∞
(d) with different methods.
D. Analysis of Effect of FDIA Scales
Our experiments show that the attack performance is closely
C. Analysis of Effect of State Variation Range related to the scale of the initial attack targets. Fig. 6 (a)
We also conducted tests for µ = 0.5 (µ = 0.5 rad = represents the attack success rates with FDIA attacks at
(90/π)o ) to examine the effects of state variation ranges (the multiple scales. The success rates are comparatively high for

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 9

0 2

Measurement Vector (MW)

-0.05
State Variable (rad)

1
-0.1

-0.15 0

-0.2
-1
-0.25

-0.3 -2
1 2 3 4 5 6 7 8 9 10 11 12 13 0 10 20 30 40 50 60
(a) Original State Variables (b) Original Measurement Vector
0 2

Measurement Vector (MW)

-0.05 1.5
State Variable (rad)

-0.1 1

-0.15 0.5

-0.2 0

-0.5
-0.25
-1
-0.3
1 2 3 4 5 6 7 8 9 10 11 12 13 0 10 20 30 40 50 60
(c) State Variables after FDIA (d) Measurement Vector after FDIA
0 2

Measurement Vector (MW)

-0.05
State Variable (rad)

1
-0.1

-0.15 0

-0.2
-1
-0.25

-0.3 -2
1 2 3 4 5 6 7 8 9 10 11 12 13 0 10 20 30 40 50 60
(e) State Variables after EVADE (f) Measurement Vector after EVADE

Fig. 4: A successful case for IEEE 14-bus system.

the attack scale increases, so do ∥ϑc ∥0 , ∥ϑc ∥2 and ∥ϑc ∥∞ .

1 The reason is that as the scale of FDAI attacks increases,
10
larger attack perturbations are required to avoid detection, thus
increasing the norm of perturbations. Since the increase of
Pmiss

0.5 5 perturbations cannot exceed limits such as ∥ϑc ∥∞ ≤ 1, the

attack success rate is also reduced. In short, the larger the
scale of FDIA attack, the larger the perturbation is required to
0 0 bypass NAD’s detection. If the required perturbation exceeds
14-bus 30-bus 118-bus 14-bus 30-bus 118-bus
(a) (b) the physical limit, no suitable perturbation can be found, and
1.5 the attack will fail.
0.6

1 0.4
1 20
0.5 0.2
15
miss

0 0 0.5
14-bus 30-bus 118-bus 14-bus 30-bus 118-bus 10
P

(c) (d)
5
Fig. 5: Average Pmiss (a), ∥ϑc ∥0 (b), ∥ϑc ∥2 (c) and ∥ϑc ∥∞
0 0
(d) with different state variation ranges (µ = 1 vs µ = 0.5). 14-bus 30-bus 118-bus 14-bus 30-bus 118-bus
(a) (b)
3 1

small-scale original attack objectives, being 100% for 14-bus

2
and 118-bus systems (and 95.5% for 30-bus system). The
0.5
likelihood of discovering suitable perturbations declines as
1
the scale of original attack objectives increases. However, for
large-scale attacks, the success rates for 14-bus and 30-bus
0 0
systems can still reach about 50%, and the attack success rate 14-bus 30-bus 118-bus 14-bus 30-bus 118-bus
for the 118-bus system is as high as 90%. We speculate that as (c) (d)
the scale of power systems increases, the potential perturbation Fig. 6: Average Pmiss (a), ∥ϑc ∥0 (b), ∥ϑc ∥2 (c) and ∥ϑc ∥∞
search range is larger, and the corresponding attack success (d) of attacks at multiple scales (µ = 1).
rate will likewise rise, which poses a serious risk to real large-
scale power systems. Fig. 6 (b), (c) and (d) show that as

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 10

E. Convergence and Complexity Analysis F. Comparison With State-of-the-Art Methods

Adversarial false data injection attack methods against state
We then tried to perform an experimental analysis of con- estimation include [29], [30], [33] and so on. In [29], the
vergence of the proposed EVADE strategy. In Algorithm 1, Fast Gradient Sign Method (FGSM) in [41] is borrowed
appropriate perturbations are found using the outer and in- to design attacks. In [30], the Jacobian-based Saliency Map
ner iterations. In the outer iteration, key state variables are Attack (JSMA) method in [42] is utilized to design attacks.
selected. In the inner iteration, the vector ϖ is computed Therefore, the attack methods in [29] and [30] are represented
and updated. In other words, the outer loop selects the state by FGSM and JSMA, respectively. However, these two attack
variables that have the greatest impact, and the inner loop methods do not consider physical constraints or the presence
subsequently searches for appropriate perturbations for the of BDD, which makes them easily detectable by BDD. On
selected state variables. If no suitable perturbation is found, the contrary, S-AFDIA [33] considers the physical constraints
the outer loop continues to iterate by increasing the number of of the system and the BDD, and hence the designed attack
selected state variables. Thus, with too few state variables, the method can bypass the detection of BDD. Therefore, we
algorithm may fail to find suitable adversarial perturbations compared the EVADE strategy with the above three methods,
and adversarial samples. Whereas, as the number of selected mainly from the perspectives of the average probability of
state variables increases, the probability of finding a suit- successfully executing deep stealth attacks (bypassing both
able adversarial perturbation increases. For the experimental BDD and NAD detection is a successful attack) and the
convergence analysis, the existence of outer loops makes the average attack cost (represented by the ratio of the perturbed
optimization of the algorithm iterative and repetitive, and in meters to all the meters in the system). The main results are
order to better demonstrate the effect of convergence, we given in Table III. From Table III, both FGSM and JSMA
choose the last outer loop in the successful samples (i.e., methods present poor attack success rates due to the fact
the outer loop that succeeds at the end) to demonstrate the that they are detected by the BDD. In comparison, S-AFDIA
convergence of the inner loop of the algorithm. Fig. 7 shows and EVADE demonstrate better attack success rates because
a convergence analysis of Algorithm 1 where the cost (ψ) they consider physical constraints so they can bypass BDD
decreases during optimization. Note that although we adjusted and NAD detection. Although S-AFDIA has a higher attack
Algorithm 1 to record 200 iterations in Fig. 7, as long as a success rate than EVADE, S-AFDIA requires a much higher
feasible perturbation is found, the algorithm is successfully attack cost, which is often difficult to meet in reality. In
terminated. addition, it should be noted that even though the attack cost
of JSMA is very low, its attack success rate is very low, and
such an attack is meaningless. By the overall comparison,
120 300 3500
the proposed EVADE attack strategy can have a high attack
14-bus 30-bus 118-bus
3000 success rate with a low attack cost, which is worth the attention
100 250
and importance of both attackers and defenders.
2500
80 200
TABLE III: Comparison of the FGSM, JSMA, S-AFDIA and
2000 EVADE methods for the IEEE 14-bus, 30-bus and 118-bus
60 150
systems.
1500
40 100
Cases Attack methods Attack success rate Attack cost
1000 FGSM 2.02% 98.1%
JSMA 2.15% 17.3%
14-bus
20 50 S-AFDIA 100% 99.2%
500
EVADE 79.1% 34.2%
FGSM 0.35% 98.9%
0 0 0 JSMA 0.24% 16.1%
30-bus
S-AFDIA 100% 99.6%
-20 -50 -500 EVADE 83.2% 32.2%
0 100 200 0 100 200 0 100 200 FGSM 0.95% 98.7%
Iteration Iteration Iteration JSMA 0.91% 15.3%
118-bus
S-AFDIA 100% 99.3%
EVADE 96.5% 29.8%
Fig. 7: An experimental analysis of convergence of Algorithm
1 for the IEEE 14-bus, 30-bus and 118-bus systems.
G. Limitations
Finally, we evaluated the computation cost in terms of This study may have some potential shortcomings. First, the
iteration complexity. In Algorithm 1, the loop iteration process premise of a white-box attack may not always be accurate and
is between Line 5 and Line 22, which includes an outer loop is overly general. Nevertheless, because of APT’s (Advanced
and an inner loop. In the inner loop, the maximum number of Persistent Threat) well-funded, experienced teams of adver-
iterations is M . In the outer loop, the maximum number of saries, white-box attacks against high-value power systems
iterations is Q. Therefore, the maximum number of iterations cannot be ignored. In this situation, the white-box attack is
of Algorithm 1 should be M × Q. used to investigate and evaluate the reliability and weaknesses

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 11

of power system state estimation with deep learning based [9] Y. Chen, Y. Tan, and B. Zhang, “Exploiting vulnerabilities of load
detectors. Besides, a lot of cyber security research begins with forecasting through adversarial attacks,” in e-Energy 2019, 2019, pp.
1–11.
potential white-box attacks [41] and then moves on to real- [10] X. Zhou, Y. Li, C. A. Barreto, J. Li, P. Volgyesi, H. Neema, and
world black-box attacks [54], like the adversarial examples in X. Koutsoukos, “Evaluating resilience of grid load predictions under
the area of images. Therefore, in our future work, we will stealthy adversarial attacks,” in 2019 Resilience Week (RWS), vol. 1,
2019, pp. 206–212.
investigate black-box targeted AFDIA, considering the laws [11] Y. Zhou, Z. Ding, Q. Wen, and Y. Wang, “Robust load forecasting
and characteristics of relevant research. towards adversarial attacks via bayesian learning,” IEEE Transactions
Second, no research has been done into the associated on Power Systems, vol. 38, no. 2, pp. 1445–1459, 2023.
[12] C. Ren and Y. Xu, “Robustness verification for machine-learning-based
effective defense methods. There are some possible defense power system dynamic security assessment models under adversarial
measures, such as adversarial training [55] [56], defensive examples,” IEEE Transactions on Control of Network Systems, vol. 9,
distillation [57], and adversarial detection. The foregoing de- no. 4, pp. 1645–1654, 2022.
[13] C. Ren, X. Du, Y. Xu, Q. Song, Y. Liu, and R. Tan, “Vulnerability
fense strategies, however, may not immediately apply to power analysis, robustness verification, and mitigation strategy for machine
systems according to our early investigations. We should learning-based power system stability assessment model under adver-
carefully analyze the features of power systems while adjusting sarial examples,” IEEE Transactions on Smart Grid, vol. 13, no. 2, pp.
1622–1632, 2022.
and enhancing the aforementioned defense strategies in order [14] C. Ren and Y. Xu, “A universal defense strategy for data-driven power
to protect the security and safety of power systems. The system stability assessment models under adversarial examples,” IEEE
shackles of defense techniques in the image domain can, of Internet of Things Journal, vol. 10, no. 9, pp. 7568–7576, 2023.
[15] Q. Song, R. Tan, C. Ren, and Y. Xu, “Understanding credibility of
course, be thrown off, and we can instead directly look into adversarial examples against smart grid: A case study for voltage
the particular defense strategies for power systems. stability assessment,” in Proceedings of the Twelfth ACM International
Conference on Future Energy Systems, ser. e-Energy ’21, 2021, p.
95–106.
V. C ONCLUSION [16] J. Tian, B. Wang, J. Li, and Z. Wang, “Adversarial attacks and defense
for CNN based power quality recognition in smart grid,” IEEE Transac-
In this work, we designed the EVADE strategy to explore tions on Network Science and Engineering, vol. 9, no. 2, pp. 807–819,
2022.
targeted AFDIA. The EVADE attack strategy selects key state [17] J. Li, Y. Yang, and J. S. Sun, “SearchFromFree: Adversarial measure-
variables based on adversarial saliency maps to improve the ments for machine learning-based energy theft detection,” in SmartGrid-
attack efficiency and perturbs as few state variables as possible Comm 2020, 2020, pp. 1–6.
[18] ——, “Exploiting vulnerabilities of deep learning-based energy
to reduce the attack cost. In this way, the EVADE attack theft detection in AMI through adversarial attacks,” arXiv preprint
strategy can bypass the detection of BDD and NAD methods arXiv:2010.09212, 2020.
(that is, maintaining deep stealthy) with a high success rate and [19] A. Takiddin, M. Ismail, and E. Serpedin, “Robust data-driven detection
of electricity theft adversarial evasion attacks in smart grids,” IEEE
achieve the attack target simultaneously, which poses a serious Transactions on Smart Grid, vol. 14, no. 1, pp. 663–676, 2023.
and imperative security breach and risk for practical power [20] J. Wang and P. Srikantha, “Stealthy black-box attacks on deep learning
systems. Considering more practical attacks, future studies non-intrusive load monitoring models,” IEEE Transactions on Smart
Grid, vol. 12, no. 4, pp. 3479–3492, 2021.
will focus on the issue of black-box attacks. Besides, the [21] Z. Zhang, M. Liu, M. Sun, R. Deng, P. e. Cheng, D. Niyato, M.-Y. Chow,
corresponding defense measures will also be investigated. and J. Chen, “Vulnerability of machine learning approaches applied
in IoT-based smart grid: A review,” IEEE Internet of Things Journal,
vol. 11, no. 11, pp. 18 951–18 975, 2024.
R EFERENCES [22] Z. Zhang, M. Sun, R. Deng, C. Kang, and M.-Y. Chow, “Physics-
constrained robustness evaluation of intelligent security assessment for
[1] C. Chen, Y. Wang, M. Cui, J. Zhao, W. Bi, Y. Chen, and X. Zhang, power systems,” IEEE Transactions on Power Systems, vol. 38, no. 1,
“Data-driven detection of stealthy false data injection attack against pp. 872–884, 2023.
power system state estimation,” IEEE Transactions on Industrial In- [23] Z. Liu, Q. Wang, Y. Ye, and Y. Tang, “A GAN-based data injection attack
formatics, vol. 18, no. 12, pp. 8467–8476, 2022. method on data-driven strategies in power systems,” IEEE Transactions
[2] M. Anisetti, C. A. Ardagna, A. Balestrucci, N. Bena, E. Damiani, and on Smart Grid, vol. 13, no. 4, pp. 3203–3213, 2022.
C. Y. Yeun, “On the robustness of random forest against untargeted [24] A. Venzke and S. Chatzivasileiadis, “Verification of neural network
data poisoning: An ensemble-based approach,” IEEE Transactions on behaviour: Formal guarantees for power system applications,” IEEE
Sustainable Computing, vol. 8, no. 4, pp. 540–554, 2023. Transactions on Smart Grid, vol. 12, no. 1, pp. 383–397, 2021.
[3] X. Yang, Y. Gong, W. Liu, J. Bailey, D. Tao, and W. Liu, “Semantic- [25] J. Ruan, Q. Wang, S. Chen, H. Lyu, G. Liang, J. Zhao, and Z. Y. Dong,
preserving adversarial text attacks,” IEEE Transactions on Sustainable “On vulnerability of renewable energy forecasting: Adversarial learning
Computing, vol. 8, no. 4, pp. 583–595, 2023. attacks,” IEEE Transactions on Industrial Informatics, vol. 20, no. 3,
[4] Z. Zhang, K. Zuo, R. Deng, F. Teng, and M. Sun, “Cybersecurity anal- pp. 3650–3663, 2024.
ysis of data-driven power system stability assessment,” IEEE Internet of [26] L. Zhang, C. Jiang, Z. Chai, and Y. He, “Adversarial attack and training
Things Journal, vol. 10, no. 17, pp. 15 723–15 735, 2023. for deep neural network based power quality disturbance classification,”
[5] P. Asef, R. Taheri, M. Shojafar, I. Mporas, and R. Tafazolli, “SIEMS: Engineering Applications of Artificial Intelligence, vol. 127, p. 107245,
A secure intelligent energy management system for industrial IoT 2024.
applications,” IEEE Transactions on Industrial Informatics, vol. 19, [27] L. Zhang, C. Jiang, A. Pang, and Y. He, “Super-efficient detector and
no. 1, pp. 1039–1050, 2023. defense method for adversarial attacks in power quality classification,”
[6] J. Chen, X. Gao, R. Deng, Y. He, C. Fang, and P. Cheng, “Generating Applied Energy, vol. 361, p. 122872, 2024.
adversarial examples against machine learning-based intrusion detector [28] Y. Ren, H. Zhang, W. Yang, M. Li, J. Zhang, and H. Li, “Transferable
in industrial control systems,” IEEE Transactions on Dependable and adversarial attack against deep reinforcement learning-based smart grid
Secure Computing, vol. 19, no. 3, pp. 1810–1825, 2022. dynamic pricing system,” IEEE Transactions on Industrial Informatics,
[7] M. Wu, R. Roy, P. Serna Torre, and P. Hidalgo-Gonzalez, “Effectiveness vol. 20, no. 6, pp. 9015–9025, 2024.
of learning algorithms with attack and defense mechanisms for power [29] A. Sayghe, O. M. Anubi, and C. Konstantinou, “Adversarial examples
systems,” Electric Power Systems Research, vol. 212, p. 108598, 2022. on power systems state estimation,” in ISGT 2020, 2020, pp. 1–5.
[8] J. Tian, B. Wang, R. Guo, Z. Wang, K. Cao, and X. Wang, “Adversarial [30] A. Sayghe, J. Zhao, and C. Konstantinou, “Evasion attacks with adver-
attacks and defenses for deep-learning-based unmanned aerial vehicles,” sarial deep learning against power system state estimation,” in PESGM
IEEE Internet of Things Journal, vol. 9, no. 22, pp. 22 399–22 409, 2022. 2020, 2020, pp. 1–5.

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 12

[31] J. Li, Y. Yang, J. S. Sun, K. Tomsovic, and H. Qi, “ConAML: [54] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and
Constrained adversarial machine learning for cyber-physical systems,” A. Swami, “Practical black-box attacks against machine learning,” in
in ACM ASIACCS 2021, 2021, pp. 52–66. ACM ASIACCS 2017, 2017, pp. 506–519.
[32] ——, “Towards adversarial-resilient deep neural networks for false data [55] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning
injection attack detection in power grids,” in 2023 32nd International at scale,” arXiv preprint arXiv:1611.01236, 2016.
Conference on Computer Communications and Networks (ICCCN), [56] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and
2023, pp. 1–10. P. McDaniel, “Ensemble adversarial training: Attacks and defenses,”
[33] J. Tian, B. Wang, Z. Wang, K. Cao, J. Li, and M. Ozay, “Joint adversarial arXiv preprint arXiv:1705.07204, 2017.
example and false data injection attacks for state estimation in power [57] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation
systems,” IEEE Transactions on Cybernetics, vol. 52, no. 12, pp. 13 699– as a defense to adversarial perturbations against deep neural networks,”
13 713, 2022. in 2016 IEEE symposium on security and privacy (SP). IEEE, 2016,
[34] J. Tian, B. Wang, J. Li, Z. Wang, B. Ma, and M. Ozay, “Exploring pp. 582–597.
targeted and stealthy false data injection attacks via adversarial machine
learning,” IEEE Internet of Things Journal, vol. 9, no. 15, pp. 14 116–
14 125, 2022.
[35] J. Tian, C. Shen, B. Wang, X. Xia, M. Zhang, C. Lin, and Q. Li,
“LESSON: Multi-label adversarial false data injection attack for deep
learning locational detection,” IEEE Transactions on Dependable and Jiwei Tian received the Ph.D. degree in Cyberspace
Secure Computing, vol. 21, no. 5, pp. 4418–4432, 2024. Security from Air Force Engineering University,
[36] J. Tian, B. Wang, T. Li, F. Shang, and K. Cao, “Coordinated cyber- Xi’an, China, in 2021. He is currently an Assistant
physical attacks considering DoS attacks in power systems,” Interna- Professor with Air Traffic Control and Navigation
tional Journal of Robust and Nonlinear Control, vol. 30, no. 11, pp. College, Air Force Engineering University, and the
4345–4358, 2020. Ministry of Education Key Laboratory for Intelligent
[37] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against Networks and Network Security, School of Cyber
state estimation in electric power grids,” ACM Transactions on Informa- Science and Engineering, Xi’an Jiaotong University,
tion and System Security (TISSEC), vol. 14, no. 1, p. 13, 2011. Xi’an, China. His current research interests include
IoT/CPS security and AI security.
[38] A. Abur and A. G. Exposito, Power System State Estimation: Theory
and Implementation. CRC press, 2004.
[39] A. S. Musleh, G. Chen, and Z. Y. Dong, “A survey on the detection algo-
rithms for false data injection attacks in smart grids,” IEEE Transactions
on Smart Grid, vol. 11, no. 3, pp. 2218–2234, 2020.
[40] B. Gao and L. Pavel, “On the properties of the softmax function with
application in game theory and reinforcement learning,” arXiv preprint Chao Shen received the B.S. degree in Automation
arXiv:1704.00805, 2017. from Xi’an Jiaotong University, China in 2007, and
[41] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing the Ph.D. degree in Control Theory and Control
adversarial examples,” arXiv preprint arXiv:1412.6572, 2014. Engineering from Xi’an Jiaotong University, China
[42] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and in 2014. He is currently a Professor in the Faculty
A. Swami, “The limitations of deep learning in adversarial settings,” of Electronic and Information Engineering, Xi’an
in EuroS&P 2016, 2016, pp. 372–387. Jiaotong University of China. His current research
[43] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: a simple interests include AI Security, insider/intrusion de-
and accurate method to fool deep neural networks,” in CVPR 2016, 2016, tection, behavioral biometrics, and measurement and
pp. 2574–2582. experimental methodology.
[44] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural
networks,” in 2017 IEEE Symposium on Security and Privacy (SP),
2017, pp. 39–57.
[45] N. Baracaldo and J. Joshi, “An adaptive risk management and access
control framework to mitigate insider threats,” Computers & Security,
vol. 39, pp. 237–254, 2013.
[46] J. Su, D. V. Vargas, and K. Sakurai, “One pixel attack for fooling deep Buhong Wang received the M.S. and Ph.D. de-
neural networks,” IEEE Transactions on Evolutionary Computation, grees in signal and information processing from
vol. 23, no. 5, pp. 828–841, 2019. Xidian University, Xi’an, China, in 2000 and 2003,
respectively. Since 2012, he has been a Professor
[47] A. Modas, S.-M. Moosavi-Dezfooli, and P. Frossard, “Sparsefool: A
with the Information and Navigation College, Air
few pixels make a big difference,” in 2019 IEEE/CVF Conference on
Force Engineering University. His current research
Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9079–
interests include cyber security and cyber physical
9088.
systems.
[48] Q. Liao, Y. Li, X. Wang, B. Kong, B. Zhu, S. Lyu, Y. Yin, Q. Song, and
X. Wu, “Imperceptible adversarial examples for fake image detection,”
in 2021 IEEE International Conference on Image Processing (ICIP),
2021, pp. 3912–3916.
[49] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional
networks: Visualising image classification models and saliency maps,”
arXiv preprint arXiv:1312.6034, 2013.
[50] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” Chao Ren received the B.E. degree from the School
arXiv preprint arXiv:1412.6980, 2014. of Computer Science and Technology, Nanjing Uni-
[51] R. D. Zimmerman, C. E. Murillo-Sánchez, and R. J. Thomas, “Mat- versity of Aeronautics and Astronautics, Nanjing,
power: Steady-state operations, planning, and analysis tools for power China, in July 2017, and the Ph.D. degree from Inter-
systems research and education,” IEEE Transactions on Power Systems, disciplinary Graduate School, Nanyang Technologi-
vol. 26, no. 1, pp. 12–19, 2010. cal University, Singapore, in March 2022. Currently,
[52] S. Ahmed, Y. Lee, S. Hyun, and I. Koo, “Unsupervised machine he is a Wallenberg-NTU Presidential Postdoctoral
learning-based detection of covert data integrity assault in smart grid Fellow at School of Electrical and Electronic Engi-
networks utilizing isolation forest,” IEEE Transactions on Information neering, Nanyang Technological University, Singa-
Forensics and Security, vol. 14, no. 10, pp. 2765–2777, 2019. pore. His research interests include machine learn-
[53] M. Esmalifalak, L. Liu, N. Nguyen, R. Zheng, and Z. Han, “Detecting ing, data-analytics, and their applications to smart
stealthy false data injection using machine learning in smart grid,” IEEE grid.
Systems Journal, vol. 11, no. 3, pp. 1644–1652, 2014.

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 13

Xiaofang Xia received the Ph.D. degree in Control

Theory and Control Engineering from Shenyang
Institute of Automation, Chinese Academy of Sci-
ences, China, in 2019. She was a visiting student
at the Department of Computer Science, University
of Alabama, USA, from August 2016 to February
2018. She is currently an associate professor with the
School of Computer Science and Technology, Xidian
University, China. Her research interests are mainly
in cyber physical systems, database management
systems, data security, and anomaly detection.

Runze Dong received the Ph.D. degrees in Cy-

berspace Security from Air Force Engineering Uni-
versity, Xi’an, China, in 2022. He is currently a Lec-
turer with the School of Information and Navigation,
Air Force Engineering University, Xi’an, China. His
research interests include physical layer security
of unmanned aerial vehicle (UAV) communication
networks and the integration of artificial intelligence
(AI) and communication networks.

Tianhao Cheng received the Ph.D. degree in Cy-

berspace Security from Air Force Engineering Uni-
versity, Xi’an, China, in 2023. His research interests
include physical layer security in unmanned aerial
vehicle communication systems and array signal
processing.