Statistical Framework
Statistical Framework
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
Abstract—Electricity distribution networks have undergone of attention has been paid to the impact of FDI on the
rapid change with the introduction of smart meter technology, grid’s state estimation problem [8] and how coordinate attacks
that have advanced sensing and communications capabilities, re- can occur [9],[10]. [11] proposes an adaptive procedure to
sulting in improved measurement and control functions. However,
the same capabilities have enabled various cyber-attacks. A par- test whether there is a data attack activity combined with
ticular attack focuses on electricity theft, where the attacker alters a multivariate hypothesis testing method in order to avoid
(increases) the electricity consumption measurements recorded the wrong grid-state estimate. Addressing the problem from
by the smart meter of other users, while reducing her own a different angle, [12] attempts to prevent the state estimation
measurement. Thus, such attacks, since they maintain the total from being compromised, by approaching the problem from
amount of power consumed at the distribution transformer
are hard to detect by techniques that monitor mean levels of a graph theoretic method aiming to design an optimal set of
consumption patterns. To address this data integrity problem, meter measurements. [13] considers a setting where multiple
we develop statistical techniques that utilize information on simultaneous nefarious data attacks are launched and proposes
higher order statistics of electricity consumption and thus are a game theoretic framework to build a defense system.
capable of detecting such attacks and also identify the users Another thrust has focused on the electricity theft problem
(attacker and victims) involved. The models work both for
independent and correlated electricity consumption streams. The and there are two general streams in the literature. One of them
results are illustrated on synthetic data, as well as emulated focuses on using machine learning and data mining techniques
attacks leveraging real consumption data. to detect anomalies in the consumption patterns of a household
Index Terms—false data injection mechanism, anomaly detec- or business, based on smart meters’ historical data -see e.g.
tion and diagnosis, higher order information, smart grid, inverse [14], [15], [16], [17], [18], [19], [20]- potentially augmented
problem, thresholded covariance matrix with information about the consumer type [20]. These methods
can be further subdivided to supervised ones that leverage
I. I NTRODUCTION labels (known FDI vs non-FDI) samples in the training data,
and unsupervised ones that try to identify abrupt changes
Electricity theft has been a major concern worldwide and from normal consumption patterns. Supervised methods can be
costs utility companies significant revenue losses [1], [2]. powerful, but availability of labeled FDI samples remains a big
It takes various forms, ranging from physical interventions challenge. Unsupervised methods are susceptible to the impact
through illegal connections and meter tampering, to billing of non-malicious factors that alter consumption patterns; e.g.
irregularities and unpaid bills by customers. The introduction seasonality, change of appliances, change of occupants, and
of advanced metering infrastructure has the potential to reduce so forth [21].
the risk of electricity theft through its increasing frequency A different stream in the literature utilizes information about
monitoring capabilities. In addition, smart meter technology the architecture of a neighborhood area network in the smart
can lead to effective and accurate load forecasting and on-time grid [22], [23], [24], [25], [26]. Specifically, it assumes that the
troubleshooting for outage remediation and network controlla- electricity provider builds a distribution station within every
bility (see, e.g., [3], [4], [5]). At the same time, it offers new neighborhood that acts as an “electricity router” to distribute
opportunities for tampering with operations of the power grid power from the substation to all consumers, A master smart
through cyber-attacks both locally and remotely, that take the meter (known as the collector) measures aggregate power
form of false data injections. The consequences range from supply from the power provider to all consumers within a
compromising demand response schemes for selected targeted certain time interval. Further, smart meters installed at each
areas, to endangering the power grid’s state estimation process consumer (households or businesses) record their correspond-
or even inducing power outages [6]. ing energy consumption for the same time interval. [24]
There is a growing literature on false data injection (FDI) proposed a method that utilizes such measurements, together
attack activities (a brief summary is given in [7]). A lot with information about the resistances of lines connecting the
This work was supported in part by NSF grant DMS 1830175. consumption points to the distribution transformers to estimate
J. Tao is with the Department of Statistics, University of Florida, technical losses due to low voltage power lines, as well as
Gainesville, FL 32611, USA email:[email protected] intrinsic inefficiencies in the transformers. [25], [26] em-
G. Michailidis is with the Departments of Statistics and Computer Science
and the Informatics institute, University of Florida, Gainesville, FL 32611, ployed such measurements and a linear regression framework
USA e-mail: [email protected] to identify electricity theft, wherein the dependent variable
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
corresponds to the aggregate measurement by the collector, The remainder of this paper is organized as follows: Section
and the predictor variables to the household/business smart II introduces the modeling framework for the problem at hand.
meter measurements. However, for this approach to work, it Section III describes the detection and identification/diagnosis
is assumed that the predictor variables are uncorrelated, an strategies, while Section IV discusses implementation issues
assumption that is automatically violated when theft occurs, and evaluates the strategies based on synthetic data, as well
as technically demonstrated in Section II below. Note that this as emulated attacks based on real consumption data. Finally,
regression framework would work to identify faulty individual some concluding remarks are drawn in Section V.
smart meters, since their measurements will most likely be
random and hence uncorrelated. II. M ODEL D ESCRIPTION AND P ROBLEM F ORMULATION
In this paper, we adopt the architecture of the neighbor-
Let Y1 , Y2 , ..., YP correspond to the smart meter2 variables
hood area network, as previously described. In this setting,
that measure electricity consumption over a time interval, and
electricity theft involves an attacker who attempts to lower
further assume that Yi = ρi W + Ui , where E(W ) = µw ,
her energy bill by injecting false measurements to her own i.i.d.
smart meter, but to avoid detectability by the utility company V ar(W ) = σw , ρ ∈ (−1, 1) and Ui ∼ F (u). In
compensates with another false injection to smart meters words, the smart meter measurements are correlated, namely
√
within the neighborhood area network. Specifically, consider Corr(Yi , Yj ) = ρi ρj , ∀ i 6= j. This is a reasonable
a set of smart meters in a neighborhood under a common assumption, since neighboring households can exhibit a certain
distributional transformer, as depicted in Figure 1. If the degree of similarity in their electricity consumption patterns
attacker alters (increases) the electricity consumption mea- [27]. The idiosyncratic component Ui of each measurement Yi
surements recorded by the smart meters of other users, while (that captures the heterogeneity among electricity consumers)
reducing her own measurement, the total consumption reported has a distribution F , whose first two moments are denoted by
at the collector is not altered. Hence, various machine learning µ and σ, respectively. Finally, denote the P × P covariance
0
based monitoring schemes that focus on alterations in mean matrix of the measurements by Y = (Y1 , · · · , YP ) by
consumption patterns will fail to detect such an attack when h 0
i
Σ = V ar(Y) = E (Y − E (Y)) (Y − E (Y)) . (II.1)
considering measurements from the central node, or when used
in end user smart meters, especially if the attacker injects small Let Z denote the measurement variable at the collector node
magnitude false data1 (e.g. distributional transformer smart meter) controlled by the
power utility company, where the smart meter measurements
are communicated to; hence, assuming absence of technical
losses due to power distribution and transmission issues (see
P
Yi .3
P
discussion in [24]), we have by definition that Z =
i=1
An electricity theft attacker aims to distort the measure-
ments recorded by the end node smart meters, while not
changing their sum. For example, if the attacker can lower
the measurement of meter i by an amount α and increase
that of meter j by an equal amount, then the attacker can
benefit financially. We coin the smart meter (end node),
whose electricity consumption measurement is decreased as
the “Attacker Node”, and the end node whose electricity
consumption measurement is increased as the “Victim Node”.
Fig. 1. Structure of the neighborhood area network, with a central smart Next, we impose a number of assumptions on Y and α that
meter node (collector) for the distribution transformer and smart meters at are used in future technical developments. We start by defining
consumption points. the key variables.
Notation:
On the other hand, examining correlations between the Yi : value of ith smart meter measurement;
measurements (and in certain cases, information encapsulated W : the variable of the common component for each smart
in 3rd moments of the data distribution) proves a powerful meter measurement;
strategy, not only to detect an attack, but also identify the ρi : the relative contribution of W for Yi ;
attacking node, as well as the “victim” nodes. Further, the Ui : idiosyncratic component for each measurement Yi ;
proposed strategy works even when the consumption patters α: data attack variable;
amongst end users are correlated. The key message of the : approximate to or close to. For example, a−b 0 means
work is that examining correlation patterns can be a powerful the value of a − b is close to 0.
approach for the electricity theft problem. 2 end nodes of the distribution network; for example, deployed at households
or businesses
1 Note that this attack mechanism violates the assumption of uncorrelated 3 In the presence of technical losses, the techniques in [24] can be used to
predictors in the regression framework proposed by [25], and thus renders it adjust the controller and individual smart meter measurements, so that equality
inapplicable. holds.
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
the Attacker and the second as the Victim, the corresponding It is easy to check that
covariance matrix of the smart meter measurements will have
E(Yi )3 − E(Yj )3 = (ρ3i − ρ3j )EW 3 +
the following form
3(ρ2i EUi − ρ2j EUj )EW 2 +
B1
3(ρi EUi2 − ρj EUj2 )EW +
B2
.
EUi3 − EUj3
..
0
and
Σ = Bs
EYi − EYj = (ρi − ρj )EW + EUi − EUj
σ
By Assumption 2.3, we have E(Yi )3 − E(Yj )3 0 and
..
.
EYi − EYj 0. Since Eα > 0 and Eα3 > 0, then
σ P ×P E(Yi + α)3 − E(Yj − α)3 > 0
where Bh , h = 1, ..., s,
" is the sub-covariance #block matrix for This proves the result.
σ + σαh −σαh A direct consequence of Theorem 2.2 is that the 3rd moment
attack αh , and Bh = .
−σαh σ + σαh of the Victim node is strictly larger than that of the Attacker
This explicit pattern dictates the following algorithmic strat- node within the same attack group. Hence, leveraging the
egy to detect an electricity theft attack and identify the pairs results of Theorem 2.1, the block diagonal structure of the
of nodes involved in it. covariance matrix and Theorem 2.2, give rise to the following
• If Cov(Yi , Yj ) < 0, i 6= j, we can conclude that end detection and identification algorithm for the pairwise attack
nodes i and j are involved in the same attack; i.e. they scenario.
belong to the same attack group;
• Similarly, if Cov(Yi , Yj ) = 0, i 6= j, we can conclude Algorithm 1 Pairwise Attack Detection and Identification
that there is no attack involving nodes i and j; rather, i←1
they belong to different attack groups. while i < P do
Hence, the above simple strategy detects electricity theft and while i < j ≤ P do
the nodes involved as Attacker and Victim in it. if Cov(yi , yj ) < 0 then
Remark: In practice, the s attacks will involve random pairs ∆ = E(yi )3 − E(yi0 )3
of nodes and not the first 2s ones. Then, one needs to reorder if ∆ > 0 then
the rows and columns of the covariance matrix to obtain the label yi as the Victim node for Attack Group i
desired structure previously discussed. and label yi0 as the Attacker node for the same
The previously outlined strategy identifies the s attack Group;
groups, but not which node in the pair is the Attacker else {∆ < 0}
and which is the Victim. To address this issue, information label yi as the Attacker for Attack Group i and
involving the 3rd moment of the smart meter measurements label yi0 as the Victim for the same Group;
is required, as the following result shows. end if
Theorem 2.2: Under the pairwise attack scenario, if end else
nodes i and j are in the same attack group with magnitude α, end if
then E(Yi + α)3 − E(Yj − α)3 > 0. j =j+1
Proof: Note that end while
i=i+1
E(Yi + α)3 = E(Yi )3 + 3E(Yi )2 E(α) + 3E(Yi )E(α)2 + E(α)3 end while
(II.2)
Similarly,
A Single Attacker-Many Victims Scenario: The pairwise
E(Yj −α)3 = E(Yj )3 −3E(Yj )2 E(α)+3E(Yj )E(α)2 −E(α)3 , scenario is the simplest one to execute, since only one Victim
(II.3) node is involved. However, if the Attacker node aims to
where nodes i and j are the Victim and Attacker nodes, use a large α, this action may be flagged by either using
respectively. Further, since α > 0, we obtain change point analysis techniques, since a sharp change in
Eα > 0, Eα3 > 0 the electricity consumption pattern of the Victim node would
occur, or by the consumer under attack, who may in turn
A subtraction of the the last two relationships [i.e., (II.2) − complain to the utility company for a sharp and unexpected
(II.3)] yields increase in her/his electricity bill. In that case, the Attacker
E(Yi + α)3 − E(Yj − α)3 = E(Yi )3 − E(Yj )3 + may want to spread the attack among a larger group of nodes,
so as not to raise such suspicions. This leads to a more
3E(α)(EYi2 + EYj 2 ) +
involved setting, where the Attacker node decreases the smart
3E(α)2 (EYi − EYj ) + meter measurement at node i by an amount α, and increases
2E(α)3 cumulatively the measurements of the Victim group smart
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
meters by an equal amount. Specifically, when a magnitude α 1) Identify the node who has only negative covariance
attack is launched by node i, we have Yi − α; further, for the l values within the Bh sub-block and laebl it is as the
Victim PNodes we also have Yj1 +k1α α, Yj2 +k2α α,...,Yjl +klα α, Attacker node.
where kjα = 1. 2) Label the remaining nodes in the block as the Victim
j ones.
Analogously to the pairwise attack scenario, the single
Attacker-Many Victims case is undetectable by monitoring The following algorithm summarizes the detection and node
discrepancies in the measurements at the controller and the identification strategy.
end node smart meters, since by construction the sum of the
latter measurements agree with the former; i.e. Z. Algorithm 2 Single Attacker-Many Victims Attack Detection
A similar analysis to the pairwise scenario shows that the i ← 1, g ← 1, lg ← 1
resulting covariance matrix exhibits again a block diagonal while i < P do
pattern; namely, while i < j ≤ P do
if Cov(yi , yj ) 6= 0 then
B1
denote yg1 = yi and let yg1 ∈ Groupg , and for each
B2
l
j, then let lg = lg + 1, denote ygg = yj and let
.. lg
yg ∈ Groupg ;
.
0
for Groupg , find the node yga s.t. Cov(yga , ygb ) < 0
Σ = Bs
for all b 6= a and b ∈ (1, 2, ..., lg ). Then label yga
σ
as the Attacker for Groupg and label the remaining
..
nodes as Victims for Groupg . Then g = g + 1, lg =
.
1.
σ P ×P else
where Bh , h = 1, ..., s, is the block of the covariance matrix end if
corresponding to the h-th attack. Each block Bh has the j =j+1
following form: end while
i=i+1
1 −k1αh · · · −kdαhh
end while
−k αh (k αh )2 · · · k αh k αh
1 1 1 dh
Bh = Σ + σαh .. .. .. ..
.
. . .
αh
−kdh kdh k1αh αh αh 2
· · · (kdh ) B. Identifying Attacks: The Dependent Case
Recall that in the general case, the smart meter measure-
where Σ has (dh + 1) × (dh + 1) dimensions and dh is the
ments are generated according to Yi = ρi W + Ui . Note
number of Victims in the αh attack group, Σ(dh +1)×(dh +1) is
that Cov(Yi , Yj ) = ρi ρj σw , which complicates detection and
the original block of the covariance matrix of the end nodes
Ps attacker-victim(s) identification strategies. We start by defining
in the αh attack group, and (dh + 1) = m. Xij = Yi − Yj for i, j = 1, 2, ..., P and i 6= j. By using this
h=1
Thus, the same broad strategy to the pairwise attack scenario new set of (P − 1)2 measurement variables, we show next
is applicable. Specifically, under the single Attacker-Many that their covariance exhibits patterns that lead to detection
Victims attack mechanism, and identification. T
• If Cov(Yi , Yj ) 6= 0, i 6= j, we conclude that end nodes i
Denote by X = X12 , ..., X1p , X21 , X23 , ..., X(P −1)P ;
and j belong to the same attack group; we then can obtain the following result.
• If Cov(Yi , Yj ) = 0, i 6= j, we conclude that nodes i and
Theorem 2.3: Cov(Xij , Xkl ) 0 if i 6= j 6= k 6= l.
j belong to different attack groups. Proof: Since Ui are i.i.d,, we get
Hence, a close examination of such patterns in the covariance Cov(Xij , Xkl ) = (ρi − ρj )(ρk − ρl )σw
structure of the smart meter measurements leads to detecting
such attacks. Since by Assumption 2.2 | ρi − ρj | 0, we get (ρi −
Interestingly, even though in this scenario the attack mech- ρj )(ρk − ρl ) 0. Therefore, Cov(Xij , Xkl ) 0.
anism is more involved, once an attack group has been Note that Theorem 2.3 implies that the differencing transfor-
identified, it is straightforward to separate the Attacker node mation of the original set of measurements leads to reducing
from the Victim ones. A close examination of the Bh block their correlation to a large extent, which proves key to our
shows, that under this scenario only the Attacker node will detection and identification strategy.
exhibit negative covariance values with all other nodes in the To illustrate, we start with the most general case. Without
same attack group; i.e., Cov(Yi0 , Yj ) < 0, ∀j 6= i0 . On the loss of generality, we assume that there are four different attack
other hand, all the Victim nodes in the same attack group will variables α1 , α2 , α3 , α4 applied to Yi , Yj , Yk , Yl , respectively
0
have positive covariance values with each other. Hence, for for any i 6= j 6= k 6= l. Then, Xij = Yi − Yj ± (α1 + α2 ) and
0
each attack group, labeling the Attacker and the Victim nodes Xkl = Yk − Yl ± (α3 + α4 ), depending on whether they are
requires attackers or victims.
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
0 0
Theorem 2.4: Cov(Xij , Xkl ) 6= 0, for i 6= j 6= k 6= l, only III. I MPLEMENTATION I SSUES AND N UMERICAL R ESULTS
if attack variables from the same attack group are separately The previous results discuss detection and identification
applied, i.e., one is on node i (or j or i & j) and the other is
strategies, in the ideal case, when one has full knowledge of
on node k (or l or k & l). population parameters (e.g. the true variances σw , σ). How-
Proof: First, calculate ever, in practice one needs to replace them with their sample
0 0 counterparts. Thus, the estimate of the covariance matrix Σ
Cov(Xij , Xkl ) = Cov((ρi − ρj )W + Ui − Uj ± (α1 + α2 ), would be noisy. On the other hand, the previous analysis
(ρk − ρl )W + Uk − Ul ± (α3 + α4 )) established that the true covariance matrix is sparse and hence
= (ρi − ρj )(ρk − ρl )σw ± we should aim to sparsify its sample analogue as well. To
(Cov(α1 , α3 ) + Cov(α1 , α4 ) + that end, we employ the Universal Thresholding Method [28]
to regularize the sample correlation matrix in order to obtain
Cov(α2 , α3 ) + Cov(α2 , α4 ) a sparse estimate of it, before running our detection and
identification algorithms.
We provide the necessary details of our estimation strat-
T
(II.5) egy next. Let X = (X1 , ..., XP ) be a p-variate random
vector with covariance matrix Σ. Given an independent and
Since (ρi − ρj )(ρk − ρl ) 0, we have identically distributed random sample {X1 , ..., Xn }, we can
calculate the covariance estimator as
0 0
Cov(Xij , Xkl ) = ± (Cov(α1 , α3 ) + Cov(α1 , α4 ) + Xn
−1
Cov(α2 , α3 ) + Cov(α2 , α4 )) σ̂ij = n (xit − x̄i )(xjt − x̄j ), i, j = 1, ..., P (III.1)
t=1
n
where x̄i = n−1
P
xit .
t=1
(II.7) Then, the sample correlation coefficient of xi and xj is
σ̂
given by ρ̂ij = √σ̂ ijσ̂jj . Thus, the estimator of R = (ρij ),
It can then be easily seen that only if α1 and α3 (or α1 and ii
denoted by R̃ = (ρ̃ij ), is given by
α4 , or α2 and α3 , or α2 and α4 ) are from the same attack
0 0
group, Cov(Xij , Xkl ) 6= 0.
h 1
i
ρ̃ij = ρ̂ij I |ρ̂| > n− 2 cq (P ) , i = 1, 2, ..., P −1, j = i+1, ..., P,
Based on the nature of this property, we could develop an
easy-to-implement algorithm to identify and group nodes for where cq (P ) = Φ−1 (1 − 2f q(P ) ) and q is the significant
both the pairwise attack and the single Attacker-Many Victims level for this multiple hypothesis testing procedure. We select
scenarios according to which attack groups they belong to. f (P ) = P (P2−1) .
Finally, the estimator of Σ, denoted by Σ̃, is given by
Algorithm 3 Detection Algorithm for Dependent Case
while 1 ≤ i 6= j 6= k 6= l ≤ P do Σ̃ = D̂1/2 R̃D̂1/2 (III.2)
if 41 = Cov(Xij , Xkl ) 6= 0 then
0 0 where D̂ = diag(σ̂11 , σ̂22 , ..., σ̂P P ).
while 1 ≤ l ≤ P & l 6= i 6= j 6= k 6= l do
if 42 = Cov(Xij , Xkl0 ) = 0 then
0 0 0
while 1 ≤ j ≤ P & j 6= l 6= i 6= j 6= k 6= l do A. Simulation Studies Results
if 43 = Cov(Xij 0 , Xkl ) = 0 then We start by providing the definition of the Variance Ratio,
conclude j & l belong to the same attack an important quantity in the sequel. Assume there are l victims
group in the group of an arbitrary attack of magnitude α; we then
else define the Variance Ratio (VR) for that group to be
conclude i & l belong to the same attack
group V ar( αl ) 1 V ar(α)
VR= = 2 (III.3)
end if V ar(Y ) l V ar(Y )
end while The quantity VR can be thought of a signal-to-noise measure
end if for the problem at hand.
end while 1) Independent Case: Recall that in this case, the model
end if i.i.d.
reduces to Yi = Ui , where Ui ∼ F (µ, σ).
update i, j, k, l To illustrate the detection algorithms, we consider P = 100
end while smart meter nodes; Y = (Y1 , ..., Y100 )T . Further, the signifi-
cance level in the Universal Thresholding Method is set to be
Remark: For example, if the output of Algorithm II-B q = 0.1. In the first simulation setting, we generate n = 200
is {(1, 2), (1, 3), (2, 3)}, we will conclude that (1, 2, 3) are independent sets of smart meter vectors, Y1 , ..., Yn , from
within the same attack group. The same logic applies to more the following two distributions: (i) Uniform(625, 675) and
complicated outcomes. (ii) Gamma(400, 1.5), respectively. The Uniform distribution
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
Fig. 3. Heat maps of the correlation matrix and its filtered version in the
independent Gamma, no attack case.
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
continue to hold.
Fig. 9. Heat map of the correlation matrix in the independent Gamma “mixed
attacks” case
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
10
Fig. 12. Probability of detection for various attack scenarios for Uniformly Fig. 14. Electricity consumption of 19 campus buildings (in natural log-
distributed dependent data. scale).
Fig. 13. Probability of detection for various attack scenarios for Gamma Fig. 15. Boxplots of electricity consumption for the 19 selected buildings.
distributed dependent data.
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
11
TABLE III
set (0.05, 0.10, 0.15, 0.20) for each attack group. Figure 17 AVERAGE NUMBER OF SUCCESSFUL DETECTIONS ( WITH STANDARD
depicts the relationship between probability of detection rate DEVIATION IN PARENTHESES ) FOR 2 PAIRWISE , 1 TWO - VICTIM AND 1
THREE - VICTIMS ATTACKS , BASED ON 50 GENERATED ATTACKS BASED ON
and VR for different types of attacks, based on 50 generated
THE RESIDENTIAL BUILDINGS DATA .
attack data sets.
Pairwise 2 Victims 3 Victims
VR = 0.05 1.92(0.274) 0.62(0.490) 0.70(0.463)
VR = 0.10 1.94(0.240) 0.98(0.141) 0.98(0.141)
VR = 0.15 1.94(0.240) 0.98(0.141) 0.98(0.141)
VR = 0.20 1.94(0.240) 0.98(0.141) 0.98(0.141)
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
12
block, we let the node with the largest cardinality be Hence, this constitutes an important feature of the proposed
the Attacker, and consequently the rest are the victims. methodology.
In the example below, node a is the Attacker. There are some open problems that merit additional in-
vestigation, including scenarios involving multiple attackers
a b c d
and multiple victims. However, such coordinated attacks are
a + − − +
more difficult to launch, since they require a higher level of
b − + + − sophistication from the attacker’s perspective.
B=
−
c + + +
d + − + + R EFERENCES
3) We could also have some missing entries in the covari- [1] T. B. Smith, “Electricity theft: a comparative analysis,” Energy policy,
ance estimate after regularization. Considering the toy vol. 32, no. 18, pp. 2067–2076, 2004.
[2] T. Ahmad, H. Chen, J. Wang, and Y. Guo, “Review of various modeling
example above for the three victims case, note that if techniques for the detection of electricity theft in smart grid environ-
the covariance estimate B has the following form with ment,” Renewable and Sustainable Energy Reviews, vol. 82, pp. 2916–
some missing entries in the Victim-Victim positions, we 2933, 2018.
[3] A. Ipakchi and F. Albuyeh, “Grid of the future,” IEEE Power and Energy
still identify node a, b, c, d as being within the same Magazine, vol. 7, no. 2, pp. 52–62, March 2009.
attack group. [4] H. Gharavi and R. Ghafurian, “Smart grid: The electric energy system
of the future [scanning the issue],” Proceedings of the IEEE, vol. 99,
a b c d no. 6, pp. 917–921, June 2011.
a
+ − − −
[5] G. B. Giannakis, V. Kekatos, N. Gatsis, S. J. Kim, H. Zhu, and B. F.
Wollenberg, “Monitoring and optimization for power grids: A signal
b − + + processing perspective,” IEEE Signal Processing Magazine, vol. 30,
B=
no. 5, pp. 107–128, Sept 2013.
−
c + + + [6] V. Pappu, M. Carvalho, and P. M. Pardalos, Optimization and security
d − + + challenges in smart power grids. Springer, 2013.
[7] G. Liang, J. Zhao, F. Luo, S. R. Weller, and Z. Y. Dong, “A review
If the missing values are in the Attacker-Victim posi- of false data injection attacks against modern power systems,” IEEE
Transactions on Smart Grid, vol. 8, no. 4, pp. 1630–1638, July 2017.
tions, we could again use the cardinality rule to address [8] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against
this issue. For example, if estimate B has the following state estimation in electric power grids,” in Proceedings of the 16th
form, by applying our rules, we still identify a, b, c and ACM Conference on Computer and Communications Security, ser.
CCS ’09. New York, NY, USA: ACM, 2009, pp. 21–32. [Online].
d being within the same attack group, and the node a is Available: http://doi.acm.org/10.1145/1653662.1653666
the Attacker. [9] Z. H. Yu and W. L. Chin, “Blind false data injection attack using pca
approximation method in smart grid,” IEEE Transactions on Smart Grid,
vol. 6, no. 3, pp. 1219–1226, May 2015.
a b c d [10] X. Liu, Z. Bao, D. Lu, and Z. Li, “Modeling of local false data injection
a + − −
attacks with reduced network information,” IEEE Transactions on Smart
− + + Grid, vol. 6, no. 4, pp. 1686–1696, July 2015.
b
[11] M. G. Kallitsis, S. Bhattacharya, S. Stoev, and G. Michailidis, “Adaptive
B=
statistical detection of false data injection attacks in smart grids,” in
c + + +
2016 IEEE Global Conference on Signal and Information Processing
d − + + (GlobalSIP), Dec 2016, pp. 826–830.
[12] S. Bi and Y. J. Zhang, “Graphical methods for defense against false-data
By applying this rule, we could resolve this problem when injection attacks on power system state estimation,” IEEE Transactions
dealing with real noisy data sets. on Smart Grid, vol. 5, no. 3, pp. 1216–1227, May 2014.
[13] A. Sanjab and W. Saad, “Smart grid data injection attacks: To defend
or not?” in 2015 IEEE International Conference on Smart Grid Com-
IV. C ONCLUSION munications (SmartGridComm), Nov 2015, pp. 380–385.
[14] R. Jiang, R. Lu, Y. Wang, J. Luo, C. Shen, and X. S. Shen, “Energy-
In this paper, we have primarily focused on how to address theft detection issues for advanced metering infrastructure in smart grid,”
coordinated power theft activities detection problem by consid- Tsinghua Science and Technology, vol. 19, no. 2, pp. 105–120, 2014.
[15] S. McLaughlin, B. Holbert, A. Fawaz, R. Berthier, and S. Zonouz, “A
ering independent and dependent smart meter data generating multi-sensor energy theft detection framework for advanced metering
mechanism. For each case, two scenarios, pairwise and one infrastructures,” IEEE Journal on Selected Areas in Communications,
attacker-many victims, have been thoroughly investigated. vol. 31, no. 7, pp. 1319–1330, 2013.
[16] A. A. Cárdenas, S. Amin, G. Schwartz, R. Dong, and S. Sastry, “A
We have separately developed an easy-to-implement detec- game theory model for electricity theft detection and privacy-aware
tion algorithm to detect attacks and identify attackers and control in ami systems,” in 2012 50th Annual Allerton Conference on
victim nodes. The implementation of the strategy leverages Communication, Control, and Computing (Allerton). IEEE, 2012, pp.
1830–1837.
a regularized covariance estimator, followed by close exami- [17] S. S. S. R. Depuru, L. Wang, V. Devabhaktuni, and R. C. Green, “High
nation of patterns in the resulting matrix. Extensive numerical performance computing for detection of electricity theft,” International
results based on both synthetic and real data illustrate the Journal of Electrical Power & Energy Systems, vol. 47, pp. 21–30, 2013.
[18] P. Jokar, N. Arianpoo, and V. C. Leung, “Electricity theft detection
superior performance of the proposed methodology. in ami using customers consumption patterns,” IEEE Transactions on
Note that there is a plethora of machine learning approaches Smart Grid, vol. 7, no. 1, pp. 216–226, 2015.
that addresses the detection problem. However, identifying [19] M. G. Kallitsis, G. Michailidis, and S. Tout, “Correlative monitoring for
detection of false data injection attacks in smart grids,” in 2015 IEEE
“attackers” and their corresponding “victims” is a more chal- International Conference on Smart Grid Communications (SmartGrid-
lenging problem that few of these approaches can address. Comm), Nov 2015, pp. 386–391.
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2952181, IEEE Journal
on Selected Areas in Communications
13
0733-8716 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.