0% found this document useful (0 votes)
89 views

Distribution System State Estimation: An Overview of Recent Developments

This document provides an overview of recent developments in distribution system state estimation. It discusses nonlinear weighted least squares and least absolute value approaches for state estimation that can address nonlinear measurement modeling challenges. It also describes the Cramér-Rao lower bound for benchmarking unbiased estimators and robust state estimation approaches that are robust to cyber attacks. Finally, it discusses current challenges in distribution system state estimation.

Uploaded by

Shihab Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views

Distribution System State Estimation: An Overview of Recent Developments

This document provides an overview of recent developments in distribution system state estimation. It discusses nonlinear weighted least squares and least absolute value approaches for state estimation that can address nonlinear measurement modeling challenges. It also describes the Cramér-Rao lower bound for benchmarking unbiased estimators and robust state estimation approaches that are robust to cyber attacks. Finally, it discusses current challenges in distribution system state estimation.

Uploaded by

Shihab Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

4 Wang et al.

/ Front Inform Technol Electron Eng 2019 20(1):4-17

Frontiers of Information Technology & Electronic Engineering


www.jzus.zju.edu.cn; engineering.cae.cn; www.springerlink.com
ISSN 2095-9184 (print); ISSN 2095-9230 (online)
E-mail: [email protected]

Review:

Distribution system state estimation: an overview


of recent developments∗
Gang WANG1 , Georgios B. GIANNAKIS1 , Jie CHEN2,3 , Jian SUN‡2,3
1Department of Electrical and Computer Engineering and Digital Technology Center,
University of Minnesota, Minneapolis, MN 55455, USA
2School of Automation, Beijing Institute of Technology, Beijing 100081, China
3Key Laboratory of Intelligent Control and Decision of Complex Systems, Beijing 100081, China
E-mail: [email protected]; [email protected]; [email protected]; [email protected]
Received Sept. 23, 2018; Revision accepted Nov. 27, 2018; Crosschecked Jan. 8, 2019

Abstract: In the envisioned smart grid, high penetration of uncertain renewables, unpredictable participation of
(industrial) customers, and purposeful manipulation of smart meter readings, all highlight the need for accurate,
fast, and robust power system state estimation (PSSE). Nonetheless, most real-time data available in the current
and upcoming transmission/distribution systems are nonlinear in power system states (i.e., nodal voltage phasors).
Scalable approaches to dealing with PSSE tasks undergo a paradigm shift toward addressing the unique modeling and
computational challenges associated with those nonlinear measurements. In this study, we provide a contemporary
overview of PSSE and describe the current state of the art in the nonlinear weighted least-squares and least-absolute-
value PSSE. To benchmark the performance of unbiased estimators, the Cramér-Rao lower bound is developed.
Accounting for cyber attacks, new corruption models are introduced, and robust PSSE approaches are outlined as
well. Finally, distribution system state estimation is discussed along with its current challenges. Simulation tests
corroborate the effectiveness of the developed algorithms as well as the practical merits of the theory.

Key words: State estimation; Cramér-Rao bound; Feasible point pursuit; Semidefinite relaxation; Proximal linear
algorithm; Composite optimization; Cyber attack; Bad data detection
https://doi.org/10.1631/FITEE.1800590 CLC number: TP311

1 Introduction sion lines and millions of miles of distribution lines,


connecting millions of power generators to millions
The electric power grid, arguably the largest of factories and homes. To maintain grid efficiency,
complex system on Earth, is recognized as the reliability, and sustainability, power system opera-
greatest engineering achievement of the 20th cen- tors have to constantly monitor the operating con-
tury (Wulf, 2000): thousands of miles of transmis- ditions of the system (Schweppe et al., 1970; Abur
‡ Corresponding author and Gómez-Expósito, 2004; Giannakis et al., 2013).
*
Wang G and Giannakis GB were supported by the National In the early 1960s, system operators tried to com-
Natural Science Foundation of China (NSFC) (Nos. 1514056,
1505970, and 1711471). Chen J and Sun J were supported by pute the voltages at a few selected buses based upon
the NSFC (Nos. 61621063 and 61522303), the NSFC-Zhejiang manually collected meter readings from geographi-
Joint Fund for the Integration of Industrialization and Informa-
tization (No. 61720106011), the Projects of Major International cally distributed current and potential transformers.
(Regional) Joint Research Program NSFC (No. 61720106011), Unfortunately, due in part to timing, model uncer-
and the Program for Changjiang Scholars and Innovative Re-
search Team in University (No. IRT1208) tainties, and metering errors, the alternate current
ORCID: Gang WANG, http://orcid.org/0000-0002-7266-2412 (AC) power flow equations were never feasible.

c Zhejiang University and Springer-Verlag GmbH Germany, part
of Springer Nature 2019 With the seminal contributions of Schweppe
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 5

et al. (1970), the statistical foundations were laid The reverse-direction current in n flowing from node
for (static/dynamic) power system state estimation n to node n can be expressed symmetrically.
(PSSE). Ever since their pioneering work, a number According to Eq. (1), unless bnn = 0, it holds that
of solutions and generalizations of PSSE have been inn = in n . According to Kirchhoff’s current law,
proposed and worked out. Furthermore, many ar- the complex current injected into node n is equal to
ticles, chapters, and books have nicely reviewed the the sum of currents on the lines incident to node n;
progress in this area (Wood and Wollenberg, 1996; that is, 
Monticelli, 2000; Baran, 2001; Abur and Gómez- in = in n , (2)
Expósito, 2004; Caro and Conejo, 2012; Huang et al., n ∈Nn

2012; Giannakis et al., 2013; Della Giustina et al., where Nn ⊆ N is the set of nodes directly connected
2014; Kekatos et al., 2017; Ahmad et al., 2018). In to node n.
this study, we provide a contemporary overview of Upon stacking up all nodal voltages (currents)
PSSE, linking the relevant physics to the signal pro- to form the N × 1 vector v (i), Ohm’s law dictates
cessing and (nonconvex) optimization methods and that
algorithms. Our main goal is to describe some of the i = Y v, (3)
recent advances in this area, identify current chal-
where Y is the so-called bus admittance matrix,
lenges, and suggest directions for future research. 
whose (n, n)th entry is − n ∈Nn yn n and (n, n )th
Admittedly, our collection is by no means exhaus-  s
entry is n ∈Nn (yn n + jbn n /2). Note that Y is
tive, but it gives an indication of the related research
symmetric and sparse. Similarly, one can collect all
taking place.
line currents in the 2L × 1 vector if and write
The notations are explained as follows: Matri-
ces (column vectors) are denoted by upper- (lower-) if = Yf v, (4)
case boldface letters; in particular, 1 is an all-one
vector of suitable dimension. Sets are represented for a properly defined 2L × N complex matrix Yf .
using calligraphic letters. Symbol “T ” represents the Let sn := pn + jqn denote the complex power
transpose and symbol “ H ” represents the Hermitian injected into node n. Using the definition sn = vn in ,
transpose. (·) denotes complex conjugate, while (·) the vector collecting all complex power injections s =
((·)) takes the real (imaginary) part of a complex p + jq can be compactly expressed as
number.
s = diag(v)i = diag(v)Y v. (5)

2 Grid modeling preliminaries The power flowing from node n to node n over line
(n, n ) seen from node n is similarly given by Snn =
In this section, we briefly review Kirchhoff’s and vn inn , or in a matrix-vector representation by
Ohm’s laws as well as the power flow equations. An
electric power grid comprising N nodes (i.e., buses) S = diag(v)Yf v. (6)
and L edges (i.e., lines) can be modeled as a graph
Matrix Y is often given in rectangular coor-
G := {N , L}, whose nodes N := {1, 2, . . . , N } cor-
dinates as Y = G + jB. Depending on whether
respond to buses and whose edges L := {(n, n )} ⊆
the complex voltages v are expressed in polar or
N × N correspond to lines. Since the focus of this
rectangular coordinates, the AC power flow equa-
study is on AC circuits, steady-state voltages and
tions admit two options. Specifically, if voltages are
currents will be represented by their single-phase
vn = Vn exp(jθn ), the real and imaginary parts of
equivalent phasors per unit.
the power flow equations (Eq. (5)) can be written as
A transmission line (n, n ) ∈ L connecting nodes

n and n ∈ N can be modeled by its line series ad- ⎪


N
⎨pn = Vn Vn (Gnn cos θnn + Bnn sin θnn ),
mittance ynn = gnn + jbnn , and total shunt suscep- n =1 (7)
tance jbsnn . Letting vn ∈ C be the complex voltage ⎪
⎪ 
N
⎩qn = Vn Vn (Gnn sin θnn − Bnn cos θnn ),
at bus n, the current inn ∈ C flowing from node n n =1
to node n across line (n, n ) is given by
where for notational brevity, we use θnn := θn − θn
inn = (ynn + jbsnn /2)vn − ynn vn . (1) for all n ∈ N . Observe that both {pn } and {qn }
6 Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17

are solely functions of the phase differences {θnn }; v, thus justifying its term as the system state. In
the power injections {sn } are invariant if all nodal practice, the supervision control and data acquisition
voltages are shifted by a common angle. This justifies (SCADA) system hosting geographically distributed
the adoption of a reference bus (i.e., slack bus) in the metering devices measures a subset of electric quan-
power system analysis that is assumed to have a zero tities every few seconds and forwards the readings via
voltage phase without loss of generality. remote terminal units (RTUs) to a control center for
If, on the other hand, voltages are given in rect- grid monitoring. The task of PSSE entails recovering
angular coordinates as vn = Vr,n + jVi,n , the power the voltage vector given the available measurements
injections become quadratically related to voltages: and grid parameters. In this section, we start with
⎧ the (weighted) (W) least-squares (LS) formulation of
⎪ N

⎪ pn = Vr,n (Vr,n Gnn − Vi,n Bnn ) PSSE, provide the Cramér-Rao bound (CRB) on the




n =1

⎪ 
N variance of any unbiased LS estimator, and outline

⎨ + Vi,n (Vi,n Gnn + Vr,n Bnn ), several WLS-based PSSE solvers.
n =1
(8)

⎪ 
N

⎪ qn = Vi,n (V i,n Gnn − Vr,n Bnn ) 3.1 Problem formulation

⎪ n =1



⎪ 
N
Suppose that we have a total of M SCADA mea-
⎩ − Vr,n (Vr,n Gnn + Vi,n Bnn ).
n =1 surements {zm }M M
m=1 collectively denoted by z ∈ R ,

Using the fact that sn = v n in = (v H en )(eTn i) = which relates to v via the model:
v H en eTn v, Eq. (8) can be compactly expressed as z = h(v) + , (12)


⎪ Y H + Yn
⎨pn = v H Hnp v, with Hnp := n , for some properly defined vector-valued nonlinear
2 (9) function h(·), where  ∈ RM captures the model
H

⎪q = v H H q v, with H q := Yn − Yn ,
⎩ n n n mismatches and measurement noise.
2j
Traditionally, the system state v is expressed
where Yn := en eTn Y for all n ∈ N . With regard to in polar coordinates, which is a (2N − 1) × 1 vec-
power flows, since a line current can also be expressed tor that comprises the real and imaginary parts of
as inn = eTnn if , it holds that S nn = v n inn = v after excluding the imaginary part of the refer-
(v H en )(eTnn if ) = v H en eTnn Yf v, thus yielding ence bus. In that case, function h(v) maps the
⎧ H real-valued state vector to SCADA measurements

⎪ Ynn  + Ynn
⎨Pnn = v H Hnn P
 v, with Hnn P
 := , through the nonlinear power flow (Eq. (7)). The
2
H motivation behind expressing the states in polar co-

⎪ Q Q Ynn  − Ynn
⎩Qnn = v H Hnn  v, with Hnn  := , ordinates is twofold. First, voltage magnitude mea-
2j
(10) surements are directly related to states. In addition,
where Ynn := en eTnn Yf for all lines (n, n ) ∈ L. the Jacobian matrix of h(v) required in the Gauss-
Similar expressions can be obtained for the voltage Newton iterations is amenable to approximations.
magnitude squares; that is, Nonetheless, when iterative optimization solvers are
employed, working directly with the N -dimensional
Vn2 = v H Hnv v, with Hnv := en eTn . (11)
complex voltage vector has in general lower com-
In a nutshell, Eqs. (9)–(11) imply that power putational complexity than that in the real case.
injections, flows, and voltage magnitude squares are This is primarily due to the compact quadratic repre-
quadratic functions of the voltage vector, which can sentations of SCADA measurements in the complex
be collectively described by {v H Hm v} for Hermitian voltage state vector, enabling us to exploit the in-
H
matrices Hm = Hm ∈ CN ×N given in Eqs. (9)–(11). herent sparsity of quadratic measurement matrices
(Eqs. (9)–(11)). Refer to recent computational re-
3 Weighted least-squares estimation formulations of PSSE in complex variables (Wang G
et al., 2017, 2018b; Wang Z et al., 2017; Džafić et al.,
As explained in Section 2, given system param- 2018a, 2018b). In a different but related context,
eters collected in Y and Yf , all power system quan- where the goal is to reconstruct complex signal vec-
tities can be expressed in terms of voltage vector tors from their intensity-only measurements, Candès
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 7

et al. (2015) and Wang et al. (2018a, 2018d) care- Furthermore, matrix F (v, v) has at least rank-
fully justified the computational advantages of opti- one deficiency even when all SCADA quantities are
mizing over complex variables rather than over their measured.
real expansions. The proof of Theorem 1 can be found in Wang
Upon pre-whitening, vector  can be assumed et al. (2018b). Regarding Theorem 1, a couple of
to follow a standardized Gaussian distribution. The remarks are of interest. The rank deficiency of the
maximum likelihood estimate (MLE) of v coin- FIM stems from the inherent voltage ambiguity; that
cides with the following nonlinear LS estimate (Kay, is, all SCADA observations are invariant even if all
1993): nodal voltages are shifted by a uni-modular phase
2
v̂ := arg minn z − h(v)2 . (13) constant. This issue can be fixed by selecting a ref-
v∈C
erence bus and setting its phase to be zero or any
Due to nonlinear h(v), the LS at hand is noncon-
constant. Although it is rank-deficient, the pseudo-
vex, and its general instance is non-deterministic
inverse of F (v, v) qualifies itself as a valid lower
polynomial (NP) hard (Pardalos and Vavasis, 1991).
bound on the mean square error (MSE) of any unbi-
Hence, solving the LS-based PSSE problem is indeed
ased estimator (Stoica and Marzetta, 2001). In addi-
challenging.
tion, this lower bound is often attainable in practice,
3.2 Cramér-Rao bound analysis and is predictive of the optimal estimator perfor-
mance (Stoica and Marzetta, 2001). Having derived
In estimation theory (Kay, 1993, Chapter 3), the CRB for PSSE, we will deal with LS PSSE solvers
the CRB provides a universal lower bound on the in the next subsection.
variance of any unbiased estimator. Appreciating
its central role as a performance benchmark, we es- 3.3 Gauss-Newton method
tablish the CRB for the LS PSSE. Evidently, from
Eq. (13), the CRB analysis of PSSE entails find- Consider the nonlinear LS problem (13) again,
ing derivatives of a real-valued function h(v) with for which the Gauss-Newton iterations are widely
respect to complex-valued variables in v. Address- known to be the “workhorse” solution (Bertsekas,
ing this challenge calls for the so-called Wirtinger’s 1999, Chapter 1.5; Abur and Gómez-Expósito, 2004,
derivatives and calculus for complex analysis, the Chapter 2). Starting with an initial guess or a flat-
basics of which are provided in Appendix A. The voltage profile vector (e.g., the all-one vector, the
following result provides a closed-form CRB for the voltage vector obtained by solving the linearized
LS PSSE under the additive white Gaussian noise power flow equations), denoted by v0 ∈ CN , the
model (12), whose proof can be found in Wang et al. Gauss-Newton method successively approximates
(2018b). the nonlinear LS fit in Eq. (13) using the linear one
Theorem 1 Consider estimating the unknown vec- of the first-order Taylor expansion of h(vi ), and re-
tor v ∈ CN from noisy observations {zm ∈ R}M m=1 lies on its minimizer to yield the next iteration vi+1 .
obeying model (12), where the noise  follows the Specifically, according to Wirtinger’s calculus in Ap-
standardized Gaussian distribution N (0, I). Then pendix A, the first-order Taylor expansion of h(v)
it holds for any unbiased estimator v̂ that around the current iteration vi is (refer to Eq. (A6))

cov(v̂)  F † (v, v) 1:N,1:N , (14)

v − vi
h(v) ≈ h(vi ) + ∇H
c h(vi ) , (17)
where cov(·) denotes the covariance matrix of the v − vi
argument, “ † ” denotes the pseudo-inverse operator,
and the Fisher information matrix (FIM) is given as where the complex Jacobian ∇c h(vi ) of h(v, v)
M
with respect to [vT vT ]T evaluated using vi is given

F (v, v) = Fm (v, v), (15) in Eq. (A5). For notational brevity, let Ji :=
m=1 ∇c h(vi ) ∈ C2N ×M , and assume that Ji JiH is in-


(Hm v)(Hm v)H (Hm v)(H m v)H vertible, which is often true when SCADA data are
Fm (v, v) := . measured across the network and M ≥ N .
(H m v)(Hm v)H (H m v)(H m v)H
(16) The next iteration vi+1 can then be obtained by
8 Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17

solving the ensuing LS 3.4 Semidefinite programming relaxation



2
H v − vi


min z − h(vi , v i ) − Ji , (18) As explained in Section 2, the challenge of the
v∈CN v − vi
2 LS PSSE Eq. (13) arises from the quadratic functions
or equivalently, solving the linear system {hm (v) = v H Hm v}. A recent line of research to

tackle the nonlinear measurements expresses {hm } as
v − vi
Ji JiH = Ji [z − h(vi , v i )] , (19) linear functions of the outer-product V := vv H and
v − vi
subsequently solves a semidefinite program (SDP)
for which celebrated efficient solvers that can exploit over V ∈ CN ×N after relaxing the nonconvex rank
the sparsity of Ji exist (Saad, 2003). Succinctly, constraint rank(V ) = 1 (Zhu and Giannakis, 2011,
the state estimate is iteratively updated using the 2014; Kim et al., 2014; Wang et al., 2014). Precisely,
following until some stopping criteria are met: problem (13) can be equivalently rewritten as



vi+1 vi −1 
= + Ji JiH Ji z − h(vi ) . (20) M
  2
v i+1 vi V̂ := arg min zm − tr(Hm V )
V ∈CN ×N (24)
m=1
Refer to some generalizations to cope with complex-
valued measurements and phasor measurement units s.t. V  0, rank(V ) = 1,
(PMU) data in Džafić et al. (2018b).
where the constraints jointly ensure that any solution
On the other hand, if  ∼ N (0, Σ) for some
V̂ can be uniquely expressed as V̂ = v̂ v̂ H for some
covariance matrix Σ 0 (instead of  ∼ N (0, I)),
v̂ ∈ CN . Problem (24) can be readily transformed
the maximum likelihood estimation can be expressed
into a convex SDP after dropping the rank-one con-
as the WLS estimation:

straint, yielding
2
min Σ −1/2 z − h(v) . (21)
v∈CN 2 min 1T β
V ∈CN ×N,
The Gauss-Newton iterations can be similarly de- β∈RM

rived by treating Σ −1/2 z as z and Σ −1/2 h(v) as s.t. V  0,


h(v) in Eq. (20), yielding



βm zm − tr(Hm V )
−1   0,
vi+1 vi zm − tr(Hm V ) 1
= + Ji Σ −1 JiH Ji Σ −1 z−h(vi ) .
v i+1 vi
∀ m = 1, 2, . . . , M, (25)
(22)
The Gauss-Newton iterations in Eqs. (20) and for which off-the-shelf convex programming solvers
(22) may not guarantee convergence; however, they can be employed. An estimate of v can be recov-
rely heavily on v0 . To improve convergence and en- ered as the principal component of the V -solution
sure descent of the (W)LS cost function, the Gauss- of problem (25) via eigenvalue decomposition, or
Newton method is implemented in the modified form through randomization techniques (Zhu and Gian-
(Bertsekas, 1999, Chapter 1.5): nakis, 2014).



vi+1 v −1  In terms of performance, it can be shown that
= i +μi Ji Σ −1 JiH Ji Σ −1 z−h(vi ) ,
v i+1 vi under appropriate assumptions, solving the convex
(23) SDP (25) attains the global optimum of the LS
where μi > 0 is a step size chosen by means of a back- PSSE problem (13) when the SCADA data are noise-
tracking line search (Bertsekas, 1999, Chapter 1.2). free; refer to Zhu and Giannakis (2014) for details.
Furthermore, it is known that the Gauss-Newton In practice, the SDP relaxation approach approx-
procedure for nonconvex optimization can get stuck imates the global optimum well even in the pres-
at local solutions (Bertsekas, 1999, Chapter 1.5). All ence of noise. Nevertheless, solving SDPs often calls
in all, the main challenge lies in developing PSSE for interior-point solvers, whose computational com-
solvers that can attain or approximate the global op- plexity grows at least cubically with the matrix size
timum at affordable computational complexity. To- N (Park and Boyd, 2017). This complexity can be a
ward this objective, we review a few recent interest- burden for real-time power system operation, which
ing proposals along this line next. motivates lightweight alternatives.
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 9

3.5 Feasible point pursuit bounds in inequalities (28a) and (28b) evaluated at
the current iteration vi , and subsequently solves the
Casting the LS in problem (13) as a noncon- resulting convex QCQP to obtain the next iteration
vex quadratically constrained quadratic program vi+1 as the v-solution of
(QCQP), the feasible point pursuit (FPP) method
investigated in Wang et al. (2018b) offers a compu- min χ22
v∈CN,
tationally affordable alternative for approximating χ∈RM

the global optimum of PSSE. The idea of FPP is to s.t. v H Hm


+
v+ 2(viH Hm

v) ≤ zm + viH Hm

vi + χ m ,
solve a sequence of convexified QCQPs obtained by
v H Hm

v+ 2(viH Hm

v) ≥ zm + viH Hm

vi − χ m ,
successively forming convex inner-restrictions of the
original nonconvex feasibility set, to approximate the ∀ m = 1, 2, . . . , M, (29)
feasible solutions of the original nonconvex QCQP
for which standard convex programming methods
(Mehanna et al., 2015; Park and Boyd, 2017).
can be used. The FPP procedure has been shown
Toward this end, the FPP-PSSE solver starts
to converge to a stationary point of the LS Eq. (13).
with reformulating problem (13) into a QCQP
Yet, extensive numerical tests demonstrate its capa-
(Wang et al., 2016):
bility of attaining (near-)optimal solutions. Gener-
min χ22 (26a) alizations to handle bad data as well as PMU mea-
v∈CN, surements are possible, and related discussion can be
χ∈RM
found in Wang et al. (2018b).
s.t. v H Hm v ≤ zm + χm , (26b)
H
v H m v ≥ z m − χm , (26c) 3.6 Numerical tests
∀ m = 1, 2, . . . , M, To summarize this section, we provide numerical
tests comparing the Gauss-Newton method, semidef-
where vector χ ∈ RM consists of M auxiliary vari-
inite relaxation (SDR) based, and FPP-based PSSE
ables {χm ≥ 0}M m=1 , capturing the residuals of fitting
solvers, using the IEEE 14-bus and 30-bus bench-
the M SCADA observations.
mark systems (Christie, 1999). The true voltage
Evidently, if a certain Hm is neither positive nor
profile was generated with magnitude uniformly sam-
negative semidefinite, constraints (26b) and (26c) are
pled from [0.9, 1.1] in per unit system and angles from
both nonconvex. Yet, if Hm is (positive or negative)
[−0.4π, 0.4π]. Independent zero-mean Gaussian ad-
semidefinite, one and only one of constraints (26b)
ditive noise with standard deviation 0.02 for voltage
and (26c) is nonconvex. Without loss of generality,
meters and 0.05 for power meters was assumed. All
consider decomposing every Hm into its positive and
+ − reported results below were averaged over 100 inde-
negative definite parts as Hm := Hm + Hm using
pendent trials.
eigenvalue decomposition. Equivalently, constraints
The first experiment assessed the mean square
(26b) and (26c) can be rewritten as
estimation error (MSE) performance of the three
v H Hm
+
v + v H Hm

v ≤ z m + χm , (27a) schemes against the CRB benchmark in Theorem 1
H + H − using the IEEE 14-bus system. A varying number of
v Hm v +v Hm v ≥ z m − χm . (27b)
measurements were simulated. At first, all voltage
It is clear that only the negative component v H Hm −
v magnitudes and all active power flows at both the
in inequality (27a) is nonconvex; it is similar for the sending- and receiving-ends were measured, which
positive part in inequality (27b). By definition, we correspond to the base case 1 on the x axis of Fig. 1.
find that the following holds for any point vi ∈ CN : Increasing one on the x axis implies including a new
type of measurements from {Qfnn , Qtnn , Pn , Qn } at
v H Hm

v ≤ 2(viH Hm

v) − viH Hm

vi , (28a) all buses or over all lines. In other words, when the
H +
v Hm v ≥ 2(viH Hm

v) − viH Hm

vi . (28b) axis value equals five, all measurements were used.
In the second experiment on the IEEE 30-bus sys-
Starting with some v0 ∈ CN , each iteration of tem, we simulated all nodal voltage magnitudes as
FPP first replaces the nonconvex sources in inequal- well as all the active power flows at both sending-
ities (27a) and (27b) with the corresponding linear and receiving-ends. Fig. 2 describes the estimated
10 Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17

2
10
GN-based SE
In addition to cyber attacks, bad data (i.e., outliers)
SDR-based SE may be due to communication delays, system param-
1 FPP-based SE
Mean square error (per unit)

10
Cramér-Rao bound eter uncertainties, and/or meter mis-calibration. In
this section, we review some of the recent advances
0
10 in robust PSSE.

-1
10 4.1 Attack models

-2 In the presence of bad data, the following cor-


10
ruption model is considered (Duchi and Ruan, 2017a;
-3 5 Wang G et al., 2017): letting {am ∈ R}M m=1 model
10 1 2 3 4 5
Type of measurements an arbitrary attack sequence, we observe

Fig. 1 MSE and Cramér-Rao bound versus the type
v H Hm v, if m ∈ I n ,
of measurements used for the IEEE 14-bus system zm ≈ (30)
am , if m ∈ I a ,
2
10
Angle error (degree)

(a)
0
for m = 1, 2, . . . , M , where additive measurement
10
noise can be included if “≈” is replaced with “=”
GN-based SE
10
-2
SDR-based SE and the set I n ⊆ {1, 2, . . . , M } (I a ) collects the in-
FPP-based SE
dices of nominal (outlying) data. Furthermore, ele-
ments of I a are assumed to be randomly chosen from
-4
10 5 10 15 20 25 30
Bus index
{1, 2, . . . , M }. Relying on whether {am } is indepen-
Magnitude error (per unit)

0
10 30
(b) GN-based SE
SDR-based SE dent of {Hm }, we consider the following two models
FPP-based SE
for the attacks:
M1: Attacks {am }M m=1 are independent of
M
-5
{Hm }m=1 .
10
5 10 15 20 25 30 M2: Attacks {am }m∈I a are independent of nom-
Bus index
inal measurement matrices {Hm }m∈I n .
Fig. 2 Angle estimation errors (a) and voltage mag-
nitudes (b) per bus for the IEEE 30-bus system
Note that M1 requires full independence be-
tween the corruption and the measurements. In
voltage angles and magnitudes across buses obtained other words, the attacker may solely corrupt {am }
using different schemes. Figs. 1 and 2 corroborate without any knowledge of Hm and v. On the other
the near-optimal performance and robustness of the hand, M2 allows completely arbitrary dependence
FPP-based PSSE solver in the simulated settings. between am and (v, Hm ) for m ∈ I a . This is prac-
tical since the type of corruption may depend on the
individual measurement v H Hm v being recorded.
4 Robust power system state estima-
tion 4.2 Problem formulation

With utilities increasingly shifting toward smart Having elaborated on the data corruption model
grid technology as well as other upgrades with inher- along with the system model in Section 2, the prob-
ent cyber vulnerabilities, the power grid has seen cor- lem of robust PSSE is formally stated next.
relative threats from adversarial cyber attacks grow- Expressed simply, the goal of robust PSSE is
ing in form and frequency (Fairley, 2016). Tradi- to recover all bus voltages v ∈ CN given network
tional PSSE approaches, particularly the (W)LS SE, parameters (Y , Yf ) and the available measurements
are being challenged by these new issues concerning z ∈ RM , whose entries as shown in Eq. (30) satisfy
data integrity and uninformed model changes (Liu M1 or M2. The first attempt may be seeking the
et al., 2011; Zhu and Giannakis, 2012; Kekatos and (W)LS estimate using one of the solvers discussed in
Giannakis, 2013). These concerns strongly motivate Section 3. However, it is well known that the (W)LS
the development of robust PSSE approaches against criterion is sensitive to bad data and can give rise
anomalous (i.e., bad) data and model inaccuracies. to very bad solutions even if only a few meters are
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 11

comprised. On the other hand, the 1 (least abso- Following Eqs. (B2) and (B3), starting with
lute value (LAV)) based losses yielding median-based v0 = 1, the deterministic prox-linear algorithm min-
estimators, have been well documented in statistics imizes Eq. (31) by iteratively solving (Wang G et al.,
and optimization for their ability in handling gross 2017):
errors in the measurements z (Huber, 2011). This
 1 2
prompts us to consider minimizing the 1 loss of vi+1 = arg min  Bi (v −vi ) − ci 1 + v− vi 2 ,
the residuals, which yields the so-called LAV esti- v∈C N 2
(32)
mate (Kotiuga and Vidyasagar, 1982; Abur and Ce-
where the coefficients are given by
lik, 1991; Jabr and Pal, 2004):
M 2μi  H
1   H  Bi := H 1 vi , H 2 vi , . . . , H M vi , (33)
min f (v) = v H m v − zm . (31) M
v∈CN M m=1 μi 
ci := z1 − viH H1 vi , z2 − viH H2 vi , . . . ,
It can be easily checked that the loss function M
H
f (v) is nonconvex and nonsmooth due to the non- zM − viH HM vi . (34)
linear measurements {v H Hm v} and the absolute-
value operator | · |, respectively. In fact, f (v) is Observe that problem (32) is a convex quadratic
not even locally convex near the global optima ± v ∗ program, which can be solved efficiently by means of
even in the absence of noise. This is evident from standard convex programming methods, including
the simplification f (v) = |v H v − 1| of a scalar vari- subgradient-type methods (Ben-Tal and Nemirovski,
able v ∈ C. Therefore, a local analysis based on 2001). Under appropriate conditions, the determin-
smoothness and convexity is nearly impossible, sug- istic prox-linear procedure (32) converges quadrat-
gesting that the Gauss-Newton method presented in ically fast to optimum v ∗ or −v ∗ , meaning that
Section 3.3 is not applicable for minimizing Eq. (31). we have to solve only about log2 (log2 (1/)) such
Due to these reasons, tackling problem (31) is chal- quadratic programs to find an -optimal estimate.
lenging. Upon linearizing {v H Hm v} at the most This number in practice amounts to 5–8 or so. Al-
recent iteration, a sequence of linear programs were ternatively, an alternating direction method of mul-
solved (Kotiuga and Vidyasagar, 1982). Strategies tipliers (ADMM) based solver was developed for it-
for improving linear programming by leveraging the eratively coping with problem (32) (Wang G et al.,
system’s structure (Abur and Celik, 1991) or via it- 2017).
erative reweighting (Jabr and Pal, 2003, 2004) have
4.4 Stochastic proximal linear alternative
been discussed. Despite these efforts, LAV estima-
tors have not been widely employed yet in today’s With microgrids becoming increasingly inter-
power networks mostly due to their computational connected, seeking the exact minimizer of Eq. (32)
inefficiency (Göl and Abur, 2014). However, the cri- per iteration of the deterministic prox-linear scheme
terion f (v) exhibits several unique structural prop- may be computationally expensive, or can be in-
erties, which are amenable to developing efficient al- tractable. This discourages the applicability of the
gorithms as we elaborate on next. deterministic prox-linear scheme to robust PSSE
of large-scale power networks. In this context, we
4.3 Deterministic proximal linear algorithm
present an inexpensive stochastic alternative of
We start by rewriting the function f (v) in Eq. (32) for minimizing problem (31) next. Ad-
Eq. (31) as a composition of the convex c(u) := vantages of the stochastic prox-linear approach over
u1 /M and the smooth s(v) : CN → RM whose per its deterministic counterpart Eq. (32) include sim-
mth entry is sm (v) := v H Hm v − zm . This composi- ple closed-form updates as well as fast convergence
tional structure lends itself favorably to the proximal to find an (approximately) optimal solution; refer to
linear (prox-linear) algorithms (Fletcher and Wat- Duchi and Ruan (2017b) for discussion on stochastic
son, 1980; Wang G et al., 2017, 2018c). For inter- composite optimization.
ested readers, we provide a brief introduction to the Instead of relying on all data to obtain the next
so-called composite optimization and the prox-linear iteration vi+1 by solving the quadratic subproblem
algorithm in Appendix B. (32), the stochastic prox-linear approach samples a
12 Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17

single datum mi ∈ {1, 2, . . . , M } uniformly at ran- {cm } (Eq. (36)) and performing the updates vi+1
dom per iteration, and inductively constructs vi+1 (Eq. (37)) entail just at most, say, 10 scalar oper-
as the optimizer of (Wang G et al., 2017) ations. In this case, the stochastic prox-linear pro-
 cedure (37) incurs per-iteration complexity of O(1).
 1
v − vi 2 ,
min cmi ,i −  bHmi ,i (v − vi ) +

2 Surprisingly, this O(1) complexity holds regardless
v∈C N 2μi
(35) of the network size N . If the power injection mea-
where |cmi ,i − (bH surements are used too, the number of operations
mi ,i (v − vi ))| can be understood
as the “linearization” of the 1 loss |zmi − v H Hmi v| per iteration increases to the order of the number
of sampled datum (Hmi , zmi ) around the current of neighbors, which typically remains much smaller
iteration vi , whose coefficients are given by than N in real-world networks.
 For convergence, a diminishing step size is usu-
bmi := 2Hmi vi , ally required for a stochastic optimization algorithm.
(36)
cmi := zmi − viH Hmi v. Similar to other stochastic gradient-type methods,
the step size sequence {μi } of the stochastic prox-
Again, we obtain a convex quadratic program linear procedure should be square summable but not
(35). Compared to the quadratic program (32) en- summable (Duchi and Ruan, 2017b); that is,
countered in the deterministic prox-linear approach,
+∞
 +∞

fortunately, the solution to problem (35) can be pro- μi = +∞, μ2i < +∞. (38)
vided in simple closed form. To this end, define the i=0 i=0
projection operator projμ (x) : R × R+ → R that re-
turns the real number in interval [−μ, μ] closest to As an example, it suffices to take μi = αi−γ with ap-
the given number x ∈ R. Then we have the solution propriately chosen constants α > 0 and 0.5 < γ ≤ 1.
to problem (35), given by Using the results in Theorem 1 of Duchi and Ruan
 (2017b), one can establish that the iteration sequence

cmi ,i −  bH mi ,i vi {vi } in Eq. (37) with step sizes satisfying Eq. (38)
vi+1 = vi + projμi 2 bmi ,i , (37) converges to a stationary point of Eq. (31) almost
bmi ,i 2
surely. Fig. 3 illustrates the convergence performance
which is repeated until some convergence criteria are of the Gauss-Newton and deterministic/stochastic
met. Intuitively, measurements with a relatively prox-linear algorithms on the IEEE 14-bus test sys-
small residual, i.e., |cmi ,i − (bH 2
mi ,i vi )|/bmi ,i 2 ≤ tem, where the simulated noise-free measurements
μi , are deemed “nominal,” for which vi is updated include all active and reactive power flows, as well as
along the direction of bmi with a step size of |cmi ,i − all squared voltage magnitudes.
(bH 2
mi ,i vi )|/bmi ,i 2 . The measurements of larger Finally, PMU measurements, if available, can
residuals obeying |cmi ,i − (bH 2
mi ,i vi )|/bmi ,i 2 ≥ μi , be readily incorporated in Eqs. (13) and (31). The
on the other hand, are likely to be outliers, or cor- presented LS and LAV PSSE solvers (mentioned in
rupted by outliers; thus, vi is updated using a thresh-
olded step size of μi .
Regarding the stochastic prox-linear solver (37)
for minimizing problem (31), some observations are
of interest. In terms of computational complexity,
the number of complex scalar operations (e.g., ad-
ditions and multiplications) can be estimated for
Eq. (37). To this end, it is instrumental to note
from Eqs. (9)–(11) that the matrices {Hm }M m=1 are
highly sparse, because both Y and Yf have only few
nonzero entries. Indeed, for power flow or voltage
magnitude square measurements, one can verify that
the corresponding matrices {Hm } have one nonzero Fig. 3 Convergence performance of Gauss-Newton,
entry or three nonzero entries, respectively. As such, deterministic, and stochastic prox-linear algorithms
evaluating their corresponding coefficients {bm } and for the IEEE 14-bus system
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 13

Sections 3 and 4) apply with no or minimal modifi- be given in its Lagrangian form:
cations. Refer to Zamzam et al. (2018) and Zhang
M
et al. (2018a, 2018b, 2019) for generalizations to 1  2
min zm − v H Hm v − am + λa1 ,
real-time estimation and forecasting based on deep v∈CN , 2M m=1
a∈RM
neural networks. (42)
for an appropriately selected regularization param-
4.5 Bad data identification
eter λ > 0. Solving problem (42) offers simultane-
An alternative to coping with bad data is to aug- ously the state estimate v̂ and outlier vector â. In
ment the measurement model (12) entry-wise with other words, it jointly performs state estimation and
an attack variable as (Zhu and Giannakis, 2012) bad data identification.
Nonetheless, due to the nonlinear measurements
zm = v H Hm v + am + m , ∀ m = 1, 2, . . . , M, (39) ∗
{v Hm v}, problem (42) is still nonconvex. Similar
where a := [a1 , a2 , . . . , aM ]T ∈ RM is an unknown to the LS PSSE in Section 3.4, semidefinite program-
vector whose mth entry is deterministically nonzero, ming relaxation can be invoked for handling prob-
only if the mth measurement zm is attacked (Kosut lem (42). Specifically, upon introducing V := vv H
et al., 2011; Liu et al., 2011; Kekatos and Giannakis, and dropping the rank constraint, one can express
2013). In practice, the attack vector a is very sparse; problem (42) as a convex program over (V , a), giv-
namely, most of its entries are zero. ing rise to (Zhu and Giannakis, 2012)
Seeking both v ∈ CN and a ∈ RM from only M
M measurements in Eq. (39) may appear impossi- 1   2
min zm − tr Hm V − am + λa1
ble, as the number of unknowns exceeds the number V ∈CN ×N , 2M m=1
a∈RM
of equations. Recent results have shown that this s.t. V  0, (43)
is possible by leveraging the parsimony of a (Zhu
and Giannakis, 2012; Kekatos and Giannakis, 2013; which can be efficiently handled again using standard
Aghamolki et al., 2018). If the number 1 ≤ k  M convex programming approaches. Upon finding its
of bad data is known a priori, one would ideally wish V -optimizer V̂ , state estimate v̂ can be recovered us-
to solve ing eigen-decomposition or randomization (Zhu and
M
1  2 Giannakis, 2012, 2014). Refer to Kekatos and Gi-
min z m − v H H m v − am annakis (2013) for generalizations using PMU mea-
v∈CN , a∈RM 2M m=1 (40)
surements as well as decentralized implementations.
s.t. a0 ≤ k,
where the 0 (pseudo-)norm  · 0 counts the num- 5 Distribution system state estimation
ber of nonzero entries in the argument. Due to the
combinatorial nature of the 0 norm, problem (40) is Unlike transmission networks where metering
NP-hard in general (Nesterov, 2013). For the special devices are installed at almost all buses, low-voltage
case of k = 1, it can be efficiently handled by means distribution grids have partial observability due to
of the largest normalized residual (LNR) test; refer limited instrumentation, low industrial investment
to details in Section 5.7 of Abur and Gómez-Expósito interest, and the sheer scale (Lu et al., 1995; Baran,
(2004) and Mili et al. (1994). 2001). Utilities have been implementing distribu-
An alternative to the constrained formulation tion automation systems (DAS) at the substation,
(40) is to replace the 0 norm with its convex surro- and a few on the feeders, which provide not only
gate 1 norm (Kekatos and Giannakis, 2013): control and monitoring of devices such as switches
and capacitor banks, but also measurements in-
M
1  2 cluding voltages, currents, and power flows (Baran,
min z m − v H H m v − am
v∈CN , a∈RM 2M m=1 (41) 2001). Real-time measurements of distribution net-
works obtained by DAS offer the possibility of per-
s.t. a1 ≤ k̂,
forming SE at a distribution level. Due to limited
where k̂ is an estimate (or upper bound) of the prac- installation of DAS and a scarcity of real-time mea-
tically unknown k. Equivalently, problem (41) can surements, the distribution system state estimation
14 Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17

(DSSE) task that involves estimating the volt- To improve observability in distribution grids,
age phasors of all buses, now becomes particularly real-time measurements have to be augmented with
challenging. a high number of pseudo-measurements when per-
We identify the following challenges that must forming DSSE (Baran, 2001). Pseudo-measurements
be addressed to design PSSE approaches that are comprise predictions of energy consumption or gen-
applicable in distribution systems: eration, obtained using load and generation fore-
1. Unbalanced phases. Distribution systems are cast procedures based on historical data (Clements,
multi-phase, consisting mainly of feeders. Feeders 2011). They are much less accurate than the real-
are mostly radial, but they have laterals that can time measurements; thus, the noise variance infor-
be single- or two-phase in general. Furthermore, mation should be accounted for in the WLS- and
most loads in residential electricity networks are ei- LAV-based DSSE (Sections 3 and 4). In terms of
ther single- or two-phase, rather than three-phase. solvers, the algorithmic advances outlined in previ-
Due to these reasons, distribution systems are un- ous sections can be used for or generalized to DSSE
balanced in nature. too. Refer to Singh et al. (2009) and Ahmad et al.
2. High r/x ratios. Due to low-voltage levels (2018) for other approaches.
as well as relatively short connecting lines, distribu-
tion systems possess higher r/x ratios than transmis-
sion grids. This situation challenges the convergence
6 Conclusions and future work
of iterative SE algorithms such as Newton-Raphson
(Ahmad et al., 2018). In this paper, we have outlined some of the re-
3. Limited availability of real-time data. Real- cent advances in PSSE, with a focus on solvers that
time data of distribution systems are available only can efficiently attain (near-)optimal solutions to the
through DAS at very few locations. The most com- nonconvex SE tasks. After developing the Cramér-
mon measurement point is the substation, and prob- Rao bound for benchmarking performance of any
ably few on the feeders. The available real-time mea- unbiased estimator, the WLS-based SE has been re-
surements, however, are presently insufficient to re- viewed. Three efficient solvers have been discussed,
cover the network state, which justifies the partial including the Gauss-Newton iterations, semidefinite
observability of distribution systems (Bhela et al., programming relaxation, and feasible point pursuit,
2018; Zamzam et al., 2018). all of which were efficiently implemented in the com-
plex domain. To cope with the cyber attacks in
4. Complex measurement functions. The DAS
the envisioned smart grid, robust PSSE was enabled
can provide both phasor measurements and real-
using the 1 -based losses, for which prox-linear al-
valued measurements (Džafić et al., 2018b; Zamzam
gorithms using composite optimization were advo-
et al., 2018). The former consists of complex volt-
cated. Finally, DSSE along with its current chal-
ages or current flows at selected buses or lines, which
lenges was outlined.
are linear functions of the voltage state vector pro-
vided by the PMUs or μPMUs. Real-valued mea- The perspective of this overview opens up a
surements, on the other hand, include voltage mag- number of exciting directions for future research to
nitudes, current flow magnitudes, as well as real and realize the vision of smarter power grids, includ-
reactive power flows. Similar to the SCADA data ing: (1) generalizing the presented nonconvex op-
in transmission networks, these real-valued data are timization approaches to enable dynamic state es-
nonlinear functions of the state vector. timation and tracking; (2) exploring more efficient
5. Network model uncertainty. In the DSSE solvers through, e.g., stochastic, online, parallel, and
framework, network topology and line parameters distributed implementations for PSSE of large-scale
are typically assumed to be perfectly known. How- networks; (3) leveraging advances in signal process-
ever, due to network infrastructure aging and lack of ing over networks (graphs) as well as deep learning
real-time monitoring of switches (thus uninformed (e.g., deep neural networks) to provide novel paths
topology changes), the knowledge of a distribution to address challenges related to the nonconvexity of
system model has a large uncertainty (Della Giustina PSSE and the partial observability of distribution
et al., 2014; Zhang et al., 2017). systems.
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 15

References Džafić I, Jabr RA, Hrnjić T, 2018b. Hybrid state estima-


Abur A, Celik MK, 1991. A fast algorithm for the weighted tion in complex variables. IEEE Trans Power Syst,
least-absolute-value state estimation (for power sys- 33(5):5288-5296.
tems). IEEE Trans Power Syst, 6(1):1-8. https://doi.org/10.1109/TPWRS.2018.2794401
https://doi.org/10.1109/59.131040 Fairley P, 2016. Cybersecurity at US utilities due for an
Abur A, Gómez-Expósito A, 2004. Power System State Es- upgrade: tech to detect intrusions into industrial control
timation: Theory and Implementation. Marcel Dekker, systems will be mandatory. IEEE Spectr, 53(5):11-13.
New York, USA. https://doi.org/10.1109/MSPEC.2016.7459104
Aghamolki HG, Miao Z, Fan L, 2018. SOCP convex Fletcher R, Watson GA, 1980. First and second order
relaxation-based simultaneous state estimation and bad conditions for a class of nondifferentiable optimization
data identification. https://arxiv.org/abs/1804.05130 problems. Math Programm, 18(1):291-307.
Ahmad F, Rasool A, Ozsoy E, et al., 2018. Distribution https://doi.org/10.1007/BF01588325
system state estimation—a step towards smart grid. Giannakis GB, Kekatos V, Gatsis N, et al., 2013. Monitoring
Renew Sust Energ Rev, 81:2659-2671. and optimization for power grids: a signal processing
https://doi.org/10.1016/j.rser.2017.06.071 perspective. IEEE Signal Process Mag, 30(5):107-128.
Baran ME, 2001. Challenges in state estimation on distri- https://doi.org/10.1109/MSP.2013.2245726
bution systems. Power Engineering Society Summer Göl M, Abur A, 2014. LAV based robust state estimation
Meeting, p.429-433. for systems measured by PMUs. IEEE Trans Smart
https://doi.org/10.1109/PESS.2001.970062 Grid, 5(4):1808-1814.
Ben-Tal A, Nemirovski A, 2001. Lectures on Modern Convex https://doi.org/10.1109/TSG.2014.2302213
Optimization: Analysis, Algorithms, and Engineering Huang YF, Werner S, Huang J, et al., 2012. State esti-
Applications. SIAM, Philadelphia, USA. mation in electric power grids: meeting new challenges
Bertsekas DP, 1999. Nonlinear Programming. Athena presented by the requirements of the future grid. IEEE
Scientific, Belmont, Massachusetts, USA. Signal Process Mag, 29(5):33-43.
Bhela S, Kekatos V, Veeramachaneni S, 2018. Enhancing https://doi.org/10.1109/MSP.2012.2187037
observability in distribution grids using smart meter Huber PJ, 2011. Robust Statistics. In: Lovric M (Ed.), Inter-
data. IEEE Trans Smart Grid, 9(6):5953-5961. national Encyclopedia of Statistical Science. Springer,
https://doi.org/10.1109/TSG.2017.2699939 Berlin, p.1248-1251.
Burke JV, Ferris MC, 1995. A Gauss-Newton method Jabr R, Pal B, 2003. Iteratively re-weighted least-absolute-
for convex composite optimization. Math Programm, value method for state estimation. IET Gener Transm
71(2):179-194. https://doi.org/10.1007/BF01585997 Distrib, 150(4):385-391.
Candès EJ, Li X, Soltanolkotabi M, 2015. Phase retrieval https://doi.org/10.1049/ip-gtd:20030462
via Wirtinger flow: theory and algorithms. IEEE Trans Jabr R, Pal B, 2004. Iteratively reweighted least-squares
Inform Theory, 61(4):1985-2007. implementation of the WLAV state-estimation method.
https://doi.org/10.1109/TIT.2015.2399924 IET Gener Transm Distrib, 151(1):103-108.
Caro E, Conejo A, 2012. State estimation via mathematical https://doi.org/10.1049/ip-gtd:20040030
programming: a comparison of different estimation al- Kay SM, 1993. Fundamentals of Statistical Signal Processing,
gorithms. IET Gener Transm Distrib, 6(6):545-553. Vol. I: Estimation Theory. Prentice Hall, Englewood
https://doi.org/10.1049/iet-gtd.2011.0663 Cliffs, USA.
Christie RD, 1999. Power Systems Test Case Archive. Uni- Kekatos V, Giannakis GB, 2013. Distributed robust power
versity of Washington. https://labs.ece.uw.edu/pstca/ system state estimation. IEEE Trans Power Syst,
Clements KA, 2011. The impact of pseudo-measurements 28(2):1617-1626.
on state estimator accuracy. IEEE Power and Energy https://doi.org/10.1109/TPWRS.2012.2219629
Society General Meeting, p.1-4. Kekatos V, Wang G, Zhu H, et al., 2017. PSSE redux:
https://doi.org/10.1109/PES.2011.6039370 convex relaxation, decentralized, robust, and dynamic
Della Giustina D, Pau M, Pegoraro PA, et al., 2014. Elec- approaches. https://arxiv.org/abs/1708.03981
trical distribution system state estimation: measure- Kim SJ, Wang G, Giannakis GB, 2014. Online semidefinite
ment issues and challenges. IEEE Instrum Meas Mag, programming for power system state estimation. IEEE
17(6):36-42. Conf on Acoustics, Speech, and Signal Process, p.6024-
https://doi.org/10.1109/MIM.2014.6968929 6027. https://doi.org/10.1109/ICASSP.2014.6854760
Duchi JC, Ruan F, 2017a. Solving (most) of a set of quadratic Kosut O, Jia L, Thomas J, et al., 2011. Malicious data
equalities: composite optimization for robust phase re- attacks on the smart grid. IEEE Trans Smart Grid,
trieval. Inform Infer J IMA, iay015. 2(4):645-658.
https://doi.org/10.1093/imaiai/iay015 https://doi.org/10.1109/TSG.2011.2163807
Duchi JC, Ruan F, 2017b. Stochastic methods for composite Kotiuga WW, Vidyasagar M, 1982. Bad data rejection
optimization problems. properties of weighted least-absolute-value techniques
https://arxiv.org/abs/1703.08570 applied to static state estimation. IEEE Trans Power
Džafić I, Jabr RA, Hrnjić T, 2018a. High performance dis- Appar Syst, 101(4):844-853.
tribution network power flow using Wirtinger calculus. https://doi.org/10.1109/TPAS.1982.317150
IEEE Trans Smart Grid, in press. Kreutz-Delgado K, 2009. The complex gradient operator and
https://doi.org/10.1109/TSG.2018.2824018 the CR-calculus. https://arxiv.org/abs/0906.4835
16 Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17

Lewis AS, Wright SJ, 2016. A proximal method for composite Wang G, Zamzam AS, Giannakis GB, et al., 2018b. Power
minimization. Math Programm, 158(1-2):501-546. system state estimation via feasible point pursuit: al-
https://doi.org/10.1007/s10107-015-0943-9 gorithms and Cramér-Rao bound. IEEE Trans Signal
Liu Y, Ning P, Reiter MK, 2011. False data injection attacks Process, 66(6):1649-1658.
against state estimation in electric power grids. ACM https://doi.org/10.1109/TSP.2018.2791977
Trans Inform Syst Sec, 14(1):1-33. Wang G, Zhu H, Giannakis GB, et al., 2018c. Robust power
https://doi.org/10.1145/1952982.1952995 system state estimation from rank-one measurements.
Lu C, Teng J, Liu WH, 1995. Distribution system state IEEE Trans Contr Netw Syst, in press.
estimation. IEEE Trans Power Syst, 10(1):229-240. https://doi.org/10.1109/TCNS.2019.2890954
https://doi.org/10.1109/59.373946 Wang G, Giannakis GB, Eldar YC, 2018d. Solving systems
Mehanna O, Huang K, Gopalakrishnan B, et al., 2015. Feasi- of random quadratic equations via truncated amplitude
ble point pursuit and successive approximation of non- flow. IEEE Trans Inform Theory, 64(2):773-794.
convex QCQPs. IEEE Signal Process Lett, 22(7):804- https://doi.org/10.1109/TIT.2017.2756858
808. https://doi.org/10.1109/LSP.2014.2370033 Wang Z, Cui B, Wang J, 2017. A necessary condition for
Mili L, Cheniae MG, Rousseeuw PJ, 1994. Robust state power flow insolvability in power distribution systems
estimation of electric power systems. IEEE Trans Circ with distributed generators. IEEE Trans Power Syst,
Syst I Fundam Theory Appl, 41(5):349-358. 32(2):1440-1450.
https://doi.org/10.1109/81.296336 https://doi.org/10.1109/TPWRS.2016.2588341
Monticelli A, 2000. Electric power system state estimation. Wood AJ, Wollenberg BF, 1996. Power Generation, Opera-
Proc IEEE, 88(2):262-282. tion, and Control (2nd Ed.). Wiley & Sons, New York,
https://doi.org/10.1109/5.824004 USA.
Nesterov Y, 2013. Introductory Lectures on Convex Opti- Wulf WA, 2000. Great achievements and grand challenges.
mization: a Basic Course. Springer Science & Business Nat Acad Eng, 30(1):5-10.
Media, Boston, USA. Zamzam AS, Fu X, Sidiropoulos ND, 2018. Data-driven
Pardalos PM, Vavasis SA, 1991. Quadratic programming learning-based optimization for distribution system
with one negative eigenvalue is NP-hard. J Glob Optim, state estimation. https://arxiv.org/abs/1807.01671
1(1):15-22. Zhang L, Wang G, Giannakis GB, 2017. Going beyond
https://doi.org/10.1007/BF00120662 linear dependencies to unveil connectivity of meshed
Park J, Boyd S, 2017. General heuristics for noncon- grids. IEEE 7th Workshop on Computational Advances
vex quadratically constrained quadratic programming. in Multi-sensor Adaptive Processing, p.1-5.
https://arxiv.org/abs/1703.07870 https://doi.org/10.1109/CAMSAP.2017.8313078
Saad Y, 2003. Iterative Methods for Sparse Linear Sys- Zhang L, Wang G, Giannakis GB, 2018a. Real-time power
tems (2nd Ed.). Society for Industrial and Applied system state estimation via deep unrolled neural net-
Mathematics, Philadelphia, USA. works. IEEE Global Conf on Signal and Information
Schweppe FC, Wildes J, Rom D, 1970. Power system static Processing, in press.
state estimation: parts I, II, and III. IEEE Trans Power Zhang L, Wang G, Giannakis GB, 2018b. Real-time power
Appar Syst, 89(1):120-135. system state estimation and forecasting via deep neural
Singh R, Pal B, Jabr R, 2009. Choice of estimator for dis- networks. https://arxiv.org/abs/1811.06146
tribution system state estimation. IET Gener Transm Zhang L, Wang G, Giannakis GB, 2019. Power system state
Distrib, 3(7):666-678. forecasting via deep recurrent neural networks. IEEE
https://doi.org/10.1049/iet-gtd.2008.0485 Conf on Acoustics, Speech, and Signal Process, in press.
Stoica P, Marzetta TL, 2001. Parameter estimation problems Zhu H, Giannakis GB, 2011. Estimating the state of AC
with singular information matrices. IEEE Trans Signal power systems using semidefinite programming. North
Process, 49(1):87-90. American Power Symp, p.1-7.
https://doi.org/10.1109/78.890346 https://doi.org/10.1109/NAPS.2011.6024862
Wang G, Kim SJ, Giannakis GB, 2014. Moving-horizon dy- Zhu H, Giannakis GB, 2012. Robust power system state
namic power system state estimation using semidefinite estimation for the nonlinear AC flow model. North
relaxation. IEEE PES General Meeting & Conf Expo- American Power Symp, p.1-6.
sition, p.1-5. https://doi.org/10.1109/NAPS.2012.6336405
https://doi.org/10.1109/PESGM.2014.6939925 Zhu H, Giannakis GB, 2014. Power system nonlinear state
Wang G, Zamzam AS, Giannakis GB, et al., 2016. Power estimation using distributed semidefinite programming.
system state estimation via feasible point pursuit. IEEE IEEE J Sel Top Signal Process, 8(6):1039-1050.
Global Conf Signal and Information Process, p.773-777. https://doi.org/10.1109/JSTSP.2014.2331033
https://doi.org/10.1109/GlobalSIP.2016.7905947
Wang G, Giannakis GB, Chen J, 2017. Robust and scalable
power system state estimation via composite optimiza-
Appendix A: Wirtinger’s calculus
tion. https://arxiv.org/abs/1708.06013
Wang G, Giannakis GB, Saad Y, et al., 2018a. Phase
Introducing the complex conjugate coordinates
retrieval via reweighted amplitude flow. IEEE Trans [vT vT ]T ∈ C2N , one can express h(v) = h(v, v) ∈
Signal Process, 66(11):2818-2833. RM . It now becomes evident that h(v, v) is holo-
https://doi.org/10.1109/TSP.2018.2818077 morphic (i.e., complex differentiable) in v for a fixed
Wang et al. / Front Inform Technol Electron Eng 2019 20(1):4-17 17

v, and vice versa. Following the convention, (par- analysis is not applicable. Yet, this is not an is-
tial) derivatives will be denoted by row vectors, while sue, as Wirtinger’s calculus in Appendix A can be
(sub)gradients are denoted by column vectors. The employed. This compositional structure lends itself
partial Wirtinger derivatives are given by (Kreutz- favorably to the prox-linear algorithm (Fletcher and
Delgado, 2009) Watson, 1980; Burke and Ferris, 1995), which we de-
  
∂hm ∂hm (v, v)  ∂hm ∂hm ∂hm velop next.
:=  = , , . . . , , Similar to other iterative schemes such as gradi-
∂v ∂v v=const. ∂v1 ∂v2 ∂vN
(A1) ent descent, trust-region, and Newton’s method, the
  
∂hm ∂hm (v, v)  ∂hm ∂hm ∂hm prox-linear algorithm builds a local model of the loss
:=  = , , . . . , , function and repeatedly minimizes this model. How-
∂v ∂v v=const.
∂v 1 ∂v 2 ∂v N
(A2) ever, thanks to the compositional structure, the local
for m = 1, 2, . . . , M , where the partial derivative model is obtained by linearizing only s(·). Specifi-
with respect to v (v) treats v (v) as a constant in cally, we first define close to any given point w ∈ CN
hm . The complex gradient of hm (v, v) with respect a local “linearization” of f (·) as
to v or v can be defined by 
fw (v) := c s(w) + 2 ∇H
v s(w)(v − w) , (B2)
 H  H
∂hm ∂hm
∇v hm := , ∇v hm := , (A3) where ∇v s(v) ∈ CN ×M is the complex Jacobian ma-
∂v ∂v
trix of s(·) evaluated at point w (Eqs. (A5) and (A6)
which gives rise to the complex gradient of hm in the in Appendix A).
conjugate coordinate system: In contrast to the originally nonconvex f (v),

H the linearization fw (v) in Eq. (B2) becomes convex,
 T T
T ∂hm ∂hm
∇c hm := ∇v hm ∇v hm = . (A4) which is indeed the key behind the prox-linear proce-
∂v ∂v dure. Starting with some point v0 ∈ CN , the (deter-
Upon introducing the so-called complex Jacobian ministic) prox-linear algorithm proceeds inductively
matrix: to obtain iterations {v1 , v2 , . . .} by minimizing the
quadratically regularized models (Burke and Ferris,
∇c h := [∇c h1 , ∇c h2 , . . . , ∇c hM ] ∈ C2N ×M , (A5) 1995; Lewis and Wright, 2016):
 
the first-order Taylor expansion of h(v + Δv) for 1
2

vi+1 = arg min fvi (v) + v − vi 2 , (B3)
given vectors v and Δv ∈ CN is defined as v∈X 2μi


H Δv where μi > 0 is a step size that can be fixed a priori
h(v + Δv) ≈ h(v) + ∇c h(v)
Δv to some constant, or be determined “on-the-fly” by a
H  line search (Burke and Ferris, 1995).
= h(v) + 2 ∇v h(v)Δv . (A6)
Observe that fvi (v) is convex in v, so is problem
(B3). It has been shown that if c(·) is L-Lipschitz
Appendix B: Composite optimization and ∇s is κ-Lipschitz, choosing any constant step
Consider minimizing functions of the form: size 0 < μ < 1/(κL) ensures that the algorithm
(Eq. (B3)) (Lewis and Wright, 2016): (1) is a de-
f (v) := c(s(v)) s.t. v ∈ V, (B1) scent method (i.e., the iterations {vi } monotonically
decrease the function value of f (v)); (2) finds an (ap-
where c(·) : RM ∈ R is convex, s(·) : CN → RM proximate) stationary point of Eq. (B1). Interested
is smooth, and V is some convex set (or V = Cn readers can refer to Lewis and Wright (2016) for a
if there is no constraint on v). Since f (·) (s(·)) is contemporary review on composite optimization.
a real-valued function of complex arguments, real

You might also like