2019-State Estimation in Smart Distribution Systems With Deep Generative Adversary Networks
2019-State Estimation in Smart Distribution Systems With Deep Generative Adversary Networks
Abstract— The problem of distribution system state estima- hourly intervals, that is incompatible with the more rapid
tion using smart meters and limited SCADA (Supervisory Con- changes of DER. Unfortunately, existing techniques rarely
trol and Data Acquisition) measurement units is considered. To address the mismatch of measurement resolution among the
overcome the lack of measurements, a Bayesian state estimator
using deep learning is proposed. The proposed method consists slow timescale smart meter data, the fast timescale real-
of two steps. First, a deep generative adversary network is time measurements (e.g., current magnitudes at feeders and
trained to learn the distribution of net power injections at substations), and the need for fast timescale state estimation.
the loads. Then, a deep regression network is trained using State estimation for unobservable systems must incorpo-
the samples generated from the generative network to obtain rate additional properties beyond the measurement model
minimum mean-squared error (MMSE) estimate of the system
state. Our simulation results show the accuracy and the online defined by the power flow equations. To this end, we
computation cost of the proposed method are superior to the pursue a Bayesian inference approach where the system
conventional methods. states (voltage phasors) and measurements are modeled as
Index Terms— Distribution system state estimation, deep random variables endowed with (unknown) joint probability
learning, generative adversary networks, deep regression net- distributions. Given the highly stochastic nature of the re-
work, SCADA, smart meter, and Bayesian inference.
newable injections, such a Bayesian model is both natural
I. I NTRODUCTION and appropriate.
The problem of state estimation is considered for dis- The most important benefit of Bayesian inference is that
tribution systems. A major obstacle to state estimation in observability is no longer required. A Bayesian estimator
distribution systems is that such systems are nominally exploits probabilistic dependencies of the measurement vari-
unobservable [1], [2]. By unobservable it means that there ables on the system states; it improves the prior distribution
is a manifold of uncountably many states that correspond to of the states using available measurements, even if there are
the same measurement. System unobservability arises when only a few such measurements. Unlike the least squares tech-
the number of sensors is not sufficiently large—typical in niques that minimize modeling error, a Bayesian estimator
distribution systems—or sensors are not well placed in the minimizes directly the estimation error.
network. An observable system may become unobservable The advantage of Bayesian inference, however, comes
when sensors are at fault, sensor data missing, or data with significant implementation issues. First, the underlying
tampered by malicious agents [3]. joint distribution of the system states and measurements is
The popular weighted least-squares (WLS) estimator and unknown, and some type of learning is necessary. Second,
its variants can no longer be used when the system is even if the relevant probability distribution is known or can
unobservable because a small WLS error in model fitting be estimated, computing the actual state estimate is often
does not imply a small error in estimation; a large estimation intractable analytically and prohibitive computationally.
error may persist even in the absence of noise. A standard A. Summary of Results and Contributions
remedy of unobservability is to use the so-called pseudo
measurements based on interpolated observations or forecasts The main contribution of this work is an application
from historical data. Indeed, the use of pseudo measurements of deep learning technology for distribution system state
has been a dominant theme for distribution system state estimation when the system is unobservable by the deployed
estimation. These techniques, however, are ad hoc and do SCADAs. To this end, we develop a data-driven generative
not assure the quality of estimates. model coupled with a deep neural network that provides
The advent of smart meters and advanced metering in- SCADA timescale state estimates. As a major departure
frastructure provide new sources of measurements. Attempts of the predominantly pseudo-measurement approaches to
have been made to incorporate smart meter data for state state estimation when the power system is unobservable, the
estimation [4]–[6]. Not intended for state estimation, smart proposed solution to state estimation for the unobservable
meters measure accumulative consumptions. They often ar- systems takes a Bayesian inference perspective, assuming
rive at a much slower timescale, e.g., in 15-minute to the system states as random quantities. Consequently, the
proposed approach is not bound by the observability assump-
Kursat Rasim Mestav† , and Lang Tong† are with the School of Electrical tion as required by the traditional weighted least-squares
and Computer Engineering, Cornell University, Ithaca, NY, 14850 USA e- solutions.
mail: {krm264,lt35}@cornell.edu.
This work was supported in part by the National Science Foundation The Bayesian inference that minimizes the mean squared
under Awards 1809830 and 1932501. error (MSE) of the state estimate is given by the conditional
k,(((
Authorized licensed use limited to: University of Saskatchewan. Downloaded on October 04,2021 at 07:44:28 UTC from IEEE Xplore. Restrictions apply.
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV 6PDUW*ULG&RPP
mean of the system state. Simple as it may appear, the although real-time measurements were used as optimization
conditional mean can be difficult to compute. Fundamen- constraints rather than as conditioning variables in Bayesian
tally, the underlying joint probability distribution of the inference. One approach to calculating conditional statistics
measurement and the system state is required. Even when is based on a graphical model of the distribution system from
such a probability distribution is given in closed-form, the which belief propagation techniques are used to generate
computation of the conditional mean is intractable in general. state estimates [21]. These techniques require a dependency
Furthermore, the lack of measurement-state samples makes graph of the system states and explicit forms of probability
it impossible to learn the joint distribution directly. distributions. Another approach is based on a linear approx-
The proposed deep learning approach consists of unsu- imation of the AC power flow [22].
pervised learning of the generative model of the network The approach presented in this paper belongs to the class
injection and supervised learning of the conditional mean of of Monte Carlo techniques in which samples are generated
the system states. Specifically, given direct or indirect mea- and empirical conditional means computed. In our approach,
surements of the net-injection, we consider several machine instead of using Monte Carlo sampling to calculating the con-
learning techniques for the underlying generative models of ditional mean directly as in [23], [24], Monte Carlo sampling
network injections. For distribution systems with smart meter is used to train a neural network that, in real-time, computes
measurements, such models can be learned from smart meter the MMSE estimate directly from the measurements.
measurements or estimates of network injections using para- The proposed technique builds on to our work on dis-
metric or nonparametric techniques, including deep learning tribution system state estimation [25] with several notable
techniques such as the generative adversary network (GAN) differences. First, the techniques of learning of generative
method. models are different. In the referred research it was assumed
The unsupervised learning of injection generative model is that the power injections follow a Gaussian mixture distri-
followed by supervised learning of the conditional mean of bution. Here we propose a more comprehensive technique
the network state. To this end, we exploit the physical model using generative adversarial networks (GANs). The GANs
of the power system by embedding the power flow equation are not only a more generic method to learn distributions
in generating training samples. A deep neural network with without having strong assumptions, it also allows us to learn
prewhitening first layer is proposed. We show that the pro- a distribution when the samples are not directly observable,
posed state estimator achieves several orders of magnitude but observable under an operation. We proposed to change
improvement in accuracy and online computation costs over the objective function of the GANs to train it to learn
the classical weighted least-squares (WLS) estimates. the distribution of power injections using the aggregated
smart meter measurements. The state estimation algorithm
B. Related Work using regression learning is improved with a prewhitening
State estimation based on deterministic state models has technique and more regularization techniques.
been extensively studied. See [1], [2] and references therein.
We henceforth highlight only a subset of the literature with II. S YSTEM M ODEL AND BAYESIAN S TATE E STIMATION
techniques suitable for distribution systems. The system state vector of the power grid at time t is
In some of the earliest contributions [7]–[10], it was well defined by xit = Vti ∠θti where Vit is the voltage magnitude
recognized that a critical challenge for distribution system and ∠θti is the phase angle for the state variable of bus
state estimation is the lack of observability. Different from i. The overall system state xt = [x1t , · · · , xN
t ]
is the
the Bayesian solution considered in this paper, most exist- column vector consisting of voltage phasors at all buses.
ing approaches are two-step solutions that produce pseudo A SCADA measures active/reactive power injections, power
measurements to make the system observable followed by flows measurements and voltage magnitude. The SCADA
applying WLS and other well-established techniques. measurement vector yt and system state xt are related by
From an estimation theoretic perspective, generating
pseudo measurements can be viewed as one of forecasting yt = h(xt ) + wt , t = 1, 2, . . . (1)
the real-time measurements based on historical data. Thus where t is the time index at the SCADA timescale
the pseudo-measurement techniques are part of the so-called (millisecond), h(·) is the measurement function, wt the
forecasting-aided state estimation [11], [12]. To this end, measurement noise.
machine learning techniques that have played significant
roles in load forecasting can be tailored to produce pseudo A. Weighted Least Squares Solutions and Observability:
measurements. See, e.g., [13]–[18]. The WLS estimator is optimal for observable systems in
Bayesian approaches to state estimation are far less ex- the absence of measurement:
plored even though the idea was already proposed in the
seminal work of Schweppe [19]. Bayesian state estima- x̂WLS (yt ) = arg min||yt − h(xt )||2 , (2)
xt
tion generally requires the computation of the conditional
statistics of the state variables. An early contribution that The goal of least squares is to minimize the modeling
modeled explicitly states as random was made in [20] where error. When there are not enough SCADAs installed, or some
load distributions were used to compute moments of states, SCADAs are faulty, the system becomes unobservable. WLS
Authorized licensed use limited to: University of Saskatchewan. Downloaded on October 04,2021 at 07:44:28 UTC from IEEE Xplore. Restrictions apply.
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV 6PDUW*ULG&RPP
methods need pseudo measurements or extra constraints to (xt , yt ), it is difficult even to estimate the generative model.
estimate states when the system is unobservable. Second, even if we have f (xt , yt ), computing the conditional
mean are often intractable. Several early attempts [21],
B. Smart Meters in State Estimation: [26] employed belief propagations on a graphical model to
Smart meter measurements that, say, T times slower than compute the solution efficiently. These methods still require
the SCADA measurements. Then the smart meter measure- the underlying graphical model; they deserve a new look
ment vector z[n] is aggregated every T time: through the lenses of modern machine learning.
The advent of powerful deep learning tools and compu-
nT −1
tation resources such as GPU and cloud computing make
z[n] = g(xi ) + v[n] (3)
it possible to overcomes the above challenges of Bayesian
i=nT −T
state estimation. The key idea of our approach is to embed
where g(·) is the power injection function and v[n] the the underlying physical law in the neural network learning
measurement noise. process.
SCADAs are not widely deployed such that most distri-
bution systems are unobservable. The common practice is to
calculate pseudo-measurements using smart meters and solve
(2).
Authorized licensed use limited to: University of Saskatchewan. Downloaded on October 04,2021 at 07:44:28 UTC from IEEE Xplore. Restrictions apply.
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV 6PDUW*ULG&RPP
As shown in Fig. 3 GAN consists of a Generative and a is used as a performance measure, where M is the number
Discriminative networks which are simultaneously trained. of Monte Carlo runs, k the index of the Monte Carlo run,
While the generative network learns to generate samples N the number of nodes, x̂[k] and x[k] the estimated and the
similar to the data, the discrimination network learns how state vectors, respectively. The objective function in the deep
Authorized licensed use limited to: University of Saskatchewan. Downloaded on October 04,2021 at 07:44:28 UTC from IEEE Xplore. Restrictions apply.
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV 6PDUW*ULG&RPP
learning model is chosen as minimize mean-squared error to into training (12000 samples), validation(4000 samples) and
approximate the optimal estimator. test(4000 samples) sets. For each one, we solved the power
c) Distribution Learning via generative deep learning: flow equations by MATPOWER to obtain the power flow
We used aggregated power injection data from the Pecan values and states. For this network, 32 optimally placed
Street collection† for distribution learning. It is beneficial to SCADA meters make the system observable. We repeated
select the training samples of deep learning model using the the experiment with 8, 14, 20 and 26 SCADAs which are
historical data with similar features such as the season, hour guaranteed to be unobservable. For each case, measurements
of the day, weather, etc. We collected the net power injection are imitated by adding measurements noise and states. The
data for each bus fixing the hour of the day at 5 pm from noisy measurements of the SCADA meters are chosen as
the 1st of May to the 31st of August in 2018. the input of the network. Prewhitening is used on the inputs.
We assumed that the power injection distribution at each The states are chosen as the output. Then we trained a deep
bus is a linear transformation of one distribution, they differ neural network with 5 to 10 hidden layers to estimate states
only by mean and variance. We normalized all measurements on the test set.
to obtain the samples of that distribution. Then, we used The ReLU (Rectified Linear Units) activation function was
these samples to train the GAN. As the data samples are used for neurons in the hidden layers and linear activation
from smart meters, we modified the objective function of functions in the output layer. The Adam algorithm was used
training as described in (6). to train the neural network with mini-batches of 60 samples.
We trained the generative network with 2 hidden layers Early stopping was applied by monitoring validation errors.
and 100 neurons at each layer. Batch normalization and To select an initial point for the optimization, He’s normal
dropout with rate 0.2 are used in layers. Adam optimization method [36] was used. To have a better regularization, batch
[30] algorithm with mini-batches is used as the optimizer. normalization and dropout with 0.3 dropping rate are used
Leaky-ReLU at hidden layers and a linear activation function at hidden layers.
at the final layer are used as the activation functions. For e) Comparison of Performances: We implemented the
the discriminative network, we used two hidden layers with proposed deep learning approach to Bayesian state estimation
30 neurons. Leaky-ReLU at hidden layers and a sigmoid on the IEEE 118-bus. We compared the proposed Bayesian
activation function at the final layer are used as the activation state estimation with deep neural network (herein abbreviated
functions. as Bayesian NN) with two WLS-based methods in the
After training the GAN and we generated many power literature:
injection vectors using the generator network. To verify
the results, we aggregated the generated samples to imitate 1) WLS with pseudo measurements: referred to as Regu-
the smart meter measurements and plotted the cumulative larized WLS generates injection pseudo measurements
distribution function (CDF) of it and empirical CDF of by normalizing the smart meter measurement over the
the raw data in Fig. 4. The figure shows empirical CDF interval.
obtained by the samples and CDF obtained from the learned 2) Augmented WLS: uses only SCADA measurements to
distribution are similar. It verifies the training was successful. estimate states. An extra constraint is added on the (2).
We also observed the discriminative function’s output is Fig. 5 presents the performance of three algorithms on
converged to a constant value of 0.5 as expected, described five scenarios. It is demonstrated that for highly unob-
in [33]. servable systems, Bayesian methods are advantageous over
conventional WLS methods. Note that the MSE floor of the
augmented WLS method due to the SCADA - Smart Meter
mismatch.
Authorized licensed use limited to: University of Saskatchewan. Downloaded on October 04,2021 at 07:44:28 UTC from IEEE Xplore. Restrictions apply.
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV 6PDUW*ULG&RPP
Authorized licensed use limited to: University of Saskatchewan. Downloaded on October 04,2021 at 07:44:28 UTC from IEEE Xplore. Restrictions apply.