2021-Bayesian Deep Learning For Dynamic Power System State Prediction Considering Renewable Energy Uncertainty
2021-Bayesian Deep Learning For Dynamic Power System State Prediction Considering Renewable Energy Uncertainty
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
AbstractÿModern power systems are incorporated with dis‐ tackle such issues, the accurate state prediction for power
tributed energy sources to be environmental-friendly and cost- system is crucial to capture the massive system uncertainties
effective. However, due to the uncertainties of the system inte‐ from RESs to support the system operations and services.
grated with renewable energy sources, effective strategies need
to be adopted to stabilize the entire power systems. Hence, the For instance, a suitable prediction technique can be deployed
system operators need accurate prediction tools to forecast the for system data analytics and contributes to the stable opera‐
dynamic system states effectively. In this paper, we propose a tion and planning in the power system.
Bayesian deep learning approach to predict the dynamic system Existing solutions to predict the dynamic system state in
state in a general power system. First, the input system dataset the power system mainly involve in deploying the traditional
with multiple system features requires the data pre-processing
prediction tools [1], [2]. For instance, in [1], a robust ap‐
stage. Second, we obtain the dynamic state matrix of a general
power system through the Newton-Raphson power flow model. proach, namely extended Kalman filter, is developed to mon‐
Third, by incorporating the state matrix with the system fea‐ itor system state dynamics in a faster and more reliable man‐
tures, we propose a Bayesian long short-term memory ner. Besides, a new robust generalized maximum-likelihood-
(BLSTM) network to predict the dynamic system state vari‐ type unscented Kalman filter in [2] is proposed to predict
ables accurately. Simulation results show that the accurate pre‐ the state innovation vectors in the system. Although the pre‐
diction can be achieved at different scales of power systems
through the proposed Bayesian deep learning approach.
vious studies propose the statistical techniques on system
state prediction, they do not consider the penetration of
Index TermsÿBayesian deep learning, data analytics, Newton- RESs in the power system. By considering the RESs de‐
Raphson power flow, renewable energy source, system state.
ployed in the system, some existing researches deploy suffi‐
cient statistical tools on predicting system state [3] - [5]. For
example, a Markov model with the Viterbi algorithm is pro‐
I. INTRODUCTION
posed in [3] for state prediction in different power systems
With the development of technology, the power system with the penetration of RESs. Furthermore, in [4], a statisti‐
has become more diversified than the conventional power cal Gaussian mixture model is developed to estimate the real-
system, such as the increase in the utilization of renewable time system measurements. Nonetheless, these approaches
energy sources (RESs). Although RESs contribute to improv‐ are not practical since such models already assume that both
ing the environmental impacts to prevent global warming, the system and models are linear. Lastly, a novel interval
the system uncertainties occur due to the uncertain and inter‐ state estimation algorithm is devised in [5] to consider differ‐
mittent renewable power generations. In practice, the integra‐ ent uncertainties of distributed generation outputs, as well as
tion of RESs in the system results in sudden instability phe‐ line parameters in unbalanced distribution systems. In prac‐
nomena of the entire power system. Meanwhile, the power tice, complex power systems have typical non-linear fea‐
system state variables may also undergo drastic changes. To tures, such as time-varying loads and multiple power genera‐
tions. Hence, to handle these non-linear features, several
Manuscript received: December 31, 2020; accepted: May 17, 2021. Date of studies are involved in deploying non-parametric methods
CrossCheck: May 17, 2021. Date of online publication: XX XX, XXXX. for prediction in the system. In [6], a non-parametric kernel
This work was supported by the General Program of Guangdong Basic and estimation method is applied to determine the state condition
Applied Basic Research Foundation (No. 2019A1515011032) and the Guang‐
dong Provincial Key Laboratory of Brain-inspired Intelligent Computation (No. to characterize the stability issues of the system.
2020B121201001). However, there exists a research gap in the state predic‐
This article is distributed under the terms of the Creative Commons Attribu‐ tion problem of traditional dynamic system. The above re‐
tion 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
S. Zhang is with the Academy for Advanced Interdisciplinary Studies, South‐ searches unilaterally consider linear or non-linear features in
ern University of Science and Technology, Shenzhen 518055, China (e-mail: sy‐ the power system. In practice, a dynamic power system con‐
[email protected]). sists of various types of linear and non-linear features. For
J. Yu (corresponding author) is with the Department of Computer Science and
Engineering, Southern University of Science and Technology, Shenzhen 518055, example, the electrical loads in the system are either linear
China (e-mail: [email protected]). or non-linear loads according to the network structure. As an
DOI: 10.35833/MPCE.2020.000939
emerging technique to handle the problems in complex sys‐
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
tems, machine learning approaches can be a good candidate could be achieved through Bayesian deep learning with con‐
to fully respect both linear and non-linear factors in a gener‐ sideration of the bad-data circumstance. However, there is
al power system. Some studies utilize machine learning tools no recent study using Bayesian deep learning model to con‐
for system data prediction problems [7] - [9]. For instance, a sider the dynamic system state of the entire power system as
novel machine learning tool, support vector machine, can be far as we are concerned.
applied for time series prediction in the power system [7]. To bridge the research gaps, we propose a Bayesian deep
Besides, a low-rank tensor learning model can be utilized to learning approach to perform state prediction in the general
predict system measurements such as solar power generation power system considering RES and model uncertainties. The
[8]. Last but not least, the use of decision trees can handle a major efforts of this paper are shown as follows.
large amount of wide-area information to keep the stability 1) A Newton-Raphson power flow model is deployed to
of the power system [9]. estimate the dynamic system state, and we devise a new
Beyond the aforementioned studies, as a subset of ma‐ state matrix to aggregate with all historical information.
chine learning, deep learning has been widely utilized in the 2) By adopting Bayesian LSTM (BLSTM) network in the
related research fields [10]. In the power system, the imple‐ system, both the model and data uncertainties can be remark‐
mentation of deep learning techniques can use multiple neu‐ ably captured.
ral network layers to extract latent system features, which 3) Accurate predictions can be achieved by means of the
can further assist power system operations in practical sce‐ proposed BLSTM network for different scales of systems.
narios. Some existing research works have solved power sys‐ The remainder of this paper is organized as follows. Sec‐
tem problems through deep learning tools [11]-[16]. In [11], tion II presents and illustrates the proposed methodology. In
a method based on artificial neural network (ANN) is pro‐ Section III, we formulate a Newton-Raphson power flow
posed to predict the long-term voltage status to ensure the model to generate the state matrix of a general power sys‐
margin of the voltage stability. Additionally, an intelligent tem. The deep learning approach, BLSTM network, is then
system strategy is proposed in [12] to predict the dynamic proposed in Section IV. In Section V, we conduct the perfor‐
voltage deviation to observe the short-term instability of the mance evaluation of the proposed model with a general pow‐
system based on the ensemble learning of neural networks. er system. Finally, we conclude this paper in Section VI.
However, these studies lack multiple degrees of data inter‐
pretability in the power system. For example, the one-layer II. PROPOSED METHODOLOGY
ANN cannot deeply learn the representation of large-scale In this section, we present the proposed methodology,
power system data due to its simple structure and parameter which is summarized in Fig. 1.
settings. To tackle such issues, in [13], a model-specific
deep neural network (DNN) based power system state esti‐ Power system data
mation scheme is proposed to estimate real-time power sys‐ Data pre-processing stage
tem state. Furthermore, the long short-term memory (LSTM) Training Testing
model could be deployed to accurately capture the time-vary‐ dataset dataset
ing dynamic behaviors of active distribution networks [14]
and to forecast the solar energy output [15]. Finally, by im‐
plementing a surrogate model, a novel LSTM model is ap‐ Newton-Raphson power Power flow
flow model analysis stage
plied to capture the time-varying consecutive states [16].
Although the aforementioned studies have demonstrated
that deep learning is superior to perform prediction tasks, State matrix State matrix
these studies are indeed based on the deterministic models (training set) (testing set)
State prediction stage
so that they lack the ability of capturing uncertainty. In the
power system, the uncertainties from various sources may BLSTM network
expose potential safety issues to cause huge economic losses
[17]. Even though several non-deep-learning approaches can Predicted state variables Aggregation stage
deal with system uncertainties, e. g., [3] and [18], their
schemes become complex and time-consuming with the in‐ Fig.1. Workflow of proposed methodology.
creasing size of power networks. In this paper, a probabilis‐
tic deep learning model, i.e., Bayesian deep learning, is ad‐ The proposed methodology consists of four main stages:
opted to incorporate the power system uncertainties into the data-preprocessing, power flow analysis, state prediction,
state estimation framework aiming at a more interpretable and aggregation. The first stage involves the dataset prepara‐
model with reliable prediction by means of probability theo‐ tion process, where we collect various types of power sys‐
ry in an efficient manner. There are several research works tem data, such as various renewable power generation, time-
that justify the feasibility of the Bayesian deep learning varying loads, and dynamic power state information. Then,
methods in power system studies [19], [20]. For instance, a the input dataset is segregated into a training dataset and a
probabilistic Bayesian deep learning model is proposed for testing dataset. The size of the training dataset is determined
day-ahead load forecasting to capture the model and load un‐ by multiple potential features of the entire dynamic power
certainties [19]. In addition, in [20], the state estimation system. After that, both the training and testing datasets are
ZHANG et al.: BAYESIAN DEEP LEARNING FOR DYNAMIC POWER SYSTEM STATE PREDICTION CONSIDERING RENEWABLE ENERGY UNCERTAINTY
used for the power flow analysis. Once the state matrix is listic prediction techniques are primarily derived from deter‐
obtained and separated into training and testing sets, we in‐ ministic models, their capability in capturing the stochastic
corporate them into the complete training and testing datas‐ uncertainty is limited. Hence, most of the existing models
ets to fit in the proposed BLSTM network. By properly train‐ cannot represent the data uncertainty due to the stochastic
ing the learning system, the dynamic system states of gener‐ time-varying loads and RESs. It is important to develop a ge‐
al power systems can be obtained via online inference. Final‐ neric probabilistic deep learning model to handle a large
ly, the aggregation stage clusters all the individual probabilis‐ number of such uncertainties to provide the confidence
tic prediction through convolution with the previously saved bounds for decision-making.
weights to gain the final probabilistic dynamic state vari‐ Besides the data uncertainty, we need to investigate the
ables. uncertainty of the model related to the output results on deal‐
ing with dynamic state prediction, which is defined as model
A. Dynamic State Assessment uncertainty. Besides the stochastic uncertainty, the model un‐
The dynamic state assessment is gained by using the New‐ certainty also plays an important role in the probabilistic pre‐
ton-Raphson power flow model. Note that we consider the diction task, which is used to identify the amount of output
alternating-current (AC) power system in the proposed meth‐ uncertainty. Intuitively, the model uncertainty refers to the
odology. Hence, we perform AC power flow analysis so as uncertainty of the model parameters and network structure.
to estimate the AC power state information. The power sys‐ The challenge of such uncertainty is to know how much the
tem network can be modeled as a graph that is defined as chosen combinations affect the results of dynamic state pre‐
(&¨), where & {1¨2¨...¨N} is the bus set, N is the total diction in different circumstances. Thus, the aim is to investi‐
number of buses; and & u & is the transmission line set. gate the degree of uncertainty of this model corresponding
In the power network, we can model a branch (i¨j) with to the outputs. Since we consider a large amount of model
two common buses i¨j & as one equivalent π circuit. parameters and numerous variations of structures evaluated
Hence, the line admittance of this circuit can be expressed for different combinations, it is important to observe how
as y ij g ij jb ij, where g ij t 0 and b ij d 0 are the branch con‐ the accurate dynamic system state prediction varies under
ductance and branch susceptance, respectively. In addition, different circumstances, e. g., days, weeks, seasons, and so‐
the shunt capacitance of branch (i¨j) can be denoted as cial factors. In addition, to tackle such system conditions,
the proposed Bayesian deep learning model is developed by
c ij c ji, which is used for voltage and reactive power control.
implementing the related parameters inside. The remainder
These aforementioned parameters are the key components to
of this paper will introduce how our proposed Bayesian deep
develop the AC power flow model for a general power sys‐
learning model can effectively handle both the data and mod‐
tem.
el uncertainties in a detailed manner.
Through the power flow analysis in the system, the dy‐
namic state variables can be estimated. The collection of
III. NEWTON-RAPHSON POWER FLOW MODEL
these variables can be sorted as the state matrix of the sys‐
tem. Since we need to fit this information into the neural net‐ This section presents the problem formulation of Newton-
work model, the large amount of historical dynamic state es‐ Raphson power flow model. As shown in Fig. 1, the AC
timations are covered and divided into training and testing power flow model can be utilized to obtain the correspond‐
sets. ing state matrix for the power system after the data pre-pro‐
cessing stage, which is for the model training and testing in
B. Uncertainty Investigation the proposed BLSTM network. The procedure of AC power
Data uncertainty in general power systems is typically re‐ flow analysis is shown below in a detailed manner [21].
lated to the stochastic uncertainty of the power generation
A. AC Power Flow Model
and loads injected as power sources to the system, such as
load variability and intermittent power generation. It is more Based on the dynamic state assessment mentioned in Sec‐
challenging to handle both load and intermittent power gen‐ tion II, we further develop the power flow model of a gener‐
eration prediction than either of these individual tasks. Spe‐ al power system. Considering the nodal equations at each
cifically, the stochastic uncertainty of load and intermittent bus, the nodal admittance matrix in the system can be denot‐
power generation reflects the uncertainty characteristics from ed as Y. Through an one-line diagram of the whole circuit,
different injected sources resulted from the variability of for each bus, the current and voltage matrices are denoted as
weather and unpredictable human activities. Besides, most of I and V, respectively. By Ohm’s law, the nodal admittance
the existing approaches only predict the upper and lower matrix Y of the entire system follows (1).
bounds of the forecasting power without the detailed infor‐ ª I 1 º ª« y 11 y 12 * y 1N º» ª V 1 º
mation about the power distribution at every time step. Addi‐ « I » « y 21 y 22 * y 2N » « V »
tionally, most probabilistic prediction methods are indeed de‐ I « 2 » «« « 2»
»»» « , » YV (1)
«,» « , , ,
terministic approaches, which cannot fully capture the sto‐ « » « « »
¬I N ¼ ¬y N1 y N2 * y NN »¼ ¬V N ¼
chastic uncertainty. In this paper, as mentioned above, we
consider the integration with time-varying loads and RESs in For the representations of power injections in the system,
AC power systems. Hence, the data uncertainty is taken into we denote P i and Q i as the active and reactive power injec‐
account in our proposed system model. As existing probabi‐ tion at bus i, respectively, which are modeled as:
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
candidate for accurate state prediction. Specifically, we col‐ Let z t be the input of the neural network, which is formed
lect the input data from December 2015 to December 2016. by:
For solar generation, wind generation, and residential load, z t Concatenate(R t ¨X t ) (15)
the profiles during this period are captured with a time dura‐
where Concatenate(·) is the function to concatenate a list of
tion of 15 minutes in a cumulative manner. Furthermore, the
inputs.
time-varying loads fluctuate due to the practical electricity
For the time sequence number in the LSTM model, we
usage patterns. In this paper, we collect the input power data
consider the model input as the observations over the past L
associated with 37 European countries. Specifically, using
time slots. The vector of observations can be denoted as
the topology of the city power grid, the related power gener‐
z t L¨t [z t L ¨z t L 1 ¨...¨z t ]. The LSTM network topology con‐
ation and loads are implemented. By sorting out the input da‐
ta, we can extract multiple different features of the power tains the fully self-connected hidden layers with the memory
grid. In addition, since the training and testing datasets are cells and related gate units installed. LSTM units implement
constructed by capturing the periodical changes of the pro‐ the hidden layers, and all units have direct connections with
files, a cross-validation step is also required for the data pre- each other. In Fig. 2, each LSTM cell is formed as the chain
processing stage to assess the exact separation of training structure and covers four interacting layers with individual
and testing sets. Besides, in order to improve the robustness communication links, which are different from the traditional
of the proposed deep learning model, the sufficient amount recurrent neural network (RNN) with a single neural net‐
of training dataset is considered. Furthermore, by consider‐ work structure. The key function for each LSTM cell con‐
ing the state matrix, the extracted features can rapidly in‐ sists of an input gate, forget gate, and output gate. First of
crease with the number of buses in the power system. Thus, all, the input gate in the LSTM network at time t i t can be
the total number of the features from the input data depends expressed as:
on general power network information. Besides, before we i t σ(z tU i h t 1W i ) (16)
fit the input data to the BLSTM network, we first normalize where U i and W i are the coefficient and weighted matrices
the input dataset to [0¨1] with min-max normalization. For of the input gate, respectively; and h t 1 is the hidden state
the missing values in the input dataset, we utilize linear in‐ for the previous time slot. By using the sigmoid function
terpolation for recovering the values. σ(), the non-linearity of the hidden layers is shown. The
B. LSTM backup cell state of LSTM network at time t C͂ t can be de‐
noted as:
The pre-processing input data with different items are fed
in the Bayesian deep learning model for the purpose of mod‐ C͂ t tanh(z tU g h t 1W g ) (17)
el training. In this paper, we focus on the BLSTM network, where tanh() is the hyperbolic tangent activation function;
which is inherently a probabilistic model for handling time- U g and W g are the coefficient and weighted matrices of the
series data. The network parameters in the BLSTM network backup cell state, respectively. Then, the forget gate of
are expressed by conditional probabilities, which are differ‐ LSTM network at time t f t can be denoted as:
ent from the typical LSTM network with fixed parameters. f t σ(z tU f h t 1W f ) (18)
Since the input data cover the features of renewable power
generation, it is crucial to capture the characteristics of where U f and W f are the coefficient and weighted matrices
RESs. Hence, our proposed BLSTM network can capture of the forget gate, respectively. The function of the memory
both the model uncertainty and stochastic uncertainty [10]. cell is to activate the forget gate to decide whether to delete
To illustrate the basic architecture of the proposed the useless information from the previous time slot. The out‐
BLSTM network, we first introduce the structure of one put gate of LSTM network at time t o t can be expressed as:
LSTM cell shown in Fig. 2. o t σ(z tU o h t 1W o ) (19)
yt where U o and W o are the coefficient and weighted matrices
of the output gate, respectively. The function of the output
ReLU(·) gate is to prevent from storing long time lag memories. Be‐
sides, the hidden state of LSTM network at time t h t can be
Hadamard
Ct1
product
+ Ct expressed as:
it Hadamard
tanh(·) h t o t Ctanh(C t ) (20)
ft product ot Hadamard where C denotes the Hadamard product operation. The cell
~
Ct product state of LSTM network at time t C t can be denoted as:
σ(·) σ(·) σ(·)
tanh(·)
C t f t CC t 1 i t CC͂ t (21)
wi + + wg + + wf + + wo + + where C t 1 denotes the cell state from the previous time slot,
+ + + + ht
which can also be defined as the memory cell state.
ui ug uf uo
zt C. BLSTM Network
ht1
The utilization of LSTM network has demonstrated that
Fig. 2. Structure of LSTM cell. the point prediction can capture the long-term dependencies
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
of the input dataset [23]. However, the standard LSTM net‐ δ* arg min KL(q δ (W )||p(W|Z train ¨Y train ))
δ
work cannot capture the uncertainty. Hence, we propose the q δ (W )
BLSTM network for probabilistic prediction tasks. Specifi‐ δ
ϯ
arg min q δ (W )lg
p(W ) p(W|Z train ¨Y train )
dW
cally, the BLSTM network involves in structure and parame‐
arg min [(KL(q δ (W )||p(W )) E qδ (W ) (lg p(Z train ¨Y train|W ))]
ter uncertainties. On one hand, the structure uncertainty re‐ δ
Tsϕ
tainty, a prior distribution that each parameter is set as a exp(y) f (z) (27)
t 1
standard normal distribution with zero mean can be placed Wt
over weight parameters. Note that the posterior distribution where f (z) is the stochastic forward pass with its weight in
W
is used to produce the samples of forecasts after training the the network. Besides, given that W t ~q *δ (W ) and p(y|f (z)) t
W
BLSTM network. We define p(W|Z train ¨Y train ) as the posterior N(y; f (z)¨σ)¨σ ! 0, we have the following estimator:
t
t 1 T s2 t 1 t 1
where z and y are the given input and output points, respec‐
(29)
tively.
For Bayesian deep learning, the network parameters are Besides, the loss function in the BLSTM network can be
defined as:
assumed to follow the posterior probability distributions.
T train
Hence, after the training process of the Bayesian neural net‐ 1 1 1
$(δ) ϕ2σ ||y i f (z i )||2 lg σ 2 (30)
work, the queries for the unseen data can be predicted by: T 2
train t 1
2
2
p(ŷ |ẑ ) E p(W|Z ¨Y ) ( p(ŷ |ẑ ¨Z train ¨Y train ))
train train To combine with the data and model uncertainties, we de‐
fine a new expression for the output of this model by consid‐
ϯ p(ŷ |ẑ ¨Z train ¨Y train ) p(W|Z train ¨Y train )dW (24)
ering the predictive mean and model precision as [ẑ ¨σ̂ 2 ]
where E() is the expectation over the posterior probability W W
f BLSTM (z), where f BLSTM (z) is the proposed BLSTM network
t t
distribution p(W|Z train ¨Y train ); and ̂ represent the prediction that is parameterized by W t ~q *δ (W ). Thus, the final loss func‐
valnes. In this case, the Bayesian neural network is equiva‐ tion can be expressed as:
lent to taking the average predictions from an ensemble of T train
1 1 1
neural networks weighted by the posterior probabilities of $ BLSTM (δ) ϕ2σ ||y i ŷ i||2 lg σ̂ i2 (31)
T train
2 2
2
the parameters W. t 1
Note that the true posterior is usually intractable for the Last but not least, the confidence interval (CI) can be cal‐
LSTM network. By considering the complexity of the poste‐ culated as:
rior distribution of the network parameters, (24) is hard to CI [μ̂ t β/2 σ̂ ¨μ̂ t β/2 σ̂ ] (32)
be performed due to its intractable computation. Therefore, where μ is the expection value; t β/2 refers to the t-score in
we introduce q δ (W ) with parameter δ as an approximating
the table of t-distribution.
variational distribution to ensure the optimal distribution by
minimizing the Kullback-Leibler (KL) divergence according D. Training Algorithm
to [25]. To solve this issue, the variational inference can be In the proposed BLSTM network, we train the network by
utilized to gain the latent parameters δ on q δ (W ) as: minimizing the loss function in (30) to adjust the predicted
ZHANG et al.: BAYESIAN DEEP LEARNING FOR DYNAMIC POWER SYSTEM STATE PREDICTION CONSIDERING RENEWABLE ENERGY UNCERTAINTY
results. Also, the BLSTM network is capable of capturing slots) to be 4. The epoch number is 150 and the batch size
the uncertainty of the input data. Considering the process of is set to be 64. The numbers of the three hidden units for
training BLSTM network, we incorporate the Adam optimiz‐ the LSTM network are set to be 64, 128, and 256, respec‐
er [27] into the training algorithm for the LSTM network. In tively. The dropout rate is 0.5. The value of β is set to be
addition, the LSTM cell in the network is designed so that 5% to investigate the 95% confidence interval for the pro‐
the updating complexity per time interval and weight does posed BLSTM network. The number of samples T s is set to
not depend on the size of the neural network, and the stor‐ be 100. All the tested algorithms are implemented with Py‐
age does not rely on the sequence length of the input data thon and PyTorch [31].
[28]. Besides, the historical data, known as the input data,
B. Scenarios for Comparison
can be regarded as the prior information for input training.
When we apply the BLSTM network for the practical scenar‐ The evaluation of our proposed BLSTM network is based
io, we should pre-train the network under different power on the comparison with other typical prediction techniques.
system operations. Thus, the state of the general power sys‐ In this paper, we introduce six widely-adopted techniques
tem can be predicted through our proposed BLSTM net‐ for baseline comparisons, including multiple linear regres‐
work. Consequently, the training of BLSTM network can be sion (MLR) [32], ANN [11], LSTM [16], unscented Kalman
summarized as follows. filter (UKF) [2], quantile regression (QR) [33], and quantile
1) Collect the power system data, and separate the data in‐ random forest (QRF) [34]. Indeed, these tools have already
to training and testing datasets. been widely used for the application in the power system.
2) Normalize the input data and prepare the sample data ζ The former three techniques belong to point prediction tech‐
with normal distribution. niques while the latter three refer to probabilistic models.
3) Initialize network parameters: δ μ σζ. Our proposed BLSTM and LSTM network can capture both
4) Perform forward and back propagation on the batch. the data and model uncertainties simultaneously. The base‐
5) Update μ and σ according to the gradients in the net‐ line techniques are fine-tuned for optimal parameter configu‐
work respect to δ. rations to produce the best prediction results.
6) Fine tune the whole network with multiple trials and C. Performance Metrics
output the predicted result. To assess the prediction accuracy of the proposed BLSTM
network, four performance metrics are employed, which are
V. PERFORMANCE EVALUATION shown as:
This section shows the performance evaluation of the pro‐ 1
(y(t) ŷ (t))2
NT ϕ
posed methodology. First of all, we introduce the simulation MSE (33)
t,
setup. Then, we present the scenarios for comparison and
performance metrics for the simulation. Finally, we present 1
(y(t) ŷ (t))2
NT ϕ
RMSE (34)
and discuss the simulation results in a detailed manner. t,
A. Simulation Setup 1
In the simulation, we evaluate the performance of the pro‐
MAE
NT ϕ
_ y(t) ŷ (t) _ (35)
t,
posed model with the historical power system data. We em‐
1 _ y(t) ŷ (t) _
ploy the IEEE 57-bus system as the testing power system __ __u 100%
[29]. Real power system data from [30] are adopted in the
MAPE ϕ
N T t , _ y(t) _
(36)
subsequent case studies. In particular, 13-month data from where y(t) and ŷ (t) are the actual and predicted values at
December 2015 to December 2016 are obtained, which are time t, respectively; and N T is the number of the sampling
divided into a training dataset (the first twelve months) and period. These metrics are used to compare the predicted val‐
a testing dataset (the remaining month) for cross-validation. ue with the actual value for point prediction, namely, mean
The sampling period of all profiles is aggregated into 15 square error (MSE), root mean square error (RMSE), mean
minutes. The historical data cover multiple entries, including absolute error (MAE), and mean absolute percentage error
solar generation, wind generation, and time-varying house‐ (MAPE). For these metrics, a smaller value indicates a bet‐
hold load. We scale the time-varying household load to fit ter performance in the prediction.
the particular IEEE 57-bus test system. In addition, the solar
and wind generators are installed at bus 13 and 37, respec‐ D. Simulation Result
tively. According to the settings of the test system [29], the 1) Comparison of Different Prediction Methods
installed capacities for these two renewable power genera‐ As previously introduced in Section V-B, we evaluate the
tors are set to be 150 MW and 70 MW, respectively. proposed BLSTM network compared with the other six base‐
For the purpose of simulation, the measurements of the line methods. Table I presents the four performance metrics
system data are used to obtain the actual values of voltages, in evaluating these prediction methods. It is apparent that
phase angles, and power by the Newton-Raphson power our proposed BLSTM model outperforms the rest of the pre‐
flow model. Then, we design the BLSTM network as fol‐ diction methods with the smallest values in MSE, RMSE,
lows. We empirically set the sequence length L (in time MAE, and MAPE. Even if both UKF and QRF can have rel‐
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
atively low MSE and RMSE, MAE and MAPE are both while, we capture both the phase angle and the voltage mag‐
much higher than BLSTM, which indicate larger errors. In nitude of this bus. The normalized results are presented in
addition, we can observe that since BLSTM network consid‐ Figs. 4 and 5. In Fig. 4, it is obvious that the 95% lower
ers both the model and data uncertainties, the predicted re‐ and upper confidence bounds can better fit the trend of the
sult can be better even if for the point estimation. Owing to real values of the phase angle at bus 13. Besides, in Fig. 5,
the characteristics of capturing long-term dependencies of most of the predicted values are close to the real ones of the
time-series input data on LSTM network, the proposed voltage magnitude at bus 13. In addition, the similar results
BLSTM network can help further enhance the prediction ac‐ can also be demonstrated at bus 37, which is integrated with
curacy of the output results. The time complexity of the pro‐ wind power generation. Furthermore, by observing Figs. 4
posed BLSTM thus only depends on the number of sam‐ and 5, although the predicted values cannot fit well with the
pling period N T and the number of features N F. Hence, the sudden fluctuations of the real values, the occurrence of the
time complexity of BLSTM is defined as '(N T N F ). It is ap‐ confidence bounds of the BLSTM network show the effec‐
parent that a lower time complexity reflects a more effective tiveness of capturing such features.
approach.
0.56 95% upper confidence bound
TABLE I 95% lower confidence bound
0.55 Real values
COMPARISON OF DIFFERENT PREDICTION TECHNIQUES IN Predicted values
IEEE 57-BUS SYSTEM