0% found this document useful (0 votes)
33 views

Machine Learning Approach For Predictive Maintenance in Hydroelectric Power Plants

The document discusses using machine learning approaches for predictive maintenance in hydroelectric power plants. Specifically, it proposes two deep learning models for anomaly detection at the Peña Blanca hydroelectric power plant: a deep neural network with logistic regression to classify various failure types, and an LSTM neural network with autoencoder to classify flaws. The models aim to enable early failure detection to save time and money compared to traditional preventive maintenance approaches.

Uploaded by

celso Garcia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Machine Learning Approach For Predictive Maintenance in Hydroelectric Power Plants

The document discusses using machine learning approaches for predictive maintenance in hydroelectric power plants. Specifically, it proposes two deep learning models for anomaly detection at the Peña Blanca hydroelectric power plant: a deep neural network with logistic regression to classify various failure types, and an LSTM neural network with autoencoder to classify flaws. The models aim to enable early failure detection to save time and money compared to traditional preventive maintenance approaches.

Uploaded by

celso Garcia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Machine Learning Approach for Predictive

Maintenance in Hydroelectric Power Plants


Victor Velasquez Wilfredo Flores
Facultad de Ingeniería Facultad de Ingeniería
2022 IEEE Biennial Congress of Argentina (ARGENCON) | 978-1-6654-8014-7/22/$31.00 ©2022 IEEE | DOI: 10.1109/ARGENCON55245.2022.9939782

Universidad Tecnológica Centroamericana Universidad Tecnológica Centroamericana


Tegucigalpa, Honduras Tegucigalpa, Honduras

Abstract- The future of hydropower industry has as key machinery, increasing the activities to be performed and the
elements, optimization in operation and maintenance, costs repair time, which is vital.
reduction and increase of reliability. This means greater Finally, Predictive Maintenance, (PdM), is a series of
challenges in the operation of hydroelectric power plants, techniques to diagnose possible future failures achieved
therefore, greater demands in maintenance. With technology
through periodic inspections and mainly data analysis of
advances and its role in the industrial sector through the
revolution 4.0 or Industry 4.0, artificial intelligence and machine machinery and equipment in operation. This type of
learning applications enables the development and modernization maintenance requires diagnostic equipment, sensors, and tests
of current maintenance techniques in hydropower plants, through without interruption in generation to detect anomalies and
condition monitoring, fault diagnosis and predictive maintenance, diagnose potential failures.
thus, an early detection can save a lot of time and money. In hydropower plants, periodic planned or preventive
In this study, two techniques are proposed to enable predictive maintenance has long been the primary, if not the only
maintenance in the Peña Blanca hydroelectric power plant, using maintenance method adopted. Requiring shutting down
two deep learning models for anomaly detection. The first one
hydroelectric units at regular intervals, based on operating
consists of a Deep Neural Network with Logistic Regression to
classify various types of failures, for the second one a Recurrent
hours and repairing a certain component when damage is
Long Short-Term Memory neural network (LSTM) with detected results in arbitrary, unplanned, and costly outages.
Autoencoder is used to classify various flaws. With the first model New concepts, methods, and models such as machine learning
it was found that it is possible to generalize several types of (ML), cyber physical systems (CPS), internet of things (IoT)
failures, while the LSTM model adjust better on detecting high are gaining more attention in the industrial sector which, with
temperatures on generator bearings since was a failure that the help of big data analytics cover a greater role due to the
occurred frequently during the study. massive availability of data to be exploited. Digitization and
analytics open the door for new ways of addressing many of
Keywords - predictive maintenance, anomaly detection, the current challenges in energy field. Lukač recognizes the
machine learning, condition monitoring, hydropower. introduction of IoT to the industry through combination of
software, sensors and intelligent control units that walk
I. INTRODUCTION towards the improvement of industrial processes [1].
This study is focused on the search for Predictive
Maintenance involves a set of techniques and activities
Maintenance techniques at the Peña Blanca Hydroelectric
applied on equipment to prevent, correct, and reduce failures
Power Plant. Two ML models are proposed based on historical
that affect their optimal operation to perform the tasks for
data of the hydroelectric power plant that accurately adjusts to
which the equipment was designed. Good maintenance
predict a failure and classify what type of failure. Data
minimizes emergency cost, shutdowns due to failures or
processing, analysis, and feature engineering were done in
repairs and maximizes the investment in the system by
Python using libraries and tools from NumPy, Matplotlib,
prolonging its useful life. There are three main types of
Pandas, Plotly and Tensorflow for Machine Learning.
maintenance techniques: Preventive, Corrective and
Predictive. A. Background
Preventive Maintenance consists of a series of There are few documented works related to the practice of
precautionary activities on a periodic basis to ensure Predictive Maintenance in hydroelectric power plants, most of
continuous and safe operation. It has the characteristic that it is the studies reviewed are very recent. Among them, it can be
carried out under normal operating conditions and thus avoids mentioned a work done in Italy where a metric called Key
unexpected failures and costly consequences. In contrast, Performance Indicator (KPI) is proposed using a complex
Corrective Maintenance consists of correcting a failure that neural network of unsupervised learning of self-organized map
has occurred in any of the equipment and seeks to restore [2]. The findings from the implementation of this model were
operation in the shortest time and at the lowest possible cost. that more than 20 anomalous situations were found, including
This maintenance has the disadvantage that, given a small cases where the standard control system did not report these
failure that could be avoided with preventive maintenance situations.
occurs, could cause other damages to the equipment or

978-1-6654-8014-7/22/$31.00 ©2022 IEEE

Authorized licensed use limited to: Universidad Nacional de Colombia (UNAL). Downloaded on April 04,2024 at 20:48:31 UTC from IEEE Xplore. Restrictions apply.
Another interesting work was done for a 56 MW pumped- components where the sensors were located (turbine,
storage hydroelectric power plant with Francis turbines in generator, and hydraulic power unit). Therefore, data from
Norway. A long short-term memory (LSTM) neural network failures that were not specific to the systems of interest for this
was used for anomaly detection with variables such as bearing research were eliminated. A histogram of plant-components
temperatures and vibration [3]. The neural network is used to failures during investigation period was made:
predict the temperature one hour in the future in conjunction
with the rate of variation in bearing temperature.
II. METHOD
The process to achieve a plan of Predictive Maintenance
techniques for the Peña Blanca Plant was as follows:

A. Dataset
The data collected consisted of various time series signals
from sensors on the different plant components and their event
(fault) list, all gathered data was exported from plant’s
SCADA. Data from 1st of July 2021 to 31st of December Fig. 1 Failures in Peña Blanca power plant for the research components.
2021 were used in this research. The objective was to obtain
models capable of predicting and classifying a fault or normal The same procedure was then carried out to assemble data
behavior based on sensor data. Therefore, the collected data frames, this time to assemble the 18 input variables with the
were preprocessed and feature-engineered to obtain a series of data corresponding to the List of Events as a function of time.
inputs and outputs to classify failures or normal behaviors. The Event List data was a categorical type variables.
B. Data preprocessing C. Feature Engineering
Data exporting was done separately: firstly, data from to the This is a critical phase prior training a model as it means
faults in the event list, then the relative to the Turbine and the obtaining extra features or variables that are very useful for
Hydraulic Power Unit and finally the data corresponding to the training. In this phase, 4 types of modifications were made to
Generator. Therefore, all the data had to be put together in a the data to train the model optimally.
single table and for this, a Python method from the Pandas 1) Addition of the Normal Behavior element for the model
library called "merge_asof" was used, which takes two data output: after merging the data of the 18 variables with
frames and joins them according to a common variable. Since the data of the Event List (failures) there were missing
we were dealing with time series, the common variable was values because not every minute had a failure,
“time”, which had a format of year-month-day-hour-minute. therefore, many elements in the column appeared with
From the data corresponding to the plant components, 18 unknown value or NaN as represented by Python. If at
variables were determined for the input: a given time no failure had occurred, it meant that the
plant was operating in a normal behavior. Since this
TABLE I was a situation that assumed an absence of faults, most
INPUT VARIABLES of the data implied "Normal Behavior" since faults do
Input Variables not happen often but occur at certain times.
Generator Winding Temp W
Guide Vane Opening [%]
Phase [°C]
2) Adjustment in the representative data for Axial
Phase Voltage L1-L2 (Generator) Bearing Temperature failure: Fortunately, limits for
Hydraulic Power Unit Pressure [bar]
[V] the winding and bearing temperatures in the Generator
Phase Voltage L2-L3 (Generator) were known. A Comparative Analysis between the data
Pressure before Butterfly Valve [bar]
[V] assuming Normal Behavior and the data assuming
Phase Voltage L1-L3 (Generator)
Pressure after Butterfly Valve [bar]
[V] High Axial Bearing Temperature failure showed an
Radial Bearing Temperature (Generator) inconsistency.
Line Current L1 [A]
[°C]
Axial Bearing Temperature (Generator)
Line Current L2 [A]
[°C]
LA Bearing Temperature (Generator)
Line Current L3 [A]
[°C]
Generator Winding Temp U Phase [°C] Frequency [Hz]
Generator Winding Temp V Phase [°C] Active Power [kW]

To predict and classify failures contained in the list of


events, the output variables had to be those that belonged to a Fig. 2 Display of the inconsistency presented under Normal Behavior.
failure related to the 18 input variables, meaning, the plant

Authorized licensed use limited to: Universidad Nacional de Colombia (UNAL). Downloaded on April 04,2024 at 20:48:31 UTC from IEEE Xplore. Restrictions apply.
i. From Figure 2. it was noticeable that many A. Deep Feed Forward Neural Network
values related to the Generator Axial It is perhaps the most famous and widely used technique in
Bearing Temperature exceeded the alarm the machine learning arsenal. A schematic diagram as shown
threshold of 65 °C and were not boxed in Fig.4 is commonly found for neural networks. On the left
within the Events column as a failure but side, the input vector x enters the network. It is called the input
as a normal behavior. layer and has several neurons equal to the number of elements
ii. To solve this problem, all values in the in the input vector. On the right side, the output vector y exits
Events column were modified as a the network, it is called the output layer and has several
function of the Axial Bearing neurons equal to the number of elements of the output vector;
Temperature input variable that exceeded it is the result of the computation. Between them are several
64.9 °C. hidden layers. The number of hidden layers and the number of
neurons in each depends on user's choice and is called the
topology of the neural network [5].

Fig 3. Graph of the Axial Bearing Temperature representative of a failure. Fig. 4 Deep Neural Network composition.

3) Categorical to binary variable change in the event A neural network with two hidden layers, therefore, looks
column: A Machine Learning model cannot be fed with as follows:
categorical variables therefore it was necessary to + )+ (1)
change this type of values into binary numbers, i.e.
values of 0 or 1.
First model’s proposed composition comprises of its input
4) Oversampling and Undersampling techniques: given
layer with the 18 variables, two hidden layers consisting of 54
the situation that the vast majority of the data implied
and 36 nodes, a Dropout layer which is useful to avoid
normal behavior, any Machine Learning model would
overfitting and finally the output layer with 17 nodes with the
be trained under the tendency to predict "Normal
type of fault or normal behavior, a flowchart of the model
Behavior" no matter what its input is. In other words,
process is presented below:
the problem was that the sample of the plant data was
not balanced, it was representative of the situation in
which the plant operates, but it was not representative
of the problem for this study which is to predict a
failure or anomaly. For the undersampling techniques,
arbitrary random selections were taken from data
corresponding to normal behavior and then sorted
according to chronology. For oversampling, a
technique called ADASYN was used which upsamples
data that have few values, creating synthetic data.
Oversampling was done for data that corresponded to a
failure, as they represented a minority which is harder
to learn for machine learning models [4].
III. MODELS
The first model is a Deep Feed Forward Neural Network
with two hidden layers with multi-classification outputs of the
fault type or normal behavior by using of Logistic Regression.
The second proposed model is a Recurrent Neural Network of
Long Short-Term Memory (LSTM) topology with multi-
classification outputs of fault type or normal behavior by Fig. 5 Deep Neural Network with Logistic Regression.
means of Logistic Regression.

Authorized licensed use limited to: Universidad Nacional de Colombia (UNAL). Downloaded on April 04,2024 at 20:48:31 UTC from IEEE Xplore. Restrictions apply.
B. LSTM Long Short-Term Memory Recurrent Neural batch with its own mean and standard deviation, this helps to
Network optimize the learning process and to use fewer epochs. The
It is very similar in spirit to the normal neural network, the flow chart of the model process is shown below:
so-called "feed forward". Instead of having connections that
move strictly from left to right in the diagram, it also includes
connections that are equivalent to cycles, these cycles act as a
memory in the formula by retaining, in some transformed
form, the inputs made at previous times. The current state of
the art of the Recurrent Neural Network technique has the
Long Short-Term Memory (LSTM) network as its protagonist.
This network is made up of cells as follows:

Fig. 6 Diagram of a LSTM Recurrent Short-Term Memory Neural Network


cell.
The inner working of each cell is defined by the following
equations:

(2)

The different weights W and b are the weights and biases


of the LSTM cell. The weights and biases of the whole
network must be adjusted by the learning algorithm. The
internal state evolution h(t) of the network is the network
memory that can encapsulate the temporal dynamics of the
data. The network is trained by providing it with a time series Fig. 7 LSTM model process
x(t) for many successive values of time t and the model outputs
the same time series, but later t + T, where T is the forecast Parameters: LSTMs require some specific steps in data
horizon. This is the essence of how the LSTM can forecast preparation. The input to LSTMs is three-dimensional matrices
time series [5]. created from the time series data. They are three-dimensional
The second proposed model is more complex, its because the data is divided into temporality cluster known also
architecture consists of its input layer with the 18 input as horizon, this allows the model to store memory of each
variables, 3 encoder hidden layers of 144, 72 and 36 nodes cluster. The length of the temporality of a cluster is user-
respectively, an intermediate layer that repeats the inputs for defined and will depend on the user's criteria. For the model in
20 times to obtain the highest possible accuracy, then the 3 this work a grouping of 20 entries was defined as can be seen
decoder layers of 36, 72 and 144 nodes respectively were in Fig 7. with 18 variables, hence the dimension "(None, 20,
positioned [6]. 18). The elaboration of the architecture of this Neural Network
Finally, the last layer with a Logistic Regression with the was guided by the work done by [7] to classify extremely
output variables. It is worth mentioning that the output layer infrequent events.
has only 10 nodes as opposed to 17 in the first model, as a
result of the subsampling and oversampling techniques IV. RESULTS AND DISCUSSION
described in the Method. In addition, it contains intermediate
Batch Normalization layers known as "BatchNormalization",
to examine each batch as it enters the model normalizing the

Authorized licensed use limited to: Universidad Nacional de Colombia (UNAL). Downloaded on April 04,2024 at 20:48:31 UTC from IEEE Xplore. Restrictions apply.
The error between the outputs of the neural network and the precision recall f1-score support

true values in the training was initially evaluated, as well as the Turbine blade is rotated 0.00 0.00 0.00 2.00
error with respect to the true outputs with the validation data. Normal Behavior 0.75 0.72 0.73 50.00
A. Deep Feed Forward Neural Network Overspeed Limit 1 in Generator 0.00 0.00 0.00 2.00
Alarm from Generator axial bearing temperature NLA 0.71 0.85 0.77 20.00
Alarm from Generator winding U phase 0.00 0.00 0.00 3.00
Alarm from Generator winding V phase 0.00 0.00 0.00 3.00
Alarm from Generator winding W phase 0.00 0.00 0.00 1.00
HPU Pump motor protection switch triggered 0.33 0.20 0.25 5.00
HPU oil pressure low 0.14 0.11 0.12 9.00
Emergency stop pushbotton Protection 0.53 0.45 0.49 22.00
Assymetry protection in generator 0.75 0.75 0.75 8.00
Low excitation voltage protection in generator 0.00 0.00 0.00 4.00
Surge protection in generator U>/U>> 0.00 0.00 0.00 9.00
Minimum limit 1 penstock pressure turbine 1.00 0.67 0.80 3.00
Minimum limit 2 penstock pressure turbine 0.00 0.00 0.00 0.00
By-Pass valve time exceeded 1.00 0.40 0.57 5.00
Butterfly valve not open 0.00 0.00 0.00 3.00

Fig. 8 Neural Network error from training and validation to true values micro avg 0.66 0.50 0.57 149.00
macro avg 0.31 0.24 0.26 149.00
weighted avg 0.54 0.50 0.51 149.00
It can be noted that error decreases as the model revises samples avg 0.50 0.50 149.00 149.00
the data during the different epochs in the training. Although, Fig. 10 FF Neural Network classification metrics.
it decreases to a value of 1, which is not a bad parameter, but
not entirely satisfactory. Then, the graph for the percentage of The model is very accurate in predicting faults such as
accuracy: Generator Axial Bearing temperature, Generator Asymmetries
and Turbine Forced Pressures, all with values above 70%. In
addition, it can predict with an accuracy higher than 70% a
Normal Behavior.

B. LSTM Long Short-Term Memory Recurrent Neural


Network
The error between the outputs of the neural network and the
true values in the training was initially evaluated, as well as the
error with respect to the true outputs with the validation data.
Fig. 9 Accuracy of the Neural Network.

This is an evaluation parameter that delivers an acceptable


result since in such a complicated situation of having
unbalanced data, an accuracy percentage of almost 70% was
achieved after the model observed the data for 200 epochs.
The classification metrics were as follows:

Fig. 11 LSTM Recurrent Neural Network error from training and validation
compared to true values

This model despite having more fluctuations in the training


has better metrics, after 50 epochs both errors converge to low
values less than 0.3. Then the plot of the accuracy percentage.

Authorized licensed use limited to: Universidad Nacional de Colombia (UNAL). Downloaded on April 04,2024 at 20:48:31 UTC from IEEE Xplore. Restrictions apply.
such as LSTMs are suited for the tasks on detecting failures
other than high bearing temperatures.
REFERENCES
[1] Lukač, D. (2015). The fourth ICT-based industrial revolution “Industry
4.0”—HMI and the case of CAE/CAD innovation with EPLAN P8. 2015
23rd Telecommunications Forum Telfor (℡FOR), 835–838.
https://doi.org/10.1109/℡FOR.2015.7377595
[2] Betti, A., Crisostomi, E., Paolinelli, G., Piazzi, A., Ruffini, F., & Tucci,
M. (2021). Condition monitoring and predictive maintenance
methodologies for hydropower plants equipment. Renewable Energy,
171, 246–253. https://doi.org/10.1016/j.renene.2021.02.102
Fig. 12 LSTM Recurrent Neural Network’s accuracy [3] Yuan, J., Wang, Y., & Wang, K. (2019). LSTM Based Prediction and
Time-Temperature Varying Rate Fusion for Hydropower Plant Anomaly
Detection: A Case Study. En K. Wang, Y. Wang, J. O. Strandhagen, & T.
In this case very positive values are obtained, an accuracy Yu (Eds.), Advanced Manufacturing and Automation VIII (Vol. 484, pp.
of more than 90% is achieved with respect to the true training 86–94). Springer Singapore. https://doi.org/10.1007/978-981-13-2375-
values and about 88% with respect to the validation data. The 1_13
metrics of the LSTM model were the following: [4] He, H., Bai, Y., Garcia, E., & Li, S. (2008). ADASYN: Adaptive
precision recall f1-score support Synthetic Sampling Approach for Imbalanced Learning. 1322–1328.
https://doi.org/10.1109/IJCNN.2008.4633969
Turbine blade is rotated 0.00 0.00 0.00 5 [5] Banger, P. (2021). Machine Learning and Data Science in the Power
Normal Behavior 0.97 0.67 0.79 5260 Generation Industry: Best Practices, Tools, and Case Studies. Elsevier.
Alarm from Generator axial bearing temperature NLA 0.99 1.00 0.99 2350 https://www.scribd.com/book/490925349/Machine-Learning-and-Data-
Alarm from Generator winding U phase 0.00 0.00 0.00 0 Science-in-the-Power-Generation-Industry-Best-Practices-Tools-and-
Alarm from Generator winding V phase 0.00 0.00 0.00 0 Case-Studies
Alarm from Generator winding W phase 0.00 0.00 0.00 0 [6] Abirami, S., & Chitra, P. (2020). Chapter Fourteen—Energy-efficient
HPU Pump motor protection switch triggered 0.00 0.00 0.00 12 edge based real-time healthcare support system. En P. Raj & P.
HPU oil pressure low 0.00 0.00 0.00 11 Evangeline (Eds.), Advances in Computers (Vol. 117, pp. 339–368).
Emergency stop pushbotton Protection 0.00 0.08 0.00 51 Elsevier. https://doi.org/10.1016/bs.adcom.2019.09.007
Surge protection in generator U>/U>> 0.00 0.00 0.00 12 [7] Ranjan, C. (2022, febrero 17). LSTM Autoencoder for Extreme Rare
Event Classification in Keras. Medium.
micro avg 0.76 0.76 0.76 7709
https://towardsdatascience.com/lstm-autoencoder-for-extreme-rare-event-
macro avg 0.20 0.17 0.18 7709
classification-in-keras-ce209a224cfb
weighted avg 0.96 0.76 0.84 7709
samples avg 0.76 0.76 0.76 7709
Fig. 13 LSTM model classification metrics

LSTM model has a capability to predict near perfect


Generator Bearing temperature failures.
IV. CONCLUSIONS
Predictive maintenance (PdM) techniques were
implemented for early detection of anomalies by means of
fault classification or time series behavior from control
variable data at the Peña Blanca hydroelectric power plant.
The Simple Neural Network can predict with an accuracy
higher than 70% faults such as high temperatures in the
Generator Bearing, asymmetries in the Generator, forced
pressures in the turbine, as well as Normal Behavior. The
Recurrent Neural Network has an accuracy of 99% to predict a
failure in the Generator Axial Bearing temperature, it has a
promising capacity to predict other types of failures such as
low pressures in the Hydraulic Power Unit, future Emergency
Shutdowns and forced pressures in the Turbine piping since
they are types of failures that in the training of the model are
trained with oversampling of synthetic data and having a
sample with more values of these types of failures could have
the capacity to predict them.
For future research it will be better to define the limits and
alarm conditions for other types of faults. This will allow more
control over plant operation and provide predictive
maintenance a better circumstance as machine learning models

Authorized licensed use limited to: Universidad Nacional de Colombia (UNAL). Downloaded on April 04,2024 at 20:48:31 UTC from IEEE Xplore. Restrictions apply.

You might also like