Dissertation NILM
Dissertation NILM
LOW-FREQUENCY UNSUPERVISED
NON-INTRUSIVE LOAD MONITORING FOR
INDUSTRIAL LOADS
September 2023
Faculty of Sciences and Technology
of the University of Coimbra
Low-Frequency Unsupervised
Non-Intrusive Load Monitoring for
Industrial Loads
Master's dissertation in the scientific area of Electrical and Computer Engineering advised by
Professor Jérôme Mendes and co-advised by Professor Cristiano Premebida and presented to the
Department of Electrical and Computer Engineering of the Faculty of Sciences and Technology
of the University of Coimbra.
September 2023
Acknowledgements
i
Abstract
The industrial sector is responsible for a large share of global energy consumption.
Lowering energy consumption in the industrial sector can reduce the rate and severity of
future climate change impacts on people and ecosystems. Non-Intrusive Load Monitoring
(NILM) techniques can disaggregate a facility’s power consumption into the individual
loads, that is, into the power consumption of each equipment in the facility. NILM
methods do not require the presence of a sensor per equipment. These methods provide
information that can be used to define strategies for optimal energy usage in a facility
and lead to a decrease in operating costs in the industrial sector. This dissertation aimed
to develop a NILM algorithm to be part of an intelligent platform for the management of
microalgae production within the scope of the InGestAlgae project (reference: CENTRO-
01-0247-FEDER-046983) developed at the Institute of Systems and Robotics (ISR) of
the University of Coimbra. The requirements defined the method to be an unsupervised
and non-event-based method, compliant with low-frequency samples, and deployed in
environments with continuously varying equipment. The developed technique is required
to estimate the active power consumption of the equipment in an industrial facility.
The method can access the values from the Supervisory Control and Data Acquisition
(SCADA) system, which includes the aggregate and equipment’s ON/OFF state data. Two
unsupervised low-frequency NILM methods were proposed, implemented and validated.
The first method uses polynomial functions, estimated through a metaheuristic algorithm,
to model the active power consumption of the equipment as a function of the aggregate
active power. The second technique consists of an Unsupervised Neural Network (UNN)
that estimates the active power of the equipment based on the optimization of an objective
function and does not require labelled training data. The UNN algorithm was trained and
tested with two different architectures and sets of inputs. The first UNN uses the aggregate
active power and equipment state samples, and the second network uses the aggregate
active power samples passed through a Fourier feature mapping. The High-resolution
Industrial Production Energy (HIPE) and the Industrial Machines Dataset for Electrical
Load Disaggregation (IMDELD) datasets were preprocessed and used to train and test
the proposed methods. The UNN, with the aggregate active power and the equipment
state samples as input, estimated the results with the lowest error values, measured with
different metrics such as the Mean Absolute Error (MAE), Mean Square Error (MSE) and
Root Mean Squared Error (RMSE) for the testing data. The UNN method successfully
identified high-consumption equipment.
Keywords: NILM, unsupervised, low-frequency, industrial loads, source separation,
non-event-based, optimization, polynomial function, unsupervised neural network.
ii
Resumo
iii
Contents
List of Figures x
1 Introduction 1
1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Background 8
2.1 NILM Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 Energy Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 General Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 SCADA Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Review of NILM Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 NILM Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 HIPE Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.3 IMDELD Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Mathematical and Computational Concepts . . . . . . . . . . . . . . . . 17
2.4.1 Polynomial Functions . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2.1 Particle Swarm Optimization . . . . . . . . . . . . . . . 18
2.4.3 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.4 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 Methodology 28
4.1 Dataset Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.1 HIPE Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 IMDELD Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.1 EMUPF Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
iv
v Contents
5 Results 40
5.1 Dataset Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.1 HIPE Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.2 IMDELD Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Methods Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.1 HIPE Dataset Results . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.1.1 MAE Values for the EMUPF Method . . . . . . . . . . 45
5.2.1.2 MAE Values for the UNN Method . . . . . . . . . . . . 46
5.2.1.3 MAE Values for the UNN Method with Fourier Mapping 46
5.2.2 IMDELD Dataset Results . . . . . . . . . . . . . . . . . . . . . . 47
5.2.2.1 Error Measures of the EMUPF Method . . . . . . . . . 47
5.2.2.2 Error Measures of the UNN Method . . . . . . . . . . . 47
5.2.2.3 Error Measures of the UNN Method with Fourier Mapping 48
5.3 Descriptive Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.1 HIPE Dataset Results . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.1.1 EMUPF Method Statistical Analysis . . . . . . . . . . . 48
5.3.1.2 UNN Method Statistical Analysis . . . . . . . . . . . . . 49
5.3.1.3 UNN Method with Fourier Mapping Statistical Analysis 49
5.3.2 IMDELD Dataset Results . . . . . . . . . . . . . . . . . . . . . . 50
5.3.2.1 EMUPF Method Statistical Analysis . . . . . . . . . . . 50
5.3.2.2 UNN Method Statistical Analysis . . . . . . . . . . . . . 50
5.3.2.3 UNN Method with Fourier Mapping Statistical Analysis 51
5.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Appendix A Definitions 64
A.1 Equipment Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.2 Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.3 Industrial Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.4 Load Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.5 Low-Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.6 State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.7 Source Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.8 Unsupervised . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
vi Contents
Appendix B Results 66
B.1 Results from the Preprocessing of the HIPE Dataset . . . . . . . . . . . . 66
B.2 Estimated Equipment Active Power Values . . . . . . . . . . . . . . . . . 70
B.2.1 Estimations for the HIPE Dataset . . . . . . . . . . . . . . . . . . 70
B.2.1.1 Estimations by the EMUPF Method . . . . . . . . . . . 70
B.2.1.2 Estimations by the UNN Method . . . . . . . . . . . . . 74
B.2.1.3 Estimations by the UNN Method with Fourier Mapping 78
B.2.2 Estimation for the IMDELD Dataset . . . . . . . . . . . . . . . . 82
B.2.2.1 Estimations by the EMUPF Method . . . . . . . . . . . 82
B.2.2.2 Estimations by the UNN Method . . . . . . . . . . . . . 82
B.2.2.3 Estimations by the UNN Method with Fourier Mapping 83
B.3 MSE and RMSE Values for the HIPE Dataset . . . . . . . . . . . . . . . 83
B.3.1 MSE and RMSE Values for the EMUPF Method . . . . . . . . . 83
B.3.2 MSE and RMSE Values for the UNN Method . . . . . . . . . . . 84
B.3.3 MSE and RMSE Values for the UNN Method with Fourier Mapping 85
B.4 Descriptive Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . 86
B.4.1 Analysis for the HIPE Dataset . . . . . . . . . . . . . . . . . . . . 86
B.4.1.1 Maximum, Minimum, Median and Sum Values for the
EMUPF Method . . . . . . . . . . . . . . . . . . . . . . 86
B.4.1.2 Maximum, Minimum, Median and Sum Values for the
UNN Method . . . . . . . . . . . . . . . . . . . . . . . . 88
B.4.1.3 Maximum, Minimum, Median and Sum Values for the
UNN Method with Fourier Mapping . . . . . . . . . . . 90
B.4.2 Analysis for the IMDELD Dataset . . . . . . . . . . . . . . . . . . 92
B.4.2.1 Maximum, Minimum, Median and Sum Values for the
EMUPF Method . . . . . . . . . . . . . . . . . . . . . . 92
B.4.2.2 Maximum, Minimum, Median and Sum Values for the
UNN Method . . . . . . . . . . . . . . . . . . . . . . . . 92
B.4.2.3 Maximum, Minimum, Median and Sum Values for the
UNN Method with Fourier Mapping . . . . . . . . . . . 92
List of Acronyms
CI Critical Infrastructure.
EA Evolutionary Algorithm.
GA Genetic Algorithm.
vii
viii List of Acronyms
MF Matrix Factorization.
ML Machine Learning.
NN Neural Network.
SA Simulated annealing.
SC Sparse Coding.
SI Swarm Intelligence.
x
xi List of Figures
4.2 Diagram representing a single training phase run for the EMUPF method. 32
4.3 Diagram illustrating the inference phase of the EMUPF method for dis-
aggregating an aggregate active power sample into the estimated active
power values for each equipment. . . . . . . . . . . . . . . . . . . . . . . 33
4.4 Diagram illustrating the training phase for the UNN method with the
aggregate active power and the equipment state data as input. Equipment
state samples are part of the input layer of the network and are used by
the objective function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Diagram depicting the training phase for the UNN method with Fourier
mapping. The objective function uses the equipment state data, but the
equipment state samples are not used in the input layer of the network. . 36
4.6 Diagram illustrating the UNN method’s inference phase for a single aggre-
gate active power sample. The method has as input the aggregate active
power and the equipment state data. Equipment state samples are part of
the input layer of the network and are used by the objective function. . . 38
4.7 Diagram representing the inference phase for the UNN method with Fourier
mapping for a single aggregate active power sample. The objective function
uses the equipment state data, but the samples are not used in the input
layer of the network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1 Preprocessed aggregate active power that results from the sum of nine
equipment in the HIPE dataset. . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Preprocessed equipment active power data for the HIPE dataset in a single
plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3 Preprocessed equipment active power data for the HIPE dataset, divided
into multiple plots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.4 Preprocessed equipment states data for the HIPE dataset. . . . . . . . . 42
5.5 Preprocessed aggregate active power for the IMDELD dataset. . . . . . . 43
5.6 Preprocessed equipment active power for the IMDELD dataset in a single
plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.7 Preprocessed equipment active power data for the IMDELD dataset in
multiple subplots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.8 Preprocessed equipment states data for the IMDELD dataset. . . . . . . 44
B.1 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment two and three. . . . . . . . . . . . . . . . . . . . . . . . 66
B.2 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through four. . . . . . . . . . . . . . . 67
B.3 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through five. . . . . . . . . . . . . . . 67
B.4 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through six. . . . . . . . . . . . . . . 68
B.5 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through seven. . . . . . . . . . . . . . 68
B.6 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through eight. . . . . . . . . . . . . . 69
B.7 Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through nine. . . . . . . . . . . . . . . 69
xii List of Figures
B.8 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two and three. . . . . . . . . . . . 70
B.9 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through four. . . . . . . . . . 70
B.10 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through five. . . . . . . . . . 71
B.11 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through six. . . . . . . . . . . 71
B.12 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through seven. . . . . . . . . 72
B.13 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through eight. . . . . . . . . 72
B.14 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through nine. . . . . . . . . . 73
B.15 Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as
the sum of the equipment with indexes two through ten. . . . . . . . . . 73
B.16 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two and three. . . . . . . . . . . . 74
B.17 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through four. . . . . . . . . . 74
B.18 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through five. . . . . . . . . . 75
B.19 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through six. . . . . . . . . . . 75
B.20 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through seven. . . . . . . . . 76
B.21 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through eight. . . . . . . . . 76
B.22 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through nine. . . . . . . . . . 77
B.23 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as
the sum of the equipment with indexes two through ten. . . . . . . . . . 77
xiii List of Figures
B.24 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The ag-
gregate was calculated as the sum of the equipment with indexes two and
three. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
B.25 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
four. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
B.26 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
five. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
B.27 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
six. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
B.28 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
seven. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
B.29 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
eight. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
B.30 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
nine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.31 Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggre-
gate was calculated as the sum of the equipment with indexes two through
ten. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.32 Expected and estimated equipment active power samples, estimated by the
EMUPF method, for the IMDELD dataset. . . . . . . . . . . . . . . . . 82
B.33 Expected and estimated equipment active power samples, estimated by the
UNN method, for the IMDELD dataset. . . . . . . . . . . . . . . . . . . 82
B.34 Expected and estimated equipment active power samples, estimated by the
UNN method, with Fourier mapping, for the IMDELD dataset. . . . . . 83
List of Tables
2.1 Survey of public NILM datasets. “agg” stands for aggregate and “eq” for
equipment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Equipment included in the HIPE dataset. . . . . . . . . . . . . . . . . . . 12
2.3 Equipment present on the IMDELD dataset. . . . . . . . . . . . . . . . . 15
4.1 General information about the HIPE dataset, including the timestamp at
which the dates stop being consecutive. . . . . . . . . . . . . . . . . . . . 29
4.2 General information about the IMDELD dataset. . . . . . . . . . . . . . 30
5.1 MAE for the equipment active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset. . . . . . . . . . . . 45
5.2 Error metrics for the aggregate active power samples, estimated by the
EMUPF method, for the testing data from the HIPE dataset. . . . . . . 45
5.3 MAE for the equipment active power samples, estimated by the UNN
method, for the testing data from the HIPE dataset. . . . . . . . . . . . 46
5.4 Error metrics for the aggregate active power samples, estimated by the
UNN method, for the testing data from the HIPE dataset. . . . . . . . . 46
5.5 MAE for the equipment active power samples, estimated by the UNN
method with Fourier mapping, for the testing data from the HIPE dataset. 46
5.6 Error metrics for the aggregate active power samples, estimated by the
UNN method with Fourier mapping, for the testing data from the HIPE
dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.7 Error metrics for the equipment active power samples, estimated by the
EMUPF method, for the testing data from the IMDELD dataset. . . . . 47
5.8 Error metrics for the aggregate active power samples, estimated by the
EMUPF method, for the testing data from the IMDELD dataset. . . . . 47
5.9 Error metrics for the equipment active power samples, estimated by the
UNN method, for the testing data from the IMDELD dataset. . . . . . . 47
5.10 Error metrics for the aggregate active power samples, estimated by the
UNN method, for the testing data from the IMDELD dataset. . . . . . . 48
5.11 Error metrics for the equipment active power samples, estimated by the
UNN method with Fourier mapping, for the testing data from the IMDELD
dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.12 Error metrics for the aggregate active power samples, estimated by the
UNN method with Fourier mapping, for the testing data from the IMDELD
dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
xiv
xv List of Tables
5.13 Mean active power values for each equipment, for the expected and esti-
mated values calculated by the EMUPF method, for the HIPE dataset.
The highlighted yellow cells correspond to the equipment with the highest
active power consumption values within the aggregate. . . . . . . . . . . 49
5.14 Mean active power values for each equipment, for the expected and esti-
mated values calculated by the UNN method, for the HIPE dataset. The
highlighted yellow cells correspond to the equipment with the highest active
power consumption values within the aggregate. . . . . . . . . . . . . . . 49
5.15 Mean active power values for each equipment, for the expected and esti-
mated values calculated by the UNN method with Fourier mapping, for the
HIPE dataset. The highlighted yellow cells correspond to the equipment
with the highest active power consumption values within the aggregate. . 50
5.16 Mean and sum active power values for each equipment, for the expected
and estimated values calculated by the EMUPF method, for the IMDELD
dataset. The highlighted yellow cells correspond to the equipment with the
highest active power consumption values within the aggregate. . . . . . . 50
5.17 Mean and sum active power values for each equipment, for the expected
and estimated values calculated by the UNN method, for the IMDELD
dataset. The highlighted yellow cells correspond to the equipment with the
highest active power consumption values within the aggregate. . . . . . . 51
5.18 Mean and sum active power values for each equipment, for the expected
and estimated values calculated by the UNN method with Fourier mapping,
for the IMDELD dataset. The highlighted yellow cells correspond to the
equipment with the highest active power consumption values within the
aggregate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
B.1 MSE for the equipment active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset. . . . . . . . . . . . 83
B.2 RMSE for the equipment active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset. . . . . . . . . . . . 84
B.3 MSE for the equipment active power samples, estimated by the UNN
method, for the testing data from the HIPE dataset. . . . . . . . . . . . 84
B.4 RMSE for the equipment active power samples, estimated by the UNN
method, for the testing data from the HIPE dataset. . . . . . . . . . . . 84
B.5 MSE for the equipment active power samples, estimated by the UNN
method with Fourier mapping, for the testing data from the HIPE dataset. 85
B.6 RMSE for the equipment active power samples, estimated by the UNN
method with Fourier mapping, for the testing data from the HIPE dataset. 85
B.7 Maximum active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset. 86
B.8 Minimum active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset. 86
B.9 Median active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset. 87
B.10 Sum of the active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset.
The highlighted yellow cells correspond to the equipment with the highest
active power consumption values within the aggregate. . . . . . . . . . . 87
xvi List of Tables
B.11 Maximum active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset. . 88
B.12 Minimum active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset. . 88
B.13 Median active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset. . 89
B.14 Sum of the active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset.
The highlighted yellow cells correspond to the equipment with the highest
active power consumption values within the aggregate. . . . . . . . . . . 89
B.15 Maximum active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for
the HIPE dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
B.16 Minimum active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for
the HIPE dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
B.17 Median active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for
the HIPE dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
B.18 Sum of the active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for
the HIPE dataset. The highlighted yellow cells correspond to the equipment
with the highest active power consumption values within the aggregate. . 91
B.19 Maximum, minimum, mean and median active power values for each
equipment, for the expected and estimated values calculated by the EMUPF
method, for the IMDELD dataset. . . . . . . . . . . . . . . . . . . . . . . 92
B.20 Maximum, minimum and median active power values for each equipment,
for the expected and estimated values calculated by the UNN method, for
the IMDELD dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
B.21 Maximum, minimum and median active power values for each equipment,
for the expected and estimated values calculated by the UNN method with
Fourier mapping, for the IMDELD dataset. . . . . . . . . . . . . . . . . . 92
Chapter 1
Introduction
This chapter comprises five sections. A summary of the background that led to
the development of the dissertation is presented. The problem, the significance of the
work developed and the gaps in the state of the art are introduced. The goals and the
requirements for the methods developed are outlined. The contributions made to the
Non-Intrusive Load Monitoring (NILM) field are listed. The structure of the dissertation
and an overview of each chapter are defined.
1.1 Context
In the industrial sector, it is essential to know the energy consumption of each equipment
present in the industrial facility to identify high-energy consumers and subsequently act
on them through strategies such as peak shaving or job scheduling. These strategies lead
to a reduction in the energy demand of the facility and a decrease in the costs of operation
and greenhouse gas emissions. NILM techniques estimate the electrical consumption of
each equipment in an industrial facility that only houses energy meters at the aggregate
level [1].
The Supervisory Control and Data Acquisition (SCADA) system [2] provides informa-
tion, such as active power and equipment state data sampled by energy meters. SCADA
information can be used as input for the NILM algorithm. A diagram that exemplifies
the expected behaviour of a NILM algorithm is shown in Figure 1.1.
There are three commonly used techniques in the field of NILM. The first technique
involves using neural networks, which require labelled data on the consumption of each
equipment [3]. The second technique includes variations of Hidden Markov Model (HMM),
which are not adequate to estimate the consumption of continuously varying equipment
[4]. The third technique consists of solutions inspired by Hart’s work [5] with the use of
transient data [6].
Almost all of the studies discussed in the literature were conducted using data from
household environments. A literature review shows a lack of low-frequency, unsupervised
NILM algorithms for industrial loads. The InGestAlgae1 project called for the development
of a NILM method to provide an industrial microalgae production plant with estimates
of the active power consumption data of the equipment and to help reduce its energy
demand. New algorithms that provide a new approach must be developed to fill the gaps
in the existing literature. The algorithms must be developed using different datasets and
1
https://ingestalgae-p2020.eu
1
2 1.1. Context
Figure 1.1: Diagram of the expected operation of the developed NILM algorithm, where
the algorithm uses the aggregate active power and the ON/OFF state equipment data to
estimate the equipment active power consumption.
3 1.2. Motivation
1.2 Motivation
The severe impact of climate change on both natural and human systems has led to
growing concern. Warming of the Earth’s climate system negatively affects biodiversity,
ecosystems, economic development, livelihoods, food and human security [7]. This is
mainly due to rising temperatures, droughts, floods, famines, and economic disruption
[8]. Excessive combustion of natural resources leads to high levels of pollutant gas
emissions, which exacerbates climate change. Among the significant greenhouse gases,
CO2 is the main heat-trapping gas. Cumulative CO2 emissions largely determine the
mean surface temperature [7]. According to the Fifth Assessment Report, AR5, presented
by the Intergovernmental Panel on Climate Change’s (IPCC), since the beginning of the
industrial revolution, the influence of humans on the climate system has grown [7]. This
is due to the increase in greenhouse gas emissions caused by the growth of global and per
capita energy consumption [9].
In 2021, the consumption of all fossil fuels increased to meet the growth of the electricity
demand. CO2 emissions of the electricity and heat production sectors increased by more
than 900 metric tons, representing 46% of the global growth in CO2 emissions. Greenhouse
gas emissions from the energy sector reached the highest level ever in 2021 [10].
As shown in Figure 1.2, there has been a near-simultaneous increase in CO2 emissions
and electricity generation from 1990 to 2018. Figure 1.3 also shows an increase in worldwide
CO2 emissions originating from energy combustion and industrial processes between 1900
and 2021.
Figure 1.2: Global CO2 emissions from electricity generation factors, between 1990 and
2019, by the IEA Energy and Carbon Tracker 2020 [11].
4 1.2. Motivation
Figure 1.3: Global CO2 emissions from energy combustion and industrial processes,
between 1900 and 2021 [10].
According to the International Energy Agency (IEA) 2021 statistics report [12], in
2019, the industrial sector was the most prominent electricity consumer sector in the
world, accounting for 41.9% of the 82 exajoules of electricity consumed, as shown in Figure
1.4.
Figure 1.4: 2019 global electricity consumption broken down by sector [12].
end-use sectors [7]. The expected solutions to reduce global carbon emissions, as described
in the Sustainable Development Scenario (SDS) [13] and in the IEA Global Energy Review
2021 report [10], include the spread of renewable energy sources, the reduction of energy
demand and the improvement of energy efficiency. Energy efficiency improvements can
lead to more than 224 different “non-energy” industrial productivity benefits, including
increased profit, safer working conditions and improvement in quality and output [14, 15].
The industrial sector can improve energy efficiency through management, technology,
or policy/regulation approaches. Energy management involves strategizing to meet energy
demand when and where needed, adjusting and optimizing energy usage [8]. Adopting
energy-efficient behaviours through energy management strategies requires a thorough
understanding of the electrical consumption of each equipment in a facility. In an industrial
setting, only the electrical load data at the aggregate level is available unless expensive
and specialized hardware has been installed per equipment. Buggypower’s micro-algae
production plant, shown in Figure 1.5, does not have individual meters for each equipment.
The total electrical load of Buggypower’s plant can be used to estimate the individual
equipment loads through a computational technique called energy disaggregation or NILM.
NILM techniques performs energy disaggregation and provides feedback that indicates
high consumption sources. NILM methods enables subsequent action on sources of high
consumption, such as peak shaving or job scheduling [8].
Research has shown that active energy feedback to residential consumers can reduce
electricity consumption in homes by 5-20% [16]. The energy savings potential of active
energy feedback in industrial facilities has not been studied. Given these considerations,
developing a NILM method for the industrial sector and studying its application in
real-world scenarios is essential.
1.3 Objectives
The dissertation aims to develop a NILM method that can estimate the energy
consumption of each equipment in an industrial facility using the information provided by
the SCADA system. The final objective is to integrate the algorithm into an industrial
factory setting to improve energy management and optimize energy usage in microalgae
production.
A set of requirements/constraints were defined for the algorithm to meet:
1. Accessibility: The user should be able to access and understand the results easily;
2. Scalability: The algorithm has to scale to accommodate a wide range of scenarios
and equipment;
3. Performance: The online stage of the algorithm should have a short response time,
suitable for real-time systems;
4. Usability: Users are allowed to derive value from the algorithm.
1. Analyzing the state of the art in the field of NILM and unsupervised NILM;
2. Surveying and selecting a public NILM dataset;
3. Modelling the problem mathematically;
4. Developing the algorithm;
5. Writing software;
6. Evaluating the results using the selected dataset;
7. Concluding and defining future work.
1.4 Contributions
Applying a low-frequency unsupervised NILM algorithm to the industrial sector is a
largely unexplored subject. The state of the art in this area is limited, as most NILM
algorithms are supervised, requiring disaggregated training data, or do not perform
source separation of continuously varying equipment, or are only applied to domestic
environments.
The work is significant because it fills a clear gap in the state of the art. The main
contributions and developed work of the dissertation are as follows:
(a) A novel method for the modelling of industrial loads by polynomial functions,
with metaheuristic optimization algorithms, developed in C++;
(b) An unsupervised neural network using Python.
In the literature, no prior method that uses polynomial functions to model the power
consumption of the equipment, as a function of the aggregate active power, in an industrial
setting, has been found. The proposed method was the first NILM solution to formulate
the objective function using matrices to find the coefficients of polynomial functions. No
previous study has been found that applies an UNN to solve a NILM solution. The first
network architecture and objective function were devised to tackle the NILM problem.
• Chapter 1 - Introduction: The current chapter introduced the NILM theme, and the
work developed and provided motivation and goals for developing and implementing
a low-frequency unsupervised NILM method applied to an industrial setting;
• Chapter 2 - Background: An introductory overview of the theoretical foundations is
established;
• Chapter 3 - State of the Art: The State of the Art in NILM techniques is presented;
• Chapter 4 - Methodology: The developed work is described, and the implemented
NILM methods are detailed;
• Chapter 5 - Results: The results and performance metrics of the proposed NILM
methods for the selected industrial datasets are presented;
• Chapter 6 - Discussion and Conclusion: A discussion of the results, a summary, and
final remarks are provided. Steps are mentioned to develop further and enhance the
method.
Chapter 2
Background
This chapter presents an introductory overview of the key concepts and foundations of
the NILM problem and the basic concepts related to the algorithms developed.
i is the equipment index. n is the total number of equipment contributing to the aggregate
active power at instant t. pti corresponds to the active power consumption of the equipment
i. et is noise or error. at is the aggregate active power consumption measured on the
meter. The objective of a NILM technique is to estimate pti from the at values. An initial
interpretation of the mathematical formulation may suggest that the NILM problem can
be solved using combinatorial optimization, which is unfeasible when considering a large
number and the different types of equipment [5] and type III equipment changes the
problem’s domain from discrete to continuous. NILM presents lower costs than ILM but
inherently introduces uncertainty in the estimated consumption values. The uncertainty
arises from various factors, including noise in the measurements, the complexity of the
load signatures of the equipment and the possible presence of multiple and different types
of equipment. There are four types of equipment, classified according to their power
consumption [20]:
• Type I - ON/OFF equipment: Equipment with only two possible states (ON/OFF);
• Type II - FSM: Equipment’s power consumption passes through state transitions;
8
9 2.1. NILM Concepts
• Type III - Continuously varying equipment: Equipment where the power consump-
tion values can vary through time in a continuous domain;
• Type IV - Permanent consumer equipment: Equipment with only one state.
Type III, often called Variable Frequency Drive (VFD) or Continuously Variable Device
(CVD), is the most challenging type of equipment to disaggregate and is ubiquitous in the
industrial sector. Examples include drilling and milling machines, whose power demands
vary based on the engine speed [21].
The formulation and estimations of NILM techniques also rely on the sampling rate at
which the data is collected. Data acquisition systems can be low-frequency (less than 1Hz)
or high-frequency (kHz to MHz). Low-frequency energy acquisition meters are cheaper
than high-frequency but do not provide data with as much detail. Low-frequency meters
only provide information on steady-state data, and high-frequency energy meters can
measure transients and electrical noise [20, 22, 23, 24].
NILM methods can be event-based or non-event-based. Event-based algorithms depend
on events. An event corresponds to a significant variation in the aggregate electrical
signal and suggests a change in the state of one equipment. An event can provide useful
information and is commonly used in the literature by solutions that disaggregate the
aggregate active power composed of type I and II equipment. Non-event-based algorithms
perform disaggregation at every instant without relying on event detection and are suitable
for disaggregating equipment of type III.
NILM algorithms can also be supervised or unsupervised. Supervised NILM methods
use a priori knowledge of equipment consumption data, such as labelled consumption data
or signature loads, while unsupervised algorithms do not have access to equipment data
[20].
NILM techniques can be divided into load classification and source separation. The
load classification process identifies the state of the power consumption of each equipment.
Source separation estimates the power consumption of the equipment. Most of the NILM
literature implements algorithms that follow the same four main steps [23]:
1. Data acquisition and signal preprocessing: In this stage, electrical data is collected
and power normalization, filtering and thresholding may take place;
2. Event/edge detection: Events are identified, corresponding to the change in the
state of equipment, implied by changes in the aggregate data;
3. Feature extraction: Features that identify the equipment are extracted within the
event windows;
4. Learning/inference or classification/load identification: A supervised or unsupervised
approach is performed to identify each equipment’s power consumption or state
based on the extracted features.
In the literature, different algorithms are considered to perform NILM. Still, their
expected outcomes differ, resulting in diverse implementations for the last step of the
traditional NILM method.
The established requirements prevent the adoption of a traditional approach.
10 2.2. General Concepts
In the context of energy disaggregation, SCADA systems can provide valuable informa-
tion. The developed NILM algorithm has access to a unique set of inputs, resulting from
integrating the process data from the SCADA system. The system provides information
acquired at the energy meters, such as active power and the state of operation of the
equipment connected to the meter in the facility.
1
https://www.energystatusdata.kit.edu/hipe.php
2
https://ieee-dataport.org/open-access/industrial-machines-dataset-electrical-load-disaggregation
11 2.3. Review of NILM Dataset
datasets were selected as they were the only ones that met the requirements. The selected
datasets required preprocessing.
Table 2.1: Survey of public NILM datasets. “agg” stands for aggregate and “eq” for
equipment.
Citations
Number Dataset Year (Google Scholar, Enviroment Frequency
Jan 2023)
1 ACS-F1 [29] 2013 61 Household 0.1Hz
2 ACS-F2 [30] 2014 53 Household 0.1Hz
3 AMBAL [31] 2017 29 Household - Synthetic 1Hz
4 AMPds / AMPds2 [32] 2013 217 Household 1Hz / 0.0167Hz
5 BERDS [33] 2013 33 Commercial 0.05Hz
BLOND-50: 50kHz agg and 6.4kHz eq
6 BLOND [34] 2018 75 Commercial
BLOND-250: 250kHz agg, 50kHz eq
7 BLUED [35] 2012 398 Household 1Hz current and 60Hz active power
8 COMBED [36] 2014 113 Commercial 2Hz
9 COOLL [37] 2016 87 Laboratory 100kHz
10 CU-BEMS [38] 2020 25 Commercial 0.0167Hz and 1Hz
11 Dataport [39] 2012 54 Household 16.67mHz to 1Hz
12 DRED [40] 2015 121 Household 1Hz
13 ECO [41] 2014 335 Household 1Hz
14 EEUD [42] 2017 38 Household 0.0167Hz
15 ENERTALK [43] 2019 40 Household 15Hz
16 ESHL [44] 2016 2 Household 0.5 to 1Hz
17 GREEND [45] 2014 193 Household 1Hz
18 HELD1 [46] 2018 15 Laboratory 4kHz
19 HFED [47] 2014 66 Household + Laboratory 9kHz to 30MHz
20 HIPE [48] 2018 25 Industry 0.2Hz
21 HES [49] 2012 207 Household 8.33mHz
22 HUE [50] 2019 24 Household 1Hz
23 iAWE [51] 2013 186 Household 1Hz
24 IDEAL [52] 2021 14 Household 1Hz
25 IHEPCDS [53] 2013 12 Household 0.016Hz
26 IMDELD [54] 2020 11 Industry 1Hz
27 I-BLEND [55] 2019 34 Commercial 0.0167Hz
28 LIFTED [56] 2020 12 Household 50Hz
29 LILAC [57] 2019 13 Industrial 50Hz
30 OPLD [58] 2016 3 Commercial 1Hz
31 PLAID I [59] 2014 210 Household 30kHz
32 PlaID II [60] 2017 14 Household 30kHz
33 PlaID III [61] 2020 32 Household 30kHz
34 RAE [62] 2018 63 Household 1Hz
35 RBSA [63] 2014 12 Household 0.0011Hz
36 REDD [64] 2011 1527 Household 15kHz, 0.5Hz and 1Hz
37 REFIT [65] 2017 260 Household 0.0167Hz
38 Sample 2012 54 Household 0.0167Hz
39 SHED [66] 2018 34 Commercial - Synthetic 0.033Hz
40 Smart / Smart* [67] 2017 519 Household 1Hz
41 SmartSim [68] 2016 24 Household - Synthetic 1Hz
42 South Korean factories dataset [69] 2022 1 Industry 0.0167Hz
43 SustData [70] 2014 67 Household 50Hz
44 SustDataED [71] 2016 27 Household 12.8kHz agg and 0.5Hz eq
45 SynD [72] 2020 51 Household 5Hz
46 SPAFID [73] 2021 1 Industry - Synthetic 0.0003Hz
47 Tracebase [74] 2012 303 Household 1Hz
48 UK-DALE [75] 2014 741 Household 16kHz agg and 0.17Hz eq
49 WHITED [76] 2016 123 Household + Industry 44.1kHz
produces electronic systems for particle physics, battery systems, and medical applications
in batches of less than 1,000 pieces.
Figure 2.2: Diagram of the factory electrical installation for the HIPE dataset. The
rectangles represent the equipment, and the meter illustrations show the locations where
the data was sampled.
A one-week period, from October 23, 2017, to October 30, 2017, from the original
dataset was used. The aggregate active power, in kW, measured at the main terminal, is
shown in Figure 2.3. The active power, in kW, for each equipment during the one-week
period is shown in Figure 2.4 and 2.5. Figure 2.4 shows that equipment with indices three
and six have the highest active power consumption values and equipment with index one
is always in the OFF state. Figure 2.6 displays the histogram of the equipment active
power samples and suggests that equipment with indices two, six, eight and nine are of
type III and equipment with indices four, five, seven and ten are of type II equipment.
13 2.3. Review of NILM Dataset
Figure 2.3: HIPE dataset’s main terminal’s active power, in kW, for a one-week period.
Figure 2.4: Active power, in kW, for the equipment in the HIPE dataset, in a single plot,
for a one-week period.
14 2.3. Review of NILM Dataset
Figure 2.5: Active power, in kW, for the equipment in the HIPE dataset, divided into
multiple plots, for a one-week period.
Figure 2.6: Histogram of the active power samples bigger than zero, in kW, for the
equipment in the HIPE dataset, divided into multiple plots, for a one-week period.
days, from December 11, 2017, 18:43:52 UTC to April 1, 2018, 21:33:17 UTC. The milling
machines were only sampled for twelve days. Eleven GreenAnt meters were installed, one
for each equipment in Table 2.3, one per Low-Voltage Distribution Board (LVDB) and
one for the Main Medium Voltage/Low Voltage Transformer (MV/LV). A diagram of the
factory electrical substation is shown in Figure 2.7.
Figure 2.7: Diagram of the factory electrical substation for the IMDELD dataset. The
rectangles represent the equipment, and the meter illustrations show the locations where
the data was sampled.
The LVDB-2 data was selected over LVDB-3 because it includes a larger number of
equipment. LVDB-3 and MI and MII equipment data were discarded. The aggregate
active power measurements on the LVDB-2 meter of the IMDELD dataset are shown in
16 2.3. Review of NILM Dataset
Figure 2.8. Before preprocessing, active power samples from the equipment are shown
in Figures 2.9 and 2.10. Figure 2.10 shows that equipment with indices seven and eight
has the highest active power consumption values, and Figure 2.8 indicates that these two
equipment have the largest influence on the values of the aggregate active power. Figure
2.11 displays the histogram of the equipment active power samples bigger than zero and
suggests that the dataset is composed of type III equipment.
Figure 2.8: Active power, in W, for the aggregate data, measured at the LVDB-2, in the
IMDELD dataset.
Figure 2.9: Active power, in W, for the equipment in the IMDELD dataset, in a single
plot.
17 2.4. Mathematical and Computational Concepts
Figure 2.10: Active power, in W, for the equipment in the IMDELD dataset, divided into
multiple plots.
Figure 2.11: Histogram of the active power samples bigger than zero, in W, for the
equipment in the IMDELD dataset, divided into multiple plots.
where i is the index of the term, n is the degree of the polynomial function, ci corresponds
to the coefficient, and x is the variable of the function.
Polynomial functions are useful for approximating complex shapes and are commonly
used in curve-fitting problems [77]. In the literature, polynomial functions have been used
to model aggregate-level energy consumption as a function of relevant variables [78, 79].
No previous studies have been found in the literature that use polynomial functions to
model the equipment’s active power consumption, using aggregate active power as the
variable.
2.4.2 Optimization
An optimization problem involves finding a given function’s maximum or minimum
value. Numerical and metaheuristic methods are two approaches to solving optimization
problems in continuous domains. Numerical optimization techniques rely on the function’s
gradient to iteratively approximate the minimum or maximum value. Examples of
numerical optimization methods are gradient descent and Newton’s method [80].
However, for the cases where the function is a complex search space with multiple
local minima or maxima, or for non-differentiable functions, with various saddle points,
numerical methods are not suitable. Metaheuristic optimization methods can provide a
solution for such cases. Metaheuristic algorithms are computational intelligence techniques
that combine two search schemes: exploration and exploitation [81]. The exploitation
scheme searches for the best solution within a given search space, and the exploration
scheme explores new solution spaces. Although metaheuristic algorithms are flexible and
can be applied to various optimization problems, solutions are not guaranteed to correspond
to the global optimum. Still, they provide good approximations for complex problems.
Metaheuristics techniques can be divided into metaphor-based and non-metaphor-based
approaches. The former includes algorithm such as Simulated annealing (SA) [82, 83], Ant
Colony Optimization for Continuous Domains (ACOR), PSO [84] and Genetic Algorithm
(GA) [85, 86]. ACOR and PSO are examples of algorithms inspired by biological systems
that use Swarm Intelligence (SI). SI algorithms simulate the behaviour of a group of
agents, where candidate solutions are updated by interaction with the environment and
other agents. GA are Evolutionary Algorithm (EA), that model the evolution progression
of cells in nature employing mutation, selection, crossover and reproduction schemes [81].
xk+1
i = xki + vik+1 , (2.4)
19 2.4. Mathematical and Computational Concepts
where i is the particle index, k is the current algorithm iteration, w is the inertial constant
that can gradually decrease with each iteration. vi is the velocity, which can be limited
by a maximum value to prevent swarm explosions. c1 is the cognitive constant, c2 is the
social constant, r1 and r2 are random numbers that follow a normal distribution between
zero and one. x is the positions of the particles. Pb is the personal best, which, in a
minimization problem, corresponds to the position of the particle with the lowest fitness
value from the first to the current iteration. All particles have an associated Pb value. Pg
corresponds to the best global position, which is the particle’s position with the lowest
fitness value across all iterations and all particles. The fitness function evaluates candidate
solutions and is described by Equation (2.5).
Each neuron has a set of weights, with a weight per input and a bias value. The
neurons of a layer are connected to the neurons of the subsequent layer. Equation (2.6)
describes a neuron’s output.
Xn
y = f( xi × wi + b), (2.6)
i=0
where y is the neuron’s output, f is the activation function, n is the number of inputs, xi
is the input, wi is the neuron’s weight and bi associated with the input i. The neuron’s
output is calculated by applying the activation function to the sum of the product of the
neuron’s weights and inputs, plus a bias parameter.
The activation function enables the NN to model non-linear relationships between
the inputs and outputs and allows the network to solve complex problems. Examples of
commonly used activation functions are the hyperbolic tangent, described by Equation
(2.7) and shown in Figure 2.13, the sigmoid and the Rectified Linear Unit (ReLU), shown
by Equation (2.8) and Figure 2.14.
ex − e−x
tanh(x) = (2.7)
ex + e−x
21 2.4. Mathematical and Computational Concepts
X = x0 · · · xi (2.11)
w00 · · · w0n
W = ... . . . .. (2.12)
.
wi0 · · · win
22 2.4. Mathematical and Computational Concepts
B = b0 · · · bn (2.13)
where n is the number of neurons in the layer, and i is the number of inputs.
The weights and bias values control the behaviour of the network and are adjusted
incrementally during the training phase of the NN. The first step of the training phase
involves forward propagation, where the input values pass through the layers, and the
output of the network is calculated. After the forward propagation phase, backpropagation
adjusts the weights and bias parameters based on an optimization algorithm such as the
gradient descent method presented in Equation (2.14).
where yi is the expected output and ȳi corresponds to the estimated value.
The NN’s training phase aims to minimize the loss function value, as a lower value
indicates a better model.
The gradient descent algorithm updates the neuron’s parameters by moving down
the loss function’s combined error surface and updating each weight and bias through
the backpropagation algorithm. The backpropagation algorithm adjusts the weights and
biases proportionally to the network loss value from changes in each respective weight.
The weights and bias values are determined by the partial derivatives of the loss function
with respect to each weight, described by Equation (2.16).
∂L2 ∂L2 ∂ ȳi
= (2.16)
∂wi ∂ ȳi ∂wi
In Equation (2.16), wi is the weight and the gradient represents the sensitivity of the loss
function to changes in the weight parameter.
The chain rule is applied to update the neurons’ parameters during the backpropagation
algorithm. The calculation can be computationally simplified using matrix form, as shown
in Equations (2.17), (2.18) and (2.19). This approach enables the simultaneous update of
all neurons in a layer.
∂L2 ∂L2
Lt = ∂w0
. . . ∂w n
(2.17)
Bt+1 = Bt + η ⊙ Lt (2.19)
Wt+1 is the updated weight matrix, Wt is the current weight matrix, η is the step size,
XtT is the transpose of the neuron’s inputs and Lt represents the loss matrix. In Equation
(2.17), Lt corresponds to the vector of the partial derivatives of the loss function L2 with
23 2.4. Mathematical and Computational Concepts
respect to each weight, for iteration t. Bt+1 is the updated bias matrix, and Bt is the
value of the bias matrix before it was updated.
The backpropagation method starts by updating the neurons of the output layer and
then proceeds regressively by recalculating the loss vector for each previous layer, based
on Equation (2.20).
T
Lt+1 = Lt · Wt+1 (2.20)
The loss vector, Lt+1 , is updated by multiplying the previous loss matrix, Lt , by the
T
new values of the transposed weight matrix, Wt+1 . Equations (2.18) and (2.19) are then
applied with the new value of Lt . The process continues for all layers of the network.
Training occurs over multiple cycles, called epochs. To improve efficiency during
training, the input data can be divided into mini-batches with a specific batch size. The
new weights and bias values for each mini-batch element are averaged. NN are commonly
used in classification problems, where labelled training data is used to train the network.
The objective of a classification problem is to determine the category to which a given
input data point belongs. ML algorithms, including NN, are employed to solve problems
in high-dimensional spaces that cannot be exhaustively searched.
Bayati et al. proposed a technique for solving constrained continuous optimization
problems using unsupervised NN [91]. The NN uses the objective function of the opti-
mization problem as the loss function of the network, which includes a penalty term to
discourage solutions that violate the constraints.
where a0 is the constant term, ai and bi are the coefficients defining the cosine and sine
amplitude. i indicates the frequency index. The variable x ranges from −π to π.
The Fourier series has multiple applications, including signal analysis and processing,
image compression, and some applications in machine learning. Tancik et al. suggested
that passing the input of a NN through a Fourier feature mapping could improve the
results in a function approximation problem [93].
Chapter 3
The current chapter reviews key studies and research in the field of NILM, with an
emphasis on unsupervised and low-frequency NILM algorithms. The literature review
followed a semi-systematic methodology [94, 95]. First, review papers were studied,
followed by papers that applied specific techniques. The search criteria included the
keywords “NILM”, “unsupervised”, “low-frequency”, and “industry” in databases such
as Scopus 1 , ScienceDirect 2 , IEEE Xplore 3 , ArXiv 4 and Google Scholar 5 . Articles were
selected based on the number of citations, publication date, abstract section and full text.
1
https://www.scopus.com/
2
https://www.sciencedirect.com/
3
https://ieeexplore.ieee.org/
4
https://arxiv.org/
5
https://scholar.google.com/
24
25 3.1. Literature Review
Ruando et al. described the approaches used for the first component of the classical
NILM algorithm, this is event detection. Event detection can be divided into three
approaches: expert heuristics, probabilistic models, and matched filters [96].
Expert heuristics involves defining a set of rules to perform event detection. Fixed
thresholding is an example of a typical implementation. Multiple thresholds based on
different features can be used to improve the results. An example of a multiple thresholding
technique is the multivariate event detection method. An alternative to fixed thresholding
is adaptive thresholding, applied in techniques like enveloped-based peak detection [96].
Probabilistic methods require a training process to estimate statistical features for
each equipment. Probabilistic models use techniques such as Generalized Likelihood Ratio
(GLR), chi-squared, Cumulative Sum (CUSUM) and Bayesian information criterion [96].
Matched filters correlate signal waveforms to known patterns and require high sampling
rates and prior knowledge of the equipment load signatures. The correlation is calculated
by performing clustering techniques [100, 101].
The feature extraction component [102] of the classical NILM method depends on
the sampling rate, where the most employed features are RMS current, RMS voltage,
active, reactive and apparent power, total harmonic distortion and power factor. A high
sampling rate allows for the capture of harmonics using the Fourier transform. At very
high sampling frequencies, two-dimensional voltage-current trajectories, electrical noise
and Electromagnetic Interference (EMI) signals can be obtained. Nontraditional features,
such as temperature, light sensing and time of day, are also used by some implementations.
Load identification or source separation can be performed using optimization, super-
vised or unsupervised techniques [96, 103]:
1. Optimization techniques aim to disaggregate the power measurement into combina-
tions of the individual equipment power signals:
Barsim et al. [6] proposed a method based on Hart’s method that uses transient
information. The authors used a sliding window to detect events in the logarithmically
transformed active and reactive power signals. The size of the sliding window was set to
contain one transient event and two steady-states. A grid-based clustering scheme similar
to density-based clustering was applied to the values calculated at each event to identify
the equipment. The researchers tested the algorithm using a dataset from a domestic
environment. Wang et al. presented another method inspired by Hart’s solution [104],
to classify individual loads in a household setting. The event detection process was not
explained in detail. The available information indicates that power consumption was
classified into three categories, namely insignificant fluctuations, fast switching and steady
working events. For each event, the start, peak and end times, along with peak values,
which include active and reactive power, were acquired. The mean active and reactive
power and the variance of active power in steady-state values were calculated. Mean-shift
clustering was applied, iteratively, to the acquired values. The clustering results were used
along with a knowledge base in a linear discriminate analysis to classify the equipment.
Both methods perform load classification. Barsim et al. used an unsupervised approach,
whereas Wang et al. used a supervised technique.
Liu et al. reviewed different unsupervised NILM algorithms [4]. These algorithms were
mainly variations of HMM or GSP. The reviewed algorithms construct state-equipment
models, assuming temporal dependencies and patterns. However, HMM and GSP cannot
perform source separation in the presence of type III equipment. HMM and GSP can only
perform load classification of type I and II equipment.
Bonfigli et al. [105] conducted an overview study of unsupervised NILM algorithms.
The authors divided the unsupervised algorithms into load classification and source
separation techniques. The load classification techniques presented are variations of HMM
and clustering methods. Bonfigli et al. presented the work by Figueiredo et al. [106] as a
supervised source separation method.
Two papers were studied in more detail, the work developed by Kolter et al. [107],
and by Figueiredo et al. [106]. Both methods are supervised and construct equipment
models using aggregate and equipment data. The models perform the separation of the
aggregate signal. Kolter et al. developed a Discriminative Disaggregation via Sparse
Coding (DDSC) method [107]. The DDSC method works by training separate models
for each equipment and then using the models to disaggregate an aggregate signal. The
model consists of a matrix of basis functions called a dictionary and an activation matrix.
The method developed by Kolter et al. can be divided into a training and an estimation
phase. During training, the method creates separate models for each equipment. The
models are calculated by an optimization method that switches between estimating the
27 3.2. Chapter Summary
dictionary and the activation matrix. The activation and basis matrices are then calculated
with a proposed method developed by Kolter et al. called the augmented regularized
disaggregation error. The estimation phase requires solving another optimization problem
to calculate a new activation matrix. The gradient descent method and a structure
perception-based algorithm are used to optimize the problem.
Figueiredo et al. developed a method called Source Separation via Tensor and Matrix
Factorization (STMF) [106], where tensors are composed of three domains: time, day
and individual equipment data. Tensors are first decomposed by the PARAFAC method
and then further decomposed into Nonnegative Matrix Factorization1 (NMF). According
to Figueiredo et al., the STMF implementation presented better results than the DDSC
method.
Methodology
This chapter outlines the development of the two proposed methods. The chapter
presents the preprocessing of the HIPE and IMDELD datasets, details the Equipment
Modelling Using Polynomial Functions (EMUPF) and Unsupervised Neural Network
(UNN) methods and describes the testing of the models. The testing process compromises
the analysis of the results through error metrics and descriptive statistics. The methodology
followed is depicted in Figure 4.1.
28
29 4.1. Dataset Preprocessing
Table 4.1: General information about the HIPE dataset, including the timestamp at which
the dates stop being consecutive.
Equipment Number of Number of Number of Non-consecutive
First timestamp Last timestamp
index samples unique samples missing samples timestamp
CS 110467 110460 494340 23-10-2017 00:00:00 29-10-2017 23:59:56 29-10-2017 02:59:54
HTO 110466 110464 494336 23-10-2017 00:00:02 29-10-2017 23:59:58 29-10-2017 02:59:56
PPU 122773 122771 482029 23-10-2017 00:00:02 29-10-2017 23:59:57 29-10-2017 02:59:56
SP 122742 122741 482059 23-10-2017 00:00:04 29-10-2017 23:59:59 29-10-2017 02:59:58
SO 110492 110490 494310 23-10-2017 00:00:02 29-10-2017 23:59:57 29-10-2017 02:59:56
VO 110465 110459 494341 23-10-2017 00:00:01 29-10-2017 23:59:57 29-10-2017 02:59:55
VP1 110492 110486 494314 23-10-2017 00:00:04 29-10-2017 23:59:57 29-10-2017 02:59:56
VP2 110496 110493 494307 23-10-2017 00:00:01 29-10-2017 23:59:55 29-10-2017 02:59:59
WM 110495 110489 494311 23-10-2017 00:00:03 29-10-2017 23:59:55 29-10-2017 02:59:59
The equipment and aggregate samples were interpolated, ensuring a frequency of 1Hz.
It was intended to study the effects of the number of equipment on the results. The
aggregate data used is synthetic since the data was generated and does not correspond
to the measurements at the main terminal energy meter. The aggregate active power
was calculated as the sum of the active power samples of each equipment. In total, eight
sums were calculated. The first aggregate is the sum of the first two equipment, the
second aggregate is the sum of the first three equipment and so on until the last aggregate
corresponds to the sum of all nine equipment. The equipment state data was computed
by applying a threshold. A threshold splits the active power data into the ON state if
the sample is above the threshold and into the OFF state if the sample is below. The
threshold was defined as 0W. Finally, the data was divided into training and testing sets
by implementing an adaptive binning algorithm. The algorithm splits the data into six
bins based on the standard deviation value of the samples. Data were selected in equal
numbers and randomly from each bin. The selected data were divided into a 70-30 ratio
for the training and testing data. The adaptive binning technique was used to remove
outliers from the training and testing data and to ensure a balanced representation of the
data. This step was crucial since some equipment remained in one state for most of the
sampling period.
The equipment state data was calculated with a threshold equal to 5W. Training and
testing data were determined using a 70-30 ratio, randomly selecting results, in equal
numbers from each bin, from the adaptive binning technique.
4.2 Methods
In the current section, the developed methods are presented, which include the EMUPF
and UNN methods. Both methods create models to estimate the active power values
of each equipment. The methods consist of two phases: an initial offline training phase
and an online inference phase. In the training phase, the models are calculated using the
aggregate active power and the equipment state training data. In the inference phase,
the active power consumption values of the equipment are calculated using the estimated
models from the training phase.
f (at , sit ) = sit · (c0n · a0t + c1i · a1t + c2i · a2t + c3i · a3t ) (4.1)
f (at , sit ) is the active power value of equipment with index i and for the sample with
index t, for the aggregate active power sample at and the equipment state sit .
The degree of the polynomial function was carefully selected. Generally, a higher
degree results in a more accurate model but also increases the computational complexity
of the optimization problem. It is important to choose a degree that balances between
underfitting, overfitting and high computational complexity. Rank three was chosen
for its ability to capture the non-linearity of the problem while providing a simplified
representation of the equipment’s behaviour and allowing for the analysis of the feasibility
of the proposed algorithm. Third-order polynomial functions are commonly used in the
literature to solve load forecasting problems [108].
31 4.2. Methods
a is the aggregate active power sample. S is the equipment state matrix, with the state
samples for each equipment. JACS represents the sum of the equipment active power
values, represented in Equation (4.8). JACS corresponds to the sum of polynomial
functions. λ is a regularization parameter multiplied by the penalty value, p, denoted by
Equation (4.9). The inputs of the training phase are the aggregate active power and the
equipment state samples. Equation (4.3) is minimized for each input, this is one aggregate
active power sample and the equipment state samples for all the equipment at the same
timestamp t. The objective function is minimized for all inputs. Then, the estimated
coefficients are averaged.
The objective function calculates the squared difference between the aggregate active
power and the sum of the active power values of the equipment in the ON state while
penalizing negative values for the estimated equipment’s active power.
The matrices J, A, C and S are denoted by Equations (4.4), (4.5), (4.6) and (4.7).
J = 1···1 (4.4)
a0 a1 · · · ar
A = ... ... .. .
. .. (4.5)
a0 a1 · · · ar
c00 c10 · · · cn0
c01 c11 · · · cn1
C = .. (4.6)
.. . . . ..
. . .
c0r c1r · · · cnr
s0
..
S=. (4.7)
sn
n is the total number of equipment in the aggregate. r is the degree of the polynomial
functions. J is the unit matrix of size 1 × n. A is the aggregate matrix with size n × r.
The coefficients matrix C has size r × n. The state matrix S has size n × 1.
n
X
JACS = si · (c0i · a0 + c1i · a1 + · · · + cri · ar ) (4.8)
i=1
where i is the equipment index, si is the state of the equipment with index, i, c represents
the coefficients of the polynomial function and a is the aggregate active power value. r
was defined as three. For the data used from the HIPE dataset, the n value is nine, and
for the IMDELD dataset, n is six.
32 4.2. Methods
The penalty function, described by Equation (4.9), penalizes negative equipment active
power values.
X n
p= pi (4.9)
i=1
where pi is the penalty value for equipment with index i, shown by Equation (4.10).
(
|ϕ · ei |γ , if ei < 0
pi = (4.10)
0, otherwise
ei , denoted by Equation (4.11), corresponds to the estimated active power value of the
equipment i. ϕ and γ are positive constants. ϕ was defined as three and γ as two. (4.11)
Figure 4.2: Diagram representing a single training phase run for the EMUPF method.
The EMUPF method was run five times and the coefficients of the run with the lowest
mean objective function value were selected. The main steps for the method’s training
phase are presented in Algorithm 1.
33 4.2. Methods
Figure 4.3: Diagram illustrating the inference phase of the EMUPF method for disaggre-
gating an aggregate active power sample into the estimated active power values for each
equipment.
34 4.2. Methods
The EMUPF method proposes a new approach to solve the NILM problem by modelling
the active power consumption of the equipment with polynomial functions. No method
that defines an objective function that includes the sum of polynomial functions and uses
optimization algorithms to estimate the coefficients of the functions has been found in the
literature.
used the ReLU activation function except the output layer, which used the hyperbolic
tangent activation function.
where li is the loss function value for the output neuron i, a is the input aggregate active
power sample, si is the state of the equipment with index i, oi is the output value of the
neuron and pi is the value of the penalty function.
Equation (4.14) shows the derivative of the loss function used by the backpropagation
algorithm to update the weights and bias of the network’s neurons.
n
∂li X ∂pi
= 2 · (a − si · o i ) + (4.14)
∂oi i=0
∂oi
The penalty value was calculated for each output neuron following Equation (4.15).
(
|ϕ · oi |γ , if oi < 0
pi = (4.15)
0, otherwise
pi is the penalty value for the output, oi . For the current problem, both ϕ and γ were
defined as three. The derivative of the penalty function is:
(
γ(|oi ||ϕi |)γ
∂pi oi
, if oi < 0
= (4.16)
∂oi 0, otherwise
The input aggregate values were normalized based on the maximum and minimum
values of the training aggregate active power. The estimates are denormalized at the end.
For the Fourier mapping, the inputs must be normalized between −π and π.
ai − min(a)
âi = (4.17)
max(a) − min(a)
Figure 4.4: Diagram illustrating the training phase for the UNN method with the aggregate
active power and the equipment state data as input. Equipment state samples are part of
the input layer of the network and are used by the objective function.
Figure 4.5: Diagram depicting the training phase for the UNN method with Fourier
mapping. The objective function uses the equipment state data, but the equipment state
samples are not used in the input layer of the network.
37 4.2. Methods
In Algorithm 3, the inputs for the first network architecture are the aggregate active
power and the equipment state samples, and for the second network architecture are the
feature from the Fourier mapping of the aggregate active power.
Twenty networks with the same architecture were trained. At the end of the training
phase, the average loss was calculated for all the outputs and the network model with the
lowest mean loss value was selected.
Figure 4.6: Diagram illustrating the UNN method’s inference phase for a single aggregate
active power sample. The method has as input the aggregate active power and the
equipment state data. Equipment state samples are part of the input layer of the network
and are used by the objective function.
Figure 4.7: Diagram representing the inference phase for the UNN method with Fourier
mapping for a single aggregate active power sample. The objective function uses the
equipment state data, but the samples are not used in the input layer of the network.
During the inference phase, the active power consumption values of the equipment
were estimated by forward propagation of the inputs for the trained network model.
In the literature, no work was found that used an unsupervised neural network to
solve a NILM problem. The first Unsupervised Neural Network architecture, network’s
parameters and objective function for a NILM problem was defined, and its applications
were studied. For the first time, the application of Fourier feature mapping was studied
for an UNN.
39 4.3. Descriptive Statistical Analysis
Results
In the current chapter, the results of the preprocessing stage and the results of the
two proposed methods are presented. The methods were assessed using the MAE, MSE
and RMSE error metrics, described in Equations (5.1), (5.2) and (5.3). The error metrics
calculate the error between the expected and estimated equipment active power values.
The expected values correspond to the equipment active power values from the testing
data from both datasets. The estimated equipment active power values were the result of
the methods using the testing data as input.
n
1X
M AE = |yi − ȳi | (5.1)
n i=0
n
1X
M SE = (yi − ȳi )2 (5.2)
n i=0
√
RM SE = M SE (5.3)
where yi is the expected output and ȳi corresponds to the estimated value. The error
metrics were calculated in kW.
40
41 5.1. Dataset Preprocessing
Figure 5.1: Preprocessed aggregate active power that results from the sum of nine
equipment in the HIPE dataset.
Figure 5.2: Preprocessed equipment active power data for the HIPE dataset in a single
plot.
42 5.1. Dataset Preprocessing
Figure 5.3: Preprocessed equipment active power data for the HIPE dataset, divided into
multiple plots.
Figure 5.4: Preprocessed equipment states data for the HIPE dataset.
Figure 5.5: Preprocessed aggregate active power for the IMDELD dataset.
Figure 5.6: Preprocessed equipment active power for the IMDELD dataset in a single
plot.
44 5.2. Methods Evaluation
Figure 5.7: Preprocessed equipment active power data for the IMDELD dataset in multiple
subplots.
Figure 5.8: Preprocessed equipment states data for the IMDELD dataset.
Table 5.1: MAE for the equipment active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset.
The EMUPF method presented large error values, especially for the estimates of active
power for equipment with indexes nine and ten. The large error values indicate the
method’s inability to estimate the equipment’s active power values.
The error metrics for the aggregate active power provide a good overview of the
accuracy of the estimations. The error was calculated between the true aggregate active
power and the sum of the estimated equipment active power values. The error metric
values should be small for an accurate estimation.
The EMUPF shows a significantly high value for the error metrics of the aggregate,
shown in Table 5.2.
Table 5.2: Error metrics for the aggregate active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset.
Agg. 1 Agg. 2 Agg. 3 Agg. 4 Agg. 5 Agg. 6 Agg. 7 Agg. 8
MAE 0.4587 0.4715 0.347 1.172 1.3498 17.199 419.36 2620.6
MSE 0.4532 0.455 0.2867 3.2745 6.8268 502.12 1.0012e+06 9.1477e+07
RMSE 0.6732 0.6745 0.5354 1.8096 2.6128 22.408 1000.6 9564.4
The error increases for a larger number of equipment. The error is the largest for
the seventh and eighth aggregates, the aggregates that were calculated with the largest
number of equipment. The error values are consistent with the fact that the estimated
active power values for the equipment with indexes nine and ten presented larger error
values.
46 5.2. Methods Evaluation
Table 5.4: Error metrics for the aggregate active power samples, estimated by the UNN
method, for the testing data from the HIPE dataset.
Agg. 1 Agg. 2 Agg. 3 Agg. 4 Agg. 5 Agg. 6 Agg. 7 Agg. 8
MAE 0.0165 0.0077 0.0142 0.0325 0.0493 0.052 0.0919 0.0523
MSE 0.0012 0.0001 0.0008 0.0039 0.0067 0.0137 0.0239 0.01
RMSE 0.0346 0.01 0.0283 0.0624 0.0819 0.117 0.1546 0.1
5.2.1.3 MAE Values for the UNN Method with Fourier Mapping
The error metrics were computed for the UNN method with Fourier feature mapping.
The method shows considerable error values, displayed in Tables 5.5 and 5.6. The error
metrics for the aggregate values increase with the number of equipment. The error is larger
than for the UNN method without Fourier mapping for all the equipment estimations of
the active power.
Table 5.5: MAE for the equipment active power samples, estimated by the UNN method
with Fourier mapping, for the testing data from the HIPE dataset.
Table 5.6: Error metrics for the aggregate active power samples, estimated by the UNN
method with Fourier mapping, for the testing data from the HIPE dataset.
Agg. 1 Agg. 2 Agg. 3 Agg. 4 Agg. 5 Agg. 6 Agg. 7 Agg. 8
MAE 0.7292 0.7093 2.1402 4.7807 5.5747 8.8285 11.02 12.493
MSE 0.9288 1.147 7.8018 44.87 56.685 140.7 224.02 258.64
RMSE 0.9637 1.071 2.7932 6.6985 7.529 11.864 14.967 16.082
Table 5.8: Error metrics for the aggregate active power samples, estimated by the EMUPF
method, for the testing data from the IMDELD dataset.
Table 5.10: Error metrics for the aggregate active power samples, estimated by the UNN
method, for the testing data from the IMDELD dataset.
Table 5.11: Error metrics for the equipment active power samples, estimated by the UNN
method with Fourier mapping, for the testing data from the IMDELD dataset.
Eq. 1 Eq. 2 Eq. 3 Eq. 4 Eq. 7 Eq. 8
MAE 77.9464 73.3989 104.6742 70.5946 68.3065 87.9315
MSE 12952.9571 12348.5452 16654.945 11567.7173 8725.065 11602.1169
RMSE 113.8111 111.1240 129.0540 107.5533 93.4081 107.7131
Table 5.12: Error metrics for the aggregate active power samples, estimated by the UNN
method with Fourier mapping, for the testing data from the IMDELD dataset.
Table 5.13: Mean active power values for each equipment, for the expected and estimated
values calculated by the EMUPF method, for the HIPE dataset. The highlighted yellow
cells correspond to the equipment with the highest active power consumption values within
the aggregate.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.26725 0.53221 - - - - - - -
Agg. 1
Estimated 0.25937 0.081432 - - - - - - -
Expected 0.23658 0.45366 0.16928 - - - - - -
Agg. 2
Estimated 0.29402 0.10823 -0.01426 - - - - - -
Expected 0.24671 0.38866 0.1657 0.11395 - - - - -
Agg. 3
Estimated 0.31315 0.40977 0.13564 0.027939 - - - - -
Expected 0.15195 0.090656 0.13843 0.066545 1.2488 - - - -
Agg. 4
Estimated 0.86575 -0.093604 0.2271 0.10483 -0.024894 - - - -
Expected 0.15599 0.076292 0.14208 0.075591 1.2465 0.00030822 - - -
Agg. 5
Estimated 0.89797 0.55498 -0.20995 0.056242 0.097145 0.0038869 - - -
Expected 0.15423 0.057363 0.12941 0.073186 1.1333 0.00054795 0.54527 - -
Agg. 6
Estimated 0.48079 1.5985 2.0034 -0.29411 0.090579 -0.0034123 15.233 - -
Expected 0.16086 0.11537 0.13829 0.089687 0.93818 0.00047945 0.56716 0.20751 -
Agg. 7
Estimated 0.095219 -0.25764 2.4769 0.21941 0.20933 0.0031821 0.62836 417.67 -
Expected 0.15544 0.092312 0.13225 0.085704 0.99808 0.00044521 0.55303 0.20765 0.0010288
Agg. 8
Estimated -0.19081 0.2007 1.208 0.27069 -0.73226 -0.010815 30.296 1.4746 2589.7
in accordance with the error metrics calculated previously. The method failed to identify
high-power-consuming equipment. The mean statistical measures are shown in Table 5.15.
Table 5.15: Mean active power values for each equipment, for the expected and estimated
values calculated by the UNN method with Fourier mapping, for the HIPE dataset.
The highlighted yellow cells correspond to the equipment with the highest active power
consumption values within the aggregate.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.26725 0.53221 - - - - - - -
Agg. 1
Estimated 0.36388 0.29839 - - - - - - -
Expected 0.23658 0.45366 0.16928 - - - - - -
Agg. 2
Estimated 0.87764 0.52853 -0.19188 - - - - - -
Expected 0.24671 0.38866 0.1657 0.11395 - - - - -
Agg. 3
Estimated 0.88279 0.58662 1.3491 0.19587 - - - - -
Expected 0.15195 0.090656 0.13843 0.066545 1.2488 - - - -
Agg. 4
Estimated 0.72788 0.10651 1.4494 0.74717 2.5621 - - - -
Expected 0.15599 0.076292 0.14208 0.075591 1.2465 0.00030822 - - -
Agg. 5
Estimated 1.7111 0.30063 2.0511 0.48857 2.6193 0.0038127 - - -
Expected 0.15423 0.057363 0.12941 0.073186 1.1333 0.00054795 0.54527 - -
Agg. 6
Estimated 1.578 0.38572 1.0058 1.1092 2.6846 0.13608 2.4479 - -
Expected 0.16086 0.11537 0.13829 0.089687 0.93818 0.00047945 0.56716 0.20751 -
Agg. 7
Estimated 2.2877 0.16125 2.7316 1.3792 2.213 -0.040339 2.7806 0.84668 -
Expected 0.15544 0.092312 0.13225 0.085704 0.99808 0.00044521 0.55303 0.20765 0.0010288
Agg. 8
Estimated 1.8493 0.56248 2.6314 1.3154 2.2591 0.092278 3.3574 1.3317 0.7311
Table 5.16: Mean and sum active power values for each equipment, for the expected
and estimated values calculated by the EMUPF method, for the IMDELD dataset.
The highlighted yellow cells correspond to the equipment with the highest active power
consumption values within the aggregate.
Eq. 1 Eq. 2 Eq. 3 Eq. 4 Eq. 7 Eq. 8
Expected 711.88 692.52 2523.1 4211.5 54434 48396
mean
Estimated 1.8229e+14 4.9231e+13 -8.7258e+13 -1.5357e+14 -1.1658e+13 1.1383e+46
Expected 3.1038e+05 3.0194e+05 1.1001e+06 1.8362e+06 2.3733e+07 2.1101e+07
sum
Estimated 7.948e+16 2.1465e+16 -3.8044e+16 -6.6956e+16 -5.0828e+15 4.9628e+48
the equipment with the highest active power consumption, for the data from the IMDELD
dataset.
Table 5.17: Mean and sum active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the IMDELD dataset. The highlighted
yellow cells correspond to the equipment with the highest active power consumption values
within the aggregate.
Eq. 1 Eq. 2 Eq. 3 Eq. 4 Eq. 7 Eq. 8
Expected 711.88 692.52 2523.1 4211.5 54434 48396
mean
Estimated 14886 333.14 7781.9 4864.6 42760 54794
Expected 3.1038e+05 3.0194e+05 1.1001e+06 1.8362e+06 2.3733e+07 2.1101e+07
sum
Estimated 6.4902e+06 1.4525e+05 3.3929e+06 2.121e+06 1.8643e+07 2.389e+07
Table 5.18: Mean and sum active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for the IMDELD
dataset. The highlighted yellow cells correspond to the equipment with the highest active
power consumption values within the aggregate.
Eq. 1 Eq. 2 Eq. 3 Eq. 4 Eq. 7 Eq. 8
Expected 711.88 692.52 2523.1 4211.5 54434 48396
mean
Estimated 1806.9 -66417 80943 72866 7229.2 1.3325e+05
Expected 844.59 835.01 3227.4 5510 66336 60229
sum
Estimated 7.8783e+05 -2.8958e+07 3.5291e+07 3.177e+07 3.1519e+06 5.8098e+07
The chapter discusses the attained results by analysing the metrics from the proposed
methods. Additionally, the chapter presents relevant considerations and final remarks and
addresses the possible directions to expand on the developed work.
6.1 Discussion
6.1.1 Results
The EMUPF method was ineffective in estimating the active power of the equipment
and had the highest MAE, MSE and RMSE values. The UNN method achieved the lowest
MAE, MSE and RMSE values for the HIPE and IMDELD datasets, compared to EMUPF
and the UNN with Fourier mapping methods. The UNN method accurately estimated
the active power consumption of the equipment. The UNN method proved to be a viable
unsupervised NILM technique.
52
53 6.2. Conclusion
6.2 Conclusion
Two novel algorithms that perform low-frequency unsupervised NILM for industrial
loads were developed and studied. The algorithms were trained and validated using
two public datasets. A survey was conducted on public NILM datasets, and the HIPE
and IMDELD datasets were selected. The datasets required preprocessing to clean the
aggregate and equipment active power and to calculate the equipment state data. The
preprocessing included dividing the data into training and testing data. The EMUPF
algorithm was rejected due to the high values of the error metrics of the results and the
fact that it is not scalable for a large number of equipment. The UNN method proved to
be a viable solution to the NILM problem, with low values for error metrics and accurately
identifying the equipment with the highest active power consumption. The UNN algorithm
is compatible with type III equipment and is not event-based. The UNN method requires
that calibration be performed for each specific problem. The analysis of the results for
the IMDELD dataset proved that estimating the equipment active power values from
an aggregate can be difficult when there is a significant difference in the active power
values of the equipment. The objectives were achieved by creating the first algorithm that
performs unsupervised low-frequency NILM for industrial loads.
on the results of the ongoing measures. A study should be conducted to analyze the
applications and benefits of the UNN algorithm in a real-world scenario. BuggyPower’s
production plant should use the UNN method, and the algorithm’s performance and
outcomes on the facility’s electricity consumption should be examined.
Bibliography
[1] J. Leiria, R. Salles, J. Mendes, P. Sousa, Soft sensors for industrial applications:
Comparison of variables selection methods and regression models, in: 2023 Interna-
tional Conference on Control, Automation and Diagnosis (ICCAD), 2023, pp. 1–6.
doi:10.1109/ICCAD57653.2023.10152323.
[9] S. Sorrell, Reducing energy demand: A review of issues, challenges and approaches,
Renewable and Sustainable Energy Reviews 47 (2015). doi:10.1016/j.rser.2015.
03.002.
[10] IEA, Global energy review 2021 – analysis - iea, International Energy Agency (2021).
[12] IEA, Statistics report: Key world energy statistics 2021 (2021).
55
56 Bibliography
[31] N. Buneeva, A. Reinhardt, Ambal: Realistic load signature generation for load
disaggregation performance evaluation, 2017 IEEE International Conference on
Smart Grid Communications, SmartGridComm 2017 2018-January (2018). doi:
10.1109/SmartGridComm.2017.8340657.
[32] S. Makonin, B. Ellert, I. V. Bajić, F. Popowich, Electricity, water, and natural gas
consumption of a residential house in canada from 2012 to 2014, Scientific Data 3
(2016). doi:10.1038/sdata.2016.37.
[41] C. Beckel, W. Kleiminger, R. Cicchetti, T. Staake, S. Santini, The eco data set
and the performance of non-intrusive load monitoring algorithms, BuildSys 2014 -
Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Efficient
Buildings (2014). doi:10.1145/2674061.2674064.
[43] C. Shin, E. Lee, J. Han, J. Yim, W. Rhee, H. Lee, The enertalk dataset, 15 hz
electricity consumption data from 22 houses in korea, Scientific Data 6 (2019).
doi:10.1038/s41597-019-0212-5.
[44] H. Xu, L. König, D. Cáliz, H. Schmeck, A generic user interface for energy
management in smart homes, Energy Informatics 1 (2018). doi:10.1186/
s42162-018-0060-0.
[47] M. Gulati, S. S. Ram, A. Singh, An in depth study into using emi signatures for
appliance identification, BuildSys 2014 - Proceedings of the 1st ACM Conference on
Embedded Systems for Energy-Efficient Buildings (2014). doi:10.1145/2674061.
2674070.
[79] T. Hong, S. Fan, Probabilistic electric load forecasting: A tutorial review, Interna-
tional Journal of Forecasting 32 (2016). doi:10.1016/j.ijforecast.2015.11.011.
[86] J. Mendes, R. Seco, R. Araújo, Automatic extraction of the fuzzy control system
for industrial processes, in: ETFA2011, 2011, pp. 1–8. doi:10.1109/ETFA.2011.
6059063.
[109] M. Thakur, A new genetic algorithm for global optimization of multimodal continu-
ous functions, Journal of Computational Science 5 (2014). doi:10.1016/j.jocs.
2013.05.005.
[110] K. Socha, M. Dorigo, Ant colony optimization for continuous domains, European
Journal of Operational Research 185 (3) (2008) 1155–1173. doi:10.1016/j.ejor.
2006.06.046.
Definitions
A.2 Event
An event corresponds to a significant variation in the aggregate electrical signal and
suggests a change in the state of one equipment.
A.5 Low-Frequency
Low-frequency refers to frequencies equal to or less than 1Hz.
64
65 A.6. State
A.6 State
The equipment state refers to the discrete operating modes of the equipment, which
can be ON or OFF.
A.8 Unsupervised
Supervised NILM algorithms use a priori knowledge of equipment consumption data,
such as labelled consumption data or signature loads, while unsupervised algorithms do
not have access to equipment data.
Appendix B
Results
Figure B.1: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment two and three.
66
67 B.1. Results from the Preprocessing of the HIPE Dataset
Figure B.2: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through four.
Figure B.3: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through five.
68 B.1. Results from the Preprocessing of the HIPE Dataset
Figure B.4: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through six.
Figure B.5: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through seven.
69 B.1. Results from the Preprocessing of the HIPE Dataset
Figure B.6: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through eight.
Figure B.7: Preprocessed aggregate active power data for the HIPE dataset for the sum
of the equipment with indexes two through nine.
70 B.2. Estimated Equipment Active Power Values
Figure B.8: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two and three.
Figure B.9: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through four.
71 B.2. Estimated Equipment Active Power Values
Figure B.10: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through five.
Figure B.11: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through six.
72 B.2. Estimated Equipment Active Power Values
Figure B.12: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through seven.
Figure B.13: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through eight.
73 B.2. Estimated Equipment Active Power Values
Figure B.14: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through nine.
Figure B.15: Expected and estimated equipment active power samples, estimated by the
EMUPF method for the HIPE dataset. The aggregate was calculated as the sum of the
equipment with indexes two through ten.
74 B.2. Estimated Equipment Active Power Values
Figure B.16: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two and three.
Figure B.17: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through four.
75 B.2. Estimated Equipment Active Power Values
Figure B.18: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through five.
Figure B.19: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through six.
76 B.2. Estimated Equipment Active Power Values
Figure B.20: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through seven.
Figure B.21: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through eight.
77 B.2. Estimated Equipment Active Power Values
Figure B.22: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through nine.
Figure B.23: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method. The aggregate was calculated as the sum of the
equipment with indexes two through ten.
78 B.2. Estimated Equipment Active Power Values
Figure B.24: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two and three.
Figure B.25: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through four.
79 B.2. Estimated Equipment Active Power Values
Figure B.26: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through five.
Figure B.27: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through six.
80 B.2. Estimated Equipment Active Power Values
Figure B.28: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through seven.
Figure B.29: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through eight.
81 B.2. Estimated Equipment Active Power Values
Figure B.30: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through nine.
Figure B.31: Expected and estimated equipment active power samples for the HIPE
dataset, estimated by the UNN method with Fourier mapping. The aggregate was
calculated as the sum of the equipment with indexes two through ten.
82 B.2. Estimated Equipment Active Power Values
Figure B.32: Expected and estimated equipment active power samples, estimated by the
EMUPF method, for the IMDELD dataset.
Figure B.33: Expected and estimated equipment active power samples, estimated by the
UNN method, for the IMDELD dataset.
83 B.3. MSE and RMSE Values for the HIPE Dataset
Figure B.34: Expected and estimated equipment active power samples, estimated by the
UNN method, with Fourier mapping, for the IMDELD dataset.
Table B.1: MSE for the equipment active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Agg. 1 0.1192 0.6264 - - - - - - -
Agg. 2 0.1647 0.4556 0.046 - - - - - -
Agg. 3 0.1681 0.041 0.0136 0.0218 - - - - -
Agg. 4 3.2647 0.8883 0.1096 0.1287 3.6171 - - - -
Agg. 5 3.1022 4.4528 0.333 0.0467 2.8758 0.0008 - - -
Agg. 6 0.7532 43.517 11.685 1.3279 2.857 0.0006 346.4 - -
Agg. 7 0.0225 2.3615 18.645 0.1818 1.7586 0.0003 0.7848 9.9832e+05 -
Agg. 8 0.3615 0.2944 4.161 0.4112 9.0028 0.0057 1404.3 9.089 9.125e+07
84 B.3. MSE and RMSE Values for the HIPE Dataset
Table B.2: RMSE for the equipment active power samples, estimated by the EMUPF
method, for the testing data from the HIPE dataset.
Table B.3: MSE for the equipment active power samples, estimated by the UNN method,
for the testing data from the HIPE dataset.
Table B.4: RMSE for the equipment active power samples, estimated by the UNN method,
for the testing data from the HIPE dataset.
B.3.3 MSE and RMSE Values for the UNN Method with Fourier
Mapping
Table B.5: MSE for the equipment active power samples, estimated by the UNN method
with Fourier mapping, for the testing data from the HIPE dataset.
Table B.6: RMSE for the equipment active power samples, estimated by the UNN method
with Fourier mapping, for the testing data from the HIPE dataset.
Table B.7: Maximum active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.8248 2.1679 - - - - - - -
Agg. 1
Estimated 1.50915 0.159669 - - - - - - -
Expected 0.83 2.12 0.3856 - - - - - -
Agg. 2
Estimated 0.86412 0.243791 0 - - - - - -
Expected 0.7999 2.2154 0.3495 0.3694 - - - - -
Agg. 3
Estimated 2.05681 2.15237 0.518802 0.12601 - - - - -
Expected 0.79 3.9703 0.36 0.3878 4.2412 - - - -
Agg. 4
Estimated 7.12293 0.29629 1.26898 3.11844 0.0149771 - - - -
Expected 0.7288 3.9703 0.377 0.35 4.0081 0.02 - - -
Agg. 5
Estimated 6.54564 15.4126 0.0527945 1.52108 0.472843 0.252218 - - -
Expected 0.8014 3.2136 0.3574 0.3454 4.7107 0.02 1.05 - -
Agg. 6
Estimated 3.41681 46.3239 9.88307 0.146988 0.253395 0 24.9324 - -
Expected 0.7876 3.9134 0.347 0.3494 5.1471 0.02 1.0215 0.85 -
Agg. 7
Estimated 0.311008 0 14.0516 2.55827 1.08503 0.132737 3.90077 5344.14 -
Expected 0.8206 3.9765 0.3744 0.3435 5.1028 0.02 1.0259 0.88 0.01
Agg. 8
Estimated 0.0129706 6.05432 6.96494 4.29919 0.00687302 0 50.4064 18.5576 66106.1
Table B.8: Minimum active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0 0 - - - - - - -
Agg. 1
Estimated 0 0 - - - - - - -
Expected 0 0 0 - - - - - -
Agg. 2
Estimated 0 0 -0.0229015 - - - - - -
Expected 0 0 0 0 - - - - -
Agg. 3
Estimated 0 0 0 0 - - - - -
Expected 0 0 0 0 0 - - - -
Agg. 4
Estimated 0 -3.62909 -0.00639624 0 -0.193847 - - - -
Expected 0 0 0 0 0 0 - - -
Agg. 5
Estimated 0 0 -1.4192 -0.0446273 0 0 - - -
Expected 0 0 0 0 0 0 0 - -
Agg. 6
Estimated 0 0 0 -6.28221 -0.0230436 -0.124548 0 - -
Expected 0 0 0 0 0 0 0 0 -
Agg. 7
Estimated 0 -5.85447 0 0 0 0 0 0 -
Expected 0 0 0 0 0 0 0 0 0
Agg. 8
Estimated -1.64326 -0.0497186 0 0 -4.03606 -0.486072 0 0 -0.0933964
87 B.4. Descriptive Statistical Analysis
Table B.9: Median active power values for each equipment, for the expected and estimated
values calculated by the EMUPF method, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.3 0.09 - - - - - - -
Agg. 1
Estimated 0.12825 0.11792 - - - - - - -
Expected 0.24 0.09 0.211 - - - - - -
Agg. 2
Estimated 0.12843 0.16035 -0.01802 - - - - - -
Expected 0.25 0 0.21 0 - - - - -
Agg. 3
Estimated 0.14452 0 0.10383 0 - - - - -
Expected 0 0 0.21 0 0.50465 - - - -
Agg. 4
Estimated 0 0 0.019924 0 0 - - - -
Expected 0 0 0.21 0 0.40395 0 - - -
Agg. 5
Estimated 0 0 0 0 0.020006 0 - - -
Expected 0 0 0.21 0 0.15 0 0.86205 - -
Agg. 6
Estimated 0 0 0.54418 0 0.033342 0 23.862 - -
Expected 0 0 0.21 0 0.15 0 0.86985 0 -
Agg. 7
Estimated 0 0 0.63784 0 0.054635 0 0.10542 0 -
Expected 0 0 0.21 0 0.14535 0 0.87 0 0
Agg. 8
Estimated 0 0 0.27832 0 -0.034273 0 47.408 0 0
Table B.10: Sum of the active power values for each equipment, for the expected and
estimated values calculated by the EMUPF method, for the HIPE dataset. The highlighted
yellow cells correspond to the equipment with the highest active power consumption values
within the aggregate.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 125.08 249.07 - - - - - - -
Agg. 2
Estimated 121.39 38.11 - - - - - - -
Expected 122.08 234.09 87.348 - - - - - -
Agg. 3
Estimated 151.71 55.846 -7.3584 - - - - - -
Expected 126.32 198.99 84.838 58.34 - - - - -
Agg. 4
Estimated 160.33 209.8 69.445 14.305 - - - - -
Expected 88.739 52.943 80.841 38.863 729.3 - - - -
Agg. 5
Estimated 505.6 -54.665 132.63 61.218 -14.538 - - - -
Expected 91.099 44.555 82.972 44.145 727.96 0.18 - - -
Agg. 6
Estimated 524.41 324.11 -122.61 32.846 56.733 2.27 - - -
Expected 90.068 33.5 75.573 42.741 661.82 0.32 318.44 - -
Agg. 7
Estimated 280.78 933.52 1170 -171.76 52.898 -1.9928 8896.3 - -
Expected 93.94 67.378 80.764 52.377 547.9 0.28 331.22 121.18 -
Agg. 8
Estimated 55.608 -150.46 1446.5 128.13 122.25 1.8583 366.96 2.4392e+05 -
Expected 90.777 53.91 77.235 50.051 582.88 0.26 322.97 121.27 0.6008
Agg. 9
Estimated -111.43 117.21 705.5 158.08 -427.64 -6.3158 17693 861.16 1.5124e+06
88 B.4. Descriptive Statistical Analysis
B.4.1.2 Maximum, Minimum, Median and Sum Values for the UNN Method
Table B.11: Maximum active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.8248 2.1679 - - - - - - -
Agg. 1
Estimated 0.7558 1.9776 - - - - - - -
Expected 0.83 2.12 0.3856 - - - - - -
Agg. 2
Estimated 0.7673 1.7204 0.6161 - - - - - -
Expected 0.7999 2.2154 0.3495 0.3694 - - - - -
Agg. 3
Estimated 0.8679 2.0062 0.525 0.7967 - - - - -
Expected 0.79 3.9703 0.36 0.3878 4.2412 - - - -
Agg. 4
Estimated 0.793 2.4464 1.9142 1.8019 3.7458 - - - -
Expected 0.7288 3.9703 0.377 0.35 4.0081 0.02 - - -
Agg. 5
Estimated 1.0578 3.306 0.9916 1.1228 3.7582 0.0204 - - -
Expected 0.8014 3.2136 0.3574 0.3454 4.7107 0.02 1.05 - -
Agg. 6
Estimated 1.5964 0.4741 1.02 1.2514 3.8266 0.0371 2.4001 - -
Expected 0.7876 3.9134 0.347 0.3494 5.1471 0.02 1.0215 0.85 -
Agg. 7
Estimated 2.1605 1.62 1.8879 2.155 4.1953 0.0331 2.0178 1.3772 -
Expected 0.8206 3.9765 0.3744 0.3435 5.1028 0.02 1.0259 0.88 0.01
Agg. 8
Estimated 1.858 2.1815 1.5995 1.0088 4.1476 0.3753 1.3129 1.2436 1.4803
Table B.12: Minimum active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0 0 - - - - - - -
Agg. 1
Estimated 0 -0.0317 - - - - - - -
Expected 0 0 0 - - - - - -
Agg. 2
Estimated -0.0226 0 -0.0189 - - - - - -
Expected 0 0 0 0 - - - - -
Agg. 3
Estimated 0 0 -0.05 -0.3561 - - - - -
Expected 0 0 0 0 0 - - - -
Agg. 4
Estimated -0.0327 0 0 0 0 - - - -
Expected 0 0 0 0 0 0 - - -
Agg. 5
Estimated -0.047 0 -0.0434 -0.0425 -0.2037 0 - - -
Expected 0 0 0 0 0 0 0 - -
Agg. 6
Estimated -0.2906 0 -0.0976 0 -0.185 0 0 - -
Expected 0 0 0 0 0 0 0 0 -
Agg. 7
Estimated 0 0 -0.0836 -0.0046 -0.036 0 -0.1035 -0.0496 -
Expected 0 0 0 0 0 0 0 0 0
Agg. 8
Estimated 0 -0.0092 -0.0472 0 -0.08 0 -0.0618 0 -0.0966
89 B.4. Descriptive Statistical Analysis
Table B.13: Median active power values for each equipment, for the expected and estimated
values calculated by the UNN method, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.3 0.09 - - - - - - -
Agg. 1
Estimated 0.3172 0.3888 - - - - - - -
Expected 0.24 0.09 0.211 - - - - - -
Agg. 2
Estimated 0.0257 0.3664 0.08345 - - - - - -
Expected 0.25 0 0.21 0 - - - - -
Agg. 3
Estimated 0.161 0 0.11955 0 - - - - -
Expected 0 0 0.21 0 0.50465 - - - -
Agg. 4
Estimated 0 0 0.2547 0 0.52675 - - - -
Expected 0 0 0.21 0 0.40395 0 - - -
Agg. 5
Estimated 0 0 0.0853 0 0.69215 0 - - -
Expected 0 0 0.21 0 0.15 0 0.86205 - -
Agg. 6
Estimated 0 0 0.0348 0 0.1014 0 0.57475 - -
Expected 0 0 0.21 0 0.15 0 0.86985 0 -
Agg. 7
Estimated 0 0 0.11135 0 0.10135 0 0.5912 0 -
Expected 0 0 0.21 0 0.14535 0 0.87 0 0
Agg. 8
Estimated 0 0 0.2311 0 0 0 0.3253 0 0
Table B.14: Sum of the active power values for each equipment, for the expected and
estimated values calculated by the UNN method, for the HIPE dataset. The highlighted
yellow cells correspond to the equipment with the highest active power consumption values
within the aggregate.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 125.08 249.07 - - - - - - -
Agg. 2
Estimated 130.02 242.89 - - - - - - -
Expected 122.08 234.09 87.348 - - - - - -
Agg. 3
Estimated 128.72 229.71 84.816 - - - - - -
Expected 126.32 198.99 84.838 58.34 - - - - -
Agg. 4
Estimated 111.34 206.18 92.685 59.328 - - - - -
Expected 88.739 52.943 80.841 38.863 729.3 - - - -
Agg. 5
Estimated 38.604 55.982 182.73 93.441 613.83 - - - -
Expected 91.099 44.555 82.972 44.145 727.96 0.18 - - -
Agg. 6
Estimated 119.21 88.975 78.045 53.624 658.38 0.1836 - - -
Expected 90.068 33.5 75.573 42.741 661.82 0.32 318.44 - -
Agg. 7
Estimated 162.23 25.637 71.248 81.82 579.24 0.5936 309.04 - -
Expected 93.94 67.378 80.764 52.377 547.9 0.28 331.22 121.18 -
Agg. 8
Estimated 161.79 65.947 98.466 138.06 419.4 0.4634 366.56 79.674 -
Expected 90.777 53.91 77.235 50.051 582.88 0.26 322.97 121.27 0.6008
Agg. 9
Estimated 149.28 53.595 218.28 115.11 444.13 0.5517 163.13 134.28 23.443
90 B.4. Descriptive Statistical Analysis
B.4.1.3 Maximum, Minimum, Median and Sum Values for the UNN Method
with Fourier Mapping
Table B.15: Maximum active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for the HIPE
dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.8248 2.1679 - - - - - - -
Agg. 1
Estimated 1.5747 2.1654 - - - - - - -
Expected 0.83 2.12 0.3856 - - - - - -
Agg. 2
Estimated 2.3089 2.3328 1.5224 - - - - - -
Expected 0.7999 2.2154 0.3495 0.3694 - - - - -
Agg. 3
Estimated 2.0322 2.4377 2.4242 2.0109 - - - - -
Expected 0.79 3.9703 0.36 0.3878 4.2412 - - - -
Agg. 4
Estimated 4.2412 4.2412 4.2412 4.2412 4.2412 - - - -
Expected 0.7288 3.9703 0.377 0.35 4.0081 0.02 - - -
Agg. 5
Estimated 4.2447 4.2447 4.2447 4.2447 4.2447 0.2474 - - -
Expected 0.8014 3.2136 0.3574 0.3454 4.7107 0.02 1.05 - -
Agg. 6
Estimated 5.0407 5.0407 5.0407 5.0407 5.0407 4.967 5.0407 - -
Expected 0.7876 3.9134 0.347 0.3494 5.1471 0.02 1.0215 0.85 -
Agg. 7
Estimated 5.3196 5.3196 5.3196 5.3196 5.3196 0 5.3196 5.3196 -
Expected 0.8206 3.9765 0.3744 0.3435 5.1028 0.02 1.0259 0.88 0.01
Agg. 8
Estimated 5.3625 5.3625 5.3625 5.3625 5.3625 4.1468 5.3625 5.3625 5.3625
Table B.16: Minimum active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for the HIPE
dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0 0 - - - - - - -
Agg. 1
Estimated -2.115 -2.1368 - - - - - - -
Expected 0 0 0 - - - - - -
Agg. 2
Estimated -0.6938 -0.0205 -1.8519 - - - - - -
Expected 0 0 0 0 - - - - -
Agg. 3
Estimated 0 -0.0028 0 -2.3909 - - - - -
Expected 0 0 0 0 0 - - - -
Agg. 4
Estimated -3.7902 -4.0649 -1.7046 -2.5114 0 - - - -
Expected 0 0 0 0 0 0 - - -
Agg. 5
Estimated -0.4699 -1.4571 -1.2564 -4.2428 0 0 - - -
Expected 0 0 0 0 0 0 0 - -
Agg. 6
Estimated -4.6296 -5.0006 -5.0274 0 0 0 -3.6912 - -
Expected 0 0 0 0 0 0 0 0 -
Agg. 7
Estimated 0 -5.2696 -4.4569 0 -2.6592 -1.6827 -3.1645 -3.7987 -
Expected 0 0 0 0 0 0 0 0 0
Agg. 8
Estimated -4.8136 -3.3127 -3.768 0 -4.8992 0 0 0 0
91 B.4. Descriptive Statistical Analysis
Table B.17: Median active power values for each equipment, for the expected and estimated
values calculated by the UNN method with Fourier mapping, for the HIPE dataset.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 0.3 0.09 - - - - - - -
Agg. 1
Estimated 0.274 0 - - - - - - -
Expected 0.24 0.09 0.211 - - - - - -
Agg. 2
Estimated 0.0864 0.1362 0 - - - - - -
Expected 0.25 0 0.21 0 - - - - -
Agg. 3
Estimated 1.0812 0 1.6254 0 - - - - -
Expected 0 0 0.21 0 0.50465 - - - -
Agg. 4
Estimated 0 0 0 0 4.2329 - - - -
Expected 0 0 0.21 0 0.40395 0 - - -
Agg. 5
Estimated 0 0 2.0369 0 4.1815 0 - - -
Expected 0 0 0.21 0 0.15 0 0.86205 - -
Agg. 6
Estimated 0 0 0 0 4.9474 0 2.9127 - -
Expected 0 0 0.21 0 0.15 0 0.86985 0 -
Agg. 7
Estimated 0 0 4.0765 0 0 0 4.5736 0 -
Expected 0 0 0.21 0 0.14535 0 0.87 0 0
Agg. 8
Estimated 0 0 3.5702 0 0 0 5.2938 0 0
Table B.18: Sum of the active power values for each equipment, for the expected and
estimated values calculated by the UNN method with Fourier mapping, for the HIPE
dataset. The highlighted yellow cells correspond to the equipment with the highest active
power consumption values within the aggregate.
Eq. 2 Eq. 3 Eq. 4 Eq. 5 Eq. 6 Eq. 7 Eq. 8 Eq. 9 Eq. 10
Expected 125.08 249.07 - - - - - - -
Agg. 2
Estimated 170.29 139.65 - - - - - - -
Expected 122.08 234.09 87.348 - - - - - -
Agg. 3
Estimated 452.86 272.72 -99.008 - - - - - -
Expected 126.32 198.99 84.838 58.34 - - - - -
Agg. 4
Estimated 451.99 300.35 690.72 100.28 - - - - -
Expected 88.739 52.943 80.841 38.863 729.3 - - - -
Agg. 5
Estimated 425.08 62.205 846.46 436.35 1496.3 - - - -
Expected 91.099 44.555 82.972 44.145 727.96 0.18 - - -
Agg. 6
Estimated 999.3 175.57 1197.8 285.32 1529.7 2.2266 - - -
Expected 90.068 33.5 75.573 42.741 661.82 0.32 318.44 - -
Agg. 7
Estimated 921.55 225.26 587.39 647.8 1567.8 79.472 1429.6 - -
Expected 93.94 67.378 80.764 52.377 547.9 0.28 331.22 121.18 -
Agg. 8
Estimated 1336 94.172 1595.3 805.46 1292.4 -23.558 1623.9 494.46 -
Expected 90.777 53.91 77.235 50.051 582.88 0.26 322.97 121.27 0.6008
Agg. 9
Estimated 1080 328.49 1536.7 768.21 1319.3 53.891 1960.7 777.73 426.96
92 B.4. Descriptive Statistical Analysis
Table B.19: Maximum, minimum, mean and median active power values for each equip-
ment, for the expected and estimated values calculated by the EMUPF method, for the
IMDELD dataset.
Eq. 1 Eq. 2 Eq. 3 Eq. 4 Eq. 7 Eq. 8
Expected 1421.7 1331.9 5077 6525 87954 86525
max
Estimated 7.4793e+14 2.0534e+14 0.018491 0 0 4.6627e+46
Expected 0 0 0 0 0 0
min
Estimated -0.0019108 0 -3.5712e+14 -6.2907e+14 -4.7713e+13 -0.0025909
Expected 844.59 835.01 3227.4 5510 66336 60229
median
Estimated 8.3303e+13 1.6742e+13 -3.9775e+13 -7.0065e+13 -5.314e+12 5.1932e+45
B.4.2.2 Maximum, Minimum, Median and Sum Values for the UNN Method
Table B.20: Maximum, minimum and median active power values for each equipment,
for the expected and estimated values calculated by the UNN method, for the IMDELD
dataset.
B.4.2.3 Maximum, Minimum, Median and Sum Values for the UNN Method
with Fourier Mapping
Table B.21: Maximum, minimum and median active power values for each equipment, for
the expected and estimated values calculated by the UNN method with Fourier mapping,
for the IMDELD dataset.
Eq. 1 Eq. 2 Eq. 3 Eq. 4 Eq. 7 Eq. 8
Expected 1.9265e+05 1.107e+05 2.653e+05 2.653e+05 1.6841e+05 2.3824e+05
max
Estimated 45486 3639.3 24337 15646 1.4122e+05 1.0982e+05
Expected 0 0 0 0 0 0
min
Estimated -2.4798e+05 -2.2089e+05 -73718 -51212 -2.5312e+05 -349.63
Expected 844.59 835.01 3227.4 5510 66336 60229
median
Estimated 0 0 1.2761e+05 45470 0 1.5627e+05