Electronics 07 00235 PDF
Electronics 07 00235 PDF
Article
An Extreme Learning Machine Approach to Effective
Energy Disaggregation
Valerio Mario Salerno * and Graziella Rabbeni
Engineering and Architecture Faculty, Kore University of Enna, 94100 Enna, Italy; [email protected]
* Correspondence: [email protected]
Received: 31 August 2018; Accepted: 1 October 2018; Published: 4 October 2018
Keywords: non-intrusive load monitoring; machine learning; Deep Modeling; extreme learning
machine; Data Driven Approach
1. Introduction
Energy consumption in household spaces increases every year [1], and is becoming a serious
concern in politics across Europe and the U.S. due to limited energy resources and the negative
implications on the environment (e.g., CO2 emissions). Indeed, the residential area in both Europe
and the U.S. represents the third highest in power consumption, consuming almost a quarter of all the
energy used [2,3]. Moreover, devices, electronics, and lighting are ranked as the second largest group,
contributing to 34.6% of residential electricity usage [4]. Unfortunately, up to 39% of the energy used by
the domestic sector can be wasted [5]. Many studies have shown energy savings if feedback on power
consumption information is provided to energy users [6,7]. Governments and public utilities have
reacted to the energy efficiency efforts by supporting smart grids and deploying smart meters. Private
corporations perceive chances in the market potential of power efficiency, and we have thus observed
an increase of commercial home power monitoring products available in the market. Stakeholders
from the public and private sectors envision smart energy management, including the transmission,
as well as the distribution, of power between energy producers and energy consumers, in addition to
seamless data exchange on electricity usage and pricing information between generators and users.
This requires connections between devices and servers. Such a requirement belongs to the broader
technology realm called the Internet of Things (IoT) in the smart home [8].
Addressing the problem of energy consumption monitoring in an efficient and effective way has
attracted a lot of attention among researchers in the field of energy monitoring, as discussed in the
recent literature, for example [9–14]. In the energy consumption literature, the classification of the
electrical loads in a house is usually divided into two groups—intrusive appliance load monitoring
and non-intrusive appliance load monitoring—relating to how much they cost, and to the simplicity to
install the system or fix it when required. Intrusive Appliance Load Monitoring (IALM, or ILM) is
usually classified as the collection of current monitoring techniques where a power meter is connected
to each appliance in the household. Therefore, these methods require entering the house, which is
rated as intrusive; furthermore, ILM systems are usually costly—all of the home devices need at least
one power meter each—and challenging to install and configure. Notwithstanding these adverse
reasons, ILM techniques provide highly trustable and authentic results. Non-intrusive appliance load
monitoring (NIALM, or NILM), on the contrary, is a way of specifying both the household power
usage and the state of operation of every connected appliance, on the evaluation of the entire load
measured by the main power meter in the house. Consequently, NILM leads to a lower cost, since the
number of energy meters can be reduced to just one per house. Nonetheless, NILM has the burden of
finding effective and efficient solutions to disaggregate the total of the power consumption measured
at a single point into individual electrical device power consumption [15–17]. The energy or power
consumption for individual electrical appliances can be determined from disaggregated data [18].
This is in contrast to the deployment of one sensor per device in the standard intrusive-type of energy
monitoring. NILM is worthwhile because it reduces costs, since multiple sensor configurations and
installation complexity linked with intrusive load monitoring are avoided [19–22]. The NILM method
is typically preferred because of financial and functional reasons. Indeed, the research in this field
is entirely focused on applying these systems in the real world. The deployment of many sensors
evidently favors NILM over the intrusive method on the basis of cost. The only benefit of intrusive
methods is the optimal precision in estimating the power consumption of each device, but in many
cases, it can be approximated without experiencing any dangerous consequences.
In this work, we are concerned with NILM, also known as energy disaggregation, which can be
broadly defined as a set of techniques used to obtain estimates of the electrical power consumption of
individual appliances from analyses of voltage and current taken at a limited number of positions in
the power distribution system of a building. The original concept traces its roots back to George Hart’s
seminal work at the Massachusetts Institute of Technology (MIT) [21]. In that work, Hart suggested
the use of variations in real and reactive power consumption, as measured at the utility meter,
to automatically trace the operation of individual appliances in a home. A significant number of studies
have since been put forth to resolve this problem under the NILM framework, and the interested reader
is referred to [16,17,20,21]. NILM approaches initially focused on feature selection and extraction,
with light emphasis on learning and inference techniques [21,22]. Progresses in computer science and
machine learning methods have driven innovations in data prediction and disaggregation techniques.
Factorial Hidden Markov Models (FHMMs) are considered the most successful approach to
power disaggregation [23]. FHMMs have proven to appropriately work for load disaggregation due
to their ability to handle temporal evolution of the appliances state. Nonetheless, the complexity
of the FHMM exponentially grows as the number of target appliances increases, which limits the
applicability of this learning approach. Moreover, if a new device class has to be added, the entire
model needs to be retrained from scratch. Alternative existing machine learning algorithms, such as
the Support Vector Machine (SVM) [24], k-Nearest Neighbor (k-NN) [25], and Artificial Neural
Network (ANN) [26–28], can also have a significant impact on the development of NILM. The principal
advantage of using machine learning is that these approaches can efficiently and effectively solve
very complex classification and regression problems [29,30]. However, much effort is still required to
reduce the error caused by different prediction and disaggregation algorithms to within a satisfactory
range, as explained in [19].
Electronics 2018, 7, 235 3 of 15
Deep neural networks (DNNs) fulfill complicated learning tasks by forming learning machines
with deep architectures, which have been proven to have a stronger recognition ability for highly
nonlinear patterns compared to shallow networks [31]. Deep learning has been widely studied and
applied in various frontier fields, and has become the state-of-the-art in speech recognition [32,33],
handwriting recognition [34], and image classification [35]. These recent advancements and
applications of deep learning have provided new ideas for energy disaggregation [30], and deep
models set the new state-of-the-art in the field. For example, Kelly and Knottenbelt [27] suggested
three different solutions: (i) The first method employs a particular kind of recurrent neural network [36];
(ii) the second method reduces noise with de-noising auto-encoders; and (iii) the third method uses a
regression technique predicting the beginning time, the ending time, and average energy requirement
for every appliance. In [37], the authors proposed a different solution employing deep recurrent
networks with long short-term memory (LSTM) gates, aiming to validate whether this solution
could address key issues observed in previous NILM methods. Those issues regard: (i) Automatic
feature extraction from low-frequency data; (ii) generalization of a solution for different buildings
and previously unseen appliances; (iii) decomposition of another device; and (iv) extensibility of the
method to computational tractability and continuous time. Zhang et al. [38] suggested a single-channel
blind source separation approach based on deep learning for NILM. The authors named that method
"sequence-to-point learning", since it makes use of a single point as the target, and a window of points
as the input. The deep model employed in [38] is a deep convolutional neural network (CNN) [34],
which is also capable of learning the device’s signature in a home.
Although deep neural architectures represent the new state-of-the-art for NILM achieving
excellent results, deep models have two main weaknesses: (1) Deep models consider the multilayer
architecture as a whole that is fine-tuned by several passes of back-propagation (BP) based fine-tuning
in order to obtain reasonable learning capabilities—such a training scheme is cumbersome and
time-consuming. (2) Huge volumes of training data are needed to achieve top performance, which
may limit the deployment of DNN-based solutions in several real-world applications. In this work,
we therefore aim to overcome these issues by introducing an alternative machine learning NILM
framework based on the unique and effective characteristics of the extreme learning machine (ELM)
algorithm [39], namely high-speed training, good generalization, and universal approximation and
classification capability. ELMs can play a pivotal role in many machine learning applications,
for example, traffic sign recognition [40], gesture recognition [41], video tracking [41], object
classification [42], data representation in big data [43], water distribution and wastewater collection [44],
opal grading [45], and adaptive dynamic programming [46]. In [47], the authors have shown that ELMs
are also suitable for a wide range of feature mappings, rather than the classical ones. Moreover, to take
advantage of multi-layer models, we deploy an energy disaggregation algorithm with hierarchical
ELMs (H-ELMs). To test the capability of ELM and H-ELM, we conducted a series of experiments on the
standardized UK Domestic Appliance-Level Electricity (UK-DALE) dataset [48]. Notably, the amount
of training data in this dataset is limited in comparison to that used in speech and image recognition,
for example. This permits us to prove that ELMs are indeed a viable solution to energy disaggregation
when the amount of training data is limited, which hinders the training and generalization capabilities
of state-of-the-art deep models, such as CNN and LSTM.
The key contributions of this work can be summarized as follows: (i) We are the first, to the
best of the authors’ knowledge, to deploy an ELM approach to energy disaggregation. (ii) We
show that ELMs represent the new state-of-the-art on the well-known and common UK-DALE
task, outperforming previous state-of-the-art approaches—namely FHMMs—and back-propagation
based artificial neural network (i.e., DNNs, CNNs, LSTMs) solutions. We provide point-to-point
comparisons against combinatorial optimization (CO), the Factorial Hidden Markov Model (FHMM),
long short-term memory (LSTM) recurrent neural networks, and autoencoder algorithms. This is
a valuable contribution for researchers working on the same problem and approaching it from a
machine learning point-of-view. (iii) We demonstrate that ELMs generalize better than previous
Electronics 2018, 7, 235 4 of 15
techniques for unseen houses—the cunning reader knows that this contribution is extremely valuable,
because back-propagation based models are very brittle and tend to underperform when the training
conditions used to tune the model parameters do not well represent the production condition.
The rest of this paper is organized as follows: Section 2 introduces the energy disaggregation
problem. Section 3 gives a brief survey of related works. Section 4 presents the ELM and H-ELM based
speech enhancement algorithms. Section 5 gives our experimental setup and results. The conclusions
from this study are drawn in Section 6.
where y(n) is the aggregate (total) power at time n, xi (n) is the power of the i-th appliance at time n, and
ε(n) is the unwanted noise at time n. Let Y = (540, 540, 600, 500, 800, 800, 750, 830, 850, 750, 570, 570,
570, 590) be the sequence of power readings in Watts taken every 20 min that have to be disaggregated.
A feasible solution to the energy disaggregation problem is given in Table 1. Interestingly, the solution
is not unique.
A non-intrusive load monitoring (NILM) system collects power consumption data from the
unique pivotal meter of a household, which can estimate the energy drain of each device. There are
always three essential steps in a NILM framework: (i) Data acquisition, related to how to obtain the
energy data, which is mostly thanks to hardware solutions; (ii) the device feature extraction step;
and (iii) inference and learning, such as the mathematical model used to decompose the total power
signal into device level signals. In this paper, we focus on the third step only.
In machine learning approaches to power disaggregation, the problem can be addressed as an
estimation, a classification, or a regression problem. In the next section, a brief overview of related work,
which addresses the energy disaggregation task under the machine learning framework, will be given.
3. Related Work
Physically, power decomposition is time-dependent. NILM researchers have always imagined
models of sequential data and time as potential solutions. Hidden Markov Models (HMMs) have
therefore received increased attention from the research community. The first work that we are
aware of that uses HMMs is [48], where the authors apply a Factorial HMM (FHMM) to the energy
disaggregation task. FHMMs were also explored in [23] and [49]. A fundamental public data set has
been provided by Kolter and Johnson, which is really useful for studying energy disaggregation [50].
Electronics 2018, 7, 235 5 of 15
In [51], the authors apply Additive Factorial HMMs, using an unsupervised technique for selecting
signal pieces where individual devices are isolated. Parson et al. have also employed HMM-based
methods in [52] and [53], where they have fused prior models of general device types (e.g., refrigerators,
clothes dryers, etc.) with HMMs. More recently, [54] introduced a method, referred to as Particle
Filter-Based Load Disaggregation, where devices’ load marks and superimposition of these were
modelled with both HMMs and Factorial HMMs. The inference was carried out by Particle Filtering
(PF). Paradiso et al. [55] proposed a new electrical load disaggregation system, which utilizes FHMMs
and explores context-based features. The user presence and the power patterns represent the context
data. In [56], the authors employed an unsupervised disaggregation approach, by considering
interactions between appliances. The device interactions were shaped using FHMMs, and inference
was implemented using the Viterbi algorithm. A new method which addresses the efficiency problem
of the Viterbi algorithm has been described in [57]. Authors proposed a super-state HMM and a
modified version of the Viterbi search. A super-state was determined as an HMM that defines the
entire power status of a collection of devices. Each appliance could be on or off, but can have a distinct
state while working. Naturally, every combination of the devices’ states represents a unique state of
the home. The fact that the precise inference was feasible in a computationally efficient time—only by
calculating sparse matrices with a large number of super-states—is a central benefit; moreover, the
decomposition could also run in real time.
Although HMM-based strategies have attracted much consideration, as the brief literature review
examined above demonstrates, the main weaknesses of these models have not been overcome. In fact,
the complexity of HMM-based techniques increases as the number of the discrete state space augments,
and this may make the inference intractable. As discussed in [23], if the context window of the model
is increased, the exponential complexity of the HMM state space is unfavorable. In more recent time,
deep neural architectures have demonstrated exceptional results in sequential models, and some NILM
researchers have shown a keen interest in these solutions. In the following, the most critical efforts
using deep neural architectures are briefly described.
Kelly and Knottenbelt [27] concentrated on answering the problem of power decomposition
employing deep networks. The authors submitted three different approaches: (a) A solution applying
a particular type of recurrent neural network using LSTM hidden nodes; (b) a noise reduction solution
employing denoising autoencoders; and (c) an algorithm with regression in order to calculate the
beginning time, the ending time, and the average power request for each appliance connected to the
electrical network. In [37], the authors presented a distinct architecture of a deep recurrent network
with LSTM. This work aims to test if it is possible to overcome the known problems of the previous
NILM methods using this kind of network. This solution requires that each home device has a linked
network; for this reason, since there are three target appliances, the authors used three networks.
In [38], the authors proposed a method named sequence-to-point (seq2point) learning, since it uses
a window as input and a single point as a target. This algorithm was shown to be a solution to the
problem of single-channel blind source separation, with application in Non-intrusive load monitoring
composed of a deep CNN capable of learning the signature of each appliance in a house.
(ELM) approach to energy disaggregation. In fact, ELMs have only one layer (the last one), made of
trainable parameters even in their deep configuration, referred to as Hierarchical-ELMs (H-ELMs).
ELMs were introduced by Huang et al. for single layer feed-forward networks (SLFNs) to overcome
problems with the BP algorithm. ELM gives an adequate and agile learning process that does not need
the heavy fine-tuning of parameters [47–58].
In this paper, we also frame energy disaggregation as a regression task, which is designated as
“denoising” in [27]. The aggregate power requirement y[n] is hence composed of the clean target power
demand signal of the target device x [n], and of the background additive noise signal v[n] provided by
the other appliances:
y[n] = x [n] + v[n] (2)
The goal is to recover an estimate of x[n] from the noisy signal y[n]. We propose to use ELM-based
models to perform the denoising task.
L
f ( xi ) = ∑ β l σ ( w l · x i + bl ) (3)
l =1
where xi = [ xi1 , xi2 , . . . , xiN ] T ∈ R N is the input vector, wl = [wl1 , wl2 , . . . , wlN ] T ∈ R N is the weight
vector connecting the l-th hidden node and the input vector, bl is the bias of the l-th hidden node,
β l = [ β l1 , β l2 , . . . , β lM ] T ∈ R M is the weight vector from the l-th hidden node to the output nodes, L is
the total number of neurons in the ELM hidden layer, and σ (·) is the nonlinear activation function to
approximate the target function to a compact subset. The output function can be formulated as
L
f ( xi ) = ∑ β l hl ( x ) = h( x ) B (4)
l =1
where B is the output weight matrix, and h( x ) = [h1 ( x ), . . . , h L ( x )] is the nonlinear feature mapping.
The relationship above can compactly be described as
HB = Y (5)
where H is the hidden layer output matrix, and Y is the target data matrix.
Electronics 2018, 7, 235 7 of 15
4.3. Hierarchical
4.3. Hierarchical ELM
ELM
Spurred by
Spurred by DNNs,
DNNs, where
where the the features
features are
are extracted
extracted using
using thethe multilayer
multilayer framework
framework withwith
unsupervised initialization, Tang et al. [41] extended extended ELMELM and
and introduced
introduced H-ELM for multilayer
multilayer
perceptrons (MLPs). The entire structure of the H-ELM H-ELM model is presented
presented in Figure
Figure 1.1. The
The H-ELM
H-ELM
framework has hastwo two steps,
steps, that that is, unsupervised
is, unsupervised featurefeature extraction,
extraction, and supervised
and supervised feature
feature regression.
regression.
In In unsupervised
unsupervised featurehigh-level
feature extraction, extraction,features
high-level features are
are extracted extracted
using an ELM using
basedanautoencoder
ELM based
autoencoder
by analyzing by analyzing
every layer asevery layer as an independent
an independent layer.
layer. The input The
data input data isinto
is introduced introduced
the ELM into the
feature
ELM feature
space space before
before feature feature
extraction extraction
to make to make useamongst
use of information of information amongst
the training data. the
Thetraining data.
output of the
The output offeature
unsupervised the unsupervised
extraction stage feature extraction
can then be used stage
as the can
inputthen be supervised
to the used as the ELMinput to the
regression
supervised ELM regression stage [41] for the final result, based on the learning from the two stages.
stage [41] for the final result, based on the learning from the two stages.
Regression
Output weight
ELM based
ELM Layer Supervised stage
regression
Sparse Autoencoder
Hidden
Hidden weight
layers
Unsupervised
Sparse Autoencoder
feature
Input weight representation
Input Data
Figure 1.
Figure 1. Hierarchical extreme learning
Hierarchical extreme learning machine
machine (H-ELM)
(H-ELM) architecture.
architecture.
5.1. Datasets
Our goal is to prove that the proposed solution can achieve comparable, if not better performance,
than recently proposed deep learning methods. Hence, we needed to deploy an experimental setup
that allowed a comparison with that available in the literature. To this end, we followed [27] and
used UK-DALE [29] as our source dataset, so that we could perform a sound quantitative comparison
between the proposed approach and that available in the literature. Each submeter in UK-DALE
sampled once every 6 s. All houses recorded aggregate apparent mains power once every 6 s. Houses
number 1, 2, and 5 also recorded active and reactive mains power once a second. In these houses, we
downsampled 1 s active mains power to 6 s to align with the submetered data, and used this as the real
aggregate data from these houses. We also aligned with the submetered data and used this as the real
aggregate data from these houses by down sampling the 1 s active mains power to 6 s. We used this
approach to make our results comparable with the state-of-the-art techniques. Any gaps in appliance
data shorter than 3 min were assumed to be due to RF issues, and so were filled by forward-filling.
Any gaps longer than 3 min were assumed to be due to the appliance and meter being switched off,
and so were filled with zeros.
We used the five target appliances selected in [27] in all our experiments; namely the fridge,
washing machine, dish washer, kettle, and microwave. These appliances were chosen because each is
present in at least three houses in UK-DALE. This meant that for each appliance, we could train our
nets on at least two houses and test on a different house. These five appliances consume a significant
amount of energy, and represent a range of different power ‘signatures’ from the simple on/off of the
kettle to the intricate pattern shown by the washing machine. In [27], the authors define an “appliance
activation” to be the power drawn by a single appliance over one complete cycle of that appliance.
We adopt the same terminology here.
In [27], the authors pointed out that artificial data had to be generated in order to regularize
the training of the parameters (biases and weights) of their deep neural networks, having between
1 million and 150 million of trainable parameters. Consequently, large training datasets play a crucial
role. Our approach was back-propagation free; therefore, there were fewer parameters to learn, which
allowed us to use only real data during the ELM learning phase. This is a key feature of the proposed
approach. For each house, we reserved the last week of data for testing and used the rest of the data
for training, as in [27]. The specific houses used for training and testing are shown in Table 2.
T
1
MAE =
T ∑ | f (yt ) − xt | (8)
t =1
Ê − E
RET = (9)
max( E, Ê)
where x (t) is the actual power of the appliance, f (yt ) is the estimated power after disaggregation,
and T is the number of examples.
The second category, on the contrary, defines how the disaggregated signal signs are assigned
to each appliance, taking into account precision (P), recall (R), F-measure (F1), total energy correctly
assigned (TECA), and Accuracy (A). These metrics are defined as follows:
TP
P= (10)
TP + FP
TP
R= (11)
TP + FN
2×P×R
F1 = (12)
P+R
∑tT=1 ∑m
M m m
=1 | f ( y t ) − x t |
TECA = 1 − T
(13)
2 ∑ t =1 y t
TP + TN
A= (14)
P+N
where TP is the true positive that the appliance was working, FP is the false positive that the appliance
was working, TN is the true negative, and FN is the false negative. P is the number of positive cases
in ground truth, and N is the number of negative cases in ground truth. Moreover, xtm is the actual
power for the m-th appliance at time t, f (ymt ) is the estimated power for m-th appliance at time t, and
T is the number of samples. The time period used for testing the efficiency of appliance recognition is
one week. The number of appliances that can be simultaneously active, of course varies as time passes.
Therefore, it is not constant during the period of time considered in the testing phase.
operate simultaneously. In this way, the proposed solution could work with all appliances (5 in the
UK-DALE corpus) simultaneously.
Table 4. Disaggregation performance for unseen houses during training. This evaluation is essential to
assess the generalization capability of the pattern recognition technique.
It is also significant to verify the generalization capabilities on unseen houses during training.
From the results reported in Table 4, we see that autoencoder outperforms CO and FHMM for every
appliance on every metric, except the relative error in total energy. Instead, the H-ELM outperformed
all of the other solutions, as expected. We have already demonstrated that H-ELMs are superior in
performance than shallow ELMs, so we do not report ELM results for unseen houses.
Finally, we have demonstrated that ELMs can play a key role in energy disaggregation, attaining
new state-of-the-art results. The importance of avoiding a cumbersome training phase can also be
better appreciated by referring to a recent study about a training-less solution to non-intrusive load
monitoring [61].
6. Conclusions
ELMs are a promising artificial neural network method introduced in [56] that have a high-speed
learning capability, and have been employed in many tasks successfully. For instance, ELMs for
wind speed forecasting were used in [62,63]. In [64], the authors apply ELMs to a state-of-charge
estimation of a battery with success. ELMs were also suggested as an effective solution to guarantee
the continuous flow of current supply in smart grids. In this paper, we have proposed to apply
ELMs to NILM. ELMs, both in their shallow and hierarchical configurations, act as a non-linear signal
enhancement system allowing us to recover the power load of the target appliance from the aggregated
load. In this work, we have proposed to extend ELMs to the energy disaggregation problem, and we
have reported top performance on the UK-DALE dataset. ELMs are much more straightforward
to train than the deep models reported in [27], while attaining superior performance in both seen
and unseen houses, and across appliances. The main advantage of ELMs over LSTMs and denoise
autoencoders, is that there is no need to fine-tune the whole network by the iterative back-propagation
algorithm, and this can quicken the learning speed and strengthen the generalization performance.
Electronics 2018, 7, 235 12 of 15
Finally, it is worth noting that the comparing various NILM approach can be still involved.
However, many improvements have been made in recent years, and there are a lot of novel approaches,
and diverse mathematical tools, that have still not been employed extensively. We believe that ELM
shows great potential for NILM, and further improvement could be achieved through multi-task
learning. We also believe that extreme learning machines could play a key role in the management of
power consumption in industrial wireless sensor networks [65].
Author Contributions: V.M.S. conceived the main idea of the proposed work and lead the organization and
writing of the paper. V.M.S. and G.R. were equally involved with setting up the experimental environment. G.R.
was responsible for the experimental validation of the idea proposed in this work.
Funding: This research received no external funding.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Uteley, J.; Shorrock, L. Domestic Energy Fact File 2008. Available online: http://www.bre.co.uk/page.jsp?
id=879 (accessed on 8 July 2018).
2. EIA Energy Consumption Report. Available online: https://www.eia.gov/totalenergy/data/annual/pdf/
sec2.pdf (accessed on 8 July 2018).
3. EU Energy Consumption. Available online: http://www.eea.europa.eu/data-and-maps/figures/final-
energy-consumption-by-sector-eu-27 (accessed on 8 July 2018).
4. EIA Residential Report. Available online: https://www.eia.gov/consumption/residential/index.php
(accessed on 8 July 2018).
5. Meyers, R.J.; Williams, E.D.; Matthews, H.S. Scoping the potential of monitoring and control technologies to
reduce energy use in homes. Energy Build. 2010, 42, 563–569. [CrossRef]
6. Darby, S. The Effectiveness of Feedback on Energy Consumption: A Review for DEFRA of the Literature on
Metering. 2006. Available online: https://www.eci.ox.ac.uk/research/energy/downloads/smart-metering-
report.pdf (accessed on 8 July 2018).
7. Abrahamse, W.; Steg, L.; Vlek, C.; Rothengatter, T. A review of intervention studies aimed at household
energy conservation. J. Environ. Psychol. 2005, 25, 273–291. [CrossRef]
8. Yu, L.; Li, H.; Feng, X.; Duan, J. Nonintrusive appliance load monitoring for smart homes: recent advances
and future issues. IEEE Instrum. Meas. Mag. 2016, 19, 56–62. [CrossRef]
9. Soetedjo, A.; Nakhoda, Y.I.; Saleh, C. Embedded Fuzzy Logic Controller and Wireless Communication for
Home Energy Management Systems. Electronics 2018, 7, 189. [CrossRef]
10. Issi, F.; Kaplan, O. The Determination of Load Profiles and Power Consumptions of Home Appliances.
Energies 2018, 11, 607. [CrossRef]
11. Manivannan, M.; Najafi, B.; Rinaldi, F. Machine Learning-Based Short-Term Prediction of Air-Conditioning
Load through Smart Meter Analytics. Energies 2017, 10, 1905. [CrossRef]
12. Jain, R.K.; Smith, K.M.; Culligan, P.J.; Taylor, J.E. Forecasting energy consumption of multi-family residential
buildings using support vector regression: Investigating the impact of temporal and spatial monitoring
granularity on performance accuracy. Appl. Energy 2014, 123, 168–178. [CrossRef]
13. Paterson, G.; Mumovic, D.; Das, P.; Kimpian, J. Energy use predictions with machine learning during
architectural concept design. Sci. Technol. Built Environ. 2017, 23, 1036–1048. [CrossRef]
14. Karatasou, S.; Santamouris, M.; Geros, V. Modeling and predicting building’s energy use with artificial
neural networks: Methods and results. Energy Build. 2006, 38, 949–958. [CrossRef]
15. Hart, G.W. Nonintrusive appliance load monitoring. Proc. IEEE 1992, 80, 1870–1891. [CrossRef]
16. Zeifman, M.; Roth, K. Nonintrusive appliance load monitoring: Review and outlook. IEEE Trans. Consum.
Electron. 2011, 57, 76–84. [CrossRef]
17. Zoha, A.; Gluhak, A.; Imran, M.A.; Rajasegarar, S. Non-intrusive load monitoring approaches for
disaggregated energy sensing: A survey. Sensors 2012, 12, 16838–16866. [CrossRef] [PubMed]
18. Chang, H.-H.; Lian, K.-L.; Su, Y.-C.; Lee, W.-J. Power-spectrum-based wavelet transform for nonintrusive
demand monitoring and load identification. IEEE Trans. Ind. Appl. 2014, 50, 2081–2089. [CrossRef]
Electronics 2018, 7, 235 13 of 15
19. Haq, A.U.; Jacobsen, H.-A. Prospects of Appliance-Level Load Monitoring in Off-the-Shelf Energy Monitors:
A Technical Review. Energies 2018, 11, 189. [CrossRef]
20. Meehan, P.; McArdle, C.; Daniels, S. An Efficient, Scalable Time-Frequency Method for Tracking Energy
Usage of Domestic Appliances Using a Two-Step Classification Algorithm. Energies 2014, 7, 7041–7066.
[CrossRef]
21. Kwak, Y.; Hwang, J.; Lee, T. Load Disaggregation via Pattern Recognition: A Feasibility Study of a Novel
Method in Residential Building. Energies 2018, 11, 1008. [CrossRef]
22. Norford, L.K.; Mabey, N. Non-Intrusive Electric Load Monitoring in Commercial Buildings. In Proceedings
of the Eighth Symposium on Improving Building Systems in Hot and Humid Climates, Dallas, TX, USA,
13–14 May 1992; pp. 133–140.
23. Zia, T.; Bruckner, D.; Zaidi, A.A. Hidden Markov Model Based Procedure for Identifying Household Electric
Loads. In Proceedings of the 37th Annual Conference on IEEE Industrial Electronics Society, Melbourne,
Australia, 7–10 November 2011; pp. 3218–3322.
24. Onoda, T.; Rätsch, G.; Müller, K.R. Applying support vector machines and boosting to a non-intrusive
monitoring system for household electric appliances with inverters. In Proceedings of the Second ICSC
Symposium on Neural Computation, Berlin, Germany, 23–26 May 2000.
25. Fomby, T.B.; Barber, T. K-Nearest Neighbors Algorithm: Prediction and Classification; Southern Methodist
University: Dallas, TX, USA, 2008; pp. 1–5.
26. Gomes, L.; Fernandes, F.; Sousa, T.; Silva, M.; Morais, H.; Vale, Z.; Ramos, C. Contextual intelligent load
management with ANN adaptive learning module. In Proceedings of the 16th International Conference on
Intelligent System Application to Power Systems (ISAP 2011), Hersonissos, Greece, 25–28 September 2011.
[CrossRef]
27. Kelly, J.; Knottenbelt, W. Neural NILM: Deep neural networks applied to energy disaggregation.
In Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient
Built Environments, Seoul, Korea, 4–5 November 2015; ACM: New York, NY, USA, 2015; pp. 55–64.
28. Ruzzelli, A.G.; Nicolas, C.; Schoofs, A.; O’Hare, G.M. Real-time recognition and profiling of appliances
through a single electricity sensor. In Proceedings of the 7th Annual IEEE Communications Society
Conference on Sensor, Mesh and Ad Hoc Communications and Networks, Boston, MA, USA, 21–25 June
2010. [CrossRef]
29. Nalmpantis, C.; Vrakas, D. Machine learning approaches for non-intrusive load monitoring: from qualitative
to quantitative comparison. Artif. Intell. Rev. 2018. [CrossRef]
30. Zeifman, M. Disaggregation of home energy display data using probabilistic approach. IEEE Trans.
Consum. Electron. 2012, 58, 23–31. [CrossRef]
31. Bengio, Y. Learning deep architectures for AI. Found Trends Mach. Found. Trends Mach. Learn. 2009, 2, 1–127.
[CrossRef]
32. Hinton, G.E.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.-R.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.;
Sainath, T.; et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of
four research groups. IEEE Signal Process Mag. 2012, 29, 82–97. [CrossRef]
33. Siniscalchi, S.M.; Salerno, V.M. Adaptation to new microphones using artificial neural networks with
trainable activation functions. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 1959–1965. [CrossRef]
[PubMed]
34. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition.
Proc. IEEE 1998, 86, 2278–2324. [CrossRef]
35. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks.
In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe,
NV, USA, 3–6 December 2012; pp. 1097–1105.
36. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef]
[PubMed]
37. Mauch, L.; Yang, B. A new approach for supervised power disaggregation by using a deep recurrent LSTM
network. In Proceedings of the 3rd IEEE Global Conference on Signal and Information Processing, Orlando,
FL, USA, 14–16 December 2015; pp. 63–67.
Electronics 2018, 7, 235 14 of 15
38. Zhang, C.; Zhong, M.; Wang, Z.; Goddard, N.; Sutton, C. Sequence-to-point learning with neural networks
for nonintrusive load monitoring. In Proceedings of the AAAI-18: Thirty-Second AAAI Conference on
Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018.
39. Huang, G.-B. An insight into extreme learning machines: random neurons, random features and kernels.
Cognit. Comput. 2014, 6, 376–390. [CrossRef]
40. Huang, Z.; Yu, Y.; Gu, J.J.; Liu, H. An efficient method for traffic sign recognition based on extreme learning
machine. IEEE Trans. Cybern. 2017, 47, 920–933. [CrossRef] [PubMed]
41. Tang, J.; Deng, C.; Huang, G.-B. Extreme learning machine for multilayer perceptron. IEEE Trans. Neural
Netw. Learn. Syst. 2016, 27, 809–821. [CrossRef] [PubMed]
42. Sun, F.; Liu, C.; Huang, W.; Zhang, J. Object classification and grasp planning using visual and tactile sensing.
IEEE Trans. Syst. Man Cybern. Syst. 2016, 46, 969–979. [CrossRef]
43. Kasun, L.L.C.; Zhou, H.; Huang, G.-B.; Vong, C.M. Representational learning with ELMs for big data.
IEEE Intell. Syst. 2013, 28, 31–34.
44. Zhao, W.; Beach, T.H.; Rezgui, Y. Optimization of potable water distribution and wastewater collection
networks: A systematic review and future research directions. IEEE Trans. Syst. Man Cybern. Syst. 2016, 46,
659–681. [CrossRef]
45. Wang, D.; Bischof, L.; Lagerstrom, R.; Hilsenstein, V.; Hornabrook, A.; Hornabrook, G. Automated Opal
Grading by Imaging and Statistical Learning. IEEE Trans. Syst. Man Cybern. Syst. 2016, 46, 185–201.
[CrossRef]
46. Liu, D.; Wei, Q.; Yan, P. Generalized policy iteration adaptive dynamic programming for discrete time
nonlinear systems. IEEE Trans. Syst. Man Cybern. Syst. 2015, 45, 1577–1591. [CrossRef]
47. Huang, G.-B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass
classification. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2012, 42, 513–529. [CrossRef] [PubMed]
48. Kelly, J.; Knottenbelt, W. The UK-DALE dataset, domestic appliance-level electricity demand and
whole-house demand from five UK homes. Sci. Data 2015, 2, 150007. [CrossRef] [PubMed]
49. Kim, H.; Marwah, M.; Arlitt, M.; Lyon, G.; Han, J. Unsupervised disaggregation of low frequency power
measurements. In Proceedings of the 2011 SIAM International Conference on Data Mining, Mesa, AZ, USA,
28–30 April 2011; pp. 747–758.
50. Kolter, J.Z.; Johnson, M.J. Redd: A public data set for energy disaggregation research. In Proceedings of the
Workshop on Data Mining Applications in Sustainability, San Diego, CA, USA, 21 August 2011.
51. Kolter, J.Z.; Jaakkola, T. Approximate inference in additive factorial HMMs with application to energy
disaggregation. In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics
(AISTATS), La Palma, Canary Islands, Spain, 21–23 April 2012; pp. 1472–1482.
52. Parson, O.; Ghosh, S.; Weal, M.; Rogers, A. Using hidden Markov models for iterative non-intrusive appliance
monitoring. In Proceedings of the Neural Information Processing Systems workshop on Machine Learning
for Sustainability, Sierra Nevada, Spain, 16–17 December 2011.
53. Parson, O.; Ghosh, S.; Weal, M. Rogers, Non-intrusive load monitoring using prior models of general
appliance types. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto,
ON, Canada, 22–26 July 2012; AAAI Press: Palo Alto, CA, USA, 2012; pp. 356–362.
54. Egarter, D.; Bhuvana, V.P.; Elmenreich, W. PALDi: online load disaggregation via particle filtering. IEEE Trans.
Instrum. Meas. 2015, 64, 467–477. [CrossRef]
55. Paradiso, F.; Paganelli, F.; Giuli, D.; Capobianco, S. Context-based energy disaggregation in smart homes.
Future Internet 2016, 8, 4. [CrossRef]
56. Aiad, M.; Lee, P.H. Unsupervised approach for load disaggregation with devices interactions. Energy Build.
2016, 116, 96–103. [CrossRef]
57. Makonin, S.; Popowich, F.; Bajic, I.V.; Gill, B. ; Bartram, L. Exploiting HMM sparsity to perform online real-
time nonintrusive load monitoring. IEEE Trans. Smart Grid 2016, 7, 2575–2585. [CrossRef]
58. Lange, H.; Bergés, M. The neural energy decoder: energy disaggregation by combining binary
subcomponents. In Proceedings of the NILM2016 3rd International Workshop on Non-Intrusive Load
Monitoring, Vancouver Canada, 14–15 May 2016.
59. Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J.
Imaging Sci. 2009, 2, 183–202. [CrossRef]
Electronics 2018, 7, 235 15 of 15
60. Bonfigli, R.; Squartini, S.; Fagiani, M.; Piazza, F. Unsupervised algorithms for non-intrusive load monitoring:
An up-to-date overview. In Proceedings of the 2015 IEEE 15th International Conference on Environment and
Electrical Engineering (EEEIC), Rome, Italy, 10–13 June 2015. [CrossRef]
61. Zhao, B.; Stankovic, L.; Stankovic, V. On a Training-Less Solution for Non-Intrusive Appliance Load
Monitoring Using Graph Signal Processing. IEEE Access 2016, 4, 1784–1799. [CrossRef]
62. Zhou, J.; Sun, N.; Jia, B.; Peng, T. A Novel Decomposition-Optimization Model for Short-Term Wind Speed
Forecasting. Energies 2018, 11, 1752. [CrossRef]
63. Wang, R.; Li, J.; Wang, J.; Gao, C. Research and Application of a Hybrid Wind Energy Forecasting System
Based on Data Processing and an Optimized Extreme Learning Machine. Energies 2018, 11, 1712. [CrossRef]
64. Chin, C.S.; Gao, Z. State-of-Charge Estimation of Battery Pack under Varying Ambient Temperature Using
an Adaptive Sequential Extreme Learning Machine. Energies 2018, 11, 711. [CrossRef]
65. Collotta, M.; Pau, G.; Salerno, V.M.; Scatà, G. A fuzzy based algorithm to Manage Power Consumption
in Industrial Wireless Sensor Networks. In Proceedings of the 2011 9th IEEE International Conference on
Industrial Informatics, Lisbon, Portugal, 26–29 July 2011. [CrossRef]
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).