0% found this document useful (0 votes)

117 views

Modeling and Forecasting Building Energy Consumption - A Review of Data-Driven Techniques

Uploaded by

Antonio José Idárraga Riascos

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views

Modeling and Forecasting Building Energy Consumption - A Review of Data-Driven Techniques

Uploaded by

Antonio José Idárraga Riascos

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Sustainable Cities and Society 48 (2019) 101533

Contents lists available at ScienceDirect

Sustainable Cities and Society

journal homepage: www.elsevier.com/locate/scs

Engineering Advance

Modeling and forecasting building energy consumption: A review of data- T

driven techniques
⁎
Mathieu Bourdeaua, Xiao qiang Zhaia, , Elyes Nefzaouib,c, Xiaofeng Guob, Patrice Chatellierd
a
Institute of Refrigeration and Cryogenics, Shanghai Jiao Tong University, Shanghai 200240, PR China
b
ESIEE Paris, Université Paris-Est, 2 Bd Blaise Pascal, 93162 Noisy-le-Grand Cedex, France
c
Université Paris-Est, ESYCOM (EA 2552), CNAM, ESIEE Paris, UPEMLV, F-77454, Marne-la-Vallée, France
d
Université Paris-Est, IFSTTAR, 14-20 Bd Newton, Champs-sur-Marne, Marne-la-Vallée, France

A R T I C LE I N FO A B S T R A C T

Keywords: Building energy consumption modeling and forecasting is essential to address buildings energy efficiency pro-
Building energy consumption blems and take up current challenges of human comfort, urbanization growth and the consequent energy
Building load forecasting consumption increase. In a context of integrated smart infrastructures, data-driven techniques rely on data
Data-driven techniques analysis and machine learning to provide flexible methods for building energy prediction. The present paper
Machine learning
offers a review of studies developing data-driven models for building scale applications. The prevalent methods
are introduced with a focus on the input data characteristics and data pre-processing methods, the building
typologies considered, the targeted energy end-uses and forecasting horizons, and accuracy assessment. A special
attention is also given to different machine learning approaches. Based on the results of this review, the latest
technical improvements and research efforts are synthesized. The key role of occupants’ behavior integration in
data-driven modeling is discussed. Limitations and research gaps are highlighted. Future research opportunities
are also identified.

1. Introduction Agreement (International Energy Agency, 2017). Meanwhile, some of

the major global warming contributors and signatories of the Agree-
Buildings account for a signiﬁcant part of the global energy con- ment such as China – 2538 Mega ton oil equivalent consumed and 8796
sumption with 30% in average and a third of the associated CO2 Mega ton of CO2 produced in 2016 (“Global Energy Statistical
emissions (International Energy Agency, 2016). Despite developments Yearbook, 2017|World Energy Statistics|Enerdata,” 2017) – are facing
to improve building energy eﬃciency, the International Energy Agency challenges with a growing urbanization and an annual increase of their
has highlighted in 2017 that current investments were not on track for building stock (Tsinghua University Building Energy Research Center,
building sector to achieve the 2 °C-scenario targeted by Paris Climate 2016).

Abbreviations: AC, air conditioning; AI, artificial intelligence; ANFIS, adaptive network-based fuzzy inference system; ANN, artificial neural network; AR, auto-
regressive; ARMA, autoregressive moving average; ARX, autoregressive exogenous; ARIMA, autoregressive integrated moving average; ARIMAX, autoregressive
integrated moving average exogenous; BECMF, building energy consumption modeling and forecasting; BDT, boosting decision tree; BPNN, back-propagation neural
network; CART, classification and regression tree; CHAID, chi-square automatic interaction detector; CRBM, conditional restricted Boltzmann machine; CV-RMSE,
coefficient of variation of root mean squared error; DBN, deep belief network; DE, differential evolution; DNN, deep neural network; DPT, dew point temperature; DT,
decision tree; ELM, extreme learning machine; EM, expectation maximization; FCRBM, factored conditional restricted Boltzmann machine; FFNN, feed-forward
neural network; GA, genetic algorithm; GP, genetic programming; GBDT, gradient boosting decision tree; HVAC, heating, ventilation and air-conditioning; IAT,
indoor air temperature; k-NN, k nearest neighbors; LEED, leadership in energy and environmental design; LS-SVM, least square support vector machine; MAE, mean
absolute error; MAPE, mean absolute percentage error; MARS, multivariate adaptive regression splines; MLP, multilayer perceptron; MLR, multiple linear regression;
NARX, non-linear autoregressive exogenous; OAT, outdoor air temperature; OLS, ordinary least square (regression); PCA, principal component analysis; PSO, particle
swarm optimization; RBF, radial basis function; RBFNN, radial basis function neural network; RC, resistance capacitance; RF, random forest; RH, relative humidity;
RMSE, root mean square error; RNN, recurrent neural network; SARIMA, seasonal autoregressive integrated moving average; SARIMAX, seasonal autoregressive
integrated moving average exogenous; SARSA, state-action reward state-action; SOM, self-organizing map; SR, solar radiation; SVM, support vector machine; SVR,
support vector regression; WD, wind direction; WS, wind speed
⁎
Corresponding author.
E-mail address: [email protected] (X.q. Zhai).

https://doi.org/10.1016/j.scs.2019.101533
Received 17 November 2018; Received in revised form 20 February 2019; Accepted 2 April 2019
Available online 14 April 2019
2210-6707/ © 2019 Elsevier Ltd. All rights reserved.
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

To address the stakes of rapidly growing urbanization, the in- build a classification and to cross-check the corresponding nomen-
creasing need of human comfort and consequent energy consumption clature. Indeed, the many recent applications and technical improve-
increase, solutions emerge in the development of smart sustainable ments of BECMF methods have introduced numerous names of tech-
infrastructures (Silva, Khan, & Han, 2018). Smart and low to zero en- niques that may confuse non-experts.
ergy buildings play a significant role (Kylili & Fokaides, 2015) in many Based on the developed classification, a key-word search was con-
aspects including global energy efficiency, energy conservation mea- ducted for the broad field of BECMF and more specifically for each
sures and the integration of renewable energy systems. Hence, building types of techniques. Google Scholar was used as it provides relevant
energy consumption modeling and forecasting is a key tool to achieve information. The number of citations helped target reference articles.
smart and sustainable designs. Indeed, it can assist in higher energy These articles highlighted more recent citing papers that build on pre-
efficient designs by comparing several strategies for both previous research work to introduce novel applications and methods.
(Tahmassebi & Gandomi, 2018) and post-occupancy studies Authors’ names were also used to search for related and relevant similar
(Ruparathna, Hewage, & Sadiq, 2017). It can also guide energy man- studies.
agement at local and global scales (Xu, Taylor, Pisello, & Culligan, A selection of articles retrieved from the key-word search was per-
2012). formed based on three criteria: (1) the publication date in the past
Among the three main approaches in building energy consumption decade to consider relatively recent research work; (2) the focus on
modeling and forecasting (BECMF) – physics-based, data-driven and forecasting energy consumption and load demand in buildings (in-
hybrid models (Dong, Li, Rahman, & Vega, 2016), data-driven techni- cluding overall energy, thermal energy with combined and separated
ques emerge as the most suitable option to ensure the integration of cooling and heating loads, and other loads such as lighting or plug
buildings in smart environments. Smart infrastructures rely on sensor load); (3) the application-scale focusing on building-scale studies
networks which generate large amounts of energy-related data (Rathore (neighborhoods, cities, regions and countries were excluded).
et al., 2018). For instance, massive smart-meters deployment programs The fourth step aimed to highlight specific information for each
have been launched with in Europe, the United States of America and selected studies, and summarized in a table for study comparisons. It
China during the past decade with ambitious goals to achieve by 2020 included the type(s) of technique(s) implemented, the characteristics of
(Liu, Marnay, Feng, Zhou, & Karali, 2017; Obey, 2009; Smart Metering the building(s) case study(ies) (number of case studies, building type(s),
deployment in the European Union|JRC Smart Electricity Systems and location(s)), the characteristics of the input (data type, granularity,
Interoperability. (n.d.).; U.S. Energy Information Administration (EIA), amount) and output data (type(s) of end-use(s), forecasting horizon,
2018). Then, as the name suggests, data-driven methods propose accuracy) and the modeling tools or software used.
modeling and forecasting frameworks based on data analysis schemes With the collected information, research articles were selected to
rather than on classical physics-based modeling tools (Foucquier, serve at least one of the three following purposes: (1) to provide a solid
Robert, Suard, Stéphan, & Jay, 2013). Furthermore, these frameworks but accessible theoretical and application reference for one specific type
include algorithms that take benefit from the recent significant devel- of technique; (2) to present an original approach in terms of forecasting
opments in the field of machine learning in recent years (Wang & technique and/or application case-study, input data, end-use(s) or
Srinivasan, 2017), providing flexibility and reliability to modeling and multidisciplinary work; (3) to propose a comparative study with in-
forecasting tools. Consequently, data-driven building energy con- sights on the different methods implemented, their performances, ad-
sumption modeling techniques have recently drawn an increasing at- vantages and weaknesses. Also, this step helped identify relevant ap-
tention, providing new case studies, algorithms and results while proaches not covered by existing review articles.
technical challenges remain (Bourdeau, Guo, & Nefzaoui, 2018). Finally, a last comparison was conducted with existing state-of-the-
Thus, we report in the present paper a review on data-driven art papers to select the most relevant examples of BECMF methods. An
building energy modeling techniques. It aims to introduce the most effort was made to select original research works that were little or not
prevalent techniques and to further provide an up-to-date overview of already reported. Moreover, techniques with few new application stu-
recent studies and advancements in BECMF studies, as well as research dies compared to these that have already been covered were excluded.
gaps and promising research directions. The paper is organized as fol- Nevertheless, these techniques were included in the classification of
lows: in Section 2 the data-driven forecasting process is described, BECMF methods presented in the following section. They are briefly
performance assessment metrics are defined, and the prevalent tech- introduced when encountered in the selected articles.
niques are presented namely autoregressive models (AR), statistical
regressions, k-nearest neighbors (k-NN), decision trees (DT), support 1.2. Classification of methods for building energy consumption modeling
vector machine (SVM) and artificial neural networks (ANN). Section 3 and forecasting
discusses and compares the application of different machine learning
approaches for data-driven techniques. Section 4 summarizes the Numerous and various techniques have been developed, adapted
characteristics of the data used and pre-processing methods in data- and used for BECMF. The significant research efforts on the topic over
driven forecasting processes. Section 5 discusses the challenges in terms the past twenty years have led to several previous reviews describing
of building typologies, energy end-uses, forecasting horizons and the the existing methods and using different nomenclatures. To aid the
implications of the lack of occupant-related data. It also summarizes the readers’ understanding of the different techniques, and re-contextualize
current trends and latest technical achievements in machine learning the scope of the present review, this part of the study compares the
applications to building energy forecasting studies, while highlighting different classifiers encountered in fifteen reviewed state-of-the-art ar-
their limitations and possible solutions. ticles (Ahmad, Chen, Guo, & Wang, 2018; Amasyali & El-Gohary, 2018;
ASHRAE, 2009, chap. 19; Chalal, Benachir, White, & Shrahily, 2016;
1.1. Research methodology Deb, Zhang, Yang, Lee, & Shah, 2017; Foucquier et al., 2013; Fumo,
2014; Mat Daut et al., 2017; Pedersen, 2007; Swan & Ugursal, 2009;
The research methodology followed six steps. The first step relied on Tardioli, Kerrigan, Oates, O‘Donnell, & Finn, 2015; Wang & Srinivasan,
the analysis of existing review papers to highlight (1) the trends in 2017; Wei et al., 2018; Yildiz, Bilbao, & Sproul, 2017; Zhao & Magoulès,
research and applications of BECMF techniques over the past decade 2012). A unifying nomenclature is then presented in Fig. 1. The sug-
which has witnessed massive smart-metering deployment programs and gested classification is based on the differences in modeling processes,
an increasing amount of energy demand data production (Liu et al., without building type or energy end-use distinctions and considering
2017; Obey, 2009; U.S. Energy Information Administration (EIA), building-scale applications.
2018), (2) the main existing categories and classes of techniques to (3) The review work has highlighted three main categories of BECMF

2
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Fig. 1. Summary classiﬁcation of building energy consumption modeling and forecasting methods (techniques with white font on black background are the tech-
niques covered in this paper).

models. The first, physics-based models, is also commonly referred as specific popular techniques have been described (Deb et al., 2017; Zhao
“white-box” (Tardioli et al., 2015). It uses a transparent process based & Magoulès, 2012). They are organized as follow in the proposed
on physics equations solving to describe the energy behavior of build- classification for data-driven methods: single models are divided be-
ings. Physics-based modeling has been introduced with different names, tween (1) classical techniques with moving average & exponential
either called “forward classical approach” and “calibrated simulation smoothing (MA & ES), autoregressive models (AR) and statistical re-
approach” (ASHRAE, 2009, chap. 19), “energy simulation programs” gressions, (2) classification-based techniques applied to forecasting
(Pedersen, 2007), “engineering methods”/approach (Fumo, 2014; Swan purpose with k-nearest neighbors (k-NN) and decision trees (DT), (3)
& Ugursal, 2009; H. Zhao & Magoulès, 2012) “physical modelings” support vector machines (SVM), (4) artificial neural networks (ANN),
(Foucquier et al., 2013), or “thermal models” (Yildiz et al., 2017). Sub- (5) genetic algorithms (GA), (6) grey modeling, (7) case-based rea-
classifications of these models have been proposed as well depending soning and (8) fuzzy models. On the opposite, a category named com-
on the origin of input data (Swan & Ugursal, 2009), the level of details bined models includes both ensemble models and improved models.
implemented in the modeling (Foucquier et al., 2013) and the modeling The latter refers to the combination of single data-driven techniques
calibration methods (ASHRAE, 2009, chap. 19; Fumo, 2014). and optimization methods (Mat Daut et al., 2017).
The second category is data-driven models. They mainly rely on Finally, the third and last of the main category of models is hybrid
time-series statistical analyses and machine learning algorithms to as- models. It describes the combination of physics-based and data-driven
sess and forecast the building energy consumption. They are also often methods. They are also called “gray-box” or “grey-box approach”
named “black-box” models (ASHRAE, 2009, chap. 19; Tardioli et al., (Foucquier et al., 2013; Tardioli et al., 2015), as the combination of
2015) to emphasize that the relationship between inputs and outputs white-box and black-box methods. Other techniques have also been
can hardly be transposed to physics-based analysis with these techni- named “hybrid models” (Chalal et al., 2016; Mat Daut et al., 2017) but
ques. “Data-driven” techniques (Ahmad et al., 2018; Amasyali & El- were referring to the improvement of single data-driven techniques
Gohary, 2018; Tardioli et al., 2015; Wei et al., 2018), have also been with optimization methods, or the combination of several machine
named “time series […] techniques” (Deb et al., 2017) and “statistical” learning algorithms. In the proposed classification these are called
(Chalal et al., 2016; Swan & Ugursal, 2009; Zhao & Magoulès, 2012). improved models as described in the previous paragraph.
Furthermore, data-driven “statistical analyses”, “regressions-based
models” and “auto-regressive models” regarded as more conventional 1.3. Overview of the papers reviewed
methods (Mat Daut et al., 2017; Pedersen, 2007; Yildiz et al., 2017;
Zhao & Magoulès, 2012) have been differentiated from artificial in- The study covers eight classes of data-driven models with auto-
telligence models referred as “intelligent computer systems”/techni- regressive models (AR), statistical regressions, k nearest neighbors (k-
ques (Pedersen, 2007), “intelligent techniques” (Fumo, 2014), “AI ap- NN), decision trees (DT), support vector machines (SVM), artificial
proach” (Mat Daut et al., 2017) or “machine learning models” (Yildiz neural networks (ANN), ensemble and improved techniques. Hybrid
et al., 2017). More recently, Wang and Srinivasan (2017) reviewed models are also discussed in the discussions of this review. Based the
data-driven models in building energy consumption prediction by op- research methodology previously described, techniques including
posing “single models” and “ensemble models”. The former uses a moving average, exponential smoothing, genetic algorithms (GA), grey
single algorithm for a straightforward forecasting process, while the models, case-based reasoning and fuzzy-based models are not described
latter build a framework managing the strengths and weaknesses of in detail in the present work (black font on white background in Fig. 1).
techniques. Within all these classes of data-driven models, several They have been investigated during the review process, however,

3
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 1
Counting of the number of reviewed research papers implementing the diﬀerent BECMF techniques from 2007 to 2019.
BECMF techniques 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 Total

AR 1 1 2 3 1 8
Statistical regressions Supervised 1 2 3 2 2 3 3 16 17
Unsupervised 1 1
k-NN 2 2 1 5
DT 1 1 1 1 1 2 7
SVM Supervised 1 4 4 3 4 2 18 20
Unsupervised 1 1
Transfer learning 1 1
ANN Supervised 1 1 1 1 4 4 3 2 2 20 22
Unsupervised 1 1
Transfer learning 1 1
DNN Supervised 2 1 1 4 6
Unsupervised 1 1
Reinforcement 1 1
Ensemble Supervised 1 2 1 3 4 11 16
Unsupervised 1 2 3
Improved 2 2 1 1 6
Hybrid 1 1 1 3

results of the literature review highlighted case-based reasoning result in a model performing very well with a specific set of data but
(Kolodner, 2014), fuzzy-based models (Song & Chissom, 1993) and grey poorly with other datasets.
models (Deng, 1989) have had few new applications compared to these Finally, the third step is the testing step when the algorithm de-
already covered in other review papers. Also, for grey modeling, fuzzy- veloped is run on the remaining part of the data to provide a final
based models, exponential smoothing and moving average, the appli- unbiased evaluation of the modeling and forecasting performances. It is
cations were mainly focusing country-scale studies while the present commonly admitted that the model parameters and structure should
work limits the applications to building-scale. GA (Mitchell, 1998) can not be modified based on the results of this final step (Fan, Xiao, &
be found in the literature but have been sparsely applied alone as the Wang, 2014). Several methods have been used to pre-process the da-
prediction technique for building energy consumption forecasting. They tasets and select relevant input data with adapted training–valida-
are mostly implemented as an optimization tool as described later in tion–testing ratios. They will be presented along with the review of the
this review. Finally, physics-based models are out of the scope of this different studies in the following sections. Nevertheless, it should be
paper, focusing on data-driven methods. Hence, these techniques are highlighted that in practice training and validation steps are not always
only briefly presented when implemented in the reviewed articles. explicitly separated.
In the present article, a total number of fifty original research papers
have been reviewed. Moreover, a counting of the number of studies
implementing the different reviewed BECMF techniques has been per- 2.2. Accuracy metrics
formed. It is presented in Table 1 with the number of papers per year
and the total number of papers over the 2007–2019 period and for each Forecasting performances of data-driven algorithms are tested using
specific approach. A distinction is made between more conventional accuracy metrics. The most common are the mean absolute percentage
ANN and deep (learning) neural networks (DNN), as well as between error (MAPE), the root mean square error (RMSE), the coefficient of
supervised, unsupervised, reinforcement and transfer learning ap- variation of RMSE (CV-RMSE) and the mean average error (MAE) as-
proaches. sessed in 53%, 47%, 38% and 36% of the reviewed studies respectively.
The coefficient of determination (R2), the mean square error (MSE), the
mean relative error (MRE), the mean bias error (MBE) and the nor-
2. Data-driven techniques malized mean bias error (NMBE) can also be found in 27%, 16%, 9%,
2% and 4% of reviewed studies respectively. Finally, some authors
2.1. Data-driven forecasting process: training, validation and testing defined specific accuracy metrics such as the relative error (Liu, Chen, &
Mori, 2015), average error (Neto & Fiorelli, 2008) and accuracy rate
Data-driven techniques use statistical and machine learning tools to (Wahid & Kim, 2016; Yu, Haghighat, Fung, & Yoshino, 2010).
develop an energy model of a building. Most techniques focus on time Different metrics provide different information on the forecasting
series data analysis but also frequently include basic knowledge on the performances and the model behavior for different datasets (Hyndman
buildings’ characteristics. The data-driven modeling process involves & Koehler, 2006). Contrariwise unit-based metrics (i.e. MAE or RMSE
three steps that rely on three different sets of data. These datasets for instance), performance evaluation based on error percentages pro-
usually result from the division of a main original one and include the vide normalized information. Thus, they should be preferred for com-
same input variables but with different combinations of values and for parisons between different models, studies and building typologies.
different periods of time (Bishop, 2006). Since MAPE was the most used in reviewed articles, this metric has
The first step is the training of the algorithm. The model is run on been selected as the reference accuracy metric for the present review
the training dataset to produce results. These results are compared to work (and in Appendix A). Otherwise, when MAPE was not available
the original training data and based on the results of the comparison, other metrics included CV-RMSE, RMSE, MAE and R2 were provided to
the different parameters of the algorithm can be adjusted to fit on the illustrate the forecasting performances of the different implemented
training dataset (Wahid & Kim, 2016). The second step is the validation. algorithms. The specific definition of each of these metrics can be found
A validation dataset is used to provide an unbiased evaluation of the in Eqs. (1)–(9).
implemented algorithm, already fit on training data, and to tune its key
modeling parameters to enhance the fitting of the model. The validation 1
n

dataset must be diﬀerent from the training dataset to prevent over- Mean Absolute Error (MAE) =
n
∑ |yforecasting, i − yobserved,i |
i=1 (1)
ﬁtting (Zhang, Deb, Lee, Yang, & Shah, 2016). Otherwise, it would

4
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Mean Absolute Percentage Error (MAPE) (%) (Box, Jenkins, & Reinsel, 2008). ARIMA(p,d,q) is composed of lagged
n
yforecasting, i − yobserved, i terms from the input time series with the autoregressive part AR(p) of
1
= ∑ *100 order p, and of lags of the forecasting error with the moving average
n yobserved, i (2)
i=1 part MA(q) of order q. When the input time series is not stationary it
n can be differentiated: the order d indicates the degree of differentiation.
1
Mean Square Error (MSE) (W or kW) =
n
∑ (yforecasting, i − yobserved,i )2 When the time series already is stationary, and therefore differentiation
i=1 is not necessary, ARIMA can also be noted ARMA(p,q). The output of
(3) ARIMA modeling is a linear equation, combining both the
autoregressive and moving average parts as follows:
Root Mean Square Error (RMSE) (W or kW)
p q
n
1 Yˆt = C + ∑ φXt−i − ∑ θj εt−j
=
n
∑ (yforecasting, i − yobserved, i )2
i=1 j=1 (10)
i=1 (4)

Cofficient of Variation of RMSE (CVRMSE) (%) With t the time-step, Ŷ the predicted value and X the time series values.
1 n φ is the coefficient of the autoregressive model, θ the is coefficient of
n
∑i = 1 (yforecasting, i − yobserved, i )2 the moving average model and C is a constant. ε is the forecasting error.
= *100
y¯observed (5) AR models are relatively simple to implement. Basic autoregressive
models only consider the recent past historical load demand data points
Also called Normalized RMSE (NRMSE) or Root Mean Square
to predict its future states. Therefore, they can only provide short-term
Percentage Error (RMSPE)
forecasting, which limits their application scope and accuracy. Several
Cofficient of Determination (R2) (unitless) technical improvements have been developed to overcome these issues.
n
∑i = 1 (yforecasting, i − yobserved, i )2 SARIMA models (seasonal ARIMA) (Jeong, Koo, & Hong, 2014) append
=1− n additional seasonal terms to a standard ARIMA to account for events
∑i = 1 (yobserved, i − y¯observed )2 (6) and trends happening at a regular pace. They are noted SARIMA(p,d,q)
n (P,D,Q)s with p, d and q related to the non-seasonal part of the data as
1
Mean Bias Error (MBE) (W or kW) =
n
∑ (yforecasting, i − yobserved,i ) presented above and P, D, Q the lagged terms of the seasonal part of the
i=1 (7) data for a lag of S (the period of the events/trends). Also, ARIMAX
models (ARIMA with inclusion of eXogenous variables) (Newsham &
Normalized Mead Bias Error (NMBE) (%) Birt, 2010) consider the impact of parameters other than the past load
1 n
n
∑i = 1 (yforecasting, i − yobserved, i ) demand on the energy consumption such as weather conditions or oc-
= *100
y¯observed (8) cupancy. They are added to the standard ARIMA models as a linear
combination the past b terms of their corresponding time series. AR-
y¯forecasting, i − y¯observed, i IMAX models can then be noted ARIMAX(p,d,q,b). Finally, both
Mean Relative Error (MRE) = SARIMA and ARIMAX can be combined.
y¯observed, i (9)
Indeed, Newsham and Birt (2010) developed a SARIMAX model
With yforecasting,i is the forecasted energy consumption at time point i, with occupancy data from network logins and daily power seasonality,
yobserved,i is the real energy consumption data at time point i, ȳobserved, i is using IBM SPSS Statistics (“IBM SPSS Statistics, ” n.d.). It aimed to
the average of the real energy data consumption over the considered forecast occupancy-related electricity load (lighting, office and lab
timeframe, and n is the total number of data in the dataset considered equipment, plug loads, without chiller power) of an office and research
for performance evaluation. building in Ontario, Canada. The model was compared to a SARIMA
model and results showed a MAPE of 1.24% and 1.22% for SARIMA and
2.3. Single models SARIMAX respectively. Yun, Luck, Mago, & Cho (2012) implemented
four 4th-order ARX models to predict separately ahour-ahead building
Single models are data-driven techniques implementing only one cooling and heating loads. Three models were indexed with time (dif-
predictive algorithm for a forecasting problem. In this paper, single ferent hours of the day) or time periods and temperature levels. Test
models include conventional methods with autoregressive models and cases were benchmark buildings simulated in EnergyPlus (“EnergyPlus,
statistical regressions, classification-based methods with k nearest ” n.d.), a physics-based modeling software. Building typologies in-
neighbors and decision trees, support vector machine and artificial cluded small-office, medium-office, midrise apartment and high-rise
neural networks. apartment buildings. ARX models were compared to a simple AR
model, a multi-linear regression (MLR) model and a back-propagation
2.3.1. Conventional methods neural network (BPNN). The ARX model indexed with three time per-
Conventional methods refer to autoregressive models and statistical iods of the day and five OAT levels performed better for every four
regressions, two popular techniques which have been widely im- building types and for both cooling and heating periods. Dagnely,
plemented for BECMF. They provide a good balance between im- Ruette, Tourwé, and Tsiporkova (2015) developed a seventh-order AR
plementation simplicity and forecasting accuracy. However, they have model to forecast the next 72-h electricity load demand of an office
shown significant limitations with respect to the forecasting horizon building in Brussels, Belgium. They proposed a comparison with an
and the ability to model nonlinear data patterns. ordinary least square regression (OLS) using Python Statsmodels
(“StatsModels: Statistics in Python — statsmodels 0.9.0 documentation,
2.3.1.1. Autoregressive models (AR). Autoregressive modeling is one of ” n.d.) and a support vector regression (SVR) using Python Scikit-Learn
the most classical modeling and forecasting techniques and is based on (Scikit-Learn, 2019“Scikit-Learn: machine learning in Python, ” n.d.).
statistical analysis of time-series. It only requires the training set to be Various inputs combinations were considered including day type, oc-
stationary: this means that statistical properties of the time-series cupancy, OAT, SR and previous-week same-day logged energy con-
should be time-invariant, or in other words that the energy sumption called “recency”. The AR model gave the best MAE of
consumption at a specific time should be similar to this of the recent 2.01 kW. OLS performances ranged between a MAE of 2.05 kW for all
past. Common models include AR and auto-regressive integrated variables and a MAE of 3.74 kW with temperature only. SVM perfor-
moving average (ARIMA) models, also called Box–Jenkins models mances ranged between a MAE of 1.94 for all variables and “recency”

5
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

only, and a MAE of 3.46 kW with temperature only. representatives are k-nearest neighbors (k-NN) and decision trees (DT).
Both are intuitive techniques with high forecasting accuracy. However,
2.3.1.2. Statistical regressions. Statistical regressions aim to model a both are also limited by the need of comprehensive input dataset. They
relationship between an output and contributing inputs, also called allow only qualitative analyses of their results.
explanatory variables, in the form of an equation. For BECMF, several
types of statistical regressions can be found in the literature. These 2.3.2.1. K-nearest neighbors (k-NN). K-nearest neighbors is a popular
include multiple linear regressions (MLR), also called conditional technique for pattern clustering and classification that was first
demand analysis (CDA) (Parti & Parti, 1980); ordinary least square introduced by Fix and Hodges, Jr (1951). When applied on time-
regressions (OLS) (Dagnely et al., 2015); piecewise linear regressions series, it relies on the idea that similar patterns can be identified and
also called segmented regressions (Zheng, Zhuang, Lian, & Yu, 2017); classified according to their properties: for instance, energy demand or
general linear regressions (Chou & Bui, 2014); elastic net regressions consumption can be related to occupancy, weather data and other
(Fan, Xiao, & Zhao, 2017); Bayesian regressions (Gelman et al., 2013); relevant parameters. Thus, given a set of historic observations (energy
and Gaussian process regressions (Rasmussen & Williams, 2006). In consumption and other variables), clusters are first created. They are
case that the modeling pattern shows high non-linearity, literatures constructed with respect to a user-defined feature: peak load, average,
reported the efficiency of multivariate adaptive regression splines magnitude of daily load variation, daily consumption (integral), etc.
(MARS) (Friedman, 1991). These features are calculated for each time series and then used for
Statistical regressions have been widely implemented for both pre- classification. A less user dependent process relies on the calculation of
occupancy (design phase) and post-occupancy forecasting studies such distances between each pair of time-series for which a metric is then to
as energy retrofit impact assessment. Their popularity is mostly related be defined. A simple Euclidian distance for example can be used and
to their simple implementation and relatively explicit formulation to interpreted as a difference of energy consumption (Toffanin, 2016).
link output energy consumption to input explanatory variables. Then, new observation data are compared with the clusters by defining
Moreover, forecasting performance of statistical regressions are rea- a degree of closeness based on two criteria. The first criteria, the
sonably good for most applications. Nevertheless, if the simplicity of parameter k, sets the number of neighbors (the closest data points) to
statistical regressions generally is a strong advantage it also induces one which the target observations should be compared. The second criteria
of their major drawbacks. Indeed, most regression techniques are un- is the metric for comparison and classification. It is usually the same as
able to deal with non-linear phenomena which are common in building for the previous clustering step. Once the comparison is made and the
energy efficiency studies. Moreover, a large amount of data is also re- new observation data are associated to a specific cluster, energy
quired to capture all possible scenarios. forecasting can be performed.
For instance, Amber et al. (2017) implemented a MLR to predict the K-NN modeling is intuitive, relatively simple to implement and
daily electricity consumption of an administrative building and an shows good forecasting accuracy. In most reviewed studies, it has been
academic building in London, England. Input data included daily mean applied for short-term forecasting horizon with hourly time-step.
OAT, RH, SR and WS, weekday index and building type, and were Similarly, to other classification methods, it has the advantage to enable
narrowed down to daily mean OAT, weekday index and building type the utilization of categorical variables for energy driver considerations
after a collinearity study. Over five years of data collected. Four years and to create the neighbors groups. However, its forecasting ability
were used to train the regression model and one year to test it. Results relies on the amount of input data available: the accuracy depends on
showed a MAPE of 8.58% for the administrative building and a MAPE the presence of similar “conditions” resulting a similar output in the
of 9.76% for the academic building. Pulido-Arcas, Pérez-Fargallo, and database.
Rubio-Bellido (2016) developed MLR forecasting models for office Valgaev and Kupzog (2016) developed a k-NN model for 24-h-ahead
buildings in Chile using a government database. It included building overall electricity load forecasting of mixed-use buildings with different
characteristics with the number of stories, floor area, form ratio, wall- sizes and aggregated end-consumers load. An Irish energy database
to-window ratio (WWR), coefficient of performance (COP), energy ef- comprising over 6000 low-voltage buildings was used to model daily
ficiency ratio, and heating and cooling emission factors. Models were profiles from smart-metering and day-type was also distinguished.
adapted to nine locations with specific climate datasets. They were used Three building sizes (25, 50 and 100 end-consumers) were then gen-
to assess total energy consumption (electricity and natural gas), com- erated, with 70% of residential spaces and 30% of commercial spaces,
paring 77,000 possible office buildings for each climate locations. Nine respectively, and with samples of 100 buildings for each size. Accuracy
regression models were prepared and energy consumption forecasting results were assessed with MRE and showed that a higher number of
results (MAE) ranged between 0.11 kW and 0.41 kW. end-consumers induced less accurate forecasting results, with 0.975,
For non-linear dynamics, statistical regressions with MARS can 0.968 and 0.940 for 25, 50 and 100 end-consumers, respectively. Wahid
compete with more complex data-driven methods presented in detail in and Kim (2016) implemented k-NN for next-day total electricity con-
the following sections of this review. For cooling and heating load sumption forecasting of residential buildings using both MATLAB
forecasting Sekhar Roy, Roy, and Balas (2018) compared MARS with a (“MATLAB – MathWorks – MATLAB Simulink, ” n.d.) and Weka (“Weka
linear regression, a Gaussian process, a simple ANN, a radial basis 3 – Data Mining with Open Source Machine Learning Software in Java,
function neural network (RBFNN), an extreme learning machine (ELM) ” n.d.) software. Appliance-level hourly energy consumption data were
model and an ensemble model of MARS and ELM. Models were trained collected for 520 apartments in Seoul, South Korea. Apartments were
and tested with an open database of 768 building samples. For heating then divided between low and high-power demand ones, considering
load prediction, the ensemble model gave the highest accuracy and the daily profiles generated. Different training-testing ratios were con-
MARS ranked second, followed by ANN, Gaussian Process, ELM, Linear sidered, and results showed that the most robust ratio was 60%
regression and RBFNN with a MAE of 0.037 kW, 0.077 kW, 0.085 kW, training–40% testing, with 95.96% of accurately forecasted results. Ma,
0.175 kW, 0.189 kW, 0.196 kW and 0.354 kW, respectively. For cooling Song, and Zhang (2017) proposed a method with combined weight
load, models also ranked the same with the ensemble model first, then selection of similar days, applied on a government office building in
MARS and ELM, with MAE of 0.127 kW, 0.146 kW and 0.238 kW, re- Jiangsu Province, China. Based on day type and daily weather type
spectively. (sunny, cloudy, rainy, and overcast), they extracted hourly OAT,
lighting and plug, and air-conditioning loads to create daily electricity
2.3.2. Classification-based methods load profiles. One reference working day and one reference vacation
Classification-based methods have been successfully implemented day were then used with each weather scenario to forecast hourly air
for modeling and forecasting purpose. Two of their most popular conditioning (AC) load. An eQuest (physics-based) model (“eQUEST, ”

6
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

n.d.) was also implemented for comparison. The relative errors for Kong. 1516 buildings were surveyed during two distinct periods, in
physics-based modeling and k-NN models, for working day/vacation winter and summer. They collected power rating appliances for each
day ranged between [5.59%; 13.6%]/[5.23%; 17.6%] and [1.31%; end-use and their corresponding half-hour time-step usage patterns. A
5.05%]/[0.83%; 3.68%], respectively. Lachut, Banerjee, & Rollins database was then created and divided into three categories with re-
(2014) developed 5-NN (k = 5) forecasting models to predict building- sidential housing types, household characteristics and appliance own-
level power demand in residential buildings. They used their own da- ership details. Each of the models was trained with the same database,
taset recorded on seven different buildings. Data provided electricity for both periods. Results for electricity consumption forecasting showed
loads with 30-s time-step and were aggregated to hourly, 6-h, daily and that the DT performed slightly better for summer time with RMSE of
weekly time-step, depending on the forecasting horizon. k-NN was 39.36 kWh, compared to the ANN (RMSE of 39.53 kWh) and the re-
compared with Bayesian regression, SVM and ARMA(1,1) using past gression (RMSE of 39.42 kWh). However, higher accuracy was achieved
24-h loads with time-related information (hour of the day, day of the for winter time with the ANN (44.14 kWh) and with the regression
week and quarter of the day). Results showed that k-NN performed (44.18 kWh) than with DT (44.40 kWh). Chou and Bui (2014) com-
better for both appliance and home levels with a one-week forecasting pared several models including CART, CHAID, SVR, ANN, general
horizon. However, it was the least performing algorithms for 1 h-, 6 h- linear regression and ensemble models with different combinations of
and one day-ahead forecasting horizons. these techniques. They were implemented in IBM SPSS Modeler (IBM
SPSS Modeler, 2019“IBM SPSS Modeler, ” n.d.) to predict separate
2.3.2.2. Decision trees (DT). Decision trees are a popular machine heating and cooling load of twelve different building types. Input
learning method also applied in regression problems for forecasting variables were extracted from an open-database of 768 building sam-
applications. It follows the simple idea of a tree growing from roots to ples simulated in Ecotect tool (Tsanas & Xifara, 2012). It included re-
leaves. Hence, a DT starts with a root node leading to other successive lative compactness, surface area, wall area, roof area, overall height,
non-leaf nodes. At each node, a test is performed by considering a orientation, glazing area and glazing area distribution. For cooling load
specific condition on an input variable, either binary or categorical, and assessment results (MAPE), methods ranked as follows: SVM with
the branches keep splitting until leaf-nodes are reached to figure a 2.99%, the four different ensemble models between 3.46% and 3.54%,
possible value of the predicted output (Fig. 2). There is then a path to then CART & CHAID both with 4.02%, ANN with 4.40% and regression
follow from the root node to the leaf-nodes through decision-making. with 4.96%. For heating load (MAPE), the method ranking was: SVR
Several types of DT have been developed. The most common for with 1.13%, the four different ensemble models between 1.56% and
building energy consumption forecasting are classification and regres- 1.61%, CART with 2.10%, ANN with 2.36% CHAID with 2.41% and
sion trees (CART) (Breiman, Friedman, Olshen, & Stone, 1984), chi- regression with 4.59%. Finally, Yu et al. (2010) implemented a C4.5 DT
squared automatic interaction detector (CHAID) (Kass, 1980), ID3 (J R to model building energy use intensity with Weka Software (“Weka 3 –
Quinlan, 1986), C4.5 (John Ross Quinlan, 1993), and C5.0. CART refer Data Mining with Open Source Machine Learning Software in Java, ”
to classification trees when the predicted output is the class the data n.d.). It was based on an 80-residential-building database from six
belongs to, and to regression trees when the predicted output is a different Japanese districts. The database included energy uses of the
number (for forecasting applications). CHAID detects interdependency different energy sources in each building and at different time-steps,
between the different variables of a dataset and therefore allows to with outdoor air temperature, building characteristics and other in-
study the influence of explanatory variables on the result. Finally, C4.5 formation such as occupant number and energy saving measures. The
is an entropy measurement-based DT and improved version of ID3. C5.0 C4.5 DT assessed the energy use intensity for each building and clas-
is an optimized version of C4.5 in terms of computation speed, memory sified them as “HIGH” or “LOW”. Results showed a 92% success rate for
allocation and tree sizing. the classification.
DT are flexible techniques that have been applied for both early
design stage and post-occupancy studies. The accuracy of prediction 2.3.3. Support vector machines (SVM)
results is comparable to other single data-driven techniques such as Support vector machines (Cortes & Vapnik, 1995) are a popular and
artificial neural networks and support vector machines. However, DT efficient technique for non-linear problems solving. It gives accurate
have the significant advantage to be easy to apprehend and with rea- results even with a relatively limited amount of available data. For
sonably complicated implementation and operation. forecasting applications, the process is similar to the resolution of a
Tso and Yau (2007) compared stepwise regression, multi-layer regression problem and is called support vector regression (SVR)
perceptron (MLP) and DT models of residential households in Hong- (Smola & Schölkopf, 2004). Therefore, as for all regression problems,

Fig. 2. Schematic of a DT (decision tree) with input-variable-based conditional separating into non-leaf nodes until ﬁnal leaf-nodes are reached.

7
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

forecasting accuracy with median RMSE of 3.4 kW against 9.1 kW.

Support vector machines also perform rather well compared to
other popular data-driven techniques such as ANN or statistical re-
gressions. For instance, Massana, Pous, Burgas, Melendez, & Colomer
(2015) proposed a comparison of the three methods implemented on
Weka software (Weka, 2019“Weka 3 – Data Mining with Open Source
Machine Learning Software in Java, ” n.d.) for short-term electric load
forecasting of an university office building in Girona, Spain. Seven
different scenarios were tested for different combinations of input
variables with filtered and non-filtered instances. The SVR model had
higher accuracy, followed by a multilayer perceptron (MLP) and a
multilinear regression (MLR) models. This study as for (Paudel et al.,
2017), also highlighted that the modeling accuracy increased with the
selection of variables and the filtering of instances. With all variables
and all instances, the MAPE was of 24.3%, 23.72% and 14.32% while
with filtered instances for occupation and OAT, the MAPE was of 5.2%,
1% and 0.06%, for MLR, MLP and SVR, respectively. Li et al. (2009)
studied short-term cooling load forecasting of a DeST-simulated (“DeST
Fig. 3. Representation of the division of a dataset into two subsets using SVR. simulation software, ” n.d.) office building in Guangzhou, China. They
built a LS-SVM model with Gaussian function kernel using mySVM
software kit (mySVM, 2019“mySVM – TU Dortmund, ” n.d.) that was
the goal is to find a best-fitting function which in SVR modeling is
compared it to a back propagation neural network (BPNN) im-
developed based on the search of a decision hyperplane splitting a given
plemented with MATLAB (“MATLAB – MathWorks – MATLAB
dataset into two sub-datasets. Moreover, the particularity of SVR is that
Simulink, ” n.d.). Both models used hourly OAT, RH and SR as input
it tolerates an error in the regression. This error is called largest margin
variables. Results showed that LS-SVM was more accurate than BPNN
and is defined as the maximum distance between the separation
with a CV-RMSE of 5.56% and 11.8%, respectively. Fu et al. (2015)
boundary of the hyperplane and the closest data samples, named sup-
proposed a ε-SVR with Gaussian kernel function for an historical record
port vectors (Fig. 3). Finally, the key feature of SVR is the definition of a
storage and office building in Shanghai, China. They aimed to perform
kernel function. The idea behind the kernel function is to change the
day-ahead prediction of the four major building loads separately and
representation space of the datasets to a higher dimension where there
aggregated using day type indicator, OAT, DPT and previous 48 h
is probably a linear separation of the two datasets. Indeed, sometimes a
electricity loads. The ε-SVR was compared with ARIMAX, ANN and
dataset cannot be directly and linearly divided into two sub-datasets.
reduced-error pruning tree models. The former outperformed all three
Then, the kernel selection has a significant impact on the performance
other techniques and for all five loads. Total load showed a CV-RMSE of
of SVM model. It can be linear, polynomial or a radial basis function
15.2% compared to 22.4%, 27.2% and 22.1% for ARIMAX, reduced-
(RBF), also called Gaussian kernel. Several sub-types of SVM can also be
error pruning tree and ANN, respectively. Liu et al. (2015) studied total
found in the literature such as a simplification of standard SVM called
electricity consumption of a campus building and an office building
least-square SVM (LS-SVM) (Li, Lu, Ding, Xu, & Li, 2009), ε-SVR and ν-
with energy-saving measures. They implemented a SVR using MATLAB
SVR (Zhang et al., 2016). However, a detailed description of these
with LIBSVM (Chang & Lin, 2013) and its FarutoUltimate (Li, 2011)
techniques is beyond the scope of this study.
toolbox. One month of hourly load data were collected and divided into
SVR have been implemented with most time-steps, input variables
three weeks for model training and one week for testing. Results
and considering real or simulated data. Nevertheless, despite competi-
showed a higher R2 of 0.906 for the first and of 0.921 for the second
tive forecasting performances and application flexibility, SVM present a
building, compared to 0.822 and 0.843 for an ARIMA model. Zhang,
major drawback: the calibration of their parameter is a difficult but
Zhao, Zhang, Fan, and Li (2017) developed support vector regression
decisive process for prediction accuracy. For instance, the kernel
(SVR) and multiple linear regression (MLR) in Python Scikit-Learn
function for example is challenging to accurately determine and sig-
(Scikit-Learn, 2019“Scikit-Learn: machine learning in Python, ” n.d.)
nificantly affects the accuracy of the forecasting. Therefore, the opti-
environment to model and forecast the cooling load of a virtual large
mization of SVM parameters has become a key challenge in building
office building. One year of hourly cooling load was simulated with
energy studies (Chen & Yang, 2018; Fu, Li, Zhang, ö Xu, 2015). Finally,
EnergyPlus (EnergyPlus, 2019“EnergyPlus, ” n.d.) under Miami cli-
as a black-box model SVM is completely non-transparent in terms of
mate, Florida, United States. Inputs of machine algorithms included
physics-based interpretation.
OAT, RH, WS, WD, SR and cooling loads at previous 1, 2, 3, 4 and 24 h.
Paudel et al. (2017) implemented SVM with LIBSVM library (Chang
Input data were first selected for modeling using distance-correlation-
& Lin, 2013) to a TRNSYS-simulated (“TRNSYS: Transient System
based input method. The basic approaches were also improved with a
Simulation Tool, ” n.d.) residential low energy building, in four dif-
prediction error correction according to the type of day and hour of the
ferent French cities and with four different climatic conditions. They
day. Results showed that the error correction improved the MLR fore-
aimed to forecast the combined cooling and heating energy demand
casting accuracy with a MAPE reduction from 7.10% to 5.51%. How-
with input variables including OAT, SR, solar gain through window and
ever, it was less effective for SVR with a MAPE reduction from 5.70% to
on walls, past-time steps of these variables, occupancy profile and day
5.66%.
indicator. Two different kernel functions were selected. A linear kernel
was used to train a SVM and determine the weight of weather data on
2.3.4. Artificial neural networks (ANN)
the energy consumption. A RBF kernel was used for the prediction of
Artificial neural networks modeling is one of the most applied data-
the energy load. Three simulation scenarios were implemented, with
driven BECMF methods. It is a nonlinear machine learning technique
different combinations of input variables after a “relevant data” selec-
inspired from neural networks in the human brain of which it copies the
tion using the linear kernel SVM. Results gave median RMSE of
information propagation process in a simplified manner. First, an in-
13.1 kW, 4.2 kW and 3.2 kW for the three scenarios, respectively. Also,
formation coming from a processing element (neuron) is sent with a
the relevant data selection was compared to an all-input model for the
weight (synapse) through a link (axon) to following processing ele-
third scenario and showed that input selection could improve the
ments. These combine the information received with other incoming

8
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Fig. 4. Schematic of a classical three-layer ANN.

information from other neurons, using a combining function (dendrite). (BPNN) (Wang, Wang, Li, Zhu, & Zhao, 2014), radial basis function
Finally, the combination of weighted information is sent to other re- neural networks (RBFNN) (Lee & Ko, 2009), a derivative of FFNN called
ceivers, depending on an activation function, also called transfer extreme learning machine (ELM) (Sekhar Roy et al., 2018), adaptive
function (cell body). The process is repeated as many times as the network-based fuzzy interference systems (ANFIS) (Ghanbari,
number of layers in the network and until the model accurately fits the Abbasian-Naghneh, & Hadavandi, 2011) and multi-layer perceptron
data (when the error rate converges) or when the maximum number of (MLP) (Amasyali & El-Gohary, 2018) which is the premise of deep
iterations is reached (with or without error convergence). The basic learning ANN. More advanced ANN sub-classes can also be found such
form of ANN is composed of three layers: an input layer used either to as recurrent neural networks (RNN) (Mocanu, Nguyen, Gibescu, &
train the model or to get prediction from input data in the testing phase, Kling, 2016) or nonlinear autoregressive model with exogenous inputs
an output layer giving the final result(s), and a hidden layer bridging (NARX) (Mena, Rodríguez, Castilla, & Arahal, 2014) and probabilistic
between inputs and outputs (the number and structure can be modified entropy-based neural networks (PENN) (Kwok & Lee, 2011).
depending on the type of ANN and the needs of the modeling) (Fig. 4). Mena et al. (2014) implemented an ANN with NARX architecture
An ANN is usually designed according to three different criteria: (1) the for a bioclimatic university building in Almeria, Spain. One year and a
interconnections between the different neurons of different layers – half of electric load data with 1-min time-step were collected. Input
how many neurons from how many layers are communicating in what parameters included the type of days, hour of the day, OAT, SR, state of
specific way; (2) the learning method – how the final error is retro- several cooling and heating equipment and combined electric power
propagated in the network and how it affects the different weights (the demand. A full-input model was built and compared to model with
error retro-propagation is not illustrated in Fig. 4); (3) the activation limited input (in that case the solar cooling system information were
functions of each neurons based on the input (Magoulès & Zhao, 2016) removed). Building energy consumption was forecasted with 1-min-
(for different layers, the activation functions can be different). ahead, 1-h-ahead and “infinite” horizons. Kwok and Lee (2011) used a
ANN are highly flexible and adaptable models. They enable most specific type of ANN called probabilistic entropy-based neural network
forecasting problem solving including with non-linear patterns. They (PENN) to forecast the cooling load of an office building in Hong-Kong.
have been applied for short- to long-term forecasting horizon, using any It used hourly weather data including OAT, RH, rainfall, WS, bright
time-step available and any type of input data. However, as most data- sunshine duration and SR. Occupancy was also considered with occu-
driven techniques, one of the major disadvantages of ANN application pancy rate and internal load. Three models were designed: (1) only
is the black-box nature of the model with no transparency in terms of weather input parameters, (2) weather inputs and occupancy area and
physical interpretation. Moreover ANN models are subject to overfitting (3) all input parameters. The cooling load was forecasted hourly with a
(Chalal et al., 2016; Massana et al., 2015). They tend to perform very one week-ahead horizon. Results showed that the third model was the
well for one specific dataset but poorly if the same model is applied on most accurate with CV-RMSE (95% lower and upper limits) of
another dataset (for training and testing steps for instance). The risk of 11.41%–17.17% compared to the second and first models with
overfitting increases with the degree of complexity of a model that is 14.84%–30.09% and 40.38%–52.05%, respectively. Bagnasco, Fresi,
usually raised when higher accuracy is targeted. Solutions have been Saviozzi, Silvestro, and Vinci (2015) implemented a MLP with MATLAB
suggested to prevent this issue. A particular attention should be di- (MATLAB, 2019“MATLAB – MathWorks – MATLAB Simulink, ” n.d.) to
rected structure modification of hidden layers of ANN (Ahmad, forecast total electricity consumption of a hospital complex in Turin,
Mourshed, & Rezgui, 2017) during the training phase. Using regular- Italy. They used one year of 15-min load data and divided the year of
ization techniques is also possible, especially with large input dataset observation into quarters to enhance the prediction force of the model.
(L’Heureux, Grolinger, Elyamany, & Capretz, 2017). For example, a pre- Input variables were the load of the previous day and of the same-day
selection on input variables can be performed with respect to their previous-week, the average of the previous day energy consumption,
impact on the energy consumption to reduce the amount of data pro- the type of day, the timestamp and the OAT. MLP was implemented for
cessed and the complexity of the model. Indeed, Massana et al. (2015) all four quarters, trained with two months and half of data and tested
showed that a pre-selection of the input parameters significantly af- with fifteen days of data. Forecasting results showed a mean MAPE of
fected their forecasting results. In their study, higher accuracy was 7%. Finally, Biswas, Robinson, and Fumo (2016) proposed two feed-
obtained using OAT, occupancy, type of day and hour of the day with forward artificial neural network (FFNN) with optimization of the
filtered instances, resulting in a reduced MAPE of 0.45% for a MLP convergence speed and accuracy and of the initialization of the algo-
compared to all non-filtered inputs with a MAPE of 23.7%. rithms, using MATLAB Neural Network Toolbox (Neural Network
It should be highlighted that ANN is a generic term. Therefore, Toolbox, 2019“Neural Network Toolbox – MATLAB, ” n.d.). These
different specific types of ANN exist to be used in various situations and FFNN aimed to predict daily electricity consumption for a research and
with different level of complexity and accuracy. The review of the lit- demonstration residential building located in Texas, USA, using the
erature presented feed forward neural networks (FFNN) (Jovanović, timestamp, daily mean OAT and daily SR. Data were collected for three
Sretenović, & Živković, 2015) and back propagation neural networks months. Models were trained with 70% of the dataset and validated and

9
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

tested with the remaining 30%. The FFNN performances were assessed layer DNN with genetic programming (GP), ANN, SVR and MLR. ANN
with a coefficient of determination, R2, ranging between 0.871 and and DNN algorithms were developed with TensorFlow library
0.878. (TensorFlow, 2019“TensorFlow, ” n.d.) and SVR was developed with
Compared to other forecasting techniques, ANN rank among the libSVM (Chang & Lin, 2013). Authors used an administrative university
most accurate. Ahmad et al. (2017) compared FFNN implemented with building located in London, England as a case-study. They aimed to
Python NeuroLab (“NeuroLab 0.3.5, Neural Network Library for Py- predict the daily building electricity consumption per unit of surface
thon,” n.d.) to random forest decision tree (RF) built with Scikit-Learn using daily mean OAT, RH and WS, and a weekday index. All models
(“Scikit-Learn: machine learning in Python, ” n.d.) for hourly prediction were trained using three years of data and were tested with one year of
of combined heating and cooling loads in a hotel building in Madrid, data. Results showed that ANN outperformed other techniques with a
Spain. They showed that the RF actually performed better than the MAPE of 6%. The multiple regression gave a MAPE of 8.5%, SVM gave
FFNN using all variables (RMSE of 4.66 kWh against 4.72 kWh) while 9% and DNN gave a MAPE of 11.15%. Mocanu, Nguyen, Gibescu, et al.
the ANN performed better than the RF with a selection of relevant (2016) proposed two DNN implemented on MATLAB (“MATLAB –
variables based on their impact on the energy consumption (4.60 kWh MathWorks – MATLAB Simulink, ” n.d.), namely conditional restricted
against 4.84 kWh). Zhao, Zhong, Zhang, and Su (2016) proposed a Boltzmann machine (CRBM) and factored CRBM (FCRBM). They used
comparison of ANN, SVM and ARIMA models to forecast combined an open dataset with aggregated and sub-metered active power data
heating and cooling energy consumption in Chinese office buildings in from a benchmark single residential housing. Energy consumption
Shanghai, Nanjing and Changsha. All models where developed using forecast was performed with different time-steps (1 min to one week)
IBM SPSS Modeler (IBM SPSS Modeler, 2019“IBM SPSS Modeler, ” and for different forecasting horizons (15-min-ahead to one-year-
n.d.). Results showed that the ANN performed better than the SVM and ahead), using past time-steps of electric load demand. The two methods
that both outperformed the ARIMA with a MAPE of 0.15%–0.11%, were compared with ANN, SVM and RNN. The FCRBM gave the best
0.21%–0.18% and 0.41%–0.33%, respectively (first–second month accuracy for all time-steps and all forecasting horizons. Fan, Wang,
testing samples). Gang, and Li (2019) compared various approaches and model archi-
With the recent popularization of the technique, deep learning tectures for forecasting with deep recurrent neural networks developed
models defined as “computational models that are composed of mul- with R software programming (“R: The R Project for Statistical
tiple processing layers to learn representations of data with multiple Computing, ” n.d.) and Keras package (Keras Documentation,
levels of abstraction” (LeCun, Bengio, & Hinton, 2015) have sig- 2019“Keras Documentation, ” n.d.). They aimed to forecast the cooling
nificantly gained in interest in BECMF studies. Deep learning modeling load demand of an educational building in Hong-Kong with half-hour
can be regarded as a technique similar to ANN. However, while stan- time-step at 24-h-ahead prediction horizon. Input variables included
dard ANN have only three layers (one input, one hidden and one output OAT, RH and past time-steps of cooling load power demand, collected
layer), deep learning neural networks (DNN) develop more complex over one year. 70% of the dataset were used for model training and
algorithm architecture and training schemes. Hence, the number and 30% were used for testing. Performances of the algorithms ranged be-
structure of the hidden layers are adapted depending on their function tween a CV-RMSE of 16.0% for the best performing model, a DNN with
in the modeling process (Fig. 5). Moreover, the training process is not as gated recurrent unit and direct inference approach, and a CV-RMSE of
straightforward as in conventional ANN. The definition of specific op- 38% for the least performing method, a DNN with long short-term
erators to provide more flexibility to the model and achieve higher memory and recursive inference approach. Finally, Shi et al. (Shi, Liu, &
accuracy. Wei, 2016) proposed the study of a type of RNN called echo state
Thus, ANN with more than three layers can be considered as a DNN. network (ESN) using neuron reservoirs instead of the classical hidden
For instance, MLP as implemented in (Bagnasco et al., 2015; Massana layer. It was implemented for an office building in China to predict
et al., 2015; Tso & Yau, 2007) can contain three or more layers. RNN hourly electricity consumption and using 6 different reservoir topolo-
(Mocanu, Nguyen, Gibescu, et al., 2016) also fall in this class of neural gies. Four years of breakdown hourly load data were collected, and
networks. Besides, other examples can be found in the literature. OAT and building occupancy were used as inputs. Models considered
Marino, Amarasinghe, and Manic (2016) compared a deep recurrent only working days to assess the three main building loads (lights, plugs
neural network called long short-term memory. They aimed to forecast and AC), total rooms load for four types of rooms and total building
the electricity load of a benchmark single residential building. They electricity demand. Errors for the whole building (CV-RMSE) were be-
used the date and time of the targeted prediction together with the tween 3.72% and 4.97% depending on the typology of reservoir.
previous time-step of power demand. Data were collected for four
years: at 1-min time-step for hour-ahead forecasting the DNN gave a
RMSE of 0.667 kW; at 1-h time-step for 60-h ahead forecasting, the 2.4. Combined models
DNN gave a RMSE of 0.625 kW. Amber et al. (2018) compared a two-
Combined models focus on the optimization of forecasting

Fig. 5. Schematic of a Deep Neural Network with n hidden layers.

10
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

techniques to improve the prediction accuracy. Combined modeling base learners sequentially to exploit their interdependence (Fig. 6d);
framework either mixes several single algorithms together (ensemble parallel learning which can refer to bagging method (Breiman, 1996) is
models) or with optimization methods (improved models). also called bootstrap aggregation method and generates base learners in
parallel to exploit their independence (Fig. 6c). The former more spe-
cifically aims to reduce the variance of the estimates of each base
2.4.1. Ensemble models learner, while the latter targets bias reduction. Finally, the second
Ensemble models are data-driven algorithms designed for fore- strategy for ensemble modeling is called heterogeneous modeling and
casting applications. They use a specific framework focusing on the can refer to stacking techniques (Wolpert, 1992) (Fig. 6b). It uses sev-
improvement of prediction performance and on the tradeoff of the eral different single forecasting algorithms trained on the same dataset
strengths and weaknesses of predictive algorithms. The ensemble (Step 1). The forecasting results from each base model are weighted to
modeling framework comprises two main steps (Fan et al., 2014): (1) give the ensemble model (Step 2). To compare with the different en-
several sub-models called “base learners” (for homogenous ensemble semble modeling processes, Fig. 6a illustrates single techniques mod-
models) or “base models” (for heterogeneous ensemble models) are eling process when a unique dataset is processed by a unique algorithm
obtained; (2) the comparison of their respective forecasting results is to give forecasting results.
performed, these results are weighted depending on their accuracy and Ensemble models have gained in interest for BECMF in the past few
they are combined to generate the optimal output of the ensemble years. They provide better prediction accuracy than regular single
model. models and they have been applied to various case-studies with dif-
The general ensemble modeling process can further adopt different ferent time-steps and types of data. However, the increasing of pre-
approaches (Wang & Srinivasan, 2017). A first strategy is called diction accuracy is paid in complexity. Indeed, the framework of en-
homogenous modeling. It creates sub-samples from the original dataset semble learning is particularly challenging to implement and requires
which are processed through one specific single data-driven technique. advanced expertise in machine learning. Moreover, it is completely a
The results obtained for each sub-sample, the base learners, are black-box modeling process, and the prediction horizon in the reviewed
weighted based on their respective prediction performances and are studies has been limited to short-term forecasting.
combined into the ensemble model. Two additional specific paths can Typical examples of homogenous ensemble learning models are
also be used for homogeneous modeling (Alobaidi, Chebana, & Meguid, improved decision trees such as random forest (RF) and boosting de-
2018): sequential or in-series learning whose classical examples are cision trees (BDT). They have the main advantage to correct the
boosting algorithms (Schapire & Singer, 1999), and which generates

Fig. 6. Comparison between (a) single, (b) heterogeneous ensemble, (c) parallel fomogeneous ensemble and (d) sequential homogeneous ensemble models.

11
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

tendency of standard DT to overfit their training set. RF consists in a The forecasting process also included an outlier detection and elim-
group of several decision trees whose results are aggregated into one ination method. The ensemble model performed better with a MAPE of
final result (Breiman, 2001). They usually use a two-level randomiza- 2.32%. For comparison, the SVR was the second-best performing model
tion strategy. First each trained with a random subset of observations with a MAPE of 3.11% and the ARIMA was the least performing tech-
and then each tree node is divided by considering a random subset of nique with a MAPE of 5.45%. Alobaidi et al. (2018) developed a
variables. BDT, and their modified version called gradient boosted de- homogeneous MLP-FFNN-based ensemble model to forecast day-ahead
cision trees (GBDT) (Ho, 1995), are based on classification and re- mean daily household electricity use. It was implemented with a smart
gression tree (CART) methodology. They use boosting techniques: in energy system in a French household, considering two years and nine
the modeling process a sequence of simple decision trees are developed, months electricity consumption and OAT datasets. The method aimed
with each successive tree modeling the residuals of the precedent one. to improve the dataset resampling in the heterogeneous ensemble
The final model is a weighted additive binary tree model (Elith, modeling process with a two-step strategy to prompt diversity in the
Leathwick, & Hastie, 2008). Tsanas and Xifara (2012) compared an model and to better capture the different trends in the dataset. It also
iterative reweighted least square regression method to a RF model to targeted the improvement of ensemble model generation, using MLR to
predict the heating and cooling load of residential buildings simulated combine base learners. The ensemble model was compared to a single
with Ecotect tool. They created an open database including eight pas- ANN and an ANN-based boosting ensemble model, all applied for every
sive systems variables (Xifara & Tsanas, n.d.). Results showed that RF days of a week one-by-one. The homogeneous ensemble model out-
outperformed the regression, with a MAE of 0.51 kW/1.42 kW and performed the two latter techniques with a mean weekly MAPE of
2.14 kW/2.21 kW, for heating/cooling loads and for both models, re- 14.4%, 18.3% and 15.2%, respectively.
spectively. Wang, Wang, Zeng, Srinivasan, and Ahrentzen (2018)
compared a RF to a regression tree and a SVR for the forecasting of 2.4.2. Improved models
electricity consumption in two institutional buildings, including a LEED In the present study improved models are defined as the combina-
building, in Florida, USA. Input data were OAT, DPR, RH, pressure, tion of single models and optimization techniques. This technique has
precipitation, WS, SR, estimated occupancy from daily operation and also been identified by Mat Daut et al. (2017) who referred to them as
class schedule, time of the day, workday type and day type. Data were “hybrid methods”. Optimization methods can include swarm in-
collected over a typical year of operation: 80% of the dataset was used telligence algorithms such as particle swarm optimization (PSO)
for model training and 20% for testing. RF outperformed both the re- (Kennedy & Eberhart, 1995) Genetic algorithms (GA) (Mitchell, 1998),
gression tree and SVR for both buildings. For the LEED building, RF had a popular sub-class of evolutionary algorithms or differential evolution
a MAPE of 7.75% compared to 8.04% for SVR and 8.90% for the re- (DE) (Storn & Price, 1997).
gression tree. For the second building, RF had a MAPE of 11.93%, Zhang et al. (2016) compared the efficiency of swarm intelligence
compared to 12.21% for SVR and 14.50% for the regression tree. and evolutionary algorithms by applying three techniques: a GA, a
Papadopoulos, Azar, Woon, & Kontokosta (2017) proposed a compar- differential evolution (DE) algorithm and PSO for the optimization of
ison of three different ensemble DT-based models implemented with SVR parameters. All three optimization techniques were implemented
Python Scikit-Learn (Scikit-Learn, 2019“Scikit-Learn: machine learning with R programming tool (The R Project for Statistical Computing,
in Python, ” n.d.). They developed a RF, an extra randomized trees and 2019“R: The R Project for Statistical Computing, ” n.d.) on ε-SVR and ν-
a GBDT to forecast combined heating and cooling loads. Using an open SVR models separately. Single models were compared to a weighted
database (Xifara & Tsanas, n.d.) they compared their forecasting combination of both types of SVR and optimized with DE. The different
method with results from Tsanas and Xifara (Tsanas & Xifara, 2012) techniques were tested on an institutional building in Singapore for
(regression and RF models), Chou and Bui (Chou & Bui, 2014) (SVM half-hour (10-day dataset, 8:2 training-testing ratio) and daily energy
and ensemble ANN-SVM models) and Castelli et al. (genetic program- forecasting (260-day dataset, 8:2 training-testing ratio). Inputs only
ming) (Castelli, Trujillo, Vanneschi, & Popovič, 2015). The perfor- included the past energy demand of the building. The combination of
mances of the GBDT improved heating load forecasting performances both types of SVR and DE showed the highest forecasting accuracy for
by 8% to 68% for heating and by 51% to 63% for cooling load com- both half-hour and daily electricity consumption forecasting, with a
pared to the three other studies. Wang, Wang, and Srinivasan (2018) MAPE of 3.77% and 5.84%. For ε-SVR with GA, PSO and DE optimi-
developed a homogeneous ensemble BDT using MATLAB (“MATLAB – zation, the MAPE were 6.67%, 5.44% and 5.44% for half-hourly time-
MathWorks – MATLAB Simulink, ” n.d.). It was applied on a LEED in- step and 5.93%, 5.95%, and 5.95% respectively for daily time-step. For
stitutional building in the University of Florida for short-term electricity ν-SVR with GA, PSO and DE optimization, the MAPE was 3.77% for all
demand prediction. The model used one year of time-series weather and three techniques with half-hourly time-step and 6.37%, 6.36% and
occupancy data, together with time of the day and day type. To provide 6.36%, respectively for daily time-step. Besides, for energy consump-
higher forecasting performance and because of the different usage tion prediction of residential buildings, Castelli et al. (2015) used ge-
period of the building over the year, the dataset was portioned in three netic programming (GP) with symbolic regression to develop an im-
sets for summer, fall and spring seasons. Moreover, a method called proved model. Using an opened database of benchmark residential
“compact” modeling was proposed and applied to BDT to measure the building characteristics (Xifara and Tsanas, 2019Xifara & Tsanas, n.d.),
influence of the different features and select the most relevant ones for it performed with MAE of 0.51 kW for heating load and of 1.18 kW for
forecasting. Results showed that BDT and compact BDT performed cooling load forecasting.
better than CART, with 2.97%/4.62%/4.40%, 2.92%/4.40%/4.48% Castelli, Trujillo, Vanneschi, & Popovič (2015) compared a PSO-
and 3.08%/5.05%/5.06%, for periods of summer/spring/fall and for ANN to simple ANN and GA-ANN (GA was used for the same purpose as
the three models respectively. Nevertheless, it was also highlighted that PSO). Two databases were used to predict hourly electricity consump-
feature selection did not significantly improve forecasting accuracy. tion for: 1) a research building from ASHRAE dataset located in USA
Ensemble models have been implemented with other techniques with four-month data collection including WS, SR, RH and OAT; 2) a
than based on decision trees. For instance, C. Fan et al. (2014) com- campus library located in East China with 100-day data collection in-
pared a MLR, an ARIMA, a SVR with Gaussian kernel, a RF, a MLP, a cluding estimated occupancy and daily OAT. For the former, 70% of the
BDT, a MARS, a k-NN and an heterogeneous ensemble model combining data were used for training and 30% for testing. For the latter, 93% of
all eight single models. They were used for daily energy consumption the data were used for training and the remaining 7% for testing.
forecasting of a mixed-use (commercial center, offices and hotel) high- Principal component analysis (PCA) (Jolliffe, 2002) was applied with
rise building in Hong-Kong. Data collection included one year of 15-min both case studies for relevant input data selection. PSO-ANN gave
time-step building electricity data and one year of daily weather data. better forecasting results than GA-ANN and ANN, with MAPE of 1.6%,

12
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

1.9% and 2.2%, respectively, for the ASHRAE database. PSO-ANN also and on two cluster-based models. Results showed an improvement of
outperformed GA-ANN and ANN when applied on the Chinese library the forecasting accuracy using clusters, with a MAPE of 3.62%, 3.64%,
building with a MAPE of 5.9%, 7.1% and 8.0%. In an author study (K. Li 3.32% and 3.22% for the single, seasonal-based, first cluster-based and
et al., 2018), authors developed an optimization strategy called second-cluster based models respectively. Nilashi et al. (2017) im-
teaching-learning based optimization. It used evolutionary algorithms plemented both EM and principal component analysis (PCA) with an
combined with BPNN to improve convergence speed and forecasting adaptive network-based fuzzy inference system (ANFIS) for cooling and
performances. A basic combined model was proposed, together with heating load forecasting of residential buildings. Comparisons were
five improved versions. Models were also coupled with PCA for relevant performed between seven forecasting techniques implemented with
input variables selection. Algorithms were compared with the PSO-ANN MATLAB (MATLAB, 2019MATLAB – MathWorks – MATLAB Simulink, ”
and GA-ANN from the previous study (K. Li et al., 2015). Improved n.d.): SVR, ANFIS, ANN, CART, MLR, PCA-ANFIS and an improved
teaching-learning based optimization ANN performed slightly better model of MARS with artificial bee colony algorithm (Karaboga &
than the two latter methods and for both buildings. Basturk, 2007). Results showed that the prediction scheme with
EM + PCA + ANFIS performed better for both heating and cooling
3. Machine learning approaches for data-driven techniques loads, with a MAPE of 1.39% and 2.45% respectively. Finally, com-
paring the efficiency of both supervised and unsupervised approaches,
The goal of data-driven techniques in BECMF studies is to model the C. Fan et al. (2017) implemented seven forecasting techniques in-
relationships between a combination of inputs and outputs under a cluding MLR, elastic net regression, RF, gradient boosting machine,
specific process. The model is then used to forecast building energy SVR, extreme gradient boosting decision tree (GBDT) and DNN, using
consumption or power load demand. Model outputs are always known one year of half-hour data. Five input variable datasets were prepared,
since they are the target of the whole study. However, there exist dif- based on supervised learning techniques. The first dataset included 1)
ferent strategies regarding the utilization of input variables and the seven variables (OAT, RH and five-time indicators). The other four
extraction of features from an input dataset to train data-driven models. datasets added 2) the past 24-h cooling load, OAT and RH; 3) the
Two main approaches have been used for input variable selection in previous time-step of cooling load, OAT and RH; 4) the previous 24-h
BECMF: supervised and unsupervised learning. Moreover, other tasks minimum, maximum, mean and standard deviation of the three vari-
exists such as reinforcement and transfer learning. This section dis- ables; 5) the four most dominant frequencies resulting from a discrete
cusses all four machine learning tasks and their application in BECMF Fourier transform and performed on the previous 24-h for each of the
studies. three time series. A sixth dataset was also prepared using an un-
supervised deep auto-encoder, a DNN, considering the four previous
3.1. Supervised and unsupervised learning feature extraction methods. The smallest forecasting error was obtained
using extreme GBDT with the unsupervised dataset (CV-RMSE of
Supervised learning can be understood as having input variables of 17.8%). On the opposite, supervised learning approach did not show
a model that are all labeled, indicating they have been clearly identified evident advantages for building cooling load prediction.
before the modeling process. For building energy studies, it would
mean that inputs used to assess the energy consumption of a building 3.2. Reinforcement and transfer learning
have been related to their tangible physical meaning (Fan et al., 2017).
For instance, the first variable would be outdoor air temperature, the Reinforcement learning (Busoniu, Ernst, De Schutter, & Babuska,
second would be occupancy and so on. Then identified variables can be 2011) is an another approach in the field of machine learning and
cleaned and pre-processed to select the most impacting on energy differs from supervised and unsupervised learning. The process is not
consumption, or directly used as such for model training. Therefore, based on feature extraction in input datasets or on data labeling. It is
most data-driven applications presented in the previous section use inspired from psychology and follows a concept of learning through
supervised learning approach. rewarding. In reinforcement learning, an artificial agent figuring a de-
Unsupervised learning is the second main task of machine learning. cision-maker is set up in a determined environment and with a specific
It has been largely implemented to make full use of big data collection goal to achieve. The agent performs self-decided actions to reach the
in buildings and its applications include data analytics, optimization, predefined goal. For each action it moves within the environment and
control, identification of occupants behavior and anomaly detection receives a retro-fed information as a reward to let it know how far away
(Fan, Xiao, Li, & Wang, 2018; Miller, Nagy, & Schlueter, 2018). Con- from the final target is its position (Fig. 7). Moreover, each performed
trariwise supervised learning, unsupervised learning uses unlabeled action is memorized by the agent to assess its efficiency based on the
data to discover relevant relationships within a dataset. Thus, a pre- reward received. Indeed, it aims to maximize the sum of rewards over
definition of specific data types which could influence the modeling time to achieve the final state. Hence, agent must be able to learn and
process is not implemented. More precisely, for BECMF applications, decide on a strategy to automatically select a next action without any
feature extraction within an input variable dataset is independent from intervention from a programmer. Therefore, reinforcement learning is
the physical meaning of the variables. Popular techniques for un- not supervised since it relies also on the results of agent-based actions
supervised feature extraction and classification include k-means (Jain, and not only on labeled input data. Reinforcement learning is not un-
2008), self-organizing maps (SOM) (Kohonen, 1997), hierarchical supervised either as the nature of the reward is already known. Some
clustering algorithms (Rokach & Maimon, 2005) and expectation recent applications for building energy and forecasting studies can be
maximization algorithms (EM) (Dempster, Laird, & Rubin, 1977). The found in the literature. Mocanu, Nguyen, Kling, & Gibescu (2016) im-
detailed description of these classification methods is out of the scope of plemented two reinforcement algorithms, namely Q-learning and state-
this review, but interested reader can refer to Wei et al. (2018) for action-reward-state-action (SARSA) algorithms, with an unsupervised
further details. Using unsupervised approach, Tang, Kusiak, & Wei deep belief network (DBN, a type of DNN) in MATLAB (MATLAB,
(2014) investigated the impact of input data clustering on the predic- 2019“MATLAB – MathWorks – MATLAB Simulink, ” n.d.). They used
tion accuracy of commercial combined heating and cooling demand seven years of hourly data to forecast energy consumption in a smart-
with short-term forecasting horizon. They first compared several single grid context. The database was divided between five different building
(SVR, MLP) and ensemble models (RF, boosting tree and MLP-en- types and five scenarios were implemented for hour-ahead, day-ahead,
semble) to highlight that a MLP-ensemble performed best. Then they week-ahead and month-ahead forecasting with hourly time-step, and
further prepared four input data pre-treatment scenarios applied on the year-ahead with weekly time-step. Among the tested model, the Q-
initial supervised MLP-ensemble model, and on a season-based model learning with DBN obtained the highest accuracy for every scenario and

13
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

and benchmark data accounting for 64%, 20% and 16% of the studied
in the present review work. Real data are directly collected from bill-
ings, energy meters, environment sensors and onsite surveys. Simulated
data are extracted from physics-based models of existing or unexisting
buildings, using tools such as EnergyPlus (“EnergyPlus, ” n.d.), TRNSYS
(TRNSYS, 2019“TRNSYS: Transient System Simulation Tool, ” n.d.),
DeST (“DeST simulation software, ” n.d.), Ecotect (“Ecotect Analysis |
Autodesk Knowledge Network, ” n.d.)or eQuest (“eQUEST, ” n.d.).
Benchmark data come from publicly-available datasets provided for
researchers to compare forecasting algorithms performances. Bench-
mark databases used in reviewed studies have been summarized in
Table 2.
The features in the dataset can be divided into six main groups: 1)
weather data grouping all data related to outdoor conditions; 2) indoor
environment to characterize the building indoor conditions; 3) occu-
pancy and occupants behavior; 4) time indicators that deliver in-
formation on the building operation and its energy behavior; 5) past
time-steps that account for the potential impact of past events on the
Fig. 7. Reinforcement learning modeling process with reward-based decision
current and predicted states of the building energy; 6) building char-
making from an artificial agent.
acteristics with information on the building passive and active systems.
A more detailed list of the different types of input variables found in
for both two transfer learning strategies. It was followed by reviewed studies and falling under these six main categories is provided
DBN + SARSA algorithm, while reinforcement learning algorithms in Table 3. The number of studies referring to these data and the cor-
alone performed less accurately. responding techniques implemented are also indicated. From the ana-
Furthermore, in this study the particularity of the training method lysis of this table comes out the predominant usage of specific cate-
based on reinforcement learning algorithms lied in training models with gories of data. Outdoor air temperature (OAT), outdoor relative
a dataset from a specific building type to forecast energy consumption humidity (RH) and solar radiations (SR) are considered in thirty-two,
for another building with different characteristics. This method is called nineteen and eighteen different studies respectively. Indeed, these
transfer learning (Pan & Yang, 2010). In the specific case of BECMF, it parameters are easily accessible through various open-access or char-
aims to use and adapt data from specific buildings to train forecasting ging weather databases platform (“Iowa Environmental Mesonet, ” n.d.;
models implemented for energy demand prediction in other different “Meteonorm: Irradiation data for every place on Earth, ” n.d.). More-
buildings. For instance, Mocanu, Nguyen, Kling, et al. (2016) used over, their impact on building energy behavior is well known. Building
commercial building data as a training set to predict the energy con- and equipment characteristics information are crucial as well to accu-
sumption of residential buildings. They also used a similar process to rately model building energy consumption. They can be accessed
train a model using data for residential buildings without electric through onsite surveys, design-related documents or energy standards.
heating to predict energy consumption of residential buildings with On the opposite, some well-identified energy drivers were less con-
electric heating. In another study, Ribeiro, Grolinger, ElYamany, sidered, such as building occupancy: real occupancy data have been
Higashino, & Capretz (2018) developed a specific approach for cross- reported in only seven reviewed studies. As a matter of fact challenges
building (transfer learning) building energy forecasting using seasonal lying in occupancy measurements (Yang, Santamouris, & Lee, 2016)
and trend adjustment. They selected a case study of four different often lead to prefer the use of assumed occupancy schedules (used in
schools with relatively similar but different energy behavior and cli- seven reviewed studies). Thus, because of data availability issues time-
mate locations. They proposed two modeling schemes based on their related parameters such as the type of day, day of the week indexes and
transfer learning method: 1) a training set of one month of data from the time of the day are considered to replace other time-dependent
the target building coupled with twelve months of data collected on the measurements such as weather information, occupancy and usages or
other three buildings; 2) a training set of twelve months of data from equipment triggering. Similarly, past load demand data points have
the target building coupled with twelve months of data collected on the been used in twenty-one reviewed articles. Indeed, past load demand
other three buildings. These schemes were compared with a classical provides information on past energy behavior related to time periods,
supervised machine learning approach with 3) one month and 4) twelve building operation conditions and events similar to the future states of
months of data from one building to forecast the next month same- the building energy demand, but for which specific data are unavail-
building energy demand; (5) a training set combining one month from able. Finally, it should be mentioned that other parameters were much
the target building and twelve months of data from the other three. All less used because of their limited impact on building energy con-
five schemes were tested with both a SVR and a MLP. Results high- sumption such as barometric pressure, cloud coverage or evaporation.
lighted the efficiency of the proposed transfer learning method over However, some of these features (indoor environment measurements or
classical supervised learning and for data-driven both techniques. CO2 levels for instance) could be relevant when considering building
Nevertheless, it should be noted that the fourth training scheme also comfort which impacts building energy consumption (Allab, Pellegrino,
produced good forecasted performances almost equivalent to these of Guo, Nefzaoui, & Kindinis, 2017).
the transfer learning method. The third input dataset characteristic is the granularity of the time-
series. Different time-steps may firstly relate to the need of the studies.
4. Input data for data-driven techniques Indeed, using a very small time-step such as 1-min provides information
on very short and specific events in buildings energy demand patterns.
4.1. Data characteristics However, such precise information induce a very high variability of the
energy demand time series and therefore brings challenges and com-
Input data are the driver of all approaches and techniques in the plexity to obtain accurate forecasting (Mena et al., 2014). On the op-
reviewed studies. Input datasets have different characteristics with a posite, a larger granularity, such as weekly or monthly reporting pro-
direct impact on the modeling and forecasting accuracy. First is the vides information on building design features (Tsanas & Xifara, 2012)
origin of data classified in three main categories with real, simulated and socio-economic related aspects (Yu et al., 2010). However, large

14
M. Bourdeau, et al.

Table 2
Description of benchmark databases used in reviewed studies.
Database name Building Number of Type of data End-use(s) Data Other Time-step Timeframe Referring Link to the database
type(s) buildings/ collection studies
appliances scale

BGE: Baltimore Gas Residential 5 buildings Load Electricity Building N/S 1h 7 years Mocanu, https://supplier.bge.com/
and Electricity & profiles Nguyen, https://supplier.bge.com/documents/index.asp#electric
company commercial Kling, et al.
(2016)
Energy Efficiency Residential 768 Benchmark Heating & Building N/S N/S N/S Tsanas and https://archive.ics.uci.edu/ml/datasets/energy+efficiency
Dataset simulated passive cooling Xifara
buildings system data (2012)
ICER: Irish Residential > 6000 Time series Electricity, Building N/S 1 day 19 months Valgaev http://www.ucd.i.e./issda/data/commissionforenergyregulationcer/
Commission for & buildings natural gas, and Kupzog https://github.com/wwzjustin/CER-Smart-Meter-Project-by-Irish-Social-Science-Data-Archive

15
Energy commercial water (2016);
Regulation Valgaev,
dataset Kupzog,
and
Schmeck
(2017)
iHEPCDS: Residential 1 building Time series Electricity Building & N/S 1 min 4 years Marino http://archive.ics.uci.edu/ml/datasets/Individual+household+electric+power+consumption
individual sub-meters et al.
Household (2016);
Electric Power Mocanu,
Consumption Nguyen,
Data Set Gibescu,
et al.
(2016)
The Great Building Commercial 1 building Time series Electricity, Building OAT, 1h 6 months Fan et al. N/S
Energy hot water DBT, (2019); Li
Predictor and cold SR, et al.
Shootout I water HR, (2015)
WS
Sustainable Cities and Society 48 (2019) 101533
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 3
Description of the diﬀerent types of input data in the reviewed studies.
Main input data type Speciﬁc input data Techniques and reference using the data Number of
studies

Weather/outdoor environment Outdoor air temperature (OAT) AR: (Fu et al., 2015; Yun et al., 2012) 32
Regression: (Amber et al., 2017, 2018; Dagnely et al., 2015; Dong et al., 2016; Fan
et al., 2014, 2017; Massa Gray & Schmidt, 2018; Massana et al., 2015; Yun et al.,
2012; Zhang et al., 2017)
k-NN: (Fan et al., 2014; Ma et al., 2017)
DT: (Fu et al., 2015; Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al.,
2018; Yu et al., 2010)
SVM: (Amber et al., 2018; Dagnely et al., 2015; Dong et al., 2016; Fan et al., 2014,
2017; Fu et al., 2015; Li et al., 2009; Massana et al., 2015; Paudel et al., 2017;
Ribeiro et al., 2018; Tang et al., 2014; Wang, Wang, Zeng, et al., 2018; Zhao et al.,
2016)
ANN: (Ahmad et al., 2017; Alobaidi et al., 2018; Amber et al., 2018; Bagnasco et al.,
2015; Biswas et al., 2016; Dong et al., 2016; Fan et al., 2014; Fu et al., 2015; Kwok
& Lee, 2011; Li et al., 2009, 2015; Massana et al., 2015; Mena et al., 2014; Neto &
Fiorelli, 2008; Ribeiro et al., 2018; Tang et al., 2014; Yun et al., 2012; Zhao et al.,
2016)
DNN: (Amber et al., 2018; Fan et al., 2019, 2017; Shi et al., 2016)
Ensemble: (Ahmad et al., 2017; Alobaidi et al., 2018; Fan et al., 2014, 2017; Tang
et al., 2014; Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
Improved: (Dong et al., 2016; Li et al., 2015, 2018)
Hybrid: (Collinge et al., 2016; Dong et al., 2016; Massa Gray & Schmidt, 2018)
Dew point temperature (DPT) Regression: (Fan et al., 2014) 7
k-NN: (Fan et al., 2014)
DT: (Fu et al., 2015; Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al.,
2018)
SVM: (Fan et al., 2018; Fu et al., 2015; Ribeiro et al., 2018; Wang, Wang, Zeng,
et al., 2018)
ANN: (Ahmad et al., 2017; Fan et al., 2014; Fu et al., 2015; Ribeiro et al., 2018)
Ensemble: (Ahmad et al., 2017; Fan et al., 2014; Wang, Wang, & Srinivasan, 2018;
Wang, Wang, Zeng, et al., 2018)
Outdoor relative humidity Regression: (Amber et al., 2017, 2018; Fan et al., 2014, 2017; Massa Gray & 19
(RH) Schmidt, 2018; Massana et al., 2015; Yun et al., 2012; Zhang et al., 2017)
k-NN: (Fan et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Amber et al., 2018; Fan et al., 2014, 2017; Li et al., 2009; Massana et al.,
2015; Ribeiro et al., 2018; Tang et al., 2014; Wang, Wang, Zeng, et al., 2018)
ANN: (Ahmad et al., 2017; Amber et al., 2018; Fan et al., 2014; Kwok & Lee, 2011;
Li et al., 2015; Li et al., 2009; Massana et al., 2015; Neto & Fiorelli, 2008; Ribeiro
et al., 2018; Tang et al., 2014; Yun et al., 2012)
DNN: (Amber et al., 2018; Fan et al., 2019, 2017)
Ensemble: (Ahmad et al., 2017; Fan et al., 2014, 2017; Tang et al., 2014; Wang,
Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
Improved: (Li et al., 2015, 2018)
Wind speed (WS) Regression: (Fan et al., 2014; Yun et al., 2012; Zhang et al., 2017) 11
k-NN: (Fan et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Amber et al., 2018; Fan et al., 2014; Tang et al., 2014; Wang, Wang, Zeng,
et al., 2018)
ANN: (Ahmad et al., 2017; Amber et al., 2018; Fan et al., 2014; Kwok & Lee, 2011;
Li et al., 2015; Tang et al., 2014; Yun et al., 2012)
Ensemble: (Ahmad et al., 2017; Fan et al., 2014; Tang et al., 2014; Wang, Wang, &
Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
Improved: (Li et al., 2015, 2018)
Wind direction (WD) Regression: (Zhang et al., 2017) 4
k-NN: (Fan et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018)
SVM: (Tang et al., 2014)
ANN: (Tang et al., 2014)
Ensemble: (Tang et al., 2014; Wang, Wang, Zeng, et al., 2018)
Rain level/rainfalls Regression: (Fan et al., 2014) 5
k-NN: (Fan et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Fan et al., 2014; Wang, Wang, Zeng, et al., 2018)
ANN: (Fan et al., 2014; Kwok & Lee, 2011)
Ensemble: (Fan et al., 2014; Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng,
et al., 2018)
Solar radiation (SR) Regression: (Amber et al., 2017; Dagnely et al., 2015; Dong et al., 2016; Fan et al., 18
2014; Massana et al., 2015; Yun et al., 2012; Zhang et al., 2017)
k-NN: (Fan et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Dagnely et al., 2015; Dong et al., 2016; Fan et al., 2014; Li et al., 2009;
Massana et al., 2015; Paudel et al., 2017; Tang et al., 2014; Wang, Wang, Zeng,
et al., 2018)
(continued on next page)

16
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 3 (continued)

Main input data type Speciﬁc input data Techniques and reference using the data Number of
studies

ANN: (Biswas et al., 2016; Dong et al., 2016; Fan et al., 2014; Kwok & Lee, 2011; Li
et al., 2009, 2015; Massana et al., 2015; Mena et al., 2014; Neto & Fiorelli, 2008;
Tang et al., 2014; Yun et al., 2012)
Ensemble: (Fan et al., 2014; Tang et al., 2014; Wang, Wang, & Srinivasan, 2018;
Wang, Wang, Zeng, et al., 2018)
Improved: (Dong et al., 2016; Li et al., 2018)
Hybrid: (Dong et al., 2016)
Solar gains SVM: (Paudel et al., 2017) 1
Bright sunshine duration ANN: (Kwok & Lee, 2011) 1
Cloud coverage Regression: (Fan et al., 2014) 1
k-NN: (Fan et al., 2014)
SVM: (Fan et al., 2014)
ANN: (Fan et al., 2014)
Ensemble: (Fan et al., 2014)
Evaporation Regression: (Fan et al., 2014) 1
k-NN: (Fan et al., 2014)
SVM: (Fan et al., 2014)
ANN: (Fan et al., 2014)
Ensemble: (Fan et al., 2014)
CO2 SVM: (Tang et al., 2014) 1
ANN: (Tang et al., 2014)
Ensemble: (Tang et al., 2014)
Barometric pressure Regression: (Fan et al., 2014) 5
k-NN: (Fan et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Fan et al., 2014)
ANN: (Fan et al., 2014; Tang et al., 2014)
Ensemble: (Fan et al., 2014; Tang et al., 2014; Wang, Wang, & Srinivasan, 2018;
Wang, Wang, Zeng, et al., 2018)
Weather type/category k-NN: (Ma et al., 2017) 1
Statistical data Physics-based: (Ma et al., 2017; Massa Gray & Schmidt, 2018; Neto & Fiorelli, 2008) 4
Hybrid: (Siddharth et al., 2011)
Indoor environment Indoor air temperature (IAT) Regression: (Massana et al., 2015) 1
SVM: (Massana et al., 2015)
ANN: (Massana et al., 2015)
Indoor relative humidity Regression: (Massana et al., 2015) 1
SVM: (Massana et al., 2015)
ANN: (Massana et al., 2015)
Indoor luminosity level Regression: (Massana et al., 2015) 1
SVM: (Massana et al., 2015)
ANN: (Massana et al., 2015)
Occupancy Occupants number/counting Regression: (Yun et al., 2012) 7
(real data) DT: (Wang, Wang, & Srinivasan, 2018; Yu et al., 2010)
SVM: (Paudel et al., 2017)
ANN: (Ahmad et al., 2017; Kwok & Lee, 2011; Yun et al., 2012)
DNN: (Shi et al., 2016)
Ensemble: (Ahmad et al., 2017; Wang, Wang, & Srinivasan, 2018)
Occupancy design data/ AR: (Newsham & Birt, 2010) 7
estimated data Regression: (Massana et al., 2015)
DT: (Wang, Wang, Zeng, et al., 2018)
SVM: (Massana et al., 2015; Wang, Wang, Zeng, et al., 2018)
ANN: (Ahmad et al., 2017; Li et al., 2015; Massana et al., 2015)
Ensemble: (Ahmad et al., 2017; Wang, Wang, Zeng, et al., 2018)
Improved: (Li et al., 2015, 2018)
Occupancy status Regression: (Dagnely et al., 2015) 2
SVM: (Dagnely et al., 2015)
Time-related indicators Time periods AR: (Yun et al., 2012) 2
Regression: (Lachut et al., 2014)
k-NN: (Lachut et al., 2014)
SVM: (Lachut et al., 2014)
Timestamp ANN: (Biswas et al., 2016) 1
Time of the day AR: (Yun et al., 2012) 12
Regression: (Fan et al., 2014; Lachut et al., 2014; Massa Gray & Schmidt, 2018;
Massana et al., 2015)
k-NN: (Fan et al., 2014; Lachut et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Fan et al., 2014, 2017; Lachut et al., 2014; Massana et al., 2015; Wang,
Wang, Zeng, et al., 2018; Zhao et al., 2016)
ANN: (Ahmad et al., 2017; Bagnasco et al., 2015; Fan et al., 2014; Massana et al.,
2015; Zhao et al., 2016)
DNN: (Fan et al., 2017; Marino et al., 2016)
Ensemble: (Ahmad et al., 2017; Fan et al., 2014, 2017; Wang, Wang, & Srinivasan,
2018; Wang, Wang, Zeng, et al., 2018)
Hybrid: (Massa Gray & Schmidt, 2018)
(continued on next page)

17
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 3 (continued)

Main input data type Speciﬁc input data Techniques and reference using the data Number of
studies

Day of the week Regression: (Amber et al., 2018; Amber et al., 2017; Fan et al., 2014; Lachut et al., 13
2014; Massa Gray & Schmidt, 2018; Massana et al., 2015)
k-NN: (Fan et al., 2014; Lachut et al., 2014)
DT: (Wang, Wang, & Srinivasan, 2018)
SVM: (Amber et al., 2018; Fan et al., 2014; Lachut et al., 2014; Massana et al., 2015;
Ribeiro et al., 2018; Wang, Wang, Zeng, et al., 2018)
ANN: (Ahmad et al., 2017; Fan et al., 2014; Lachut et al., 2014; Massana et al.,
2015; Ribeiro et al., 2018; Wang, Wang, Zeng, et al., 2018)
DNN: (Amber et al., 2018; Fan et al., 2017; Marino et al., 2016)
Ensemble: (Fan et al., 2014, 2017; Wang, Wang, Zeng, et al., 2018)
Hybrid: (Massa Gray & Schmidt, 2018)
Type of day Regression: (Dagnely et al., 2015; Fan et al., 2014, 2017; Massana et al., 2015) 13
k-NN: (Fan et al., 2014; Ma et al., 2017)
DT: (Fu et al., 2015; Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al.,
2018)
SVM: (Dagnely et al., 2015; Fan et al., 2014, 2017; Fu et al., 2015; Massana et al.,
2015; Paudel et al., 2017; Wang, Wang, Zeng, et al., 2018; Zhao et al., 2016)
ANN: (Bagnasco et al., 2015; Fan et al., 2014; Fu et al., 2015; Massana et al., 2015;
Mena et al., 2014; Neto & Fiorelli, 2008; Zhao et al., 2016)
Ensemble: (Fan et al., 2014, 2017; Wang, Wang, & Srinivasan, 2018; Wang, Wang,
Zeng, et al., 2018)
Speciﬁc day indicator Regression: (Fan et al., 2014, 2017; Massana et al., 2015) 2
Month of the year k-NN: (Fan et al., 2014) 8
SVM: (Fan et al., 2014, 2017; Massana et al., 2015; Ribeiro et al., 2018)
ANN: (Ahmad et al., 2017; Fan et al., 2014; Massana et al., 2015; Ribeiro et al.,
2018)
DNN: (Fan et al., 2017)
Ensemble: (Ahmad et al., 2017; Fan et al., 2014, 2017; Wang, Wang, & Srinivasan,
2018; Wang, Wang, Zeng, et al., 2018)
Year SVM: (Ribeiro et al., 2018) 1
ANN: (Ribeiro et al., 2018)
Past time-steps/data points Previous power demand/ AR: (Dagnely et al., 2015; Fan et al., 2014; Fu et al., 2015; Lachut et al., 2014; Liu 21
energy consumption et al., 2015; Newsham & Birt, 2010; Yun et al., 2012; Zhao et al., 2016)
Regression: (Dong et al., 2016; Fan et al., 2014; Lachut et al., 2014)
k-NN: (Lachut et al., 2014; Ma et al., 2017; Valgaev & Kupzog, 2016; Wahid & Kim,
2016)
SVM: (Dong et al., 2016; Fan et al., 2017; Lachut et al., 2014; Liu et al., 2015;
Mocanu, Nguyen, Gibescu, et al., 2016; Chaobo Zhang et al., 2017)
ANN: (Alobaidi et al., 2018; Bagnasco et al., 2015; Dong et al., 2016; Kwok & Lee,
2011; Mena et al., 2014; Mocanu, Nguyen, Gibescu, et al., 2016; Yun et al., 2012)
DNN: (Fan et al., 2019; Fan et al., 2017; Mocanu, Nguyen, Gibescu, et al., 2016;
Mocanu, Nguyen, Kling, et al., 2016)
Ensemble: (Alobaidi et al., 2018; Fan et al., 2017; Zhang et al., 2016)
Improved: (Dong et al., 2016; Zhang et al., 2016)
Hybrid: (Dong et al., 2016)
Previous OAT Regression: (Fan et al., 2017) 2
SVM: (Fan et al., 2017; Paudel et al., 2017)
DNN: (Fan et al., 2017)
Ensemble: (Fan et al., 2017)
Previous RH Regression: (Fan et al., 2017) 1
SVM: (Fan et al., 2017)
DNN: (Fan et al., 2017)
Ensemble: (Fan et al., 2017)
Previous SR SVM: (Paudel et al., 2017) 1
Previous solar gains SVM: (Paudel et al., 2017) 1
Mathematical characteristics Minimum, maximum and/or Regression: (Fan et al., 2017) 2
mean of time series SVM: (Fan et al., 2017; Ribeiro et al., 2018)
ANN: (Ribeiro et al., 2018)
DNN: (Fan et al., 2017)
Ensemble: (Fan et al., 2017)
Fourier transform Regression: (Fan et al., 2017) 1
SVM: (Fan et al., 2017)
DNN: (Fan et al., 2017)
Ensemble: (Fan et al., 2017)
Deep learning-based time series Regression: (Fan et al., 2017) 1
SVM: (Fan et al., 2017)
DNN: (Fan et al., 2017)
Ensemble: (Fan et al., 2017)
Building characteristics and operation Passive system Regression: (Amber et al., 2017; Chou & Bui, 2014; Dong et al., 2016; Massa Gray & 14
information Schmidt, 2018; Nilashi et al., 2017; Pulido-Arcas et al., 2016; Sekhar Roy et al.,
2018; Tsanas & Xifara, 2012; Tso & Yau, 2007)
DT: (Chou & Bui, 2014; Nilashi et al., 2017; Tso & Yau, 2007; Yu et al., 2010)
SVM: (Chou & Bui, 2014; Dong et al., 2016; Nilashi et al., 2017)
ANN: (Chou & Bui, 2014; Dong et al., 2016; Nilashi et al., 2017; Sekhar Roy et al.,
(continued on next page)

18
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 3 (continued)

Main input data type Speciﬁc input data Techniques and reference using the data Number of
studies

2018; Tso & Yau, 2007)

Ensemble: (Chou & Bui, 2014; Papadopoulos et al., 2017; Tsanas & Xifara, 2012)
Improved: (Castelli et al., 2015; Dong et al., 2016; Nilashi et al., 2017)
Physics-based: (Ma et al., 2017; Massa Gray & Schmidt, 2018; Neto & Fiorelli, 2008)
Hybrid: (Dong et al., 2016; Massa Gray & Schmidt, 2018; Siddharth et al., 2011)
Active systems Regression: (Pulido-Arcas et al., 2016) 9
SVM: (Tang et al., 2014)
ANN: (Mena et al., 2014; Tang et al., 2014)
Ensemble: (Tang et al., 2014)
Physics-based: (Ma et al., 2017; Massa Gray & Schmidt, 2018; Neto & Fiorelli, 2008)
Hybrid: (Collinge et al., 2016; Dong et al., 2016; Massa Gray & Schmidt, 2018;
Siddharth et al., 2011)

time-steps are not suitable for forecasting applications related to day-to- outliers that depict very unusual energy behaviors. It can be done
day building energy management. In the reviewed papers, the times- manually or automatically. For instance, Fan et al. (2014) who devel-
step ranged from 1-min to annual with the following repartition: (1) oped an automated outlier detection method.
three studies with 1-min time-step, (2) one study with 5-min time-step, Input data selection is not mandatory. It aims to select specific
(3) three studies with 15-min time-step, (4) three studies with half-hour combinations of inputs to retrieve the most influential energy drivers in
time-step, (5) twenty-eight studies with hourly time-step, (6) five stu- order to enhance forecasting performances and reduce calculations
dies with daily time-step, (7) three studies with weekly time-step and complexity. Several approaches have been highlighted during the re-
(8) nine studies with annual time-step. The details of the studies with view work. Input data selection can first relate to the selection of an
the corresponding time-steps and other characteristics is available in adapted forecasting time-step (Mocanu, Nguyen, Gibescu, et al., 2016;
Appendix A. Mocanu, Nguyen, Kling, et al., 2016) or an adapted training–valida-
Finally, the fourth characteristics of input dataset is the amount of tion–testing ratio (Massa Gray & Schmidt, 2018; Wahid ö Kim, 2016).
data used for training, validation and testing of the forecasting algo- Also, a common method is the manual selection of different combina-
rithms. Among the reviewed studies most of the database contained tions of inputs (Dagnely et al., 2015; Fan et al., 2017; Neto & Fiorelli,
between one and six months of data (45%). 7% (3 studies) used less 2008; Yun et al., 2012) and the comparison of the forecasting results.
than a month of data, 7% of the database used between 6 months and 1 Indeed, some parameters may not have any direct impact on building
year of data, 24% used between 1 year and 2 years and 17% used more energy consumption. Then, to use them for model training can result in
than 2 years of data. Moreover, it should be noted that specific studies lower forecasting accuracy and in overfitting the models. However, it
such as Yu et al. (2010) or those referring to Tsanas and Xifara (2012)’s should be highlighted that in relevant data pre-selection is not always
database (Xifara and Tsanas, 2019Xifara & Tsanas, n.d.) did not used effective (Massa Gray & Schmidt, 2018; Wang, Wang, ö Srinivasan,
time series as inputs of their models. They referred to the amount of 2018). Input dataset can also be pre-processed using original dataset
data they used as test cases or sets of data but without timeframe in- manually divided into different periods such as weekdays/weekends
dication. Training, validation and testing ratios were investigated. Most (Newsham & Birt, 2010), days with specific types of weather (Ma et al.,
datasets in reviewed studies, with a share of 65%, used between 50% 2017; Mena et al., 2014) or seasons (Tang et al., 2014). By doing so,
and 90% of their data for training or training and validation combined specific building operating conditions are isolated and sub-models can
(therefore between 10% and 50% for testing). Then 20% of the datasets be developed for each or specific periods depending on the focus of the
were split with more than 90% dedicated to training. Only in (Massa studies. The same idea can be achieved with clustering algorithms that
Gray & Schmidt, 2018) were more data used for testing than for automatically identify the trends in the building energy behavior. Ex-
training (10% of the total number of datasets) and in (Ribeiro et al., tracted trends can then be associated with different usages or types of
2018; D. Zhao et al., 2016) were the data divided with a 50%–50% days (Tang et al., 2014). These algorithms can be either supervised
training–testing ratio. Frequently used ratios were 70%–30% in six when the clustering is based on user-defined features, or unsupervised if
studies and 80%–20% and 75%–25% in five studies. Finally, when the the features are extracted by using mathematical operators and metrics
validation step was dissociated from the training step most data were (Toffanin, 2016). Then, it relates to machine learning tasks described in
dedicated to algorithm training with a training–validation–testing ratio the previous section. Finally, a pre-selection of features can be im-
of 70%–15%–15% in (Fan et al., 2017), 62%–17%–21% in (Mena et al., plemented, using sensitivity analyses (Kristensen & Petersen, 2016),
2014) and 80%–10%–10% in (Chaobo Zhang et al., 2017). This high- principal component analysis (PCA) (K. Li et al., 2018; Nilashi et al.,
lights that if the validation step has a different purpose than the training 2017) or other specific methods (Deb & Lee, 2018; Massana et al., 2015;
step and is supposed to use a different dataset, it is not always clearly Paudel et al., 2017; Wang, Wang, Zeng, et al., 2018; Chaobo Zhang
dissociated or even mentioned at all. et al., 2017).

4.2. Data pre-processing 5. Discussions

As part of the modeling process, input data pre-processing is a very 5.1. Building energy modeling and forecasting targets
important. It involves a verification of the input data quality and
eventually an optimization of the types of inputs, time frames and time- 5.1.1. Building typologies
steps selected. Hence many reviewed studies used common methods or Reviewed studies have focused on a variety of building typologies.
developed specific ones for data pre-processing, as it directly impacts These typologies have been classified into three main categories:
on the forecasting results, their accuracy and reliability. commercial buildings including educational buildings, residential
Data pre-processing relies on two sub-processes: data cleaning and buildings and mixed usages buildings representing 66%, 30% and 4% of
input data selection. Data cleaning is a mandatory step to remove all the buildings in reviewed studies. Table 4 provides further description
poor-quality information such as missing data, monitoring issues and of the different building typologies.

19
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 4
Summary of the different building types in the reviewed studies.
Main building typology Specific building type Number of studies considering the specific building types

Commercial Hotel 1 40%

Hospital 1
Office (real) 10
Office (simulated) 5
Research 2
Library 2
N/S 2
Educational Academic (classrooms & laboratories) 7 26%
Administrative 3
Institutional (classrooms, offices and laboratories) 1
Research center (offices and laboratories) 4
Residential Single family housings (real) 5 30%
Single family housings (simulated) 1
Multifamily building (real) 2
Multifamily building (simulated) 7
Research/demonstration 1
N/S 1
Mixed-use Residential multifamily building and commercial office (simulated) 1 4%
Mixed-use (commercial center, office, hotel) 1

Therefore, it clearly appears a lack of studies on residential build- 5.1.2. Energy end-uses
ings and mixed-use buildings. The over-representation of educational Most of reviewed studies (52%) focus on overall energy forecasting
buildings in probably due to data availability since campus buildings while a smaller part (46%) focus on cooling and heating load demand
can be more easily instrumented and monitored for research purposes. prediction (separated or combined) (details on studies targeting
On the opposite, sensor-based data for residential building are more heating, cooling or combined cooling and heating loads are provided in
difficult to obtain. Indeed, half of residential buildings considered were Table 5). Then only 4% of the reviewed studies targeted other loads.
either simulated buildings using benchmark dataset (Xifara & Tsanas, Newsham and Birt (2010) assessed “occupancy-related” loads with
n.d.) or unoccupied residential buildings used as research demonstra- combined lightings and plugs electricity demand. Shi et al. (2016)
tion (Biswas et al., 2016). This clearly highlights an insufficient number proposed forecasting models for lightings, AC and plug loads separately
of monitored residential buildings and a lack of data for this building and combined at the building scale. Therefore, there is a lack of studies
typology. Another reason could also be the complexity of residential on other loads than thermal loads and total building energy demand.
energy demand forecasting. Because of the smaller size of residential However, these loads such as lighting and plug loads represent more
buildings, the relatively small number of energy consuming appliances than 19% of residential energy consumption in Europe (“Energy con-
and the complexity to account for occupants’ behavior in energy sumption in households – Statistics Explained, ” n.d.). Consequently,
models, individual dwelling energy demand is more difficult to assess they hold a significant share of energy demand and of consequent po-
than for commercial buildings. Occupant's behavior has a significant tential energy savings (Ghadi, Rasul, & Khan, 2017). Even more so that
impact on building energy consumption (Pisello & Asdrubali, 2014) and both lighting and equipment on plugs are a significant internal heat
a higher variability in residential buildings than in commercial or large source in buildings and they directly impact on cooling load demand
office buildings (Xu et al., 2012). Furthermore, advanced occupant (Dong et al., 2016).
behavior modeling has been lacking in BECMF studies as reported in The main reason for this gap could be related to the “occupancy-
Table 3 and real occupancy data are hardly accessible. Thus, they are based” nature of lighting and plugs energy consumption (Newsham &
replaced by predefined occupancy scenarios resulting in even larger Birt, 2010). Indeed, energy standards require specific amounts of
uncertainties in building energy forecasting (Azar & Menassa, 2012). lighting for optimal operating conditions in offices and for activities in
Despite obvious challenges accurate residential and mixed-used energy residential buildings (ASHRAE, 2013). Thus, for obvious energy con-
modeling and forecasting is needed. Indeed, residential energy conservation measures, lighting might be automated to detect occupancy
sumption represented 25.7% of the European final energy consumption (Kandasamy, Karunagaran, Spanos, Tseng, & Soong, 2018) or at least
in 2016 against 13.5% for commercial buildings (“European for non-zero occupancy. Similarly, most equipment usage such as office,
Environment Agency – Final energy consumption by sector and fuel, ” electronic and cooking equipment usage are also occupancy related.
n.d.). Therefore, it holds a large share of energy consumption with large Moreover, as highlighted in Section 4.1 few occupancy and behavior-
potential energy savings. related data have been used in the reviewed studies, which induces a
Regarding lack of studies on mixed-use buildings, similar problems lack of information for other building loads forecasting. Hence, further
as those faced for residential energy exist. Furthermore, combining studies should focus on lighting and plug energy demand with a better
different building types induces a larger diversity of appliances, beha- accounting for occupancy and occupants’ behavior information.
viors and demand profiles which increases the modeling complexity
(Choi, Cho, & Kim, 2012). Consequently, addressing such cases requires
5.1.3. Forecasting horizon
an even larger amount of data which are not easily available. When
Forecasting horizons can be divided between short-term, medium-
unavailable, data are replaced by assumptions at the expense of realistic
term and long-term (Mocanu, Nguyen, Kling, et al., 2016; Yalcinoz and
case studies (Valgaev & Kupzog, 2016). Nevertheless, mixed-use
Eminoglu, 2005). They aim at different purposes for energy manage-
building energy studies are essential as well since this building typology
ment and savings. Short-term horizon can be defined as forecasting
is gaining ground in some energy-consuming and urbanized countries
from the next minute to the next week. It is essential for real-time
(Choi et al., 2012; Woo and Cho, 2018). Hence, future studies should 1)
management of building energy systems (Fan et al., 2019) such as
provide residential buildings case studies with integration of retrofit
HVAC systems or to manage local energy generation, storage and pro-
impact assessment together with enhanced human behavior capturing
vision (Bouzerdoum, Mellit, & Massi Pavan, 2013). It represents 41% of
and 2) focus on providing a better mixed-use buildings understanding.
the simulations in studies covered by the present review. Medium-term

20
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 5
Description of the diﬀerent end-uses targeted in the reviewed studies.
End-use Techniques implemented Number of studies

Overall energy AR: (Dagnely et al., 2015; Fan et al., 2014; Fu et al., 2015; Lachut et al., 2014; Liu et al., 2015) 28
Regression: (Amber et al., 2017, 2018; Dagnely et al., 2015; Dong et al., 2016; Fan et al., 2014; Lachut et al., 2014;
Massana et al., 2015; Pulido-Arcas et al., 2016)
k-NN: (Fan et al., 2014; Lachut et al., 2014; Valgaev & Kupzog, 2016; Wahid & Kim, 2016)
DT: (Fu et al., 2015; Tso & Yau, 2007; Wang, Wang, & Srinivasan, 2018; Wang, Wang, Zeng, et al., 2018)
SVM: (Amber et al., 2018; Dagnely et al., 2015; Dong et al., 2016; Fan et al., 2014; Fu et al., 2015; Lachut et al., 2014;
Liu et al., 2015; Massana et al., 2015; Mocanu, Nguyen, Gibescu, et al., 2016; Ribeiro et al., 2018)
ANN: (Alobaidi et al., 2018; Amber et al., 2018; Bagnasco et al., 2015; Biswas et al., 2016; Dong et al., 2016; Fan et al.,
2014; Fu et al., 2015; Li et al., 2015; Massana et al., 2015; Mena et al., 2014; Mocanu, Nguyen, Gibescu, et al., 2016;
Neto & Fiorelli, 2008; Ribeiro et al., 2018; Tso & Yau, 2007)
DNN: (Amber et al., 2018; Marino et al., 2016; Mocanu, Nguyen, Gibescu, et al., 2016; Shi et al., 2016)
Ensemble: (Alobaidi et al., 2018; Fan et al., 2014; Wang et al., 2018a; Wang, Wang, Zeng, et al., 2018)
Improved: (Dong et al., 2016; Li et al., 2015, 2018; Zhang et al., 2016)
Hybrid: (Dong et al., 2016; Siddharth et al., 2011)
Cooling load AR: (Yun et al., 2012) 12
Regression: (Chou & Bui, 2014; Fan et al., 2017; Nilashi et al., 2017; Sekhar Roy et al., 2018; Tsanas & Xifara, 2012; Yun
et al., 2012; Chaobo Zhang et al., 2017)
DT: (Chou & Bui, 2014; Nilashi et al., 2017)
SVM: (Chou & Bui, 2014; Fan et al., 2017; Li et al., 2009; Nilashi et al., 2017; Zhang et al., 2017)
ANN: (Chou & Bui, 2014; Kwok & Lee, 2011; Li et al., 2009; Nilashi et al., 2017; Sekhar Roy et al., 2018; Yun et al.,
2012)
DNN: (Fan et al., 2019; Fan et al., 2017)
Ensemble: (Chou & Bui, 2014; Fan et al., 2017; Papadopoulos et al., 2017; Sekhar Roy et al., 2018; Tsanas & Xifara,
2012)
Improved: (Castelli et al., 2015; Nilashi et al., 2017)
Heating load AR: (Yun et al., 2012) 7
Regression: (Chou & Bui, 2014; Nilashi et al., 2017; Sekhar Roy et al., 2018; Tsanas & Xifara, 2012; Yun et al., 2012)
DT: (Chou & Bui, 2014; Nilashi et al., 2017)
SVM: (Chou & Bui, 2014; Nilashi et al., 2017)
ANN: (Chou & Bui, 2014; Nilashi et al., 2017; Sekhar Roy et al., 2018; Yun et al., 2012)
Ensemble: (Chou & Bui, 2014; Papadopoulos et al., 2017; Tsanas & Xifara, 2012)
Improved: (Castelli et al., 2015; Nilashi et al., 2017)
Combined heating and cooling AR: (Zhao et al., 2016) 7
loads Regression: (Massa Gray & Schmidt, 2018)
k-NN: (Ma et al., 2017)
SVM: (Paudel et al., 2017; Tang et al., 2014; Zhao et al., 2016)
ANN: (Ahmad et al., 2017; Tang et al., 2014; Zhao et al., 2016)
Ensemble: (Ahmad et al., 2017; Tang et al., 2014)
Hybrid: (Collinge et al., 2016; Massa Gray & Schmidt, 2018)
Other loads AR: (Newsham & Birt, 2010) 2
DNN: (Shi et al., 2016)

prediction of energy consumption, from one week to several months 5.2. Building energy modeling and forecasting data-driven methods
ahead, focuses on energy storage systems management and main-
tenance planning of building equipment (Rahman, Srikumar, & Smith, In the present review, different building energy modeling and
2018). This horizon is considered in 35% of reported studies. forecasting methods have been presented and described, focusing on
Finally, long-term horizon provides information on the next year data-driven techniques. These algorithms, even for basic methods, can
and over which is used for design and planning tasks (Rahman et al., achieve relatively high forecasting accuracy while requiring less ex-
2018). Thus, as much as short- and medium-term predictions, it is es- pertise regarding the various building energy behavior characteristics
sential, to serve long-term sustainability strategies in built environ- than traditional physics-based modeling process (Ma et al., 2017; Neto
ment. However, long-term forecasting is targeted by less than 25% of ö Fiorelli, 2008). Thus, they are currently the main research focus in
reviewed studies. This is probably due to data availability problems. BECMF (Ahmad et al., 2018; Amasyali ö El-Gohary, 2018; Deb et al.,
Indeed, the longer the forecasting horizon the larger the diversity of 2017; Mat Daut et al., 2017; Wang & Srinivasan, 2017; Wei et al., 2018;
demand patterns (Mocanu, Nguyen, Gibescu, et al., 2016). Accounting Yildiz et al., 2017).
for a larger diversity requires more data, collected over longer periods Among the techniques described, classical approaches with auto-
of time. However, as explained in Section 4.1 a majority (59%) of regressive and regression models are quite popular because of their
studies relied on less than one-year measurement campaigns while only relative implementation simplicity and good forecasting performance.
17% used data collected for more than two years and thus rely on re- They are often used as a comparison basis for the implementation of
levant training sets for long-term prediction. In addition, building en- more advanced algorithms (Fan et al., 2014, 2017). Classification-based
ergy systems can exhibit nonlinear behaviors (Li & Wen, 2015). If methods with DT and k-NN are intuitive and of significant prediction
nonlinearities can be handled by most forecasting algorithms on a force (Chou & Bui, 2014; Ma et al., 2017; Wahid & Kim, 2016). SVM
short-term basis, it becomes much harder for long-term forecasting and ANN are among the best performing and the most implemented
horizon. Finally, the forecasting time-step also has a non-negligible data-driven single techniques for building energy forecasting studies, as
impact (Mena et al., 2014). Among the long-term forecasting studies, highlighted in Table 1. They can be used as a support tool for more
53% used annual time-step, compared to 12% for weekly and daily advanced modeling process such as ensemble (Alobaidi et al., 2018)
time-step, 18% for hourly time-step and 6% for 1-min time-step. Thus, and improved models (F. Zhang et al., 2016). Furthermore, recent
the smaller the time-step the more challenging the long-term fore- machine learning developments have been implemented for BECMF
casting. with deep neural networks (Amber et al., 2018), unsupervised learning

21
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Fig. 8. Illustration of data-driven methods covered in the review and the range of machine learning improvements available.

(Fan et al., 2017), reinforcement learning (Mocanu, Nguyen, Kling, combining both white- and black-box models a have been the focus of
et al., 2016) and transfer learning (Ribeiro et al., 2018) (Fig. 8). Such recent studies.
improvements in machine learning based forecasting algorithms lead Hybrid modeling presents two main orientations. A data-driven
the way to less operator-dependent and more versatile algorithms in method is used to optimize specific parameters of a white-box models.
terms of data usage, with much higher prediction accuracy. For instance, Siddharth, Ramakrishna, Geetha, and Sivasubramaniam
Also, machine learning techniques benefit from widespread mod- (2011) used a genetic algorithm to quickly and realistically create sets
eling tools, libraries and packages available that encompass various pre- of specific input parameters identified as key energy-drivers for a white-
embedded functions which makes implementation easier. Some of the box model. They aimed to assess hourly total building energy con-
most used are Python (“Python, ” n.d.), R programming (The R Project sumption over a year. Then, for satisfactory results a non-linear re-
for Statistical Computing, 2019“R: The R Project for Statistical Com- gression model was implemented between the selected system variables
puting, ” n.d.), MATLAB (“MATLAB – MathWorks – MATLAB Simulink, and the annual energy consumption. It showed very satisfactory coef-
” n.d.), IBM-SPSS Modeler (“IBM SPSS Modeler, ” n.d.) and Statistics ficients of determination. In the case of (Massa Gray & Schmidt, 2018),
(“IBM SPSS Statistics, ” n.d.), Weka (“Weka 3 – Data Mining with Open a Gaussian process was combined with a RC-lumped model (Resistance
Source Machine Learning Software in Java, ” n.d.) and mySVM software Capacitance) to predict and adjust error of the physics-based model. It
(mySVM, 2019“mySVM – TU Dortmund, ” n.d.). Details on the different showed higher forecasting performances than with the Gaussian pro-
packages used for machine learning techniques are provided in Table 6. cess or the RC-model alone. Another way to combine data-driven and
physics-based models consists in replacing parts of the physics-based
model with machine learning algorithms, for energy equipment load
5.3. Limitations of data-driven techniques: toward grey-box modeling demand simulation for instance. This is the case for (Collinge, DeBlois,
Landis, Schaefer, & Bilec, 2016) who used sequential linear regression
Despite great flexibility and good forecasting performances, data- to assess cooling and heating loads of an HVAC system set in an En-
driven algorithms show several limitations. First, they rely on large ergyPlus (EnergyPlus, 2019“EnergyPlus, ” n.d.) physics-based en-
quantities of data that must be representative of the different operating vironment. Similarly, Dong et al. (2016) proposed an hybrid strategy to
conditions of the building. Otherwise, they would only capture specific predict the total electricity consumption of residential buildings by
patterns lack generality. This is a common problem in machine learning dividing the electric loads between AC and non-AC consumption. Non-
techniques, with overfitting (Chalal et al., 2016), and the reason why a AC electricity consumption was forecasted using a LS-SVM model, from
particular attention is to be paid to training, validation and testing data which internal heat gain variations were deduced. Heat gains together
samples independence. Nevertheless, such constraint is often limited by with weather information they were input in a 2R-1C lumped model to
manual data pre-processing as presented in Section 4.1, and the lack of calculate the different building zones temperature. Zone temperatures
information and data availability on important energy drivers (Section results were input in an AC regression model to further AC cooling
4.2). The first problem can be tackled through more advanced or dif- power consumption. Finally, both data-driven-based non-AC electricity
ferent machine learning techniques with unsupervised, reinforcement consumption and hybrid-based AC electricity consumption were
and transfer learning (Sections 3.1 and 3.2). The latter cabs be counter- summed up to forecast the total building electricity consumption. A
balanced by optimizing input data, using time-related parameters for comparison between the grey-box model and data-driven algorithms
example instead of physical variables. such as FFNN, SVR, LS-SVM, Gaussian mixture model and Gaussian
Nevertheless, data-driven approaches remain completely black-box process regression showed a significant improvement of forecasting
methods. Contrarily to physics-based models, also called white-box performances with the proposed hybrid model.
models, they do not provide transparency on the link between inputs
and the final forecasted building energy consumption. However, phy-
sics-based modeling is also a very complex process, requiring advanced 5.4. Occupants’ behavior impact on building energy efficiency
knowledge and information on building to be modeled with high un-
certainties among key energy-drivers. Therefore, hybrid techniques In spite of the significant progress in data-driven modeling in recent

22
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

Table 6
Summary of the software and packages to develop data-driven studies in reviewed studies.
Software and packages Referring studies

IBM SPSS (“IBM SPSS Modeler, ” n.d.; “IBM SPSS Statistics, ” n.d.) AR: (Newsham & Birt, 2010; Zhao et al., 2016)
Regression: (Amber et al., 2017; Chou & Bui, 2014)
DT: (Chou & Bui, 2014)
SVM: (Chou & Bui, 2014; Zhao et al., 2016)
ANN: (Chou & Bui, 2014; Zhao et al., 2016)
Ensemble: (Chou & Bui, 2014)
Improved: (Li et al., 2018)
MATLAB (“MATLAB – MathWorks – MATLAB N/S Regression: (Nilashi et al., 2017)
Simulink, ” n.d.) k-NN: (Wahid & Kim, 2016)
DT: (Nilashi et al., 2017; Wang, Wang, & Srinivasan, 2018)
SVM: (Nilashi et al., 2017; Wang, Wang, & Srinivasan,
2018)
ANN: (Bagnasco et al., 2015; Li et al., 2009; Nilashi et al.,
2017)
DNN: (Mocanu, Nguyen, Kling, et al., 2016)
Ensemble: (Wang, Wang, & Srinivasan, 2018)
Improved: (Nilashi et al., 2017)
Hybrid: (Massa Gray & Schmidt, 2018)
LibSVM (Chang & Lin, 2013) SVM: (Amber et al., 2018; Dong et al., 2016; Mocanu,
Nguyen, Gibescu, et al., 2016; Paudel et al., 2017)
LibSVM + FaroUltimate (Li, 2011) SVM: (Liu et al., 2015)
Neural Network Toolbox (“Neural Network Toolbox – MATLAB, ANN: (Biswas et al., 2016)
” n.d.)
mySVM software (“mySVM – TU Dortmund,” n.d.) SVM: (Li et al., 2009)
Python programming Scikit-Learn package (“Scikit-Learn: machine learning in Regression: (Zhang et al., 2017)
Python, ” n.d.) SVM: (Dagnely et al., 2015; Chaobo Zhang et al., 2017)
Ensemble: (Ahmad et al., 2017)
NeuroLab (“NeuroLab 0.3.5, Neural Network Library for Python, ANN: (Ahmad et al., 2017)
” n.d.)
StatsModel (“StatsModels: Statistics in Python — statsmodels Regression: (Dagnely et al., 2015)
0.9.0 documentation, ” n.d.)
TensorFlow (“TensorFlow, ” n.d.) Regression: (Amber et al., 2018)
SVM: (Amber et al., 2018)
ANN: (Amber et al., 2018)
DNN: (Amber et al., 2018)
R programming (“R: The R Project for Statistical N/S Ensemble: (Zhang et al., 2016)
Computing, ” n.d.) Improved: (Zhang et al., 2016)
Keras package (“Keras Documentation, ” n.d.) DNN: (Fan et al., 2019)
Weka software (“Weka 3 – Data Mining with Open Source Machine Learning Software in Java, ” n.d.) Regression: (Massana et al., 2015)
k-NN: (Wahid & Kim, 2016)
DT: (Yu et al., 2010)
SVR: (Massana et al., 2015)
ANN: (Massana et al., 2015)

years, a large gap remains when trying to accurately account for oc- on energy efficient behaviors and that social networking within build-
cupancy and human behavior impact. Indeed, the review work has ings and communities could have a significant impact on energy sav-
highlighted a lack of real occupancy data used in BECMF studies, often ings, comparable to typical retrofit actions (Pisello & Asdrubali, 2014).
replaced by theoretical occupancy scenarios and resulting in large
modeling uncertainties (Azar & Menassa, 2012). Furthermore, even
when available, most occupants-related data only considered occu- 6. Conclusions
pancy schedules (Table 3). This problem is partly due to the scarcity of
residential and mixed-use buildings case studies. Indeed, as one of the We identified in this paper the main building energy consumption
“key factor influencing energy consumption in buildings” (Pisello & modeling and forecasting techniques and specifically reviewed data-
Asdrubali, 2014), accuracy can only be achieved by accessing detailed driven methods. We covered approaches from the most conventional to
occupant-related data such as occupancy but also socio-economic data the most recent research efforts on the topic. Six single techniques have
(Sütterlin, Brunner, & Siegrist, 2011; Tso ö Yau, 2007), behavior un- been introduced with autoregressive models, statistical regressions, k
derstanding, equipment usages and social interactions (Peschiera & nearest neighbors, decision trees, support vector machines, artificial
Taylor, 2012). neural networks, and two combined approaches: ensemble and im-
Thus, complex occupants’ behavior modeling has been integrated in proved models. Furthermore, we examined different machine learning
some building energy forecasting studies. Simple approaches have been approaches commonly used in the field including supervised, un-
implemented to assess the general behavior of occupants (Zhang, Cao, supervised, reinforcement and transfer learning. We presented the basic
& Romagnoli, 2018) and it's negative impact on building thermal loads concepts and illustrated them through different recent studies. Peculiar
(Ferracuti et al., 2017). Similarly, methods have been developed for the attention was given to input data characteristics (i.e. origin, inputs
evaluation of consumers’ energy efficiency and energy-savings beha- types, time-series time-step, amount of data and the training–valida-
vior, to assess their impact on cooling load forecasting (Spandagos & tion–testing ratio) and pre-processing methods. Finally, research gaps
Ng, 2018). The evaluation of peer networks (Peschiera & Taylor, 2012) and future research directions are identified. Although data-driven
and behavioral modifications on the energy consumption have been methods offer a very wide range of tools to model and forecast build-
investigated as well (Xu et al., 2012). It highlighted specific incentives ings energy consumption that can adapt to many different situations,
depending on the types of buildings, available data, modeling purpose,

23
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

required accuracy and forecasting horizons, a universal protocol that learning: An overview. 2011 IEEE symposium on adaptive dynamic programming and
can tackle the variety of problems faced is still lacking and a tradeoff, reinforcement learning (ADPRL), 1–8. https://doi.org/10.1109/ADPRL.2011.5967353
IEEE.
accounting for each problem constraints, is often to be made. In addi- Castelli, M., Trujillo, L., Vanneschi, L., & Popovič, A. (2015). Prediction of energy per-
tion, several specific points still require particular attention such as formance of residential buildings: A genetic programming approach. Energy and
long-term energy consumption forecasting, black-box data-driven Buildings, 102, 67–74. https://doi.org/10.1016/j.enbuild.2015.05.013.
Chalal, M. L., Benachir, M., White, M., & Shrahily, R. (2016). Energy planning and
techniques enhancement by hybridization with physical models, the forecasting approaches for supporting physical improvement strategies in the
accurate and realistic accounting for occupancy and occupants’ beha- building sector: A review. Renewable and Sustainable Energy Reviews, 64, 761–776.
vior as well as real use cases of residential or mixed-use buildings. https://doi.org/10.1016/j.rser.2016.06.040.
Chang, C., & Lin, C. (2013). LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology (TIST), 2, 1–39. https://doi.org/10.
Declaration of interest 1145/1961189.1961199.
Chen, X., & Yang, H. (2018). Integrated energy performance optimization of a passively
designed high-rise residential building in different climatic zones of China. Applied
None.
Energy, 215, 145–158. https://doi.org/10.1016/J.APENERGY.2018.01.099.
Choi, I. Y., Cho, S. H., & Kim, J. T. (2012). Energy consumption characteristics of high-rise
Acknowledgment apartment buildings according to building shape and mixed-use development. Energy
and Buildings, 46, 123–131. https://doi.org/10.1016/J.ENBUILD.2011.10.038.
Chou, J.-S., & Bui, D.-K. (2014). Modeling heating and cooling loads by artificial in-
The study has been supported by the National Key R&D Program of telligence for energy-efficient building design. Energy and Buildings, 82, 437–446.
China (Grant No. 2017YFC0704200). https://doi.org/10.1016/J.ENBUILD.2014.07.036.
Collinge, W. O., DeBlois, J. C., Landis, A. E., Schaefer, L. A., & Bilec, M. M. (2016). Hybrid
dynamic-empirical building energy modeling approach for an existing campus
Appendix A. Supplementary data building. Journal of Architectural Engineering, 22(1), 04015010. https://doi.org/10.
1061/(ASCE)AE.1943-5568.0000183.
Supplementary data associated with this article can be found, in the Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
Dagnely, P., Ruette, T., Tourwé, T., & Tsiporkova, E. (2015). Predicting hourly energy
online version, at https://doi.org/10.1016/j.scs.2019.101533. consumption. Can you beat an autoregressive model? Proceeding of the 24th annual
machine learning conference of Belgium and the Netherlands.
References Deb, C., & Lee, S. E. (2018). Determining key variables influencing energy consumption in
office buildings through cluster analysis of pre- and post-retrofit building data. Energy
and Buildings, 159, 228–245. https://doi.org/10.1016/J.ENBUILD.2017.11.007.
Ahmad, M. W., Mourshed, M., & Rezgui, Y. (2017). Trees vs neurons: Comparison be- Deb, C., Zhang, F., Yang, J., Lee, S. E., & Shah, K. W. (2017). A review on time series
tween random forest and ANN for high-resolution prediction of building energy forecasting techniques for building energy consumption. Renewable and Sustainable
consumption. Energy and Buildings, 147, 77–89. https://doi.org/10.1016/j.enbuild. Energy Reviews, 74, 902–924. https://doi.org/10.1016/j.rser.2017.02.085.
2017.04.038. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from in-
Ahmad, T., Chen, H., Guo, Y., & Wang, J. (2018). A comprehensive overview on the data complete data via the EM algorithm. Journal of the Royal Statistical Society. Series B
driven and large scale based approaches for forecasting of building energy demand: A (Methodological), 39(1), 1–38.
review. Energy and Buildings, 165, 301–320. https://doi.org/10.1016/J.ENBUILD. Deng, J. (1989). Introduction to grey system theory. The Journal of Grey System, 1, 1–24
2018.01.017. Retrieved from http://www.researchinformation.co.uk/grey/IntroGreySysTheory.
Allab, Y., Pellegrino, M., Guo, X., Nefzaoui, E., & Kindinis, A. (2017). Energy and comfort pdf.
assessment in educational building: Case study in a French university campus. Energy DeST simulation software. (n.d.). Retrieved from http://dest.tsinghua.edu.cn/chinese/
and Buildings, 143, 202–219. https://doi.org/10.1016/J.ENBUILD.2016.11.028. default1.asp?accessdenied=%2Fchinese%2Fdefault.asp.
Alobaidi, M. H., Chebana, F., & Meguid, M. A. (2018). Robust ensemble learning fra- Dong, B., Li, Z., Rahman, S. M. M., & Vega, R. (2016). A hybrid model approach for
mework for day-ahead forecasting of household based energy consumption. Applied forecasting future residential electricity consumption. Energy and Buildings, 117,
Energy, 212, 997–1012. https://doi.org/10.1016/J.APENERGY.2017.12.054. 341–351. https://doi.org/10.1016/j.enbuild.2015.09.033.
Amasyali, K., & El-Gohary, N. M. (2018). A review of data-driven building energy con- Ecotect Analysis|Autodesk Knowledge Network. (n.d.). Retrieved from https://
sumption prediction studies. Renewable and Sustainable Energy Reviews, 81, knowledge.autodesk.com/support/ecotect-analysis?sort=score.
1192–1205. https://doi.org/10.1016/J.RSER.2017.04.095. Elith, J., Leathwick, J. R., & Hastie, T. (2008). A working guide to boosted regression
Amber, K. P., Ahmad, R., Aslam, M. W., Kousar, A., Usman, M., & Khan, M. S. (2018). trees. Journal of Animal Ecology, 77(4), 802–813. https://doi.org/10.1111/j.1365-
Intelligent techniques for forecasting electricity consumption of buildings. Energy, 2656.2008.01390.x.
157, 886–893. https://doi.org/10.1016/j.energy.2018.05.155. Energy consumption in households – Statistics explained. (n.d.). Retrieved from https://
Amber, K. P., Aslam, M. W., Mahmood, A., Kousar, A., Younis, M. Y., Akbar, B., & … ec.europa.eu/eurostat/statistics-explained/index.php/Energy_consumption_in_
Hussain, S. K. (2017). Energy consumption forecasting for university sector buildings. households.
Energies, 10(10), 1–18. https://doi.org/10.3390/en10101579. EnergyPlus. (n.d.). Retrieved from https://energyplus.net/.
ASHRAE (2009). 2009 Ashrae handbook fundamentals I-P edition – Energy estimating and eQUEST. (n.d.). Retrieved from http://www.doe2.com/equest/.
modeling methods. Atlanta: American Society of Heating, Refrigerating and Air- European Environment Agency – final energy consumption by sector and fuel. (n.d.).
Conditioning Engineers. Retrieved from https://www.eea.europa.eu/data-and-maps/indicators/final-energy-
ASHRAE (2013). Standard 90. 1-2013 – Energy standard for buildings except low-rise re- consumption-by-sector-9/assessment-4.
sidential buildings (SI edition). Atlanta: American Society of Heating, Refrigerating and Fan, C., Wang, J., Gang, W., & Li, S. (2019). Assessment of deep recurrent neural network-
Air-Conditioning Engineers. based strategies for short-term building energy predictions. Applied Energy, 236(2018,
Azar, E., & Menassa, C. C. (2012). Agent-based modeling of occupants and their impact on July), 700–710. https://doi.org/10.1016/j.apenergy.2018.12.004.
energy use in commercial buildings. Journal of Computing in Civil Engineering, 26(4), Fan, C., Xiao, F., Li, Z., & Wang, J. (2018). Unsupervised data analytics in mining big
506–518. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000158. building operational data for energy efficiency enhancement: A review. Energy and
Bagnasco, A., Fresi, F., Saviozzi, M., Silvestro, F., & Vinci, A. (2015). Electrical con- Buildings, 159, 296–308. https://doi.org/10.1016/J.ENBUILD.2017.11.008.
sumption forecasting in hospital facilities: An application case. Energy and Buildings, Fan, C., Xiao, F., & Wang, S. (2014). Development of prediction models for next-day
103, 261–270. building energy consumption and peak power demand using data mining techniques.
Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer. Applied Energy, 127, 1–10. https://doi.org/10.1016/J.APENERGY.2014.04.016.
Biswas, M. A. R., Robinson, M. D., & Fumo, N. (2016). Prediction of residential building Fan, C., Xiao, F., & Zhao, Y. (2017). A short-term building cooling load prediction method
energy consumption: A neural network approach. Energy, 117, 84–92. https://doi. using deep learning algorithms. Applied Energy, 195, 222–233. https://doi.org/10.
org/10.1016/j.energy.2016.10.066. 1016/J.APENERGY.2017.03.064.
Bourdeau, M., Guo, X., & Nefzaoui, E. (2018). Buildings energy consumption generation Ferracuti, F., Fonti, A., Ciabattoni, L., Pizzuti, S., Arteconi, A., Helsen, L., & Comodi, G.
gap: A post-occupancy assessment in a case study of three higher education buildings. (2017). Data-driven models for short-term thermal behaviour prediction in real
Energy and Buildings, 159. https://doi.org/10.1016/j.enbuild.2017.11.062. buildings. Applied Energy, 204, 1375–1387. https://doi.org/10.1016/J.APENERGY.
Bouzerdoum, M., Mellit, A., & Massi Pavan, A. (2013). A hybrid model (SARIMA–SVM) 2017.05.015.
for short-term power forecasting of a small-scale grid-connected photovoltaic plant. Fix, E., & Hodges, J. L., Jr. (1951). Discriminatory analysis – Nonparametric discrimination:
Solar Energy, 98, 226–235. https://doi.org/10.1016/J.SOLENER.2013.10.002. Consistency properties.
Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (2008). Time series analysis: Forecasting and Foucquier, A., Robert, S., Suard, F., Stéphan, L., & Jay, A. (2013). State of the art in
control. John Wiley. building modelling and energy performances prediction: A review. Renewable and
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi. Sustainable Energy Reviews, 23, 272–288. https://doi.org/10.1016/J.RSER.2013.03.
org/10.1023/A:1018054314350. 004.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. Friedman, J. H. (1991). Multivariate adaptative regression splines. The Annals of Statistics,
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and re- 19(1), 1–67.
gression trees. Chapman & Hall/CRC. Fu, Y., Li, Z., Zhang, H., & Xu, P. (2015). Using support vector machine to predict next day
Busoniu, L., Ernst, D., De Schutter, B., & Babuska, R. (2011). Approximate reinforcement electricity load of public buildings with sub-metering devices. Procedia Engineering,

24
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

121, 1016–1022. https://doi.org/10.1016/J.PROENG.2015.09.097. Li, X., & Wen, J. (2015). Building energy forecasting using system identification based on
Fumo, N. (2014). A review on the basics of building energy estimation. Renewable and system characteristics test. 2015 workshop on modeling and simulation of cyber-physical
Sustainable Energy Reviews, 31, 53–60. https://doi.org/10.1016/J.RSER.2013.11.040. energy systems (MSCPES), 1–6. https://doi.org/10.1109/MSCPES.2015.7115401.
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., & Rubin, D. (2013). Bayesian Li, Y. (2011). LIBSVM-FarutoUltimate: A toolbox with implements for support vector machines
data analysis. New York: Chapman and Hall/CRChttps://doi.org/10.1201/b16018. based on libsvm. Retrieved from http://www.matlabsky.com/.
Ghadi, Y. Y., Rasul, M. G., & Khan, M. M. K. (2017). Energy savings by fuzzy base control Liu, D., Chen, Q., & Mori, K. (2015). Time series forecasting method of building energy
of occupancy concentration in institutional buildings. Energy Procedia, 105, consumption using support vector regression. 2015 ieee international conference on
2850–2858. https://doi.org/10.1016/J.EGYPRO.2017.03.628. information and automation, 1628–1632. https://doi.org/10.1109/ICInfA.2015.
Ghanbari, A., Abbasian-Naghneh, S., & Hadavandi, E. (2011). An intelligent load fore- 7279546 IEEE.
casting expert system by integration of ant colony optimization, genetic algorithms Liu, X., Marnay, C., Feng, W., Zhou, N., & Karali, N. (2017). A review of the American
and fuzzy logic. 2011 IEEE symposium on computational intelligence and data mining Recovery and Reinvestment Act Smart Grid Projects and their implications for China.
(CIDM), 246–251. https://doi.org/10.1109/CIDM.2011.5949432 IEEE. Ma, Z., Song, J., & Zhang, J. (2017). Energy consumption prediction of air-conditioning
Global Energy Statistical Yearbook, 2017|World Energy Statistics|Enerdata. (2017). systems in buildings by selecting similar days based on combined weights. Energy and
Retrieved from https://yearbook.enerdata.net/. Buildings, 151, 157–166. https://doi.org/10.1016/j.enbuild.2017.06.053.
Ho, T. K. (1995). Random decision forests. Proceedings of 3rd international conference on Magoulès, F., & Zhao, H.-X. (2016). Data mining and machine learning in building energy
document analysis and recognition, Vol. 1, 278–282. https://doi.org/10.1109/ICDAR. analysis. Wiley-ISTE.
1995.598994 IEEE Comput. Soc. Press. Marino, D. L., Amarasinghe, K., & Manic, M. (2016). Building energy load forecasting
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. using Deep Neural Networks. IECON 2016 – 42nd annual conference of the IEEE
International Journal of Forecasting, 22(4), 679–688. https://doi.org/10.1016/J. Industrial Electronics Society, 7046–7051. https://doi.org/10.1109/IECON.2016.
IJFORECAST.2006.03.001. 7793413 IEEE.
IBM SPSS Modeler. (n.d.). Retrieved from https://www.ibm.com/products/spss-modeler. Massa Gray, F., & Schmidt, M. (2018). A hybrid approach to thermal building modelling
IBM SPSS Statistics. (n.d.). Retrieved from https://www.ibm.com/products/spss- using a combination of Gaussian processes and grey-box models. Energy and Buildings,
statistics. 165, 56–63. https://doi.org/10.1016/J.ENBUILD.2018.01.039.
International Energy Agency (2016). Tracking clean energy progress 2016. Retrieved from Massana, J., Pous, C., Burgas, L., Melendez, J., & Colomer, J. (2015). Short-term load
http://www.iea.org/publications/freepublications/publication/ forecasting in a non-residential building contrasting models and attributes. Energy
TrackingCleanEnergyProgress2016.pdf. and Buildings, 92, 322–330. https://doi.org/10.1016/J.ENBUILD.2015.02.007.
International Energy Agency (2017). Tracking clean energy progress 2017 informing energy Mat Daut, M. A., Hassan, M. Y., Abdullah, H., Rahman, H. A., Abdullah, M. P., & Hussin,
sector transformations. Retrieved from http://www.iea.org/publications/ F. (2017). Building electrical energy consumption forecasting analysis using con-
freepublications/publication/TrackingCleanEnergyProgress2017.pdf. ventional and artificial intelligence methods: A review. Renewable and Sustainable
Iowa Environmental Mesonet. (n.d.). Retrieved from https://mesonet.agron.iastate.edu/. Energy Reviews, 70, 1108–1118. https://doi.org/10.1016/J.RSER.2016.12.015.
Jain, A. K. (2008). Data clustering: 50 years beyond k-means. 19th international conference MATLAB – MathWorks – MATLAB Simulink. (n.d.). Retrieved from https://www.
on pattern recognition (ICPR) (pp. 1–33). mathworks.com/products/matlab.html.
Jeong, K., Koo, C., & Hong, T. (2014). An estimation model for determining the annual Mena, R., Rodríguez, F., Castilla, M., & Arahal, M. R. (2014). A prediction model based on
energy cost budget in educational facilities using SARIMA (seasonal autoregressive neural networks for the energy consumption of a bioclimatic building. Energy and
integrated moving average) and ANN (artificial neural network). Energy, 71, 71–79. Buildings, 82, 142–155.
https://doi.org/10.1016/j.energy.2014.04.027. Meteonorm: Irradiation data for every place on Earth. (n.d.). Retrieved from https://
Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). Springer, Ed. www.meteonorm.com/.
Jovanović, R.Ž., Sretenović, A. A., & Živković, B. D. (2015). Ensemble of various neural Miller, C., Nagy, Z., & Schlueter, A. (2018). A review of unsupervised statistical learning
networks for prediction of heating energy consumption. Energy and Buildings, 94, and visual analytics techniques applied to performance analysis of non-residential
189–199. buildings. Renewable and Sustainable Energy Reviews, 81(2016, December),
Kandasamy, N. K., Karunagaran, G., Spanos, C., Tseng, K. J., & Soong, B. H. (2018). Smart 1365–1377. https://doi.org/10.1016/j.rser.2017.05.124.
lighting system using ANN-IMC for personalized lighting control and daylight har- Mitchell, M. (1998). An introduction to genetic algorithms. Bradford Book – The MIT Press.
vesting. Building and Environment, 139(April), 170–180. https://doi.org/10.1016/j. Mocanu, E., Nguyen, P. H., Gibescu, M., & Kling, W. L. (2016a). Deep learning for esti-
buildenv.2018.05.005. mating building energy consumption. Sustainable Energy, Grids and Networks, 6,
Karaboga, D., & Basturk, B. (2007). A powerful and efficient algorithm for numerical 91–99. https://doi.org/10.1016/j.segan.2016.02.005.
function optimization: Artificial bee colony (ABC) algorithm. Journal of Global Mocanu, E., Nguyen, P. H., Kling, W. L., & Gibescu, M. (2016b). Unsupervised energy
Optimization, 39(3), 459–471. https://doi.org/10.1007/s10898-007-9149-x. prediction in a Smart Grid context using reinforcement cross-building transfer
Kass, G. V. (1980). An exploratory technique for investigating large quantities of cate- learning. Energy and Buildings, 116, 646–655. https://doi.org/10.1016/J.ENBUILD.
gorical data. Applied Statistics, 29(2), 119–127. 2016.01.030.
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN’95 – mySVM – TU Dortmund. (n.d.). Retrieved from https://www-ai.cs.uni-dortmund.de/
International conference on neural networks, Vol. 4, 1942–1948. https://doi.org/10. SOFTWARE/MYSVM/index.html.
1109/ICNN.1995.488968 IEEE. Neto, A. H., & Fiorelli, F. A. S. (2008). Comparison between detailed model simulation
Keras Documentation. (n.d.). Retrieved from https://keras.io/. and artificial neural network for forecasting building energy consumption. Energy and
Kohonen, T. (1997). Exploration of very large databases by self-organizing maps. Buildings, 40(12), 2169–2176. https://doi.org/10.1016/J.ENBUILD.2008.06.013.
Proceedings of international conference on neural networks (ICNN’97), Vol. 1, 1–6. Neural Network Toolbox – MATLAB. (n.d.). Retrieved from https://www.mathworks.
https://doi.org/10.1109/ICNN.1997.611622 IEEE. com/products/neural-network.html.
Kolodner, J. (2014). Case-based reasoning. Elsevier Science. NeuroLab 0.3.5, Neural Network Library for Python. (n.d.). Retrieved from https://
Kristensen, M. H., & Petersen, S. (2016). Choosing the appropriate sensitivity analysis pythonhosted.org/neurolab/.
method for building energy model-based investigations. Energy and Buildings, 130, Newsham, G. R., & Birt, B. J. (2010). Building-level occupancy data to improve ARIMA-
166–176. https://doi.org/10.1016/j.enbuild.2016.08.038. based electricity use forecasts. Proceedings of the 2nd ACM Workshop on Embedded
Kwok, S. S. K., & Lee, E. W. M. (2011). A study of the importance of occupancy to building Sensing Systems for Energy-Efficiency in Building – BuildSys’10, 13. https://doi.org/10.
cooling load in prediction by intelligent approach. Energy Conversion and 1145/1878431.1878435 ACM Press New York, New York, USA.
Management, 52(7), 2555–2564. Nilashi, M., Dalvi-Esfahani, M., Ibrahim, O., Bagherifard, K., Mardani, A., & Zakuan, N.
Kylili, A., & Fokaides, P. A. (2015). European smart cities: The role of zero energy (2017). A soft computing method for the prediction of energy performance of re-
buildings. Sustainable Cities and Society, 15, 86–95. https://doi.org/10.1016/J.SCS. sidential buildings. Measurement, 109, 268–280. https://doi.org/10.1016/J.
2014.12.003. MEASUREMENT.2017.05.048.
L’Heureux, A., Grolinger, K., Elyamany, H. F., & Capretz, M. A. M. (2017). Machine Obey, D. R. (2009). Text – H.R.1 – 111th Congress (2009–2010): American Recovery and
learning with big data: Challenges and approaches. IEEE Access, 5, 7776–7797. Reinvestment Act of 2009. Retrieved from https://www.congress.gov/bill/111th-
https://doi.org/10.1109/ACCESS.2017.2696365. congress/house-bill/1/text.
Lachut, D., Banerjee, N., & Rollins, S. (2014). Predictability of energy use in homes. Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on
International green computing conference, 1–10. https://doi.org/10.1109/IGCC.2014. Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.
7039146 IEEE. 2009.191.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. Papadopoulos, S., Azar, E., Woon, W.-L., & Kontokosta, C. E. (2017). Evaluation of tree-
https://doi.org/10.1038/nature14539. based ensemble learning algorithms for building energy performance estimation.
Lee, C.-M., & Ko, C.-N. (2009). Time series prediction using RBF neural networks with a Journal of Building Performance Simulation, 1–11. https://doi.org/10.1080/19401493.
nonlinear time-varying evolution PSO algorithm. Neurocomputing, 73(1–3), 449–460. 2017.1354919.
https://doi.org/10.1016/j.neucom.2009.07.005. Parti, M., & Parti, C. (1980). The total and appliance-specific conditional demand for
Li, K., Hu, C., Liu, G., & Xue, W. (2015). Building's electricity consumption prediction electricity in the household sector. The Bell Journal of Economics, 11(1), 309. https://
using optimized artificial neural networks and principal component analysis. Energy doi.org/10.2307/3003415.
and Buildings, 108, 106–113. https://doi.org/10.1016/J.ENBUILD.2015.09.002. Paudel, S., Elmitri, M., Couturier, S., Nguyen, P. H., Kamphuis, R., Lacarrière, B., & Le
Li, K., Xie, X., Xue, W., Dai, X., Chen, X., & Yang, X. (2018). A hybrid teaching-learning Corre, O. (2017). A relevant data selection method for energy consumption predic-
artificial neural network for building electrical energy consumption prediction. tion of low energy building based on support vector machine. Energy and Buildings,
Energy and Buildings, 174, 323–334. https://doi.org/10.1016/j.enbuild.2018.06.017. 138, 240–256. https://doi.org/10.1016/J.ENBUILD.2016.11.009.
Li, X., Lu, J., Ding, L., Xu, G., & Li, J. (2009). Building cooling load forecasting model Pedersen, L. (2007). Use of different methodologies for thermal load and energy esti-
based on LS-SVM. 2009 Asia-Pacific conference on information processing, 55–58. mations in buildings including meteorological and sociological input parameters.
https://doi.org/10.1109/APCIP.2009.22 IEEE. Renewable and Sustainable Energy Reviews, 11(5), 998–1007. https://doi.org/10.

25
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

1016/J.RSER.2005.08.005. approaches for prediction of building energy consumption at urban level. Energy
Peschiera, G., & Taylor, J. E. (2012). The impact of peer network position on electricity Procedia, 78, 3378–3383. https://doi.org/10.1016/J.EGYPRO.2015.11.754.
consumption in building occupant networks utilizing energy feedback systems. TensorFlow. (n.d.). Retrieved from https://www.tensorflow.org/.
Energy and Buildings, 49, 584–590. https://doi.org/10.1016/J.ENBUILD.2012.03. Toffanin, D. (2016). Generation of customer load profiles based on smart-metering time series,
011. building-level data and aggregated measurements. Swiss Federal Institute of Technology
Pisello, A. L., & Asdrubali, F. (2014). Human-based energy retrofits in residential build- (ETH).
ings: A cost-effective alternative to traditional physical strategies. Applied Energy, TRNSYS: Transient System Simulation Tool. (n.d.). Retrieved from http://www.trnsys.
133, 224–235. https://doi.org/10.1016/J.APENERGY.2014.07.049. com/.
Pulido-Arcas, J. A., Pérez-Fargallo, A., & Rubio-Bellido, C. (2016). Multivariable regres- Tsanas, A., & Xifara, A. (2012). Accurate quantitative estimation of energy performance
sion analysis to assess energy consumption and CO2 emissions in the early stages of of residential buildings using statistical machine learning tools. Energy and Buildings,
offices design in Chile. Energy and Buildings, 133, 738–753. https://doi.org/10.1016/ 49, 560–567. https://doi.org/10.1016/J.ENBUILD.2012.03.003.
J.ENBUILD.2016.10.031. Tsinghua University Building Energy Research Center (2016). China building energy use
Python. (n.d.). Retrieved from https://www.python.org/. 2016.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81–106. Tso, G. K. F., & Yau, K. K. W. (2007). Predicting electricity energy consumption: A
Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann Publishers. comparison of regression analysis, decision tree and neural networks. Energy, 32(9),
R: The R Project for Statistical Computing. (n.d.). Retrieved from https://www.r-project. 1761–1768. https://doi.org/10.1016/J.ENERGY.2006.11.010.
org/. U.S. Energy Information Administration (EIA) (2018). How many smart meters are installed
Rahman, A., Srikumar, V., & Smith, A. D. (2018). Predicting electricity consumption for in the United States, and who has them? Retrieved from https://www.eia.gov/tools/
commercial and residential buildings using deep recurrent neural networks. Applied faqs/faq.php?id=108&t=3.
Energy, 212(October 2017), 372–385. https://doi.org/10.1016/j.apenergy.2017.12. Valgaev, O., & Kupzog, F. (2016). Building power demand forecasting using K-nearest
051. neighbors model – initial approach. 2016 IEEE PES Asia-Pacific power and energy en-
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. MIT gineering conference (APPEEC), 1055–1060. https://doi.org/10.1109/APPEEC.2016.
Press Retrieved from http://f3.tiera.ru/2/Cs_Computer science/CsLn_Lecture notes/ 7779700 IEEE.
Advanced Lectures on Machine Learning 2003(LNCS3176, Springer, 2004)(ISBN Valgaev, O., Kupzog, F., & Schmeck, H. (2017). Building power demand forecasting using
3540231226)(248s).pdf#page=70. K-nearest neighbours model – practical application in Smart City Demo Aspern
Rathore, M. M., Paul, A., Hong, W.-H., Seo, H., Awan, I., & Saeed, S. (2018). Exploiting project. CIRED – Open Access Proceedings Journal, 2017(1), 1601–1604. https://doi.
IoT and big data analytics: Defining Smart Digital City using real-time urban data. org/10.1049/oap-cired.2017.0419.
Sustainable Cities and Society, 40, 600–610. https://doi.org/10.1016/J.SCS.2017.12. Wahid, F., & Kim, D. (2016). A prediction approach for demand analysis of energy con-
022. sumption using k-nearest neighbor in residential buildings. International Journal of
Ribeiro, M., Grolinger, K., ElYamany, H. F., Higashino, W. A., & Capretz, M. A. M. (2018). Smart Home, 10(2), 97–108. https://doi.org/10.14257/ijsh.2016.10.2.10.
Transfer learning with seasonal and trend adjustment for cross-building energy Wang, J., Wang, J., Li, Y., Zhu, S., & Zhao, J. (2014). Techniques of applying wavelet de-
forecasting. Energy and Buildings, 165, 352–363. https://doi.org/10.1016/J. noising into a combined model for short-term load forecasting. International Journal of
ENBUILD.2018.01.034. Electrical Power and Energy Systems, 62, 816–824. https://doi.org/10.1016/j.ijepes.
Rokach, L., & Maimon, O. (2005). Clustering methods. Data mining and knowledge dis- 2014.05.038.
covery handbook, 321–352. https://doi.org/10.1007/0-387-25465-X_15 Springer- Wang, Z., & Srinivasan, R. S. (2017). A review of artificial intelligence based building
Verlag New York. energy use prediction: Contrasting the capabilities of single and ensemble prediction
Ruparathna, R., Hewage, K., & Sadiq, R. (2017). Economic evaluation of building energy models. Renewable and Sustainable Energy Reviews, 75, 796–808. https://doi.org/10.
retrofits: A fuzzy based approach. Energy and Buildings, 139, 395–406. https://doi. 1016/J.RSER.2016.10.079.
org/10.1016/J.ENBUILD.2017.01.031. Wang, Z., Wang, Y., & Srinivasan, R. S. (2018a). A novel ensemble learning approach to
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated support building energy use prediction. Energy and Buildings, 159, 109–122. https://
predictions. Machine Learning, 37(3), 297–336. https://doi.org/10.1023/ doi.org/10.1016/J.ENBUILD.2017.10.085.
A:1007614523901. Wang, Z., Wang, Y., Zeng, R., Srinivasan, R. S., & Ahrentzen, S. (2018b). Random Forest
Scikit-Learn: Machine learning in Python. (n.d.). Retrieved from https://scikit-learn.org/ based hourly building energy prediction. Energy and Buildings, 171, 11–25. https://
stable/. doi.org/10.1016/j.enbuild.2018.04.008.
Sekhar Roy, S., Roy, R., & Balas, V. E. (2018). Estimating heating load in buildings using Wei, Y., Zhang, X., Shi, Y., Xia, L., Pan, S., Wu, J., & … Zhao, X. (2018). A review of data-
multivariate adaptive regression splines, extreme learning machine, a hybrid model driven approaches for prediction and classification of building energy consumption.
of MARS and ELM. Renewable and Sustainable Energy Reviews, 82, 4256–4268. https:// Renewable and Sustainable Energy Reviews, 82, 1027–1047. https://doi.org/10.1016/
doi.org/10.1016/J.RSER.2017.05.249. J.RSER.2017.09.108.
Shi, G., Liu, D., & Wei, Q. (2016). Energy consumption prediction of office buildings Weka 3 – Data Mining with Open Source Machine Learning Software in Java. (n.d.).
based on echo state networks. Neurocomputing, 216, 478–488. Retrieved from https://www.cs.waikato.ac.nz/ml/weka/index.html.
Siddharth, V., Ramakrishna, P. V., Geetha, T., & Sivasubramaniam, A. (2011). Automatic Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259. https://
generation of energy conservation measures in buildings using genetic algorithms. doi.org/10.1016/S0893-6080(05)80023-1.
Energy and Buildings, 43(10), 2718–2726. https://doi.org/10.1016/J.ENBUILD.2011. Woo, Y. E., & Cho, G. H. (2018). Impact of the surrounding built environment on energy
06.028. consumption in mixed-use building. Sustainability (Switzerland), 10(3), https://doi.
Silva, B. N., Khan, M., & Han, K. (2018). Towards sustainable smart cities: A review of org/10.3390/su10030832.
trends, architectures, components, and open challenges in smart cities. Sustainable Xifara, A., & Tsanas, A. (n.d.). UCI machine learning repository: Energy efficiency Data
Cities and Society, 38(August 2017), 697–713. https://doi.org/10.1016/j.scs.2018.01. Set. Retrieved from https://archive.ics.uci.edu/ml/datasets/energy+efficiency.
053. Xu, X., Taylor, J. E., Pisello, A. L., & Culligan, P. J. (2012). The impact of place-based
Smart Metering deployment in the European Union|JRC Smart Electricity Systems and affiliation networks on energy conservation: An holistic model that integrates the
Interoperability. (n.d.). Retrieved from https://ses.jrc.ec.europa.eu/smart-metering- influence of buildings, residents and the neighborhood context. Energy and Buildings,
deployment-european-union. 55, 637–646.
Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Yalcinoz, T., & Eminoglu, U. (2005). Short term and medium term power distribution load
Computing, 14, 199–222. forecasting by neural networks. Energy Conversion and Management, 46(9–10),
Song, Q., & Chissom, B. S. (1993). Fuzzy time series and its models. Fuzzy Sets and 1393–1405. https://doi.org/10.1016/J.ENCONMAN.2004.07.005.
Systems, 54(3), 269–277. https://doi.org/10.1016/0165-0114(93)90372-O. Yang, J., Santamouris, M., & Lee, S. E. (2016, January). Review of occupancy sensing
Spandagos, C., & Ng, T. L. (2018). Fuzzy model of residential energy decision-making systems and occupancy modeling methodologies for the application in institutional
considering behavioral economic concepts. Applied Energy, 213, 611–625. https:// buildings. Energy and Buildings, 121, 344–349. https://doi.org/10.1016/j.enbuild.
doi.org/10.1016/J.APENERGY.2017.10.112. 2015.12.019.
StatsModels: Statistics in Python — statsmodels 0.9.0 documentation. (n.d.). Retrieved Yildiz, B., Bilbao, J. I., & Sproul, A. B. (2017). A review and analysis of regression and
from https://www.statsmodels.org/stable/index.html. machine learning models on commercial building electricity load forecasting.
Storn, R., & Price, K. (1997). Differential evolution – a simple and efficient heuristic for Renewable and Sustainable Energy Reviews, 73, 1104–1122. https://doi.org/10.1016/
global optimization over continuous spaces. Journal of Global Optimization, 11(4), J.RSER.2017.02.023.
341–359. https://doi.org/10.1023/A:1008202821328. Yu, Z., Haghighat, F., Fung, B. C. M., & Yoshino, H. (2010). A decision tree method for
Sütterlin, B., Brunner, T. A., & Siegrist, M. (2011). Who puts the most energy into energy building energy demand modeling. Energy and Buildings, 42(10), 1637–1646. https://
conservation?. A segmentation of energy consumers based on energy-related beha- doi.org/10.1016/J.ENBUILD.2010.04.006.
vioral characteristics. Energy Policy, 39, 8137–8152. Yun, K., Luck, R., Mago, P. J., & Cho, H. (2012). Building hourly thermal load prediction
Swan, L. G., & Ugursal, V. I. (2009). Modeling of end-use energy consumption in the using an indexed ARX model. Energy and Buildings, 54, 225–233. https://doi.org/10.
residential sector: A review of modeling techniques. Renewable and Sustainable Energy 1016/J.ENBUILD.2012.08.007.
Reviews, 13(8), 1819–1835. https://doi.org/10.1016/J.RSER.2008.09.033. Zhang, C., Cao, L., & Romagnoli, A. (2018). On the feature engineering of building energy
Tahmassebi, A., & Gandomi, A. H. (2018). Building energy consumption forecast using data mining. Sustainable Cities and Society. https://doi.org/10.1016/J.SCS.2018.02.
multi-objective genetic programming. Measurement, 118, 164–171. https://doi.org/ 016.
10.1016/J.MEASUREMENT.2018.01.032. Zhang, C., Zhao, Y., Zhang, X., Fan, C., & Li, T. (2017). An improved cooling load pre-
Tang, F., Kusiak, A., & Wei, X. (2014). Modeling and short-term prediction of HVAC diction method for buildings with the estimation of prediction intervals. Procedia
system with a clustering algorithm. Energy and Buildings, 82, 310–321. https://doi. Engineering, 205, 2422–2428. https://doi.org/10.1016/J.PROENG.2017.09.967.
org/10.1016/j.enbuild.2014.07.037. Zhang, F., Deb, C., Lee, S. E., Yang, J., & Shah, K. W. (2016). Time series forecasting for
Tardioli, G., Kerrigan, R., Oates, M., O‘Donnell, J., & Finn, D. (2015). Data driven building energy consumption using weighted Support Vector Regression with

26
M. Bourdeau, et al. Sustainable Cities and Society 48 (2019) 101533

diﬀerential evolution optimization technique. Energy and Buildings, 126, 94–103. consumption. Renewable and Sustainable Energy Reviews, 16(6), 3586–3592. https://
https://doi.org/10.1016/J.ENBUILD.2016.05.028. doi.org/10.1016/J.RSER.2012.02.049.
Zhao, D., Zhong, M., Zhang, X., & Su, X. (2016). Energy consumption predicting model of Zheng, Z., Zhuang, Z., Lian, Z., & Yu, Y. (2017). Study on building energy load prediction
VRV (Variable refrigerant volume) system in oﬃce buildings based on data mining. based on monitoring data. Procedia Engineering, 205, 716–723. https://doi.org/10.
Energy, 102, 660–668. 1016/j.proeng.2017.09.894.
Zhao, H., & Magoulès, F. (2012). A review on the prediction of building energy

Stock Price Prediction Using Artificial Neural Networks: Padmaja Dhenuvakonda, R. Anandan, N. Kumar
100% (1)
Stock Price Prediction Using Artificial Neural Networks: Padmaja Dhenuvakonda, R. Anandan, N. Kumar
5 pages
Datadriven BIM For Energy Efficient Building Design
100% (1)
Datadriven BIM For Energy Efficient Building Design
187 pages
Energies 16 03748
No ratings yet
Energies 16 03748
23 pages
Energies: A Review of Deep Learning Techniques For Forecasting Energy Use in Buildings
No ratings yet
Energies: A Review of Deep Learning Techniques For Forecasting Energy Use in Buildings
26 pages
Paper 3
No ratings yet
Paper 3
15 pages
A review on time series forecasting techniques for building energy_2017
No ratings yet
A review on time series forecasting techniques for building energy_2017
23 pages
Heating and Cooling Loads Forecasting For Residential Buildings Based On Hybrid Machine Learning Applications A Comprehensive Review and Comparative Analysis
No ratings yet
Heating and Cooling Loads Forecasting For Residential Buildings Based On Hybrid Machine Learning Applications A Comprehensive Review and Comparative Analysis
20 pages
Machine Learning Algorithms for Predicting Energy
No ratings yet
Machine Learning Algorithms for Predicting Energy
19 pages
Forecasting Building Energy Consumption: Adaptive Long-Short Term Memory Neural Networks Driven by Genetic Algorithm
No ratings yet
Forecasting Building Energy Consumption: Adaptive Long-Short Term Memory Neural Networks Driven by Genetic Algorithm
19 pages
Forecasting of Residential Unit's Heat Demands: A Comparison of Machine Learning Techniques in A Real World Case Study
No ratings yet
Forecasting of Residential Unit's Heat Demands: A Comparison of Machine Learning Techniques in A Real World Case Study
35 pages
Building Thermal Dynamics Modeling With Deep Tran - 2024 - Engineering Applicati
No ratings yet
Building Thermal Dynamics Modeling With Deep Tran - 2024 - Engineering Applicati
12 pages
buildings-12-02039
No ratings yet
buildings-12-02039
25 pages
1 s2.0 S037877881933751X Main
No ratings yet
1 s2.0 S037877881933751X Main
22 pages
Building Energy Prediction
No ratings yet
Building Energy Prediction
14 pages
Energies: Building Energy Consumption Prediction: An Extreme Deep Learning Approach
No ratings yet
Energies: Building Energy Consumption Prediction: An Extreme Deep Learning Approach
20 pages
Smart Energy Management System[1]
No ratings yet
Smart Energy Management System[1]
11 pages
Paper Presentation Betab Ash
No ratings yet
Paper Presentation Betab Ash
7 pages
zhou2019
No ratings yet
zhou2019
10 pages
2015 - Exploiting IoT-based Sensed Data in Smart Buildings To Model Its Energy Consumption
No ratings yet
2015 - Exploiting IoT-based Sensed Data in Smart Buildings To Model Its Energy Consumption
6 pages
1 s2.0 S0378778823007430 Main
No ratings yet
1 s2.0 S0378778823007430 Main
15 pages
Lei - A Building Energy Consumption Prediction Model Based On Rough Set Theory and Deep Learning Algorithms
No ratings yet
Lei - A Building Energy Consumption Prediction Model Based On Rough Set Theory and Deep Learning Algorithms
19 pages
A review of the-state-of-the-art in data-driven approaches for building
No ratings yet
A review of the-state-of-the-art in data-driven approaches for building
23 pages
Predicting Energy Consumption Using Stacked LSTM Snapshot Ensemble
No ratings yet
Predicting Energy Consumption Using Stacked LSTM Snapshot Ensemble
24 pages
Reference - 8
No ratings yet
Reference - 8
17 pages
(ASCE)SC.1943-5576.0000555
No ratings yet
(ASCE)SC.1943-5576.0000555
8 pages
Energies 14 03020 v2
No ratings yet
Energies 14 03020 v2
25 pages
MSC Proj
No ratings yet
MSC Proj
102 pages
Intelligent Deep Learning Techniques For Energy Consumption Forecasting in Smart Buildings: A Review
No ratings yet
Intelligent Deep Learning Techniques For Energy Consumption Forecasting in Smart Buildings: A Review
33 pages
Abigail 50 2023
No ratings yet
Abigail 50 2023
18 pages
A hybrid model for building energy consumption forecasting using long short term memory networks
No ratings yet
A hybrid model for building energy consumption forecasting using long short term memory networks
20 pages
Kel-1 Assignment 1 - ENG
No ratings yet
Kel-1 Assignment 1 - ENG
3 pages
Reference - 4
No ratings yet
Reference - 4
12 pages
s12273-024-1181-y
No ratings yet
s12273-024-1181-y
19 pages
Data Driven Energy Efficiency in Buildings
No ratings yet
Data Driven Energy Efficiency in Buildings
21 pages
Hybrid Forecasting Model of Building Cooling Load Based On Combined Neural Network
No ratings yet
Hybrid Forecasting Model of Building Cooling Load Based On Combined Neural Network
15 pages
A deep learning framework for building energy consumption forecast
No ratings yet
A deep learning framework for building energy consumption forecast
21 pages
Building Energy Consumption Prediction Using Deep Learning
No ratings yet
Building Energy Consumption Prediction Using Deep Learning
11 pages
Applsci 10 08323
No ratings yet
Applsci 10 08323
27 pages
Energies 17 01285 v2
No ratings yet
Energies 17 01285 v2
18 pages
Greenbuilding tr09
No ratings yet
Greenbuilding tr09
14 pages
Reinforcement Learning-Based BEMS Architecture For Energy Usage Optimization
No ratings yet
Reinforcement Learning-Based BEMS Architecture For Energy Usage Optimization
33 pages
Applied Sciences
No ratings yet
Applied Sciences
16 pages
1 s2.0 S0973082621001307 Main
No ratings yet
1 s2.0 S0973082621001307 Main
14 pages
1-s2.0-S0360544224024101-main
No ratings yet
1-s2.0-S0360544224024101-main
18 pages
Master Ahmed Hussnain 2014 PDF
No ratings yet
Master Ahmed Hussnain 2014 PDF
85 pages
Urban Building Energy Performance Prediction and Retrofit Analysis Using Data-Driven Machine Learning Approach
No ratings yet
Urban Building Energy Performance Prediction and Retrofit Analysis Using Data-Driven Machine Learning Approach
16 pages
Khalil 34 2021
No ratings yet
Khalil 34 2021
6 pages
A Study On The Application of Artificial Intelligence Techniques For Predicting The Heating and Cooling Loads of Buildings
No ratings yet
A Study On The Application of Artificial Intelligence Techniques For Predicting The Heating and Cooling Loads of Buildings
14 pages
LSTM-based Indoor Air Temperature Prediction Framework For HVAC Systems in Smart Buildings
No ratings yet
LSTM-based Indoor Air Temperature Prediction Framework For HVAC Systems in Smart Buildings
17 pages
1 s2.0 S0360132323002792 Main
No ratings yet
1 s2.0 S0360132323002792 Main
15 pages
Cucs 008 12
No ratings yet
Cucs 008 12
6 pages
Modeling Heating and Cooling Loads by Artificial Intelligence For Energy-Efficient Building Design
No ratings yet
Modeling Heating and Cooling Loads by Artificial Intelligence For Energy-Efficient Building Design
10 pages
Energies 16 07508 v2
No ratings yet
Energies 16 07508 v2
24 pages
FirstOralPresentation
No ratings yet
FirstOralPresentation
60 pages
Energies 17 00700
No ratings yet
Energies 17 00700
27 pages
Data Driven Modeling For Energy Consumption Prediction in Smart Buildings PDF
No ratings yet
Data Driven Modeling For Energy Consumption Prediction in Smart Buildings PDF
8 pages
Energies 13 01555
No ratings yet
Energies 13 01555
18 pages
Goyal 2020
No ratings yet
Goyal 2020
5 pages
Brandi 2020 146
No ratings yet
Brandi 2020 146
53 pages
CITA Complex Modelling
From Everand
CITA Complex Modelling
Mette Ramsgaard Thomsen
No ratings yet
Harnessing Earth's Heat: Geothermal Energy as an Innovative Solution for Data Center Power Demands
From Everand
Harnessing Earth's Heat: Geothermal Energy as an Innovative Solution for Data Center Power Demands
Alberto De Miranda
No ratings yet
Optimización Fmincon Tutorial 2011
No ratings yet
Optimización Fmincon Tutorial 2011
10 pages
2024_PCS_24P2CSC04_Question Bank ML
No ratings yet
2024_PCS_24P2CSC04_Question Bank ML
7 pages
Matrix-Based RSA Encryption of Streaming Data Prendergast 20210804
No ratings yet
Matrix-Based RSA Encryption of Streaming Data Prendergast 20210804
15 pages
CS8080 Irt Unit 3 23 24
No ratings yet
CS8080 Irt Unit 3 23 24
48 pages
Sharp Feature Detection in Point Clouds: Abstract-This Paper Presents A New Technique For
No ratings yet
Sharp Feature Detection in Point Clouds: Abstract-This Paper Presents A New Technique For
12 pages
Download Full Quantum Mathematics II 1st Edition Michele Correggi PDF All Chapters
100% (1)
Download Full Quantum Mathematics II 1st Edition Michele Correggi PDF All Chapters
50 pages
Network Security Lab 6
No ratings yet
Network Security Lab 6
4 pages
CSE All Subjects
No ratings yet
CSE All Subjects
3 pages
Kaspersky Lab 'S File Level Encryption Technology: The Case For Encryption
No ratings yet
Kaspersky Lab 'S File Level Encryption Technology: The Case For Encryption
4 pages
Distributed Systems Question Bank
100% (1)
Distributed Systems Question Bank
2 pages
Soundarya 256 NLP Practs
No ratings yet
Soundarya 256 NLP Practs
14 pages
Software Refactoring Prediction Using SVM and Optimization Algorithms
No ratings yet
Software Refactoring Prediction Using SVM and Optimization Algorithms
10 pages
Lab - Introduction To Simulink
No ratings yet
Lab - Introduction To Simulink
18 pages
2.1.2 Pseudocode and Flowcharts
No ratings yet
2.1.2 Pseudocode and Flowcharts
24 pages
Unmasking The Face Expression
No ratings yet
Unmasking The Face Expression
11 pages
CPM and Pert 14-06-22
No ratings yet
CPM and Pert 14-06-22
29 pages
2.4 Lecture 8: Boltzmann Equation, H-Theorem
No ratings yet
2.4 Lecture 8: Boltzmann Equation, H-Theorem
4 pages
Resource Management Techniques PDF
No ratings yet
Resource Management Techniques PDF
2 pages
Topic:-CA-1: Advanced Search and Optimization Techniques
No ratings yet
Topic:-CA-1: Advanced Search and Optimization Techniques
11 pages
Stream Ciphers I: Thomas Johansson
No ratings yet
Stream Ciphers I: Thomas Johansson
50 pages
KNP3063 Robotics and Automation Course Plan
No ratings yet
KNP3063 Robotics and Automation Course Plan
2 pages
Jimaging 09 00046 v2
No ratings yet
Jimaging 09 00046 v2
26 pages
Dynamic Programming
No ratings yet
Dynamic Programming
43 pages
Chapter 2 - Understanding Recursion
No ratings yet
Chapter 2 - Understanding Recursion
17 pages
Slide 2 - Reservoir Simulation
No ratings yet
Slide 2 - Reservoir Simulation
37 pages
Queues
No ratings yet
Queues
21 pages
System of Linear Equations
No ratings yet
System of Linear Equations
4 pages
DynamicProgramming Part2 FEUP
No ratings yet
DynamicProgramming Part2 FEUP
57 pages

Modeling and Forecasting Building Energy Consumption - A Review of Data-Driven Techniques

Uploaded by

Modeling and Forecasting Building Energy Consumption - A Review of Data-Driven Techniques

Uploaded by

Sustainable Cities and Society 48 (2019) 101533

Contents lists available at ScienceDirect

Sustainable Cities and Society

Modeling and forecasting building energy consumption: A review of data- T

1. Introduction Agreement (International Energy Agency, 2017). Meanwhile, some of

forecasting accuracy with median RMSE of 3.4 kW against 9.1 kW.

Fig. 4. Schematic of a classical three-layer ANN.

Fig. 5. Schematic of a Deep Neural Network with n hidden layers.

2018; Tso & Yau, 2007)

4.2. Data pre-processing 5. Discussions

Commercial Hotel 1 40%

You might also like