Building Energy Performance Forecastinga Multiple Linear Regression Approach
Building Energy Performance Forecastinga Multiple Linear Regression Approach
net/publication/336959548
CITATIONS READS
182 2,126
2 authors:
Some of the authors of this publication are also working on these related projects:
A validated energy model of a solar dish-Stirling system considering the cleanliness of mirrors View project
Predictive models for assessing the building energy performance View project
All content following this page was uploaded by Antonino D'Amico on 11 September 2020.
ABSTRACT
Different ways to evaluate the building energy balance can be found in literature, including
comprehensive techniques, statistical and machine-learning methods and hybrid approaches. The
identification of the most suitable approach is important to accelerate the preliminary energy
assessment. In the first category, several numerical methods have been developed and implemented
in specialised software using different mathematical languages. However, these tools require an
expert user and a model calibration. The authors, in order to overcome these limitations, have
developed an alternative, reliable linear regression model to determine building energy needs.
Starting from a detailed and calibrated dynamic model, it was possible to implement a parametric
simulation that solves the energy performance of 195 scenarios. The lack of general results led the
authors to investigate a statistical method also capable of supporting an unskilled user in the
estimation of the building energy demand. To guarantee high reliability and ease of use, a selection
of the most suitable variables was conducted by careful sensitivity analysis using the Pearson
coefficient. The Multiple Linear Regression method allowed the development of some simple
relationships to determine the thermal heating or cooling energy demand of a generic building as a
function of only a few, well-known parameters. Deep statistical analysis of the main error indices
underlined the high reliability of the results. This approach is not targeted at replacing a dynamic
simulation model, but it represents a simple decision support tool for the preliminary assessment of
the energy demand related to any building and any weather condition.
Keywords
Building Energy Demand; Sensitivity Analysis; Forecast Method; Dynamic Simulation; Black Box
Method; Multiple Linear Regression.
MLR parameters
yi i-th independent variable
xi i-th explanatory variable
b0 intercept of the linear regression
bi i-th regression coefficient
e error of the linear regression
Other parameters
CDD Cooling Degree Days [K day]
HDD Heating Degree Days [K day]
QG Internal gains [kWh/year]
Sop Opaque surface [m2]
Sw Surface of the glazed component [m2]
S/V Shape factor [m-1]
U Thermal transmittance [W/(m2·K)]
1. Introduction
In Europe, the building sector is considered to be the largest energy consumer being responsible for
up to 40% of total energy use and 36% of total CO2 emissions [1]; more specifically, the non-
residential sector represents about 40% of total energy consumption in the building sector [2,3]. A
knowledge of the energy performance of existing building stocks and the forecasting of the energy
behaviour of newly designed buildings is fundamental to achieving the targets of the EPDB (Energy
Performance of Building Directive) established by the European Union [1].
It is well-known that building energy assessment is quite complex owing to the influence of many
factors, such as weather conditions, the building construction and its shape, the thermophysical
properties of the materials used, the intended use, the occupancy and behaviour of the users, the
lighting, the ventilation, and the heating/cooling systems along with their performance and operating
schedules [4]. Furthermore, for new buildings, it is necessary for the choices to be based on high
energy performance, securely guaranteeing the achievement of energy and environmental targets.
In general, the evaluation of the energy performance of an existing building and the design of new
buildings integrating several energy-efficiency measures are solved via software programs with the
aim of predicting improvements that could be made considering different design management. For
careful energy planning, new methods have to be explored in order to support engineers and architects
in their efforts to improve design, reduce computational time and increase energy performance.
Due to the complexity of the problem, the prediction of building energy consumption is quite difficult
and has become one of the main objectives of several research studies. In recent years, a large number
of both elaborate and simplified forecast approaches have been proposed and applied to several
problems. Several of these cases are available, some based on knowledge of the building thermal
balance and on the resolution of physical equations, and others on building data collection and on the
implementation of forecast models developed by means of machine-learning techniques [5]. In the
literature, it is possible to distinguish three main methods: “white box” or physical techniques, “black
box” or statistical and/or learning approaches and “grey box” or hybrid approaches.
The “white box” approaches are used to model building thermal behaviour for several applications
on different scales. These techniques, known also as engineering methods, are based on the use of
physical principles to solve the equations describing the physical behaviour of heat transfer. In this
category, it is possible to distinguish between simplified and detailed comprehensive methods.
Among the simplified methods, the degree day method is one of the most used; several research
studies affirm that meteorological data provide an effective tool for determining the energy demand
and for calculating heating or cooling building requirements [6–8]. Another simplified method is
based on the temperature frequency, which can be used to model large buildings where internal gains
(QG) dominate [9]. For example, White et al. [10] attempted to use average monthly temperatures to
predict monthly building energy consumption and Westphal et al. [11] forecasted the annual heating
and cooling loads of non-residential buildings based on certain weather variables.
On the other hand, the detailed comprehensive methods use very elaborate physical functions to
evaluate, step-by-step, the energy consumption of a building linked to its construction, operation of
the systems, utility rate schedule of the equipment, external climate conditions and solar irradiance.
To solve such physical problems, a large number of numerical software programs are available and
these have been compared [12,13]. Users can choose to select the mechanisms and the associated
equations representing the system, but sometimes many software tools are badly adapted to taking
into account moisture influences, and generally the effects of latent heat are neglected [13,14]. In the
literature, three main thermal building models can currently be found: Computational Fluid Dynamics
(CFD), zonal methods and the multi-zone technique. CFD is a branch of fluid mechanics that is based
on numerical analysis to analyse and solve problems that involve flows. Nowadays, a huge number
of CFD software programs are available, such as FLUENT [15], COMSOL Multiphysics [16], MIT-
CFD, PHOENICS-CFD [17], and so on..
The zonal method is the first degree of simplification of the CFD technique; it involves dividing each
building zone into several cells detailing the indoor environment and estimating a thermal comfort
zone [18,19]. Specifically, this technique presents its efficiency in the description of the flow profiles
within the building. The multi-zone technique, or nodal method, is based on the assumption that each
building zone is a homogeneous volume characterised by uniform state variables. The solution is
based on the application of two main methods: transfer functions or the finite difference method. In
the field of energy efficiency and sustainability in buildings, and based on this last technique, several
software tools have been developed, such as, Energy Plus [20], ESP-r [21], TRNSYS , IDA-ICE [22],
Clim2000 [23,24], BSim [25,26] and BUILDOPT-VIE [27].
Although these simulation tools are effective and accurate, there are some practical difficulties in
implementing a reliable model. Indeed, these tools require details of the building and environmental
parameters which are not always simple to find and collect, and the lack of precise input can lead to
a low-accuracy simulation; furthermore, to use these tools normally, an expert user is required, as is
a careful calibration of the model.
The “black box” approaches are mainly used to deduce a prediction model from a relevant database
(for example, to forecast energy consumption or heating/cooling load in a given building). These
models do not require any information about physical phenomena but they are based on a function
deduced only by means of sample data connected to each other and which describe the behaviour of
a specific system. The black box methods mainly employed in the field of building energy forecasting
are: Multiple Linear Regression (MLR) or statistical regression model, Genetic Algorithm (GA),
Artificial Neural Network (ANN) the Support Vector Machine (SVM) [4,5]; an overview of these
method is described in Li et al. [28].
MLR methods correlate the building energy consumption or energy indices with the influencing
variables in a simple way. These empirical models are developed based on energy performance data
collected previously. In certain simplified models, linear regression is used to correlate the energy
consumption with climatic variables [29–31]; for example, Ansari et al. [32] calculated the total
cooling load by adding up the cooling load of each building envelope component, while Dhar et al.
[33,34] modelled heating and cooling loads using the outdoor dry bulb temperature as the only
weather variable. Parti et al. [35] were the first to propose a new method using linear regression for
the prediction of building energy consumption.
Kialashaki et al. [36] applied the regression and ANN models to evaluate the energy requirements of
the residential sector.
The main advantage of this method is its ease of use; indeed, no specific expertise is required. As
indicated in Aydinalp-Koksal et al. [37], regression models are easier to use, against the engineering
methods. However, the MLR presents a major limitation in that it is unable to treat non-linear
problems.
GA is a stochastic optimisation technique based on Darwin’s theory of evolution. In building
simulation, GA is used to find a prediction model deducing a simple equation which can fit the
problem. An important advantage of GA is the fact that it deals with a powerful optimisation method
which is able to solve every problem and give several final solutions to a complex problem [5].
Among artificial intelligence models, ANN's are the most widely used in the forecasting of building
energy and are capable of solving both non-linear and complex problems [38,39]. The main advantage
of ANN is its ability to determine non-linear relationships among different variables without any
assumptions or any postulate of a model overcoming the discretisation problem. However, ANNs
need to have a relevant database in order to obtain reliable solutions. In fact, it is really important to
train an ANN with an exhaustive learning database with representative and complete samples [4].
Among artificial intelligence techniques, SVM, introduced by Vapnik et al. [40], is usually used to
solve classification and regression problems. These are highly effective models even with small
quantities of training data. Many studies [41,42] of these models were conducted on building energy
analysis and demonstrate that SVMs can perform well in predicting hourly and monthly building
energy consumption. When a problem cannot be completely solved by applying one of the methods
previously described, it is possible to use a “grey box” method. These methods can overcome the
limitations of each individual technique by coupling them so that the advantages of one method
counteract the drawbacks of the other [5].
In the field of building energy planning, it would be more convenient to identify the best method that
describes the investigated problem for the development of a generic decision support tool,
characterised by low calculation time, a non-complex data collection phase, high reliability and a
simple calculation language that can be used even by a non-expert user.
1.1 Contribution of the work
In this paper, the authors have tried to identify a simple method capable of solving the traditional
building energy balance which will represent a decision support tool useful in the preliminary phases
of an energy planning, when the user is not an expert or when it is necessary to speed up the decision-
making phases.
A comprehensive analysis of the energy performance of a specific building, although correctly
interpreting the energy balance problem, requires an expert user with a knowledge of the physical
problems associated and who is capable of constructing a model, collecting and implementing a large
number of parameters, performing careful calibrations and explaining the results well. All of these
steps require high computational time and do not always provide an immediately correct evaluation;
with an incorrect assessment, the procedure must be restarted. Moreover, although a parametric
simulation allows the simultaneous analysis of the energy needs of several case studies, the results
cannot be generalised: a dynamic simulation of each individual case corresponds to a specific result.
To try to overcome these limits and to accelerate the preliminary assessment phase, the authors
investigated the reliability of an alternative method using the multiple linear regression to solve the
building energy balance. With the implementation of a detailed, reliable energy database,
representative of high energy performance non-residential building stocks, it was possible to apply
the black box method and to compare the obtained results with the previous comprehensive analysis.
Obviously, the method validity is linked to the reliability of the database used to identify the linear
correlation. To ensure this high reliability, the authors based their work on a carefully calibrated
TRNSYS model, implementing a parametric simulation which allowed the investigation of 195
scenarios representing different possible building combinations built with high energy performance
and simulated in several climatic conditions for different thermophysical characteristics and different
shape factors (S/V). Furthermore, a careful sensitivity analysis through examination of the Pearson
coefficient permitted the identification of the most suitable variables that influence the building
energy balance during the heating/cooling period. The application of the MLR method to the energy
database resulted in the definition of some simple correlations that identified heating (Hd), cooling
(Cd) or comprehensive (Ed) energy demand with a high degree of reliability and these are valid for a
representative building stock. These correlations, validated thanks to deep statistical error analysis,
solve the building energy balance knowing only a few well-known parameters and without any
computational time or physical knowledge. For these reasons, the solutions obtained from the
application of the MLR method can be considered generic and applicable to any condition. The
literature reports black box methods being applied to forecast the energy needs of a single building
or a district level, yet in this work a methodology is proposed to allow the identification of a more
flexible tool that can assess the energy requirements of an entire panorama of non-residential
buildings. Indeed, once the correlations valid at a general level have been determined, it will be
possible to provide an easy-to-use tool that can help identify the needs of a building without that the
user knowing the physical problem or all the variables that come into play, simply by solving a linear
equation. Furthermore, the added value of this paper lies in the generality of the results obtained
thanks to the availability of an accurate database built on a high number of models that were simulated
with parametric simulations. The high degree of reliability achieved from the results guarantees that
this methodology can be replicated in any other climatic and building context, representing a
forecasting tool to support decisions.
The simple form of the correlations could be used as a supplementary evaluation criterion/tool to
support standards and/or laws in the field of building energy performance. Although it is possible to
develop AI-based models (SVM, ANN, GA, and so on), which in some cases, present more accurate
solutions, these tools require a high knowledge of the physical, numerical and mathematical principles
of the analysed sector. Moreover, as for the MLR application, such tools require for their
implementation the use of an accurate database [5,43]. Another strength of the presented model in
this work is that the application of the MLR method does not require, during the use phase, any
calculation tool such as a personal computer or software program, but it is characterised by the
resolution of a simple linear equation.
2. Method
The aim of this paper is to provide an improved method that allows the evaluation of building energy
performance immediately and simply in any situation and boundary conditions. In this section, the
main steps and the procedures used to achieve the objective of the work are illustrated. In the flow
chart, displayed in Fig. 1, the entire procedure followed by the authors is represented.
As indicated in the flow chart, the idea is to develop a generic decision support tool that, without an
expert user and with a high degree of reliability, immediately solves the traditional building energy
balance in any case and in any conditions. In order to achieve this goal, it is necessary to develop a
generic solution of a representative building stock which includes all possible building topologies and
environmental conditions. For this reason, the authors decided to investigate, as representative, non-
residential buildings designed with high energy performance according to the energy efficiency laws
and standards in Italy (section 3).
As explained in the introduction, to solve the building energy balance it is possible to choose several
methods, the two methods applied in this work are reported in the flow chart. First of all, to determine
the building thermal energy demand a comprehensive method (section 4) was applied. A detailed
TRNSYS model of a non-residential building was developed allowing, after a careful calibration with
actual conditions, the determination of the heating and cooling energy requirements (section 4.1). To
collect the results that describe the energy balance of the representative building stock, a parametric
simulation was developed. Based on the calibrated model and developing the parametric analysis, the
authors simulated 195 scenarios of a non-residential building [44], which represent the possible
combinations in 5 climatic zones and 15 cities, with 13 shape factors and different thermo-physical
parameters, for a total of 1560 simulations (section 4.2).
The identification of a series of restrictions such as data collection, knowledge of the physical
problems, the tool language, the computational time and the lack of generality of results, because
these are single answers to a specific condition (section 4.3), prompted the authors to investigate other
alternative methods that overcome these limits. As previously indicated, a good alternative for
resolving this problem is represented by the black box methods (section 5). Although they do not take
into account the physics of the problem, they are able to identify a correlation or a dependence
between the input and output data. The strong correlation or dependence between the data is
guaranteed by the identification of the main parameters that characterise the building energy balance.
In this case, for a generic solution, all main parameters that describe the building thermal energy
balance and all thermal energy results obtained from the parametric simulations were collected in a
matrix of 197 rows and 20 columns. This dataset was used to explore the MLR method (section 6),
which allows the modelling of a linear relationship between two or more explanatory variables (input
of the model) and a response variable (building energy performance) through a fitting procedure.
The identification of the best solutions for calculating the building energy demand with high
reliability is guaranteed thanks to preventive sensitivity analysis (section 6.1), which allowed to
identify the more correlate parameters with the heating and cooling demand, and so too the optimal
input data for forecasting the building energy requirements. The performance analysis of this
alternative method is illustrated by means of an error metric analysis (section 6.2), which provides
the most used error indices. This statistical analysis was applied for all correlation forms proposed:
for the heating (section 6.2.1), cooling (section 6.2.2) and comprehensive energy demand assessment
(section 6.2.3). Owing to the reliability and flexibility of the energy database, this method was
investigated with good results. The generic database, which identifies a solution that simultaneously
responds to changes in climate and shape factor, gives generic solutions that can explain any possible
building topology in any condition (section 7).
3. Case study
As previously indicated, the authors proposed a method for the assessment of building energy needs
that can be extended to any context, for any building and boundary conditions. In order to obtain a
generic tool with these characteristics, it was necessary to investigate a representative building stock
that includes all possible building types and environmental conditions. It is known in the energy
efficiency field that every European country legislates autonomously and that the standards and laws
require different transmittance limit values for the building envelope and different efficiency systems.
In this case, the authors decided to analyse a representative building stock designed with high energy
performance and non-residential use located in the Italian context, which had already been developed
in a previous work [44].
Based on the Heating Degree Days (HDD) index, the Italian peninsula can be divided into six
different climatic zones [45], where zone A represents the hottest and zone F represents the coolest.
For each zone, the daily hours of heating system activity and the consequent yearly heating period
are indicated (Table 1) and the transmittance limit values for the design of high-performance
buildings are imposed (Table 2) [46].
Table 1
Italian Climatic Zones.
Climatic Zone HDD Heating season Daily hours
From To From To
st th
A 0 600 1 December 15 March 6
st st
B 601 900 1 December 31 March 8
th st
C 901 1400 15 November 31 March 10
st th
D 1401 2100 1 November 15 April 12
th th
E 2101 3000 15 October 15 April 14
F 3001 ∞ No limit
Table 2
Limit thermal transmittance values for each climatic zone.
As for Cooling Degree Days (CDD), the current Italian standards indicate values without changing
the cooling periods for all Italian cities, and without making any distinction between the climate zones
[47]. In order to represent the entire climate conditions, 3 cities, characterised by the maximum,
minimum, and mean HDD value were selected for each zone [48]; the 15 selected cities, according
to actual Italian laws and standards are collected in Table 3.
Table 3
Selected Italian cities HDD and CDD values.
Table 4
Geometric features of the investigated building models [44].
On the basis of the real geometric constructions of a non-residential building with high energy
performance, the authors have tried to investigate the greatest number of combinations, varying the
S/V from 0.2 to 0.9 and respectively identifying the geometric dimensions.
A knowledge of the energy demand of each building, varying simultaneously the weather conditions,
the shape factor and the thermal transmittance of the envelope allowed us to obtain an energy database
of non-residential building stock designed with high performance representative of the Italian context.
Table 5
Envelope thermal features of the ideal building model.
Thermal
Conductivity Density Thickness
Components Layers Materials capacity
[W/mK] [kg/m3] [kJ/kgK] [m]
1 External coating 1.00 1800 0.84 0.02
External 2 Lime cement 0.90 1800 0.96 0.015
Wall 3 Tuff block 0.63 1500 0.70 0.30
4 Internal Plaster 0.70 850 0.96 0.02
1 Cement Brick 2.00 2500 0.88 0.02
2 Cement Screed 1.40 2000 1.20 0.06
Floor
4 Concreate slab 1.91 1400 1.00 0.25
5 Internal Plaster 2 0.70 800 0.837 0.02
1 External tiles 1.10 2100 0.84 0.01
2 Bitumen 0.17 1200 1.40 0.02
Roof 3 Lime cement 1.40 2000 1.20 0.06
5 Concreate slab 1.91 1400 1.00 0.25
6 Internal Plaster 2 0.70 800 0.84 0.02
The windows are made of aluminium and equipped with insulating thermoacoustic glass with plastic
blinds. To solve the energy balance of this building, TRNSYS software (Fig. 2) was used and in order
to simulate the thermal behaviour, the following were considered:
• detailed weekly and daily schedules regarding the utilisation of equipment, lighting systems,
and presence of office users;
• actual monitored data recorded from 2000 to 2009 (TMY2) generated by Meteonorm software
[50];
• infiltration losses according to Appendix C of [51];
• a heat gain of 230 W per piece of equipment (one piece for each office worker and one piece
per 50 meeting people); and
• the estimation of the presence of office workers with sedentary activity (1 met).
Furthermore, based on the heating and cooling periods, a heating period was set from 1st December
to 31st March and a cooling period from 1st June to 30th September, eliminating weekdays and
holidays, for 8 hours per day based on the office occupancy rate.
Fig. 2. TRNSYS schema.
The results obtained from the dynamic simulations were validated thanks to a model calibration. For
the model validation, data recorded by two-channel Hobo-U10 Temp / RH temperature sensors
positioned in some office rooms of the building, was used.
For a period from 25th February to 17th May 2006, the indoor air temperature and the indoor air
average relative humidity trends were monitored. For example, the data relating to an area for office
use located on the second floor is reported. This office was unoccupied for the entire period and,
therefore, characterised by a low air turnover and negligible temperature changes.
25
20
[°C]
15
10
0
22/2/06 0.00 9/3/06 0.00 24/3/06 0.00 8/4/06 0.00 23/4/06 0.00 8/5/06 0.00 23/5/06 0.00
Measured Simulated
Fig. 3. Comparison between the measured and simulated indoor air temperature trend.
In Fig. 3, the comparison between the hourly measured and simulated indoor air temperature is
illustrated. As reported in Mustafaraj et al. [52] and Royapoor et al. [53], according to the main
standards or guidelines (ASHRAE Guideline 14 [54], Measurement and Verification of Federal
Energy Projects (FEMP) [55] and International Performance Measurement and Verification Protocol
(IPMVP) [56]), the authors could validate the “Base-Case” model. In particular, two error indices
were calculated: the Normalized Mean Bias Error (NMBE) and the Coefficient of Variation of the
Root Mean Square Error (CV-RMSE):
• the NMBE (Eq. 1) is a normalisation of the Mean Bias Error and provides the global bias
between the expected and predicted data. Positive values of this index mean that the model
provides an underestimated value with respect to the expected data. Negative values mean
that the model provides an overestimated output data [57,58].
1 (x − y )
i i
NMBE = 100 i =1
(1)
N x
• the CV-RMSE (Eq. 2), providing a measure of the variability of the error between the
expected and predicted data, is one of the most important measurements for evaluating the
goodness-of-fit of the forecast model [59]. It provides a clear indication of the forecast ability
of the model in the field of building energy evaluation [58,60]
N
1
(x − y )
2
i i
N i =1
CV − RMSE = 100 (2)
x
In Table 6, the limit values and ranges of applicability of all criteria for both indices for hourly
calibration are reported; furthermore, in the last column the NMBE and the CV-RMSE calculated for
the “Base-Case” model are indicated.
Table 6
Criteria and error indices for the model calibration.
For all criteria the “Base-Case” model was calibrated; the NMBE is within the applicability ranges
required and CV-RMSE is lower than the specified limit values.
Table 7
Thermal transmittance values used in the TRNSYS models.
Climatic
A-B C D E F
zone
Umodel [W/(m2·K)]
Uwall 0.444 0.379 0.336 0.297 0.276
Uroof 0.377 0.353 0.303 0.249 0.234
Ufloor 0.445 0.385 0.307 0.287 0.268
Uwindow 2.760 2.260 1.760 1.760 1.40
Each model was simulated for 13 geometrical configurations in 5 climatic zones represented by 3
different cities (Table 3). Moreover, because the building orientation and the wall azimuth influences
the solar radiation received on the façade, each model was simulated eight times, varying the
orientation by 45° each time, and averaging the results (for a total of 1560 simulations). In D’Amico
et al. [44], the results obtained from the parametric dynamic simulation are collected.
• the Mean Absolute Error (MAE) represents the direct deviation between expected and
predicted output values (Eq. 3) [60]:
N
1
MAE =
N
x −y
i =1
i i
(3)
• the Mean Square Error (MSE) calculates the variance between the target of a model and what
is going to be forecasted (Eq. 4) [64]:
N
1
( x − y )
2
MSE = i i
(4)
N i =1
• the Root Mean Square Error (RMSE) represents the square root of the quadratic mean of the
differences between predicted and expected values (Eq. 5).
1 N
( xi − yi )
2
RMSE = (5)
N i =1
• the Mean Absolute Percentage Error (MAPE) evaluates the absolute percentage deviation
between the predicted and expected values. It indicate the percentage error size that could be
used as a measure of the quality of a model’s output (Eq. 6) [65]:
1 N xi − yi
MAPE = 100
N i =1 xi
(6)
• the determination coefficient (R2) evaluates the manner in which a model approximates the
real data points, which is a measure of the predictability degree of the model [66]; the higher
R2, the more efficient the developed model (Eq. 7) [67]:
( x − y )
2
i i
R2 = 1 − i =1
(7)
( x − x)
N 2
i
i =1
where
xi is the i-th expected output;
yi is the i-th predicted output;
x is the average of the whole desired output; and
N is the number of the identification set samples.
The MAE, MSE, RMSE and MAPE allow a comparison of the deviation between the predicted and
expected values of the building energy demand [66,68]. However, because the first three are based
on absolute errors, it is not possible to identify a specific criterion to find an optimal value for each
of them, but smaller values correspond to more precise models. Instead, the MAPE, being
independent of the scale, is more significant [68].
6. MLR Model
The multiple linear regression model allows an immediate assessment of building energy
requirements. As discussed before, an MLR model is one of the black box categories and one of the
easiest and most intuitive approaches of prediction. This method, excluding a knowledge of the
physical phenomena, still allows the prefixed objective to be reached without excessive
computational cost. Nonetheless, a knowledge of a large survey database on which the model can be
constructed is necessary. Therefore, if compared to the physical model, MLR models have the
advantage of minimising the amount of input data, avoiding tedious work and the necessity for
powerful informatics equipment [63]. The aim of this method is to explain the relationship between
the dependent variable (annual heating, cooling or comprehensive energy demand) and two or more
explanatory variables or regressors (climate and thermophysical parameters) using linear
combinations of the latter [69]. The MLR models were developed according to the most general
equation form (Eq. 8) [70]:
yi = b0 + b1 x1 + b2 x2 + ... + bp x p + e (8)
where
yi represent the i-th independent variable (output);
xi represent the i-th explanatory variable (input);
b0 is the intercept of the relationship;
bi is the i-th regression coefficient that determines the used weight by the equation on the i-th
explanatory variable to provide the estimate output; and
e is the error related to the i-th observation.
The objective function for constructing the MLR model is the least square method, with the goal of
minimising the sum of the least square errors between the expected and predicted outputs as illustrated
in the following equation (Eq. 9) [69]:
2
n p
Min yi − xij b j − b0 (9)
i =1 j =1
xy
r= (10)
x y
where xy is the covariance between the x and y variables and is calculated as:
xy =
1 n
(
xi − x yi − y
n i =1
)( ) (11)
and x and y are the standard deviation of each variable and are calculated as:
( x − x)
n 2
i
x = i =1
(12)
n
( y − y)
n 2
i
y = i =1
(13)
n
The r coefficient measures the linear correlation between the two analysed variables and it may
assume a value between -1 < r < 1; the value 1 represents a total positive linear correlation, the value
-1 indicates a total negative linear correlation and 0 means that there is not a linear correlation.
The authors calculated the r coefficient for each parameter representative of the energy database
constructed in section 4.2 and then applied a sensitivity analysis to identify those parameters that
affect the building thermal balance more and that can be used in the MLR model. In the following
graphs (Fig. 4 to Fig. 9), the linear regression of the main variables affecting the dynamic behaviour
of the “ideal building” model both for heating and cooling energy demand are illustrated: HDD, CDD,
external temperature (T), S/V, glazed surface (Sw),opaque surface (Sop) and internal gains (QG). For
each trend, the determination coefficient (R2) and the r coefficients are also displayed.
18 40
R² = 0.8444 R² = 0.41
16 35
r = 0.919 r = 0.640
14 30
H d [kWh/m2 year]
C d [kWh/m2 year]
12
25
10
20
8
15
6
4 10
2 5
0 0
0 1000 2000 3000 4000 5000 6000 0 100 200 300 400
HDD [K day] CDD [K day]
Fig. 4. Linear regression analysis between the Hd and HDD (a) and between the Cd and CDD (b).
20 35
18 R² = 0.8714 R² = 0.740
30 r = 0.860
16 r = - 0.933
H d [kWh/m2 year]
C d [kWh/m2 year]
14 25
12
20
10
8 15
6 10
4
2 5
0 0
0.00 5.00 10.00 15.00 20.00 25.00 0.00 5.00 10.00 15.00 20.00 25.00
T [°C] T [°C]
Fig. 5. Linear regression analysis between the Hd and T (a) and between the Cd and T (b).
14 40
R² = 0.6461 R² = 0.121
12 35
r = 0.804 r = - 0.348
30
H d [kWh/m2 year]
10 C d [kWh/m2 year]
25
8
20
6
15
4
10
2 5
0 0
0 0.2 0.4 0.6 0.8 1 0.0 0.2 0.4 0.6 0.8 1.0
S/V [m-1] S/V [m-1]
Fig. 6. Linear regression analysis between the Hd and S/V (a) and between the Cd and S/V (b).
14 40
R² = 0.5401 R² = 0.014
12 35
r = - 0.735 r = 0.118
30
H d [kWh/m2 year]
C d [kWh/m2 year]
10
25
8
20
6
15
4
10
2
5
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Sw [m2] Sw [m2]
Fig. 7. Linear regression analysis between the Hd and Sw (a) and between the Cd and Sw (b).
14 35
R² = 0.0864 R² = 0.626
12 r = 0.294 30 r = - 0.791
C d [kWh/m2 year]
H d [kWh/m2 year]
10 25
8 20
6 15
4 10
2 5
0 0
0 2000 4000 6000 8000 10000 12000 14000 0 2000 4000 6000 8000 10000
Sop [m2] Sop [m2]
Fig. 8. Linear regression analysis between the Hd and Sop (a) and between the Cd and Sop (b).
14 40
R² = 0.1473 R² = 0.009
12 35
r = - 0.384 r = 0.095
30
H d [kWh/m2 year]
10 C d [kWh/m2 year]
25
8
20
6
15
4
10
2
5
0 0
0.E+00 2.E+05 4.E+05 6.E+05 8.E+05 1.E+06 1.E+04 2.E+05 4.E+05 6.E+05 8.E+05
QG kWh/year] QG [kWh/year]
Fig. 9. Linear regression analysis between the Hd and QG (a) and between the Cd and QG (b).
A first criterion for the identification of the significant variables for the studied phenomenon could
be that of sorting the variables in descending order of the absolute value of the coefficient r and
selecting those that have a value of r significantly different from zero. Another identification criterion
is represented by an empirical rule that (for high value of n) selects those variables in which the value
of r is greater than 2 / n [77]. In Fig. 10, sensitivity analysis among the input variables and the
heating/cooling energy demand based on the r coefficient is displayed.
Sensitivity analysis
QG
Sop
Explanatory variables
Sw
S/V
R.U.
T
CDD
HDD
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
Cd Hd
Pearson coefficient
Fig. 10. Pearson correlation coefficients of input variables for Hd and Cd.
To calculate the r correlation coefficient, the values assumed in each selected city (for the climatic
parameters) and the values assumed in each “ideal building” model (for the thermophysical and
geometric parameters) were considered. In this manner, applying the empirical selection criterion
previously described, only the values with an r correlation coefficient greater than 0.55 can be
considered for the implementation of the regression model. Based on these considerations and on the
sensitivity analysis emphasised in Fig. 10, the HDD, T, S/V and Sw for the heating energy demand
forecast and CDD, T and Sop for the cooling energy demand evaluation should be selected. However,
based on the previous results obtained in Ciulla et al. [48,78] and in D'Amico et al. [44], the authors,
also for the building cooling load evaluation, considered the S/V parameter indispensable. Instead,
because the two climatic indices are a function of the external temperatures, it was decided to exclude
the temperature from input variables in the linear regression model because of its redundancy.
Further, the determination of HDD and CDD data, often tabulated in laws and standards, is easier
than determining the average monthly temperatures Regarding the high values of linear correlation
assumed by the glazed and opaque surfaces for the heating and cooling energy demand respectively,
it is possible to affirm that, fixing all other conditions, with increases of the glazed surface the solar
gain increases and obviously Hd decreases and Cd increases.
S
H d = k + 1 HDD + 2 (14)
V
where
1 is the first regression coefficient [kWh/(m2 year K day)];
2 is the second regression coefficient [kWh/ m year]; and
k is the intercept [kWh/m2year].
Fig. 11. Scatter plot and regression plan for the Hd.
While the second form of Hd as a function of HDD, S/V and Sw is represented by Eq. 15:
S
H d * = k + 1* HDD + 2* + 3* S w (15)
V
where
1* is the first regression coefficient [kWh/(m2 year K day)];
2* is the second regression coefficient [kWh/ m year];
3
*
is the third regression coefficient [kWh/m5 year]; and
k is the intercept [kWh/m2year].
The values of all regression coefficients, intercepts and R2 for both equations, obtained from the
application of the least square method between the expected and predicted outputs for 85% of the
database values are collected in Table 8; in both cases R2 is close to 0.9.
Table 8
Partial regression coefficient and R2 for the Hd.
k 1 2 3 R2
Hd -7.3203 0.0053781 19.4008 - 0.898
*
Hd -2.3015 0.0053839 14.4288 -0.0056909 0.900
S
Cd = k + 1 CDD + 2 (16)
V
where
1 is the first regression coefficient [kWh/(m2 year K day)];
2 is the second regression coefficient [kWh/ m year]; and
k is the intercept [kWh/m2year].
Fig. 12. Scatter plot and regression plan for the Cd.
The second form is represented by Eq. 17:
S
Cd * = k + 1* CDD + 2* + 3* Sop (17)
V
where
1* is the first regression coefficient [kWh/(m2 year K day)];
2* is the second regression coefficient [kWh/ m year];
3* is the third regression coefficient [kWh/m5 year]; and
k is the intercept [kWh/m2year].
Also in this case, the values of all regression coefficients, intercepts and R2 for both equations were
obtained from the application of the least square method for 85% of the data.
It should be noted that some of the data from the sample was purged due to an inconsistency between
the value of CDD and the demand value for cooling calculated with the TRNSYS models. More
specifically, for the cities where the CDD value was not initially provided, the parameter was
calculated by the authors, and in particular for Cortina and Sestriere, the value of CDD was assessed
as zero (section 3). These values could imply the non-ignition of the cooling system, but since the
current standard establishes a standard cooling period valid for all Italian cities without distinction of
area, the simulation in TRNSYS has provided an unjustified cooling requirement. Therefore, for the
determination of the cooling energy demand, it was agreed to eliminate the values linked to the
models of the cities of Cortina and Sestriere (26 fewer scenarios). In Table 9, all parameters of Eq.
16 and Eq. 17 are collected and, in general, the R2 values are higher than 0.9.
Table 9
Partial regression coefficient and R2 for the Cd.
k 1 2 3 R2
Cd 30.5767 0.0064923 -11.0297 - 0.906
*
Cd 41.4031 0.0041604 -13.0856 -0.0020440 0.962
S
Ed = k + 1 ( HDD + CDD ) + 3 (18)
V
where
1, is the first regression coefficient [kWh/(m2 year K day)];
2 is the second regression coefficient [kWh/ m year]; and
k is the intercept [kWh/m2year].
Fig. 13. Scatter plot and regression plan for the Ed.
The second correlation form, instead, considers the two weather indices in two different explanatory
variables (Eq. 19):
S
Ed 1 = k + 1 HDD + 2 CDD + 3 (19)
V
where
1 , 2 are the first and second regression coefficients [kWh/(m2 year K day)];
3 is the third regression coefficient [kWh/ m year]; and
k is the intercept [kWh/m2year].
Finally, to consider the strong correlation among the energy demand and the Sw and Sop parameters,
a more complicated correlation is proposed in which the value of Ed is a function of five parameters
(Eq. 20):
S
Ed * = k + 1* HDD + 2* CDD + 3* + 4* S w + 5* Sop (20)
V
where
1* , 2* are the first and second regression coefficients [kWh/(m2 year K day)];
3* is the third regression coefficient [kWh/ m year];
4 , 5 are the fourth and fifth regression coefficients [kWh/m5 year]; and
* *
The collection of the regression coefficients and the intercept values for each correlation, and the
comparison of the determination coefficients is reported in Table 10.
Table 10
Partial regression coefficient and R2 for the Ed.
k 1 2 3 4 5 R2
Ed 32.5597 -0.0006188 10.7855 - - - 0.950
Ed 1 33.6326 -0.0008445 -0.0041735 10.8133 - - 0.950
*
Ed 49.342 -0.0008874 -0.0058240 -1.35286 -0.0131923 -0.0007279 0.959
The results confirm that the use of HDD and CDD as a unique explanatory variable or two distinct
variables is indifferent, so much so that the determination coefficient is the same; in all cases higher
than 0.95.
30 30
20 20
10 10
0 0
-10 -10
-20 -20
-30 -30
-40 -40
-50 -50
0 10 20 30 40 0 10 20 30 40
Predicted Hd * [kWh/m2year]
Predicted Hd* [kWh/m2year]
Fig. 14. Residual trend of Hd* correlation for identification and validation set.
30 30
20 20
10 10
0 0
-10 -10
-20 -20
-30 -30
-40 -40
-50 -50
0 10 20 30 40 50 0 10 20 30 40
Predicted Cd* [kWh/m2year]
Predicted Cd* [kWh/m2year]
Fig. 15. Residual trend of Cd* correlation for identification and validation set.
Identification set Validation set
60 60
Residual Ed* [kWh/m2year]
40 40
0 0
-20 -20
-40 -40
-60 -60
25 30 35 40 45 50 20 25 30 35 40 45 50
Predicted Ed* [kWh/m2year] Predicted Ed* [kWh/m2year]
Fig. 16. Residual trend of Ed* correlation for identification and validation set.
Put simply, a residual is the error in a result and in these cases the value is between ± 20%, both in
the identification and validation sets.
In Fig. 14, the residual trends of Hd* correlation for all data from the identification and validation set
are plotted, whereas in Fig. 15, there are the residual trends of Cd* correlation. In this second case, as
explained in section 6.2.2, the data sample is represented by a lower number of cases because there
are no model results related to the cities of Cortina and Sestriere. In Fig. 16, the residual trends of
Ed*correlation are plotted.
In addition to the calculation of R2 values, to validate the reliability of the MLR models, four other
statistical errors were calculated: the MSE, MAE, RMSE and MAPE. In Fig. 17, the statistical
analysis of the error based on the validation dataset is represented.
Hd Hd Hd
36.63%
Ed* Hd* Ed* Hd* Ed* Hd*
8.00 4.54
63.97 22.49%
20.63 4.38 34.97%
19.16 6.88 3.66 10.16
3.52 40.06%
7.82
Ed184.52 Cd Ed1 8.43 Cd Ed1 25.36% Cd
103.29 9.19 7.85 5.54
45.50 25.42% 23.58%
6.75
85.04 9.22
Ed Cd* Ed Cd* Ed Cd*
Fig. 17. Statistical analysis of the Validation set for the MLR models.
The MSE distribution highlighted as the best performance is related to the heating energy demand
correlations (Eq. 14 and Eq. 15), while the worst is the cooling energy demand Cd (Eq. 16). Among
the energy comprehensive correlations, the best is Ed* (Eq. 20). The same considerations are valid for
MAE and RMSE. As for MAPE, the best results are indicated by Ed*, while the heating energy
demand correlations are less efficient. Generally, in all cases, the solution Ed*, Hd* and Cd* are the
best correlations for solving the thermal energy balance of a building; these results are also confirmed
by the high R2 values determined in section 6.2. In the following (Table 11), all correlations and
respective statistical errors are collected.
Table 11
MLR correlations and respectively statistical errors.
As explained previously, the more complicated correlations are characterised by better quality and
reliability; in general, the high value of R2 and the low values of MAE and RMSE justify the use of
the MLR methodology as a good alternative for determining the building energy performance. The
MLR method represents a simple and immediate tool which can solve a complex problem, such as
the building energy balance, and can accelerate and help some aspects of energy planning.
8. Conclusion
In this work, the authors explain that the selection of the most suitable method for solving a
determinate problem is important because it allows to overcome certain limits, in order to identify a
generic solution able to interpret any condition and to accelerate the resolution with high reliability.
After a review of the main types of methods for solving the building energy balance widespread in
literature, the authors investigated two of these: a comprehensive analysis with TRNSYS software
and the Multiple Linear Regression method.
As explained in the paper, the first method, belonging to the white box category, allows the
determination of the building energy performance with a high degree of reliability if the model is
correctly developed and calibrated. Indeed, high reliability is a function of a detailed data collection
phase (representative of the model), careful calibration, and the presence of an expert user who knows
the software tool language and the studied physical phenomena. These conditions permit the
development an accurate model which represents the actual conditions well. Based on this first result,
in order to obtain a generic solution, a parametric simulation was developed that solves the building
energy balance, simultaneously changing the weather conditions, the shape factor and the
thermophysical characteristics of the building. In this way, 1560 simulations of a representative
building stock were obtained for non-residential buildings designed with high energy performance
located in the Italian peninsula.
However, although the parametric simulation solves several scenarios, simultaneously obtaining 1560
results, it is not able to give a generic indication because each single simulation gives a single specific
response for a model under certain boundary conditions and characterised by specific thermophysical
choices. Indeed, to generalise the results, it is necessary to analyse all of the thermal energy results
obtained from the parametric simulation. Careful sensitivity analysis on the 1560 simulation results,
based on the identification of the Pearson coefficient, allowed the identification of the main
parameters that influence the building thermal balance during the heating, cooling and entire
climatisation periods. Thanks to this analysis and the use of all simulation data, the authors decided
to explore the Multiple Linear Regression technique belonging to the black box methods. This method
allowed a linear relationship to be modelled between two or more explanatory variables, which
represent the inputs of the model and a response variable through a fitting procedure. As a result,
some simple correlations were developed knowing only a few groups of well-known parameters, and
identifying the heating, cooling and comprehensive energy needs of a building with a high degree of
reliability. Indeed, these correlations are characterised by optimal statistical error values; for example,
the determination coefficients are higher than 0.9 and the Mean Absolute Error and Root Mean Square
error are lower than 10 kWh/m2year. The reliability and flexibility of the energy database allowed the
identification of solutions that simultaneously respond to changes in climate and building shape
factor, obtaining generic solutions which can explain any possible building topology in any
conditions.
The promising results justify the use of Multiple Linear Regression as an alternative method, issuing
a simple and immediate tool that can solve a complex problem like building energy balance, thereby
accelerating and helping some evaluation phases in energy planning, presenting a valid criteria that
could be indicated in standards and laws in the field of the building energy performance.
References
[1] European Parliament and Council. Directive 2010/31/EU on the energy performance of
buildings. Off J Eur Union 2010. doi:doi:10.3000/17252555.L_2010.153.eng.
[2] Poel B, van Cruchten G, Balaras CA. Energy performance assessment of existing dwellings.
Energy Build 2007;39:393–407. doi:10.1016/j.enbuild.2006.08.008.
[3] Balaras CA, Gaglia AG, Georgopoulou E, Sarafidis Y, Lalas D, Mirasgedis S. European
residential buildings and empirical assessment of the Hellenic building stock, energy
consumption, emissions and potential energy savings. Build Environ 2007;42:1298–314.
doi:10.1016/j.buildenv.2005.11.001.
[4] Zhao H-X, Magoulès F. A review on the prediction of building energy consumption. Renew
Sustain Energy Rev 2012;16:3586–92. doi:10.1016/j.rser.2012.02.049.
[5] Foucquier A, Robert S, Suard F, Stéphan L, Jay A. State of the art in building modelling and
energy performances prediction: A review. Renew Sustain Energy Rev 2013;23:272–88.
doi:10.1016/j.rser.2013.03.004.
[6] Scafetta N, Fortelli A, Mazzarella A. Meteo-climatic characterization of Naples and its
heating-cooling degree day areal distribution. Int J Heat Technol 2017;35.
doi:10.18280/ijht.35sp0119.
[7] Atalla T, Gualdi S, Lanza A. A global degree days database for energy-related applications.
Energy 2018;143:1048–55. doi:10.1016/j.energy.2017.10.134.
[8] Gi K, Sano F, Hayashi A, Tomoda T, Akimoto K. A global analysis of residential heating and
cooling service demand and cost-effective energy consumption under different climate change
scenarios up to 2050. Mitig Adapt Strateg Glob Chang 2018;23:51–79. doi:10.1007/s11027-
016-9728-6.
[9] Al-Homoud MS. Computer-aided building energy analysis techniques. Build Environ
2001;36:421–33. doi:10.1016/S0360-1323(00)00026-3.
[10] White JA, Reichmuth R. Simplified method for predicting building energy consumption using
average monthly temperatures. IECEC 96. Proc. 31st Intersoc. energy Convers. Eng. Conf.
IEEE, 1996, p. 1834–9. doi:10.1109/iecec.1996.553381.
[11] Westphal FS, Lamberts R. The use of simplified weather data to estimate thermal loads of non-
residential buildings. Energy Build 2004;36:847–54. doi:10.1016/j.enbuild.2004.01.007.
[12] Crawley DB, Hand JW, Kummert M, Griffith BT. Contrasting the capabilities of building
energy performance simulation programs. Build Environ 2008;43:661–73.
doi:10.1016/j.buildenv.2006.10.027.
[13] Brun A, Spitz C, Wurtz E, Mora L. Behavioural comparison of some predictive tools used in
a low-energy building. 11th Int. IBPSA Conf. Build. Simul. 2009, 2009, p. 1185–90.
[14] Woloszyn M, Rode C. Tools for performance simulation of heat, air and moisture conditions
of whole buildings. Build Simul 2008;1:5–24. doi:10.1007/s12273-008-8106-z.
[15] ANSYS. ANSYS Fluent Software | CFD Simulation 2012.
[16] COMSOL. Simulation Software COMSOL Multiphysics®. 1998 n.d.
[17] CHAM. CHAM | PHOENICS 2005.
[18] Wurtz E, Mora L, Inard C. An equation-based simulation environment to investigate fast
building simulation. Build Environ 2006;41:1571–83. doi:10.1016/j.buildenv.2005.06.027.
[19] Haghighat F, Li Y, Megri AC. Development and validation of a zonal model - POMA. Build
Environ 2001;36:1039–47. doi:10.1016/S0360-1323(00)00073-1.
[20] EnergyPlusTM n.d.
[21] ESP-r n.d.
[22] Building Performance - Simulation Software | EQUA n.d.
[23] Bonneau D, Rongere FX, Covalet D, Gautier B. Clim2000: Modular software for energy
simulation in buildings. Proc IBPSA 1993;93.
[24] Woloszyn M, Rusaouen G, Covalet D. Whole building simulation tools: Clim2000. IEA Annex
2004;41.
[25] Rode C, Grau K. Whole building hygrothermal simulation model. ASHRAE Trans., 2003, p.
572–82.
[26] Rode C, Grau K. Integrated calculation of hygrothermal conditions of buildings. Proc. 6th
Symp. Build. Phys. Nord. Ctries., vol. 1, 2002, p. 23–30.
[27] BuildOpt-VIE n.d.
[28] Li Z, Han Y, Xu P. Methods for benchmarking building energy consumption against its past
or intended performance: An overview. Appl Energy 2014;124:325–34.
[29] Bauer M, Scartezzini J-L. A simplified correlation method accounting for heating and cooling
loads in energy-efficient buildings. Energy Build 1998;27:147–54. doi:10.1016/S0378-
7788(97)00035-2.
[30] Westergren K-E, Högberg H, Norlén U. Monitoring energy consumption in single-family
houses. Energy Build 1999;29:247–57. doi:10.1016/S0378-7788(98)00065-6.
[31] Pfafferott J, Herkel S, Wapler J. Thermal building behaviour in summer: long-term data
evaluation using simplified models. Energy Build 2005;37:844–52.
doi:10.1016/J.ENBUILD.2004.11.007.
[32] Ansari FA, Mokhtar AS, Abbas KA, Adam NM. A Simple Approach for Building Cooling
Load Estimation. Am J Environ Sci 2005;1:209–12.
[33] Dhar A, Reddy TA, Claridge DE. A Fourier series model to predict hourly heating and cooling
energy use in commercial buildings with outdoor temperature as the only weather variable. J
Sol Energy Eng 1999;121:47–53.
[34] Dhar A, Reddy TA, Claridge DE. Modeling hourly energy use in commercial buildings with
Fourier series functional forms. J Sol Energy Eng 1998;120:217–23.
[35] Parti M, Parti C. The total and appliance-specific conditional demand for electricity in the
household sector. Bell J Econ 1980:309–21.
[36] Kialashaki A, Reisel JR. Modeling of the energy demand of the residential sector in the United
States using regression models and artificial neural networks. Appl Energy 2013;108:271–80.
[37] Aydinalp-Koksal M, Ugursal VI. Comparison of neural network, conditional demand analysis,
and engineering approaches for modeling end-use energy consumption in the residential sector.
Appl Energy 2008;85:271–96. doi:10.1016/j.apenergy.2006.09.012.
[38] Olofsson T, Andersson S, Östin R. A method for predicting the annual building heating
demand based on limited performance data. Energy Build 1998;28:101–8. doi:10.1016/S0378-
7788(98)00004-8.
[39] Ekici BB, Aksoy UT. Prediction of building energy consumption by using artificial neural
networks. Adv Eng Softw 2009;40:356–62. doi:10.1016/j.advengsoft.2008.05.003.
[40] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20:273–97.
doi:10.1007/BF00994018.
[41] Dong B, Cao C, Lee SE. Applying support vector machines to predict building energy
consumption in tropical region. Energy Build 2005;37:545–53.
doi:10.1016/J.ENBUILD.2004.09.009.
[42] Lai F, Magoulès F, Lherminier F. Vapnik’s learning theory applied to energy consumption
forecasts in residential buildings. Int J Comput Math 2008;85:1563–88.
doi:10.1080/00207160802033582.
[43] Ciulla G, D’Amico A, Lo Brano V, Traverso M. Application of optimized artificial intelligence
algorithm to evaluate the heating energy demand of non-residential buildings at European
level. Energy 2019. doi:10.1016/J.ENERGY.2019.03.168.
[44] D’Amico A, Ciulla G, Panno D, Ferrari S. Building energy demand assessment through heating
degree days: The importance of a climatic dataset. Appl Energy 2019;242:1285–306.
doi:10.1016/J.APENERGY.2019.03.167.
[45] Il Presidente della Repubbica. Regolamento recante norme per la progettazione,
l’installazione, l’esercizio e la manutenzione degli impianti termici degli edifici ai fini del
contenimento dei consumi di energia, in attuazione dell’art. 4, comma 4, della legge 9 gennaio
1991, n. 10. Gazz Uff Della Repubb Ital SO 1993.
[46] Decreto 26 giugno 2015. Applicazione delle metodologie di calcolo delle prestazioni
energetiche e definizione delle prescrizioni e dei requisiti minimi degli edifici; Adeguamento
del decreto del Ministro dello sviluppo economico, 26 giugno 2009 - Linee guida nazionali per
la cer 2015.
[47] Ente Nazionale Italiano di Normazione. UNI 10349:2016 “Riscaldamento e raffrescamento
degli edifici - Dati climatici. Ente Naz Ital Di Normaz 2016.
[48] Ciulla G, Lo Brano V, D’Amico A. Modelling relationship among energy demand, climate and
office building features: A cluster analysis at European level. Appl Energy 2016.
doi:10.1016/j.apenergy.2016.09.046.
[49] Zhao HX, Magoulès F. A review on the prediction of building energy consumption. Renew
Sustain Energy Rev 2012;16:3586–92. doi:10.1016/j.rser.2012.02.049.
[50] Meteonorm- Global Meteorological Database- Version7. Software and data for engineers,
planners and education n.d.
[51] ISO, EN. EN ISO 13790: 2008. Energy performance of buildings-Calculation of energy use
for space heating and cooling. Eur Comm Stand (CEN), Brussels 2008.
[52] Mustafaraj G, Marini D, Costa A, Keane M. Model calibration for building energy efficiency
simulation. Appl Energy 2014;130:72–85. doi:10.1016/j.apenergy.2014.05.019.
[53] Royapoor M, Roskilly T. Building model calibration using energy and environmental data.
Energy Build 2015;94:109–20. doi:10.1016/j.enbuild.2015.02.050.
[54] ANSI/ASHRAE. ASHRAE Guideline 14: Measurement of Energy and Demand Savings.
2014.
[55] DOE US. M&V guidelines: measurement and verification for performance-based contracts -
version 4.0. Fed Energy Manag Progr 2015. doi:10.1039/c8ew00545a.
[56] Efficiency Valuation Organization. International Performance Measurement and Verification
Protocol: Concepts and Options for Determining Energy and Water Savings, Volume I. Energy
Proj Financ Resour … 2012. doi:DOE/GO-102002-1554.
[57] Yun K, Luck R, Mago PJ, Cho H. Building hourly thermal load prediction using an indexed
ARX model. Energy Build 2012;54:225–33. doi:10.1016/j.enbuild.2012.08.007.
[58] Ruiz GR, Bandera CF. Validation of calibrated energy models: Common errors. Energies
2017. doi:10.3390/en10101587.
[59] Chae YT, Horesh R, Hwang Y, Lee YM. Artificial neural network model for forecasting sub-
hourly electricity usage in commercial buildings. Energy Build 2016;111:184–94.
doi:10.1016/j.enbuild.2015.11.045.
[60] Yezioro A, Dong B, Leite F. An applied artificial intelligence approach towards assessing
building performance simulation tools. Energy Build 2008;40:612–20.
doi:10.1016/j.enbuild.2007.04.014.
[61] Catalina T, Virgone J, Blanco E. Development and validation of regression models to predict
monthly heating demand for residential buildings n.d. doi:10.1016/j.enbuild.2008.04.001.
[62] Ciulla G, D’Amico A, Lo Brano V, Beccali M. ANN decision support tool for the prediction
of the thermal energy performance of European top rated energy efficient non-residential
buildings. Conf Proc 12th SDEWES Held Dubrovnk, 4 to 8 Oct 2017 2017.
[63] Catalina T, Iordache V, Caracaleanu B. Multiple regression model for fast prediction of the
heating energy demand. Energy Build 2013. doi:10.1016/j.enbuild.2012.11.010.
[64] Ahmad T, Chen H. Short and medium-term forecasting of cooling and heating load demand in
building environment with data-mining based approaches. Energy Build 2018;166:460–76.
doi:10.1016/j.enbuild.2018.01.066.
[65] Xuan Z, Xuehui Z, Liequan L, Zubing F, Junwei Y, Dongmei P. Forecasting performance
comparison of two hybrid machine learning models for cooling load of a large-scale
commercial building. J Build Eng 2019;21:64–73. doi:10.1016/j.jobe.2018.10.006.
[66] Fud G. Deep belief network based ensemble approach for cooling load forecasting of air-
conditioning system. Energy 2018;148:269–82. doi:10.1016/j.energy.2018.01.180.
[67] Elhami B, Khanali M, Akram A. Combined application of Artificial Neural Networks and life
cycle assessment in lentil farming in Iran. Inf Process Agric 2017;4:18–32.
doi:10.1016/j.inpa.2016.10.004.
[68] Son H, Kim C. Short-term forecasting of electricity demand for the residential sector using
weather and social variables. Resour Conserv Recycl 2017;123:200–7.
doi:10.1016/j.resconrec.2016.01.016.
[69] Deng H, Fannon D, Eckelman MJ. Predictive modeling for US commercial building energy
use: A comparison of existing statistical and machine learning algorithms using CBECS
microdata. Energy Build 2018. doi:10.1016/j.enbuild.2017.12.031.
[70] Darlington RB, Hayes AF. Regression Analysis and Linear Models: Concepts, Application
and Implementation. 2016. doi:10.1016/0141-1187(83)90072-X.
[71] Abdipour M, Younessi-Hmazekhanlu M, Ramazani SHR, omidi A hassan. Artificial neural
networks and multiple linear regression as potential methods for modeling seed yield of
safflower (Carthamus tinctorius L.). Ind Crops Prod 2019;127:185–94.
doi:10.1016/j.indcrop.2018.10.050.
[72] Lahouar A, Ben Hadj Slama J. Day-ahead load forecast using random forest and expert input
selection. Energy Convers Manag 2015. doi:10.1016/j.enconman.2015.07.041.
[73] Gunay B, Shen W, Newsham G. Inverse blackbox modeling of the heating and cooling load in
office buildings. Energy Build 2017. doi:10.1016/j.enbuild.2017.02.064.
[74] Kapetanakis DS, Mangina E, Finn DP. Input variable selection for thermal load predictive
models of commercial buildings. Energy Build 2017. doi:10.1016/j.enbuild.2016.12.016.
[75] Ding Y, Zhang Q, Yuan T, Yang F. Effect of input variables on cooling load prediction
accuracy of an office building. Appl Therm Eng 2018.
doi:10.1016/j.applthermaleng.2017.09.007.
[76] Hsu D. Comparison of integrated clustering methods for accurate and stable prediction of
building energy consumption data. Appl Energy 2015. doi:10.1016/j.apenergy.2015.08.126.
[77] May R, Dandy G, Maier H. Review of input variable selection methods for artificial neural
networks. Artif. neural networks-methodological Adv. Biomed. Appl., InTech; 2011.
[78] Ciulla G, D’Amico A, Lo Brano V. Evaluation of building heating loads with dimensional
analysis: Application of the Buckingham π theorem. Energy Build 2017;154:479–90.
doi:10.1016/J.ENBUILD.2017.08.043.