test-DSAD004020
test-DSAD004020
This work was supported by The Jenny and Antti Wihuri Foundation,
FINLAND.
2
followed by a fuzzy logic system in the second stage enhanced temperature signals. The proposed model is based on
by a genetic algorithm to optimize the number of rules and probabilistic Bayesian behavior model to learn the energy usage
parameters. This model was applied on simulation data patterns of shiftable appliances and a price-demand model to
generated by another fuzzy logic system. A similar approach predict the hourly energy consumption of air-conditioning
was used in [3] but with a radial basis function RBF network in systems. The authors also proposed a distributed pricing
the first stage forecaster and an adaptive neural fuzzy inference optimization based on genetic algorithm for the utility company
system (ANFIS) in the second stage. Another approach to to maximize its profit based on the learning results.
forecast the price responsive loads is presented as a stochastic To the best of the authors’ knowledge, there was no previous
regression, where the daily load curve is represented by a set of works that tried to apply neural networks and machine learning
periodic smoothing-spline basis functions. This approach was methods to detect the dependencies between electricity price
described in [4] and applied to data from the OlyPen project [5]; and demand in a price-responsive environment at the individual
the model parameters were estimated from observational time customer level. Given the importance of individual customers
series using maximum-likelihood methods. Another approach in the low voltage network control and customer energy
consists of modeling the response to the electricity price as an management, we intend to investigate the potential of Artificial
inverse optimization problem with a set of marginal utility neural networks (ANN) and machine learning methods for
curves and consumption limits as in [6] where authors have learning the price elasticity for the individual customers. The
introduced a solution for this nonconvex mathematical main contributions of this paper are as follows:
program. The model considers time and weather variables. The • A new modeling approach for individual customers
same authors have presented earlier in [7] a data-driven bidding rather than aggregate loads forecasting. The model
model to determine the optimal market bid based on inverse considers the shiftable and curtailable loads in an
optimization and bi-level programming. This model was single household. The learning is based on historical
applied to the same data from the OlyPen project. Other works data of interactions between the customer and the
have approached this subject differently. For example, authors given prices.
in [8], have proposed a hybrid forecasting for electricity price • An ANN to learn the daily electricity consumption
and demand. This approach focuses on the bidirectional price- of shiftable devices given a set of daily prices. The
demand relationships when forecasting electricity market price ANN takes the 24h daily prices as input and gives
and demand. The proposed model is composed of three main the 24 hours shiftable loads as output.
blocks. The first block uses a multi-input multi-output • An LSTM model to learn the consumption patterns
forecasting engine to generate initial demand and price of the household’s heating and air-conditionning
forecasts. Then using historical market data, interdependencies devices and overcome the uncertainty of indoor
temperature and thermal insulation.
between price and demand are captured in the second block and
• A genetic algorithm optimizer to fine tune the
presented in the form of IF-THEN rules. In the third block,
structures and parameters of the neural network and
these rules are applied to the initially generated forecasts and
the LSTM network.
modified accordingly. Another approach is to formulate the
The rest of this paper is organized as follows: the problem
problem as a linear regression problem and to consider the
statement is presented in section 3. Section 4 presents a
aggregated changes in consumption over the distribution
theoretical framework for learning models and LSTM
network as a weighted sum of all individual changes in
networks. Data generation method for price sensitive customers
consumption. This approach was described in [9] where the
are presented in section 5. Numerical results are presented and
authors proposed a hierarchical dynamic linear model and
discussed in section 5. Section 6 is a conclusion.
proposed an algorithm for learning the future price elasticity of
consumers based on their responses to previous pricing updates.
2. PROBLEM STATEMENT
In [10], authors studied the customer price elasticity of demand
using an agent-based model. The model was used to In this paper, we consider a smart grid equipped with two-
demonstrate and quantify the economic impact of price ways communication system (smart meters) that announces the
elasticity of demand in electricity markets. electricity prices corresponding to the next 24 hours. Based on
These models were built to approach the problem of these prices, consumers can schedule their devices either
electricity price sensitivity of a pool of price-sensitive manually or automatically using energy management systems.
customers in a residential area. In this paper we are interested We want to learn the users’ responses to a set of prices proposed
in the individual consumption patterns as it can give valuable by the retailer. The information about the users’ preferences,
information about each customer’s behavior and their potential devices, indoor temperatures and thermal insulation is
response to price signals. It can also help in offering targeted unknown. The idea of this paper is to overcome this uncertainty
electricity demand-response programs and optimal differential by extracting an abstract representation of the user’s behavior
pricing for smart grid retail. However, the literature in this in response to a set of prices and temperatures using historical
context is not very extensive. Authors in [11] presented a model data of prices, outdoor temperatures and corresponding power
for individual customers. This model can identify valuable consumption. The main goal is to find a mapping between
information about different behaviors and usage patterns electricity prices, weather conditions and power consumption
between different customers in response to the price and in a household using its historical data.
3
To collect the historical data related to customer’s responses layers to learn representations of data with multiple levels of
to electricity prices and weather conditions, we need to abstraction. They are obtained by composing simple, but non-
implement this DR program for different customers and register linear modules. Each of these modules transform the
their behavior for a long period. representation at one level into a representation at a higher, and
We assume that the users have three types of devices: more abstract level. The combination of such transformations
• Shiftable devices: all devices that have a tolerance in a model can enable it to learn very complex functions [12].
period throughout the day. For example, a washing A multilayer NN is composed of an input layer having the same
machine or a drying machine. dimension as the input vector, an output layer with dimension
• Non shiftable devices: all devices that can never be of the output vector and the hidden layers composed of several
turned off or need to be run in a specific time. For neurons. Whereas an LSTM network is a recurrent neural
example, a refrigerator. network with memory blocks capable of learning long-term
• Curtailable devices: all devices with adaptable dependencies. It consists of a set of recurrently connected
energy consumption levels; mostly heating and air subnets playing the role of memory chips. These memory cells
conditioning devices. The consumption of these give the network an ability to learn the contextual information
devices is usually dependent on the temperature needed to predict the next sequence in a time series. [13]
and weather conditions. In the first learning model (NN1), we used a multi-layer
For shiftable devices, depending on the user, the scheduling perceptron architecture with hidden layers, input and output
may prioritize minimizing their electricity bill or maximizing layers both have 24 dimensions to represent respectively the 24-
their own comfort. We assume that the users are given the day hours prices and the 24-hours loads. The number of hidden
ahead electricity prices’ timetables (𝑃1 , 𝑃2 , … . , 𝑃24 ), on which layers, hidden activation functions and dropout will be tuned
their hourly loads of the whole day (𝑈1 , 𝑈2 , … . , 𝑈24 ) will using a genetic algorithm. This choice is justified by the nature
depend. For these devices, we implement a neural network of shiftable appliances’ loads. The running patterns of shiftable
model that will take the whole set of electricity prices during a devices are 24 hours periodic and their total amount of energy
day (𝑃1 , 𝑃2 , … . , 𝑃24 ), and outputs the set of corresponding is supposed to be nearly static.
hourly loads (𝑈1 , 𝑈2 , … . , 𝑈24 ). Whereas for curtailable devices In the second model, we use an LSTM network that takes as
(namely air conditioning systems), the power consumption input the outdoor temperatures and power consumption at the
depends on both electricity price and temperature. If the previous timesteps and the electricity price at previous and next
temperature is too hot or too cold, the electricity consumption timesteps and outputs the power consumption at the next
is expected to be high and even higher if the electricity price is timestep. This process can be repeated to recurrently predict a
low. On the other side, consumption should be low if the large sequence of loads using temperatures forecasts and
temperature is normal and prices are high. Therefore, we only projected electricity prices. The choice of this architecture is
need the price and temperature in a given hour to predict the justified by the following assumptions:
consumption of these devices. However, we only have the • The air-conditioning systems are reacting to indoor
information about the outdoor temperatures. The air temperatures in the house and to electricity prices.
conditioners are reacting to the indoor temperature of the house. • The indoor temperature in a house depends on the
This information is therefore necessary to predict how much outdoor temperatures in the previous timesteps, the
power is going to be used by these devices. We can have an building insulation and the amount of energy spent
approximate estimation of this temperature based on the in heating/cooling in previous timesteps.
outdoor temperatures in previous timesteps but only if we have • The amount of power needed for air-conditioning
information about the house’s thermal insulation. Since this can be predicted using the information of current
feature can vary from one house to another, it is not possible to price and previous values of loads and outdoor
have a generic model that can estimate the indoor temperature temperatures.
based on outdoor temperatures. Another way to approach this An illustration of an LSTM memory block with a single cell is
problem is to use abstract representations of the hidden features. provided in Fig.1.
We will use an LSTM based recurrent neural network to learn These memory blocks replace the summation units in the
the consumption patterns of air-conditioners using the past hidden layers in a standard recurrent neural network. The input
outdoor temperatures and power consumption and current vector is concatenated to the hidden state vector and passed
electricity price. In the next section, we explain the use of neural through the forget gate to determine how much of the cell state
networks and LSTM networks and we present details of the two components can be kept. The same vector is passed through the
models. input gate to determine how much of the new state candidate C
can pass to the new cell state. Finally, the output gate will
3. LEARNING MODELS BASED ON ARTIFICIAL NEURAL decide how much of the transformed state cell vector can be
NETWORKS passed to the next hidden state vector 𝒉𝑡 . The cell state vector
The use of artificial neural networks for this work is justified 𝑪𝒕 is given by:
by their ability to learn the mapping function between the input 𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖𝑡 ∗ 𝐶𝑡
and output without any prior information about the problem.
ANNs are based on the combination of multiple processing where the forget vector 𝑓𝑡 is given in function of the forget
4
4. SIMULATION DATA
Considering the difficulty of obtaining the real data related
to customers’ response to electricity price signals, we will
generate the data needed to test our models using electricity
prices and temperatures in Finland. We simulate a customer’s
response to a set of prices and temperatures using an
optimization model from the literature for shiftable devices to
train NN1, and a fuzzy logic system for air-conditioning
systems to train the LSTM network.
Fig.1: LSTM memory block with one cell. An LSTM cell consists
4.1. Shiftable appliances simulation model
of a cell state vector C, a hidden state vector h and three gates: input gate,
For shiftable appliances, we use a device scheduling
output gate and forget
optimization model from the literature presented in [14]. The
gate’s weight matrix 𝑊𝑓 and bias vector 𝑏𝑓 as follows: idea is to simulate the response of a customer to a set of prices.
𝑓𝑡 = 𝑓(𝑊𝑓 . [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑓 ) We suppose that the costumer is responding to price signals by
Using the same notations, the input vector 𝑖𝑡 , the output vector shifting the usage of certain devices to a certain period of the
𝑜𝑡 and the cell state candidate 𝐶𝑡 are given by: day to reduce their electricity bill. The customer is also
supposed to have preferences for running devices at certain
𝑖𝑡 = 𝑓(𝑊𝑖 . [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 ) times. Therefore, the model should consider the customer’s
comfort function. We assume that a household has 6 shiftable
𝑜𝑡 = 𝑓(𝑊𝑜 . [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) devices, each one has a time window in which it can operate
defined by an earliest starting time (ELT) and a latest stopping
𝐶𝑡 = 𝑔(𝑊𝑐 . [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑐 ) time (LST), a preferred starting time (PST), a running period
Finally, the next hidden state is given by: (D) and a load profile. In addition, the household has a basic
ℎ𝑡 = 𝑜𝑡 ∗ 𝑔(𝐶𝑡 ) load that is running all day and cannot be stopped or shifted.
The proposed LSTM network consists of several layers of The user prefers to start his devices in the PST but having
LSTM cells followed by a fully connected layer. The number information of the whole day electricity prices, he also wants to
minimize his bill payment. Therefore, he has interest in shifting
of LSTM layers, size of LSTM cells, activation functions and
their starting time in the time window of each device to
other parameters will be determined by a genetic algorithm to
minimize the cost function but not far away from the PST to
find the optimal structure and hyperparameters.
minimize the discomfort function defined by how much the
In the case of our model, the input vector 𝑥𝑡 is composed of actual starting time is shifted from the PST. Additionally, the
the current electricity price 𝑃𝑡 , the temperature at previous user cannot switch too much load to one time slot because we
timestep 𝑇𝑡−1 and the power consumption at previous timestep assume that an extra cost, called penalty cost, would be applied
𝑈t−1 . Fig.2 illustrate the architecture of the proposed LSTM if a load threshold (L) were exceeded in a time slot. This extra
network. cost is calculated as:
𝑃𝐶 = 𝛼 ∗ (𝐿𝑜𝑎𝑑𝑡 − 𝐿𝑡 )
Fig.2: LSTM Network for air-conditioning load prediction. The model uses the
information about temperatures, loads and price in the previous timesteps to predict the load U(t).
Since this is a regression problem, the fully connected layer uses a linear activation function.
5
Fig. 3: Boxplot representation of daily prices distribution. Fig. 4: Boxplot representation of generated daily loads distribution
where α is a constant called penalty factor. This cost (if To generate the most diversified data for the model, we use
positive) is added to the cost function for every hour. This is a electricity prices in Elspot day-ahead electricity prices in
multi-objective optimization problem that we assume (for data Finland in the period between 1st January 2017 and the 7th
generation purposes) that the users are solving to schedule their September 2018 [16]. Fig.3 shows a boxplot representation of
appliances either manually or using a device scheduling system. electricity prices given in €cents/Kw. The simulation algorithm
The used values of ELT, LST, PST and running periods about outputs a dataset of the expected loads for each day. A boxplot
the 6 shiftable devices are presented in Table 1 and their representation of the generated loads is shown in Fig. 4. The
required loads in (KW) are shown in Table 2. We tried to make boxplot distribution reflects the consumption constraints
the values as realistic as possible for the different kinds of defined in Table 1 and Table 2. The hourly load is always higher
devices. For simplicity, the basic load will be 2KW for the or equal to 2KW/hour, which is the basic load defined
whole day. previously. During the night hours (23:00 to 04:00) according
to Table 1, no shiftable devices are operating. Consequently, the
Table 1: Devices usage’s data consumption in this period is equal to the basic load. The
EST LET boxplot also showsPST that at 07:00 the consumption
D (hours) is equals to
Electric range 5:00 9:00 3.6 most of the7:00time. This can be seen 1in Table 1 and Table 2 as
the load required for the electric range, which has a preferred
Oven 10:00 21:00 starting time at19:00
07:00, plus the basic load.
1 The load at 22:00 is
Electric water heater 9:00 23:00 equal to 0 in most10:00of the times, which3can be explained by the
Dishwasher 13:00 23:00 fact that only the electric water heater2 and the dishwasher can
19:00
Clothes Washer 6:00 22:00 be run at that time
18:00and none of them 2have a preferred running
time close to 22:00. The rest of the loads are dependent on the
Clothes Dryer 8:00 21:00 electricity price.19:00 2
The loads did not surpass 6KW/hour, which
can be explained by the penalty constraint. This data set will
Table 2: Devices’ loads serve our proposed model NN1 for training and testing.
First hour 2nd hour 3rd hour
4.2. Curtailable appliances simulation model
Electric range 1.6 0.0 0.0
For curtailable devices, we use a fuzzy logic system to model
Oven 2.8 the
0.0 consumption of air-conditioning 0.0 systems operating during
Electric water heater 2.0 the
2.0 day to maintain a comfortable
2.0 temperature of the space
while taking into consideration the electricity price in a given
Dishwasher 1.3 hour.
2.1 Fuzzy logic is the best system 0.0in this kind of problems as
Clothes Washer 1.4 it2.0
can model non-qualitative concepts0.0 like “hot temperature”
or “low price”. We use fixed rules for our fuzzy logic system
Clothes Dryer 4.0 3.5
presented as follows: 0.0
1) If (P is low) and (T is low) then (U is much-high).
The load threshold L for one hour is chosen to be 5 KW and 2) If (P is low) and (T is average) then (U is little-high).
the penalty factor α is 1.0. 3) If (P is low) and (T is high) then (U is much-high).
To solve this multi-objective optimization problem, the
4) If (P is average) and (T is low) then (U is little-high).
authors in [14] used a Non-dominated Sorting Genetic 5) If (P is average) and (T is average) then (U is little-low).
Algorithm II (NSGA-II). The algorithm gives for each set of 6) If (P is average) and (T is high) then (U is little-high).
24-hours prices different Pareto optimal solutions [15] to the 7) If (P is high) and (T is low) then (U is average).
multi-objective optimization problem in respect of the cost and 8) If (P is high) and (T is average) then (U is much-low).
discomfort functions. For simplicity reasons, we will only 9) If (P is high) and (T is high) then (U is average).
consider the best cost Pareto optimal solution. This solution will
Where P is the electricity price, T is the indoor temperature
model the behavior of a customer who gives priority to the cost and U is the expected usage or consumption of the air
optimization and the second gives priority to his own comfort. conditioner.
6
Fig.5. Fuzzy membership functions for air conditioner dynamics Fig.6. Fuzzy membership functions for indoor temperature
(fuzzy system 1). dynamics (fuzzy system 2)
7
6. CONCLUSION REFERENCES
Artificial neural networks are revolutionary tools that have [1] P. Valtonen, “Operating and market environment,” in Distributed energy
reached a certain level of maturity to be useful in many fields ressources in an electricity retailer’s short-term profit maximization,
Lappeenranta, FINLAND, Lappeenrannan teknillinen yliopisto
of science and industry. Therefore, a potential application of Yliopistopaino 2015.
these techniques is the energy distribution system. With the [2] A. Khotanzad, E. Zhou, and H. Elragal, “A neuro-fuzzy approach to short-
development of smart metering infrastructure and smart grids, term load forecasting in a price-sensitive environment,” IEEE
these models will have access to the necessary data for their Transactions on Power Systems, vol. 17, no. 4, pp. 1273–1282, 2002.
[3] Z. Yun, Z. Quan, S. Caixin, L. Shaolan, L. Yuming, and S. Yang, “RBF
training. In the future smart grids, customers are supposed to neural network and anfis-based short-term load forecasting approach in
take part of the grid’s flexibility by participating in DR real-time price environment,” IEEE Transactions on Power Systems, vol.
programs and reacting to electricity price signals. 23, no. 3, pp. 853–858, 2008.
Consequently, the electricity retailers or the utility companies [4] J. Hosking, R. Natarajan, S. Ghosh, S. Subramanian, and X. Zhang,
“Short-term forecasting of the daily load curve for residential electricity
will have a big interest in the information about this usage in the smart grid,” Applied Stochastic Models in Business and
responsiveness of the customer to certain electricity prices. Industry, vol. 29, no. 6, pp. 604–620, 2013.
9