Energy Management For Microgrids Using A Hierarchical Game-Machine Learning Algorithm

This document proposes an energy management strategy for microgrids powering wireless communication networks using a hierarchical algorithm combining multi-agent game theory and machine learning. The microgrid contains photovoltaic systems, batteries, and base stations. A two-layer algorithm is designed where base stations first solve an immediate load response as a two-player game, obtaining a quick load plan. Then, reinforcement learning is used to adapt the load model to an unknown environment over time. Simulation results show this approach achieves comparable performance to exhaustive search while converging faster than direct reinforcement learning alone.

Uploaded by

Muhammad Atiq Ur Rehman 22-FET/PHDEE/S19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Energy Management For Microgrids Using A Hierarchical Game-Machine Learning Algorithm

Uploaded by

Muhammad Atiq Ur Rehman 22-FET/PHDEE/S19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Energy Management for Microgrids Using a

Hierarchical Game-Machine Learning Algorithm

Rui Hu Alexis Kwasinski
School of Electrical and Computer Engineering School of Electrical and Computer Engineering
University of Pittsburgh University of Pittsburgh
Pittsburgh, U.S Pittsburgh, U.S
[email protected] [email protected]

Abstract—This paper presents an energy management search and broadcast the energy management strategy. The
strategy for microgrids using a multiagent game-learning alternative way is to distribute the decision-making process to
algorithm. This microgrid is powered by photovoltaic (PV) the individual BS controllers. Such an approach formulated
systems equipped with batteries and is intended to be operating the original energy management problem as a multiagent
in islanded mode. The proposed energy management strategy is
applied to wireless communication networks by addressing the
optimization problem and aimed at solving for an equilibrium.
tradeoff between the communication signal’s quality of service In [9], energy management is modeled as a multi-player game.
(QoS) and energy availability. A two-layer algorithm combining However, the computation cost for a large scale system might
multiagent-game and reinforcement learning (RL) is designed be too costly and impractical [10]. A reinforcement machine
for base stations (BSs) in order to accomplish the goal learning algorithm was introduced in [11] so that the
mentioned above. The proposed method shows improvement in computation time is manageable. This reinforcement learning
the microgrid’s performance and has a higher converging speed (RL) algorithm enables the BS to search for an operating point
compared to a direct RL approach. The designed energy via trial and error but requires a few days of “training.”
management algorithm was tested in multiple case studies.
In this paper, we propose a hierarchical load response
Keywords—microgrid, reinforcement learning, game theory, algorithm combing a virtual two-player game and RL
distributed control, multi-agent system, optimization, process. Applying this algorithm, the controllers in a
communication system, load shaping microgrid solve the immediate load response as a two-player
I. INTRODUCTION game and adjust the user’s load model using an RL algorithm.
Because a two-player game could be solved in polynomial
A microgrid is a local independently controlled electric time, the controllers obtain a reasonable load plan fast in real-
system with distributed energy resources (DER), combined time and gain the capability of adapting to the unknown
with storage devices and flexible loads. Such a system can be environment with the aid of RL. Additionally, compared to a
operated in a non-autonomous way or an autonomous way, direct RL approach, the two-player game simplifies the action
depending on whether it is connected to the main grid [1]. searching space; thus, the converging speed is higher. As the
Thanks to its independent control and local DERs, a microgrid study results will show, this approach obtains an energy
can power its local area when the main grid is interrupted management strategy with performance compared to an
during a natural disaster [2]. This feature has drawn much exhaustive heuristic search and has a shorter training period
attention from designers seeking high system resiliency, such compared to direct RL algorithm.
as those in the communications industry. Considering the
crucial role played by communication networks, a few II. MICROGRID ENERGY MANAGEMENT
companies have already explored using renewable energy An example of the proposed microgrid for communication
sources and microgrids to power communication facilities networks is shown in Fig.1. The microgrid consists of a
consisting of a group of base stations (BSs). A practical photovoltaic (PV) power generator, three communication
question arises as of how to utilize the stored energy in such BSs, and battery units. The batteries are responsible for
microgrid considering renewable resources’ partial stochastic absorbing excess generated power or powering the load when
characteristics. Past research has been focusing on searching the generated power is insufficient. The BSs in this microgrid
for an energy management strategy that solves this problem. utilize communication traffic shaping (CTS) to adjust their
One of the approaches is by switching on/off BSs and energy consumption [12]. In a general form, the load at each
minimize the total load demand, as discussed in [3, 4] and [5]. BS could be expressed as
Other methodologies considering green energy availability
and delay performance include GALA [6], IDEA [7], and
TEA [8]. These approaches rely on a central controller to
This work is supported and funded by the Hillman Foundation.

978-1-7281-4911-0/19/$31.00 ©2019 IEEE

546
Authorized licensed use limited to: INTERNATIONAL ISLAMIC UNIVERSITY. Downloaded on May 16,2022 at 14:12:40 UTC from IEEE Xplore. Restrictions apply.
PBSi rti ( Pb PcG (t )) rti PL

where rti is the ratio that the BS’s load takes in the total
microgrid load, Pb is the base BS power and Pc is the
controllable power as a linear function of the traffic shaping
factor (TSF) G , and PL is the microgrid total load demand.
Generally, the higher G is, the better the quality of the
communication is. A traffic shaper controls (“shapes”) the
actual throughput (equivalent to the total volume of traffic) at
Fig. 1. Communication microgrid scheme.
the output of a communication system. Since the action of
shaping traffic entails a reduction of bit rate, it will lead to an
increased delay for data traffic or higher compression ratio for PB GR PL (6)
interactive video or speech traffic. In an LTE base station, a
radio frame is divided into minimum units of transmit Hence the battery energy output is
resources called “resource blocks” (RB). Without traffic t
'Ebat (t ) ER EL ³ P ( x)dx (7)
shaping, all ongoing calls will require RBT resource blocks. t0
B

However, when applying traffic shaping, the actual number of

active resource blocks becomes where t0 is the current time, Er and EL are the integrals of
PL and GR overtime. Therefore, the battery state of charge
RB (t ) d G (t ) RBT (2)
(SoC) at time t
Correspondingly, the maximum transmitted bit rate in this BS
Ebat (t0 ) 'Ebat (t )
is limited based on the TSF by (2). Moreover, the impact of SoC (t ) (8)
the quality of service (QoS) caused by choice of TSF is Ebat _ full
measured through the metric of peak signal-to-noise ratio
(PSNR). A detailed discussion of the relation between QoS is also a random variable, where Ebat (t0 ) is the battery’s SoC
and PSNR could be found in [13]. Also, as shown in [14], at time t0 , and Ebat _ full is the battery energy when fully
PSNR could be approximated by charged. Suppose the cumulative density functions (CDF) of
qv a log(G r ) b (3) ER is FR , and the probability density function (PDF) of load
consumption EL is f L , the pdf of battery energy 'Ebat is
where qv is the video quality measured in PSNR, r is the
then
nominal bit rate, and a and b are constants based on the choice
of source codecs. In this study, the codec type applied is f
H.265, and the corresponding values of a and b are 10.4 and -
23.8. Details on how parameters in (3) values are obtained
f Ebat ('Ebat )
EBS
³ FR ( Ebat EL ) f L ( EL )dEL
f

could be found in [13].

During operation, the actual load consumption depends on Hence the CDF of battery SoC(t)
real-time variables such as communication traffic type (voice ( SoC ( t ) SoCnow ) Ebat _ full
call/video stream) and the number of users. Additionally, the
power generated by renewable resources, such as from
FSoC ( SoC (t )) ³
f
f Ebat ('Ebat )d 'Ebat
photovoltaic (PV) modules and wind turbines, are partially
stochastic [15]. Suppose their distribution functions are
where SoCnow is the battery’s current SoC level. So the
PL X L (t ,T1 ,T2 ,...,Tn ) probability that the battery SoC is higher than any given value
at time t is
P(SoC(t ) ! x) 1 FSoC ( x) (11)
GR X R (t , T1c , T 2c ,..., T nc )
The SoC distribution information is essential for the users
to evaluate its objective. A BS in the microgrid has multiple
where GR is the power generated by a renewable source, t is objectives: meet the load requirements, provide reasonable
time, T i and T ic are environmental conditions, and X L and service quality and maintain sufficient stored energy. The last
requirement comes from the fact again that the battery units
X R are the distributions that the BS load and power follow.
are responsible for stabilizing the system bus voltage. If the
Correspondingly, the battery output power is also a random batteries is fully discharged, the bus voltage in the microgrid
variable depending on the distribution of load and power could move away from the intended operating point and
generation

547
Authorized licensed use limited to: INTERNATIONAL ISLAMIC UNIVERSITY. Downloaded on May 16,2022 at 14:12:40 UTC from IEEE Xplore. Restrictions apply.
Fig. 2. Game-learning algorithm scheme.

cause the microgrid to shut down. Therefore, it is a good where G i indicates the virtual user's TSF action. Each user’s
practice to have the microgrid batteries’ SoC maintained at a objective is to maximize (13) considering the virtual player’s
relatively high level in island mode. In this project, the possible action.
performance of the microgrid is evaluated by an objective If a natural disaster hits the microgrid, the communications
function measuring the weighted sum of communication between BSs might be cut off, and the components could be
quality and battery SoC distribution: damaged. In this condition, the behavior modes of BSs could
obj (t ) wcom fcom ( PBS ,t ) wSoC P(SoC(td ) ! SoCgoal ) (12) be different. Some BSs may experience a surge in
communication load demand, while other BSs may reduce
where f com ( PBS , t ) calculates the average normalized peak their load consumption in order to conserve more stored
energy. Therefore, the BSs in this microgrid do not
signal-to-noise ratio (PSNR) of the communication network
necessarily share a common goal. Instead, it is safer for the
computed by (3). The BS’s goal is to search for a TSF strategy
BSs to assume the worst scenario, where its virtual player is
G (t ) that maximizes the objective function (12). Details on trying to minimize their objectives. With this assumption, the
how f com ( PBS , t ) is calculated could be found in [9]. objective of the BSs becomes

III. GAME-LEARNING ALGORITHM max min obj (t , G i (t ), G i (t )) (14)

Gi Gi

The proposed algorithm consists of two parts: the which makes it a zero-sum game. The solution of (14) is the
immediate TSF decision game and the BS load-ratio learning. same as the solution of a corresponding linear programming
The scheme of the algorithm is shown in Fig. 2. In this problem
section, the details of the two parts are explained.
Maximize z
A. Two-player TSF game
Subject to Acx t z e
Initially, the load demand (1) could be computed using the
load and weather forecast information and modeled as a (15)
ecx 1 0
virtual two-player game similar to [9]. For a user with a load
ratio rti , it assumes a virtual user takes all the rest of the load x t 0
(1 rti )( Pb PcG (t )) . Therefore, the objective function (12) where x is player I’s strategy vector indicating its probability
depends on the TSF choices of both the actual and virtual user playing TSF actions, e is a vector of ones with the same length
is of x, A is the payoff table computed by deducing players’
obj (t , D i (t ), D i (t )) wload f com ( PBS , t , G i (t ), G i (t )) choice of TSF as shown in Fig. 3, z is the BS’s expected
(13)
wSoC P( SoC (td ) ! SoCgoal G i (t ), G i (t )) payoff and is his probability of choosing the ith TSF. This
correspondence of zero-sum game and linear programming is
Player
P21 P22 based on the connection between the Minmax Theorem and
II
the Duality Theorem [16]. The linear programming problem
Player
I
G 21 G 22 could be solved by applying a simplex or interior-point
P11 G 11 obj (t , G 11, G 21) obj (t , G 11, G 22) algorithm [17].
P12 G 12 obj (t , G 12, G 21) obj (t , G 12, G 22) B. Load-ratio learning
During the learning process, the BSs search for their
Fig. 3. Example payoff table of a user.
optimal load-ratio policies through interactions with their
environment and adapt their decision-making process in a

548
Authorized licensed use limited to: INTERNATIONAL ISLAMIC UNIVERSITY. Downloaded on May 16,2022 at 14:12:40 UTC from IEEE Xplore. Restrictions apply.
trial-and-error manner. At first, each BS is given a load-ratio 0.8
list pi t pmin i (20)
M 1

Lr > p1 , p2 ,..., pM @ These limits are set so that the agents are not trapped in local
optimums as the environment changes.
where pi indicates the probability that the BS’s load takes C. Design of reward function
i of the microgrid’s total load. At time t, all the BSs The reward function has a critical influence on the BS's
M strategy obtained through RL. For example, if one action
randomly pick its load ratio according to their load-ratio lists: always leads to the highest payoff, the BS would develop a
strategy to play that action only. In this study, two sets of
i reward functions are given to the BSs depending on their
rti (t ) with probability pi
M available information.
1) Reward function with communication
After making the load-ratio choice, the BS conducts the two-
If the communication network in the microgrid is
player game solving and observe the resulting system status
functioning normal and no component is lost, the BSs could
change. Then, a reward is computed and given to each agent
exchange their choices of TSFs and battery SoC status with
to update its load-ratio policy, and the above process repeats.
each other. So the original objective function (12) is
This process is similar to that of a policy iteration, but the
commutable to all BSs; thus, (12) works as the reward
updating law and its converging objective are different due to
function:
the multi-agent arrangement [18]. The learning sequence of
the agent is shown below: ri wcom f com ( PBS , t , G1 (t ),..., G N (t )) wSoC p( SoC (td ) !
(21)
SoCgoal )
1. At time t, the BS chooses a load-ratio according to its
load-ratio policy Lri . Suppose the load ratio taken is A penalty is given to the BS when the probability of reaching
rti . the SoC goal is too small, as shown in (22). This setting
encourages the BS to pick a lower TSF so that more energy
2. After conducting the two-player game and solving for is saved when the energy situation is critical.
its solution, each BS applies the obtained TSF strategy ri 1 Gi (t ), P(SoC(tend ) t SoCgol ) 0.5 (22)
vector x.
2) Reward function with local information
3. At the next LCI decision time t 1 , the BSs collect the If the communication link in the microgrid is damaged or
system status (SoC, PSNR information) and compute malfunctioning, it is possible that some of the BSs are unable
their payoffs. Suppose the reward of player i is ri (t ) . to observe other BSs’ moves and power-load profiles. In this
situation, the disconnected BSs are limited to evaluate its
4. BS updates its load-ratio policy according to the rule performance with local information using the function shown
in (23)
Lri (t 1)
Lri (t ) E ri (t ) eai Lri (t ) (18)
1 1
ri wload wSoC (23)
where 0 E 1 is a learning rate parameter and is a unit 1 eGi (t ) SoC (t )
SoC ( t )
Gi (t )
1 e
vector with its rti th component unity. This algorithm is
which is an approximation of the original objective function,
known as Linear Reward-Inaction algorithm LR I . One where ai and SoC are the user’s local TSF and battery SoC
critical feature of this learning algorithm is that the 1
convergence to pure NE of agents’ strategies is guaranteed if level. The first term is a normalized
1 eDi SoC (t )
the learning rate is sufficiently small, regardless of the
communication quality index, which gives the agent a high
number of players. By which it means that all probability
reward when both the SoC and Gi (t ) are large. The second
vectors Lri converge to unity vectors. Such convergence is in
term 1 is an approximated energy availability index.
terms of probability, and the mixed strategies NEs are not
SoC ( t )
Gi (t )
proven to be stable. However, compared to other multi-agent 1 e
approaches such as mixed game and multi-agent Q-learning, Similar to the standard operation reward, a penalty is given
this algorithm is one of the few that guarantees a form of when the SoC level is critical
convergence [19]. A full description of the LR I algorithm
ri 1 Gi (t ), SoC(t ) d SoCmin (24)
could be found in [20]. A modification made to this algorithm
in this study is a set of limits to the load-ratio policy such that
pi d pmax 0.8i (19)

549
Authorized licensed use limited to: INTERNATIONAL ISLAMIC UNIVERSITY. Downloaded on May 16,2022 at 14:12:40 UTC from IEEE Xplore. Restrictions apply.
IV. ANALYSIS VERIFICATION TABLE I I. INITIAL LOAD-RATIO POLICY OF A USER
A case study applying this game-learning algorithm was
conducted in MATLAB. The scheme of the simulated system rti 0.2 0.4 0.6 0.8 1
is coincident with the one in Fig. 1, which consists of three pi 0.1 0.6 0.1 0.1 0.1
BSs. It is assumed the users share the same fundamental
parameters as listed in Table I. The users are asked to pick because such value reflects the system’s long-term behavior
TSFs and update their load-ratio policy every hour. The initial in term of the objective function. This value is called the
load-ratio policy of a BS is shown in Table II, and the system performance index in the following context.
available TSF of a BS is G (t ) >0.2,0.4,0.6,0.8,1.0@ . In the simulation, the system is operating normally where
TABLE I. EVALUATION PARAMETER VALUES BSs are free to communicate with each other. The system
PSNR and battery SoC during a two-day simulation is shown
Symbol PARAMETER Value in Fig. 4. As the result shows, the obtained system PSNR is
wcom Communication quality weight 0.5
highly related to the system SoC level, which is similar to the
one obtained by the heuristic algorithm (25) as shown in Fig.
wSoC Energy availability weight 0.5 5. Both algorithms ensure a system SoC over their desired
Ebat Battery fully charged energy 24 kWh goals (0.8), but the game-learning algorithm has a higher
minimal PSNR. According to [14], a moderately good target
Solar power expectation 1 kW for quality of video stream is 37 dB PSNR, whereas a 32 dB
BS base load expectation 0.2 kW
PSNR is considered as acceptable. Also, the overall system
performance computed by (26) of the game-learning
BS traffic depended load
expectation
0.8 kW algorithm is evagl 18.5058 compared to that of heuristic
one evaheu 18.9941 . Therefore, in the sense of overall
Solar power variance 4,000
objective function, the game-learning algorithm has a similar
BS traffic depended load performance compared to the exhaustive heuristic algorithm.
4,000
variance
Desired battery SoC level 0.8 A comparison between the game-learning and a direct RL
approach is also conducted under the normal condition. If the
Initial Battery SoC level 0.7
TSF strategy is found through a direct RL approach linking
BW BS total bandwidth 10MHz SoC states with the TSF actions, such as the one in [11], the
a PSNR-rate bit curve parameter 10.4
PSNR-rate bit curve parameter -23.8
system PSNR and SoC would have a relationship as shown
r Nominal transmit rate bit 2 Mbps in Fig. 6 with a system performance evarl 10.8693 , lower
The PSNR and battery SoC of the overall system are than that of the game-learning algorithm. This performance
demonstrated, and the system performance is compared to lost is mainly due to its exclusion of the time dimension in its
that of a direct RL algorithm and a heuristic algorithm with searching space. Additionally, the direct RL approach
complete information. The exhaustive heuristic algorithm demands a longer training period. With a learning rate of 0.1,
maximizes the objective function at each TSF decision time: the direct RL requires approximate 15 days until the system
performance reaches a stable level, which could be observed
Maximize obj (t , D (t ))t (25) from the training curve shown in Fig. 7. In comparison, when
the same system is operated applying the game-learning
The day-averaged cumulated objective function evaluates the
performance of a microgrid
end

¦ obj (t )
t 1 (26)
eva
day

Fig. 5. LSR and system SoC applying heuristic algorithm, normal

condition.
Fig. 4. PSNR and SoC of BS microgrid applying game-learning
algorithm, normal condition.

550
Authorized licensed use limited to: INTERNATIONAL ISLAMIC UNIVERSITY. Downloaded on May 16,2022 at 14:12:40 UTC from IEEE Xplore. Restrictions apply.
Fig. 7. Learning curve of a micogrid applying direct RL algorithm.
Fig. 6. LSR and system SoC applying direct RL, normal condition.
Transactions on Wireless Communications, vol. 12, no. 5, pp. 2126-
2136, 2013.
[4] N. Yu, Y. Miao, L. Mu, H. Du, H. Huang, and X. Jia, "Minimizing
Energy Cost by Dynamic Switching ON/OFF Base Stations in Cellular
Networks," IEEE Transactions on Wireless Communications, vol. 15,
no. 11, pp. 7457-7469, 2016.
[5] A. Stavridis, S. Narayanan, M. D. Renzo, L. Alonso, H. Haas, and C.
Verikoukis, "A base station switching on-off algorithm using
traditional MIMO and spatial modulation," in 2013 IEEE 18th
International Workshop on Computer Aided Modeling and Design of
Communication Links and Networks (CAMAD), 2013, pp. 68-72.
[6] T. Han and N. Ansari, "Green-energy Aware and Latency Aware user
associations in heterogeneous cellular networks," pp. 4946-4951:
Fig. 8. Learning curve of a micogrid applying game-learning algorithm. IEEE.
[7] D. Liu, Y. Chen, K. K. Chai, and T. Zhang, "Distributed delay-energy
method, the obtained learning curve is shown in Fig. 8, which aware user association in 3-tier HetNets with hybrid energy sources,"
shows the system starts with a higher performance index as pp. 1109-1114: IEEE.
well as a faster-converging speed. The performance [8] V. Chamola, B. Krishnamachari, and B. Sikdar, "Green Energy and
Delay Aware Downlink Power Control and User Association for Off-
improvement is benefited from the game solving process, Grid Solar-Powered Base Stations," IEEE Systems Journal, vol. 12, no.
which takes consideration of power and load prediction. And 3, pp. 2622-2633, 2018.
the increased learning speed is likely caused by a smaller [9] R. Hu, A. Kwasinski, and A. Kwasinski, "Mixed strategy load
searching space: the agent applying direct RL has an SoC- management strategy for wireless communication network micro grid,"
pp. 1-8: IEEE.
TSF 2-dimension searching space while the agent in the
[10] C. D. a. C. H. Papadimitriou, Three-player games are hard. 2005.
game-learning algorithm needs only to explore the 1-
[11] R. Hu and A. Kwasinski, "Energy management for microgrids using a
dimension load-ratio policy. reinforcement learning algorithm " in 2018 IEEE Green Energy and
Smart Systems Conference (IGESSC), 2018.
V. CONCLUSION
[12] "Microgrids for disaster preparedness and recovery with electricity
This paper presented a two-layer game-machine learning continuity plans and systems: with electricity continuity plans and
energy management mechanism for microgrids to optimize its systems," in Premium Official News, Newspaper Article, ed: Plus
Media Solutions, 2015.
energy usage. In this particular case, the analysis focuses on a
[13] A. Kwasinski and A. Kwasinski, "Integrating cross-layer LTE
system applicable to wireless communication networks, but resources and energy management for increased powering of base
the same approach can be used in other applications with a stations from renewable energy," pp. 498-505: IFIP.
partially controllable load. The simulation results show that [14] A. Kwasinski and A. Kwasinski, "The role of multimedia source
the energy management strategy obtained by the game- codecs in green cellular networks," vol. 2016-, pp. 1-6: IEEE.
learning algorithm has a better performance than solely [15] S. Vandael, B. Claessens, D. Ernst, T. Holvoet, and G. Deconinck,
applying the reinforcement learning and is close with an "Reinforcement Learning of Heuristic EV Fleet Charging in a Day-
exhaustive heuristic search algorithm. Benefiting from the Ahead Electricity Market," IEEE Transactions on Smart Grid, vol. 6,
reduced searching space, converging speed of the proposed no. 4, pp. 1795-1805, 2015.
algorithm is higher than a direct reinforcement learning [16] S. Homer and A. L. Selman, Computability and complexity theory,
2nd;2; ed. (no. Book, Whole). London; New York;: Springer, 2011.
algorithm. Additionally, this algorithm showed strong
[17] H. Karloff, Linear programming (Progress in theoretical computer
resilience against system damage such as partial power loss science). Boston: Birkhäuser, 1991, pp. viii, 142 p.
and communication network failure. In the future, the
[18] R. S. Sutton, A. G. Barto, and I. netLibrary, Reinforcement learning:
performance of the adpating feature of the algorithm under an introduction (no. Book, Whole). Cambridge, Mass: MIT Press,
dynamic environment will be valuated. 1998.
[19] L. Buşoniu, R. Babuška, and B. De Schutter, "Multi-agent
REFERENCE Reinforcement Learning: An Overview," vol. 310Berlin, Heidelberg:
[1] N. Hatziargyriou, Microgrids : Architectures and Control. New York, Springer Berlin Heidelberg, 2010, pp. 183-221.
United Kindom: John Wiley & Sons, Incorporated, 2013. [20] P. S. Sastry, V. V. Phansalkar, and M. A. L. Thathachar, "Decentralized
[2] M. Hanna, "When disasters strike distributed systems," vol. 15, ed. learning of Nash equilibria in multi-person stochastic games with
Newton: King Content Company, 1995, p. 54. incomplete information," IEEE Transactions on Systems, Man, and
Cybernetics, vol. 24, no. 5, pp. 769-777, 1994.
[3] E. Oh, K. Son, and B. Krishnamachari, "Dynamic Base Station
Switching-On/Off Strategies for Green Cellular Networks," IEEE

551
Authorized licensed use limited to: INTERNATIONAL ISLAMIC UNIVERSITY. Downloaded on May 16,2022 at 14:12:40 UTC from IEEE Xplore. Restrictions apply.

An Introduction To Continuous Optimization
100% (4)
An Introduction To Continuous Optimization
400 pages
Design Optimization of High Voltage Bushing Using Electric Field Computations
No ratings yet
Design Optimization of High Voltage Bushing Using Electric Field Computations
8 pages
2024 - Development of Control Strategy For Community Battery Energy Storage System in Grid-Connected Microgrid of High Photovoltaic Penetration Level
No ratings yet
2024 - Development of Control Strategy For Community Battery Energy Storage System in Grid-Connected Microgrid of High Photovoltaic Penetration Level
15 pages
Optimal Day-Ahead Scheduling of Microgrids With Battery Energy Storage System
No ratings yet
Optimal Day-Ahead Scheduling of Microgrids With Battery Energy Storage System
28 pages
Energy Consumption Optimization
No ratings yet
Energy Consumption Optimization
23 pages
Energies 13 06546 v2
No ratings yet
Energies 13 06546 v2
19 pages
Abd-Elmagid, Dhillon, Pappas - 2019 - A reinforcement learning framework for optimizing age-of-information in RF-powered communication s
No ratings yet
Abd-Elmagid, Dhillon, Pappas - 2019 - A reinforcement learning framework for optimizing age-of-information in RF-powered communication s
14 pages
2403.01013v2
No ratings yet
2403.01013v2
25 pages
Energies 16 05369
No ratings yet
Energies 16 05369
20 pages
2018distributed Operation Management of Battery Swapping-Charging Systems
No ratings yet
2018distributed Operation Management of Battery Swapping-Charging Systems
14 pages
A Quadratic Programming Based Optimal Power and Battery Dispatch For Grid-Connected Microgrid
No ratings yet
A Quadratic Programming Based Optimal Power and Battery Dispatch For Grid-Connected Microgrid
13 pages
Alvarez 2020
No ratings yet
Alvarez 2020
6 pages
820 - Paper - PowerTech2019 Preprint
No ratings yet
820 - Paper - PowerTech2019 Preprint
6 pages
Energy Management of Renewable Energy Sources Incorporating With Energy Storage Device
No ratings yet
Energy Management of Renewable Energy Sources Incorporating With Energy Storage Device
17 pages
B14 1rv17ee004 V
No ratings yet
B14 1rv17ee004 V
14 pages
BESS Aided Renewable Energy Supply Using Deep Rein
No ratings yet
BESS Aided Renewable Energy Supply Using Deep Rein
20 pages
Smart Optimization in Battery Energy Storage Systems An Overview
100% (2)
Smart Optimization in Battery Energy Storage Systems An Overview
17 pages
Sustainability 12 10144
No ratings yet
Sustainability 12 10144
25 pages
IJEEE - Microgrid BSS Scheduling Using Teaching Learning Based Optimization Algorithm
No ratings yet
IJEEE - Microgrid BSS Scheduling Using Teaching Learning Based Optimization Algorithm
11 pages
MICROGRID
No ratings yet
MICROGRID
20 pages
Energies 10 00523 PDF
No ratings yet
Energies 10 00523 PDF
22 pages
batteries-09-00219-v3
No ratings yet
batteries-09-00219-v3
16 pages
(2019) Optimal Economic Operation and Battery Sizing For Microgrid Energy Management Systems Considering Demand Response
No ratings yet
(2019) Optimal Economic Operation and Battery Sizing For Microgrid Energy Management Systems Considering Demand Response
8 pages
Mathematics 10 03420 v2
No ratings yet
Mathematics 10 03420 v2
25 pages
1993 Extension of Battery Life Via Charge Equalization Control
No ratings yet
1993 Extension of Battery Life Via Charge Equalization Control
13 pages
Sizing of Battery Energy Storage System: A Multi-Objective Optimization Approach in Digsilent Powerfactory
No ratings yet
Sizing of Battery Energy Storage System: A Multi-Objective Optimization Approach in Digsilent Powerfactory
10 pages
Grupo 2
No ratings yet
Grupo 2
5 pages
Article 01
No ratings yet
Article 01
11 pages
Load Frequency Control in Microgrids Based On A Stochastic Non-Integer Controller
No ratings yet
Load Frequency Control in Microgrids Based On A Stochastic Non-Integer Controller
9 pages
Optimized Operation-Planning of a Microgrid With R (1)
No ratings yet
Optimized Operation-Planning of a Microgrid With R (1)
8 pages
Resilience-Oriented Optimal Operation of Networked Hybrid Microgrids
No ratings yet
Resilience-Oriented Optimal Operation of Networked Hybrid Microgrids
11 pages
Priorización de Carga y Desprendimiento en Sistemas de Energía Fotovoltaica.
No ratings yet
Priorización de Carga y Desprendimiento en Sistemas de Energía Fotovoltaica.
7 pages
energies-14-08365-v2
No ratings yet
energies-14-08365-v2
17 pages
An Optimal Control Approach for Enhancing Efficiency in Renewable Energy Communities
No ratings yet
An Optimal Control Approach for Enhancing Efficiency in Renewable Energy Communities
6 pages
Optimal Control in A Cooperative Network of Smart Power Grids
No ratings yet
Optimal Control in A Cooperative Network of Smart Power Grids
8 pages
Termpaper
No ratings yet
Termpaper
5 pages
Optimal Energy Scheduling
No ratings yet
Optimal Energy Scheduling
12 pages
batteries-10-00075
No ratings yet
batteries-10-00075
15 pages
(2015) Robust Energy Management of Microgrid With Uncertain Renewable Generation and Load
No ratings yet
(2015) Robust Energy Management of Microgrid With Uncertain Renewable Generation and Load
10 pages
Enhanced real-time scheduling algorithm for energy management in a renewable-integrated microgrid - 1-s2.0-S0306261921010229-main
No ratings yet
Enhanced real-time scheduling algorithm for energy management in a renewable-integrated microgrid - 1-s2.0-S0306261921010229-main
14 pages
Energies: Impact Analysis of Demand Response Intensity and Energy Storage Size On Operation of Networked Microgrids
No ratings yet
Energies: Impact Analysis of Demand Response Intensity and Energy Storage Size On Operation of Networked Microgrids
19 pages
A Dynamic Load Control Scheme For Smart Grid Systems: Energy Procedia
No ratings yet
A Dynamic Load Control Scheme For Smart Grid Systems: Energy Procedia
6 pages
Cost-Efficient Cellular Networks Powered by Micro-Grids
No ratings yet
Cost-Efficient Cellular Networks Powered by Micro-Grids
15 pages
Reinforcement Learning-Based Multiaccess Control and Battery Prediction With Energy Harvesting in IoT Systems
No ratings yet
Reinforcement Learning-Based Multiaccess Control and Battery Prediction With Energy Harvesting in IoT Systems
12 pages
robust optimazation
No ratings yet
robust optimazation
13 pages
Energies 16 06770 v2
No ratings yet
Energies 16 06770 v2
20 pages
25839549
No ratings yet
25839549
7 pages
A Data-Driven Robust Model For Day-Ahead Operation Planning of Microgrids Considering Distributed Energy Resources and Demand Response
No ratings yet
A Data-Driven Robust Model For Day-Ahead Operation Planning of Microgrids Considering Distributed Energy Resources and Demand Response
15 pages
Distributed Economic Dispatch in Microgrids Based On Cooperative Reinforcement Learning
No ratings yet
Distributed Economic Dispatch in Microgrids Based On Cooperative Reinforcement Learning
12 pages
Optimized Power and Cell Individual Offset For Cellular Load Balancing Via Reinforcement Learning
No ratings yet
Optimized Power and Cell Individual Offset For Cellular Load Balancing Via Reinforcement Learning
7 pages
Supervisory Power Quality Control Scheme For A Grid-Off Microgrid
No ratings yet
Supervisory Power Quality Control Scheme For A Grid-Off Microgrid
8 pages
A Resilient and Privacy-Preserving Energy Management Strategy For Networked Microgrids
No ratings yet
A Resilient and Privacy-Preserving Energy Management Strategy For Networked Microgrids
12 pages
Journal of Energy Storage
No ratings yet
Journal of Energy Storage
22 pages
Optimal Battery Management Strategies in Mobile Networks Powered by A Smart Grid
No ratings yet
Optimal Battery Management Strategies in Mobile Networks Powered by A Smart Grid
8 pages
sarin
No ratings yet
sarin
8 pages
Reinforcement Learning Based Multi-Access Control and Battery Prediction With Energy Harvesting in Iot Systems
No ratings yet
Reinforcement Learning Based Multi-Access Control and Battery Prediction With Energy Harvesting in Iot Systems
12 pages
electronics-12-04513
No ratings yet
electronics-12-04513
19 pages
Probabilistic Mobility Load Balancing For Multi-Band 5G and Beyond Networks
No ratings yet
Probabilistic Mobility Load Balancing For Multi-Band 5G and Beyond Networks
7 pages
Reinforcement Learning Framework For Dynamic Power Transmission in Cloud RAN Systems
No ratings yet
Reinforcement Learning Framework For Dynamic Power Transmission in Cloud RAN Systems
6 pages
Conference
No ratings yet
Conference
6 pages
A Fast Decomposition Method To Solve A Security-Constrained Optimal Power Flow (SCOPF) Problem Through Constraint Handling
No ratings yet
A Fast Decomposition Method To Solve A Security-Constrained Optimal Power Flow (SCOPF) Problem Through Constraint Handling
14 pages
Balanced Computing Offloading For Selfish IoT Devices in Fog Computing - 3
No ratings yet
Balanced Computing Offloading For Selfish IoT Devices in Fog Computing - 3
9 pages
Conference
No ratings yet
Conference
5 pages
Business Applications of Regressions
No ratings yet
Business Applications of Regressions
2 pages
Solar Energy Forecasting Using Deep Learning Techniques
No ratings yet
Solar Energy Forecasting Using Deep Learning Techniques
6 pages
Machine-Learning-Based Real-Time Economic Dispatch in Islanding Microgrids in A Cloud-Edge Computing Environment
No ratings yet
Machine-Learning-Based Real-Time Economic Dispatch in Islanding Microgrids in A Cloud-Edge Computing Environment
9 pages
Important 24 PDF
No ratings yet
Important 24 PDF
6 pages
Smart Meter
No ratings yet
Smart Meter
2 pages
Important 23 PDF
No ratings yet
Important 23 PDF
6 pages
Assignment # 2 MBA ITM
No ratings yet
Assignment # 2 MBA ITM
1 page
BBA-32, Mathematics and Statistics For Business: Assignment Due Date: 03-12-2015
No ratings yet
BBA-32, Mathematics and Statistics For Business: Assignment Due Date: 03-12-2015
1 page
Business Applications of Correlation
No ratings yet
Business Applications of Correlation
2 pages
Advance Semiconductor Devices
No ratings yet
Advance Semiconductor Devices
1 page
LTI Systems
No ratings yet
LTI Systems
3 pages
Module 2 Lesson 9 Linear Programming
No ratings yet
Module 2 Lesson 9 Linear Programming
37 pages
apex_gen_des_2024.1_doc_whats_new.pdf.pdf
No ratings yet
apex_gen_des_2024.1_doc_whats_new.pdf.pdf
14 pages
Chapter 3 Simplex Method PDF
No ratings yet
Chapter 3 Simplex Method PDF
32 pages
Assigment: Hungarian Method
No ratings yet
Assigment: Hungarian Method
22 pages
Application of Derivatives
No ratings yet
Application of Derivatives
29 pages
JNTUA-B.Tech.3-2 MECH-R15-SYLLABUS
No ratings yet
JNTUA-B.Tech.3-2 MECH-R15-SYLLABUS
36 pages
CS3491 - Notes - Unit 5 - Neural Networks
No ratings yet
CS3491 - Notes - Unit 5 - Neural Networks
37 pages
EV Sukanta Das
No ratings yet
EV Sukanta Das
12 pages
Production and Operations Management
No ratings yet
Production and Operations Management
24 pages
CYMDIST-SwitchOptimization
No ratings yet
CYMDIST-SwitchOptimization
2 pages
Hima Lakkaraju XAI ShortCourse
No ratings yet
Hima Lakkaraju XAI ShortCourse
271 pages
Ie 335 Fall 2020 Quiz 1 O4
No ratings yet
Ie 335 Fall 2020 Quiz 1 O4
3 pages
Katalog 2020 IMP Stand 2017 05 07 Final
No ratings yet
Katalog 2020 IMP Stand 2017 05 07 Final
289 pages
NAPCON Brochure 2019-20 PDF
No ratings yet
NAPCON Brochure 2019-20 PDF
36 pages
Facility Location Problems-A Case Study For ATM Site Selection
No ratings yet
Facility Location Problems-A Case Study For ATM Site Selection
9 pages
Athens Center of Ekistics Ekistics: This Content Downloaded From 14.139.161.13 On Mon, 10 Jun 2019 10:36:22 UTC
No ratings yet
Athens Center of Ekistics Ekistics: This Content Downloaded From 14.139.161.13 On Mon, 10 Jun 2019 10:36:22 UTC
9 pages
Optimal Control Theory With Aerospace Applications
No ratings yet
Optimal Control Theory With Aerospace Applications
10 pages
An Excel Solver-VBA Application For R&D Project Selection and Portfolio Optimization
No ratings yet
An Excel Solver-VBA Application For R&D Project Selection and Portfolio Optimization
7 pages
Base Camp Facility Layout: February 2001
No ratings yet
Base Camp Facility Layout: February 2001
8 pages
14Mx11 Probability and Statistics: Semester I 4 0 0 4
No ratings yet
14Mx11 Probability and Statistics: Semester I 4 0 0 4
36 pages
HLS Allocation
No ratings yet
HLS Allocation
19 pages
Unit 3: Introduction To Linear Programming
No ratings yet
Unit 3: Introduction To Linear Programming
12 pages
A Genetic Algorithm For Function Optimization - A Matlab Implementation
No ratings yet
A Genetic Algorithm For Function Optimization - A Matlab Implementation
14 pages
Iscl w3 Aplicatii Factories
No ratings yet
Iscl w3 Aplicatii Factories
5 pages
Special Cases of Linear Programming Models (Part 3)
No ratings yet
Special Cases of Linear Programming Models (Part 3)
2 pages
APIN2012
No ratings yet
APIN2012
15 pages
Zellner Bayesian Analysis
No ratings yet
Zellner Bayesian Analysis
4 pages
Constructal PEM Fuel Cell Stack Design: J.V.C. Vargas, J.C. Ordonez, A. Bejan
No ratings yet
Constructal PEM Fuel Cell Stack Design: J.V.C. Vargas, J.C. Ordonez, A. Bejan
18 pages

Energy Management For Microgrids Using A Hierarchical Game-Machine Learning Algorithm

Uploaded by

Energy Management For Microgrids Using A Hierarchical Game-Machine Learning Algorithm

Uploaded by

Energy Management for Microgrids Using a

Hierarchical Game-Machine Learning Algorithm

978-1-7281-4911-0/19/$31.00 ©2019 IEEE

However, when applying traffic shaping, the actual number of

could be found in [13].

III. GAME-LEARNING ALGORITHM max min obj (t , G i (t ), G  i (t )) (14)

Fig. 5. LSR and system SoC applying heuristic algorithm, normal

You might also like

III. GAME-LEARNING ALGORITHM max min obj (t , G i (t ), G i (t )) (14)