0% found this document useful (0 votes)
1 views

Stochastic_Maintenance_Schedules_of_Active_Distribution_Networks_Based_on_Monte-Carlo_Tree_Search

This paper presents a stochastic maintenance scheduling model for active distribution networks (DNs) that incorporates uncertainties from distributed energy resources (DERs) using a Monte-Carlo tree search (MCTS) approach. The model aims to minimize maintenance costs while adhering to reliability constraints over a multistage planning horizon. Numerical tests demonstrate the effectiveness of the proposed method compared to traditional optimization techniques, highlighting its potential for improving maintenance strategies in power systems.

Uploaded by

AdeelKhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Stochastic_Maintenance_Schedules_of_Active_Distribution_Networks_Based_on_Monte-Carlo_Tree_Search

This paper presents a stochastic maintenance scheduling model for active distribution networks (DNs) that incorporates uncertainties from distributed energy resources (DERs) using a Monte-Carlo tree search (MCTS) approach. The model aims to minimize maintenance costs while adhering to reliability constraints over a multistage planning horizon. Numerical tests demonstrate the effectiveness of the proposed method compared to traditional optimization techniques, highlighting its potential for improving maintenance strategies in power systems.

Uploaded by

AdeelKhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

3940 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO.

5, SEPTEMBER 2020

Stochastic Maintenance Schedules of Active


Distribution Networks Based on Monte-Carlo
Tree Search
Yuwei Shang, Wenchuan Wu , Senior Member, IEEE, Jiawei Liao, Jianbo Guo, Senior Member, IEEE, Jian Su,
Wei Liu, and Yu Huang

Abstract—The integration of volatile distributed energy re- ΩP Set of scenarios for stochastic variables
sources (DERs) brings new challenges for the active distribu-
tion network maintenance scheduling (DN-MS). Conventionally, Constants and Parameters
the DN-MS is formulated as a deterministic optimization model T Maintenance planning horizon
without considering the uncertainties of DERs. In this paper, bfj Constant of maintenance budget for equip-
the DN-MS is formulated as a multistage stochastic optimization
problem, which is cast as a stochastic mixed-integer nonlinear ment j in feeder f
programming model. It aims to reduce the total maintenance cost PMfj Preventive maintenance budget of equip-
constrained by the reliability indices. To capture the operational ment j in feeder f
characteristics of active distribution networks, the uncertainties CMfj Corrective maintenance budget of equip-
of DERs and post-outage operation strategies of switching devices ment j in feeder f
are incorporated into the model. In general, this type of model is
intractable and mainly solved by heuristic search methods with low MBt Stage-t maintenance budget
P,Mat C,Mat
efficiency. Recently, Monte-Carlo tree search (MCTS) is emerging Cj , Cj Cost of materials required for the repair of
as a scalable and promising reinforcement learning approach. We the equipment j in preventive or corrective
propose a stochastic MCTS solution to this problem. In the tree maintenance
search procedure, a sample average approximation technique is
developed to estimate multistage maintenance costs considering CjP,Wor , CjC,Wor Cost per working hour necessary for the
uncertainties. To speed up the MCTS, the complicated constraints repair of equipment j in preventive or cor-
of the original problem are transformed to penalty or heuris- rective maintenance
tics functions. This approach can asymptotically approximate the HjP,Wor , HjC,Wor Number of working hours for the repair of
optimum with promising computation efficiency. Numerical test equipment j in preventive or corrective main-
results demonstrate the superiority of the proposed method over
benchmark methods. tenance
Lf , TRf Length of line segments or number of trans-
Index Terms—Distribution network, distributed generation,
electric vehicles, maintenance, Monte-Carlo tree search. formers in feeder f
NC Number of customers
NOMENCLATURE
ρP U
i , ρi Prices of planned or unplanned unsupplied
Sets energy at load point i (¥/kWh)
Ωfl , Ωftr Set of line sections or distribution transform- λbase
j Base failure probability of equipment j
ers in feeder f λincre
j Incremental failure probability of equip-
Ωfnn Set of load points in feeder f ment j
P,Mai C,Mai
ΩSWI Set of switching devices ri,j , ri,j Restoration time experienced by load point i
ΩF Set of feeders due to preventive or corrective maintenance
of equipment j
Manuscript received August 28, 2019; revised January 2, 2020; accepted δSAIDI , δSAIFI Thresholds of SAIDI or SAIFI
February 9, 2020. Date of publication February 13, 2020; date of current version tfAS , tfM S Operation time of the automatic or manual
August 24, 2020. This work was supported in part by the National Science
Foundation of China under Grant 51725703. (Corresponding author: Wenchuan switching devices in feeder f
Wu.) Variables
Yuwei Shang and Wenchuan Wu are with Tsinghua University, Beijing
100084, China (e-mail: [email protected]; [email protected]). LTi,j Binary variable that is 1 if load point i can
Jianbo Guo, Jian Su, and Wei Liu are with China Electric Power Research be transferred to other feeders by operating
Institute, Beijing 100192, China (e-mail: [email protected]; sujian@
epri.sgcc.com.cn; [email protected]). switching devices due to the failure of equip-
Jiawei Liao and Yu Huang are with Peking University, Beijing 100871, China ment j, and 0 otherwise
(e-mail: [email protected]; [email protected]). N CjP,f , N CjU,f Number of customers experiencing planned
Color versions of one or more of the figures in this article are available online
at http://ieeexplore.ieee.org. or unplanned outages due to maintenance of
Digital Object Identifier 10.1109/TPWRS.2020.2973761 equipment j

0885-8950 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
SHANG et al.: STOCHASTIC MAINTENANCE SCHEDULES OF ACTIVE DN BASED ON MONTE-CARLO TREE SEARCH 3941

Pi,j Unsupplied power for node i during the scenario is adopted for evaluation. This over-simplified
maintenance of equipment j model may not be applicable to active DNs.
SAIDI System Average Interruption Duration Index To describe DER uncertainties and their influence, the DN-
SAIFI System Average Interruption Frequency MS is formulated as a multistage stochastic programming model
Index for active DNs. The solutions involve the maintenance budgeting
λfj Failure probability of equipment j in feeder f and time allocation that minimize the expected costs while
CPM , CCM Labor and material cost of preventive or satisfying the reliability constraints of the distribution network.
corrective maintenance If only one operational scenario (i.e., one realization of DER
CUE Cost of unsupplied energy uncertainties) is considered, the solution space to this problem
CP,UE , CU,UE Cost of unsupplied energy in planned or contains approximately k Ne ·T candidate solutions, where k is
unplanned outage the number of maintenance decisions for each piece of equip-
afj Binary decision variable that is 1 if preven- ment, Ne is the number of pieces of equipment, and T is the
tive maintenance is allocated for equipment maintenance planning horizon. The problem is computation-
j in feeder f, otherwise 0 ally challenging. The Monte-Carlo tree search (MCTS) method
In the Markov decision process and Monte-Carlo tree search may provide a viable solution to the problem, as an artificial
intelligence approach that has achieved superior performance in
A, S Set of all actions or states several computationally intensive computer games. We develop
a, s Action or state MCTS-based solutions to solve the DN-MS.
v, v Node or chance node in the search tree
Q(s, a) Reward function for taking action a in state s
γ Discount rate B. Related Work
Many works have focused on maintenance scheduling in DNs.
I. INTRODUCTION Reference [2] established a quantitative model to analyze the
impact of different PM strategies on system reliability and costs.
A. Background
References [3] and [4] presented a practical framework for
ISTRIBUTION networks are one of the most implementing a reliability-centered maintenance procedure in
D maintenance-intensive parts of power systems.
Distribution network companies devote considerable efforts to
DNs. Reference [5] used the network reliability and maintenance
cost as evaluation criteria, and the fuzzy analytical hierarchical
allocate their limited resources (e.g., labors and materials) to process (AHP) was adopted to assign their weights. Reference
maintain the power assets (e.g., lines and transformers) [1]. In [6] formulated a mixed-integer linear programming (MILP)
general, maintenance strategies can be classified into corrective method and employed an efficient decision-tree solution for
maintenance (CM) and preventive maintenance (PM). The PM. They used a linearized cost model to describe the EENS
CM is performed after equipment failure occurs, while the and maintenance costs, etc. Reference [7] presented a MILP
PM is performed before equipment failure occurs. The goal formulation in conjunction with time-dependent deterioration
of distribution network maintenance scheduling (DN-MS) is failure probabilities of equipment. The aim was to identify a
to enhance system reliability while reducing total costs over cost-effective PM strategy while satisfying system reliability
a multistage planning horizon. The integration of large-scale requirements. Reference [8] presented a two-step approach to
distributed energy resources (DERs), such as distributed optimize the maintenance strategy for feeders equipped with
generators (DGs) and electric vehicles (EVs), distinguishes normally open switches. If the budget is limited and does not
DN-MS from existing models in terms of the cost formulation allow performance of the required maintenance, the loading of
and reliability estimation: the worst-performing equipment is reduced. In [9] a decision
1) In general, expenditure due to PM and CM mainly involves tree was used to express state transitions and dynamic program-
the costs of labor and materials, and the cost of unsupplied ming (DP) was employed to optimize the maintenance strategy.
energy (also known as the expected energy not supplied, However, DP for DN-MS may suffer from dimensionality [6].
EENS). Most existing works formulate the EENS as an This issue can be addressed by the approximate DP, as evidenced
expected loss function of energy consumption. However, by references [10] and [11] for solving maintenance scheduling
in active DNs, the generation losses of DGs should be problems in related fields. Some recent works, i.e., [12] and [13],
considered in EENS, as well as their uncertainties. incorporated the operation strategies of automatic switching
2) During DN-MS formulation, the reliability indices are devices, including sectionalizers and tie switches, into the DN-
incorporated as constraints; reliability estimation is in- MS. They formulated mixed-integer non-linear programming
dispensable. For reliability estimation, post-outage op- (MINP) models that were solved by a heuristic search algorithm.
eration of switching devices, such as sectionalizers and To date, no study has used the stochastic optimization
tie switches, is necessary for transferring the outage load model in the context of DN-MS. However, some recently pub-
to adjacent feeders, which is constrained by the feeders’ lished works, i.e., [14], [15] and [16], incorporated different
available capacity, in turn determined by the power flow. types of uncertainties into the maintenance scheduling prob-
The DER uncertainties are always ignored and the load de- lem for power plants and transmission networks, and demon-
mand is assumed to be constant, i.e., only one deterministic strated remarkable performance improvements compared with

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
3942 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO. 5, SEPTEMBER 2020

deterministic models. In [14], a robust computational framework


was presented to incorporate uncertainties in the availability
of backup generators during maintenance interventions. The
framework was numerically tested in a hydroelectric power plant
with the IEEE 24-bus reliability test system (IEEE-24 RTS).
In [15], a two-stage SP framework for the maintenance sched-
ule of generators was presented, with explicit consideration of Fig. 1. Single-line diagram of a two-feeder distribution network (DN).
unexpected failures. However, multistage SP models are more
difficult to solve than two-stage SP models [17]. In [16], a
SP model was formulated for the operation and maintenance asymptotically identify the best-performing actions, i.e., those
schedule of a transmission system. A reinforcement learning that approximate the optimal maintenance strategy. Second, to
solution was used but verification was done in a small-scale speed up the simulation, we transform the complicated con-
power system with four pieces of equipment requiring repair. straints into penalty functions or heuristics. This transformation
Recently, the MCTS has attracted the attention of the machine represents a cheap and viable computation strategy that avoids
learning community due to its scalability with respect to compu- the need for analytic derivation of the feasible state and action
tationally intensive problems. Essentially, the MCTS is a method sets during the recursive simulations.
for discrete optimization of an expensive-to-compute problem Numerical tests on a distribution network with 112 equip-
by taking random samples in the solution space and adaptively ment (i.e., singles-stage search space contains 2112 possible
building a search tree according to the results. Reference [18] maintenance schedules) show the superiority of S-MCTS over
introduced the basic ideas of MCTS, in which two policies, B-MCTS and the genetic algorithm (GA). As compared to B-
i.e., the tree policy and the default policy, are used together to MCTS, the developed SAA technique enhances the optimality of
asymptotically direct randomized simulations towards trajecto- solutions, and the constraints transformation strategy enhances
ries that promise high returns. Different MCTS methods were the feasibility of solutions. Evaluations on S-MCTS and GA
reviewed in [19]. In [20], the MCTS-based approach was a key show that S-MCTS reduces 7.5% more cost with only 8.6%
component in solving the computationally intensive game “GO”. CPU time of GA.
Reference [21] extended the MCTS method to game-playing The remainder of this paper is organized as follows. We
from the perspectives of perfect and imperfect information. formulate the problem in Section II. In Section III, we introduce
However, these methods were tied to their specific contexts and the proposed solution. We report the results of case studies in
not directly applicable to our problem for two reasons. First, Section IV and conclude the paper in Section V.
they ignored intermediate rewards during implementation of the
default policy, which leads to biased estimation of candidate
maintenance schedules. Second, directly enumerating the feasi- II. MODEL FORMULATION
ble state and action sets constrained by the nonlinear constraints We illustrate the problem by using a simplified DN, as shown
in our problem (e.g., reliability constraints, operational con- in Fig. 1. SS1 and SS2 denote two substations. They supply
straints of switches for network reconfiguration) may drastically feeders F1 and F2, respectively. In F1, there are seven line
decrease the computation efficiency of the MCTS. sections {l1, …,l7}, along with their distribution transformers
{tr1, tr2, tr3} in the PM list to be scheduled. These transformers
are connected to a DG, an EV charge station and an ordinary
C. Contributions
load, respectively. The switching devices include two circuit
We formulate a stochastic MINP model (SMINP) for optimiz- breakers {CB1, CB2}, three sectionalizers {S1, S2, S3} and a
ing the maintenance schedules in active DNs. It aims to reduce tie switch {T1}. These switches can be operated automatically
the total maintenance and operation cost, constrained by the or manually; CBs are operated to trip the fault, sectionalizers are
reliability indices over a multistage planning horizon. As far as operated to isolate faults or restore power to outage areas, and
we know, this is the first model that analytically embeds the un- tie switches can provide alternative power supply paths from
certainties of DERs and operation strategies of switching devices other feeders. These switches are used for different purposes
(i.e., network reconfiguration or load restoration strategies). (e.g., network reconfiguration, load restoration) under differ-
We then propose a MCTS-based solution for solving the ent circumstances. When the PM is scheduled, the network
DN-MS. The basic version of the MCTS (B-MCTS) offers reconfiguration is commonly involved to provide alternative
a concise computation framework by recursively using a tree power supply paths for the unsupplied loads. However, when an
policy to expand the search tree towards high-reward nodes, equipment failure occurs, not only the CM needs to be performed
and a default policy to perform the simulations for updating the on the equipment, but also the restoration is involved through the
estimated rewards and other statistics. To cope with the uncer- post-fault network reconfiguration of un-faulted out-of-service
tainties of DERs, we extend the B-MCTS to yield the stochastic area. For instance, when equipment l1 is undergoing PM or CM,
MCTS (S-MCTS) through two novel customizations. First, we all load points experience a power supply cutoff from F1. A
develop a sample average approximation (SAA) technique for load-flow analysis will be conducted to determine whether the
estimating the multistage maintenance costs under uncertainties load points can be transferred from F1 to F2. If the available
of DERs. The estimated results are used to help the tree policy to capacity of F2 is sufficient, then T1 is closed to provide an

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
SHANG et al.: STOCHASTIC MAINTENANCE SCHEDULES OF ACTIVE DN BASED ON MONTE-CARLO TREE SEARCH 3943


alternative power supply for all load points in F1. If there is uncertainties of DERs. Here superscript · is used to distinguish
insufficient capacity and only LD can be transferred to F2, then stochastic variables from deterministic variables,
T1 and S3 will be closed while S1 and S2 will be opened.  UE      P,UE,f  U,UE,f 
Ct = C i,j,t + C i,j,t (6)
A. Objective Function f ∈ΩF j∈{Ωf ∪Ωftr } i∈Ωfnn
l

The objective of the problem is to minimize the expected  P,UE,f  U,UE,f

maintenance costs through optimizing the maintenance budget- where C i,j,t and C i,j,t are the planned and unplanned un-
ing and time allocation for the correlated equipment. Let T be supplied energy costs for load point i due to stage-t maintenance
the length of the maintenance scheduling horizon, and t the time of equipment j.
 P,UE,f
index. Let C(at ) be the stage-t costs, i.e., the cost incurred in C i,j,t is given by
the period [t, t + 1),
 P,UE,f 

C(at ) = CtPM + CtCM + CtUE (1) C i,j,t = afj,t · P i,j,t · ri,j,t


P
· ρP f
i,j,t , ∀i ∈ Ωnn ,

where CtPM and CtCM are the labor and material costs of PM and ∀j ∈ {Ωfl ∪ Ωftr }, ∀f ∈ ΩF (7)
CM, respectively. CtUE is the cost of the EENS in both planned 

and unplanned outages. at is the maintenance decisions for all where P i,j,t is the stochastic power and ri,j,t is the restoration
equipment at stage t, i.e., at = {afj,t |j ∈ Ωl ∪ Ωtr , f ∈ ΩF }. time of load point i during stage-t maintenance of equipment j,
CtPM can be obtained based on the scheduled PM budget for where,
the distribution equipment by the distribution utility, ri,j,t
P
= LTi,j,t · (pAS tfAS + (1 − pAS )tfM S )
 
CtPM = afj,t · PMfj,t (2) P,Mai
+ (1 − LTi,j,t )ri,j , ∀i ∈ ΩfLD , ∀j ∈ {Ωfl ∪ Ωftr }
f ∈ΩF j∈{ Ωl ∪Ωtr }
(8)
CtCM is obtained according to the number of working hours LTi,j,t is the binary variable denoting whether the load can
and the amount of materials needed for CM activities, be transferred to alternative feeders. If load point i can be
 
CtCM = CM,f
Cj,t (3) transferred to a backup feeder, LTi,j,t equals 1; otherwise, it
f ∈ΩF j∈{Ωl ∪Ωtr }
is 0. As a result, if load point i cannot be transferred, it will
experience a restoration time equal to the maintenance duration
CM,f
Cj,t = λfj,t · (Cjwor · Hjwor + Cjmat ), ∀j ∈ {Ωfl ∪ Ωftr }, of equipment j, which usually lasts from several hours to several
days. If load point i is transferable, we further consider the
∀f ∈ ΩF (4) possibility of automatic switch malfunctions in practice. Let pAS
where λfj,t
is the failure probability of equipment j. be the probability of correct operation of automatic switches; if
In general, equipment failures can be classified as random or switches always operate correctly in automatic mode, pAS = 1
deteriorating failures (i.e., ageing). Random failures are com- and load point i has a restoration time that is equal to the
monly modelled by exponential probability distributions, while automatic switching time. In contrast, if the switches do not
deteriorating failures are commonly modeled by the Weibull or operate correctly in automatic mode, pAS = 0, which implies
normal probability distributions, whose failure probability may load point i has the manual switching time [13].
 P,UE,f  U,UE,f
increase over time. PM can reduce the probability of deterio- Similar to C i,j,t , C i,j,t is calculated by
rating failures [9], and the failure probability of the equipment  U,LD,f 
could be decreased in proportion to their allocated maintenance C i,j,t = λfj,t · P i,j,t · ri,j,t
U
· ρU f
i,j,t , ∀i ∈ Ωnn ,
budget [12]. Hence the failure probability in a time interval is
given as, ∀j ∈ {Ωfl ∪ Ωftr }, ∀f ∈ ΩF (9)
f f where,
λfj,t+1 = afj,t · λfj,t · e−(bj ·PMj,t ) + (1 − afj,t ) · λfj,t + λf,incre
j,t ,
ri,j,t
U
= LTi,j,t · (pAS tfAS + (1 − pAS )tfM S )
∀j ∈ {Ωfl ∪ Ωftr }, ∀f ∈ ΩF (5)
C,Mai
+ (1 − LTi,j,t )ri,j , ∀i ∈ ΩfLD , ∀j ∈ {Ωfl ∪ Ωftr } (10)
where bfj and λf,incre
j,t can be determined based on the historical
maintenance data [12]. Note that (5) is a general model to Let Pr denote the probability distribution of stochastic power

describe the deteriorating failure probability. In short-term main- P on a common and complete probability space P , and let
tenance scheduling problems (e.g., monthly, weekly), λf,incre


j,t vector ω t ≡ {P i,t : i ∈ Ωnn } ∈ ΩP t denote stage-t stochastic
can be simply set to 0. Besides, as more sensors and smart power for all load points. Assume the initial stage is stage 0,
meters are deployed, condition based information can be used we refer to a scenario ωtT ∈ ΩT,P t ≡ ΩP P
t × · · · × ΩT −1 as a
to replace (5) for capturing the time-varying failure probability realization (or sampling trajectory) of the stochastic process
more accurately [22]. T
{ω t : 0 ≤ t ≤ T − 1} (see [17] for details of the sampling
CtUE is estimated based on the interrupted energy during out-
 UE
procedure). For the DN-MS problem over the multistage plan-
age. We use a stochastic variable denoted as C t to incorporate ning horizon, its objective can be finally formulated as a nested

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
3944 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO. 5, SEPTEMBER 2020

multistage expectation minimization function: The decision variables of (17) are denoted by {a0 , …,aT-1 }.
  Take a0 as an example, it is a set of binary variables repre-
z = min C(a0 ) + γE ω1
min C(a1 ) + γE 
ω 2| ω 1
min C(a2 ) senting the maintenance decisions of all equipment at stage
a0 a1 a2
 0, i.e., a0 = {afj,0 |j ∈ Ωl ∪ Ωtr , f ∈ ΩF }. In Section III we
describe the MCTS-based approach for obtaining high-quality
+ · · · + γE 
ω T −1 | ω T −2
[min C(aT −1 )] . . . (11)
aT −1 approximate solutions to this problem.
where E 
ω t+1 | ω t
is the expectation following the conditional
  III. PROPOSED METHOD
probability Pr(ω t+1 |ω t ). C(·) denotes the cost formulated in
(1). γ is the discount factor for future maintenance costs. For A. Preliminaries of the Computation Framework
short-term maintenance scheduling, γ can be set to 1. The problems solved by the MCTS are commonly formalized
by the Markov decision process (MDP), in which S and A denote
B. Constraints the state space and action space, respectively, F : S × A → S
The maintenance decision is subject to constraints (12)–(16). denotes the transition function from a state-action pair to the
1) The allocated stage-t maintenance budget should be less next state, Q(s) denotes the reward of a state s ∈ S, and Af (s)
than the maximum maintenance budget, denotes the set of available actions in state s. In the DN-MS
problem, we use Q as the negative of the maintenance cost C
CtPM + CtCM ≤ M Bt (12) formulated in (1), the transition function F is defined by (5).
2) The reliability indices System Average Interruption Fre- The state and action are
quency Index (SAIFI) and System Average Interruption
Duration Index (SAIDI) are used to represent the degree s = {λfj |j ∈ Ωl ∪ Ωtr , f ∈ ΩF } (18)
of customer satisfaction. They should be less than the a = {afj |j ∈ Ωl ∪ Ωtr , f ∈ ΩF } (19)
allowable thresholds [6],
  where s and a are vectors comprising failure probabilities and
  f
U,f
N Cj,t f
P,f
N Cj,t
λj,t · + aj,t · ≤ δSAIFI maintenance decisions for all equipment, respectively. Note
F l tr
NC NC thatλfj ∈ [0, 1] is a real-valued variable, and afj ∈ {0, 1} is a
f ∈Ω j∈{Ω ∪Ω }
(13) binary variable that, when equal to 1, denotes the scheduling PM
  for j, and vice versa when it is 0. We use t = 0 to represent the
  i∈Ωfnn ri,j,t
U
i∈Ωfnn ri,j,t
P
initial stage and λfj,0 to represent the stage-0 failure probability
λfj,t + afj,t
NC NC of equipment j; the corresponding state vector is s0 = {λfj,0 |j ∈
f ∈ΩF j∈{Ωl ∪Ωtr }
Ωl ∪ Ωtr , f ∈ ΩF }. If the stage-0 maintenance strategy is a0 =
≤ δSAIDI (14)
{afj,0 |j ∈ Ωl ∪ Ωtr , f ∈ ΩF }, we can derive the corresponding
where the reliability indices are estimated by incorporating state vector s1 by (5). Similarly, the state vector st can be derived
the post-outage events and network reconfiguration through based on {st−1 , at−1 }, where t ∈ {1, . . . , T }.
switches [23]. A Monte-Carlo search tree consists of nodes and edges. Let
3) The failure probability of the equipment should not be v be a tree node corresponding to a state s. The directed tree
lower than that of the minimum failure probability in its edge connecting a parent node to its child node represents an
useful life period [10], action a leading to state transitions. Each node v contains the
following statistics: state s(v), selected action a(v), expected
λfj,t ≥ λbase f f
j,t , ∀j ∈ {Ωl ∪ Ωtr }, ∀f ∈ Ω
F
(15) reward Q(v) and visit count N(v). For example, in our problem,
where λbase is the minimum failure probability of equipment j v0 is the root node of the tree with statistics s(v0 ) denoting the
j
during its service life. For simplicity, λbase can be set to 0. state vector s0 defined in (18), and a(v0 ) denoting a0 defined
j
4) The thermal capacity of the equipment restricts the maxi- in (19). v1 is one of the child nodes of v0 , whose statistics
mum current that can be carried through, s(v1 ) and a(v1 ) represent stage-1 state and action vectors s1
and a1 , respectively. If the total number of pieces of equipment
f f,max
, ∀j ∈ {Ωfl ∪ Ωftr }, ∀f ∈ ΩF (16)

Ij,t (ω t ) ≤ Ij,t to be repaired is Ne , then the number of child nodes for v0 is
2N e . The Q(v) and N(v) of each node v are updated during the
which determines the operation strategies of switching de-
implementation of the MCTS. When the iterative computation
vices for supply restoration or network reconfiguration accord-
budget of the MCTS is reached (e.g., constraint of time, iteration
ing to the power flow distribution.
or memory), we can identify a complete path of the search tree
The DN-MS problem is formulated as a multi-stage SP:
denoted {v0 , v1 , . . . , vT }, with the tree edges representing the
min (11) set of optimal maintenance strategies {a0 , a1 , . . . , aT −1 } and
the resulting failure probability vectors {s1 , s2 , . . . , sT }.
s.t. (12)–(16) (17)
The above process offers the general computation framework
which is different from the existing two-stage SP models, which of the MCTS based methods for solving the DN-MS. In subsec-
provide only a single recourse opportunity under uncertainty. tion B, we briefly introduce the computation procedure of the

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
SHANG et al.: STOCHASTIC MAINTENANCE SCHEDULES OF ACTIVE DN BASED ON MONTE-CARLO TREE SEARCH 3945

basic MCTS and its gaps for solving (17). To bridge the gaps, in
subsection C we propose the S-MCTS, which is customized
from the B-MCTS by introducing two novel computation
strategies.

B. Basic MCTS Approach


The B-MCTS achieves an iterative computation process that
consist of four steps during each iteration (see [19] for imple-
mentation details):
a) Selection: Each iteration starts from the root node v0 . A
child node vc from its parent vp is recursively selected
following the criterion that maximizes,
Fig. 2. Differences between the B-MCTS and S-MCTS in iteration proce-
dures.
Q(vc ) ln N (vp )
U CTbasic = + Cp (20)
N (vc ) N (vc )
where N (vp ) is the number of times vp has been visited, details of S-MCTS are presented in Appendix A. To explain the
N (vc ) is the number of times vc has been visited, and enhanced computation strategies, Fig. 2 is used to highlight the
Cp > 0 is a constant. The selected child node is required to differences in iteration procedures between the B-MCTS and
represent a nonterminal state and have unvisited children the S-MCTS.
node. In the tree configuration, we replace a node v in B-MCTS
b) Expansion: Assume in step a that the nodes {v0 , by a chance node v in S-MCTS during the simulation step.
v1 , . . . , vt−1 } have been selected; then, an unvisited action A chance node v is defined to incorporate the uncertainties
at−1 from the stage-(t − 1) feasible action set is selected for the reward estimation of the node. Different settings of the
for vt−1 and a new leaf node vt joins the tree. Steps a and search tree leads to different uncertainty incorporation strategies.
b complete the implementation of the tree policy, which As mentioned in Section IIIA, the tree nodes corresponding to
addresses the exploration-exploitation dilemma by allow- state variables are set as a deterministic function of a parent node
ing random selection of child nodes while asymptotically and an action. To incorporate uncertainties of DERs under this
approximating the optimum [21]. setting, the reward estimation for a node in one stage is given by
c) Simulation: A default policy is applied to perform a simu-
lation from the newly expanded node vt to the terminal 

node vT . Commonly, the default policy is a uniformly Q(v) = E (Q(ω)) (23)



ω ∈ΩP
random search strategy aiming to produce a reward only
at the terminal node, i.e., Q(vT ), through fast simulations.

This policy largely ensures the efficiency of the recursive where Q(ω) denotes the reward obtained in one possible sce-

computations in MCTS. nario conditional on vector ω, and Q(v) incorporates all

P
d) Backup: The simulation results during one iteration are possible scenarios ω ∈ Ω in one stage. Many statistical models

backpropagated for all visited nodes from vt to v0 , with are available for describing the stochastic power P . We use the

the following statistics updated for these nodes,
normal distribution [25], [26], i.e.,P ∼ N (μ , σ
2
), where μ
P P P
N (v) ← N (v) + 1 (21) 
and σ are the sample mean and variance of P .
2
P
Q(v) ← Q(v) + Q(vT ) (22) Note that as long as v is not a terminal node, multistage
whose aim is to inform the tree policy decision in the next uncertainties should be incorporated to provide unbiased esti-
iteration. mation of the expected reward. In the stochastic optimization
Despite the conciseness of the B-MCTS, it is not directly community the SAA method has been well studied [17], [24],
applicable to our problem because it ignores the intermediate i.e., generate a random sample ω 1 , . . . , ω M of M realizations

rewards {Q(vt ), . . . Q(vT −1 )} and their estimations under dif- of the random vector ω, that replaces the true distribution of the
ferent possible scenarios during implementation of the default original problem by the empirical distribution constructed from
policy. Moreover, the analytic derivation of the feasible state and M scenarios, each taken with probability 1/M. In this work,
action sets from (12)–(16) in each iteration of the tree search we develop the SAA based reward estimation strategy, i.e., given
involves a heavy computational burden. stage-t nodevt , its expected reward from stage t onwards is
formulated as Tt E


ω t ∈ΩP
(Q(wt )) and approximated by M
t
C. Stochastic MCTS Method T
realizations of the random process ω t . A detailed example is
In this subsection, we provide the customizations that we provided in Appendix B to show how the multistage uncertainty
made to extend B-MCTS into S-MCTS. The implementation is incorporated for node reward estimation.

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
3946 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO. 5, SEPTEMBER 2020

During the recursive computations, rather than analytically


deriving the feasible action/state space from constraints (12)–
(16), we transform them into penalty functions or heuris-
tics denoted by K = {kl (s, a)}5l=1 , where kl ∈ K is the lth
penalty/heuristic function corresponding to a specific constraint.
For instance, k1 corresponds to (12) and k5 corresponds to (16).
(12)–(14) are in the form of f (a) ≤ δa , where f(a) and δa are Fig. 3. Configuration of the small-scale DN.
the left- and right-hand sides of the constraint, respectively. They
are transformed into the penalty functions, T 
t Eω t ∈ΩP
(Q(wt )); when the computation budget is reached,
kl (s, a) : penaltyl = σl (f (a) − δa ), l ∈ {1, 2, 3} (24) t
the trajectory that obtains the maximum value can be identified,
where σl > 0 is a penalty coefficient. Because the reliability of i.e., a set of maintenance actions, i.e., {a0 , a1 , . . . , aT −1 }, are
DN varies by power outage area and the scheduled maintenance formed as the solution to the original problem of (17), since
strategy, we use (13)–(14) to estimate the SAIDI and SAIFI by the maximum approximated rewards are obtained without any
aggregating the estimated results of all load points under all constraints violations.
scenarios [23]. If the finally estimated reliability indices exceed
their predefined threshold, a penalty value greater than 0 is then IV. CASE STUDY
produced by (24). In this section, we describe the case studies that we used to
(15) is transformed into a heuristic (if-then rule), test the proposed algorithm; which is implemented in Python
k4 (s, a) : if (15) is unsatisfied, λfj ← λf,base
j,t (25) 3.6.5. To demonstrate the performance of the proposed method,
two case studies are provided. In subsection A, a small-scale test
(16) confines the operation strategies of switching devices. system is used to exhibit the merits of the DN-MS model and
Because performing online searches for the on/off state of these the feasibility of the S-MCTS. In subsection B, a full-scale test
switches together with all the maintenance decisions is compu- system is presented to show the scalability of the S-MCTS.
tationally expensive, we obtain the operation strategies of these
switching devices by solving the following heuristic in advance, A. Small-Scale Case Study
rather than during the tree search process,
⎧ ⎫ Fig. 3 shows the configuration of the small-scale test system.
⎨  ⎬ Only one feeder is involved, which contains 8 line sections and 4
k5 (s, a) : Ωfnntrans. : Ii ≤ I fb ,max − I fb (26) transformers with 4 load points. A circuit breaker, three section-
⎩ f

trans.
i∈Ωnn alizers and a tie switch is equipped on the feeder for network
reconfiguration or load restoration. The uncertainty of DG is
where Ωfnntrans. is the set of maximum load points that can be considered for maintenance scheduling. The key maintenance
transferred from feeder f, Ii is the demand of load point i and parameters (e.g., cost of labor and number of maintenance hours
I fb ,max − I fb is the spare capacity of backup feeder fb . required for PM and CM, equipment failure parameters) are
During the backup step shown in Fig. 2, different from the listed in Appendix C. In practice, the DN-MS of many utilities
B-MCTS that only uses the reward Q(vT ) for reward updating, is mainly divided into annual, monthly, and weekly maintenance
in the S-MCTS the expected rewards of each chance node are scheduling. The annual scheduling determines the list of equip-
memorized and accumulated for updating the reward of all ment for maintenance in a year, and provides preliminary divi-
visited nodes. We thereby modify (20) into, sion of maintenance tasks into different months. The monthly
T
 scheduling allocates the PM tasks into different workdays, which
 ln N (vp )
U CTmodified = E (Q(wt )) + Cp are specified by the distribution network operator. The weekly
t

ω t ∈ΩP
t
N (vc ) scheduling determines the maintenance activities in each given
day. So, the annual schedules provide very rough maintenance

3
− max(0, penaltyl ) (27) strategies, while the weekly schedules only slightly change
l=1 the monthly schedules considering temporary events. In this
work, we study the monthly maintenance scheduling problem
which asymptotically guides the tree search process towards
as an example, although the proposed method is applicable for
theoptimal solution of (17) as follows. Initially, the term
other maintenance optimization problems. For simplicity, it is
ln N (v )
Cp p
N (vc ) encourages the exploration of unvisited nodes assumed that there are three workdays for PM scheduling (i.e.,
in different stages; approximately it decreases to 0 and the planning horizon T equals to 3), and all line sections and
thereby encourages exploitation of the nodes with promising transformers in Fig. 3 are in the PM scheduling list. So, in the
rewards. Once constraints (12)–(16) are violated, the term MCTS, the branching factor of the root node is 212 .
− 3l=1 max(0, penaltyl ) biases the rewards of the correspond- To estimate the computation performance of the S-MCTS,
ing nodes, which in turn reduces the chances of selecting several variants of MCTS algorithms are investigated, the char-
these nodes in future iterations. As the number of iteration acteristics of which are listed in Table I. They are differentiated
increases, the estimated reward improves owing to the term by whether incorporating the DER uncertainties, the network

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
SHANG et al.: STOCHASTIC MAINTENANCE SCHEDULES OF ACTIVE DN BASED ON MONTE-CARLO TREE SEARCH 3947

TABLE I TABLE III


CHARACTERISTICS OF COMPARED ALGORITHMS PERFORMANCE COMPARISON OF INCORPORATING COMPLICATED CONSTRAINTS

TABLE IV
PERFORMANCE COMPARISON OF INCORPORATING NETWORK
TABLE II RECONFIGURATION STRATEGY AND DER UNCERTAINTIES
PREVENTIVE MAINTENANCE SCHEDULES

reconfiguration strategy and the nonlinear constraints during tree


search. For instance, when formulating maintenance decisions,
B-MCTS0,1 is used as the baseline MCTS that considers only 1
scenario with network reconfiguration, yet the constraints trans-
formation strategy is not incorporated in the algorithm; in con-
trast, S-MCTS2,100 considers 100 scenarios of DER uncertainty
with network reconfiguration and the constraints transformation
strategy. Specifically, in the penalty functions, the thresholds for
the maintenance budget, SAIFI and SAIDI, in one stage, are
set to ¥ 8 × 104 , 2 and 3, and their penalty coefficients are 5,
105 and 105 , respectively. In performance comparisons, these and reduces the cost of EENS. For the main feeder, l1 and l7
algorithms are first used to generate the maintenance schedules. are scheduled in different days, thus a power supply path is
Then, the obtained maintenance schedules are evaluated by always available that improves the reliability level for relevant
1,000 randomly generated scenarios. load points.
Table II lists a detailed PM plan obtained using the S- The merits of transforming the complicated constraints into
MCTS2,100 , which is to minimize the expected cost constrained penalty functions/heuristics are investigated by two pairs of
with the system reliability level and maintenance resources. comparisons. From Table III, we observe that those methods
Specifically, the thresholds for SAIFI and SAIDI are 2 and 3, employing the constraints transformation strategy ensure the
respectively; the maximum maintenance capacity in one day is reliability level. However, both the methods without the con-
6 line sections and 2 transformers. In Day 1, a planned outage straints transformation strategy (B-MCTS0,1 and S-MCTS0,20 )
for the PM of 4 line sections and two transformers is performed, obtain unqualified reliability indices. It can be concluded that
since the load points LD1 and LD2 are of low demand; the DG1 the constraints transformation strategy enhances the feasibility
and LD3 are transferred to the backup feeder, which reduces of the MCTS’s solutions.
the cost of EENS. After maintenance the DG1 and LD3 are Table IV shows the results of different S-MCTSs with respect
transferred back to the feeder. In Day 2, the planed outage is to the incorporation of network reconfiguration strategy and
performed on LD3 owing to its low demand, while the power DER uncertainties. The strength of incorporating the network
supply of other load points is not affected. In Day 3, the DG reconfiguration is analyzed by comparing S-MCTS1,1 to S-
is interrupted due to its low production level, while other load MCTS2,1 , or S-MCTS1,20 to S-MCTS2,20 , or S-MCTS1,100 to S-
points are restored through network reconfiguration. Note that l6 MCTS2,100 , respectively. Take S-MCTS1,1 and S-MCTS2,1 as an
is not maintained, because its failure probability is low and over example, in S-MCTS1,1 the Day-1 SAIDI exceeds its threshold
maintenance is avoided; the under maintenance is also avoided value, while in S-MCTS2,1 all reliability indices are within the
as other equipment are of increased failure probability that may thresholds. In addition, the total maintenance cost of S-MCTS2,1
induce high costs of CM and EENS. Overall, the schedule prop- is lower than that of S-MCTS1,1 ; similar results are observed in
erly avoids repetitive interruption of load points. For instance, the other two comparisons: when network reconfiguration is not
the line section and transformer belonging to the same lateral considered both the reliability indices and costs increase. We
are maintained together, which improves the reliability level thereby conclude that incorporating the network reconfiguration

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
3948 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO. 5, SEPTEMBER 2020

TABLE V TABLE VI
COMPUTATION TIME OF DIFFERENT ALGORITHMS COMPARISON OF ALGORITHMS IN THE FULL-SCALE TEST SYSTEM

into the model can increase system reliability and decrease


maintenance costs.
Moreover, the results by further considering the DER uncer- tree parallelization, leaf parallelization and root parallelization
tainties are investigated. By comparing S-MCTS2,1 , S-MCTS2,20 [19], [20]. We leave it for future investigation.
and S-MCTS2,100 , we observe that S-MCTS2,1 results in the
highest maintenance cost. However, S-MCTS2,100 achieves the
minimum costs and S-MCTS2,20 the second lowest costs. The B. Full-Scale Case Study
reliability indices of these algorithms satisfy the threshold. The test system is modified from [13] and shown in
This implies that explicitly considering both DER uncertain- Appendix C. There are 7 feeders in total. Two feeders, F1
ties and network reconfiguration strategy via SMNIP rather and F7, are independent feeders (i.e., without backup). Feeders
than deterministic MILP can lead to substantial performance F2–F6 are equipped with 18 sectionalizers and 4 tie switches for
increases. In addition, the quality of the obtained solutions in network reconfiguration and load transfer. The DN contains 40
terms of representing the underlying uncertainty can be analyzed load points. 2 DGs and 2 EV charge stations that are connected
by comparing S-MCTS1,1 , S-MCTS1,20 and S-MCTS1,100 , or to the 10 kV DN, are assumed with uncertainties. In this test
S-MCTS2,1 , S-MCTS2,20 and S-MCTS2,100 , i.e., the obtained case, we compute and compare the expected reliability indices
results are improved as more scenarios are involved. This indi- and total costs of the S-MCTS and the standard GA imple-
cates that the S-MCTS is able to capture the underlying statistical mented in [27], considering the popularity of GA for solving
properties and provide high-quality solutions to (17). the maintenance scheduling problems [28]–[32]. It is assumed
Table V lists the computation times when running these that 72 line sections and 40 distribution transformers are in
algorithms. As the considered scenarios increase from 1 to 20 the PM scheduling list. The operation of switching devices is
and 100, the CPU time increases sub-linearly from S-MCTS1,1 to considered in these algorithms. So, the combinatory search space
S-MCTS1,20 and S-MCTS1,100 owing to the computation strat- includes 2134 × 3 possible maintenance solutions. In S-MCTS, 20
egy of transforming constraints into penalty functions. When possible scenarios are incorporated for DERs. In performance
computing the heuristic of coordinating switching devices, a comparisons, these algorithms are first used to generate the
small amount of extra computation time is added, as evidenced maintenance schedules. Then, the obtained maintenance sched-
by comparing S-MCTS1,20 to S-MCTS2,20 , or S-MCTS1,100 to ules are evaluated by 1,000 randomly generated scenarios.
S-MCTS2,100 . Table IV and V also show that increasing the Table VI lists the computation results of the two algorithms.
number of investigated scenarios in S-MCTS improves the The stopping criteria of them (no significant improvement in
solution quality with the price of computation time. However, objective function, maximum iterations) are hardened so that the
the overall computation efficiency is promising considering the better results may be obtained with the price of more computa-
search space of the original problem. It is also worth noting that tion time. From the perspective of cost savings, S-MCTS saves
when extending the length of the maintenance planning horizon, ¥ 9.35 × 104 compared with GA. For the reliability indices,
the scalability of S-MCTS can be ensured mainly by two reasons: both SAIFI and SAIDI in all stages are satisfied by S-MCTS
First, the implementation of the default policy consumes the and GA. For the computation time, S-MCTS takes 612 s yet
most of computation time in S-MCTS, which only involves GA takes 7140 s to obtain the listed result. Considering the
random action sampling and rewards accumulation. So, its ef- amount of distribution equipments in distribution utilities, the
ficiency is mainly affected by the number of sampled scenarios maintenance cost and computation time savings for utilities
when the planning horizon is extended. As the above tests offered by S-MCTS can be remarkable.
imply, increasing the number of scenarios leads to moderate A further comparison is made between S-MCTS and GA,
(sub-linear) growth of CPU time. Therefore, the computation focusing on the stability of their solutions in different runs, as
burden may be generally affordable for practical application. they involve stochastic search in a complex search space. For
Second, there are available acceleration techniques making fair comparison, they are given the same computation budget in
this method more applicable to large-scale distribution net- each run. Fig. 4 shows the obtained costs of compared methods
works. For example, the independent nature of each simulation in 10 runs in the form of box plots (the reliability levels are all
in S-MCTS suggests that the algorithm can be considerably satisfied, hence are omitted here). Descriptive statistics, e.g., the
accelerated by parallelization, i.e., more simulations can be first and third quartiles, minimum, median and maximum, are
performed in a given amount of time and the wide availability of listed. As can be seen, the GA results in premature convergence
multi-core processors can be exploited. Different parallelization to randomized, non-optimal solutions. In contrast, the solutions
mechanisms for MCTS are reported with promising results, e.g., generated by S-MCTS in different runs are more consistent

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
SHANG et al.: STOCHASTIC MAINTENANCE SCHEDULES OF ACTIVE DN BASED ON MONTE-CARLO TREE SEARCH 3949

Algorithm 1: S-MCTS Approach.


1: function MCTSSEARCH(s0 )
2: create root node v0 with state s0
3: while within computational budget do
4: vt ← TREEPOLICY(v0 )
5: C ← DEFAULTPOLICY(s(vt ))
6: BACKUP(vt , C)
7: return a(BESTCHILD(v0 ))
8: function TREEPOLICY(v)
9: while v is nonterminal do
Fig. 4. Stability performance comparison between S-MCTS and GA. 10: if v not fully expanded then
11: return EXPAND(v)
and less expensive than that of GA. These results show the 12: else v ← BESTCHILD(v)
superiority of S-MCTS over GA for generating high-quality 13: return v
solutions in practice. 14: function EXPAND(v)
15: choose a ∈ untried actions from A(s(v))
16: add a new child v to v
V. CONCLUSION
17: Initialize Q(v ) = −∞
In this paper, we formulate an SMINP for optimizing multi- 18: s(v ) = f (s(v), a(v)) and a(v ) = a
stage maintenance schedules in DNs. The model is developed 19: return v
to minimize the expected maintenance costs constrained by the 20: function BESTCHILD(v)
reliability indices, taking into account the uncertainties of DERs 21: return arg max Q(v ) +
and operation strategies of switching devices. Then, we provide  v ∈children of v
an S-MCTS solution to this problem. Unlike the B-MCTS, Cp ln(N (v) + 1)/(1 + N (v ))
we introduce chance nodes in the search tree and develop the 22: function DEFAULTPOLICY(s)
SAA method to estimate the multistage maintenance costs under 23: Initialize (C, i) = (0, 0)
uncertainties. To incorporate the nonlinear constraints without 24: while s is non-terminal do
reducing the computation efficiency, we transform them into 25: choose a ∈ A(s) uniformly at random
penalty functions or heuristics during a tree search. This solution 26: s ← f (s, a), C ← C + γ i ∗ cost(s, a) and
can asymptotically converge to the optimum with promising i←i+1
computational efficiency. The results of our numerical tests show 27: return C for state s
that the proposed method outperforms the B-MCTS and the 28: function BACKUP(v, C)
genetic algorithm. 29: while v is not null do
30: N (v) ← N (v) + 1, Q(v) ← max(Q(v), −C)
31: C ← γ ∗ C + cost(s(v), a(v))
APPENDIX
32: v ← parent of v
A. Realization of S-MSTC 33: function f(s, a)
In this subsection, we describe the realization of the S- 34: Calculate s by (5)
MCTS in detail. Algorithm 1 summarizes the pseudocode of 35: Apply heuristic (25)
the S-MCTS for maintenance scheduling. In the main func- 36: return s
tion MCTSSEARCH(s0 ), the subfunctions TREEPOLICY, DE- 37: function cost(s(v), a(v))
FAULTPOLICY and BACKUP are repeatedly executed. Note 38: Calculate C PM and C CM for (s(v), a(v)) by (2)–(4)
that in TREEPOLICY, the unvisited nodes are assigned higher 39: for n = 1 toM
priority for node expansion than the visited node selected by 40: Sample P n in scenario n
the function BESTCHILD. In line 22, we add 1 to both N(v) 41: Apply heuristic (26)
 UE
and N(v ) to prevent these two variables from becoming zero. 42: Calculate the cost C n
In subfunctions f and cost, we realize the constraints and cost 43: Apply penalty function (24)
objectives of (17). In f(s, a), we apply heuristic (25) to dynami- 44: obtain C
= C
PM
+ C CM + 
cally modify the failure probability once it exceeds the baseline M  UE 3
1
M Cn + max(0, penaltyl )
failure probability. In cost(s(v), a(v)), we apply the heuristic n=1 l=1
(26) and penalty function (24) repeatedly during the realization 45: return C
of different scenarios. In line 39, we use M as the number
 UE
of sampled scenarios; in line 42 we use C n to denote the
B. Example of Incorporating Uncertainty Into S-MCTS
unsupplied energy cost obtained in scenario n, and in line 44
we obtain the cost C achieved by (1) and penalized by (24). C In this subsection, an example is given to show the incor-
is used to inform the tree policy, to avoid selecting a child node poration of multistage uncertainty into the decision process of
that violates the associated constraints. S-MCTS. For simplicity, it is assumed that the maintenance
horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
3950 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO. 5, SEPTEMBER 2020

Fig. 5. Illustration of incorporating uncertainty into the S-MCTS.

Fig. 6 Configuration of the full-scale test system.

schedule involves only 1 equipment over two stages (i.e., stage DEFAULTPOLICY starts from nodev1,1 , action a1 is set to 0
0 and stage 1). randomly, which yields the terminal nodev2,2 and concludes
Figure 5 depicts two trees, i.e., the S-MCTS and a scenario this trajectory. Then, the expected rewards of expanded nodes
tree. The S-MCTS originates from the root node v0  and (in this case, node v1,1 ) are estimated based on the SAA in
ends at a terminal node (e.g., v2,2 ). v0  corresponds to the a backward direction. So, for v1,1  its reward is approximated
equipment failure probability λ0 . The tree branches represent as minus the average costs taking actiona = {1, 0}under 6 sce-
different PM decisions. Take a0 as an example, a0 = 1 leads to narios, i.e., 16 6n=1 Q(ω n ; a). Note that according to (27), the

the state transition from v0  to v1,1 , while a0 = 0 leads to ln N (vp )
term Cp N (vc ) is added into the reward value for address-
v1,2 . Totally, there are four possible maintenance schedules
ing the exploration-exploitation dilemma. If some constraints
that yield four state transition trajectories. The scenario tree
are violated the estimated value is further biased by the term
provides possible realizations of uncertainties. We assume 6
− 3l=1 max(0, penaltyl ). After that the newly estimated re-
scenarios over the two stages, i.e.,
ward value is compared to the stored value in the previous
{(ω1,1 , ω2,1 ), (ω1,1 , ω2,2 ), (ω1,2 , ω2,3 ), (ω1,2 , ω2,4 ), (ω1,3 , iteration, the bigger will be saved.

ω2,5 ), (ω1,3 , ω2,6 )},


C. Configuration and Key Parameters of the Test Systems
M
for simplicity, we use notation WM := {ω , . . . , ω } to repre-
1
See Tables VII and VIII.
sent sampled scenarios. In this case M equals to 6.
The S-MCTS progresses according to Algorithm 1. It explores TABLE VII
different maintenance strategies (denoted by a) by iteratively KEY PARAMETERS
performing the functions TREEPOLICY, DEFAULTPOLICY
and BACKUP. In each iteration, during the implementation of
backup policy the visited nodes estimate their reward based on
the explored actions and incorporated scenarios starting from
their stage until the terminal stage. Assume in one iteration the

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
SHANG et al.: STOCHASTIC MAINTENANCE SCHEDULES OF ACTIVE DN BASED ON MONTE-CARLO TREE SEARCH 3951

TABLE VIII [20] D. Silver et al., “Mastering the game of go with deep neural networks and
BUDGETS AND COSTS PER CM/PM ACTIVITY tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016.
[21] D. Whitehouse, E. Powley, and P. Cowling, “Determinization and informa-
tion set Monte Carlo tree search for the card game Dou Di Zhu,” in Proc.
IEEE Conf. Comput. Intell. Games, Seoul, South Korea, 2011, pp. 87–94.
[22] M. Yildirim, X. A. Sun, and N. Z. Gebraeel, “Sensor-driven condition
based generator maintenance scheduling—Part I: Maintenance problem,”
IEEE Trans. Power Syst., vol. 31, no. 6, pp. 4253–4262, Nov. 2016.
[23] C. Chen, W. Wu, B. Zhang, and C. Singh, “An analytical adequacy
evaluation method for distribution networks considering protection strate-
gies and distributed generators,” IEEE Trans. Power Del., vol. 30, no. 3,
pp. 1392–1400, Jun. 2015.
REFERENCES [24] A. Shapiro, “Analysis of stochastic dual dynamic programming method,”
Eur. J. Oper. Res., vol. 209, pp. 63–72, 2011.
[1] J. Zhong, W. Li, C. Wang, J. Yu, and R. Xu, “Determining optimal
[25] E. Arriagada, E. López, C. Roa, M. López, and J. Vannier, “A stochastic
inspection intervals in maintenance considering equipment aging failures,”
economic dispatch model with renewable energies considering demand
IEEE Trans. Power Syst., vol. 32, no. 2, pp. 1474–1482, Mar. 2017.
and generation uncertainties,” in Proc. IEEE Grenoble Conf., Grenoble,
[2] L. Bertling, R. Allan, and R. Eriksson, “A reliability-centered asset
2013, pp. 1–6.
maintenance method for assessing the impact of maintenance in power
[26] N. Abdel-Karim, E. J. Nethercutt, J. N. Moura, T. Burgess, and T. C.
distribution systems,” IEEE Trans. Power Syst., vol. 20, no. 1, pp. 75–82,
Ly, “Effect of load forecasting uncertainties on the reliability of North
Feb. 2005.
American bulk power system,” in Proc. IEEE PES General Meeting |
[3] P. Dehghanian, M. Firuzabad, F. Aminifar, and R. Billinton, “A compre-
Conference & Exposition, National Harbor, MD, 2014, pp. 1–5.
hensive scheme for reliability centered maintenance in power distribution
[27] V. K. Koumousis and C. P. Katsaras, “A Saw-tooth genetic algorithm
systems—Part I: Methodology,” IEEE Trans. Power Del., vol. 28, no. 2,
combining the effects of variable population size and reinitialization to
pp. 761–770, Apr. 2013.
enhance performance,” IEEE Trans. Evol. Comput., vol. 10, no. 1, pp. 19–
[4] P. Dehghanian, M. Firuzabad, F. Aminifar, and R. Billinton, “A Compre-
28, Feb. 2006.
hensive scheme for reliability-centered maintenance in power distribution
[28] J. C. Stacchini, M. B. Filho, and M. L. R. Roberto, “A genetic-based
systems—Part II: Numerical Analysis,” IEEE Trans. Power Del., vol. 28,
methodology for evaluating requested outages of power network ele-
no. 2, pp. 771–778, Apr. 2013.
ments,” IEEE Trans. Power Syst., vol. 26, no. 4, pp. 2442–2449, Nov. 2011.
[5] M. Khodaei Tehrani, A. Fereidunian, and H. Lesani, “Financial planning
[29] E. K. Burke and A. J. Smith, “Hybrid evolutionary techniques for the
for the preventive maintenance of power distribution systems via fuzzy
maintenance scheduling problem,” IEEE Trans. Power Syst., vol. 15, no. 1,
AHP,” Complexity, vol. 21, pp. 36–46, Oct. 2014.
pp. 122–128, Feb. 2000.
[6] H. M. Shourkaei, A. A. Jahromi, and M. F. Firuzabad, “Incorporating
[30] J. Zhu, P. Xuan, P. Xie, C. Hong, and W. Yan, “Generation and transmission
service quality regulation in distribution system maintenance strategy,”
equipment maintenance scheduling with load transfer,” in Proc. IEEE
IEEE Trans. Power Del., vol. 26, no. 4, pp. 2495–2504, Oct. 2011.
Power Energy Soc. General Meeting, Chicago, IL, 2017, pp. 1–5.
[7] A. Abiri-Jahromi, M. Fotuhi-Firuzabad, and E. Abbasi, “An efficient
[31] A. M. Eldurssi and R. M. O’Connell, “A fast nondominated sorting guided
mixed-integer linear formulation for long-term overhead lines mainte-
genetic algorithm for multi-objective power distribution system reconfig-
nance scheduling in power distribution systems,” IEEE Trans. Power Del.,
uration problem,” IEEE Trans. Power Syst., vol. 30, no. 2, pp. 593–601,
vol. 24, no. 4, pp. 2043–2053, Oct. 2009.
Mar. 2015.
[8] V. Aravinthan and W. Jewell, “Optimized maintenance scheduling for
[32] J. D. Foster, A. M. Berry, N. Boland, and H. Waterer, “Comparison of
budget-constrained distribution utility,” IEEE Trans. Smart Grid, vol. 4,
mixed-integer programming and genetic algorithm methods for distributed
no. 4, pp. 2328–2338, Dec. 2013.
generation planning,” IEEE Trans. Power Syst., vol. 29, no. 2, pp. 833–843,
[9] A. D. Janjic and D. S. Popovic, “Selective maintenance schedule of
Mar. 2014.
distribution networks based on risk management approach,” IEEE Trans.
Power Syst., vol. 22, no. 2, pp. 597–604, May 2007.
[10] K. Ahadi and K. M. Sullivan, “Approximate dynamic programming for
selective maintenance in series–Parallel systems,” IEEE Trans. Rel., to be
published, 2019, doi: 10.1109/TR.2019.2916898.
[11] S. K. Abeygunawardane, P. Jirutitijaroen, and H. Xu, “Power system
maintenance planning using value function approximation,” in Proc. Int.
Conf. Probabilistic Methods Appl. Power Syst., 2014, pp. 1–7. Yuwei Shang received the B.E. degree in electri-
[12] H. Mirsaeedi, A. Fereidunian, S. M. Hosseininejad, and H. Lesani, “Elec- cal engineering from Zhejiang University, Zhejiang,
tricity distribution system maintenance budgeting: A reliability-centered China, in 2013 and the M.E. degree in electri-
approach,” IEEE Trans. Power Del., vol. 33, no. 4, pp. 1599–1610, cal engineering in 2015 from Tsinghua University,
Aug. 2018. Beijing, China, where he is currently working toward
[13] H. Mirsaeedi, A. Fereidunian, S. M. Hosseininejad, P. Dehghanian, and H. the Ph.D. degree. His research interests include active
Lesani, “Long-term maintenance scheduling and budgeting in electricity distribution system analysis & control, power distri-
distribution systems equipped with automatic switches,” IEEE Trans. Ind. bution asset management, machine learning and its
Informat., vol. 14, no. 5, pp. 1909–1919, May 2018. application in energy system.
[14] H. G. Williams and E. Patelli, “Maintenance strategy optimization for
complex power systems susceptible to maintenance delays and operational
dynamics,” IEEE Trans. Rel., vol. 66, no. 4, pp. 1309–1330, Dec. 2017.
[15] B. Basciftci, S. Ahmed, N. Z. Gebraeel, and M. Yildirim, “Stochastic opti-
mization of maintenance and operations schedules under unexpected fail-
ures,” IEEE Trans. Power Syst., vol. 33, no. 6, pp. 6755–6765, Nov. 2018.
[16] R. Rocchetta et al., “A reinforcement learning framework for optimal Wenchuan Wu (Senior Member, IEEE) received the
operation and maintenance of power grids,” Appl. Energy, vol. 241, B.S., M.S., and Ph.D. degrees from Tsinghua Univer-
pp. 291–301, 2019. sity, Beijing, China. He is currently a Professor and
[17] A. Bhattacharya, J. P. Kharoufeh, and B. Zeng, “Managing energy storage the Director of Electric Power System Research Insti-
in microgrids: A multistage stochastic programming approach,” IEEE tute, Department of Electrical Engineering, Tsinghua
Trans. Smart Grid, vol. 9, no. 1, pp. 483–496, Jan. 2018. University. His research interests include energy man-
[18] R. Coulom, “Efficient selectivity and backup operators in Monte-Carlo agement system, active distribution system operation
tree search,” in Proc. 5th Int. Conf. Comput. Games, Turin, Italy, 2006, and control, machine learning and its application in
pp. 72–83. energy system. He is an IET Fellow, and an Associate
[19] C. Browne et al., “A survey of Monte Carlo tree search methods,” IEEE Editor for IET Generation, Transmission & Distribu-
Trans. Comput. Intell. AI Games, vol. 4, no. 1, pp. 1–49, Mar. 2012. tion and IET Energy Systems Integration.

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap
3952 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 35, NO. 5, SEPTEMBER 2020

Jiawei Liao received the B.S. degree in Internet Jian Su received the M.S. degree in power electric
of Things from Northeastern University, Shenyang, system and automation from China Agricultural Uni-
China, in 2013. He is currently working toward versity, Beijing, China, in 1996. He is currently a
the master’s degree in software engineering with Senior Engineer with China Electric Power Research
Peking University, Beijing, China. His research inter- Institute, Beijing, China. He is the Convener of IEC
ests include object detection and deep reinforcement SC 8B WG4, and the Vice Chairman of CIGRE C6
learning. National Committee of China. His research interests
include planning and automation of power distri-
bution system, integration of distributed renewable
generations.

Jianbo Guo (Senior Member, IEEE) received the


B.S. degree from the Huazhong University of Science
and Technology, Wuhan, China, in 1982 and the M.S. Wei Liu received the Ph.D. degree in electrical en-
degree from China Electric Power Research Institute, gineering from the Harbin Institute of Technology,
Beijing, China, in 1984. From 2010 to 2019, he was Harbin, China, in 2003. He is currently a Senior
the Chairman of China Electric Power Research Insti- Engineer of China Electric Power Research Institute,
tute. Since 2013, he has been an elected Academician Beijing, China. His main research interests include
of the Chinese Academy of Engineering, Beijing, optimal planning and operation analysis of distribu-
China. He is currently the Deputy Chief Engineer tion system.
of State Grid Corporation of China, the Vice Chair-
man of China Electrical Engineering Society, and the
honorary Chairman of China Electric Power Research Institute. He has been
dedicated to the clean use of energy and the development of environmentally
friendly power grid. He has long been engaged in power system analysis
& control. He has made remarkable achievements in power grid planning,
Yu Huang received the Ph.D. degree in computer
improving security, reliability and transfer capability of power grid, and security
science from Peking University, Beijing, China, in
of wind power integration. As the Principal Investigator, he participated in
2007. He is currently an Associate Professor with
Chinese National Programs for Three Gorges Power Transmission Project and
many other important grid planning studies. He presided over the development National Engineering Research Center for Software
Engineering, Peking University. His research inter-
plan for Chinese National Grid Interconnection Project (from 2020 to 2050). He
ests include software engineering, big data analysis,
successfully developed Thyristor Controlled Series Compensation (TCSC) and
and domain application.
UHV series compensator (1000 kV) with proprietary IPR. He won the first and
second prizes of the National Science and Technology Progress Award in 2008
and 2015, the Ho Leung Ho Lee Foundation Science and Technology Progress
Award in 2011, and FEIAP Engineer of the Year Award 2018, in 2019.

horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on November 18,2024 at 09:11:56 UTC from IEEE Xplore. Restrictions ap

You might also like