Integration of Markov Chains
Integration of Markov Chains
A R T I C L E I N F O A B S T R A C T
Keywords: Development of failure analysis techniques for complex engineering systems is evolving rapidly. Complexity in
Complex system these systems refers to the complex interrelations among system components, variables, factors, and parameters
Failure as well as the large number of components to include in the study. It is not an easy task to include all in
Bayesian network
terrelationships of a complex system into one representation. New dynamic and uncertain factors affecting en
Simulation
Markov chain
gineering systems, like climate change, new technologies, and new uses, make it clear that the water reservoir
systems operations and performance are under probabilistic inputs from many different factors. This means that
failure of such systems should be assessed using multidisciplinary probabilistic uncertainty measures. Bayesian
Networks (BNs) provide a flexible way of representing such complex systems and their interrelating components
probabilistically and in a single unified representation. Compared to other techniques such as fault tree and event
tree analyses methods, BN is useful in representing complex networks that have multiple events and different
types of variables in one representation, with the ability to predict the effects, or diagnose the causes leading to a
certain effect. In this paper, two proposed methodologies are developed to support BNs in dealing with the failure
analysis of complex engineering systems, i.e. Simulation Supported Bayesian Networks (SSBNs), and Markov
Chain Simulation Supported Bayesian Networks (MCSSBNs). For complex networks, whose failures are affected
by a large number of uncertain interconnected variables, these proposed methods are used for efficiently pre
dicting failure probabilities. Compared to exhaustive simulation, the new tools have the distinction of decom
posing the complex system into many sub-systems, which makes it easier for understanding the network and
faster for simulating the entire network while taking multiple operation scenarios into consideration. The effi
ciency of these techniques is demonstrated through their application to a pilot system of two dam reservoirs,
where the results of SSBNs and MCSSBNs are compared with those of the simulation of entire system operations.
1. Risk, reliability, and uncertainty and is present in all aspects of risk analysis including risk assessment,
risk management, and risk communication. For the purpose of this
Risk is the product of the probability of an undesired outcome paper, only the probabilistic uncertainty is of interest. Generally, risk
(failure) and the consequences of that outcome [1–10]. This means that analysis is a systematic tool that facilitates the identification of the weak
risk is the expression of a probability measure (frequency or likelihood) elements of a complex system and the hazards that mainly contribute to
of an event and the consequences (impact or effect) of that event, with the risk.
the potential to influence the achievement of an organization’s objec According to [9–12], availability is the ability of a component or
tives. The development of risk estimates or the determination of risks in system to function at a specified interval of time. This is closely related
a given context is called Risk Analysis, while Risk Assessment is the to what is called “Reliability”, which describes the ability of a system or
process of evaluating the risks and determining the best course of action, component to function under stated conditions for a specified period of
[2]. Uncertainty of outcomes is the common concept in all definitions of time. Reliability engineering is a sub-discipline of systems engineering
risk, which can be described by the uncertainty surrounding future that emphasizes dependability in the lifecycle management of a product.
events and outcomes. Thus, uncertainty is an intrinsic property of risk In reliability engineering programs - where reliability plays a key role in
* Corresponding author.
E-mail addresses: [email protected], [email protected] (A. El-Awady), [email protected] (K. Ponnambalam).
https://doi.org/10.1016/j.ress.2021.107511
Received 30 July 2020; Received in revised form 24 December 2020; Accepted 29 January 2021
Available online 2 February 2021
0951-8320/© 2021 Elsevier Ltd. All rights reserved.
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
the cost effectiveness of systems - testability, maintainability, and analysis, which allows for better understanding to enhance the system
maintenance are parts of these programs. In reliability engineering, reliability, and take decisions for mitigating the negative effects, or
estimation, prevention, and management of high levels of lifetime en better enhancing the causes. A complex system can be defined as a
gineering uncertainty and risks of failure are common areas to be dealt system that is composed of a huge number of interacting components,
with. Theoretically, reliability is defined as the probability of success and may be represented as a network where the nodes represent system
(Probability of success = 1 - Probability of failure). Sometimes, proba components and the edges (links) are their interactions. Given any
bilistic stability analysis is referred to as “reliability analysis”. During complex system that includes inputs, outputs, sub-systems, and bound
failure probability estimation, reliability analysis cannot be used solely, aries, it is reasonable to assume that all of these system components are
and the results of such analysis must be moderated using engineering interacting either directly with one another, or indirectly. In order to
judgment and appropriate models as useful tools in estimating condi estimate the probability of failure for such system, the interactions
tional probabilities. should be represented mathematically including any probability mea
Generally, according to [3,10,13,14], uncertainty - which is a com sures. A full representation of the system facilitates its analysis from the
mon concept for expressing inaccuracies - means that a number of failure point of view. The main obstacle in failure analysis of complex
different values can exist for a quantity, while risk means the possibility systems is how to represent the system components and their basic and
of loss as a result of uncertainties. Accordingly, any uncertain variable, conditional probabilities. Bayesian Networks (BNs) are found to solve
which can take various values over a range, should be provided with an this problem. Bayesian Network provides a representation (provided
uncertainty analysis that is used to assess output uncertainty and to also in graphic forms) of any system using basic probabilities, for system
identify the most efficient ways to reduce that uncertainty according to inputs, and conditional probabilities, for sub-systems and their in
the contributing variables. Hence, in terms of statistical concepts, un teractions. One of the main advantages of using BNs is the ability of
certainty can be thought about as a statistical variable and can be integrating all types of data (social, environmental, technical, etc.)
calculated using well verified statistical procedures. In a broad sense, the seamlessly in one representation where the main information is pro
value reported for a measurement describes the central tendency vided in probabilistic terms.
(mean); while the uncertainty describes the standard deviation (devia In this research, the main contributions are:
tion from the mean). Ideally, this measure of uncertainty is calculated
from repeated trials, or to be taken from estimates in whole or part in • Using Bayesian Networks to represent complex systems using system
many engineering tests or research experiments. Thus, risk analysis decompositions,
forces the engineer to confront uncertainties directly and to use best • Integrating simulation with Bayesian Networks for better quantifi
estimates and predictions, especially, while taking decisions regarding cation of any complex system’s Bayesian Network,
the safety of large technological (complex) systems. Increasingly, such • Integrating simulation, Markov Chains, and Bayesian Networks to
decisions are being based on the results of probabilistic risk assessments represent the system network’s feedbacks while facilitating the
(PRAs), which must be associated with adequate quantification of the quantification process of the system probabilities.
uncertainties. Uncertain parameters can be treated as random variables
with appropriate probability distributions. Such distributions are An Introduction of Bayesian Networks is provided in the next section.
assigned on the basis of available data (which is often scarce), combined
with logic inference and the judgement of experts (which can vary 2. Bayesian Networks (BNs)
widely), adding another element of uncertainty into the uncertainty
analysis itself, [15]. This means that there might be different sources of BNs, or belief networks, are probabilistic graphical models used to
uncertainty due to data available, limited knowledge, and subjective represent knowledge about an uncertain domain using a combination of
judgement, and uncertainty here is assumed to be available in proba principles from graph theory, probability theory, computer science, and
bilistic terms either from data or from expert judgement or logical statistics. In the graph, nodes (vertices) are representing probabilities of
inference. random variables, and the edges (arcs) represent the interrelationships
According to [16], in 2009, the American Society of Civil Engineers (conditional probabilistic dependencies) among these variables. BN is a
(ASCE) issued a report titled “Guiding Principles for the Nation’s Critical directed acyclic graph (DAG), meaning that a set of directed edges are
Infrastructure.” Risk management of critical infrastructure depends on used to connect the set of nodes, where these edges represent direct
four interrelated guiding principles, identified as follows: statistical dependencies among variables, with the constraint of not
having any directed cycles (cannot return to any node by following
1 To quantify and communicate risk, directed arcs). Thus, the definition of parent nodes and child nodes
2 To employ an integrated systems approach, becomes obvious. The directed edge is often directed from a parent node
3 To exercise leadership, management, and stewardship in decision- to a child node, which means that any child node depends on its parent
making processes, node(s). Refs [17–32] are some works applying BN in various problems.
4 To adapt critical infrastructure in response to dynamic conditions BN by being a DAG reduces the huge explosion in the state space
and practices. encountered in more general networks such as those based on Markov
chains and hence are suitable for large complex systems. BNs are
This paper focuses mainly on the first two guiding principles, mathematically rigorous, understandable, and efficient in computing
which is to present a method that can represent all interrelated and joint probability distribution over a set of random variables. In BNs,
multidisciplinary system components in a combined representation there are two main types of reasoning (inference support): 1- predictive
(integrated systems approach), while enhancing the ability to quantify reasoning (top-bottom or forward reasoning), in which evidence nodes
this kind of system representation in order to better predict the failures are connected through parent nodes (cause to effect), and 2- diagnostic
for many purposes (risk management, risk reduction, etc.). reasoning (bottom-top or backward reasoning), in which evidence nodes
For any given system including inputs and sub-systems, probabilistic are connected through child nodes (effect to cause). Firstly, the topology
failure analysis depends on finding the probability of not getting the of the BN should be specified (structuring of graphical causality model),
required or estimated output of that system. The required output may be then, the interrelationships among connected nodes should be quanti
the effect that is produced from the system causes (like in prediction fied, i.e. conditional probability distributions in the form of conditional
reasoning), or the determination of causes responsible for certain results probability tables (CPTs). Also, the basic probabilities of basic (evi
and effects (like in diagnostic reasoning). Thus, determining the cause- dence) nodes should be determined in the form of basic probability ta
effect relation is an important first step in the probabilistic failure bles (BPTs). As the number of parent nodes and/or their states increases,
2
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
the CPTs get very large. The prior basic probability tables, for the root 3.1. Simulation
nodes, and the conditional probability tables, for the parent and child
relationships, may be obtained from historical database currently In [34], simulation is defined as “the process of designing a model of
available, which can be updated in case of having any new data or in a real-world process or system and conducting experiments on this
formation. Generally, quantifying BNs depends on four sources of data: model for the purpose either of understanding the behavior of the sys
statistical and historical data, judgment based on experience (expert tem or of evaluating various strategies (within the limits imposed by a
judgement), existing physical models (or empirical models), and logic criterion or set of criteria, e.g. time) for the operation of the system”.
inference. Where no such sufficient data exists, either subjective prob Any real-life process studied by simulation techniques is viewed as a
abilities from experts or detailed simulation models can be used to es system, which is, in general, a collection of entities that are logically
timate conditional probabilities (using a reduced number of system related and are of interest to a particular application. While investi
components only so it is practical unlike the full system simulations), gating a real-world system, detailed simulation model should include
which will be discussed in detail later in this paper. the entire system. This may be computationally expensive especially in
The joint probability distribution function of random variables in a systems having large number of variables. During simulation, system
BN can be expressed as follows in Eq. (1): variables are sorted into two groups: 1- uncontrollable variables: which
∏n are considered as givens, and 2- controllable variables: that can be
P(x1 , ……, xn ) = i=1
P[xi |Pa (xi )] (1) manipulated to find a solution, [35]. In general, simulation enables the
study of internal interaction of subsystems within a complex system. A
where P(x1 , ……, xn ) is the joint probability of variables x1, x2, x3,.. xn, simulation model helps to gain knowledge about improvement of a
and Pa (xi) is the parent set of xi . If xi has no parents, then the function system. Through simulation, a model may be implemented by a large
reduces to the unconditional probability P(xi). number of variations (especially in uncertainty quantification studies
One of the features that a BN allows is entering evidence as input, called Monte-Carlo simulations), producing complex scenarios. How
resulting in updating probabilities in the network when new information ever, simulation may take a large amount of computing time to provide
is available. This information will propagate through the network and complete results and may become unsuitable as complexity increases.
the posterior probabilities can be calculated according to Eq. (2).
Likelihood ∗ Prior Probability 3.2. BN–simulation integration in uncertain complex systems
Posterior Probability = (2)
Evidence
In evaluating complex systems’ failure using BNs, quantifying the
The concept of posterior probability allows for identifying the events
nodal probabilities and conditional probabilities, represented by arcs, of
which have higher contributing impacts on the undesired/failure event,
the BNs, is a challenge. As mentioned earlier, characterization and
helping decision makers to identify these important factors, [15] and
quantification of a BN depends on four sources of information: statistical
[33].
and historical data, judgment based on experience (expert judgement),
In BNs, the only concern is the cause-effect relationships to derive
existing physical models or empirical models, and logic inference.
causal inferences from a combination of diverse assumptions. Generally,
Accordingly, the main challenge is whether or not the required data or
BNs help modelers and decision makers answer queries even when no
information, from these sources, is available. For some blue print pro
experimental data is available. When different reasoning processes and
jects (systems yet to be built), there might not be enough historical
useful evidence data (basic probabilities) are available, BN provides a
operational data and their statistical features. For this kind of systems,
basis to perform uncertainty propagation of query variables, [19].
decision makers may rely on expert judgement, logic inferencing, and
In ref [18], advantages and disadvantages of Bayesian Network (BN),
doing simulations by appropriate models. On the contrary, there are pre-
Event Tree Analysis (ETA), and Fault Tree Analysis (FTA) are illustrated.
existing complex systems that have been operating for decades such as
All these methods can be used in representing systems. However, when
constructed dams and reservoirs, waste water sewage systems, and
the BN is used to represent the system, the network is found to be more
water supply distribution systems. In such systems, operational, histor
readable and understandable while overcoming the disadvantages of
ical, and statistical data and information can be used to estimate the
other representation methods. This facilitates the analysis part of the
required probabilistic information to be used in BN-based probabilistic
system, which leads to easier system quantification.
representation of these systems. Therefore, an important question is: can
model-based simulations (in this paper the word simulation is used
3. Simulation Supported Bayesian Network (SSBN)
interchangeably with Monte-Carlo simulations that is required for esti
mating probabilities), especially with the goal of reducing computa
Complex systems may be affected by different kinds of interacting
tional complexity, be adapted for complex systems represented by BNs?
factors and components. Such complexity is expected to affect systems
In [36], reassessment of dam safety events using BNs is illustrated.
failures. One of the aims of this paper is to simplify the representation of
The BNs were constructed based on an event tree analysis, and were
such complex systems, while enhancing the ability of probabilistic
supplemented by Monte Carlo simulations to obtain the statistics of
quantification for system components and their interactions. To simplify
failure probability, i.e. mean, and variance using probability bounds set
the representation of complex systems and their components, Bayesian
by the experts. They proposed the following steps for the reassessment
Networks (BNs) are expected to be efficient in defining the in
purpose:
terrelationships among system components depending on evidences of
basic probabilities and conditional probabilities among system compo
1- Results of Event Tree Analysis (ETA) are validated using BN analyses.
nents (nodes). BNs were found distinctive in representing information in
2- BNs are supplemented (enhanced) by Monte Carlo Simulations
a probability form that is helpful for studying failures. This paper aims to
(MCSs) using BN to update the statistics of failure probability using
develop Bayesian network approaches that are used to predict failure
Beta distribution as the probability density function (PDF).
probabilities of complex systems, and how to update the probabilistic
data of complex systems seamlessly in order to give more accurate up
However, the simulation in the research reported in [36] uses
dates and results for decision makers while avoiding time and effort
deterministic mathematical system dynamic models of sub-systems for
unlike other analysis techniques such as exhaustive simulation.
use in Monte-Carlo simulations to generate some of the required prob
abilities and hence is a more general method. If an entire system can be
simulated in this manner then there is no real need for BN. However, BN
is used to decompose a large system so that when simulation is needed it
3
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
4
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
provides the worst-case design [37], and allows for rare events to Where:
have a higher probability of occurrences allowing for comparisons
with smaller number of samples and for the numerical precision t: season of the year {1,2,3,4}. The unit time in this simulation is one
commonly available. season.
• The dam is assumed to have a spillway gate that is to be opened or m: year {1,2,3,4,……..,1000}
closed. If there is a requirement of spill at any time, the gate should I: randomly generated inflow of the dam according to the lower and
be opened, and water will be spilled from the dam that can go to the upper bounds known from historical data (uniformly distributed)
same channel of the outflow, or to be diverted to any other channel. [units of water volume/time].
If there is a requirement to spill and the gate is closed (failed to open/ I(t,m): inflow of the dam at a certain year (m) and season (t) [units of
operate), the dam is assumed to fail. The state of the gate (0 or 1) is water volume/time].
also a uniformly generated random value. U, U(m): designed outflow from the dam. It is assumed that the
• The aim is to calculate the probability of failure of each dam (the designed outflow equals the mean of the dam inflow throughout one
probability that all the above events happen at the same time) using a year (m) [units of water volume/time].
large number of years of simulated inflows. S: storage of the reservoir at the beginning of a unit time [units of
• Four different system configurations were considered as explained in water volume]
Table 1 (i.e. series having dependent inflows, series with indepen S(t,m): storage of the reservoir at the beginning of a unit time in the
dent inflows, parallel having dependent inflows, and parallel with season (t) of the year (m) [units of water volume]
independent inflows). Smin: minimum storage limit of the dam reservoir (dead storage)
• The system of two dams is assumed to fail if any of the dams fail at [units of water volume]
any time (in the case of series connection), and if both dams fail at Smax: maximum storage limit of the dam reservoir [units of water
the same time (in the case of parallel connection). volume]
• A reservoir operation simulation model (mass balance and governing Water_Available: the water available at the reservoir during a unit
equations) is used for simulating each dam reservoir operation/ time. This is the inflow at a unit time plus the stored amount of water
management as follows: at the reservoir [units of water volume].
U(m) = mean[I(t, m)] (3) Water_Available (t,m): the water available at the reservoir at a unit
time in season (t) of the year (m).
S(t + 1, m) = S(t, m) + I(t, m) − U(m) (4) Controlled_Release: the actual release from the dam with gates
control/management, to keep the storage levels of the dam reservoir
above the minimum value (Smin). Controlled release should be less
than or equal to the designed value of the outflow (U) [units of water
Such That:Smin ≤ S(t + 1, m) ≤ Smax volume/time].
Controlled_Release (t,m): the controlled release at every season (t)
Water Available(t, m) = S(t, m) + I(t, m) (5)
and year (m).
Table 1
Simulation results for a two reservoir system of different configurations.
Simulation Results Dam 1 Dam 2 Probability of
Probability of Probability of System Failure
Failure Failure
5
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Spill: the amount of water that exceeds the maximum storage level inflow and the reservoir level in every time step, the spill state will
of the reservoir (Smax) at a unit time (one season) after releasing all depend on both parent nodes and the conditional probabilities that
the required release (controlled release) [units of water volume]. can be determined from the simulation process.
Spill (t,m): the spill amount at a certain season (t) and year (m). • Spillway Gate node: the spillway gates are supposed to open during
Spill_Release: if the spillway gates are opened, the spill release will the spill event in order to release the spill amount from behind the
be equal to the amount of spill over the time (per unit time or sea dam and to prevent overtopping failure from taking place. If these
son). If the gates are closed (failed to open), an overtopping failure is gates failed to open for any reason during the spill event, there will
taking place [units of water volume/time]. be an expected failure. So, this node includes two states, Open, or
Failed to Open. According to spillway gates maintenance schedules,
The simulation results for different configurations are shown in there should be an estimation for the percentage of time during the
Table 1. year that the gates tend to fail or not operate. In simulation, using
Different results may be obtained with different inflows, different that estimation, randomly generated variables for the states of gates
initial conditions of the reservoirs (initial water level in the reservoirs), (opened or closed) at every time step will help in determining the
and different spillway gates’ operation/management policies. probability of dam failure during spill events.
Figs. 2–5 show BN representations for different connection topol • Dam Failure: the spill events, with the spillway gates failed to open
ogies of the two dam reservoir system used in the simulation with dams will result in an overtopping dam failure. The two different states of
in series having dependent inflows, in series having independent in this node are Failure or No Failure.
flows, in parallel having dependent inflows, and in parallel having in
dependent inflows, respectively. Another node called “Dam System Failure” is added to the BN to
In these BNs, each dam is represented using five nodes/variables that account for the probability of the entire system failure depending on the
include different states. These nodes and their states are explained as topology of the system and whether the dams are in series or in parallel.
follows: In a series configuration the whole system fails if at least one of the dams
fails, whereas in the parallel configuration the system fails if both dams
• Flow node: the inflow of the reservoir. It includes two states, High fail. Of course, other definitions for failure of the system can be assumed.
Inflow, or Low Inflow. The inflow, along with the reservoir level, When probabilistic results of events responsible for failures are ob
affects the spill event which is about having excess water more than tained from simulation (i.e. both marginal (for each node) and condi
reservoir storage capacity. The probability values of each state (i.e. tional (for each link)) and fed into the BNs represented in above
High and Low inflows) are defined and obtained from the simulation. Figs. 2–5, the probabilities of failure can then be calculated using the BN
• Reservoir Level node: the level of water in the reservoir in every time equations, for the same initial and operating conditions, as reported in
step. It contains two states, High level, or Low level. Definition of Table 2. This is a BN-simulation integration that is defined in this paper
High state or Low state depends on the system analysis, and the as Simulation Supported Bayesian Network (SSBN). See [38] for more
probability can then be obtained from simulation (or from available details.
data in real world systems, if available). For the two dam reservoir system and while using the SSBN method,
• Spill node: affected by the combined states of both inflow and the there is no need to decompose the system of two dam reservoirs, as it is
reservoir level nodes. There would be two states of Spill or No Spill. already a simple system that includes only a small number of variables/
The Spill node shows the probability of having water (i.e. inflow plus nodes.
stored water that is a function of the reservoir level) more than the It can be concluded from Tables 1 and 2 that, running simulation
reservoir capacity, so it needs to be released through the spillway in gives close probabilistic results to those calculated from BNs when
order not to result in overtopping failure. Given the state of both the supported by simulation (i.e. SSBN) that gives the nodal and link
6
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
7
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Fig. 6. Probabilistic data and results of the BN of two dam reservoirs in series having dependent inflows.
Table 3 Table 4
Effect of increased number of states on the SSBN results for a system of two Predicting failure probabilities for future time periods from SSBN steady state
dams. estimates.
Probability of Probability of Probability of Probability Probability of Probability of Probability of
System Failure System Failure System Failure of System System System System
(Simulation) (SSBN, 2 states) (SSBN, 3-4 states) Failure (S.S) Failure (20 Failure (50 Failure (100
years) years) years)
Series Connection, 0.027 0.024 0.025
Independent Series 0.011 0.1985 0.425 0.6692
Inflows Connection,
Parallel 0.00025 0.0001 0.00014 Dependent
Connection, Inflows
Independent Series 0.024 0.385 0.703 0.912
Inflows Connection,
Independent
Inflows
illustrated in Table 3. Accordingly, to have more accurate estimates from Parallel 0.001 0.0198 0.0488 0.0952
the SSBN, more number of states for every variable should be defined. Connection,
Dependent
In a complex system, the simple two dam reservoir system can be Inflows
viewed as one of many sub-systems, and the simulation for the two dam Parallel 0.0001 0.001998 0.004988 0.00995
reservoir system is used to quantify the probability tables of the entire connection,
network. Coupling all sub-networks of the entire system depends on the Independent
Inflows
conditional probabilities among system variables and among sub-
systems. These conditional probabilities may be estimated and ob
tained from the simulation steady state results, or from historical data, other parts, the discretization also needs to be increased to match the
logic inference, and/or expert judgement, [15]. required accuracy increasing some CPU time.
For more complex networks, if the failure probability is expected to It must be noted that the probabilistic estimates from simulation or
be in the range of 10− 4, then randomly generated samples in the range of SSBN (in Tables 1 and 2) are steady state estimates that can be used in
106 are needed to perform the exhaustive simulation for the entire predicting failure probabilities in future time periods (e.g. 20, 50, 100,
network. This makes the exhaustive simulation computationally or 200 years). The following Eq. (9), which depends on binomial dis
complicated and not possible for systems with a very large number of tribution, can be used for that purpose:
components. If the system is decomposed to smaller sub-systems, like [( )n ]
the two dam reservoir system, less number of samples is needed to P(failure)in n years = 1 − 1 − P(failure)in steady− state (9)
obtain the steady state results from simulation (i.e. randomly generated
Table 4 shows the results of using the above equation to predict the
samples in the range of 103 to 104). This was also tested on the two dam
probabilities of system failure, in different future time periods, from the
reservoir example for multiple thousands of samples (i.e. 103 to 104
steady state probability estimates of SSBN in Table 2.
range) and multiple hundreds of thousands of samples (i.e. 105 to 106
In summary, the SSBN concept starts with decomposing the complex
range). The steady state estimates of failure probability of the system
system/network to smaller less complex networks, like the one proposed
didn’t have a significant difference with an increasing number of sam
in the two dam reservoir system. Once all the smaller networks are
ples for this system. A BN network analysis that allows for such a
simulated and represented probabilistically, they are ready to be re-
decomposition is efficient for complex systems although, as discussed in
8
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
The Markov process is a random process in which changes occur Fig. 7. Markov Chain of three states S1, S2, S3.
continuously over a period of time, where the future depends only on the
present state, and is independent of the past history. Markov Analysis ⎛ ⎞
(MA) is a technique that provides probabilistic information about im P11 P12 P13
TPM = ⎝ P21 P22 P23 ⎠
pacts of decisions that may help the decision maker, but without
P31 P32 P33
providing a recommended decision. It can be used to model system
performance, dependability, availability, reliability, and safety. MA may
Where, for example, P22 is the probability that the variable is currently
also be shown as a mathematical abstraction to model simple or complex
in the second state, and will remain in the second state in the next step.
concepts in a computable form. It is a tool for modeling complex system
While P32 is the probability that the variable is currently at the third
designs involving timing, sequencing, repair, redundancy, and fault
state and will move to the second state in the next step (moves to second
tolerance, along with determining the system availability in order to
state given that it was at the third state).
identify the flow of the system and enumerate the failure rate (forward),
In the next section, the advantages of Markov Chain Analysis can be
repair rate (backward), and the probability of failure of the different
used to overcome the limitations in the BN and the SSBN
components. Any Markov model can be graphically represented using
representations.
Markov diagram, which consists of the states and transitions of the
model. The transition probabilities, and transition rates, among the
6. Markov Chain simulation supported Bayesian Network
different states, within the system diagram, are of a great importance in
(MCSSBN)
the Markov Analysis. In reliability analysis, MA has the following
advantages:
The decompositional approach conducted in [28] and [36], allowed
the researchers in this paper to conceptualize and formulate a new
• Markov graphical representation helps in understanding the system
concept and methodology in dealing with BNs of complex systems. The
behavior,
concept of Simulation Supported Bayesian Networks (SSBNs), presented
• Modelling systems with their state diagrams, and in terms of the
in this paper, is expected to be an efficient method of applying de
interdependencies of states, is more accurate in specific situations,
compositions to complex networks. Moreover, the Markov-Bayes
• For observation purposes, MA allows for specifying different types of
(Cyclic-Acyclic) combination, proposed in [28], represents a way of
states and state groups.
supplementing Markov Chains with additional low-level features taken
from multiple sources, and are efficiently combined using Bayesian
A Markov chain is a sequence of random variables X1,.., Xn such that,
Networks. Since quantification of BNs depends on basic and conditional
given the present state, the future and past are independent. It is
probabilities, and Markov Chains are represented by transition (i.e.
formally written as follows in Eq. (10):
conditional) probabilities among different states, the Markov-Bayesian
Prob (Xn+1 = x|X1 = x1 , X2 = x2 , …, Xn = xn ) = Prob (Xn+1 = x|Xn = xn ) combination is a probabilistically quantified representation. Adding
(10) Markov Chains, which may represent different scenarios, states, or
events in a cyclic representation, to the BNs, of which the acyclic nature
In other words, the conditional distribution of Xn+1 in future depends may not be suitable for all complex structures, will result in a more
only upon the present state Xn. Usually, the chain is defined by speci generalized approach that fits most of the large complex system struc
fying the probabilities of transitioning from one state to another. The tures. Such structures, with their complex interrelations among system
state space may be considered to be continuous and sometimes discrete. components, usually have slow processing for calculating failure prob
For a continuous state space where a probability density can be defined, abilities when using only the Markov structure. Markov-Bayes combi
the transition probability can be written as P(x, y) = Prob (Xn+1 = y | Xn nation is estimated to reduce this problem in such systems. Now, with
= x). For a discrete state space, the transition probability is a matrix and the decompositional approach, represented by SSBNs, the question is
is written as Pxy, [39]. how to incorporate Markovian analysis to determine different scenarios,
So, if the chain is currently in state si, then it moves to state sj at the states, or events, which may take place at different times in the system
next step with a probability denoted by pij, then, this probability does network. In this section, the concept of Markov Chain Simulation Sup
not depend upon which states the chain was in before the current state. ported Bayesian Network (MCSSBN), cyclic-acyclic approach, is
The transitions among different states in the Markov Chain are repre introduced.
sented by the Transition Probability Matrix (TPM). This matrix is also Hidden Markov Chains, and Markov Chain Monte Carlo (MCMC)
called the matrix of transition probabilities, or the transition matrix. For models, are not new concepts/methods. Hidden Markov Chains are used
a Markov Chain of three states, the Transition Probability Matrix (TPM) in applications to introduce unobservable hidden states and can be also
can be represented as follows (see Fig. 7):
9
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
In this approach, the system is represented with low level BNs, high
level Markov Chains for different scenarios, and a higher level BN.
The motivation here is to add some cyclic behaviour to the BN in order to
accept any feedbacks for the sub-systems of the network, and make it
easier to estimate the required probabilistic information for the entire
network with these feedbacks. The SSBN concept is the corner stone.
Decomposing the BN to lower level BN sub-networks while running
simulations for them according to available data is the first step of this
approach. Running the simulation for every sub-network makes it
obvious that it may experience more than one scenario. Scenario is
defined here as a combination of states for all the nodes included in
every sub-network, where different combinations means different
scenarios.
Simulation results for every BN sub-network are used to define
different scenarios that the sub-network may experience. Then, given
the scenarios and simulation probabilistic results for all the sub-
Fig. 9. Low level BN sub-networks.
networks, sub-networks can be re-combined/aggregated back to the
higher level network. For the Bayesian inference in the higher level,
larger scale BN can be used to estimate the required probabilistic in networks (i.e. red, blue, and green BN sub-networks) as shown in Fig. 9.
formation (for example, the probability of failure). While the transition Every low level sub-network is simulated according to the available data
probabilities among different scenarios in every sub-network are esti for the variables and components included in the sub-network, or
mated from simulation, the most probable scenario to take place in the depending on randomly sampled data. The results of simulation are
next time step can be predicted for every sub-network. A Markov Chain probabilistic information (i.e. basic and conditional probabilities) of the
should then be constructed for every BN sub-network. Then, the prob states of the nodes and the BN sub-networks and their interrelationships.
ability of failure of the system can be predicted in the next time step. Simulation is also used to identify different scenarios (i.e. combinations
Moreover, the scenarios that contribute to failures more can be ob of states) for every sub-network, and the steady state probability dis
tained, and the scenarios of sub-networks which result in higher failure tribution of the scenarios.
probability can also be identified. At steady state, every scenario is represented as a Markov state, and
A simple example is given to illustrate this approach. An 11 node BN in order to obtain the transition probabilities among the different sce
is introduced in Fig. 8. The eleven nodes represent different variables, narios, the relation between steady state probability distribution of
factors, and system components where every node includes different Markovian states and the steady state transition probability matrix is
states (i.e. at least two states). This BN can be decomposed to 3 BN sub- given by the following Eq. (11), and the normalization equation (i.e. the
10
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
sum of probabilities must equal to 1): Markov Chain that controls the states at every time step within the node.
∏ ∏ This means that every state variable takes its state from its own lower
TPMs.s = (11) level Markov Chain. Running simulation is the start point. Decomposing
∏ the network to smaller BN sub-networks is used to facilitate the simu
Where TPMs.s is the steady state transition matrix and is the row lation process. Simulation is used to estimate the steady state probability
vector of steady state probability distribution of the Markovian states. distribution of the Markov states included in every node/variable and
By estimating the steady state probability distribution of the Markovian their transition probabilities. The probability of interest (i.e. failure
states from simulation, and by using the eigenvectors and eigenvalues probability in this research) can be predicted by combining/aggregating
for Eq. (11), the transition probabilities among the different scenarios of all the states and sub-networks as one high level BN. Then, at every time
every BN sub-network can be estimated. Some of these scenarios are of step, the state of every node in the BN will change depending on the
interest of the decision maker. So, a high level Markov chain for the transition probabilities of the lower level Markov Chains. So, the high
scenarios of interest can be constructed as shown in Fig. 10 where three level BN is supposed to experience a different scenario (i.e. combination
scenarios of interest of the first (i.e. red) sub-network are chosen by the of states) at every time step. Accordingly, a higher level Markov Chain
decision maker and their transition probabilities are estimated as can be constructed for the entire network to estimate the failure prob
explained above. ability for the entire system under different scenarios, and to be linked
The BN sub-networks are then re-combined as one higher level sys
tem BN (i.e. Fig. 8), and the probability of system failure can be pre
dicted using Bayesian inference. Then, by using the transition
probabilities from a scenario to another, for every sub-network, the
probability of failure can be predicted according to the new scenarios,
and linked probabilistically to the initial scenarios through transition
probabilities. With any evidence in the BN of the entire system, the
posterior Bayesian inference facilitates determining the main contrib
uting scenarios/states to the evidence (i.e. failure in this research). This
makes it easier to determine which BN sub-network had contributed
more to failures, and which of its scenarios, represented by Markov
Chains, is contributing more to failures.
Fig. 10. High level Markov Chain for the scenarios of interest of a BN sub-network
11
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
12
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Fig. 14. An example of a higher level Markov Chain for three scenarios of the entire BN.
using SSBN and MCSSBN for the two dam reservoirs connected in series probability matrices can be estimated from the simulation stage.
and having independent inflows. The steady state probability distributions of the different states of the
The distinction of the MCSSBN over the SSBN is having the ability to four state variables are estimated from simulation in order to be fed into
predict the system dynamics with different operation scenarios using the BN representation of the two reservoir system. The Markov transi
transition probability matrices. tion probability matrices may also be obtained for the Markov states of
When the MCSSBN second approach is applied to the two dam all four state variables. At this point, the simulation results of both dams
reservoirs connected in series and having independent inflows, the and the steady state probability distributions of the four state variables
system can be decomposed to three sub-systems, the first reservoir sub- in the system are ready to be fed into the high level BN of the system,
system, the second reservoir sub-system, and the system failure sub- shown in Fig. 20, to start predicting the system failure.
system, see Fig. 15. While simulating each dam, it is obvious that the At every time step, the high level BN is used to predict the system
dam failure may happen at different states of inflows and reservoir levels failure during different combinations of states (i.e. scenarios) for the
(i.e. state variables). As stated earlier, in simulation, inflow of the first entire network, not for every dam (i.e. sub-network) like in the first
dam is supposed to have three different states (low, intermediate, and approach, depending on the transition probabilities among Markov
high), and reservoir level of the first dam also has three different states states of the state variables that determine the most probable states to
(low, intermediate, and high). While for the second dam, the inflow is happen in the next time step.
supposed to have four states (low, intermediate, high, and very high), For example, with the evidence that the first dam has an interme
and its reservoir level is of three states (low, intermediate, and high). In diate inflow, while its reservoir has a high level storage, and this hap
this approach, instead of having combinations of states (i.e. scenarios) pens when the second dam has a very high inflow, while its reservoir
for each dam like the first approach, the node of every state variable will level is at low storage, the posterior probability of system failure, given
have its own lower level Markovian states and chain. The state variables this combination of evidences/events, can be estimated. In the next time
in this case are the inflow rate of the first dam, the reservoir level of the step, there might be different states that make a new combination (i.e.
first dam, the inflow rate of the second dam, and the reservoir level of scenario) for the system failure and its probability. And so forth.
the second dam. For every state variable, a lower level Markov Chain Thus, the entire system can be represented using a higher level
should be constructed as shown in Fig. 20, where the steady state Markov Chain that shows the dynamic system analysis for different
probability distributions of the Markov states and the transition combinations of states for the entire system network in different time
13
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Fig. 15. BN of two series reservoirs of independent inflows decomposed to three sub-networks.
Fig. 16. An example of a Markov Chain for a five scenario BN sub-network of the first reservoir.
steps. Many combinations of states for the four state variables can be different scenarios for the entire network.
defined. These combinations will result in different states of excess If any new data becomes available and the system is required to be
water over reservoir capacity (spill or no spill) for both dams, along with updated, the sub-network of the dam that was affected by the change
the randomly generated states for spillway gates. Thus, the system will be re-simulated, not the entire network. This allows for estimating
failure, depending on Dam1 failure and Dam2 failure, will have different updated steady state probability distributions and transition probability
probabilities for different combinations of states. This means that dy matrices for the lower level Markov states of the state variables that have
namic scenarios for the whole system network can be used to construct a updated data, along with getting all updated probabilistic results from
new higher level Markov Chain for different scenarios at different time simulation. Then, the new probabilistic data is fed into the high level BN
periods according to the operation of all system components. Fig. 21 for updated prediction of system failure. It is up to the decision makers,
shows a higher level scenario (i.e. combination of states) for the entire according to their expertise, to re-simulate the system sub-networks
network, and also shows an example of a higher level Markov Chain of every season, every year, or even every month, to have more updated
14
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Fig. 17. An example of a Markov Chain for a seven scenario BN sub-network of the second reservoir.
Fig. 18. Higher level BN for two reservoir system with three sub-networks.
data and more accurate and reliable prediction results. to result in probability estimates from the BN – when supported by
To compare the MCSSBN and SSBN results while using probability simulation and Markov Chains – that converge to the probability esti
estimates from exact simulations, the steady state probability distribu mates of system failure when using exhaustive simulation. But, on the
tions of the lower level Markov states of the four state variables are other side, decomposing the BN with reduced number of discretized
estimated from the simulation, and used to quantify the BN probability states makes the system faster to be analyzed without exploding the
tables. It is found that the system failure probability under the same number of states of the system during the failure analysis.
operational conditions is about 2.5%, which is close or similar to what In Summary, MCSSBN has two approaches:
was obtained in the SSBN stage. Note that the states’ discretization used
in MCSSBN includes more states for the inflow and the reservoir level The first approach, low level BNs, high level Markov Chains for
nodes for both dams, which makes the results more accurate and different scenarios, and a higher level BN, is used to add some
converging to the simulation results. Table 6 adds the probability of cyclic behaviour to the BN in order to accept any feedbacks for the
system failure after using MCSSBN 2nd approach to what is illustrated in sub-systems of the network, and make it easier to estimate the
Table 5 for the two dam reservoirs connected in series and having in required probabilistic information for the entire network with these
dependent inflows. The distinction of the MCSSBN over the SSBN is feedbacks. It helps the decision makers when certain possible sce
having the ability to predict the system dynamics with different opera narios, of the network sub-systems, are of interest. Experts, according
tion scenarios using transition probability matrices. to their experience, may choose to analyze the system with some
This means that increasing the number of discretized states per every scenarios of interest that are most probable to happen.
BN node and increasing the number of scenarios of interest is supposed
15
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
9. Conclusion
16
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Fig. 20. Low level Markov Chains of the state variables of the high level system BN.
Fig. 21. Higher level scenario (i.e. combination of states) for the entire network, and an example of higher level Markov Chain showing dynamic scenarios for the
entire network.
that uses SSBN and Markov Chain integration is proposed to use Markov probability estimates in SSBN and MCSSBN. So, when the probabilities
transition probabilities to reflect any feedback from different situations, are estimated from simulation, they already have the effect of feedbacks
scenarios, or operational conditions to the system failure analysis that is while being fed into the BN representation. This means that BN encap
required to be considered. From another point of view, in SSBN and sulates most of the features of the simulation including feedback in the
MCSSBN, the BN is supported by simulation which inherits the feed case of SSBN and MCSSBN methods. In addition, BN facilitates repre
backs among system components. The simulation is the main source for senting systems scenarios according to different evidences.
17
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
Table 6 [18] Zhang LM, Xu Y, Jia JS, Zhao C. Diagnosis of embankment dam distresses using
Failure probability of the system of two dam reservoirs connected in Bayesian networks. Part I. Global-level characteristics based on a dam distress
database. Can. Geotech. 2011;48:1630–44. NRC Research Press.
series and having independent inflows using MCSSBN second [19] Lee C-J, Lee KJ. Application of Bayesian network to the probabilistic risk
approach (Compare to Table 5). assessment of nuclear waste disposal. Reliab Eng Syst Saf 2005;91:515–32. Elsevier
Ltd.
Method System Failure Prop.
[20] Ben-Gal I. Bayesian Networks. Encyclopedia of statistics in quality & reliability.
MCSSBN 2 nd
approach 0.025 Wiley & Sons; 2007. p. 1–6.
[21] Korb KB, Nicholson AE. Introducing Bayesian Networks. Bayesian artificial
intelligence. Chapman & Hall/CRC Press LLC; 2004. p. 29–54. Second Edition.
[22] Hosseini S, Barker K. Modeling infrastructure resilience using Bayesian networks: a
The methods introduced in this paper can be used to probabilistically
case study of inland waterway ports. Comput Ind Eng 2016;93:252–66. Elsevier
model and quantify complex systems in the national and global levels in Ltd.
order to avoid their large-scale failures, hazards, risks, and their con [23] Nadim F, Liu ZQ. Quantitative risk assessment for earthquake-triggered landslides
sequences, [48–50]. Some researchers are also working more on finding using Bayesian network. In: Proceedings of the 18th International conference on
soil mechanics and geotechnical engineering; 2013.
optimal Bayesian Network structures for different applications [51]. [24] Li P, Liang C. Risk analysis for cascade reservoirs collapse based on Bayesian
Networks under the combined action of flood and landslide surge. Math Probl Eng
2016:1–13. Hindawi Publishing Corporation.
Author statement [25] Smith M. Dam risk analysis using Bayesian Networks. In: Engineering conferences
international; 2006.
I confirm that I have responded to all reviewers’ comments, and [26] Daemi T, Ebrahimi A, Fotuhi-Firuzabad M. Constructing the Bayesian Network for
components reliability importance ranking in composite power systems. Electr
added/corrected what should be corrected or modified. Power Energy Syst 2012;43(1):474–80. Elsevier Ltd.
[27] Lakehal A, Laouacheria F. Reliability based rehabilitation of water distribution
networks by means of Bayesian networks. J Water Land Dev 2017;(34):163–72.
[28] Carter N, Young D, Ferryman J. A combined Bayesian Markovian approach for
Declaration of Competing Interest behaviour recognition. In: Proceedings of the 18th international conference on
pattern recognition (ICPR’06). IEEE Computer Society; 2006.
The authors confirm that there is no conflict of interest with any [29] Verzobio A, El-Awady A, Ponnambalam K, Quigley J, Zonta D. Elicitation process
to populate Bayesian Networks with application to dam safety. In: The 8th
party due to the research methods associated in the submitted paper.
International conference on water resources and environmental research
(ICWRER). Hohai University; 2019. June.
[30] Ponnambalam K, El-Awady A, Mousavi SJ, Seifi A. Simulation supported Bayesian
Aknowledgement
Network for estimating failure probabilities of dams. In: ICOLD 87th annual
meeting and symposium; 2019. June.
This work was supported by Ontario Power Generation (OPG), [31] El-Awady A, Ponnambalam K. A decomposition approach using Bayesian Networks
Ontario, Canada. Tony Bennett and Andy Zielinski of OPG have been and Markov Chains for probabilistic failure analysis of dams. In: The 8th
international conference on water resources and environment research (ICWRER).
helpful in problem formulation and for providing comments. Slobodan Hohai University; 2019. June.
Simonovic was diligent in reviewing first author’s Ph.D. thesis on which [32] El-Awady A, Ponnambalam K. Bayesian Network (BN) approach for failure
this paper is based. prediction of deep geological nuclear waste repository. In: CNS conference on
nuclear waste management, decommissioning, and environmental restoration
(NWMDER); 2019. September.
References [33] Zheng X, Wei Y, Xu KL, An HM. Risk assessment of tailings dam break due to
overtopping. EJGE 2016;21(7):1641–9.
[34] Ingalls RG. Introduction to simulation. In: Proceedings of the 2008 winter
[1] Berg Heinz-Peter. Risk management: procedures. Methods Exp 2010;1(17):79–95.
simulation conference. IEEE; 2008. p. 17–26.
RT&A #2June.
[35] Perros H. Computer simulation techniques: the definitive introduction!. Raleigh,
[2] Häring I. Introduction to risk analysis and risk management processes. Risk
NC: NC State University; 2009.
analysis and management: engineering resilience, Singapore. Springer Science+
[36] Liu ZQ, Nadim F, Eidsvig UK, Lacasse S. Reassessment of dam safety using Bayesian
Business Media; 2015. p. 9–26. https://doi.org/10.1007/978-981-10-0015-7_2.
Network. Geo-Risk, ASCE 2017:168–77.
[3] Rodger C, Petch J. Uncertainty & risk analysis, a practical guide from business
[37] Bai E-W, Tempo R, Fu M. Worst-case properties of the uniform distribution and
dynamics. Pricewaterhouse Coopers, MCS, business dynamics, United Kingdom,
randomized algorithms for robustness analysis. Math Control, Signals, Syst 1998;
April 1999.
11:183–96. Springer-Verlag London Limited.
[4] Office of the gene technology regulator, department of health and aging, Australian
[38] El-Awady A. Probabilistic failure analysis of complex systems with case studies in
government. Risk analysis framework. January 2005.
nuclear and hydropower industries. Ph.D. Thesis. UWSpace, University of
[5] Sivakumar Babu GL, Srivastava A. Risk and reliability analysis of stability of
Waterloo; 2019.
earthen dams. Guntur, India: IGC; 2009.
[39] Sharma S. Markov Chain Monte Carlo methods for Bayesian data analysis in
[6] FAO (Food and Agriculture Organization). Introduction to risk analysis – basic
astronomy. Annu Rev Astron Astrophys 2017;55:213–59.
principles of risk assessment. Yerevan, Armenia: Risk Management and Risk
[40] Tulupyev AL, Nikolenko SI. Directed cycles in Bayesian belief networks:
Communication; 2010. October.
probabilistic semantics and consistency checking complexity. In: 4th Mexican
[7] U.S. Army Corps of Engineers & U.S. Department of the Interior. Bureau of
international conference on artificial intelligence, MICAI 2005, advances in
reclamation. Best practices in dam and levee safety risk analysis. 2015. July.
artificial intelligence; 2005.
[8] U.S. Army Corps of Engineers & U.S. Department of the Interior. Bureau of
[41] Ghahramani Z. An introduction to hidden Markov Models and Bayesian Networks.
reclamation. Best practices in dam and levee safety risk analysis. 2015. 26
J Pattern Recogn Artif Intell 2001;15(1):9–42.
February.
[42] Meeden G, Vardeman S. A simple hidden Markov Model for Bayesian modeling
[9] King L. Reliability of flow- control systems. BC Hydro Gen 2014. 12 May.
with time dependent data. Commun Stat - Theory Methods 2000;29(8):1801–26.
[10] Hartford DND, Beacher GB. Risk and uncertainty in dam safety. Thomas Telford
[43] Bouillaut L, Francois O, Dubois S. A Bayesian network to evaluate underground
Ltd.; 2004.
rails maintenance strategies in an automation context. J Risk Reilab, Inst Mech Eng
[11] Bernardi S, et al. Dependability analysis techniques. Model-driven dependability
2013;227(4):411–24.
assessment of software systems. Berlin Heidelberg: Springer-Verlag; 2013.
[44] Wang F, Zhang Q-L. Systemic estimation of dam overtopping probability: Bayesian
p. 73–90. DOI: 10.1007/978-3-642-39512-3 6.
Networks approach. J Infrastruct Syst, ASCE 2016;23(2):1–12.
[12] U.S. Army Corps of Engineers & U.S. Department of the Interior. Bureau of
[45] Jong C-G, Leu S-S. Bayesian-network-based hydro-power fault diagnosis system
reclamation. Probabilistic stability analysis (Reliability Analysis). 2015. March.
development by fault tree transformation. J Mar Sci Technol 2013;21(4):367–79.
[13] Kline SJ. The purposes of uncertainty analysis. J Fluids Eng, ASME 1985;107:
[46] Jones TB, Darling MC, Groth KM, Denman MR, Luger GF. A dynamic Bayesian
153–60. June.
Network for diagnosing nuclear power plant accidents. In: Proceedings of the
[14] Cox DC, Baybutt P. Methods for uncertainty analysis: a comparative survey. Soc
twenty-ninth international Florida artificial intelligence research society
Risk Anal 1981;1(4):251–8.
conference; 2016.
[15] El-Awady A, Ponnambalam K, Bennett T, Zielinski A, Verzobio A. Bayesian
[47] Delgado-Hernández D-J, Morales-Nápoles O, De-León-Escobedo D, Arteaga-
Network approach for failure prediction of mountain Chute dam and generating
Arcos J-C. A continuous Bayesian network for earth dams’ risk assessment: an
station. In: ICOLD 87th annual meeting and symposium; 2019. June.
application. Struct Infrastruct Eng 2014;10(2):225–38.
[16] Bea RG, Johnson T. Root causes analyses of the oroville dam gated spillway failures
[48] Aven Terje. How to determine the largest global and national risks: review and
and other developments. University of California, Berkeley, Center for Catastrophic
discussion. Reliab Eng Syst Saf 2020;199. Elsevier.
Risk Management; July 2017.
[49] Agrawal N. Defining natural hazards – large scale hazards. Natural disasters and
[17] Peng M, Zhang LM. Analysis of human risks due to dam-break floods—part 1: a
risk management in Canada. Advances in natural and technological hazards
new model based on Bayesian networks. Nat Hazards 2012;64:903–33. Springer.
18
A. El-Awady and K. Ponnambalam Reliability Engineering and System Safety 211 (2021) 107511
research, 49. Dordrecht: Springer; 2018. https://doi.org/10.1007/978-94-024- [51] Tan X, Gao X, Wang Z, He C. Bidirectional heuristic search to find the optimal
1283-3_1. bayesian network structure. Neurocomput 2020. Elsevier.
[50] Langseth H, Portinale L. Bayesian networks in reliability. Reliab Eng Syst Saf 2007;
92(1):92–108. Elsevier.
19