University of Mines and Technology, Tarkwa Faculty of Integrated Management Studies Department of Engineering Management
University of Mines and Technology, Tarkwa Faculty of Integrated Management Studies Department of Engineering Management
BY
MODULE INSTRUCTOR:
TARKWA, GHANA
MAY, 2024
ABSTRACT
Fault Tree Analysis (FTA) is a systematic and deductive approach used to analyse and visualise
potential failure modes within a complex system. This method is widely employed in various
industries, including engineering, manufacturing, aerospace, and nuclear power, to identify
potential causes of system failures and prioritise risk mitigation strategies.
By constructing a fault tree diagram, analysts can identify critical paths or scenarios that could
result in the identified failure and evaluate the probability and consequences of each path. This
analysis helps in identifying weaknesses in the system design, operational procedures, or
external factors that could lead to failures.
Since 1960, Fault Tree Analysis serves as a powerful tool for risk assessment, decision-making,
and designing effective preventive and corrective measures to enhance system reliability, safety,
and performance. It provides a structured framework for understanding complex systems,
assessing potential failure modes, and proactively managing risks to prevent catastrophic
events.
Overall, Fault Tree Analysis plays a crucial role in ensuring the safety, reliability, and efficiency
of systems and processes across various industries by systematically analysing potential failure
scenarios and developing targeted risk mitigation strategies.
Fault tree analysis (FTA) is a very prominent method to analyse the risks related to safety and
economically critical assets, like power plants, airplanes, data centers and web shops. the fault-
tree analysis has been a useful analytic tool for the reliability and safety of complex systems.
FTA methods comprise of a wide variety of modeling and analysis techniques, supported by a
wide range of software tools.
TABLE OF CONTENT
Content Page
ABSTRACT ........................................................................................................................... i
LIST OF TABLES..................................................................................................................... iv
5.2 The Fault Tree Analysis for the Case Study (for Top Event ‘Tank Overfills’)16
6.2 Summary......................................................................................................... 20
REFERENCES ........................................................................................................................ 21
LIST OF TABLES
1.1 Overview
Fault Tree Analysis (FTA) is an analytical, graphical, and deductive tool that is widely used for
system design, analysis, maintenance, and safety assessment in various industries in the world.
It is also an ISO-standardised approach for root cause analysis and system reliability evaluation.
It provides graphical visualisation of complex failure mechanisms and makes even complex
systems readily understandable (A. Jimenez-Roa et al., 2023).
A fault tree starts by identifying the top event of the system that is to be analysed. The top event
represents the undesired or unpleasant event which should be avoided. The top event may
become a result of one or more basic events or other corresponding systems failing. In the fault
tree, the accidents, errors and failures are represented by logic gates (Clerissi et al., 2023).
Simultaneously, Fault Tree Analysis illustrates the interplay of factors, represented as nodes,
influencing one another's effects. Additionally, it aids in evaluating the repercussions of each
issue, showcasing causal relationships within the model. This method proves effective for
optimising resources. By amalgamating faults, it streamlines problem-solving efforts in
industries, making a consolidated group more manageable than tackling individual issues
(Catelani et al., 2021).
The use of fault tree analysis (FTA) began to spread to other commercial sectors in the 1960s.
In 1970’s FTA emerged into a distinct domain of research in the academic field. It was followed
by publications in various academic journals as a proof. In 1977, the first edition of standard
NUREG-0492 appeared that set the stage for the implementation of fault tree, and the even
greater widespread use and operationalisation of the method began with the release of
subsequent editions of the standard. Fault tree analysis (FTA) has since been employed as a
formalised approach in many occupational branches such as the oil industry (FRAM/SNA),
aviation, healthcare, and IT. Notably, however, FTA can offer less than perfect results,
particularly with tasks where complexity is underpinned by high levels of interdependence.
With this, it should be noted, FTA has also been employed in applications where system
interdependencies are an especially salient feature. For example, the method has been used as
a quantified approach to model cascades of risk in a 5G network.
1.3 Definition and Purpose
Fault Tree Analysis (FTA) is an approach used to model, evaluate, and prioritise the failure
event in the safety and reliability domain. It measures the dependability of a safety- or mission-
critical nuclear, chemical, transportation, aviation, or power systems.
The fault tree is a symbolic representation of the sequence of events leading to system
malfunction. The sequence includes basic events (sub-components failure, system, or
environment condition) and their logic gates (ORD and AND) (Barrère & Hankin, 2020).
The appliances of FTA are also extended like energy industry, aircraft, and nuclear industry
(Silva et al., 2012). In the FTA model, the Event Based Elements (EBE) are logically linked up
via the Logical Gates (LG). After constructing a fault occurrence tree, a systematic analysis is
performed to detect the reasons for the top event of Minimal Cut Sets (MCSs) (Liu et al., 2021).
When discussing FTA, the focus is on unwanted events or – generally speaking – failure modes.
An unwanted event for the system under evaluation is identified in every FTA: it represents an
important loss of availability, or safety, or reliability in the system and – consequently – it is
also associated with costs. The analysis of this unwanted event is pursued through the building
of a fault tree. The fault tree is a graphical diagram that shows the logical relation among the
events that can lead to the unwanted event or its consequences, although FTA can also be used
for “coherent” success trees (tree diagrams that are using “OR” and “AND” gates that are
oriented toward the opposite direction) (Liu & Jones, 2024).
In building the fault tree, all the possible causes are identified: the model is called “complete”
if, under normal circumstances and with no further investigations, the last check is designed to
be exhaustive. The causes appear inside the diagram through logically connected lines: the
distribution outputs appear at the left, classical gates follow (it is generally the OR gate the main
focus of FTA), and then the distribution inputs appear on the left side, at the bottom.
CHAPTER 2
BASIC COMPONENTS OF FAULT TREE ANALYSIS
2.1 Overview
The basic components of the fault tree analysis (FTA) are depicted in a top-down manner
(identified with the assistance of a Reliability Block Diagram (RBD)) and they are the essential
component that provide successful and analytical performance of the FTA. The main elements
in constructing a fault tree are Top event, Inputs to the system (Basic events), Logic symbol,
Gates (AND, OR, K-out-of-N, etc.), Transfer, and the final development of the analysis. The
observed output events or situations that are entered in the top level are further identified by a
trigger. This trigger or characteristic is termed as Top events. (JM Sollid et al., 2010).
The logical relations among the basic events from the top event and these basic events include
the logic symbol and root cause, and are represented by respective paths. It starts from the top-
level input and finishes with the bottom-most or lowest level of the events. The utilised symbols
in the Fault tree are the relationship and local or inputs symbols. The relationships are likened
to events and their logical connections which are represented by gates (AND and OR gates).
The causative events are directly under the main event and these are symbolising their original
or main causes. This representation not only allows a reviewer to easily trace the root causes of
the top event but also a reviewer can understand the mechanism of fault tree fault propagation.
Fault tree analysis (FTA) is a powerful tool for reliability analysis, safety assessment, and risk
evaluation of complicated engineering systems. In this technique, the probable causes leading
to failing of the system is first investigated and then the feasible safety measures for each
identified way are employed. The technique and calculations then allow enumeration of various
ways expected to lead to failure. Hence, in order to properly conduct FTA, a systematic view
for the hardware system under consideration must be performed in order to identify all
equipment, components and some key operating situations that can induce system failure and
to draw these potential causes to the considered system as an event tree diagram. To produce
the FTA model, one must specify its components. FTA diagrams are made up of events and
gates. An event is a state or condition that affects the system. The leaf nodes of the FTA model
are referred to as basic events. A gate is a logical operator that combines two or more events.
The root of the tree represents the top event. (Silva et al., 2012).
When events are connected by and gates, the top event can only be realised when all of the
basic events or most of events in the network connected with the selected event by that proposed
connection occur at the same time. Suitable symbols have been allocated to define the basic or
complex events with the help of MATLAB toolbox of the system.
DFAs introduced by System Safety Concepts (SySCON) was utilised to deduce the objective
function in terms of basic events probability values. With respect to the deduced objective
function, the system reliability has been studied. Further, the correlation finance function is
manipulated to express the impact of the sudden appearance of an event (basic or intermediate)
or gate failure on the system performance. According to the calculation results, a feasible
measure capable to maximise the system failed probability has then been considered.
(Wang et al., 2021).
1 AND Gate The output event occurs when all the input
events occur.
3 Priority AND Gate The output event occurs if the of the two
input events occur but not both.
Fault Tree Analysis (FTA) is a graphical tool for modeling the relationship among activities
leading to a fault or an accident.
Elements involved comprise the initial or termination point (the top event), which is defined as
a significant event for a specific reason, a graphical structure of basic events. Among the events
there could be more specific events that are given the name intermediate events (IEs), which
are activities resulting from the combination of at least two basic events.
The basic events are modeled as random variables, each with their characteristic probability of
occurrence, in addition, logical operators representing ET logic, OR logic, NOT logic, inhibit
logic, feedback logic and other concepts. All of these factors are set in the FTA tree to get a
credible result and the main reason of the fault and ascertain the reliability and safety of the
system planned.
The most common gate types used in FTA are AND, OR, and K/N. In the attribute, N is the
number of non-null input events necessary at minimum for propagating an output event. If the
number of non-null input events is equal to or greater than n, then the output event exists.
According to Chen et al. (2017), “sand” and “pand” are temporal AND and OR gates,
respectively. Temporal “events” derived from event trees or fault trees are not considered in this
paper. (J. Schilling, 2015).
CHAPTER 3
CONSTRUCTING A FAULT TREE
3.1 Overview
Using the information about the logical relationships between failure scenarios for output
components, based on the structure generated for the scenarios using all the logic model
information, a fault tree can be constructed by systematically building the logical relationships
between the basic events using the available quantitative and qualitative information. Although
a fault tree might be designed by a reliability engineer and a systems engineer from scratch,
most often, the information on the associated system and component failures, correlated with
basic event failures, is already systematically stored. This information is drawn from formal
failure analysis activities (such as those resulting from hazard and operability studies, FMECA,
RBD diagrams, etc.) and taken from databases and existing documentation (e.g. as stored in the
literature on similar equipment, operating experience, etc.) (Luis Fuentes-Bargues et al., 2017).
According Catelani et al. (2021), the process of constructing a fault tree generally involves the
following steps:
Regardless of the actual approach that is employed, the Gate Type (GT), which provides nodes
for basic events (BEs), should instead serve as a fault tree template that is populated with
conditional probabilities to obtain a corresponding FT. It appears to be common practice to
populate the GT node nearest to the BE nodes in an FT where the BE nodes represent the basic
events and the primary events represented at the AND nodes (or immediate children of the OR
node) appear as a line of negative (¬) BE nodes leading from the non-OR GT child to the target
BE node. This means that, when incorporating the conditional probabilities from the GT to the
FT in a systematic manner, a basic event number (BN) is introduced that replaces each OR GT
child (or its equivalent in the FT) with an OR node whose child nodes are the BE nodes
stemming (ultimately) from the negative BE nodes in the GT (Barrère & Hankin, 2020).
According to STAMP models for risk assessment, a task process is initiated by human or
system. At the trigger's mechanism mediates the relationship between the previous event and
the next task. An adverse event happens when the normal operation deviates from the design
intent. A direct cause is a basic event that contributes to direct impacts on system failure.
Especially, an intermediate cause is the mechanism (or a trigger) that bridges between a direct
cause and an adverse event.
In this study, the accident step can be seen as the top event pointing to the bottom gate, the
bottom gate shows all possible events causing the top event, and every branch depicts the direct
relationship between two or more events being orderly divided by “and” or “or” gates. The fault
tree from crew’s perception can be seen in Fig. 3. No accident can be decomposed into three
direct causes that are visibility defect, pilot’ s incorrect decision, and engine failure. Visibility
defect can be further expanded into flight instrument defects, and fog or rainfall. fog can cause
poor visibility, and rainfall can not only lead to water spots in the front glass but also cause poor
visibility.
Then, 14 intermediate events are explicitly depicted. In order to realise the direct relationship
between intermediate events and top events, each intermediate event must have a clear
explanation of reaching the top event. This part takes visibility obstacle function as an example
to give a detailed explanation. Visibility obstacle is the offshoot of three aspects: topographic
factors, visibility defects, and pilot incorrect decisions. Therefore, to establish the primary level
relationship, we must connect with pilots’ incorrect decisions through the unsymbolic “or” gate
and then further connect with visibility defects which are decomposed from poor weather and
instrument failure, and then establish the relationship with the top event. For foggy weather, the
weakening of the visibility brought us to visibility obstacles which can cause accidents.
(JM Sollid et al., 2010) (Wang & Hu, 2023) (Silva et al., 2012).
3.2 Gate Selection and Configuration
To represent the different combinations of service failures, various logic gates are used, like
Inhibits OR, Corelates AND, Standard AND, Standard OR, Simple Transfer, Non-inhabits
Transfer, and, up to a vertex H and K-out-of-N (Ahmed & Hasan, 2016).
All these logic gates are analysed mathematically to evaluate the heavy user side of the
stakeholders instead of using an intuition-based approach. Afterwards, the logic gates are used
using a unitless function called normalising values. The normalising values are calculated at
every vertex using Non-faulty Sensors, Different Faulty Sensor’s Combinations, Frequency
Rate, and Problematic Sensor’s Submission before and after the gated operators. The logical
trees are then converted using the simple recall-based Normalising Parameters into Probabilistic
Boolean Logic Gate trees. Each basic event is assigned a probability based on the failure tree
that is used to solve the LVIA entailed as the establishment complex using Boolean Algebra
through the inbuilt LabVIEW gates (Barrère & Hankin, 2020).
Reliability and Availability Evaluation of Wireless Sensor Networks for Industrial Applications
(Silva et al., 2012). The fault tree model is then analysed to obtain the desired LVIA indices for
the WSN under consideration. Based on the fault tree constructed, different combinations of
functional and nonfunctional logic gates are used. The corresponding number of sensor node
failures and the cumulative probability of sup-service availability are obtained at different
times. The mean time to critical failure is obtained from the cumulative probability of sub-
service availability.
CHAPTER 4
ANALYSING A FAULT TREE
4.1 Overview
A fault tree represents how the top event is related to its basic events. Each basic event logically
related to a top event forms a path, and each path is called a “Minimal Cut Set” (MCS).
Determining top event probability can be done by modeling a fault tree with Boolean or vector-
based formulations and solving the problem using methods including algebraic algorithms,
numerical algorithms, or search algorithms. An “MCS” is a basic event subset that results in a
top event. Among many fault trees, finding the MCS with the maximum probability has long
been an important research problem. (Wang & Hu, 2023) An FTA system is a combination of
basic events (BEs) and top events (TEs) that are related through AND, OR, and NOT gates. The
output of a top event is affected only by the states of basic events that are connected to it through
event relationships. A minimal cut set is a group of basic events that result in a top event. The
analysis of a Fault Tree is classified into Qualitative and Quantitative Analysis
Qualitative analysis in fault tree analysis involves a descriptive assessment of system failures
based on logical relationships between events and conditions. It focuses on understanding the
critical paths, dependencies, and factors that can lead to the top event or system failure without
assigning specific numerical values. Qualitative analysis helps in identifying potential failure
modes, determining critical factors contributing to failures, and developing an understanding of
the risk landscape. Techniques such as expert judgment, experience, and knowledge are used to
qualitatively assess events, gates, and their relationships within the fault tree. The output of
qualitative analysis includes identifying weak points in a system, prioritising critical events,
and establishing a conceptual understanding of failure scenarios.
4.3 Quantitative Analysis
Quantitative analysis in fault tree analysis involves assigning numerical values, probabilities,
and metrics to events, conditions, and relationships within the fault tree. It focuses on
quantifying the likelihood of different events, estimating failure probabilities, and assessing the
overall risk and reliability of the system. Quantitative analysis helps in making informed
decisions based on numerical data, evaluating the impact of failures, and optimising risk
mitigation strategies. Probability theory, statistical analysis, reliability data, and mathematical
modeling are used to calculate probabilities, failure rates, and other quantitative metrics within
the fault tree. The output of quantitative analysis includes quantified risk assessments,
probabilities of system failures, critical paths analysis, and data-driven optimisation of system
reliability and safety.
Minimal Cut Sets (MCS) represent the smallest combinations of basic events in a fault tree that,
if they occur together, would lead to the top event or system failure. MCS consist of a set of
basic events that are both necessary and sufficient to cause the top event. By identifying and
analysing minimal cut sets, analysts can pinpoint the most critical combinations of events that
could lead to system failure. MCS help in prioritising risk mitigation strategies, focusing on the
most essential factors that need to be addressed to prevent the occurrence of the top event.
Path Sets represent all possible combinations of basic events in a fault tree that can lead to the
top event or system failure. Path Sets include all potential paths or sequences of events that can
result in the top event, regardless of whether they are minimal or redundant. By examining path
sets, analysts can gain a comprehensive view of the various ways in which system failures can
occur and the different event combinations that contribute to failures. Path sets provide insights
into the multiple failure scenarios, highlighting the interdependencies between events and
conditions within the system. Path sets assist in risk assessment, decision-making, and
prioritising interventions to address the diverse pathways that could lead to system failures.
4.5 Component Importance Measures:
In Fault Tree Analysis (FTA), Component Importance measures are crucial in understanding
the contribution of individual components or events to the overall system failure. These
measures help identify which components are most critical to preventing system failure and
guide improvements in system reliability. Criticality Importance measures the importance of a
component in preventing system failure. It is based on the increase in system failure probability
when the component fails. Components with higher criticality importance values are more
crucial to system reliability. Analysing these Component Importance measures in Fault Tree
Analysis helps engineers and analysts identify critical components that require attention and
resources to improve system reliability and safety. Using this information, stakeholders can
prioritise maintenance, upgrades, or redundancy measures to enhance system performance and
reduce the risk of failure.
CHAPTER 5
CASE STUDIES AND APPLICATIONS OF FAULT TREE ANALYSIS
As a safety feature, the second level sensor, L2, is connected to switch SW2. When the fluid
level is unacceptably high SW2 opens which de-energises relay R1. R1 contacts then open to
break the control circuit. This results in R2 de-energising, its contacts open and remove power
from the pump. This will require a manual start-up of the circuit. For the system failure mode
‘Tank overfills’, the relevant component failure modes along with the failure rate and repair
time data are shown in table 5. Some of the failure modes will be revealed such as relay R2
contacts stuck closed. This component condition will mean that the pump keeps running and
the problem is revealed by the tank overfilling. Others such as relay R1 contacts fail closed will
be unrevealed as this is the normal operating state for that component. All of the component
failure modes associated with the safety systems (L2, SW2, R1 and PB) will be unrevealed as
for this class of events the failure will only be revealed when the component is tested /inspected
or when a demand for the component to work occurs. For these component failure events an
inspection interval is also specified which enables the probability of the event to be calculated.
For this example, an inspection interval of 4380 hours is assumed.
Figure 5.1 Simple Tank Level Control System
5.2 The Fault Tree Analysis for the Case Study (for Top Event ‘Tank Overfills’)
Table 5.1: Component Failure Modes and Data
Component Failure Mode Code Failure Rate Mean Time to
(per hour) Repair (hours)
Push Button Stuck Closed PB 5 × 10−5 2
Relay Contact Stuck Closed R1/R2 6 × 10−5 10
Switch Stuck Closed SW1/SW2 5 × 10−5 10
Level Sensors Fail to indicate L1/L2 2 × 10−6 5
high level
With reference to the fault tree for the undesired top event ‘tank overfills’ is developed above.
The text boxes specify exactly what each gate output event in the fault tree represents. Each
branch is developed downward using AND and OR gates until basic events (component failure
events) are encountered and the failure causality development is terminated. The final fault tree
structure showing how the basic events combine to cause the system level failure event is
illustrated in Figure 6.2
4 SW1 SW2
5 SW1 L2
6 L1 PB
7 L1 R1
8 L1 SW2
9 L1 R1
For the tank level control system fault tree, the complete list of minimal cut sets are given in
table 6. As can be seen there are 9 failure combinations in total. One is first order (a single event
causes system failure) and eight are of order two.
Using the component failure data in Table 5.1, the system failure parameters can be calculated:
If the system failure predictions indicate an unacceptable performance the weaknesses can be
identified using component importance measures.
The Fussell-Vesely measure is indicated in Table 5.3. This shows that component L1 provides
the biggest contribution to system failure.
Table 5.3: Component Importance Measures
Rank Component Fussel- Vesely
1 L1 0.4148
2 R2 0.3777
3 L2 0.3155
4 SW1 0.2075
5 R1 0.1139
6 SW2 0.0966
7 PB 0.0963
The disadvantage of this method is that it is not related to the notion of expectations for the
cause of the faulty function. Another limitation is that in case of uncertain data, probabilistic
results obtained in public disasters are not reliable. (Silva et al., 2012)
The use of FTA is very often found in safety-critical domains, particularly in airplane and
nuclear reactor safety. Beyond these domains, FTA is used in the context of system
dependability. This technique is widely used to perform the reliability analysis of hardware and
software systems. In the context of the military domain, FTA can be used to perform the security
and integrity verification of a communication subsystem. (Ahmed & Hasan, 2015)
6.2 Summary
A fault tree represents the causes of a specified system failure mode in terms of the failure
modes of the system components. A summary of the features of fault tree analysis is:
Catelani, M., Ciani, L., Bartolini, A., Del Rio, C., Guidi, G., & Patrizi, G. (2021), Reliability
Analysis of Wireless Sensor Network for Smart Farming Applications. ncbi.nlm.nih.gov
A. Jimenez-Roa, L., Heskes, T., & Stoelinga, M. (2023), Fault Trees, Decision Trees, And
Binary Decision Diagrams: A Systematic Comparison.
Barrère, M. & Hankin, C., 2020. MaxSAT Evaluation (2020), Benchmark: Identifying
Maximum Probability Minimal Cut Sets in Fault Trees.
Clerissi, D., Di Rocco, J., Di Ruscio, D., Di Sipio, C., Ihirwe, F., Mariani, L., Micucci, D.,
Teresa Rossi, M., & Rubei, R. (2023), Supporting Early-Safety Analysis of IoT Systems by
Exploiting Testing Techniques.
Liu, S. & Jones, E. (2024). Clinical implementation of failure modes and effects analysis for
gynecological high-dose-rate brachytherapy. ncbi.nlm.nih.gov
Silva, I., Affonso Guedes, L., Portugal, P., & Vasques, F. (2012), Reliability and Availability
Evaluation of Wireless Sensor Networks for Industrial Applications. ncbi.nlm.nih.gov
Wang, X. & Hu, X. (2023), Quantitative risk assessment of college campus considering risk
interactions. ncbi.nlm.nih.gov
Dickerson, C., Roslan, R., & Ji, S. (2018), A Formal Transformation Method for Automated
Fault Tree Generation from a UML Activity Model.
JM Sollid, S., Morten Lossius, H., R Nakstad, A., Aven, T., & Søreide, E. (2010), Risk
assessment of pre-hospital trauma airway management by anaesthesiologists using the
predictive Bayesian approach. ncbi.nlm.nih.gov
Wang, L., Fan, M., & Zhang, F. (2021), Applying Fuzzy Fault Tree Method to Evaluate the
Reliability of College Classroom Teaching. ncbi.nlm.nih.gov
Chen, E., Bao, H., Shorthill, T., Elks, C., & Dinh, N. (2022) Failure Mechanism Traceability
and Application in Human System Interface of Nuclear Power Plants using RESHA.