Fault Tree Analysis
Fault Tree Analysis
1
Fault Tree Analysis
CIVE
It is one of the principal methods of probabilistic safety (or risk) analysis (PRA).
It was developed by Bell Telephone Laboratories in 1962 for the U.S. Air Force
Minuteman system, later adopted and extensively used by Boeing Company.
Fault tree diagrams are used most often as a system-level risk assessment
technique, that can model the possible combinations of equipment failures,
human errors, and external conditions that can lead to a specific type of
accident.
It follows a top-down structure and represent a graphical model of the pathways
within a system between basic events that can lead to a foreseeable loss event
(or a failure) referred to as the top event.
The contributory events and conditions are interconnected using standard logic
symbols (AND, OR, etc.), also referred to as gates.
The events that must coexist to cause the top event are described using the
AND relationship, alternate events that can individually cause the top event are
described using the OR relationship.
The concurrency not lead to a serious or adverse consequence. The relative
likelihood of a number of potential consequences will depend o n the
conditions or subsequent events that follow potential consequences can be
systematically identified using an event tree.
2
Basic Events
Advantages:
It allows the use of reliable information on component failure and other basic
events to estimate the overall risk associated with new system designs for which
no historical data exists.
It is simple to understand and easy to implement.
Qualitative descriptions of potential problems and combinations of events
causing specific problems of interest.
Quantitative estimates of failure frequencies and likelihoods, and relative
importance of various failure sequences and contributing events.
Lists of recommendations for reducing risks.
Quantitative evaluations of recommendation effectiveness.
Limitations:
It is difficult to conceive all possible scenarios leading to the top event.
Construction of fault trees for large systems can be tedious.
Correlations between basic events (e.g. failure of components belonging to the
same batch) are difficult to model and exact solutions to correlated events do not
exist.
Subjective decisions regarding the level of detail and completeness are often
necessary.
CIVE
3
Notation
Symbol Name Description
Primary Event Symbols
Circle Basic Event: A basic initiating fault
requiring no further development.
Gate Symbols
OR Gate The union operation of events, i.e. the
output event occurs if (at least) one or
more of the inputs occur
AND Gate The intersection operation of events, i.e. the
output event occurs if and only if all the
inputs occurred.
INHIBIT The output event occurs if the (single) input event occurs in the
Gate presence of an enabling condition (i.e. Conditioning Event (oval)
drawn to the right of the gate)
Transfer Symbols
Triangle-in Indicates that the tree is developed
further someplace else (e.g. another
page).
Triangle-out Indicates that this portion of the tree is a sub- tree
connected to the corresponding Triangle-In .
4
CIVE
5
Rules of Fault Tree Construction
A fault tree should only be constructed once the functioning of the entire system is
fully understood.
The objective is to identify all the component failures, or combinations thereof that
could lead to the top event (Steps 2 - 4 above)
Rule 1. State the fault event as a fault, including the description and timing of a
fault
condition at some particular time. It includes:
a) What is the fault state of that system or component is
b) When that system or component is in a faulty state.
Test the fault event by asking:
a) Is it a fault?
b) Is the what-and-when portion included in the fault statement?
Rule 2. There are two basic types of fault statements, state-of-system and state-of-
Component. To continue the tree:
a) If state-of-system fault statement, use Rule 3
b) If state-of-component fault statement, use Rule 4
Rule 3. A state-of-system fault may use an AND, OR, or INHIBIT gate or no gate at all.
To determine which gate to use, the faults must be then:
a) Minimum necessary and sufficient fault events,
b) Immediate fault events.
6
CIVE
Rule 9. An INHIBIT gate describes a causal relationship between one fault and
another,
but the indicated condition must be present. The fault is the direct and sole
cause of the output when that specified condition is present. Inhibit conditions
may be faults or situations, which is why AND & INHIBIT gates differ.
7
How to Perform Fault Tree Analysis (FTA)
As previously mentioned, the FTA is a logical breakdown from the Top-level undesired
event, cascaded to the Base-level event (root cause). Each path has a probability
assigned. The paths related to the highest severity / highest probability combinations
are identified and will require mitigation. Starting at the Base-level event (at the
bottom of the FTA) and working the path up to the undesirable Top-level event is
called a Cut Set. There are many cut sets within the FTA. Each has an individual
probability assigned to it. The Base-level event is often color-coded to identify the risk
level indicated.
Step 1. Identify the Hazard: Knowing the consequence of the failure is useful in
defining the top-level event of the fault tree. The top-level event, or hazard, should be
defined as precisely as possible:
How much?
How long (duration)?
What is the safety impact?
What is the environmental impact?
What is the regulatory impact?
8
CIVE
List the potential causes of the hazard to the next level. This is similar to
the ‘5 why’ process, except the development of a fault tree should be
focused on a single level before progressing to the next,
a) Include system design engineers, who have full knowledge of the
system and its functions, in the higher levels of the Fault Tree
Analysis. This knowledge is very important for cause selection.
b) Include Reliability Engineers who can assist in developing the
relationships of causes to a failure or fault.
Estimate the probability of the causes at the Base-level event.
Label all causes with codes.
Prioritize or sequence causes in the order of occurrence or probability.
9
Step 4. Identify the Cut Sets:
Risk is estimated for each event,
a) When available, the failure rate data can be used to calculate the risk of
a single chain or many chains.
b) If there is no data, an estimate is established based on subjective
guidelines similar to those used in FMEA development.
The Cut Sets with risk greater than the system can tolerate (i.e. safety or
inoperative conditions) are selected for mitigation.
Actions are required for Critical (red) and High Risks (orange).
Action logs and revision records are kept for follow-up and closure of each
undesirable risk. Any risk not mitigated to an acceptable level is a candidate
for mistake-proofing or quality control, which protects the consumer from the risk.
10
CIVE
Fall from
Scaffolding
Height &
ground
conditions
Safety Forget
belt to wear
Upholder broken
broken
11
When to Use Fault Tree Analysis
12
Types of Fault Tree Analysis
CIVE
The Standard Fault Tree Analysis is not the only option. For specific industries and use
cases, FTA has been extended. These extensions could visualize features that are
difficult to express by standard fault trees. These are some of them:
Dynamic FTA Dynamic Fault Trees: Dynamic fault trees (DFT), which extend
existing fault trees, model complex system behaviours and interact with them.
Repairable FTA – Repairable Fault Trees – Enhance the FTA model with the
ability to describe complex dependent repairs to system components.
Extended: Takes into account multi-state components as well as random
probabilities.
Fuzzy: This complex mathematical concept, called fuzzy set theory, takes into
account unreliable variables that are hard to predict (like wind and weather).
State-event FTA: The SEFT is used to analyse dynamic behaviour that fault trees
can’t model.
Qualitative fault tree analysis is used to understand the fault tree structure and analyse
the system’s vulnerabilities. We can conduct qualitative fault tree analysis in many
ways.
Minimal cut (MCS): It helps to identify vulnerabilities in a system. A system that
has a low number of components, or a large number of elements with high failure
probabilities, would be deemed unreliable. These elements are identified in
MCS’s fault tree. You can improve reliability by reducing the likelihood of failure
or adding redundancies.
13
The Minimal Path Sets (MPS): It will help you to determine the system’s
strength. This is a method that identifies the minimum number of components
necessary to keep the system functioning. Once you have identified the elements,
you can work to reduce their failure rate. This improves the system’s reliability.
Common Causes Errors (CCE): Determine if multiple elements can cause
failures. CCE identifies critical components. Your team must ensure that these
components are regularly inspected and replaced if necessary. A computer-aided
maintenance management system (CMMS), can plan and schedule maintenance
for these critical components.
The quantitative FTA method can be used for calculating the probability of failure in
the analysis. This will allow you to better understand your risk and prioritize it.
Quantitative FTA may produce stochastic or important measures:
We can use stochastic measures to determine the likelihood of the system failing.
Importance Measures indicate the importance of a cut set, path or system to
reliability.
Once you have a good idea of the probability of your basic events you can
calculate the probabilities for your intermediate events using the gates
connecting them. OR gates and AND gates are the most common gates. Here is an
example.
14
Benefits:
A visual representational record of a system is determined to show the
logical relationships between causes and events that lead to system failure.
The analysis helps others to understand the results and pinpoint the
weaknesses quickly.
At the same time, it also lays the foundation for any further evaluation and
analysis.
When we make upgrades or changes to the system, with Fault Tree
Analysis, you already have steps for possible changes and effects.
A Fault Tree diagram can also be used to maintain procedures and design
quality tests.
We need to foresee and evaluate first the undesired event. Thus, we have to
anticipate all contributing factors to the cause.
The effort sometimes may prove to be very expensive and time-consuming
unless you equip certain skills to analyse the FTA yourself.
If the reader doesn’t possess skills, then Fault Tree Analysis may not be
desirable to use.
Conclusion:
Fault Tree Analysis is not an easy process even though the examples might tell you
otherwise. You will feel more able to see the future and predict failures if you have the
right team. You will be the one who plans fault repair into scheduled maintenance
downtime, and your team will work proactively rather than reactively.
15
Bibliography
1. Industrial Safety & Maintenance Management – M.P Poonia & S.C Sharma
2. Industrial Safety, Health and Environment Management Systems – R.K Jain
3. Fault Tree Analysis Guide with Example – www.safetyculture.com
4. Fault Tree Analysis – www.sixsigmastudyguide.com
5. What is Fault Tress Analysis And How To Perform It – limblecmms.com
6. Fault Tress Analysis 8 steps process – accendroreality.com
16
CIVE
Abbreviations
*******
17