Tap root
Tap root
TapRooT
WASAC
M. Tamer 1
Root Cause Analysis (RCA)
TapRoot
Objectives:
By the end of this workshop, participants shall be able to:
Classify accidents
WASAC
M. Tamer 3
What Is An
Accident/Incid
ent?
An unplanned,
unexpected event that
interferes with or
interrupts normal activity
& potentially leads to
personal injury or loss
(equipment damage).
Injur
y
- A traumatic wound or other
condition of the body caused by
external forces, including
stress or strain.
- The injury is identifiable to time
and place of occurrence and
member or function of the
body affected and is caused by
a specific event or
incident or series or events
or incidents within a single
day or work shift.
Occupational Non-traumatic
physiological harm or loss
Illness of capacity produced by:
Systemic infection;
Continued or repeated
stress or strain;
Exposure to toxins,
poisons, fumes, etc.,
Or
Other continued and
repeated exposures to
conditions of the work
environment over a long
period of time.
A condition that does
not meet the definition
of an injury.
Definition
of Safety
•“safety is
noted more in its
absence than its
presence.”
Incident/Accident
FactAnalysis
Finding
•The investigation of incidents identifies
the specific root causes and causal
factors (contributing factors) for
incidents.
•There is less emphasis on identifying the
specific individuals responsible.
•Disciplinary actions are rare but likely if
there is a history of repeated
occurrences.
•There is usually a greater amount of
explanatory detail in the incident report.
•There is greater tendency in a fact-
finding organization to report a near
miss as well as minor incident events.
Incident
Investigation
Incident investigation is the
process of identifying the
underlying causes of incidents and
implementing steps to prevent
similar events from occurring.
Objective of Incident
Investigation
The primary objective of incident investigation is to
prevent recurrence by applying recommendation from
incident result, our knowledge and experience.
WASAC
M. Tamer 10
.
WASAC
M. Tamer 12
Heinrich’s Theories
Unsafe/At-Risk
Behaviors
88%
WASAC
M. Tamer 13
Sequence of Events Model Heinrich’s Theories
MISTAKES OF PEOPLE
WASAC
M. Tamer 14
Multiple Causation Theory
Factors combined in random
fashion to cause accidents.
WASAC
M. Tamer 15
The Barrier Analysis Process
Hard Barriers Soft Barriers
(Engineered) (Administrative)
Machine guards Procedures
PPE Training
Fall protection Supervision
Communication
Interlocks
Work planning
Electrical systems
Standards and
Safety valves regulations
Energy/ Energy/
Hazard Hazard
WASAC
M. Tamer 16
Barrier Categories
1. Barriers that failed The barrier was in place and operational
`` at the time the accident, but it
failed to prevent the accident.
3. Barriers that did not The barrier did not exist at the time of the
exist accident
WASAC
M. Tamer 17
• Effectiveness – how well it meets
General its intended purpose
Characteristic •
• Availability – assurance the barrier
s of Barriers will function when needed
• Assessment – how easy to
determine whether barrier will
work as intended
• Interpretation – extent to which
the barrier depends on
interpretation by humans to
achieve its purpose
WASAC
M. Tamer 18
“Work-as-Done” Varies from “Work-as-Planned” at Employee Level
WASAC
M. Tamer 23
Preventing System Accidents
WASAC
M. Tamer 24
Safeguard/Barrier Analysis
Strongest 1.Remove/substantially reduce the hazard
2. Remove the target
3. Guard the target with effective safeguard
4. Improve human performance with good human
factors design.
5. Improve human performance with rules, proc.,
signs
Weakest
WASAC
25
M. Tamer
Human Error
Human Error
Is not a cause of failure. It is the effect, or symptom, of deeper trouble. It is
not random, it is systematically connected to features of people’s tools,
tasks and operating environment.
It is not the conclusion of an investigation. It is the starting point!
Old View
Human error is a cause of accidents
To explain failure you must seek failure
You must find people’s inaccurate assessments, wrong decisions, bad
judgements
New View
Human error is a symptom of trouble deeper inside a system
To explain failure, do not try to find where people went wrong
Instead, find how people’s assessments and actions made
sense at the time,
given the circumstances that surroundedWASAC them.
. M. Tamer 26
Organizational Factors
The Organizational Factors layer (slice) represents the defenses put
into place by top management. This level of system defenses might
include a company culture that puts safety first, and management
decisions that reinforce safety by providing well-trained employees
and well-designed equipment to do the job.
Resource Management Organizational Climate Organizational Process
WASAC
M. Tamer 28
Preconditions for Unsafe Acts
Certain substandard preconditions may foster a climate where incidents
can occur. Those preconditions may be related to human factors,
practices, or interface with work conditions or the environment.
WASAC
M. Tamer 29
Adverse or Inadequate Practices of Operators
WASAC
M. Tamer 31
Three Basic Causes of Accident
WASAC
M. Tamer 32
1. Direct Cause
Three •The direct cause of an accident
Basic is the immediate events or
Causes conditions that caused the
accident.
•The direct cause should be stated
in one sentence
Ex.:
The direct cause of the accident was
the inadvertent activation of
electrical circuits that initiated the
release of CO2 in an occupied space.
Direct & Indirect/Unrelated
WASAC
M. Tamer 39
Characteristics Failure Type Examples
Short-term memory lapse; omit to
Lapse perform a required action:
(Omission)
• forget to indicate at a road
junction
• medical implement left in patient
after surgery
• miss crucial step, or lose place, in
a safety-critical procedure
• drive road tanker off before
delivery complete
(hose still
connected)
WASAC
M. Tamer 40
Examples Typical Control Measures
A simple, frequently-performed physical action • Human-cantered design
goes wrong: (consistency e.g., up always
• flash headlights instead of operating means off; intuitive layout of
windscreen wash/wipe function controls and instrumentation;
• move a switch up rather than down level of automation etc.)
(wrong • checklists and reminders;
action on right object) procedures with ‘place
• take reading from wrong instrument (right markers’
action on wrong object) (tick off each step)
• transpose digits during data input into a
process control interface • removal of distractions and
interruptions
• sufficient time available to
Short-term memory lapse; omit to perform a complete task
required action: • warnings and alarms to help
• forget to indicate at a road junction detect errors
• medical implement left in patient after • often made by experienced,
surgery highly-trained, well-motivated
• miss crucial step, or lose place, in a staff: additional training not
safety- valid
critical procedure
• drive road tanker off before delivery
complete (hose still connected)
WASAC
M. Tamer 41
Compliance/Noncompliance Technique
WASAC
M. Tamer 42
WASAC
M. Tamer 43
CONDUCTI
NG THE
INVESTIGATI
ON
WASAC
M. Tamer 44
Taking Control
of the Accident
Scene
Before arriving at the site, assure that
the scene and evidence are properly
secured, preserved, and documented
and that preliminary witness
information has been gathered.
At the accident scene, the
Chairperson should:
Obtain briefings from all
persons involved in managing the
accident response.
Obtain all information and
evidence gathered by the team.
WASAC
M. Tamer 45
Accident Investigation
Team
The Chairperson is responsible for ensuring
that all Investigation Team members work as
a team and share a common approach to the
investigation; brief them on:
The scope of the investigation,
The schedule and plan for completing
the investigation
Information control and release protocols
Recording and tracking incoming
and outgoing correspondence.
The Investigation Team will clarify and
confirm priorities on handling the following
items:
Handle serious injuries, a fatality, or
serious off-site effects,.
Secure the area in preparation for
the investigation.
Accident
Participate in an initial orientation
Investigat tour.
ion Team Provide for the initial photographic or
videotape coverage of the directly affected
areas and the surrounding areas.
Do not concentrate on the worst
damage and fail to photograph the fringe
of the undisturbed surroundings.
Have a size reference in the picture, such
as a pencil.
Accident Investigation Team
WASAC
M. Tamer 54
Conducting
Develop a timeline for the
the events leading up
to the incident, when Investigatio
this type of information n
is appropriate.
Plan for
coordination and
communication with
other functions.
not
Seek to obtain
place blame.facts,
Determining
Facts
Immediately following any accident,
much of the available information may
be conflicting and erroneous.
The principal challenge is to
distinguish between accurate and
erroneous information. This can be
accomplished by:
Understanding the activity that
was being performed at the time of
the accident or event.
Personally, conducting a walk-
through of the accident scene or,
work location.
Determining
Physical Barriers – Ex. Facts
Facts related to physical barriers on the day of the accident are as
follows:
There were no general barriers, warning lines, or signs to alert
personnel on top of the construction materials to the fall hazards
in the area.
The platform was intended to catch falling tools or parts, but it
was also used as a work platform for personnel with 100 percent
fall protection.
There were no static lines or designated (i.e., engineered)
anchor points for personnel to connect fall protection equipment
in the vicinity of the platform.
Lighting in the area of the platform was measured at 2
foot‐candles. .
WASAC
M. Tamer 57
Collecting
Evidence
Three key types of evidence are
collected during the investigation:
1. Human or testamentary
evidence includes witness
statements and observations;
2. Physical evidence is matter
related to the accident (e.g.,
equipment, parts, debris,
hardware, and other physical
items); and
3. Documentary evidence includes
paper and electronic
information, such as records,
reports, procedures, and
documentation.
Collecting
The process of Evidence
pursuing evidentiary
material involves:
Collecting human evidence
(locating and interviewing
witnesses);
Collecting physical evidence
(identifying, documenting, inspecting,
and preserving relevant matter);
Collecting documentary evidence;
Examining organizational
concerns, management systems,
and line management oversight;
and
Preserving and controlling
Collect and Catalog Physical
Evidence
Equipment
Tools
Materials
Hardware
Operation facilities
Pre- and post-accident positions of accident-related elements
Scattered debris
Patterns, parts, and properties of physical items associated with the
accident.
WASAC
M. Tamer 60
Collect and
Less obvious but potentially Catalog
important physical evidence includes Physical
fluids (liquids and gases). Evidence
Many facilities use a multitude of
fluids, including chemicals, fuels,
hydraulic control or actuating fluids,
and lubricants. Analyzing such
evidence can reveal much about the
operability of equipment and other
potentially relevant conditions or
causal factors.
In addition to pathogens, any
evidence may create a hazard for
persons handling it. This aspect of any
evidence should be considered and
addressed before handling it.
Sketch and Map Physical
Evidence
Sketch and map the position of debris, equipment, tools,
and injured persons.
Position maps convey a visual representation of the
scene immediately after an accident.
Evidence may be inadvertently moved, removed, or
destroyed, especially if the accident scene can only be partially
secured. Therefore, sketching and mapping should be
conducted immediately after recording initial witness
statements.
Precise scale plotting of the position of elements can
subsequently be examined to develop and test accident causal
theories.
WASAC
M. Tamer 62
Photograph and Video Physical
Evidence
Photos or videos can identify, record, or preserve
physical evidence that cannot be effectively conveyed by
words or collected by any other means.
Photographic coverage should be detailed and complete,
including standard references to help establish distance and
perspective.
Video should cover the overall accident scene, as well as
specific locations or items of significance.
WASAC
M. Tamer 63
Inspect Physical
Evidence
Following initial mapping and photographic recording, a systematic
inspection of physical evidence can begin. The inspection involves:
Surveying the involved equipment, structures, etc., to
ascertain whether there is any indication that component parts
were missing or out of place before the accident;
Noting the absence of any parts of guards, controls, or
operating indicators (instruments, position indicators, etc.)
among the damaged or remaining parts at the scene;
Identifying as soon as possible, any equipment or parts that
must be cleaned prior to examination or testing and transferring
them to a laboratory or to the care of an expert experienced in
appropriate testing methodologies;
WASAC
M. Tamer 64
Remove Physical
Evidence
Following the initial inspection of the scene,
investigators may need to remove items of physical
evidence. To ensure the integrity of evidence for later
examination, the extraction of parts must be controlled
and methodical.
The process may involve simply picking up components
or pieces of damaged equipment, removing bolts and
fittings, cutting through major structures, or even
recovering evidence from beneath piles of debris.
Before evidence is removed from the accident scene,
it should be carefully packaged and clearly identified.
Investigator’s kit can provide general purpose
cardboard
tags or adhesive labels for this purpose.
Remove Physical
Evidence
When preparing to remove physical
evidence, these guidelines should be
followed:
Extraction and removal or movement of
parts should not start until:
witnesses have been interviewed,
since visual reference to the accident
site can stimulate one’s memory.
position records (measurements
for maps, photographs and video)
have been made.
Remove Physical
Evidence
Be aware that the accident
site.
Care during extraction
and preliminary
examination is necessary to
avoid defacing or distorting
impact marks and fracture
surfaces.
The team lead and
investigators should concur
when the parts extraction
work can begin, in order to
assure.
Collect and Catalog Documentary
Evidence
Documentary evidence can provide important data (i.e.,
proof of “work-as-done”) and should be preserved and
secured as thoroughly as physical evidence.
This information might be in the form of documents, photos,
video, or other electronic media, either at the site or in files at
other locations (this information should not be confused with
procedures and such).
Some work/process/system records are retained only for
the workday or the week. Once an event has occurred, the
team must work quickly to collect and preserve these records
so they can be examined and considered in the analysis.
Electronic
Files
Work orders, logbooks, training records
(certifications/qualifications), forms, time
sheets
Problem evaluation reports
Occurrence reports
Nonconformance reports
Closeout of corrective actions from
similar events
Process metrics
Previous lessons learned
External reviews or assessments
Internal assessments (management
and self assessments .
Collecting Human
Evidence
Human evidence is often the most insightful and also the most
fragile. Witness recollection declines rapidly in the first 24 hours
following an accident or traumatic event.
Therefore, witnesses should be located and
interviewed immediately and with high priority.
As physical and documentary evidence is gathered and analyzed
throughout the investigation, this new information will often
prompt additional lines of questioning and the need for follow up
interviews with persons previously not interviewed.
WASAC
M. Tamer 70
Witness
es &
Intervie
ws
Witness
es
Principal witnesses and eyewitnesses
are identified and interviewed as soon as
possible.
Principal witnesses are persons who
were actually involved in the accident.
Eyewitnesses are persons who directly
observed the accident or the conditions
immediately preceding or following the
accident.
General witnesses are those with
knowledge about the activities prior to or
immediately after the accident (the
previous shift supervisor or work
controller, for example).
Intervie
ws •A partial list of those who should
be interviewed includes:
WASAC
M. Tamer 80
14-Steps
Cognitive
Interviewi
ng
14-Steps Cognitive Interviewing
WASAC
M. Tamer 82
14-Steps Cognitive Interviewing
WASAC
M. Tamer 91
Layers of Incident
Causes
Accident investigation is like peeling an onion.
Beneath one layer of causes there are other layers.
The outer layers deal with the immediate causes
while the inner layers are concerned with the
underlying causes such as weakness in the
management system.
WASAC
M. Tamer 92
Isolate Fact
From Fiction
Establish the norms
Use NORMS-based
analysis of information
•Not an interpretation
•Observable
•Reliable
•Measurable
•Specific
If an item meets all five
of above, it is a fact.
NORMS Of
Objectivity
Objective Subjective
Not an Interpretation Interpretations
Based on a factual description based on personal interpretations/biases.
Observable Non-observable
based on what is seen or heard. based on events not directly observed.
Reliable Unreliable
Two or more people independently agree on what they Two or more people don’t agree on what
observed
they observed.
Measurable Non-measurable
a number is used to describe behavior or a number isn’t used.
situation
Specific General
based on detail definitions of what based on non detailed descriptions.
happened.
WASAC
M. Tamer 94
INVESTIGATI
ON
TECHNIQUE
S
Investigation
Techniques Causal
Factor
Tree
Change
Others
Analysis
Investigation
Techniques
Event and Time
Cause Sequence
Analysis Model
Fault Tree
Analysis TapRooT
(FTA)
UNDERSTAND
ING
TapRoot
What is a
Root
A RootCause?
Cause is the most
basic cause (or causes) that
can reasonably be identified
that management has control
to fix and, when fixed, will
prevent (or significantly
reduce the likelihood of) the
problem’s recurrence.
Model of
“Accident
Causation”
WASAC
M. Tamer 100
Traditional
Accident/Incid
ent
Investigations
You can’t solve all human performance problems with discipline,
training, and procedures.
If you look at most industrial accident/incident investigations, you
find three standard corrective actions:
1. Discipline. Which starts with the common corrective action:
“Counsel the employee to be more careful when …”.
2. Training. This may be the most used (and misused) corrective
action of all.
3. Procedures. If you don’t have one, write one. If you already
have one, make it longer.
Often, people can’t see
effective corrective actions
even if they can find the root
causes.
WASAC
M. Tamer 107
Before You
Start
2. Prepare for evidence preservation.
Ensure that evidence is preserved by shift personnel and
first responders by having them use the SAy ESPN
Technique.
S: Safely arrive at the scene
A: Assess and take control of the scene (evaluate for
hazards, avoid creating additional injuries, establish
incident command roles and responsibilities)
E: Emergency services – are for the injured; protect the
env.
S: secure the scene
P: Preserve evidence
N: Notify appropriate company personnel and regulatory
Before You
Start
3.Develop investigation team requirements for certain
classes of problems.
4.Specify training requirements for the team and get
them trained.
5. Build an investigation kit
6.Establish contracts with expert consultants and analysis
laboratories for broken equipment and oil analysis, or others.
7.Establish consulting contracts with expert RC facilitators
or human performance investigation expert to help with
investigations that are beyond the skills of in-house
investigators.
8.Consult with your corporate council about
legal requirements.
What is a
Root Cause
Analysis?
WASAC
M. Tamer 111
How Root Causes Cause
Accidents
In taproot each error is called a “causal factor”.
Each causal factor has at least one Specific Root Cause and may have
one or more Generic Causes.
Most Specific Causes are related to some type of human error.
WASAC
M. Tamer 112
TapRoot
APPLICATION
Plan the
Investigation
What? SnapCharT:
• A map of events &
related
conditions for
an incident
• Planning Tool
• Data Collection
Tool
• The basis for the
rest of your
analysis
SnapCha
rT®
A SnapCharT® is the most fundamental tool for performing an
investigation. It will be used throughout the investigation.
In Step l, the Spring SnapCharT® is used to:
Develop an initial picture of what happened
Decide what information is readily available and what needs to
be collected
Establish a list of potential witnesses to interview
Highlight conflicts that exist in the preliminary information
Plan the next steps in the investigation
WASAC
M. Tamer 115
TapRooT® 7-Step
Process
How SnapCharT®s Help
In the TapRooT® System, the first tool an investigator uses is a
SnapCharT®.
assumptions
Incident
Condition
Condition
Condition
WASAC
M. Tamer 118
Draw Events Next
To decide what goes into an Event, ask
“what action happened next?”
Events are conditions – Active Verbs, One action per box
Date/Time
WASAC
M. Tamer 119
How Events Flow on your SnapChart
a) Build START to FINISH
Incident
OR
b) Build FINISH to START
??? Incident
OR
WASAC
c) Fill in the BLANKS
M. Tamer 120
Now Add Conditions
What do I know about each Event? Clarifying facts/data
–
• Was there anything about this Event that • Positive or negative
was different than desired?
• Quantified if possible
Doctor’s unauthorized
removal of the
Cataract from the left
eye,
WASAC
M. Tamer 123
SnapCharT®
On a preliminary Chart, recognizing what you do NOT
know may be as important as what you do know.
WASAC
M. Tamer 124
SnapCha
rT®
At this point, you are ready to review
the inf. on the chart.
• Does any part of the story not
make sense?
• Is there a conflicting information?
• What problems do you need to
investigate further?
• What inf. do you need from
interviews?
• What evidence do you need to
collect?
Preliminary
SnapChart used to
plan the investigation
WASAC
M. Tamer 126
SnapChart Guidelines Summary
Conditions:
Event?
Tell what we know about an Event
“Who does What?”
OR Some ex.
“What does What” • How/What/Where/Why/ To What Extent/
Under What Conditions
One Action per Event • What required actions were not done?
• How did equipment fail?
Include dates/times • What was different from design?
10/03/2023
If assumption/Unverified Facts:
ALL items on the Chart should be:
• Dashed box
• Factual
• Non-judgmental • Dashed Oval
• Precise/Quantified
WASAC
M. Tamer 127
Exercise
At the end of their shift, two employees leave their work and
head to their cars in the parking lot, one employee step in a
pothole and sprain his ankle, the other employee notified the
security who arranged for the employee transportation to
ER, where he was treated and released.
During investigation, the followings where noted:
WASAC
M. Tamer 129
Step 1: Plan Investigation – Get Started!
Draw Initial SnapChart
Events Incident Events
Event
Incident Event Event
Event
Employee walks Employee steps
Employee Another Employee transported
to car sprains Employee to ER, treated and
in pothole notifies
ankle released
security
WASAC
M. Tamer 131
Add Conditions to Related Events
No action
After dark taken These are proven facts &
questions, NOT opinion,
Conditions
No work
order work order
submitted submitted
WASAC
M. Tamer 132
Step 3: Define Causal Factors
Review the information
on the SnapChart and
identify Causal Factors
WASAC
133
M. Tamer
Causal Factors
WASAC
M. Tamer 134
Ask the following questions:
•What error allowed a Hazard or allowed it to grow too large?
•What error allowed a Safeguard to fail?
•What error allowed a safeguard to be missing?
•What error allowed a target to get too close to the Hazard?
• What error allowed an incident (or its consequences) to
become worse after Hazard contacted the Target?
WASAC
M. Tamer 135
Four Steps Method
Starting with the first problem in each group’s “So What?” chain,
follow the chain until you come to the first Event or the incident,
then stop.
The causal factor will be either that first Event or one or two “So
Whats?” back from the Event/Incident.
WASAC
M. Tamer 138
Causal Factors
Step 2: Group related conditions near the Event that they impact.
WASAC
M. Tamer 139
Over Grouping
WASAC
M. Tamer 140
Investigator revises the Events into parallel
paths to better reflect the logic of what
happened and moves the conditions around
near events
Revised SnapCharT® oWf ASSpArCained
M. Tamer 141
Ankle Incident with Conditions Moved in
SnapCharT® of
Sprained Ankle
Incident with
Grouped Problems WASAC
M. Tamer 142
Causal
Factors
Step 3: List the “So What resulted
because of this problem?” method
to arrange each group logically.
Select a problem on the
SnapChart and ask “So What”
resulted because of this problem?
Ex. Construction supervisor didn’t
plan or supervise deliveries.
Interview identify that supervisor
was knowledgeable about which
roads on that site were rated for
heavy loads. He thought the driver
knows too.
Causal Factors
So What?
Dump truck driver delivers heavy loads to the const. site by driving
across the parking lot that was not rated for heavy loads.
So What resulted from the dump truck driver delivering heavy
loads to the const. site by driving across the parking lot that was
not rated for heavy loads?
Heavy loads on the parking lot caused the beneath the asphalt to
shift and created the pothole.
So What?
The existence of the pothole allowed the employee to step in the
pothole.
From details to the BIG picture
WASAC
M. Tamer 144
”So-What” Chains
WASAC
M. Tamer 145
”So-What” Chains
WASAC
M. Tamer 146
”So-What” Chains
WASAC
M. Tamer 147
Four Steps Method
Step 4:
Identify the Causal Factor for each group and mark it with a
triangle. The causal factor will be either that first Event or one
or two “So Whats?” back from the Event/Incident.
WASAC
M. Tamer 151
Level 1 – Top of the Tree
WASAC
M. Tamer 152
WASAC
M. Tamer 153
Find Root Causes
This troubleshooting guide helps the investigator identify which of
the seven human performance related Basic Cause Categories to
investigate further.
WASAC
M. Tamer 156
Analyzing a Causal
Factor
When the investigator identified a Human Performance Difficulty,
they were guided to a set of 15 questions (part of the tree's
embedded intelligence) called the Human Performance
Troubleshooting Guide.
The first of the 15 questions of the guide is shown in Figure
below.
WASAC
M. Tamer 158
Human Performance Difficulty
WASAC
M. Tamer 159
Human Performance Difficulty
WASAC
M. Tamer 160
Human Performance Difficulty
WASAC
M. Tamer 161
Human Performance Difficulty
WASAC
M. Tamer 162
Human Performance Difficulty
WASAC
M. Tamer 163
Human Performance Difficulty
WASAC
M. Tamer 164
WASAC
M. Tamer 165
WASAC
M. Tamer 166
WASAC
M. Tamer 167
WASAC
M. Tamer 168
WASAC
M. Tamer 169
WASAC
M. Tamer 170
WASAC
M. Tamer 171
WASAC
M. Tamer 172
Strep 5- Generic Causes
WASAC
M. Tamer 173
WASAC
M. Tamer 174
Generic
Root Cause:
Causes
Equipment Difficulty – Storage
Generic Cause:
Our inventory control system does not specify proper storage
of pumps:
WASAC
M. Tamer 175
Example Analysis Using
TapRooT®
Initial Incident Description
During a normal night shift at a process plant, fish were killed
when a temporary (temp) water treatment unit overheated and
released hot, low pH water to one of the plant‘s outfalls.
An investigation that included a contractor representative
(contract personnel were operating the temporary water
treatment unit) was conducted using the TapRooT® System.
The investigation found a sequence of events shown on a
SnapCharT® in Figure next.
WASAC
M. Tamer 176
Example Analysis Using
TapRooT®
WASAC
M. Tamer 177
Example Analysis Using
TapRooT®
Results of Additional Investigation
After considerable investigation including:
interviews with all contract operators and
their supervisor,
discussions with the temporary water
treatment unit vendor's engineers,
interviews with plant personnel at the
process plant unit,
interviews with procurement
personnel, and
interviews with operations
management,
Complete SnapCharT
WASAC
M. Tamer 179
Analyzing a Causal
Factor
To analyze the causal factor - contract operator falls
asleep - the investigator started at the top of the
TapRooT® Root Cause Tree® and worked down the
tree trough a process of selection and elimination.
WASAC
M. Tamer 181
Analyzing a Causal
Factor
When the investigator identified a Human Performance Difficulty,
they were guided to a set of 15 questions (part of the tree's
embedded intelligence) called the Human Performance
Troubleshooting Guide.
The first of the 15 questions of the guide is shown in Figure
below.
WASAC
M. Tamer 183
Analyzing a Causal
Factor
The completed analysis of one of these categories (Human
Engineering)
WASAC
M. Tamer 184
Analyzing a Causal
Factor
When this causal factor was analyzed using the rest of the
applicable Basic Cause Categories (not shown here - Work
Direction, Procedures, Management System) the following root
causes and generic causes were identified:
WASAC
M. Tamer 185
Developing Corrective
Actions
WASAC
M. Tamer 186
Once the causes for all of the causal factors were
identified, the investigator used the Corrective Action
Helper® module of the TapRooT® Software to help develop
the corrective actions for each of the root causes.
This module of the software helps investigators:
1. Verify that they are addressing the real causes of the
incident.
2.Develop corrective actions to fix the specific cause
of the problem.
Corrective
Actions
Developing Corrective Actions
3.Develop corrective actions for
the generic (or systemic) cause (if
applicable) for the problem.
4.Develop additional implementing
actions needed to make the corrective
actions successful.
5.Find references to study the
problem in detail and learn more
about potential strategies to eliminate
the problem.
WASAC
M. Tamer 188
Corrective Actions
Check:
You have decided that the problem was related to loss of
performance over time while monitoring. (The job was too
boring.)
WASAC
M. Tamer 189
Corrective Actions
Ideas:
1.You should consider recommending the following
options: (Order does not indicate preference.)
a.Provide an alarm to alert the worker and relieve the
boredom of monitoring.
b.Provide an automated monitoring and response system to
replace human monitoring and response. NOTE: this will probably
leave the worker in supervisory control. You will need to consider
ways to keep the worker informed as to what the automation is
doing and to clearly indicate why it is doing it.
You should also consider ways to keep the workers involved in the
process so that they maintain their situational awareness and
maintain their manual control proficiency.
WASAC
M. Tamer 190
Corrective Actions
WASAC
M. Tamer 191
Corrective
Actions
e. Provide false signals to keep the worker involved. However, you
should also consider that people may ignore real signals if they
become accustomed to receiving only false signals.
f. Consult the workers to see if they have ideas that would make
the task more interesting without conflicting with the monitoring
requirements.
WASAC
M. Tamer 192
2. Fatigue can also combine with
monitoring alertness problems.
Consider training supervisors to
understand that fatigued personnel
should not be assigned to tasks that
require a high degree of monitoring
alertness.
Correcti
ve
3. Also, consider testing individuals for
Actio their alertness before assigning them to
ns a monitoring task.
1.Replace the old fire hose with a new, tested fire hose.
(Causal Factor 1)
2.Develop policy on testing and use of equipment in
temporary situations. (CF 1)
WASAC
M. Tamer 196
Corrective
Actions
2.Remove the jumpers and place the automatic trip feature back
in service. (CF 2, 3, & 4)
WASAC
M. Tamer 198
Corrective Actions
However, these root causes will be trended in the facility's
database and if these types of problems repeat - even during
proactive audits, additional corrective action may be justified.
WASAC
M. Tamer 199
Also, the corrective actions were
reviewed to ensure they were specific,
measurable, that someone was Correcti
accountable (no responsible people
were listed here), reasonable, timely (no ve
due dates were listed here and no
interim corrective actions were
Actions
provided for long term projects),
effective, and reviewed for
unanticipated consequences.
Also, as time passed and data was accumulated, data from the
root causes would be reviewed to detect potential areas for
generic improvements and also reviewed to detect negative
trends or verify that improvement has occurred.
WASAC
M. Tamer 201
Comparison of Results
WASAC
M. Tamer 203
Corrective Actions
However, what factors were missed and left uncorrected that
could contribute to future incidents?
WASAC
M. Tamer 204
Corrective Actions
2. No effective corrective actions were taken to improve
monitoring alertness. At best, only a temporary improvement in
alertness was achieved.
In fact, the results of spot audits could be non-representative
because operators may be "covering” for each other to ensure
that no one else gets fired.
The moving of the diesel so that the operator hears the alarm
and the fixing of the auto shut off feature makes the sleeping
problem doubtful.
WASAC
M. Tamer 205
Corrective Actions
3. After a contract operator is fired, other operators will view
future investigations with suspicion and will be less likely to be
fully cooperative.
For example,
Would an operator admit that they had nodded off?
Would another operator "tell" on a fellow operator if he found
the other operator sleeping? or would they just "handle it on-
shift" and not tell anyone?
Would covering up mistakes get in the way of effective learning
from mistakes?
WASAC
M. Tamer 206
Corrective Actions
Even though:
- Root cause analysis using TapRooT® and developing corrective
actions is more difficult than blaming those involved, and -
TapRooT® suggests more thorough and potentially more difficult
to implement corrective actions than the easy "fire the
contractor" answer,
WASAC
M. Tamer 207
Defining Causal Factors for an
Incident with an
Equipment Failure
When an incident involves an equipment failure, the investigator
could recognize that the equipment related groups might be
“over-grouped” or “under-grouped” because the investigator
doesn’t have the inf. to understand why the equip. failure
happened. More inf. is needed.
WASAC
M. Tamer 208
WASAC
M. Tamer 209
Defining Causal Factors for
an Incident with an
Equipment Failure
Starting with the first step of the Four Step method, we will find
Causal Factors:
STEP 1: Identify all problems on the SnapChart.
Using Equifactor Troubleshouting Table leads to the discovery of
additional Events and conditions to add to the SnapChart.
Ex. the first symptom will be excessive vibration from a failed
bearing, but failed bearing is only a symptom.
Defining Causal Factors for an
Incident with an
Equipment
Using the centrifugal pump Failure
troubleshooting table under the
“vibration and noise” symptom, we see that excessive vibration
can have many causes including:
Suction problem – pump is cavitating
Suction problem – insufficient immersion of suction pipe or bell
Hydraulic system – total system head higher than design head of
pump
Mechanical system – unbalance pump
Mechanical system - misalignment
Mechanical system – casing distorted
from excessive pipe strain
WASAC
M. Tamer 211
Defining Causal Factors for an
Incident with an
Equipment
Mechanical system – inadequate Failure
grouting of base
Mechanical system – bent shaft
Mechanical system – obstruction in lines or pump
housing
Mechanical system – mechanical defects – worn, rusted, defective
bearings
Mechanical system – unbalance driver
Mechanical system – motor troubles
WASAC
M. Tamer 212
Defining Causal Factors for an
Incident with an
Equipment
Some of these are equipment Failure
oriented, some are design oriented,
some are installation oriented, and others are related to the
operation of the pump.
Before you can develop a Causal Factor, additional investigation is
needed to understand the Events that lead to the pump’s failure
by analyzing potential causes and process of elimination.
Based on expert knowledge you can eliminate non-potential
causes. Maintenance can disassemble the equipment while
collecting inf. to systematically eliminate items from the list until
the cause of failure is identified.
WASAC
M. Tamer 213
Defining Causal Factors for an
Incident with an
Example: Equipment Failure
Investigator discovers that the pump had been run with a clogged
suction strainer until vibration and overheating had caused the
bearings to fail.
Where did the debris come from?
The debris looks like it came from a previous repair of a check
valve upstream from the strainer. Thus the Equifactor
Troubleshooting Tables helped the investigator add information to
the SnapChart including the new Events (Fig. below.24
WASAC
M. Tamer 214
WASAC
M. Tamer 215
Defining Causal Factors for an
Incident with an
Equipment
Check valve replacement in progress Failure
Debris collects in pipe during work
Suction strainer clogs with debris
And the new Conditions:
Cleanliness control steps in work order not followed
Didn’t remove debris from pipe after work was complete
Didn’t detect clogged strainer
No remote indication, alarm, or regular checks on strainer
differential pressure
Didn’t detect hot, vibrating pump until after bearing
failure
Supervisor did not inspect piping prior to close-out
WASAC
No close-out inspection requirement
M. Tamer 216
Defining Causal Factors for an
Incident with an
Equipment
Thus, new problems that were found outFailure
using the Equifactor are:
Debris collects in pipe during work
Cleanliness control steps in work order not followed
Debris was not removed after work was complete
Suction strainer clogs with debris
Operations didn’t detect clogged suction strainer
Operations didn’t detect hot, vibrating pump until after bearing
failure
Supervisor did not inspect piping prior to close-out
No close-out inspection requirement
No remote indication, alarm, or regular checks on strainer
differential pressure
WASAC
M. Tamer 217
STEP 2: Group related Conditions
Defini near the event that they impact.
ng Causal Three problem groups for the Pump
Fails Incident are shown in fig. next.
Factors for Organize the inf. in a cause chain
an using the “So What?” method – the
Incident third step in the Four Step Method for
with Defining Causal Factors.
Failure
a
Organize the inf. in a cause chain
n using the “So What?” method – the
Equipme third step in the Four Step Method for
nt Defining Causal Factors.
WASAC
M. Tamer 219
Defining Causal
Factors for an
Incident with an
Equipment
Failure
STEP 3: Use the “So What?”
method to arrange each group
logically
Visually display the “So What?”
logic directly on the SnapChart in
fig. below
WASAC
M. Tamer 221
Defining Causal Factors for an Incident with
an Equipment Failure
STEP 4: Identify the Causal Factor for each group and mark it
with a triangle.
Start with 1st. problem in each group’s “So What?” chain,
following the chain until you come to the first Event or the
Incident, then stop.
The Causal Factor will be either the Event or one or two “So
What?” back from the Event/Incident. The Causal Factor will be:
Who did what wrong or what was done wrong?
What equipment failed or did not work as intended?
We find four “So What?”Causal Factors as shown in fig. below
WASAC
M. Tamer 222
WASAC
M. Tamer 223
Prepare A
Report
•Be objective!
•State facts.
•Assign
cause(s),
not blame.
•If referring
to an
individuals
actions,
don’t use
names in
the
recommen
dation.
•Good: WASAC
M. Tamer
All 224
The Incident
Report
The incident report is designed to communicate the
investigation results to a wide audience.
The goal of the investigation is to prevent a similar
incident.
An exceptional investigation report willfully
explains the technical elements and issues
associated with the incident.
It will describe the management systems that
should have prevented the event and will detail the
system root causes associated with human errors
and other deficiencies involved in the incident.
Report
Requirements
The report shall, at a minimum, include:
•Date and time of the incident.
•Date and time that the investigation began.
•A list of the investigation team members including members' job
titles.
•A description of the incident including a detailed chronological
sequence of events.
•An Emergency Responder Report of the tactical operations if
appropriate and useful.
•The factors that contributed to the incident.
•Recommendations resulting from the investigation.
Report
Requirements
WASAC
M. Tamer 227
Executive
Summary
The purpose of the executive summary is to convey to the reader a
reasonable understanding of the accident, its causes, and the
actions necessary to prevent recurrence.
Typical executive summaries are one to five pages, depending on
the complexity of the accident.
The executive summary should include a brief account of:
Essential facts pertaining to the occurrence and major
consequences (what happened)
Conclusions that identify the causal factors, including
organizational, management systems, and line management
oversight deficiencies, that allowed the accident to
happen (why it happened) .
WASAC
M. Tamer 228
Executive
Summary
The executive summary should not include a laundry list
of all the facts, conclusions, and recommendations.
Rather, to be effective, it should summarize the
important facts; causal factors; conclusions; and
recommendations.
In other words, if this was the only part of the report
that was read, what are the three or four most
important things you want the reader to come away
with? 23
Executive
Introduction
Summary
A fatality was investigated in which a construction
subcontractor fell from a temporary platform in the [Facility]
at the [Site]. In conducting its investigation, the Accident
Investigation Team used TapRoot technique.
The Team inspected and videotaped the accident site,
reviewed events surrounding the accident, conducted
extensive interviews and document reviews
Direct And Root Causes The direct cause of the accident was the fall from an
unprotected platform.
The contributing causes of the accident were: (1) ---
Conclusions And summarized in Table 1
Recommendations
WASAC
M. Tamer 230
Developing Recommendations
Ineffective recommendations may only serve to
transfer the hazard or even create a new hazard that
was not present before the initial incident.
WASAC
M. Tamer 231
Accident Investigation Startup
Activities List
Name of Designated Lead
Description of Activity
HQ Site Other
WASAC
M. Tamer 232
GOOD
LUCK
WASAC
M. Tamer 233
WASAC
M. Tamer 234
WASAC
M. Tamer 235
WASAC
M. Tamer 236
GOOD
LUCK