SIS
SIS
se.com
Contents
Don’t keep safety a secret – an introductory guide to functional safety 3
Summary 30
Getting safety wrong in high hazard industries is not an option. Nothing is more important than
safety because incidents:
• Cost money
• Disrupt reliable operations
• Impact the environment
• Can result in massive damage
• May cause loss of life
• Threaten an organizations reputation and very existence.
And yet, a worrying trend is happening. Safety incidents are rising despite a decrease in the total
number of working hours1.
After 15 years working in high hazard industries delivering safety system projects, I am now
considered a ‘mid-career professional’. When I was once the ‘newbie’ now there is a new
generation of ‘early-career professionals’ on the same journey, finding their way, asking many of
the same questions I once asked when I started.
While a lot has changed in 15 years, fantastic technology advances, the rise of digital software
tools and applications, cyber security concerns, the functional safety basics remain the same.
1 Source: International Oil and Gas Producers ‘Safety performance indicators – 2022 data’.
What I thought I would do was share some of my learnings and experiences for a whole new
generation. While the functional safety domain is “an inch wide and a mile deep”, and the devil is
in the detail, I would encourage anyone to add comments, share experiences, and help others to
learn for the greater good.
In the next chapter, we will look at the evolution of the functional safety standards including
IEC61508 and IEC61511.
Functional safety is fundamental to enabling the safe, reliable, and compliant operations of today’s
process plants. A key component is the complex technology used for safety related automation
systems. But, more importantly, the implementation of a safety life-cycle approach that captures all
assets of plant design, implementation, operations, and maintenance of those plants.
The following chapter introduces a very complex subject and is intended to be an overview for
the various practices, standards, concepts, and implementation considerations involved with
functional safety management.
We will examine:
• the evolution of today’s safety standards, the differences among them, and their requirements
for compliance.
• the elements that make up a contemporary Safety Instrumented System and how that differs
from a basic process control system.
• share some common examples that incorporate Safety Instrumented Systems into safe
process design.
Like all engineering domains, we love nothing more than Three Letter Acronyms (TLAs), so don’t
be surprised if you come across a few throughout the series, including:
Nothing is more important than safety to the process control industry. Standards continue
to evolve as the industry continues to learn and improve.
A key milestone was reached in 1996 when the International Society of Automation, or ISA,
introduced ISA SP84 - Safety Lifecycle, Quantitative Approach, a standard that documented the
steps necessary to properly specify, design and maintain a safety system.
IEC 61508 - Safety Lifecycle, Quantitative and Qualitative Approach, is an umbrella standard
released by the International Electro Technical Commission in 1998. It covers the functional
safety of electrical, electronic and programmable electronic systems across all industry sectors.
Operators and developers of safety equipment use this document to design, manufacture and
successfully implement safety throughout the entire lifecycle of a system.
IEC 61511 – is a sector specific standard for Functional Safety implementation, developed
specifically to address functional safety in the process industry. Released in 2003, it applies to all
components making up a SIS including filed sensors, logic solvers and final elements like valves,
pumps & motors. In 2004, the standard was wholly adopted in the USA as ISA/ANSI S84.01 (with
the exception of the grandfather clause). Since then, other industry specific standards have been
introduced including those for the Nuclear, Rail, Machinery, and manufacturing industries.
Both the IEC 61508 and IEC 61511 are continuing to evolve and adapt to industry needs based on
practical experience gained from global implementation. IEC 61508 Edition 2 was released in 2010
with several significant changes and improvements, removing ambiguity and offering guidance
for applicability. There are many working groups focused on potential areas of improvement, so
expect an updated version soon.
It is important to note, that these standards are not prescriptive, but performance based.
They present guidelines for best practices, but they do not identify procedures for specific
implementation.
In the next chapter, we will look at why we need functional safety and its origins.
Following the Piper Alpha disaster in 1988, the health and safety executive in the UK analysed
industrial incidents that caused injury or death to employees. The survey revealed that most of
all safety problems were caused by human error. Let’s take a closer look.
Failures that were caused by design and implementations of Safety Instrumented Systems
accounted for only 15% of all failures analysed. So, this begs the question, if the safety systems
account for only 15% of the failures analysed, where are the rest coming from?
Significantly the major source of failures was due to inadequacies in the specification of the control
system (44%). This may have been due either to poor hazard analysis of the equipment under
control (EUC), or to inadequate assessment of the impact of failure modes of the control system on
the specification. Whatever the cause, situations that should have been identified are often missed
because a systematic approach has not been used.
Next, the changes made to these systems, after they were commissioned, changes that were
not analysed or documented, accounted for 20% of these failures. Perhaps these changes were
expedient to getting production underway, but they defeated the safety function by invalidating the
design and implementation.
Improper maintenance and operating procedures were responsible for another 15% of all failures.
If operators don’t service their equipment or do preventive maintenance, they can’t rely on their
equipment to protect them when it is needed.
And finally, installation and commissioning errors, for example, devices that were incorrectly
placed so they were unable to sense hazardous conditions, accounted for 6% of all failures.
In total, 85% of all failures had nothing to do with automation and control. They had to do with
the specification, installation, and operation of the equipment. They were caused by how the
equipment was used, not by the equipment itself.
This realisation led to the concept of the functional safety lifecycle, where safety is considered from
project conception to plant decommissioning.
In the next chapter, we will look at how following a systematic lifecycle approach is important to
successfully implementing functional safety.
Well, if you have a set of company standards or philosophies, go check if IEC61508 or IEC61511
are listed. If they are, then you need to understand what they mean – to YOU!
Let’s look at the IEC61508 safety lifecycle. This general approach to functional safety specifies 16
distinct parts for all activities required to manage safety throughout the entire Safety Instrumented
System lifecycle:
Phase 1 – the analysis phase, when you analyse and identify the risk:
Phase 2 – the realization phase, which encompasses risk reduction system (e.g. the safety
instrumented systems) including the system design, build, test, documentation, installation,
commissioning and site testing.
For example, if the safety requirement calls for proof testing of an element of the Safety
Instrumented Function every 12 months, but plant maintenance only tests it every 36 months,
making the assumed safety integrity of that element invalid. Not only does this expose you to
increased safety risk but will also expose you to compliance issues should an incident occur.
But IEC 61511 is specifically tailored for those who design, integrate and use SIS. It focuses
attention on one type of automated safety system used within the process sector, the safety-
instrumented system.
IEC 61511 covers the design and management requirements for a Safety- Instrumented System
from cradle to grave. These are also three basic phases in this lifecycle: analysis, realization and
operation. Their scope includes 11 sections ranging from initial concept, design, implementation,
operation and maintenance, through to decommissioning.
Throughout the safety supply chain, demonstrated compliance to these standards is a must-have
for best practice process safety management.
In the next chapter, we will look at how we can put the standards into practice, independent
protection layers and building defence in depth.
Decommissioning
As discussed in chapter 3, the safety lifecycle can be categorized into three broad areas.
The first is the analysis phase. Once you have a conceptual process design in hand, the first
step is to analyse the process’s hazards and risks to people, equipment, and the environment.
An assessment of these hazards and risks will determine what critical safety-instrumented
functions you need in relation to the corporate tolerable risk levels. These, in turn, will determine
what safety integrity levels to apply to the safety-critical functions. The conceptual design stage
includes applying the safety integrity levels to the safety functions and verifying that the design will
meet the safety needs.
What can go wrong? What are the consequences? What is the impact? What do we need
to do reduce the likelihood? What systems do we need to put in place? How do we ensure
that the systems we rely on are operating as designed?
The second phase is realization, which focuses on design, specification fabrication, procurement,
construction, installation, and commissioning.
The final phase is operation, which covers start-up, operation, maintenance, modification, and
eventual decommissioning.
The operation phase includes a very important ingredient – the management of change. Anytime
you decide to make a modification in this phase, you need to assess the impact of this change on
your system by analysing the effect throughout the entire safety lifecycle.
You want to be sure that your change will not affect any safety-critical loops or compromise the
safety integrity that you selected for that loop. When applied correctly, this feedback mechanism
ensures that you do not introduce human error or unforeseen hazards by making changes at the
end of the process.
For example, the above image shows a plant that is being run with a basic process control system
– it could be as simple as single loop controllers to a very large scale, complex system. Its job is
to keep the process under control and operating within the safe operating limits. For example, the
process could be a pressure vessel where chemicals are mixed to produce some sort of product.
The control system would govern how much product goes in, how much is converted, how much
comes out.
If something upsets that process, for example, bad feedstock, a pressure spike, or something else
extraneous – the first layer of protection is the alarm. When the alarm is generated, an operator
can intervene in the process to get it back under control and back in production. The process
alarm and operator response are considered a layer of protection, an independent brick wall that
prevents a catastrophic situation.
The next layer of protection is generally a Safety Instrumented System or shutdown system that
senses that a process is getting so out of control that it is headed for a dangerous situation. With
no human intervention, it shuts down the process before it gets to a hazardous state.
If these systems don’t work or don’t work fast enough, the next layer of protection is physical
devices such as pressure relief valves or flare systems.
However, if the hazard remains uncontained, that is the hazardous event has occurred, the next
layer of protection seeks to mitigate the result (as opposed to preventing it in the first place).
For example, in the event of a fire, a sprinkler system could mitigate the consequences.
Each protection layer is independent from the source of the hazard and from other protection
layers. Any one layer will perform its function, regardless of the action or failure of any other
protection layer.
The left-hand axis is divided into Prevention and Mitigation. Plant design is a protection layer in
itself - if you design your plant to be inherently safe (often referred to inherently safe design) that’s
a layer of protection.
Next, the Basic Process Control System (BPCS), normally a Distributed Control System (DCS)
which is responsible for the normal operation of the plant, is used as a layer of protection against
unsafe conditions in many instances. Normally, if the BPCS fails to maintain control, alarms notify
operations that human intervention is needed to re-establish control of the process within the
specified limits.
Should these protection layers fail, different layers of protection continue to try to minimise the
consequences of the hazard, such as relief valves, dikes, and emergency response crews plant
and community evacuation procedures.
The key point here is a Safety Instrumented System is an independent layer of protection and an
important part of hazard prevention. Up until recently, some plants tried to incorporate this safety
function inside the Basic Process Control System. The problem is this; if a failure of the Basic
Control System is the cause of the hazard, the Basic Control System can’t possibly execute the
safety action required to bring the plant to a safe state.
One is a demand on the other. The Safety Instrumented System must be independent from
the Basic Process Control System, especially if you are taking credit for the risk reduction in
your calculations.
In the next chapter, we will look at what makes up a Safety Instrumented System, how it is different
from the control system, the difference between safety and non-safety PLCs and what you would
use an SIS for.
Formal definition: SIS – “instrumented system used to implement one or more Safety Instrumented
Functions (SIF). A SIS is composed of any combination of sensor(s), logic solver(s), and final
element(s)” (IEC 61511 / ISA 84.01)
Informal definition: “an instrumented Control System designed specifically for safety applications that
detects “out of control” process conditions and automatically returns the process to a safe state”.
The informal definition more easily conveys intent: “a Safety Instrumented System is the last line of
defence before a hazard occurs. It is a layer of protection designed to achieve or maintain a safe
state of the process when unacceptable process conditions are detected”.
Again, the Safety Instrumented System is different from the basic process control system that
controls the plant and needs to be treated differently.
A Safety Instrumented System looks at the integrity of a safety loop from pipe to pipe. It has three
major elements 1) Sensors 2) logic solvers and 3) final elements. The sensors look for the initiating
event that could cause the hazard. The logic solver decides how to deal with the hazard and then
sends a signal to the final element(s). When triggered, the final element brings the process to a
safe state. Each of these elements and how they relate needs to be considered when assessing
the risk.
A Safety Instrumented System may implement one or more Safety Instrumented Functions (SIF),
which are designed and implemented to address a specific process hazard or hazardous events.
A Basic Process Control System is connected to the sensors and actuators and uses set point
control to control the flow of material through the process. For example, a set point control loop
consisting of a pressure transmitter, controller, and control valve. The pressure in this vessel must
be controlled. This is done by sensing the pressure with a pressure transmitter (PT101).
Pressure measurements are sent to the controller (PIC101). When the pressure reaches a certain
point, the controller instructs the value (PV101) to open or close until the pressure reaches the
desired set point.
If any of these elements were to fail – the pressure transmitter, controller, or control valve, or for
example, the controller was put into manual or left in manual control, there would be no way to
control the pressure in the vessel.
The Safety Instrumented System shall be completely independent from the BPCS.
It shall use different instruments, different controllers, and different end devices.
Its only role is safety and protection.
Often an additional protection layer would be added such as a mechanical pressure relief value,
adding yet more safety and reducing the likelihood of something bad from happening or the
consequence of the hazardous event.
The SIS must be completely independent from the BPCS. It must use different instruments,
different controllers, and different end devices so that where a failure of BPCS devices may not
result in both a demand on the SIS and a dangerous failure of the SIS. No process control is
performed in the SIS – Its only role is safety and protection.
Safety PLCs are designed specifically for safety applications with very specific and
quantifiable known failure modes.
The fundamental difference is a standard PLC can fail in ways that are not predictable, or at least,
not predictable with any certainty that can be designed around. Usually, no-one knows that there
is a problem until the problem occurs, by which time, it may be too late.
Safety programmable logic controllers, on the other hand, are operating with very specific and
quantifiable known failure modes. They are designed in such a way that when they fail, they do
so with a level of certainty and within a specific probability that corresponds to safety integrity
levels. This is achieved in a variety of ways including incorporation of a high degree of internal
diagnostics and automatic testing, hardware backup and redundancy. As well as voting of
independent signals within the PLC. In short, they are designed so that you can expect the
equipment to work when it is needed to work.
Also, safety PLCs are certified by a third party to international standards such as IEC61508.
IEC61511 and ISA SP84 using rigorous test procedures to confirm that they’ve been designed
and manufactured in a way that the effect of systematic failures during this process is reduced.
The most recognised 3rd party certification organization of safety PLC’s is the German TÜV
although there are others in the market that also certify safety related equipment. And when you
are controlling the destiny and health of your employees, your property,
the environment, you want to be sure that your equipment has been tested and verified by
someone other than the person or vendor who sold the equipment to you!
While most people are familiar with emergency shutdown systems (ESD) and fire and gas (F&G)
detection and suppression systems, many don’t appreciate that burner management (BMS),
turbomachinery protection (TMC) or high integrity pressure protection (HIPPS) systems are
considered SIS.
• Emergency shutdown system - detect potentially hazardous conditions, shut down the plant and
help prevent unsafe incidents
• Fire and gas detection - detect abnormal situations such as fire or combustible/toxic gas and
provide early alerts and mitigation
• Burner management system - enable safe operation and protection of burners, boilers, furnaces,
and heaters.
• High Integrity Pressure Protection - help prevent over pressurization of a plant or pipeline
• Turbo Machinery Control and Protection - make your rotating machines safe and efficient
In the next chapter, we will look at safety instrumented functions, safety integrity levels, and
probability of failure on demand.
Informally, it is a function that doesn’t require operator intervention. It is one loop within the SIS
consisting of sensors, logic solver, and final control elements that act together to detect a hazard
and bring the process to a safe state. In simple terms, it is the safety equivalent to a control loop!
A SIF is one loop within the SIS consisting of sensors, logic solver and final control
elements that act together to detect a hazard and bring the process to a safe state.
A SIF senses a potential hazard, applies logic, and triggers an action to bring the plant to a safe
state. It is event driven. A Safety Instrumented System utilizes several SIF’s working together as a
whole to protect the entire plant.
None of the above include an action, associated with a final element, that automatically brings the
plant to a safe state, hence not considered a SIF. They may be part of the overall risk reduction,
for example, partial credit may be taken for the operator alarm, but this is not a SIF.
These questions are answered through the assignment of target Safety Integrity Levels, or SIL.
A SIL Level is “the Safety Integrity Level (SIL) of a specific Safety Instrumented Function (SIF)
which is being implemented by a Safety Instrumented System (SIS).”
Or put simply, Safety Integrity levels are measures of the amount risk reduction that the SIF should
provide against a specific hazardous event in relation with the corporate tolerable risk levels.
Safety Integrity Levels are measures of the safety risk given to a specific process.
The higher the Safety Integrity Level, the greater the impact or
consequence of a failure, and the lower the failure rate that
is acceptable.
PFD = λdu TI
2
PFD Probability of Failure on Demand
This is a significant contributor to the overall PFD. The test interval is the time between which you
have some uncertainty about your Safety Instrumented System being as good as it was the first
day that you tested it.
If you think about a SIL as probability, then on day one it’s very safe, but over time, the sensing
elements could degrade, pipe could corrode, or the valve might start to wear. The longer the time
between testing the SIF, the more uncertain you are about its ability to perform correctly when its
vital it operates on demand.
Without regular testing, you cannot verify that your Safety Instrumented System still works the way
you designed it to work. Every time you perform a proof test, it in theory brings the PFD back to
where it was on day one.
Note: Some regulators do not allow you to consider a proof test to take you back to day one in the
true sense of the word, so this is simplified for explanation purposes.
This is illustrated in the saw tooth curve – the average probability of failure on demand is the
average of the saw tooth curve. You need to adjust the test interval to ensure that the dotted line
stays inside the desired safety integrity target or band.
If the test interval is extended, the performance is significantly affected and the average PFD
increases. Likewise, if you shorten the test interval, you are testing more often, your PFD
decreases, however your maintenance costs also go up! And the probability of human error
(systematic failures) could increase.
The point is that safety integrity levels are connected to specific proof testing intervals which form
part of your SIF design. Varying the proof test interval will change PFD.
For example, a piece of equipment can achieve Safety Integrity Level 3 (SIL 3) operation with
a proof test interval of 12 months. If however, it is proof tested only every 48 months due to
operational restrictions, it is no longer operating as designed, and if you run the calculation,
may not be providing a PFD equal to the SIL 3 target.
A Safety Integrity Level is a way to indicate the tolerable failure rate of a particular
safety function
Let’s take a closer look at the order of magnitude between the different safety integrity levels
(1 being the lowest, 4 being the highest) for SIF low demand mode of operation:
A Safety Integrity Level indicates the probability of failure on demand, not how often it will fail but
how often it will fail at the time it needs to operate.
This can be expressed in terms of safety availability, or how much of the time the system will work
when you need it to work. For example, at the lowest Safety Integrity Level, if the system works
when you need it to work between 90% and 99% of the time, that’s good enough. Every step up is
an order of magnitude of safety.
One way to think of Safety Integrity Levels is to consider how much risk reduction you are
applying to the Safety Instrumented Function. Say your inherent risk is “x” and in order to get to
the acceptable risk, you need to reduce your risk somewhere between 10 and 100 times. Safety
Integrity Level 1 is all you need to apply.
What the Safety Integrity Level concept asks you to do is to assess your risk for a specific Safety
Instrumented Function, then determine the appropriate level of risk mitigation you require, which is
equivalent to the required Safety Integrity Level of the Safety Instrumented Function.
In the next chapter we will look at risk, what is tolerable, and how to reduce it to an acceptable
level.
One of the simple ways to think about risk is “What is the likelihood of an undesired event occurring
within a specific period or given circumstances?”
One of the simplest ways that we can do this is to use a risk matrix that shows the correlation
between the likelihood of something happening and the consequence if it happens.
Note: while the concept of the risk matrix is the same, I have seen many different size risk matrices
used such as 3x3, 4,4, 5x5, 5x6, 6x6, even 6x8. I have seen the risk matrix inverted, so the
highest risks are in the bottom left-hand corner, not the top right. I have seen risk matrix used for
people safety on the plant, people safety off the plant, commercial risk, public image as well as
environmental risk.
But the basic principle remains the same. If you have a low likelihood of an undesired event
occurring and the consequence is minor, that’s a relatively low risk.
However, if you have a high likelihood of an undesired event occurring and the consequence
is still minor, it might still be considered high risk…because if it happens all the time,
it’s interrupting production.
How do we achieve the right balance of too little or too much risk?
Many operators immediately associate risk with injury or death to personnel. But when analysing
risk, these operators may not have thought about some of the ongoing or consequential damaging
effects of taking too much risk, including:
For example, the sustainability impact on the environment and the consequential clean-up cost is a
prevalent consideration among regulatory authorities.
Consider the loss of containment of Macondo oil well, owned by BP in the Gulf of Mexico in April
2010, and the consequential environmental damages. The operator’s revenue was interrupted.
And the incident certainly carried a huge legal liability. Their corporate image took a staggering hit,
the stock price was significantly impacted, and the final settlement bill was in the $Billions.
These are all factors that need to be considered when identifying risk, but which are not always
easy to quantify.
So, when considering the consequence, don’t just think about the safety consequence, consider
the people, commercial and environment consequence.
The point is, you must think about the big picture when considering risk. All these factors can
weight the impact of the consequence you are trying to mitigate.
Some countries have mandated tolerable risk limits. But in many countries for example North
America, the liability is on the operator to determine what their tolerable risk is, provided they meet
OSHA requirements as a minimum.
A common concept when determining tolerable risk is ALARP or “As Low as Reasonably
Practicable”. ALARP is one of the fundamental principles of risk management. We neither need nor
want to manage risk to the point where we eliminate it, because doing so is simply not a good use
of resources. i.e. the point where the costs exceed benefits.
Assessing tolerable risk will be different based on the operators’, and/or shareholders’
expectations, as well as where the plant is built and operated. Building a plant in the middle of the
desert is very different from building the same plant in the middle of a city surrounded by a dense
population. Same plant, same process risk, same process. But the tolerable risk to the operator is
completely different.
There are multiple layers of protection being deployed to reduce the overall risk to an acceptable
level as no one layer on its own provides enough risk reduction to get to the target acceptable level:
A key point to note and remember is – a Safety Instrumented System does not reduce the
consequence; it can only reduce the likelihood of an incident or event from happening.
The difference between the inherent risk (the unmitigated risk) and the tolerable risk is how much
risk reduction you apply and the corresponding safety integrity levels that are assigned.
A Safety Instrumented System reduces the likelihood of an inherent process risk, and the
standards obligate us to go through this systematic process for every identified hazard that we feel
we can address with passive or active risk reduction methods.
Turn the page for the next chapter, where we will look at some of the considerations when selecting
the safety instrumented system including architecture, voting and testing.
Omissions in the design of a Safety Instrumented Function could remain undiscovered until an incident
occurs. This possibility makes the conceptual design of a Safety Instrumented System especially
challenging.
The conceptual design comprises the selection of appropriate technology (sensors, SIS, and final
elements). There has never been more technology to choose from, so it is extremely important to be
clear on what you are asking for, and why:
a. Remember that proof test interval discussed earlier? The technology may be lower CAPEX, but
if you must test it every year to keep it at SIL3, then the OPEX costs escalate significantly. What
appears at first to be a cost-effective choice, may have a total lifecycle expenditure (TOTEX) that
far exceeds anticipated costs!
As always, the devil is in the detail. Something may have a certificate and be certified as SIL3, but
it’s important to understand what is behind the certificate. Every reputable SIS manufacturer will
have a safety manual to accompany the system. If you read nothing else, read this document, and
become familiar with what you are buying, what you are getting, or more importantly, not getting, what
functionality is built into the system e.g., diagnostics, versus what is required as you, the owner /
operator will have to do.
Specifying just the Safety Integrity Level or Probability of Failure on Demand values without
understanding how the equipment is going to be used, applied, tested, operated and
maintained in real life is crucial.
Proper application of today’s smart technologies can significantly benefit the operator by reducing
interventions in both duration and frequency.
It is necessary to think about not only the process in a safe state, but also as a cost-effective
architecture with field devices and logic solvers that can tolerate a non-demand failure on one of its
elements without losing production.
You should also consider redundancy, and levels of fault tolerance, to maximize production up time,
avoid spurious trips, and ease maintenance activities. Not just for field instruments, but also for the SIS
system. The more fault tolerant the SIS, the less likelihood you are to have a costly nuisance or spurious
trip. Investing a little more in a highly fault tolerant SIS system pays long term dividends! The small
additional cost far outweighs an hour of lost production revenue.
You need to identify any common cause failures that could defeat the safety function even if you’ve
applied it redundantly. For example, one of the sensing or initiating events for a particular safety
protection loop is a pressure transmitter in the field. You may decide to use two pressure transmitters
so if one of them fails spuriously or without a demand you won’t trip your plant.
But the problem is if they are both connected to the same impulse line tubing, a blockage in that tube
could cause both transmitters not to sense the hazard.
This is known as a common cause failure and has the potential to either trip your plant spuriously, or
worse completely defeat the safety function.
Many plants have scheduled maintenance turnarounds every 5-7 years. But some systems need to be
tested more often to meet the Safety Integrity Levels outside of the turnaround window.
How are these tests going to be conducted without interrupting operations? Are you going to do a full
test or use some internal diagnostics to report on system integrity? Procedures and resources to inspect,
monitor, audit and report on system integrity enable you to prove the suitability of the safety system at any
point in its operation life.
In summary, specifying just the Safety Integrity Level or Probability of Failure on Demand values without
understanding how the equipment is going to be used, applied, tested, operated, and maintained in real
life is crucial. You must be very specific to ensure that you get what you need for the specific asset.
In the next chapter, we will look at practical examples of SIL1 and SIL2 safety instrumented functions and
the mean time to failure spurious (MTTFS).
This is a pressure vessel (V101). The instruments or devices in blue are elements of the basic
process control system that is maintaining the level in the vessel. If one of these elements fails,
it may not be possible to control the level of the vessel.
For example, if the level transmitter (LT101) failed dangerously, that is it stopped reacting to the
tank level and reported the same level all the time, the controller (LIC101) would continue to think
that the level was constant, even if it was rising. If the level rises too much, product could come
out of the top of the vessel (known as a loss of containment) which could spark off, and then cause
an explosion.
On the other hand, the Safety Instrument System (Shown in Red), measures the same level
in the vessel through an independent level transmitter (LT102), independent logic solver and
independent valve (XV101).
Thus, if one of the basic process control system elements fail, the Safety Instrumented System
can still sense a high level and protect this vessel from getting into a hazardous state. If the Safety
Integrity Level for this protective function was determined as Safety Integrity Level 1 (SIL1), then
this design is sufficient.
[Note: Mean Time to Fail Spurious = The mean time until a failure of the system causes a spurious process trip]
But what if LT102 is not the most reliable in the world? Say it fails over time due to environmental
condition, corrosive process fluids etc. In the previous example, the Safety Integrity Level 1 still is
met because the safety function will function if LT102 fails spuriously.
But it will trigger the valve (XV101) and shut the process down, potentially costing millions of
dollars in lost production just because this $1,000 device failed in a spurious way.
One way to increase the availability of this protection loop, but not compromise safety, is to use
two measuring devices to look at the same level in the vessel.
If one of them fails spuriously, that is, failed when there was no demand, you still receive the
healthy signal from the second device. Both must agree at the same time (Two out of Two voting
or 2oo2) that there is a process condition issue before they trigger the protection loop. The
availability or the potential for this unit to shut down by a spurious trip has been reduced by adding
an additional device.
Note, the safety integrity has not been affected. The SIF still can achieve a Safety Integrity Level
1 SIF, but with an increased process availability because the spurious trip of just one of the level
transmitters will not trip the loop.
Important: the simplified examples do not negate the need for following the systematic approach
to identify hazards, risks, consequences, likelihood, target levels, risk reduction methods etc.
This architecture uses two level devices in the safety loop to measure the level in the vessel
(LT102 and LT103). Only one of them must detect a hazardous condition (one out of two or 1oo2)
to shut down the system using valves XV101 and XV102.
Once again, because only one of these devices must vote on a hazardous situation, it trips the
production unit and takes the production unit to a safe state. This design is much safer than the
previous design, but still, if a $1,000 device goes bad, it will stop production.
To increase the overall availability of this production unit and maintain safety, you could typically
use three devices measuring the same level (LT102, LT103, LT104).
During normal operation, all three level transmitters monitor the level, and are voted such that
two out of three (2oo3) need to agree to demand the safety function before it can take place.
This allows one of these devices to go bad without tripping the entire plant, and still provides the
required safety levels.
Another benefit is that this allows online maintenance and testing of the level transmitters without
the need to put the loop in manual control or any loss of protection as the voting takes it out of
the equation.
The point is, when designing a system; you need to consider both the SAFETY of the plant and
the AVAILIBILITY of the plant. Afterall, the safest plant is one that never starts up and operates!
In summary:
• Device hardware failures are not the primary causes of previous industry major accidents.
• EC61511 / ISA S84.01 are standards for the process industries.
• They are performance-based standards and address the entire safety lifecycle.
• They are not always mandatory but are considered best engineering practice by industry and regulators.
• A Safety Instrumented System is an independent layer of protection and must be independent and
separate from the Basic Process Control System. They should not normally “share” the same field devices.
• SIS PLCs are different from traditional PLCs and independently certified for use in safety related
applications.
• A Safety Instrumented Function is one loop (a safety loop) executed within a Safety Instrumented
System (SIS). It includes sensors, logic solver and final control elements that work to detect a hazard
and bring the process to a safe state.
• A Safety Integrity Level (SIL) indicates the tolerable failure rate of a particular Safety Instrumented Function.
• Risk is a product of likelihood and consequence. A Safety Instrumented System is designed to reduce
the likelihood of an incident or event, it does not reduce the consequence.
• And finally, the device testing, voting architecture and plant availability targets must all be considered
in designing a Safety Instrumented System. Is it safe? Is it safe with enough availability to allow for
continuous production?
Like everything, the devil is always in the detail. The good news is that there are plenty of excellent
resources for more information, including the following sources:
Kenny Chua
Triconex Safety and
critical Control
Schneider Electric
50 Kallang Avenue,
Singapore 339505
©2024 Schneider Electric. All Rights Reserved.
Schneider Electric | Life Is On is a trademark and the property of Schneider Electric SE, its subsidiaries, and affiliated companies.
998-23302905