0% found this document useful (0 votes)
21 views

SIS

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

SIS

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Don’t keep safety a secret

An introductory guide to functional safety

se.com
Contents
Don’t keep safety a secret – an introductory guide to functional safety  3

Functional safety fundamentals  5

Why we need functional safety 7

Following a systematic lifecycle approach is important 8

Putting the standards into practice  10

So what exactly is a Safety Instrumented System?  13

What is the difference between a Safety Instrumented Function (SIF)


and a Safety Integrity Level (SIL)?  17

What is risk? What is tolerable? What is acceptable?  21

What do I need to consider when selecting the SIS?  25

Simplified SIL 1 and 2 practical examples  27

Summary  30

About the author  31

2 Don’t keep safety a secret – an introductory guide to functional safety


Don’t keep safety a secret
– an introductory guide
to functional safety
Managing operational risk is one of the most demanding aspects of achieving safe,
reliable, and profitable operations for any company working in high-hazard industries

Getting safety wrong in high hazard industries is not an option. Nothing is more important than
safety because incidents:

• Cost money
• Disrupt reliable operations
• Impact the environment
• Can result in massive damage
• May cause loss of life
• Threaten an organizations reputation and very existence.

And yet, a worrying trend is happening. Safety incidents are rising despite a decrease in the total
number of working hours1.

After 15 years working in high hazard industries delivering safety system projects, I am now
considered a ‘mid-career professional’. When I was once the ‘newbie’ now there is a new
generation of ‘early-career professionals’ on the same journey, finding their way, asking many of
the same questions I once asked when I started.

While a lot has changed in 15 years, fantastic technology advances, the rise of digital software
tools and applications, cyber security concerns, the functional safety basics remain the same.

1 Source: International Oil and Gas Producers ‘Safety performance indicators – 2022 data’.

Life is On | Schneider Electric 3


When I started my career, I was lucky. There were lots of experts with expertise and experience to
learn from. They would take the time to sit down with me and patiently explain why, what, how. But
most of those experienced people have retired, and a new generation are now looking to me for
my expertise, asking me the same questions that I asked when I first started.

What I thought I would do was share some of my learnings and experiences for a whole new
generation. While the functional safety domain is “an inch wide and a mile deep”, and the devil is
in the detail, I would encourage anyone to add comments, share experiences, and help others to
learn for the greater good.

In this document we will explore topics such as:

• Functional safety fundamentals, standards and why they are important


• Why functional safety is needed
• How following a systematic lifecycle approach is vital
• How we can put the safety standards into practice to build defence in depth through
independent protection layers
• What a safety instrumented system is and how it differs from the control system, how a safety
PLC is different from a traditional PLC, and what safety PLCs are used for
• The difference between SIS, SIF and SIL
• How we balance risk and achieve a tolerable risk level
• Considerations for selecting the safety system, including field instruments, system architecture,
voting, proof tests and testing
• Look at some simplified SIL1 and SIL2 architectures

In the next chapter, we will look at the evolution of the functional safety standards including
IEC61508 and IEC61511.

4 Don’t keep safety a secret – an introductory guide to functional safety


Functional safety fundamentals
To keep high hazard operations safe for an entire lifetime, operators rely heavily on functional
safety management and practices to ensure compliance to regulatory standards, meet legal
requirements, protect the environment, protect people, and protect their capital assets.

Functional safety is fundamental to enabling the safe, reliable, and compliant operations of today’s
process plants. A key component is the complex technology used for safety related automation
systems. But, more importantly, the implementation of a safety life-cycle approach that captures all
assets of plant design, implementation, operations, and maintenance of those plants.

The following chapter introduces a very complex subject and is intended to be an overview for
the various practices, standards, concepts, and implementation considerations involved with
functional safety management.

We will examine:

• the evolution of today’s safety standards, the differences among them, and their requirements
for compliance.
• the elements that make up a contemporary Safety Instrumented System and how that differs
from a basic process control system.
• share some common examples that incorporate Safety Instrumented Systems into safe
process design.

Who doesn’t love an acronym or two?

Like all engineering domains, we love nothing more than Three Letter Acronyms (TLAs), so don’t
be surprised if you come across a few throughout the series, including:

• SIS – Safety Instrumented System


• SIF – Safety Instrumented Function
• SIL – Safety Integrity Level
• PFD – Probability of Failure on Demand
• PHA – Process Hazard Analysis
• LOPA – Layer of Protection Analysis
• SRS – Safety Requirement Specification
• PES – Programmable Electronic System
• BPCS – Basic Process Control System
• PLC – Programmable Logic Controller

Life is On | Schneider Electric 5


Why are standards important?
Standards and guidelines about evaluation and certification of control instrumentation used in
safety-instrumented systems, or SIS, were first introduced in 1984.

Nothing is more important than safety to the process control industry. Standards continue
to evolve as the industry continues to learn and improve.

A key milestone was reached in 1996 when the International Society of Automation, or ISA,
introduced ISA SP84 - Safety Lifecycle, Quantitative Approach, a standard that documented the
steps necessary to properly specify, design and maintain a safety system.

IEC 61508 - Safety Lifecycle, Quantitative and Qualitative Approach, is an umbrella standard
released by the International Electro Technical Commission in 1998. It covers the functional
safety of electrical, electronic and programmable electronic systems across all industry sectors.
Operators and developers of safety equipment use this document to design, manufacture and
successfully implement safety throughout the entire lifecycle of a system.

IEC 61511 – is a sector specific standard for Functional Safety implementation, developed
specifically to address functional safety in the process industry. Released in 2003, it applies to all
components making up a SIS including filed sensors, logic solvers and final elements like valves,
pumps & motors. In 2004, the standard was wholly adopted in the USA as ISA/ANSI S84.01 (with
the exception of the grandfather clause). Since then, other industry specific standards have been
introduced including those for the Nuclear, Rail, Machinery, and manufacturing industries.

Both the IEC 61508 and IEC 61511 are continuing to evolve and adapt to industry needs based on
practical experience gained from global implementation. IEC 61508 Edition 2 was released in 2010
with several significant changes and improvements, removing ambiguity and offering guidance
for applicability. There are many working groups focused on potential areas of improvement, so
expect an updated version soon.

It is important to note, that these standards are not prescriptive, but performance based.
They present guidelines for best practices, but they do not identify procedures for specific
implementation.

In the next chapter, we will look at why we need functional safety and its origins.

6 Don’t keep safety a secret – an introductory guide to functional safety


Why we need functional safety
In my last chapter I very briefly looked at the two major internal safety standards, IEC61508
and IEC61511. So why were these standards created? Why was the concept of functional
safety developed?

Following the Piper Alpha disaster in 1988, the health and safety executive in the UK analysed
industrial incidents that caused injury or death to employees. The survey revealed that most of
all safety problems were caused by human error. Let’s take a closer look.

Failures that were caused by design and implementations of Safety Instrumented Systems
accounted for only 15% of all failures analysed. So, this begs the question, if the safety systems
account for only 15% of the failures analysed, where are the rest coming from?

Significantly the major source of failures was due to inadequacies in the specification of the control
system (44%). This may have been due either to poor hazard analysis of the equipment under
control (EUC), or to inadequate assessment of the impact of failure modes of the control system on
the specification. Whatever the cause, situations that should have been identified are often missed
because a systematic approach has not been used.

Next, the changes made to these systems, after they were commissioned, changes that were
not analysed or documented, accounted for 20% of these failures. Perhaps these changes were
expedient to getting production underway, but they defeated the safety function by invalidating the
design and implementation.

Improper maintenance and operating procedures were responsible for another 15% of all failures.
If operators don’t service their equipment or do preventive maintenance, they can’t rely on their
equipment to protect them when it is needed.

And finally, installation and commissioning errors, for example, devices that were incorrectly
placed so they were unable to sense hazardous conditions, accounted for 6% of all failures.

In total, 85% of all failures had nothing to do with automation and control. They had to do with
the specification, installation, and operation of the equipment. They were caused by how the
equipment was used, not by the equipment itself.

This realisation led to the concept of the functional safety lifecycle, where safety is considered from
project conception to plant decommissioning.

In the next chapter, we will look at how following a systematic lifecycle approach is important to
successfully implementing functional safety.

Life is On | Schneider Electric 7


Following a systematic lifecycle
approach is important
Now that we understand some of the applicable standards, and the need for functional safety,
what do we do next?

Well, if you have a set of company standards or philosophies, go check if IEC61508 or IEC61511
are listed. If they are, then you need to understand what they mean – to YOU!

Let’s look at the IEC61508 safety lifecycle. This general approach to functional safety specifies 16
distinct parts for all activities required to manage safety throughout the entire Safety Instrumented
System lifecycle:

Phase 1 – the analysis phase, when you analyse and identify the risk:

• What can go wrong?


• What is the likelihood,
• What is the consequence,
• What do I need to do manage and reduce the risk to an acceptable level? (safety needs, target
levels, specify safety requirements etc.)

Phase 2 – the realization phase, which encompasses risk reduction system (e.g. the safety
instrumented systems) including the system design, build, test, documentation, installation,
commissioning and site testing.

8 Don’t keep safety a secret – an introductory guide to functional safety


Phase 3 - the operation phase, during which you operate and maintain the systems, and safety
performance, through life of the plant, and make sure that all the assumptions used as the basis
for design used in the analysis phase are validated and maintained.

For example, if the safety requirement calls for proof testing of an element of the Safety
Instrumented Function every 12 months, but plant maintenance only tests it every 36 months,
making the assumed safety integrity of that element invalid. Not only does this expose you to
increased safety risk but will also expose you to compliance issues should an incident occur.

IEC 61511 and ISA 84.01 lifecycle


As we just mentioned, IEC 61508 is a generic standard intended to address all industries.
For example, manufacturers and suppliers of safety devices certify to standard IEC61508
because their products can be used in many different industries and applications.

But IEC 61511 is specifically tailored for those who design, integrate and use SIS. It focuses
attention on one type of automated safety system used within the process sector, the safety-
instrumented system.

IEC 61511 covers the design and management requirements for a Safety- Instrumented System
from cradle to grave. These are also three basic phases in this lifecycle: analysis, realization and
operation. Their scope includes 11 sections ranging from initial concept, design, implementation,
operation and maintenance, through to decommissioning.

Throughout the safety supply chain, demonstrated compliance to these standards is a must-have
for best practice process safety management.

In the next chapter, we will look at how we can put the standards into practice, independent
protection layers and building defence in depth.

Life is On | Schneider Electric 9


Putting the standards
into practice
Understanding how safety is quantified in IEC 61508 or IEC 61511 can be difficult for anyone new
to the concept.

Decommissioning

As discussed in chapter 3, the safety lifecycle can be categorized into three broad areas.

The first is the analysis phase. Once you have a conceptual process design in hand, the first
step is to analyse the process’s hazards and risks to people, equipment, and the environment.
An assessment of these hazards and risks will determine what critical safety-instrumented
functions you need in relation to the corporate tolerable risk levels. These, in turn, will determine
what safety integrity levels to apply to the safety-critical functions. The conceptual design stage
includes applying the safety integrity levels to the safety functions and verifying that the design will
meet the safety needs.

What can go wrong? What are the consequences? What is the impact? What do we need
to do reduce the likelihood? What systems do we need to put in place? How do we ensure
that the systems we rely on are operating as designed?

The second phase is realization, which focuses on design, specification fabrication, procurement,
construction, installation, and commissioning.

The final phase is operation, which covers start-up, operation, maintenance, modification, and
eventual decommissioning.

The operation phase includes a very important ingredient – the management of change. Anytime
you decide to make a modification in this phase, you need to assess the impact of this change on
your system by analysing the effect throughout the entire safety lifecycle.

You want to be sure that your change will not affect any safety-critical loops or compromise the
safety integrity that you selected for that loop. When applied correctly, this feedback mechanism
ensures that you do not introduce human error or unforeseen hazards by making changes at the
end of the process.

10 Don’t keep safety a secret – an introductory guide to functional safety


Building defence in depth – safety and layers of protection
The next concept to introduce is layers of protection. Process designers use a variety of
protection layers-or safeguards- to create “defence in depth” against catastrophic accidents.
They are devices, systems, or actions that can prevent a hazard from transitioning to an
undesirable consequence.

Process Designers use a variety of protection layers – or safeguards – to defend in depth


against catastrophic accidents.

For example, the above image shows a plant that is being run with a basic process control system
– it could be as simple as single loop controllers to a very large scale, complex system. Its job is
to keep the process under control and operating within the safe operating limits. For example, the
process could be a pressure vessel where chemicals are mixed to produce some sort of product.
The control system would govern how much product goes in, how much is converted, how much
comes out.

If something upsets that process, for example, bad feedstock, a pressure spike, or something else
extraneous – the first layer of protection is the alarm. When the alarm is generated, an operator
can intervene in the process to get it back under control and back in production. The process
alarm and operator response are considered a layer of protection, an independent brick wall that
prevents a catastrophic situation.

The next layer of protection is generally a Safety Instrumented System or shutdown system that
senses that a process is getting so out of control that it is headed for a dangerous situation. With
no human intervention, it shuts down the process before it gets to a hazardous state.

If these systems don’t work or don’t work fast enough, the next layer of protection is physical
devices such as pressure relief valves or flare systems.

However, if the hazard remains uncontained, that is the hazardous event has occurred, the next
layer of protection seeks to mitigate the result (as opposed to preventing it in the first place).
For example, in the event of a fire, a sprinkler system could mitigate the consequences.

Each protection layer is independent from the source of the hazard and from other protection
layers. Any one layer will perform its function, regardless of the action or failure of any other
protection layer.

Life is On | Schneider Electric 11


Independent Protection Layers
Let’s examine layers of protection more closely. Where does a Safety Instrumented System fit
among these layers of protection?

The left-hand axis is divided into Prevention and Mitigation. Plant design is a protection layer in
itself - if you design your plant to be inherently safe (often referred to inherently safe design) that’s
a layer of protection.

Next, the Basic Process Control System (BPCS), normally a Distributed Control System (DCS)
which is responsible for the normal operation of the plant, is used as a layer of protection against
unsafe conditions in many instances. Normally, if the BPCS fails to maintain control, alarms notify
operations that human intervention is needed to re-establish control of the process within the
specified limits.

If the operator is unsuccessful, having an independent Safety Instrumented Control System in


place can bring the process to a safe state and mitigate any hazards. All these protection layers
are designed to prevent hazards in the first place.

Should these protection layers fail, different layers of protection continue to try to minimise the
consequences of the hazard, such as relief valves, dikes, and emergency response crews plant
and community evacuation procedures.

The key point here is a Safety Instrumented System is an independent layer of protection and an
important part of hazard prevention. Up until recently, some plants tried to incorporate this safety
function inside the Basic Process Control System. The problem is this; if a failure of the Basic
Control System is the cause of the hazard, the Basic Control System can’t possibly execute the
safety action required to bring the plant to a safe state.

One is a demand on the other. The Safety Instrumented System must be independent from
the Basic Process Control System, especially if you are taking credit for the risk reduction in
your calculations.

In the next chapter, we will look at what makes up a Safety Instrumented System, how it is different
from the control system, the difference between safety and non-safety PLCs and what you would
use an SIS for.

12 Don’t keep safety a secret – an introductory guide to functional safety


So what exactly is a Safety
Instrumented System?
We’ve examined why we have standards, the independent layers of protection, so now let’s look at
what a Safety Instrumented System(SIS) is.

Formal definition: SIS – “instrumented system used to implement one or more Safety Instrumented
Functions (SIF). A SIS is composed of any combination of sensor(s), logic solver(s), and final
element(s)” (IEC 61511 / ISA 84.01)

Informal definition: “an instrumented Control System designed specifically for safety applications that
detects “out of control” process conditions and automatically returns the process to a safe state”.

The informal definition more easily conveys intent: “a Safety Instrumented System is the last line of
defence before a hazard occurs. It is a layer of protection designed to achieve or maintain a safe
state of the process when unacceptable process conditions are detected”.

Again, the Safety Instrumented System is different from the basic process control system that
controls the plant and needs to be treated differently.

A Safety Instrumented System looks at the integrity of a safety loop from pipe to pipe. It has three
major elements 1) Sensors 2) logic solvers and 3) final elements. The sensors look for the initiating
event that could cause the hazard. The logic solver decides how to deal with the hazard and then
sends a signal to the final element(s). When triggered, the final element brings the process to a
safe state. Each of these elements and how they relate needs to be considered when assessing
the risk.

A Safety Instrumented System may implement one or more Safety Instrumented Functions (SIF),
which are designed and implemented to address a specific process hazard or hazardous events.

Life is On | Schneider Electric 13


How is the SIS different from the BPCS?
A Safety Instrumented System is different from the Basic Process Control System. But how? Let’s
look at this simple process instrument diagram (P&ID).

A Basic Process Control System is connected to the sensors and actuators and uses set point
control to control the flow of material through the process. For example, a set point control loop
consisting of a pressure transmitter, controller, and control valve. The pressure in this vessel must
be controlled. This is done by sensing the pressure with a pressure transmitter (PT101).

Pressure measurements are sent to the controller (PIC101). When the pressure reaches a certain
point, the controller instructs the value (PV101) to open or close until the pressure reaches the
desired set point.

If any of these elements were to fail – the pressure transmitter, controller, or control valve, or for
example, the controller was put into manual or left in manual control, there would be no way to
control the pressure in the vessel.

The Safety Instrumented System shall be completely independent from the BPCS.
It shall use different instruments, different controllers, and different end devices.
Its only role is safety and protection.

A Safety Instrumented System uses an independent pressure transmitter (PT102), an independent


isolation valve (UV102), and an independent controller (USC102) to manage the pressure. If the
pressure within the vessel exceeds the set point, the independent pressure transmitter sends the
message to the controller. The controller in turn sends the signal to the isolation valve, which will
stop the flow of product into the vessel. Even if all BPCS elements fail, a hazard is prevented by
not allowing any more product to enter the vessel, which should keep the pressure in the vessel
below the allowed limits.

Often an additional protection layer would be added such as a mechanical pressure relief value,
adding yet more safety and reducing the likelihood of something bad from happening or the
consequence of the hazardous event.

The SIS must be completely independent from the BPCS. It must use different instruments,
different controllers, and different end devices so that where a failure of BPCS devices may not
result in both a demand on the SIS and a dangerous failure of the SIS. No process control is
performed in the SIS – Its only role is safety and protection.

14 Don’t keep safety a secret – an introductory guide to functional safety


What is the difference between a Safety PLC and a Standard PLC?
A commonly asked question is “Why can’t I use a standard PLC to protect my plant? Why do I
need to pay more money for a safety PLC?”

Safety PLCs are designed specifically for safety applications with very specific and
quantifiable known failure modes.

The fundamental difference is a standard PLC can fail in ways that are not predictable, or at least,
not predictable with any certainty that can be designed around. Usually, no-one knows that there
is a problem until the problem occurs, by which time, it may be too late.

Safety programmable logic controllers, on the other hand, are operating with very specific and
quantifiable known failure modes. They are designed in such a way that when they fail, they do
so with a level of certainty and within a specific probability that corresponds to safety integrity
levels. This is achieved in a variety of ways including incorporation of a high degree of internal
diagnostics and automatic testing, hardware backup and redundancy. As well as voting of
independent signals within the PLC. In short, they are designed so that you can expect the
equipment to work when it is needed to work.

Also, safety PLCs are certified by a third party to international standards such as IEC61508.
IEC61511 and ISA SP84 using rigorous test procedures to confirm that they’ve been designed
and manufactured in a way that the effect of systematic failures during this process is reduced.
The most recognised 3rd party certification organization of safety PLC’s is the German TÜV
although there are others in the market that also certify safety related equipment. And when you
are controlling the destiny and health of your employees, your property,
the environment, you want to be sure that your equipment has been tested and verified by
someone other than the person or vendor who sold the equipment to you!

Life is On | Schneider Electric 15


What would I use a Safety Instrumented System for?
Safety systems are applied over a much wider area of your plant than you may realise.

While most people are familiar with emergency shutdown systems (ESD) and fire and gas (F&G)
detection and suppression systems, many don’t appreciate that burner management (BMS),
turbomachinery protection (TMC) or high integrity pressure protection (HIPPS) systems are
considered SIS.

• Emergency shutdown system - detect potentially hazardous conditions, shut down the plant and
help prevent unsafe incidents
• Fire and gas detection - detect abnormal situations such as fire or combustible/toxic gas and
provide early alerts and mitigation
• Burner management system - enable safe operation and protection of burners, boilers, furnaces,
and heaters.
• High Integrity Pressure Protection - help prevent over pressurization of a plant or pipeline
• Turbo Machinery Control and Protection - make your rotating machines safe and efficient

In the next chapter, we will look at safety instrumented functions, safety integrity levels, and
probability of failure on demand.

16 Don’t keep safety a secret – an introductory guide to functional safety


What is the difference between
a Safety Instrumented
Function (SIF) and a Safety
Integrity Level (SIL)?
As per the standard, a SIF is “a function to be implemented by a SIS which is intended to
automatically achieve or maintain a safe state for the process with respect to a specific
hazardous event.”

Informally, it is a function that doesn’t require operator intervention. It is one loop within the SIS
consisting of sensors, logic solver, and final control elements that act together to detect a hazard
and bring the process to a safe state. In simple terms, it is the safety equivalent to a control loop!

A SIF is one loop within the SIS consisting of sensors, logic solver and final control
elements that act together to detect a hazard and bring the process to a safe state.

A SIF senses a potential hazard, applies logic, and triggers an action to bring the plant to a safe
state. It is event driven. A Safety Instrumented System utilizes several SIF’s working together as a
whole to protect the entire plant.

Common misconceptions about a SIF include:

• Generating an operator alarm indication is a SIF


• Detecting an over temperature on the burner exhaust is a SIF
• Detecting a flammable gas cloud is a SIF
• Detecting smoke or fire is a SIF

None of the above include an action, associated with a final element, that automatically brings the
plant to a safe state, hence not considered a SIF. They may be part of the overall risk reduction,
for example, partial credit may be taken for the operator alarm, but this is not a SIF.

Life is On | Schneider Electric 17


An example of a SIF could be something like detecting over temperature in the burner exhaust
causes the main fuel isolation valve to trip and remove the fuel from the burner, thereby bringing
the burner to a safe state would be considered a Safety Instrumented Function. It contains a
sensing element, logic element and takes an automatic action that prevents a specific hazard.

What is Safety Integrity Level (SIL)?


To what extent can a process be expected to perform safely? And in the event of a failure, to what
extent can the process be expected to fail safely?

These questions are answered through the assignment of target Safety Integrity Levels, or SIL.

A SIL Level is “the Safety Integrity Level (SIL) of a specific Safety Instrumented Function (SIF)
which is being implemented by a Safety Instrumented System (SIS).”

Or put simply, Safety Integrity levels are measures of the amount risk reduction that the SIF should
provide against a specific hazardous event in relation with the corporate tolerable risk levels.

Safety Integrity Levels are measures of the safety risk given to a specific process.

In simple terms, a Safety Integrity Level is a measurement of the risk


reduction performance required for a Safety Instrumented Function.

Safety Integrity Levels are defined in 4 discrete levels of safety,


1 through 4.

Each level represents an order of magnitude of risk reduction.

The higher the Safety Integrity Level, the greater the impact or
consequence of a failure, and the lower the failure rate that
is acceptable.

Safety Integrity Level (SIL) expressed as a Probability of


Failure on Demand (PFD)
The effectiveness of a Safety Instrumented Function can be described in terms of the probability
that it will fail to perform its required function when it is called upon to do so. This is known as the
Probability of Failure on Demand, or PFD.

PFD = λdu TI
2
PFD Probability of Failure on Demand

λdu Dangerous Undetected Failures

TI Test Interval (Proof Test)

18 Don’t keep safety a secret – an introductory guide to functional safety


Over time, the probability of a failure on demand increases. To counter this, we periodically test
the SIF to check that it functions correctly (note: the SIF includes sensors, the SIS and the final
elements), which is why the simplified equation includes TI, the test interval period.

This is a significant contributor to the overall PFD. The test interval is the time between which you
have some uncertainty about your Safety Instrumented System being as good as it was the first
day that you tested it.

If you think about a SIL as probability, then on day one it’s very safe, but over time, the sensing
elements could degrade, pipe could corrode, or the valve might start to wear. The longer the time
between testing the SIF, the more uncertain you are about its ability to perform correctly when its
vital it operates on demand.

Without regular testing, you cannot verify that your Safety Instrumented System still works the way
you designed it to work. Every time you perform a proof test, it in theory brings the PFD back to
where it was on day one.

Note: Some regulators do not allow you to consider a proof test to take you back to day one in the
true sense of the word, so this is simplified for explanation purposes.

This is illustrated in the saw tooth curve – the average probability of failure on demand is the
average of the saw tooth curve. You need to adjust the test interval to ensure that the dotted line
stays inside the desired safety integrity target or band.

If the test interval is extended, the performance is significantly affected and the average PFD
increases. Likewise, if you shorten the test interval, you are testing more often, your PFD
decreases, however your maintenance costs also go up! And the probability of human error
(systematic failures) could increase.

The point is that safety integrity levels are connected to specific proof testing intervals which form
part of your SIF design. Varying the proof test interval will change PFD.

For example, a piece of equipment can achieve Safety Integrity Level 3 (SIL 3) operation with
a proof test interval of 12 months. If however, it is proof tested only every 48 months due to
operational restrictions, it is no longer operating as designed, and if you run the calculation,
may not be providing a PFD equal to the SIL 3 target.

Life is On | Schneider Electric 19


What are the different SIL levels?
A Safety Integrity Level is a way to indicate the tolerable failure rate of a particular safety function.
The safety Integrity Level assignment is based on the amount of risk reduction that is necessary to
maintain risk at an acceptable level.

A Safety Integrity Level is a way to indicate the tolerable failure rate of a particular
safety function

Let’s take a closer look at the order of magnitude between the different safety integrity levels
(1 being the lowest, 4 being the highest) for SIF low demand mode of operation:

Safety Integrity Safety Probability of Risk Reduction


Level Failure on Demand Factor

SIL4 > 99.99% 0.001% to 0.01% 100,000 to 10,00


SIL3 99.9% to 99.99% 0.01% to 0.1% 10,000 to 1,000
SIL2 99% to 99.9% 0.1% to 1% 1,000 to 100
SIL1 90% to 99% 1% to 10% 100 to 10

A Safety Integrity Level indicates the probability of failure on demand, not how often it will fail but
how often it will fail at the time it needs to operate.

This can be expressed in terms of safety availability, or how much of the time the system will work
when you need it to work. For example, at the lowest Safety Integrity Level, if the system works
when you need it to work between 90% and 99% of the time, that’s good enough. Every step up is
an order of magnitude of safety.

One way to think of Safety Integrity Levels is to consider how much risk reduction you are
applying to the Safety Instrumented Function. Say your inherent risk is “x” and in order to get to
the acceptable risk, you need to reduce your risk somewhere between 10 and 100 times. Safety
Integrity Level 1 is all you need to apply.

What the Safety Integrity Level concept asks you to do is to assess your risk for a specific Safety
Instrumented Function, then determine the appropriate level of risk mitigation you require, which is
equivalent to the required Safety Integrity Level of the Safety Instrumented Function.

In the next chapter we will look at risk, what is tolerable, and how to reduce it to an acceptable
level.

20 Don’t keep safety a secret – an introductory guide to functional safety


What is risk? What is tolerable?
What is acceptable?
Safety standards exist to manage and reduce risk, that leads to questions such as “How do we
define risk? How do we measure it? How do we understand what it means?”

Risk = Likelihood x Consequence


The likelihood of a specified or undesired event occurring within a specified period
or in specified circumstances.

One of the simple ways to think about risk is “What is the likelihood of an undesired event occurring
within a specific period or given circumstances?”

One of the simplest ways that we can do this is to use a risk matrix that shows the correlation
between the likelihood of something happening and the consequence if it happens.

A risk matrix is:

• a matrix used to define the level of risk


• by considering the category of probability or likelihood
• against the category of consequence or severity.

This is a simple mechanism to:

• increase visibility of risks


• assist management decision making.

Note: while the concept of the risk matrix is the same, I have seen many different size risk matrices
used such as 3x3, 4,4, 5x5, 5x6, 6x6, even 6x8. I have seen the risk matrix inverted, so the
highest risks are in the bottom left-hand corner, not the top right. I have seen risk matrix used for
people safety on the plant, people safety off the plant, commercial risk, public image as well as
environmental risk.

But the basic principle remains the same. If you have a low likelihood of an undesired event
occurring and the consequence is minor, that’s a relatively low risk.

However, if you have a high likelihood of an undesired event occurring and the consequence
is still minor, it might still be considered high risk…because if it happens all the time,
it’s interrupting production.

Life is On | Schneider Electric 21


Some of these things can be quantified. The likelihood of a hazardous event happening may be
directly related to the reliability or a valve, a motor, or a pump. Their prior use fault history of a
device can give you an idea of how likely it is that their failure will cause some sort of an event.

How do we achieve the right balance of too little or too much risk?
Many operators immediately associate risk with injury or death to personnel. But when analysing
risk, these operators may not have thought about some of the ongoing or consequential damaging
effects of taking too much risk, including:

• Injury or death to personnel


• Environmental damage and associated clean-up costs
• Damage and loss of equipment / property
• Business interruption and associated losses
• Company reputation and image
• Market share loss
• Legal liability, litigation and “duty of care” defence

For example, the sustainability impact on the environment and the consequential clean-up cost is a
prevalent consideration among regulatory authorities.

Consider the loss of containment of Macondo oil well, owned by BP in the Gulf of Mexico in April
2010, and the consequential environmental damages. The operator’s revenue was interrupted.
And the incident certainly carried a huge legal liability. Their corporate image took a staggering hit,
the stock price was significantly impacted, and the final settlement bill was in the $Billions.
These are all factors that need to be considered when identifying risk, but which are not always
easy to quantify.

So, when considering the consequence, don’t just think about the safety consequence, consider
the people, commercial and environment consequence.

The point is, you must think about the big picture when considering risk. All these factors can
weight the impact of the consequence you are trying to mitigate.

22 Don’t keep safety a secret – an introductory guide to functional safety


Tolerable risk and reducing it to the lowest acceptable limit
The goal of eliminating risk and bringing about a state of absolute safety is not attainable.
The foundation for any modern safety system is therefore to reduce risk to an acceptable or
tolerable level.

When considering tolerable risk, there are three basic considerations:

• Legal and statutory obligations


• Moral obligation
• Financial obligations to investors and shareholders

Some countries have mandated tolerable risk limits. But in many countries for example North
America, the liability is on the operator to determine what their tolerable risk is, provided they meet
OSHA requirements as a minimum.

A common concept when determining tolerable risk is ALARP or “As Low as Reasonably
Practicable”. ALARP is one of the fundamental principles of risk management. We neither need nor
want to manage risk to the point where we eliminate it, because doing so is simply not a good use
of resources. i.e. the point where the costs exceed benefits.

Assessing tolerable risk will be different based on the operators’, and/or shareholders’
expectations, as well as where the plant is built and operated. Building a plant in the middle of the
desert is very different from building the same plant in the middle of a city surrounded by a dense
population. Same plant, same process risk, same process. But the tolerable risk to the operator is
completely different.

Life is On | Schneider Electric 23


Let’s take a practical example, and consider a hazardous event (HE), a tank overfill that may lead
to tank rupture, an explosive vapour release and potential fire:

There are multiple layers of protection being deployed to reduce the overall risk to an acceptable
level as no one layer on its own provides enough risk reduction to get to the target acceptable level:

• a pressure relief device (PRD)


• a basic process control system (BPCS) controlling the tank level (LT101, LIC101, LV101)
• an operator alarm (LAH101)
• the SIS system (LT102, XV101)

A key point to note and remember is – a Safety Instrumented System does not reduce the
consequence; it can only reduce the likelihood of an incident or event from happening.

The difference between the inherent risk (the unmitigated risk) and the tolerable risk is how much
risk reduction you apply and the corresponding safety integrity levels that are assigned.

A Safety Instrumented System reduces the likelihood of an inherent process risk, and the
standards obligate us to go through this systematic process for every identified hazard that we feel
we can address with passive or active risk reduction methods.

Turn the page for the next chapter, where we will look at some of the considerations when selecting
the safety instrumented system including architecture, voting and testing.

24 Don’t keep safety a secret – an introductory guide to functional safety


What do I need to consider
when selecting the SIS?
Approximately 60% of failures are built into systems before commissioning, and 85% of those are
engineering related.

Omissions in the design of a Safety Instrumented Function could remain undiscovered until an incident
occurs. This possibility makes the conceptual design of a Safety Instrumented System especially
challenging.

The conceptual design comprises the selection of appropriate technology (sensors, SIS, and final
elements). There has never been more technology to choose from, so it is extremely important to be
clear on what you are asking for, and why:

1. What does it take to make it SIL3?

2. What does it take to keep it at SIL3?

a. Remember that proof test interval discussed earlier? The technology may be lower CAPEX, but
if you must test it every year to keep it at SIL3, then the OPEX costs escalate significantly. What
appears at first to be a cost-effective choice, may have a total lifecycle expenditure (TOTEX) that
far exceeds anticipated costs!

As always, the devil is in the detail. Something may have a certificate and be certified as SIL3, but
it’s important to understand what is behind the certificate. Every reputable SIS manufacturer will
have a safety manual to accompany the system. If you read nothing else, read this document, and
become familiar with what you are buying, what you are getting, or more importantly, not getting, what
functionality is built into the system e.g., diagnostics, versus what is required as you, the owner /
operator will have to do.

Specifying just the Safety Integrity Level or Probability of Failure on Demand values without
understanding how the equipment is going to be used, applied, tested, operated and
maintained in real life is crucial.

Selecting field devices

Selection of appropriate field devices and their


architecture is critical to meeting the performance targets.
When selecting the technology, you must understand its
failure rates, verify that it is certified for use in the desired
safety application, and understand any equipment
restrictions about how it is to be applied in safety
applications.

Proper application of today’s smart technologies can significantly benefit the operator by reducing
interventions in both duration and frequency.

Life is On | Schneider Electric 25


Select architecture / voting / fault tolerance
An understanding of how you are going to apply your design to address safety, specifically safety
within your plant uptime / availability targets is essential. You might have a safe plant, but if it’s shut
down for 60% of the time because you have spurious trips or failures, then that is not acceptable.
Furthermore, most of the major accidents happened worldwide, occur during plant startup or restart.

It is necessary to think about not only the process in a safe state, but also as a cost-effective
architecture with field devices and logic solvers that can tolerate a non-demand failure on one of its
elements without losing production.

You should also consider redundancy, and levels of fault tolerance, to maximize production up time,
avoid spurious trips, and ease maintenance activities. Not just for field instruments, but also for the SIS
system. The more fault tolerant the SIS, the less likelihood you are to have a costly nuisance or spurious
trip. Investing a little more in a highly fault tolerant SIS system pays long term dividends! The small
additional cost far outweighs an hour of lost production revenue.

Some of the considerations for architecture / voting selection include:


• Select degree of fault tolerance required for safety
• Select degree of fault tolerance for plant availability
• Apply required redundancy to both field devices and logic solver
• Identify potential common-cause failures that could defeat redundant architecture

You need to identify any common cause failures that could defeat the safety function even if you’ve
applied it redundantly. For example, one of the sensing or initiating events for a particular safety
protection loop is a pressure transmitter in the field. You may decide to use two pressure transmitters
so if one of them fails spuriously or without a demand you won’t trip your plant.

But the problem is if they are both connected to the same impulse line tubing, a blockage in that tube
could cause both transmitters not to sense the hazard.

This is known as a common cause failure and has the potential to either trip your plant spuriously, or
worse completely defeat the safety function.

Proof tests and testing


The Safety Instrumented System must be carefully monitored for performance against the assumptions
made during design. Because a Safety Integrity Level is essentially a probability or certainty that
something is going to work when it needs to work, you need to consider how often you are going to proof
test those devices. This should be part of the initial risk assessments, SIL Verification and detailed in the
Safety Requirements Specification (SRS).

Many plants have scheduled maintenance turnarounds every 5-7 years. But some systems need to be
tested more often to meet the Safety Integrity Levels outside of the turnaround window.

When selecting the SIS, consider:


• Proof tests – check for frequency, online or during shutdown, full functional test, or partial test
• Diagnostic testing – check if programming is required, frequency, response time to a detected fault

How are these tests going to be conducted without interrupting operations? Are you going to do a full
test or use some internal diagnostics to report on system integrity? Procedures and resources to inspect,
monitor, audit and report on system integrity enable you to prove the suitability of the safety system at any
point in its operation life.

In summary, specifying just the Safety Integrity Level or Probability of Failure on Demand values without
understanding how the equipment is going to be used, applied, tested, operated, and maintained in real
life is crucial. You must be very specific to ensure that you get what you need for the specific asset.

In the next chapter, we will look at practical examples of SIL1 and SIL2 safety instrumented functions and
the mean time to failure spurious (MTTFS).

26 Don’t keep safety a secret – an introductory guide to functional safety


Simplified SIL 1 and 2
practical examples
Let’s put everything previously discussed together in practical examples. As the majority of safety
functions in practice are SIL1 and SIL2, and very few are SIL3, we will just focus on SIL1 and SIL2.

Let’s start with a couple of typical Safety Integrity Level 1 designs.

This is a pressure vessel (V101). The instruments or devices in blue are elements of the basic
process control system that is maintaining the level in the vessel. If one of these elements fails,
it may not be possible to control the level of the vessel.

For example, if the level transmitter (LT101) failed dangerously, that is it stopped reacting to the
tank level and reported the same level all the time, the controller (LIC101) would continue to think
that the level was constant, even if it was rising. If the level rises too much, product could come
out of the top of the vessel (known as a loss of containment) which could spark off, and then cause
an explosion.

On the other hand, the Safety Instrument System (Shown in Red), measures the same level
in the vessel through an independent level transmitter (LT102), independent logic solver and
independent valve (XV101).

Thus, if one of the basic process control system elements fail, the Safety Instrumented System
can still sense a high level and protect this vessel from getting into a hazardous state. If the Safety
Integrity Level for this protective function was determined as Safety Integrity Level 1 (SIL1), then
this design is sufficient.

Life is On | Schneider Electric 27


Typical SIL 1 design – safety and higher MTTFS
(Mean Time To Failure Spurious).

[Note: Mean Time to Fail Spurious = The mean time until a failure of the system causes a spurious process trip]

But what if LT102 is not the most reliable in the world? Say it fails over time due to environmental
condition, corrosive process fluids etc. In the previous example, the Safety Integrity Level 1 still is
met because the safety function will function if LT102 fails spuriously.

But it will trigger the valve (XV101) and shut the process down, potentially costing millions of
dollars in lost production just because this $1,000 device failed in a spurious way.

One way to increase the availability of this protection loop, but not compromise safety, is to use
two measuring devices to look at the same level in the vessel.

If one of them fails spuriously, that is, failed when there was no demand, you still receive the
healthy signal from the second device. Both must agree at the same time (Two out of Two voting
or 2oo2) that there is a process condition issue before they trigger the protection loop. The
availability or the potential for this unit to shut down by a spurious trip has been reduced by adding
an additional device.

Note, the safety integrity has not been affected. The SIF still can achieve a Safety Integrity Level
1 SIF, but with an increased process availability because the spurious trip of just one of the level
transmitters will not trip the loop.

Important: the simplified examples do not negate the need for following the systematic approach
to identify hazards, risks, consequences, likelihood, target levels, risk reduction methods etc.

28 Don’t keep safety a secret – an introductory guide to functional safety


Typical SIL 2 design
Let’s look at a higher integrity loop. This is a Safety Integrity Level 2 design. It still has the same
process control, but assumes the associated risk, consequence, and likelihood is an order of
magnitude higher. Therefore, it needs an order of magnitude more risk reduction to reach a
tolerable level.

This architecture uses two level devices in the safety loop to measure the level in the vessel
(LT102 and LT103). Only one of them must detect a hazardous condition (one out of two or 1oo2)
to shut down the system using valves XV101 and XV102.

Once again, because only one of these devices must vote on a hazardous situation, it trips the
production unit and takes the production unit to a safe state. This design is much safer than the
previous design, but still, if a $1,000 device goes bad, it will stop production.

Typical SIL 2 design – higher MTTFS (Mean Time To Failure Spurious)

To increase the overall availability of this production unit and maintain safety, you could typically
use three devices measuring the same level (LT102, LT103, LT104).

During normal operation, all three level transmitters monitor the level, and are voted such that
two out of three (2oo3) need to agree to demand the safety function before it can take place.
This allows one of these devices to go bad without tripping the entire plant, and still provides the
required safety levels.

Another benefit is that this allows online maintenance and testing of the level transmitters without
the need to put the loop in manual control or any loss of protection as the voting takes it out of
the equation.

The point is, when designing a system; you need to consider both the SAFETY of the plant and
the AVAILIBILITY of the plant. Afterall, the safest plant is one that never starts up and operates!

Life is On | Schneider Electric 29


Summary
So that’s it! In this book we have tried to keep it simple and introduce the importance of functional
safety and some idea of what is involved.

In summary:
• Device hardware failures are not the primary causes of previous industry major accidents.
• EC61511 / ISA S84.01 are standards for the process industries.
• They are performance-based standards and address the entire safety lifecycle.
• They are not always mandatory but are considered best engineering practice by industry and regulators.
• A Safety Instrumented System is an independent layer of protection and must be independent and
separate from the Basic Process Control System. They should not normally “share” the same field devices.
• SIS PLCs are different from traditional PLCs and independently certified for use in safety related
applications.
• A Safety Instrumented Function is one loop (a safety loop) executed within a Safety Instrumented
System (SIS). It includes sensors, logic solver and final control elements that work to detect a hazard
and bring the process to a safe state.
• A Safety Integrity Level (SIL) indicates the tolerable failure rate of a particular Safety Instrumented Function.
• Risk is a product of likelihood and consequence. A Safety Instrumented System is designed to reduce
the likelihood of an incident or event, it does not reduce the consequence.
• And finally, the device testing, voting architecture and plant availability targets must all be considered
in designing a Safety Instrumented System. Is it safe? Is it safe with enough availability to allow for
continuous production?

Like everything, the devil is always in the detail. The good news is that there are plenty of excellent
resources for more information, including the following sources:

• American Petroleum Institute


• Centre for Chemical Process Safety
• Chemical Safety Board
• Energy Institute
• Health and Safety Executive
• International Association of Oil and Gas Producers (OGP)
• National Skills Academy
• The Organization for Economic Co-Operation and Development (OECD)
• Occupational Safety and Health Association (OSHA)

30 Don’t keep safety a secret – an introductory guide to functional safety


About the author
Kenny Chua is an offer manager for the Triconex safety and critical
business. He holds a Master of Business Administration (MBA),
is a Bachelor of Science (BSc) in Electrical Engineering as well
as a certified Project Management Professional (PMP) and
a TÜV certified Functional Safety Engineer (FS Eng).

He has more than 15 years’ experience in the process safety


industry, including roles in project engineering, project
management, and offer management.

Kenny Chua
Triconex Safety and
critical Control

Life is On | Schneider Electric 31


se.com

Schneider Electric
50 Kallang Avenue,
Singapore 339505
©2024 Schneider Electric. All Rights Reserved.
Schneider Electric | Life Is On is a trademark and the property of Schneider Electric SE, its subsidiaries, and affiliated companies.
998-23302905

You might also like