95% found this document useful (19 votes)
4K views

Uptime Elements Passport: Gineer

uptime elements

Uploaded by

Brian Careel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
95% found this document useful (19 votes)
4K views

Uptime Elements Passport: Gineer

uptime elements

Uploaded by

Brian Careel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 148

REM

Uptime® Elements ™
Passport

Reliability Engineering for Maintenance


E ngi nee
IN PREPARATION FOR
i t y ri
l

ng
i
Reliab
Part of the Certified Reliability Leader
Body of Knowledge REM

e
nc
fo a
r M ai nten

criticality analysis • reliability strategy development


reliability engineering • root cause analysis
capital project management
reliability centered design
REM
Reliability Engineering
for Maintenance

Ca Rsd Re Rca Cp Rcd


Reliability Engineering
for Maintenance
ISBN 978-194872-54-3
HF012017

© 2017 Netexpress USA, Inc. d/b/a Reliabilityweb.com


(“Reliabilityweb.com”)
Printed in the United States of America.
All rights reserved.

This book, or any parts thereof, may not be reproduced,


stored in a retrieval system, or transmitted in any form
without the permission of the Publisher.

Publisher: Reliabilityweb.com
Designer: Jocelyn Brown

For information: Reliabilityweb.com


www.reliabilityweb.com
8991 Daniels Center Drive, Suite 105, Ft. Myers, FL 33912
Toll Free: 888-575-1245 | Phone: 239-333-2500
E-mail: [email protected]

Uptime®, Reliabilityweb.com® and Uptime® Elements™ are the trademarks or


registered trademarks of NetexpressUSA Inc. d/b/a Reliabilityweb.com and
its affiliates in the USA and in several other countries.

10 9 8 7 6 5 4 3 2 1
REM Contents
criticality analysis
Ca Introduction.......................................................... 3
Key Terms and Definitions................................... 5
Criticality Analysis Development......................... 8
Analysis Process Methodology........................... 13
Benefits of Criticality Analysis............................. 15
What Every Reliability Leader Should Know....... 16
Summary.............................................................. 17

reliability strategy development


Rsd Introduction.......................................................... 21
Key Terms and Definitions................................... 24
Purpose of Reliability Strategy Development..... 26
Reliability-Centered Maintenance
Principles and Standards................................ 27
Reliability Strategy Development Tools............... 37
Benefits of Reliability Strategy Development...... 39
What Every Reliability Leader Should Know....... 44

iii
Summary.............................................................. 47
References............................................................ 50

reliability engineering
Re Introduction.......................................................... 55
Key Terms and Definitions................................... 55
Purpose of Reliability Engineering....................... 58
Role of Reliability.................................................. 59
Measuring Reliability............................................ 62
Software Reliability............................................... 64
Benefits of Reliability............................................ 65
What Every Reliability Leader Should Know....... 66
Summary.............................................................. 66
References............................................................ 68

root cause analysis


Rca Introduction.......................................................... 71
Key Terms and Definitions................................... 72
Purpose of Root Cause Analysis.......................... 73
Root Cause Analysis Process............................... 74
Root Cause Analysis Tools................................... 76
Benefits................................................................. 79

iv
What Every Reliability Leader Should Know....... 79
Summary.............................................................. 80
References............................................................ 83

capital project management


Cp Introduction.......................................................... 87
Key Terms and Definitions................................... 87
Developing Capital Project Management........... 90
Installation of New Assets................................... 94
Commissioning New Assets................................ 95
Optimizing Capital Project Management............ 97
What Every Reliability Leader Should Know....... 99
Summary.............................................................. 99

reliability centered design


Rcd Introduction.......................................................... 103
Key Terms and Definitions................................... 103
Principles of Reliability Centered Design............ 105
The 10X Rule........................................................ 106
2
Designing for RAMS ........................................... 107
Practices and Tools for Reliability
Centered Design.............................................. 111

v
What Every Reliability Leader Should Know....... 116
Summary.............................................................. 117
References............................................................ 118

Acknowledgment............................................... 121

vi
The Uptime Elements is a holistic system
based approach to reliability
that includes: Technical Elements,
Cultural Elements, Leadership Elements

Reliability Engineering
REM for Maintenance

Ca Rsd
criticality reliability
analysis strategy
development

Re
reliability
Rca
root cause
engineering analysis

Cp Rcd
capital reliability
project centered
management design
® ™
Uptime Elements
Technical Activities Leadership Business Processes

Reliability Engineering Asset Condition Work Execution Leadership


REM for Maintenance ACM WEM LER for Reliability AM Asset Management
Management Management

Ca Rsd Aci Vib Fa Pm Ps Es Opx Sp Cr Samp


criticality reliability asset vibration fluid preventive planning and executive operational strategy and corporate strategic asset
analysis strategy condition analysis analysis maintenance scheduling sponsorship excellence plans responsibility management
development information plan

Re
reliability
Rca
root cause
Ut
ultrasound
Ir
infrared
Mtmotor
Odr Mro
operator driven mro-spares
Hcm Cbl Ri Ak Alm
human capital competency risk asset asset lifecycle
engineering analysis testing thermal testing reliability management management based management knowledge management
imaging learning

Cp Rcd Ab Ndt Lu De Cmms computerized


Int Rj Dm Pi Ci
capital reliability alignment and non machinery defect maintenance integrity reliability decision performance continuous
project centered balancing destructive lubrication elimination management journey making indicators improvement
management design testing system

A Reliability Framework and Asset Management System™


Reliabilityweb.com’s Asset Management Timeline
Operate
Business Residual
Needs Analysis Design Create/Acquire Maintain Dispose/Renew
Liabilities
Modify/Upgrade

Asset Lifecycle

Reprinted with permission from NetexpressUSA Inc. d/b/a Reliabilityweb.com. Copyright © 2016-2017. All rights reserved. No part of this graphic may be reproduced or transmitted in any form or by any means without the prior express written consent of NetexpressUSA Inc. Uptime®,
Reliability®, Certified Reliability Leader™, Reliabilityweb.com® , A Reliability Framework and Asset Management System™ and Uptime® Elements™ are trademarks and registered trademarks of NetexpressUSA Inc. in the U.S. and several other countries.

reliabilityweb.com • maintenance.org • reliabilityleadership.com


Ca
criticality
analysis
criticality analysis

Introduction
Criticality analysis (CA) is a key element in the Reli-

Ca
ability Engineering for Maintenance (REM) domain of
Uptime Elements and is fundamental to asset manage-
ment. CA is used to evaluate how asset failures impact
organizational performance and to systematically rank
plant assets for the purpose of workflow prioritization,
preventive maintenance and condition monitoring
development, maintenance reliability initiatives, etc.
It provides the basis for determining the value and
impact a specific asset has on the production/operations
process, as well as the level of attention the asset requires
with regard to reliability strategy development (RSD) or
strategies and plans (SP) for asset management.
A failure mode and effects analysis (FMEA) is used
to determine different failure modes and their effects on
the asset, while a criticality analysis classifies and prior-
itizes the level of importance of a failure on operations.
This ranking is based on several factors, such as the pro-
jected failure rate of the asset, the severity of the effect
(i.e., consequences) of the failure and the likelihood of
the failure being detected before it occurs.
Asset criticality is sometimes called asset risk profile.
It uses a risk formula to determine the financial impact

3
Reliability Engineering for Maintenance

if an asset failure was to happen. Simply stated, it is a


risk rating indicator, with asset criticality directly pro-
portional to:

(Failure Frequency / Period) X (Cost Consequences ($))


= Risk ( $/period)

The cost consequence is not just the cost of lost pro-


duction and the cost of repair, but also includes costs
related to safety, the environment, quality, the organiza-
tion’s reputation, etc. The cost consequence is the total
business impact of that asset’s failure. The failure fre-
quency is an estimated number, a probability based on
history or industry norm for similar situations.
An analysis of asset criticality rankings performed
at numerous organizations shows that data and usage
histories are usually never as good as claimed. Also,
different areas of a plant or division utilize the com-
puterized maintenance management system (CMMS)
differently regarding work order creation, work record-
ing and parts usage. Planning and scheduling activities
of maintenance work orders can be guided by asset crit-
icality rankings to determine work execution. In other
words, the highest ranking criticality among the work

4
criticality analysis

orders would be chosen first for execution and then each


lower level ranking is performed in turn until all back

Ca
orders are completed.
Criticality analysis is an important tool that provides
valuable information for decisions about work priority,
developing reliability strategies, justifying resources to
conduct root cause analysis (RCA), FMEA, etc. CA
helps ensure that resources are being spent on the right
assets to get more value.

Key Terms and Definitions


Asset – A thing, entity, or item that has actual or poten-
tial value to an organization.
Asset (equipment) capacity – The ability of equipment
to produce a product or provide a service at a given per-
formance rate over a specified time period.
Asset management – An organizational process to
maximize value from an asset during its life; the manage-
ment of the life of an asset to achieve the lowest lifecycle
cost with the maximum availability, performance effi-
ciency and highest quality. Also known as Physical Asset
Management.
Computerized maintenance management system
(CMMS) – A software system that keeps records and
5
Reliability Engineering for Maintenance

tracks all maintenances activities. Synonymous with


Enterprise Asset Management (EAM).
Criticality analysis – A methodology used to evaluate
how asset failures impact organizational performance to
systematically rank plant/facility assets for the purpose
of work prioritization, preventive maintenance (PM)/
predictive maintenance (PdM) development and opti-
mization, material classification, capital improvement
projects, etc.
Data collection – Obtaining asset and facility informa-
tion to develop and support performance improvement
efforts.
Failure – Inability of an asset or component to perform
its designed function. It does not require the asset to be
inoperable; reduced speed or not meeting operational or
quality requirements
Failure mode and effects analysis (FMEA) – A
technique to examine an asset, process, or design to
determine potential ways it can fail and its potential
effects on required functions, and to identify appro-
priate mitigation tasks for highest priority risks. Also
known as Failure Mode, Effects and Criticality Analysis
(FMECA).

6
criticality analysis

Lifecycle costing – A technique that examines all


costs associated with assets/items during their lifecycle,

Ca
including design, development, build, operate, maintain
and disposal.
Maintenance program – A comprehensive set of main-
tenance activities, their intervals and required activities,
along with accurate documentation of these activities.
Maintenance strategies – A long-term plan covering
all aspects of maintenance management that sets the
direction on how assets will be maintained and contains
action plans for achieving the desired future state.
Mean time between failures (MTBF) – A basic measure
of asset reliability calculated by dividing total operating
time of the asset by the number of failures over a period;
the inverse of failure rate (λ) and is generally used for
repairable systems.
Mean time to repair (MTTR) – A basic measure of
maintainability, it represents the average time needed
to restore an asset to its full operational condition after
a failure; calculated by dividing total repair time of the
asset by the number of failures over a period of time.
Predictive maintenance (PdM) – An advanced main-
tenance technique focused on using technology, such

7
Reliability Engineering for Maintenance

as oil analysis, vibration, or ultrasound, to determine


condition of assets and then taking appropriate actions
to avoid failures. Synonymous with Condition-Based
Maintenance (CBM) and On-Condition Maintenance.
Preventive maintenance (PM) – A maintenance strat-
egy based on inspection, component replacement and
overhauling at a fixed interval, regardless of condition
at the time; usually performed to assess the condition of
an asset; replacing service items (e.g., filters, oils, belts
and lubricating parts) are a few examples of PM tasks;
PM inspection may require another work order to repair
other discrepancies found during the PM.
Risk priority number (RPN) – A technique used for
analyzing the risk associated with potential problems
identified during a FMEA; expresses the degree of risk
associated with potential problems regarding severity
and probability; usually calculated before and after the
improvement; mathematically, RPN = Severity x Occur-
rence x Detection.

Criticality Analysis Development


Why should an organization invest in a criticality
analysis process rather than conduct a FMEA? Prop-
erly conducting a FMEA is a time-consuming and
8
criticality analysis

resource-intensive activity. If an organization were to


attempt to conduct a FMEA on all its existing plant or

Ca
facility assets, it would consume almost all of its highly
skilled and specialized engineering team resources and
take an extensive period that would defer the benefits
the company might achieve while conducting it.
As a rule, the largest percentage of an organization’s
total risk for its equipment and plant assets is concen-
trated on a small proportion of these items. These are the
equipment and plant asset items that should be involved
in the FMEA process. Therefore, the emphasis should be
on those items that are critical for sustaining continuous
operation of the equipment and plant assets. This must
be the focus of the criticality analysis.
So, where is the starting point for a criticality analysis?
To understand a standardized approach of consequences
and severity of a failure, it’s best to review the follow-
ing chart (Figure 1) from the ISO14224 standard. The
categories on the chart define the type of failure based
on whether it’s catastrophic, severe, moderate, or minor.
Criteria for these categories must be determined by
each organization. For example, a failure that results in
death would be catastrophic; similarly, complete system
failure or production shutdown also could be viewed as

9
Reliability Engineering for Maintenance

Figure 1: The ISO14224 failure consequence block


diagram

10
criticality analysis

catastrophic. In the severe category, any injury or illness


that results from a failure would be considered severe.

Ca
However, the classification system damage in the range
of $1 million may vary from company to company. So,
the dollar threshold for the severe, moderate and minor
categories becomes organization dependent.
The operational consequences, which include
expenses, also introduce a subjective factor. For exam-
ple, what may be a very high maintenance cost to one
organization might not be as dramatic to another. As
such, setting the dollar amount in each column cate-
gory becomes dependent on the company. However, the
ISO14224 block diagram is an excellent starting point
for developing criticality analysis criteria. As a com-
pany fills out this chart, it is able to determine what is
unacceptable and must be prevented at any cost, when a
corrective measure should be considered at a reasonable
cost, or what an acceptable risk is and its run to failure
strategy.
The ISO14224 failure consequence diagram is also a
logical starting point for the severity of the failure. An
alternative approach utilized by some organizations is to
use a quantitative number that can be determined by a
criteria, such as hours of downtime, cost of repair, asset

11
Reliability Engineering for Maintenance

cost, etc. Whether an organization chooses to use qual-


itative or quantitative measures to determine severity,
a clearly defined approach to the failure consequence
ranking system becomes necessary.
Gathering input from production/operations, main-
tenance, engineering, quality, materials management and
environmental, health and safety (EH&S) representa-
tives can replace individual perceptions of criticality,
with agreement and a better understanding. As the
cross-functional team identifies factors, also known
as characteristics significant to the business, everyone
learns from others’ points of view. Examples of factors
that could be used to analyze assets include:

• Operations / Mission impact;


• Customer impact;
• Environmental, health, and safety impact;
• Product quality impact;
• Ability to isolate/recover from single point failures;
• Ability to detect failure before it occurs – early warn-
ing capability;
• Maintenance cost impact;
• MTBF or reliability;
• MRO spares lead time;

12
criticality analysis

• Asset replacement value;


• Asset utilization rate.

Ca
The team, based on their collective knowledge, can
choose the most appropriate factors.
Analysis Process Methodology
The suggested steps to conduct a criticality analysis are:
1. Select team members from cross-functional areas to
perform the analysis;
2. Get the list of assets from the CMMS based on an
established hierarchy scheme:
a. Use ISO14224 as a guideline, if needed, to
improve hierarchy and taxonomy;
3. Establish appropriate criteria and weighting factors
for criticality analysis;
4. Apply criteria and develop criticalty ranking number
for each asset, or assign Low (L), Medium (M), or
High (H) criticality based on collective team knowl-
edge and data available:
a. Numerical results can be scaled and grouped,
making it possible to classify asset groups by their
functional importance to the business;
b. Functional grouping can be classified into three
types of assets:

13
Table 1
Asset ID Asset Type Asset Description Criticality Criteria
Reliability Engineering for Maintenance

Weighted Weighted
(1) Mission - (2) (3) Safety - (4) (5) Single (6) Asset (7) Criticality Criticality
Operations Customer HSE Regulatory Point Replacement Maintenance (8) Spare Raw Rating Rating
Impact Impact Impact Impact Failure cost Cost lead Time Score (100) L-M-H
1 A-001 Assembly machine 4 4 2 1 3 3 3 3 23 57.5 M
2 Conveyor system 2 1 2 1 2 1 1 1 11 27.5 L
3 Hydraulic Power unit 2 1 3 3 2 2 2 2 17 42.5 M
4 Crane - OH 10 Ton 3 1 2 2 1 2 1 2 14 35 L
5 Transformer Area Transformer unit -PT1 5 3 2 1 4 3 2 3 23 57.5 M
Numerical criteria rating scale = 1-5 (5 being high impact) Crticality Rating … Low -L = 0-40 Medium -M= 41-70 High -H= 71-100

14
criticality analysis

i. nonessential to operations, can be classified as


(L) assets;

Ca
ii essential to operations, can be classified as (M)
assets;
iii critical to operations, can be classified as (H)
assets.

An example of a criticality analysis is shown in Table 1.


This example has five assets, with eight criticality criteria
factors. Each asset is rated on a scale of 1 to 5, with 5 being
the highest impact. The next column shows a cumulative
raw score of all factors. In this example, the weighting
factor is assumed to be the same for all factors. In the next
two columns, the raw score is converted to a scale of 100
or Low (L), Medium (M), or High (H).

Benefits of Criticality Analysis


Asset intensive businesses should embrace the asset crit-
icality ranking process and all the discovery that comes
with it. Understanding the ranking process and the
implications for work order execution are but a few of
the overall benefits. Removing most areas of subjectivity
from the work order process, capital investments and the
supply chain takes advantage of day-to-day maintenance

15
Reliability Engineering for Maintenance

routines and supports the goals of an effective and reli-


able asset management process.
By identifying the factors that make each asset crit-
ical, the analysis also provides valuable information for
deciding which actions will reduce the risk for all plant
assets.

What Every Reliability Leader


Should Know
• Selecting the appropriate team members from
cross-functional areas is important for conducting a
good analysis;
• It is essential to establish good criteria and weighting
factors;
• Have an adequate number of criteria factors, two are
too few, 10 are too many;
• Knowledge of ISO14224 can be a helpful resource;
• Asset criticality (ranking) is one of the best ways to
develop an effective maintenance reliability improve-
ment plan.

16
criticality analysis

Summary
Asset criticality is fundamental to asset management.

Ca
Organizations must define which of its assets are crit-
ical and focus their maintenance reliability efforts on
those assets first. Criticality prioritizes which assets are
important to monitor, maintain and improve. There-
fore, performing a criticality analysis, identifying critical
assets and building a reliability, maintenance, or asset
management plan is a good strategy.
The ranking process requires the selection of team
members from cross-functional areas, such as produc-
tion/operations, engineering, maintenance, quality,
health, safety and environment, etc., to perform the anal-
ysis. The ranking process defines the relative importance
of asset failure consequences to the overall business. This
is accomplished by evaluating asset failure consequences
and the probability of failure against weighted criteria
within several business impact factors. Typically, the
business impact factors of mission/customer, safety,
quality, regulatory, throughput and cost impact are used
for an evaluation.
The next step is to establish appropriate criteria and
weighting factors for criticality. Knowledge of ISO14224
could be very helpful with this task. Then, apply the criteria

17
Reliability Engineering for Maintenance

to each asset and either assign a numerical number or just


Low (L) Medium (M), or High (H) criticality based on
collective team knowledge and the data available.
To rank results, create a criticality list based on the
numerical criticality score for each asset, which then can
be put to use in a variety of ways, from daily workflow
management to reliability improvement to capital proj-
ect funding decisions.
Criticality analysis is an important tool that provides
valuable information for making decisions about work
priority, developing reliability strategies and justifying
resources to conduct RCAs, FMEAs, etc. Criticality
analysis helps ensure that resources are being spent on
the right assets to get more value for stakeholders.

18
Rsd
reliability
strategy
development
reliability strategy development

Introduction
Reliability strategy development (RSD) is based on
three main techniques:

• Reliability-centered maintenance (RCM);

Rsd
• Preventive maintenance optimization (PMO);
• Failure mode and effects analysis (FMEA).

These three techniques serve as the proven founda-


tions of any successful reliability strategy. They all focus
on creating time directed (TD), condition directed (CD)
or failure finding (FF) tasks that make up a preventive
maintenance program. These tasks seek to minimize
system and component degradation, thus ensuring the
assets continue to do what their users require in their
present operating context. Each technique is a differ-
ently structured process to develop efficient and effective
maintenance plans for an asset to minimize its proba-
bility of failure.
Successful reliability strategies rely on the correct
combination and application of these techniques to
deliver value to organizations in a safe, cost-effective way.
Regardless of which technique is applied, successful
outcomes will be increasingly likely if the four phases of
strategic change are understood and applied.
21
Reliability Engineering for Maintenance

Figure 1: Four phases of strategic change (Source: RCM


Project Managers' Guide. Reliabilityweb.com)

RSD relies on two areas of competence:

1. Understanding the differences between RCM, PMO


and FMEA.
2. Identifying when and where each technique should
be applied.

RCM
RCM is generally used to achieve improvements in all
aspects of asset management, such as the establishment
of a safe, minimum, or optimized level of maintenance,
changes in operating procedures and establishment of
an effective maintenance plan for the most critical systems.
Successful implementation of RCM promotes cost-
effectiveness, asset uptime and a better understanding of
the level of risk the organization is currently managing.
It has been demonstrated that the best benefit for
applying RCM is realized during the design and devel-
opment phases of the asset lifecycle by eliminating or
22
reliability strategy development

mitigating effects of its failure modes. However, RCM


can be successfully applied at any time during an asset’s
lifecycle.
RCM development has been an evolutionary process.
More than 40 years have passed since its inception in

Rsd
the 1970s, during which RCM has become a mature
process. However, industry has yet to fully embrace the
RCM methodology in spite of its proven track record.

PMO
A preventive/planned maintenance optimization pro-
cess focuses on evaluating each PM task and eliminating
unnecessary tasks or wasteful activities, thus improving
the plant’s overall performance. This allows refocusing
the resource’s constrained maintenance toward effective
failure prevention maintenance activities.

FMEA
Failure mode and effects analysis (FMEA), also some-
times called failure mode, effects and criticality analysis
(FMECA), is a step-by-step approach for identifying
all possible failures in design and operations (e.g., the
manufacturing process of a product or service).
Developed in the 1940s by the U.S. military, the
FMEA process was further developed and enhanced

23
Reliability Engineering for Maintenance

by the aerospace and automotive industries. Now, it’s


being applied to eliminate or minimize all operational
failures (i.e., defects) in industrial and non-industrial
applications.
The ISO/TS16949 quality management systems
standard requires suppliers to conduct product/design
and process FMEAs in an effort to prevent failures
before they happen.

Key Terms and Definitions


Asset – A thing, entity, or item that has actual or poten-
tial value to an organization.
Condition-Directed (CD) Tasks – Tasks directly aimed
at detecting the onset of a failure or failure symptom.
Critical Asset – An asset that has been evaluated and clas-
sified as critical due to its potential impact on safety, the
environment, quality, production/operations and mainte-
nance if it fails.
Failure – The inability of an asset/component to per-
form its designed function.
Failure Finding (FF) Tasks – Scheduled tasks that seek
to determine if a hidden failure has occurred or is about
to occur.

24
reliability strategy development

Failure Mode – The ways in which something might


fail; Different ways an asset or component can fail to
perform as intended.
Failure Mode and Effects Analysis (FMEA) – A
technique to examine an asset, process, or design to

Rsd
determine potential ways it can fail and its potential
effects on required functions, and to identify appropriate
mitigation tasks for highest priority risks.
Hidden Failure – A failure mode that is not evident to
a person or operating crew under normal circumstances.
Operating Context – The environment in which an
asset is expected to be used.
Preventive Maintenance Optimization (PMO) – A
methodology focusing on improving maintenance
effectiveness and efficiency by reviewing an existing
maintenance program and, in most cases, adding main-
tenance tasks to account for failure modes not addressed
by the existing program.
Reliability-Centered Maintenance (RCM) – A system-
atic, disciplined process for establishing the appropriate
maintenance plan for an asset/system to minimize the
probability of failures. The process ensures safety, system
function and mission compliance.

25
Reliability Engineering for Maintenance

Run to Failure (RTF) – A maintenance strategy or


policy for assets where cost and impact of failure are less
than the cost of preventive actions; a deliberate decision
based on economical effectiveness.
Time-Directed (TD) Tasks – Tasks directly aimed at
failure prevention and performed based on time, such as
calendar time or run time.

Purpose of Reliability Strategy


Development
Reliability strategy development (RSD) is a systematic
approach for developing new maintenance requirements
where they do not exist or optimizing an existing main-
tenance program. In both cases, the end result of the
strategy application is a maintenance program com-
posed of tasks that represent a technically correct and
cost-effective approach to maintaining asset/compo-
nent operability. This operability, in turn, lends itself to
improved system reliability and plant availability.
Another important result of an RSD program is a
documented, technical basis for every maintenance pro-
gram decision. Linking each maintenance action to a
failure mode is key to the successful application of any
reliability strategy.

26
reliability strategy development

Reliability-Centered Maintenance
Principles and Standards
There are four principles that define and characterize
RCM and set it apart from any other preventive main-
tenance planning process.

Rsd
Principle 1: The primary objective of RCM is to preserve
system function.
Principle 2: Identify failure modes that can defeat the
functions.
Principle 3: Prioritize function needs (i.e., failures modes).
Principle 4: Select applicable and effective tasks.

In addition, RCM recognizes:

• Design Limitations – The objective of RCM is to


maintain the inherent reliability of system func-
tion. A maintenance program can only maintain the
level of reliability inherent in the system design; no
amount of maintenance can overcome poor design.
This makes it imperative that maintenance knowl-
edge be fed back to designers to improve the next
design of the system.

27
Reliability Engineering for Maintenance

• Safety First, Then Economics – Safety must be


maintained at any cost; it always comes first in any
maintenance task. Hence, the cost of maintaining safe
working conditions is not calculated as a cost of RCM.

Effective is one of the key words in the RCM pro-


cess. It means you are sure the task will be useful and
are willing to spend resources to do it. Simply applying
a task just because it is possible to do or is applicable is
not sufficient justification.

RCM Standards
The SAE JA1011 standard describes the minimum cri-
teria to which a process must comply to be called RCM.
A highly simplified RCM decision framework is
shown in Figure 2 to the right.

Selecting Applicable and Effective Tasks


Time-based, intrusive preventive maintenance tasks
generally apply to:

• Bathtub curve, wear out and fatigue failure patterns;


• Single piece and simple items that frequently demon-
strate a direct relationship between reliability and age.
This is particularly true where factors, such as metal

28
reliability strategy development

Will the failure have No


a direct and adverse
effect on environment,
health, security, safety?
Will the failure have
a direct and adverse
Yes
effect on Mission
Yes No
(quantity or quality)?

Rsd
Will the failure result
in other economic No
Yes loss (high cost
damage to machines
Is there an No or systems)?
effective CM
technology or
approach?

Yes Candidate
Develop &
schedule CM
Is there an
effective Interval-
No
For
task to monitor
Based task?
condition.

Yes Yes

Redesign system,
Develop &
Perform Condition- accept the failure
schedule Interval- Run-to-Fail?
Based task. risk, or install
Based task.
redundancy.

Figure 2

fatigue or mechanical wear, are present or where the


items are designed as consumables (i.e., short or pre-
dictable life spans).

In these cases, an age limit based on operating time


or stress cycles may be effective in improving the overall
reliability of the complex item of which they are a part.

29
FAILURE PATTERNS
Random failures account for 77-92% of total failures and age related failure characteristics for the remaining 8-23%.
AGE RELATED

BATHTUB WEAR OUT FATIGUE


Probability of Failure

Probability of Failure

Probability of Failure
Reliability Engineering for Maintenance

by Nowlan and Heap, US Navy, Bromberg


Failure Pattern Percentage Sources: RCM
Time Time Time
RANDOM INFANT MORTALITY
RANDOM

Probability of Failure

Probability of Failure

Probability of Failure
Time Time Time
Reprinted with permission from NetexpressUSA Inc. d/b/a Reliabilityweb.com. Copyright © 2016. All rights reserved. No part of this graphic may be reproduced or transmitted in any form or by any means without the prior express
Figure 3: Failure written
patterns (Source: Reliabilityweb.com)
consent of NetexpressUSA Inc., Reliability® and Reliabilityweb.com® are trademarks and registered trademarks of NetexpressUSA Inc. in the U.S. and several other countries.
reliabilityweb.com • maintenance.org • reliabilityleadership.com

30
reliability strategy development

Condition directed tasks (i.e., condition monitoring)


generally apply to:

• Initial break-in, random and infant mortality failure


patterns;
• Complex items that frequently demonstrate some

Rsd
infant mortality, after which their failure probability
increases gradually or remains constant, and a marked
wear out age is not common. In many cases, scheduled
overhaul increases the overall failure rate by intro-
ducing a high infant mortality rate into an otherwise
stable system.

Failure characteristics (i.e., patterns) were first noted


in a 1978 report titled, Reliability-Centered Maintenance.
Other studies in Sweden in 1973 and by the U.S. Navy
in 1983 produced similar results. In these three studies,
random failures accounted for 77 to 92 percent of total
failures and age related failure characteristics for the
remaining 8-23 percent.

NOTE 1: Only condition directed tasks can address


random failure. The applicability of these tasks is limited
by the amount of time associated with the P-F interval
for each failure mode.

31
Reliability Engineering for Maintenance

NOTE 2: In a typical decision framework, condition


directed tasks are evaluated first because these types
of tasks are generally less intrusive, cheaper, quicker to
complete and enable the organization to plan and sched-
ule remedial work in advance of actual failure.

Failure finding tasks are those where a loss of func-


tion does not become evident to the operator. A failure
finding task can reduce the risk of multiple failure to an
acceptable level.

RCM Questions and Why They Matter


A process that answers the following seven essential ques-
tions can be termed reliability-centered maintenance.

1. What are the asset’s functions and desired stan-


dards of performance in its present operating
context?
PURPOSE:
This question forms the foundation for effective deci-
sion-making. The asset is there to support the full
mission of the plant. Efforts to support the mission add
value; efforts that don’t impact it are wasted.

32
reliability strategy development

VALUE:
When all parties involved in plant success (includes risks
to avoid) agree on asset function, they share an under-
standing of what is important and why it adds value.
RISK OF NOT APPLYING THIS STEP:

Rsd
A lack of understanding or agreement regarding asset
functions causes a lack of clarity regarding the right
thing to do. This leads to:
• Differing priorities;
• Inability to measure performance;
• Excess costs (i.e., not enough of the right thing or too
much of the wrong thing).
2. In what ways can the asset fail to fulfill its functions
(i.e., functional failures)?
PURPOSE:
This question focuses decisions on relevant functional
problems and the degree to which these problems can
manifest themselves a little or a lot.
VALUE:
Provides a logical connection between equipment failure
and the consequence of that failure to the component,
the system and the plant.

33
Reliability Engineering for Maintenance

RISK OF NOT APPLYING THIS STEP:


Actions will waste resources preserving equipment,
while falling short of protecting the desired function or
missing failure effects between interdependent systems,
sometimes catastrophically.
3. What causes each functional failure (i.e., failure
mode)?
PURPOSE:
This question identifies the component failure mode that
the decision will prevent or mitigate, detect onset of, or
discover, if hidden, what could go wrong.
VALUE:
The decision is specific to a failure event that can be
managed optimally. A significant number, if not most,
are identified.
RISK OF NOT APPLYING THIS STEP:
Decisions and resulting actions cannot be clearly linked
to the resulting performance of the system. (Hope is not
a strategy.)
4. What happens when each failure occurs (i.e., fail-
ure effects)?

34
reliability strategy development

PURPOSE:
This question identifies how component failure impacts
other components, systems, the plant, surroundings, or
the ability to detect failures.
VALUE:

Rsd
Detailed knowledge about adverse impacts, if any,
improves the quality of decisions made to manage them.
RISK OF NOT APPLYING THIS STEP:
Not understanding the effects of failure guarantees that
the consequences of failure are also unknown.
5. In what ways does each failure matter (i.e., failure
consequences)?
PURPOSE:
This question identifies how important the failure is to
control, prevent, or mitigate in terms of safety, opera-
tions, the environment and economics.
VALUE:
With infinite resources, you would address every poten-
tial problem equally. This question helps you identify
where you must actively manage failure and the extent
to which you must do so over other priorities.

35
Reliability Engineering for Maintenance

RISK OF NOT APPLYING THIS STEP:


Not understanding the consequences of a failure means
you are depending on luck to prioritize the right
actions and might be allocating resources to something
unimportant.
6. What should be done to predict or prevent each
failure (i.e., proactive tasks and task intervals)?
PURPOSE:
This question compares action alternatives that could
potentially manage failure.
VALUE:
The best available action to manage failure while mini-
mizing costs is chosen.
RISK OF NOT APPLYING THIS STEP:
Actions chosen to manage failure may not be as appli-
cable, effective, or economical as other options.

7. What should be done if a suitable proactive task


cannot be found (i.e., default actions)?
PURPOSE:
This question manages risks that maintenance tasks
cannot address.

36
reliability strategy development

VALUE:
Helps an organization eliminate risk, rather than
live with it. Documentation from all seven questions
will ensure the risk is given the appropriate level of
consideration.

Rsd
RISK OF NOT APPLYING THIS STEP:
The failure and its consequences are not under the con-
trol of the organization.

Reliability Strategy Development Tools


Reliability-centered maintenance is a tool to ensure that
assets continue to do what their users require in their
present operating context. Its best application is during
development/design of a new asset, however, it is also
used to improve maintenance plans of existing assets.
FMEA is a primary tool used within RCM anal-
ysis to ensure you are accounting for all the failure
modes. Successful implementation of RCM leads to
increased cost-effectiveness, asset uptime and a greater
understanding of the level of risk the organization is
managing.
PMO is a process to optimize maintenance plans or
PMs of existing assets. It also uses FMEA as one of its
tools. The goal of both RCM and PMO is to establish a
37
Reliability Engineering for Maintenance

cost-effective maintenance plan that ensures improved


asset availability and reliability.
PMO can be thought of as the reverse of RCM.
PMO starts with the task and works back to the fail-
ure mode to ensure it is applicable and effective. RCM
starts with functions and functional failures, and then
failure modes have tasks associated if they are applicable
and effective.
Although RCM, PMO and FMEA have a great deal
of variation in their application, most procedures include
some or all of the following nine steps:

1. System selection and information collection;


2. System boundary definition;
3. System description and functional block diagram;
4. System functions and functional failures;
5. Failure mode and effects analysis (FMEA);
6. Logic (decision) tree analysis (LTA);
7. Selection of maintenance tasks;
8. Task packaging and implementation;
9. Making the program a living one — continuous
improvements.

38
reliability strategy development

RSD Tools Derivatives


There are many derivatives of RCM, PMO and FMEA.
All of these derivatives help perform analyses cost-effec-
tively. Most of them take some shortcuts, such as cutting
out some steps, considering only a limited number of

Rsd
failure modes, or automating the process using software
to reduce the time taken to complete the analysis. In
addition, software programs are available to help reduce
the time to perform analyses.
It is important for users of these tools and tech-
niques to understand the limitations imposed by these
shortcuts. This enables users to apply RSD with confi-
dence by knowing the right tool is selected at the right
time and driven by the criticality of the equipment/
systems.

Benefits of Reliability Strategy


Development
Some of the benefits of implementing a reliability strat-
egy include:
• Enhanced Reliability – The primary goal is to
improve asset reliability and availability in a cost-ef-
fective manner. This improvement comes through

39
Reliability Engineering for Maintenance

constant reappraisal of the existing maintenance pro-


gram and eliminating or minimizing potential failure
modes during the lifecycle of an asset.
• Cost Reduction – Due to the initial investment
required to obtain the technological tools, training and
equipment condition baselines, a reliability program
sometimes results in a short-term increase in main-
tenance costs. This increase is relatively short-lived.
The cost of reactive maintenance decreases as failures
are prevented and preventive maintenance tasks are
replaced by condition monitoring. The net effect is a
reduction in reactive maintenance and a decrease in
total maintenance costs.
• Documentation – Reliability analysis utilizing RCM,
PMO, or FMEA allows for the understanding and
documentation of operations and maintenance key
features, failures modes, justification of PM tasks,
related drawings and manuals, etc. This documenta-
tion can be good training material for new operations
and maintenance personnel.
• Effective Equipment/Parts Replacement – With the
use of robust analysis tools, equipment/component
replacement is more likely to be based on equipment

40
reliability strategy development

condition, not on the calendar. This condition-based


approach to maintenance extends the life of the facil-
ity and its equipment.
• Efficiency/Productivity – Safety and environmen-
tal risks are the primary concerns. The second most

Rsd
important concern is cost-effectiveness, which takes
into consideration the priority or mission critical-
ity and then matches a level of cost appropriate to
that priority. The flexibility of the RSD approach to
maintenance ensures the proper type of maintenance
is performed when it is needed. Maintenance that is
not cost effective is identified and not performed.

Benefits of PMO
If one were to conduct a survey among maintenance pro-
fessionals to ascertain how their PMs came about or the
basis of their program, the responses would probably fail
to provide definitive and meaningful information. Most
existing PM programs cannot be traced to their origins.
For those that can, most are unlikely to make sense.
The following reasons are usually the ones given for
a PM program:

• Original equipment manufacturer (OEM)


recommendations;
41
Reliability Engineering for Maintenance

• Experienced based;
• Failure prevention;
• Brute force;
• Regulations.

Over time, you keep adding more and more tasks to


PMs without thinking about the cost and value of each
task. Eventually, PM tasks become ineffective. Perform-
ing too much PM, or an ineffective PM, can be costly.
Since PM optimization is a structured process used
to quickly improve the performance of existing assets
by eliminating unnecessary, redundant and ineffective
PMs, costs are reduced, maintenance is more effective
and asset performance is increased. From a financial per-
spective, reactive maintenance to fix failures typically
costs two to four times more than planned maintenance
due to its inherent inefficiencies.
There are similarities between some RCM and PMO
decision-making frameworks. However, they are not
identical.

PM optimization can lead to:

• Increased business revenue through increased asset


availability;

42
reliability strategy development

Rsd
Figure 4: Evaluation of failures (Source: Nexus Global)

• Lower risk of specific asset failures;


• Improved preventive/predictive procedures, as well as
improved safety and environmental performance by
reducing safety and environmental risks;
• Motivated people focused on improving asset
reliability;
• A structured approach to reliability improvement.

43
Reliability Engineering for Maintenance

What Every Reliability Leader


Should Know
Reliability Strategy Development
RSD relies on two main areas of competence:
1. Understanding the differences between RCM, PMO
and FMEA;
2. Identifying when and where each technique should
be applied.

Reliability-Centered Maintenance

• RCM is performed to ensure assets continue to meet


performance requirements in their present operating
context.
• It is a rigorous, structured process to develop an efficient
and effective maintenance plan to minimize failures.
• It is used to establish a safe and optimum level of
maintenance and changes in the operating procedures.
• Best results are achieved when it is done as a multi-dis-
ciplined team effort.

PM Optimization
PM optimization is a best practice that is achieved by:
• Removing or enhancing all maintenance tasks that are
vague, don’t add any value, or are not cost-effective;

44
reliability strategy development

• Replacing calendar based tasks with run based, con-


dition-based, or run to failure where feasible and cost
justified;
• Eliminating duplicate PMs, where different people
or groups are performing the same PMs to the same

Rsd
assets;
• Assigning tasks appropriately between maintenance
and operations;
• Making PMO a living program, updating as needed.

Failure Mode and Effects Analysis


FMEA helps designers and engineers improve the
reliability of assets and systems to produce quality
products.
Although the purpose, terminology and other details
can vary according to the FMEA type, the basic meth-
odology is similar for all types. The typical sequences of
steps consider the following set of questions:

1. What are the components and functions they


provide?
2. What can go wrong?
3. What are the causes?
4. What are the effects?

45
Reliability Engineering for Maintenance

Table 1: Typical modes of bearing failure

5. How bad are the effects?


6. How often can they fail?
7. How can this be prevented?
8. Can this be detected?
9. What can be done; what design, process, or proce-
dural changes can be made?

FMEA analysis helps to incorporate reliability and


maintainability features in the asset design to eliminate
or reduce failures, thereby reducing overall lifecycle costs.
Properly performed, FMEA provides several benefits:

46
reliability strategy development

• Early identification and elimination of potential


asset/process failure modes;
• Prioritization of asset/process deficiencies;
• Documentation of risk and actions taken to reduce
risk;

Rsd
• Minimization of late changes and associated costs;
• Improved asset (i.e., product), process reliability and
quality;
• Reduction of lifecycle costs;
• Catalyst for teamwork among design, operations and
maintenance.

Summary
Reliability-centered maintenance (RCM) is a process
to ensure assets continue to do what their users require
in their present operating context. The RCM process is
defined by the technical standard SAE JA1011, which sets
the minimum criteria that any process should meet before
it can be called RCM.
RCM is generally used to achieve improvements in
asset/plant operations, such as the establishment of safe
minimum levels of maintenance, including changes to
operating procedures. Successful implementation of
RCM leads to increased cost-effectiveness, asset uptime
47
Reliability Engineering for Maintenance

and a greater understanding of the level of risk the orga-


nization is managing.
The analysis of an asset, system and/or plant in
accordance with RCM methodology provides a set of
actionable tasks and improves the understanding of how
assets and systems operate and interact. It analyzes all
potential failure modes of an asset/system and develops
appropriate and cost-effective strategies, both mainte-
nance and operational tasks, to minimize failures. RCM
also determines a series of actions that ensure high asset/
system availability and provides documentation to sup-
port training personnel.
RCM emphasizes the use of predictive maintenance
techniques in addition to traditional preventive mea-
sures. These types of preventative actions are aimed
at avoiding failures and increasing availability. They
include:

• Maintenance tasks, which are grouped into the main-


tenance plan of an asset, system, or facility;
• Operating procedures for both production and
maintenance;
• Modifications or possible improvements;

48
reliability strategy development

• Defined series of training activities truly useful and


profitable for the company;
• Determination of important spare parts to keep in
stock at the facility.

Rsd
RCM must be considered throughout the lifecycle
of an asset if it is to achieve maximum effectiveness.
According to many studies, about 80 percent or more
of an asset’s lifecycle cost is fixed during the planning,
design and build phases. The subsequent phases set the
remaining 20 percent or so of the lifecycle cost. Thus,
the decision to institute RCM for an asset, including
condition monitoring, will have a major impact on the
lifecycle cost of the asset. This decision is best made
during the planning and design phase.
FMEA helps designers and engineers improve the
reliability of assets and systems to produce quality prod-
ucts. Although the purpose, terminology and other
details can vary according to the FMEA type, the basic
methodology is similar for all types.
PMO can address most existing PM programs that
cannot be traced to their origins. For those that can,
most are unlikely to make sense. The following reasons
are usually the ones given for a PM program:

49
Reliability Engineering for Maintenance

• OEM recommendations;
• Experienced-based;
• Failure prevention;
• Brute force;
• Regulations.

PMO is a structured process that enables organiza-


tions to create maintenance tasks. The main objective
is to maintain assets and facilities in satisfactory oper-
ating condition by providing for systematic inspection,
detection and correction of incipient failures either
before they occur or before they develop into a major
failure.

References
Society of Automotive Engineers. SAE JA1011, Evaluation
Criteria for Reliability-Centered Maintenance (RCM)
Processes, 1998.
http://standards.sae.org/ja1011_200908/
Society of Automotive Engineers. SAE JA1012, A Guide to
the Reliability-Centered Maintenance (RCM) Standard, 2002.
http://standards.sae.org/ja1012_200201/
Smith, Anthony M. and Hinchcliffe, Glenn R. RCM –
Gateway to World Class Maintenance. Waltham: Elsevier, 2004.
50
reliability strategy development

Moubray, John. Reliability-Centered Maintenance. New York:


Industrial Press, 1997.
Smith, Anthony M. Reliability-Centered Maintenance.
New York: McGraw Hill, 1993.
Gulati, Ramesh. Maintenance and Reliability Best Practices.

Rsd
New York: Industrial Press, 2009/2012.
Nowlan, Stanley F. and Heap, Howard F. Reliability-
Centered Maintenance. U.S. Department of Defense:
Report Number AD-A066579 (pdf ), 1978.
NASA. Reliability-Centered Maintenance Guide for Facilities
and Collateral Equipment (pdf ). NASA: February 2000.
Paske, Sam. Developer of The 7 questions of RCM, 2013
RCM Project Managers' Guide, www.reliabilityweb.com

51
Re
reliability
engineering
reliability engineering

Introduction
Reliability engineering (RE) is a field that deals with the
study, evaluation and lifecycle management of reliability
for an asset or product. Reliability engineering is consid-
ered a sub-discipline of systems engineering.
Reliability engineering plays a significant role in
cost-effective operations and maintenance of an asset,
machine, or system by ensuring it consistently performs
its intended or required function or mission on demand

Re
and without degradation or failure.
Many times, the terms reliability, availability and
maintainability (RAM) or reliability, availability, main-
tainability and safety or sustainability (RAMS) are used
in reliability engineering analysis.

Key Terms and Definitions


Asset – A thing, entity, or item that has actual or poten-
tial value to an organization.
Availability – The probability that an asset is capable
of performing its intended function satisfactorily, when
needed, in a stated environment. Availability is a func-
tion of reliability and maintainability.
Critical Asset – An asset that has been evaluated and
classified as critical due to its potential impact on safety,
55
Reliability Engineering for Maintenance

environment, quality, production/operations and main-


tenance if it fails.
Failure – The inability of an asset to perform its designed
function.
Failure Mode and Effects Analysis (FMEA) – A
technique to examine an asset, process, or design to
determine potential ways it can fail and its potential
effects on required functions, and to identify appropriate
mitigation tasks for highest priority risks.
Failure Rate – The number of failures of an asset over a
period of time. Failure rate is considered constant over
the useful life of an asset. It is normally expressed as the
number of failures per unit time. Denoted by Lambda
(λ), failure rate is the inverse of mean time between
failures.
Hidden Failure – A failure mode that will not become
evident to a person or the operating crew under normal
circumstances.
Maintainability – The ease and speed in which a
maintenance activity can be carried out on an asset. A
function of equipment design that is usually measured
by mean time to repair.

56
reliability engineering

Mean Time Between Failures (MTBF) – A basic


measure of asset reliability calculated by dividing total
operating time of the asset by the number of failures over
a period of time. MTBF is the inverse of failure rate (λ)
and is generally used for repairable systems.
Mean Time to Repair (MTTR) – A basic measure of
maintainability, it represents the average time needed
to restore an asset to its full operational condition after
a failure.

Re
Operating Context – The environment in which an
asset is expected to be used.
Reliability – The probability that an asset, item, or
system will perform its required functions satisfactorily
under specific conditions within a certain time period.
Reliability Centered Maintenance (RCM) – A system-
atic, disciplined process for establishing the appropriate
maintenance plan (requirements) for an asset/system to
minimize the probability of failures. The process ensures
safety, system function and mission compliance under
present operating context.
Run to Failure (RTF) – A maintenance strategy or
policy for assets where the cost and impact of failure is

57
Reliability Engineering for Maintenance

less than the cost of preventive actions. It is a deliberate


decision based on economical effectiveness.
Uptime – The time during which an asset or system
is either fully operational or is ready to perform its
intended function.

Purpose of Reliability Engineering


The goal of reliability engineering is to evaluate the
inherent reliability of an asset or process and pinpoint
potential areas for reliability improvement. Realisti-
cally, all failures cannot be eliminated from a design, so
another goal of reliability engineering is to identify the
most likely failures and then identify appropriate actions
to mitigate the effects of those failures.
The reliability evaluation of an asset can include a
number of different analyses. Depending on the phase
of the asset lifecycle, certain types of analysis are more
appropriate than others. The different reliability analyses
are interrelated and help to examine the reliability of the
asset from different perspectives in order to determine
possible problems, find solutions and make improvements.
The reliability engineering activity is an ongoing
process starting at the conceptual phase of an asset or
product design and continuing through all phases of its

58
reliability engineering

lifecycle. The goal is always to identify potential prob-


lems as early as possible in the lifecycle and improve
reliability.

Role of Reliability
The primary role of a reliability professional/engineer
(RP/E) is to identify and manage the reliability risks
of an asset that could adversely affect plant or business
operations. This broad primary role can be divided into

Re
three key areas:

• Loss (production) reduction or elimination – One of


the essential roles of the RP/E is to track operations/
production losses, identify assets with abnormally
high maintenance costs and then find ways to reduce
those losses or high costs. These losses are prioritized
to focus efforts on the largest and most critical oppor-
tunities. The RP/E, in partnership with the operations
team, develops a plan to eliminate or reduce the losses
through root cause analysis, obtains approval of the
plan and facilitates the implementation.
• Risk management – Another role of the RP/E is to
manage risk for the achievement of an organization’s
strategic objectives in the areas of asset capability (to
ensure fewer failures), quality, safety and health, and
59
Reliability Engineering for Maintenance

operations/production. Some tools used by a reliability


engineer to identify and reduce risk include: cause and
effects analysis; criticality analysis; FMEA; fault tree
analysis; Pareto analysis; RAMS analysis; root cause
analysis; and safety hazards analysis.
• Asset lifecycle management – Studies have shown
that 80 percent or more of the total cost of ownership
or lifecycle cost of an asset is determined before it is
put into use. This reveals the need for the reliability
engineer to be involved in the asset requirements, and
the design/development and installation stages of proj-
ects for new assets and modification of existing assets.

Some responsibilities and duties commonly found in


the job description of a reliability professional/engineer
are, but not limited to:

• Interfacing with capital project management/engi-


neering to ensure the reliability, maintainability, safety
and sustainability of new and modified assets.
• Participating in the development of design and instal-
lation specifications, commissioning plans, criteria for
and evaluation of asset and technical MRO suppliers
and technical maintenance service providers, and sup-
porting acceptance tests and inspection criteria.

60
reliability engineering

• Participating in the final check of new installations,


including factory and site acceptance testing that will
assure adherence to functional specifications.
• Ensuring reliability, maintainability, safety/sustain-
ability of assets, processes, utilities, facilities, controls
and safety/security systems throughout their entire
lifecycle.
• Providing support to define, design, develop, monitor
and refine an asset maintenance plan that includes

Re
RCM-based preventive maintenance tasks and effec-
tive utilization of predictive and other non-destructive
testing methodologies to identify and isolate inherent
reliability problems.
• Providing input to a risk management plan that will
anticipate reliability-related and non-reliability-related
risks that could adversely impact plant operations.
• Providing support in finding engineering solutions to
repetitive failures and all other problems that adversely
affect plant operations, such as capacity, quality, cost,
or regulatory compliance issues, by applying data
analysis techniques that can include statistical process
control; reliability modeling and prediction; fault tree
analysis; Weibull analysis; Six Sigma methodology;
and root cause failure analysis.

61
Reliability Engineering for Maintenance

• Working with production to perform analyses of


assets, including asset utilization, overall equipment
effectiveness, remaining useful life and other param-
eters that define operating condition, reliability and
costs of assets.

Measuring Reliability
Reliability, maintainability and availability are three key
terms in reliability engineering. Although we say asset
reliability improvement, many times what we really mean
is availability. Availability (A) is a function or product of
reliability and maintainability of the asset. It is measured
by the degree to which an item or asset is in an operable
and committed state at the start of the mission when the
mission is called at an unspecified (random) time.
In simple terms, the availability may be stated as the
probability that an asset will be in operating condition
when needed. Mathematically, the availability is defined:

Uptime
Availability (A) =
Uptime + Downtime
MTBF
=
MTBF + MTTR

62
reliability engineering

Reliability (R) is defined as the probability that an


item/asset will perform its intended function for a
specific interval under stated conditions. Reliability is
usually measured by MTBF and calculated by dividing
operating time by the number of failures. For example,
suppose an asset was in operation for 2000 hours (or for
12 months) and during this period there were 10 failures.
The MTBF for this asset is:

Re
MTBF =
2000 hours ÷ 10 failures = 200 hours per failure
or
12 months ÷ 10 failures = 1.2 months per failure

A larger MTBF generally indicates a more reliable


asset or component.
Maintainability (M) is the measure of an asset’s abil-
ity to be retained or restored to a specified condition
when maintenance is performed by personnel having
specified skill levels and using prescribed procedures
and resources at each stage of maintenance and repair.
Maintainability is usually expressed in hours by MTTR,
or sometimes by mean downtime (MDT). MTTR is

63
Reliability Engineering for Maintenance

the average time to repair assets. It is pure repair time


(called wrench time by some). In contrast, MDT is the
total time the asset is down, which includes repair time
plus additional waiting delays.
In simple terms, maintainability usually refers to
those features of assets, components, or total systems
that contribute to the ease of maintenance and repair.
A lower MTTR generally indicates easier maintenance
and repair.

Software Reliability
Software reliability is a special aspect of reliability engi-
neering. Asset/system reliability, by definition, includes
all parts of the system, including hardware, software,
supporting infrastructure (including critical external
interfaces), operators and procedures. Traditionally, reli-
ability engineering focuses on critical hardware parts of
the system. Since the widespread use of digital integrated
circuit technology, software has become an increasingly
critical part of nearly all present day assets/systems.
As with hardware, software reliability depends on
good requirements, design and implementation. Soft-
ware reliability engineering relies heavily on a disciplined
software engineering process to anticipate and design

64
reliability engineering

against unintended consequences. A common reliability


metric is the number of software faults, usually expressed
as faults per thousand lines of code. This metric, along
with software execution time, is a key to most software
reliability models and estimates.

Benefits of Reliability
Asset reliability is an important attribute for several rea-
sons, including:

Re
• Improves Customer Satisfaction. Reliable assets will
perform to meet customer needs on time, every time.
• Increases Repeat Business. Customer satisfaction
will bring repeat business and have a positive impact
on future business.
• Enhances Reputation. The more reliable plant assets
are, the more likely the organization will have a favor-
able reputation.
• Reduces Operations and Maintenance Costs. Poor
asset performance costs more to operate and maintain.
• Improves Competitive Advantage. With greater
emphasis on a plant reliability improvement program,
companies gain an advantage over their competition.

65
Reliability Engineering for Maintenance

What Every Reliability Leader


Should Know
• Reliability engineering is a field that deals with
the study, evaluation, and lifecycle management of
reliability.
• The goal of RE is to evaluate the inherent reliability
of an asset or process.
• To identify potential areas for reliability improvement.
• The role of a reliability professional is:
• Reduction / elimination of loss (production)
• Risk management
• Asset Lifecycle Management
• MTBF and Uptime (Availability) are two key perfor-
mance measures.

Summary
Reliability engineering is a relatively new discipline. Its
growth and importance have been the result of several
factors, including the increased complexity and sophis-
tication of assets/systems, regulatory and community
requirements to meet reliability, maintainability, safety
and sustainability performance specifications, and an
organization’s profit concerns resulting from the high
cost of failures and their repairs.
66
reliability engineering

Reliability engineering should be an ongoing process


that starts at the conceptual phase of an asset design and
continues throughout all phases of its lifecycle. The goal
always needs to be to identify potential reliability prob-
lems as early as possible in the asset/product lifecycle.
While it may never be too late to improve the reliability
of an asset, changes to a design are less expensive in the
early part of a design phase than once the asset is built
and put into service.

Re
Reliability, along with availability, maintainability,
safety and sustainability, are not only an important part
of the engineering design process, but also necessary
functions of asset lifecycle management. Reliability engi-
neering provides support in reducing the total cost of
asset ownership by providing cost benefit analysis, oper-
ational capabilities loss-risks studies/analysis, repair and
facility resourcing optimization, replacement decisions,
spare parts and inventory optimization, establishment of
an optimum maintenance or PM program, etc.

67
Reliability Engineering for Maintenance

References
Gulati, Ramesh. Maintenance and Reliability Best Practices.
New York: Industrial Press, 2009/2012.
Ebeling, Charles E. An Introduction to Reliability and
Maintainability Engineering. Long Grove: Waveland Press,
2005.
Ray, Donald. What’s the Role of the Reliability Engineer?
Reliable Plant: http://www.reliableplant.com/Read/23083/
role-reliability-engineer-operations.
Smith, Anthony M. and Hinchcliffe, Glenn R. RCM –
Gateway to World Class Maintenance. Waltham: Elsevier,
2004.
Reliability Engineer and Maintenance Engineer
Job Descriptions. www.reliabilityweb.com/articles/re-vs-me

68
Rca
root cause
analysis
root cause analysis

Introduction
Root cause analysis (RCA) is a method of problem
solving that tries to identify the root causes of faults or
problems that cause failure events.
RCA can help transform a reactive culture into a
forward-looking culture that solves problems before
they occur or escalate. More importantly, it reduces the
frequency of problems occurring over time within the
environment. Having unreliable asset performance can
be a threat in many cultures and environments. Old
measures that pit production against maintenance may
have to be removed from the system. Empowering defect

Rca
elimination and cross-training teams may be required to
overcome the resistance from cultures.
Root cause analysis, or root cause failure analysis
(RCFA) as it is sometimes called, is a step-by-step
methodology that leads to the discovery of the prime
cause (or the root cause) of the failure. If the root cause
of a failure is not addressed in a timely fashion, the fail-
ure will repeat itself, usually causing unnecessary loss
of production and increasing the cost of maintenance.
RCA is a structured way to arrive at the root cause,
thus facilitating elimination of the cause and not just
the symptoms associated with it.

71
Reliability Engineering for Maintenance

Key Terms and Definitions


Asset – A thing, entity, or item that has actual or poten-
tial value to an organization.
Checklist – A structured, preprepared form for collect-
ing, recording and analyzing data as work progresses.
The generic tool can be developed for a wide variety
of purposes, such as an operator’s start-up checklist, a
preventive maintenance checklist, and a maintainability
checklist used by designers.
Failure – The inability of an asset to perform its designed
function.
Problem – A perceived gap between the existing state
and a desired state, or a deviation from a norm, standard,
or status quo.
Problem Chain – The series of symptoms that initiated
a problem.
Root Cause – Failure or fault from which a chain of
effects or failures originates.
Root Cause Analysis (RCA) – Identification and eval-
uation of the reason for an undesirable condition or
nonconformance; A methodology that leads to the dis-
covery of the cause of a problem or root cause..

72
root cause analysis

Root Cause Failure Analysis (RCFA) – Investigative


technique applied to the determination of factors lead-
ing to an initiating or original failure.
Symptom – A condition that is produced by a problem,
not the actual problem.

Purpose of Root Cause Analysis


Assets, components and processes can fail for a number
of reasons. But usually there is a definite progression of
actions (problem chain and consequences) that lead to a
failure. The RCA investigation traces the cause and effect
trail from the failure back to the root cause. The primary

Rca
purpose of performing a RCA is to analyze problems or
events to identify what happened, how it happened and
why it happened so actions for preventing reoccurrence
can be developed.
To be effective, RCA must be performed systemati-
cally, usually as part of an investigation, with conclusions
and root causes that are identified and backed up by
documented evidence. Usually, a team effort is required.
There may be more than one root cause for an event or
problem; the difficult part is demonstrating persistence
and sustaining the effort required to determine them.
The purpose of identifying all solutions to a problem is

73
Reliability Engineering for Maintenance

to prevent reoccurrences at lowest cost in the simplest


way. If there are alternatives that are equally effective,
then the simplest or lowest cost approach is preferred.
Identifying root causes depends on the way in which
the problem or event is defined. Effective problem
statements and event descriptions are helpful, or even
required. To be effective, the analysis should establish
a sequence of events or a timeline to understand the
relationships between contributory (causal) factors, root
cause(s) and the defined problem or event to prevent in
the future.

Root Cause Analysis Process


When we have a problem, how do we approach it for a
solution? Do we jump in and start treating the symp-
toms? If we only fix the symptoms, based on what we see
on the surface, the problem will almost certainly happen
again. Then we will keep fixing the problem, again and
again, without ever solving it.
The practice of RCA is predicated on the belief that
problems are best solved by attempting to correct or
eliminate root causes, as opposed to merely addressing
the immediately obvious symptoms. By directing correc-
tive measures at root causes, the likelihood of problem

74
root cause analysis

reoccurrence will be minimized. In many cases, complete


prevention of reoccurrence through a single interven-
tion is unlikely. Therefore, RCA is often considered an
iterative process; it is frequently viewed as part of a con-
tinuous improvement toolbox.
Root cause analysis is not a single, defined method-
ology; there are several types or philosophies of RCA in
existence. Most of these can be classified into four, very
broadly defined categories based on their field of appli-
cation: safety-based, production-based, process-based
and asset failure-based.

Rca
1. Safety-based RCA is performed to find causes of
accidents related to occupational safety, health and
environment.
2. Product or production-based RCA is performed to
identify causes of poor quality, production and other
problems in manufacturing related to the product.
3. Process-based RCA is performed to identify causes
of problems related to processes, including business
systems.
4. Asset failure-based RCA is performed for failure
analysis of assets or systems in engineering and the
maintenance area.

75
Reliability Engineering for Maintenance

Despite the seeming disparity in purpose and defi-


nition among the various types of root cause analysis,
there are some general principles that can be considered
universal. The RCA process involves six steps:

• Define the problem (the failure).


• Collect data/evidence about issues that contributed
to the problem.
• Identify possible causal factors.
• Develop solutions and recommendations.
• Implement the recommendations.
• Track the recommended solutions to ensure
effectiveness.

Root Cause Analysis Tools


The nature of RCA is to identify multiple contributing
factors to a problem or event. This is most effectively
accomplished through an analysis method. Here are
some methods used in RCA.

• 5 Whys Analysis – A problem-solving technique for


discovering the root cause of a problem. This technique
helps users to get to the root of the problem quickly by
simply asking “why” a number of times until the root cause
becomes evident.
76
root cause analysis

• Barrier Analysis – An investigation or design


method that involves the tracing of pathways by
which a target is adversely affected by a hazard,
including the identification of any failed or missing
countermeasures that could or should have prevented
the undesired effect(s).
• Causal Factor Tree Analysis – An investigation and
analysis technique used to record and display, in a logical,
tree-structured hierarchy, all the actions and conditions
that were necessary and sufficient for a given conse-
quence to have occurred.
• Cause Mapping® – A simple, but effective method of

Rca
analyzing, documenting, communicating and solving a
problem to show how individual cause and effect rela-
tionships are interconnected.
• Cause and Effects Analysis – Also called Ishikawa or
fishbone diagram, it identifies many possible causes for
an effect or problem and then sorts ideas into useful
categories to help in developing appropriate corrective
actions. The design of the diagram looks like the skeleton
of a fish, hence the designation “fishbone” diagram.
• Change Analysis – Looks systematically for possible
risk impacts and appropriate risk management strategies
in situations where change is occurring. This includes

77
Reliability Engineering for Maintenance

situations in which system configurations are changed,


operating practices or policies are revised, new or differ-
ent activities will be performed, etc.
• Failure Mode and Effects Analysis – A technique to
examine an asset, process, or design to determine poten-
tial ways it can fail and its potential effects on required
functions, and subsequently identify appropriate mitiga-
tion tasks for highest priority risks.
• Fault Tree Analysis – This analysis tool is constructed
starting with the final failure or event and progressively
tracing each cause that led to the previous cause. This
continues until the trail can be traced back no further.
Once the fault tree is completed and checked for logical
flow, it can be determined which changes would prevent
the sequence of causes or events with marked conse-
quences from occurring again.
• Pareto Analysis (80/20) – A statistical technique in
decision making that is used for analysis of selected and
a limited number of tasks that produce significant overall
effect. The premise is that 80 percent of problems are
produced by a few, vital critical causes (20 percent).

78
root cause analysis

Benefits
RCA solves problems at their root, rather than just fixing
the obvious. It is often equated to a Kaizen improvement
process, and rightly so, as it often digs into possible orga-
nizational change, rather than localized optimizations.
The benefits of RCA include uncovering relationships
between causes and symptoms of problems, working to
solve issues at the root itself and providing tangible evi-
dence of cause and effect and solutions. RCA can:

• Identify barriers and causes of problems so permanent


solutions can be found.

Rca
• Identify true root causes.
• Eliminate repeated failures.
• Identify major, long-term opportunities for
improvement.
• Reduce costs and increase revenue.
• Enable organizations to expand findings to multiple
sites.

What Every Reliability Leader


Should Know
• It is a problem solving method.
• It is a step-by-step methodology that leads to the

79
Reliability Engineering for Maintenance

discovery of the prime cause (or the root cause) of


a failure.
• The primary purpose of performing a RCA is to ana-
lyze problems or events to identify:
• What happened;
• How it happened;
• Why it happened…so that actions for preventing
reoccurrence are developed.
• It can help to transform a reactive culture into a for-
ward-looking culture that solves problems before they
occur or escalate.
• To be effective, RCA must be performed systemati-
cally; usually a team effort is required.

Summary
When do most organizations conduct a RCA? Typically
when someone is injured, when there is catastrophic
damage, when there has been an “incident” and when
there has been an environmental release, violation, etc.
Most of these high visibility occurrences require us to
perform analysis by some federal or state regulatory
agency. Therefore, we conduct RCAs in an effort to
comply with regulatory requirements only. We don’t
need to perform RCA for compliance only, its real
80
root cause analysis

benefit is its disciplined and comprehensive methodol-


ogy to eliminate root problem at the source.
Root cause analysis is not a one-size-fits-all meth-
odology. There are many different tools, processes and
philosophies for accomplishing RCA. In fact, it was born
out of a need to analyze various enterprise activities,
such as:

• Accident analysis and occupational safety and health.


• Quality issues.
• Efficient business processes.
• Engineering and maintenance failure analysis.

Rca
• Various systems-based processes, including change
management and risk management.
Organizations must continually improve processes,
reduce costs and cut waste to remain competitive. To
make improvements in any process, failure/problem,
including potential failures, it needs to be analyzed using
tools and techniques for developing and implement-
ing corrective actions. A variety of methods, techniques
and tools are available, ranging from a simple checklist
to sophisticated modeling software. They can be used
effectively to lead us to appropriate corrective actions.
Applying continuous improvement tools can optimize

81
Reliability Engineering for Maintenance

work processes and help any organization improve


its results, regardless of the size or type of business
environment.
RCA is a process that introduces organizational
improvements in many situations, lasting improvements
and most importantly, a learning process to follow for
thorough understandings of relationships, causes and
effects, and solutions. By practicing RCA, we eliminate
taking action on possible causes and delay a response to
the last responsible moment when the actual root cause
of an effect is identified.

82
root cause analysis

References
Gulati, Ramesh. Maintenance and Reliability Best Practices.
New York: Industrial Press, 2012.
Latino, Robert J.; Latino, Kenneth C.; Latino, Mark A. Root
Cause Analysis: Improving Performance for Bottom-Line Results.
Boca Raton: CRC Press, 2002.
Tague, Nancy R. The Quality Toolbox. Milwaukee: ASQ
Quality Press, 2005.
Andersen, Bjorn and Fagerhaug, Tom. Root Cause Analysis.
Milwaukee: ASQ Quality Press, 2006.
Cause Mapping, www.thinkreliability.com

Rca

83
Cp
capital project
management
Uptime Ele Technical Activities
capital project management
®

Introduction
REM Capital project
Reliability management
Engineering
ACM Asset (CP) is the management
Condition
WEM Work ofExecution
for Maintenance Management Management
all capital asset purchases, from the investment require-
ments definition to commissioning. Capital project
Ca
management Rsdfocuses Acion managing
Vib theFa Pm Ps
capital expendi-
criticality reliability asset vibration fluid preventive planning and
analysis strategy condition analysis analysis maintenance scheduling
ture fordevelopment
an asset from the time business’ needs determine
information

Re
the design Rca of the asset Utto the infrared
Ir
approximate Mt capital Odr expen- Mro
diture required.
reliability
engineering
root cause
analysis CP also determines
ultrasound
testing thermal the scopereliability
motor
testing of the managemen
operator driven mro-spares
imaging
project (required capacity, size of asset, financial jus-
Cptification, Rcd etc.), the Ab Ndt
supplier evaluation Luand selection, De Cmms computerized
capital reliability alignment and non machinery defect maintenance
and the centered
project
management execution
design of the project,
balancing
testing which is typically the
destructive lubrication elimination managemen
system

installation and commissioning phases when


Certified Reliability the asset
Leader™

is turned over to operations and maintenance. This flow


Certified Reliability Leader/Asset Man

A Reliability Framework and Asse


is depicted in the Uptime Elements Asset Management
Timeline.
Reliabilityweb.com’s Asset Mana

Cp
O
Business
Needs Analysis Design Create/Acquire M
Modi

Asset Lifecycle
Key Terms and Definitions
Reprinted with permission from NetexpressUSA Inc. d/b/a Reliabilityweb.com. Copyright © 2016. All rights reserved. No part of this graphic may be reprodu
Reliability®, Certified Reliability Leader™, Reliabilityweb.com® , A Reliability Framework and Asset Management System™ and Uptime® Elemen
Acceptance Criteria – Requirements a project or system
reliabilityweb.com • maintenance.org • rel
must meet before a customer can accept delivery.

87
Reliability Engineering for Maintenance

Acceptance Test – A test conducted under specified


conditions using delivered items to determine compli-
ance with specified requirements.
Acquisition – Obtaining equipment or assets for use by
an organization in its business.
Asset Lifecycle – Stages or phases involved in the man-
agement of an asset during its life. These phases include
concept, design and development, build, install and com-
mission, operations, maintenance, decommissioning and
disposal.
Asset Performance Management – A set of work pro-
cesses used to maximize asset performance, mitigate risk
and maximize return on investment for a business.
Capacity – The maximum sustainable output rate that
can be achieved for a current product utilizing existing
worker effort, equipment and facilities.
Capital Asset – A physical asset that is held by an orga-
nization for its production potential.
Capital Project – Projects that include new construc-
tion, major repairs, or improvement, where the cost is
capitalized rather than expensed.
Failure – The inability of an asset to perform its designed
function.
88
capital project management

Lifecycle – The stages involved in the management of


an asset.
Lifecycle Cost – The total cost of ownership during
the life of the asset, including design/development,
fabrication, installation and commissioning, operation,
maintenance and disposal.
Project – A temporary undertaking to create a product
or improve asset condition with a defined start and end
point and specific objectives that, when attained, signi-
fies completion.
Project Management – The application of specific
knowledge, skills, tools and techniques to activities
during a project to complete the project on time, on
budget and meet project requirements.
Reliability – The probability that an asset, item, or

Cp
system will perform its required functions satisfactorily
under specific conditions within a certain time period.
System Design – The translation of customer require-
ments into a comprehensive, detailed, functional
performance or design specification that is then used to
construct a specific asset.
Systems Engineering – A discipline applying technical
and administrative direction and surveillance to identify

89
Reliability Engineering for Maintenance

and document the functional and physical characteristics


of an asset/system called a configuration item and to
control changes to those characteristics, and record and
report those changes.

Developing Capital Project Management


The business needs analysis to determine when new
assets are required is the starting point for capital proj-
ect management. In some organizations, this is referred
to as investment planning, however, it is really the needs
and feasibility assessment for the existing asset portfolio
There are at least four reasons for beginning investment
planning:

• The discovery of a new product or service that the


company needs to produce.
• A greater demand for an existing product or service
that requires additional assets.
• The company is required to build a plant in a spe-
cific geographical location to meet a customer
requirement.
• To meet increased regulatory requirements for exist-
ing assets.

90
capital project management

The investment planning portion of the asset’s life-


cycle is dictated by the organization’s strategic plan. The
strategic plan may be directing the company to diversify
or expand into new markets. Strategic planning may
also dictate that the company’s direction is to expand
its share of an existing market. Investment planning
quantifies the financial benefits and risks. If it is to be
effective, the strategic plan should involve a thorough
understanding of existing customers’ needs. Custom-
ers may be demanding modifications or enhancements
to existing products or services that require new assets.
The strategic plan also should be sensitive to increased
regulatory requirements. There may be new regulatory
requirements that require extensive modifications or new
assets to keep existing buildings, facilities, processes and
equipment in compliance with new regulations. What-

Cp
ever the reason, the strategic plan must be linked to
investment planning for the return on investment for
the new assets to be properly managed. Capital project
management is a business requirement.
Once the strategic plan identifies the need for addi-
tional assets, a study should be done as part of the
investment planning process to examine the utilization
of existing assets. It is quite common in many companies

91
Reliability Engineering for Maintenance

today to find that existing assets are underutilized. If


these underutilized assets were more fully utilized, many
capital expenditures could be avoided. This area should
always be closely examined before the decision is made
to purchase any new assets.
In the project definition phase of the asset’s lifecycle,
the scope and specifications of the asset are defined.
It is necessary for the asset to meet the identified
demand in the investment planning phase. This means
the asset will have to meet certain requirements. There
are certain reliability, maintainability, projected life
and total cost of ownership requirements that all assets
need to meet to support business requirements. Some
additional concerns include: What is the production
volume that must be achieved to meet the business
need? Will the asset be required to perform in a 24 x
7 operation or will it be a 24 x 5 schedule? Reliability
and maintainability are critical to the decision on the
design capacity of the asset and the profitability of the
new product or service.
Once the scope and specifications are finalized, the
next step is a cost-benefit analysis. Will the company
specify a facility building that is designed for 500 people
when the business plan requires a hundred employees?

92
capital project management

When considering production assets, if, for example, the


asset needs to produce 1,000 bottles of beer per hour,
will the company design a line that produces 10,000
bottles of beer an hour? Or will it design a line that is
only capable of producing 500 bottles of beer per hour?
Any mistakes in designing assets where the design is not
based on meeting the company’s long-range strategic
plan will result in extreme financial penalties for the
company.
It must be kept in mind that the asset is still only
a document, drawing, or blueprint at this phase of its
lifecycle. There have been no major costs incurred to
this point. In fact, dozens of books written on lifecycle
costing show that up to 90 percent of lifecycle costs are
specified by the asset design engineer. However, the same
90 percent of asset lifecycle costs are not incurred until

Cp
the asset is in its operational and maintenance phases
of the lifecycle. Historically, the majority of companies
overlook this fact and fail to achieve the profitability
required by the projections in the strategic plan.
Additional considerations at this lifecycle phase
would be operability and maintainability. The design
engineer must solicit input from the operations person-
nel as to how the new equipment should operate. Will

93
Reliability Engineering for Maintenance

the equipment be so sophisticated that retraining of all


operations personnel is necessary? Or is the equipment
so similar to existing equipment that very little training
is required? The design engineer must also solicit input
from maintenance personnel. For example, will the new
equipment be so sophisticated that retraining of main-
tenance personnel is necessary? Or is the equipment so
similar to existing equipment that very little training of
maintenance personnel is required? When considering
spare parts, are parts on the new equipment interchange-
able with parts on existing equipment? Or will an entire
new generation of spare parts be required? The answers
to these questions can drive the operation and mainte-
nance costs to such a high level that the asset will not
produce a return on investment.

Installation of New Assets


In this phase of the asset’s lifecycle, the asset is actually
created, produced, or acquired. The initial construction/
acquisition cost is also incurred in this phase. If the asset
is constructed internally, all the design documents, capac-
ity studies, reliability and maintainability specifications,
regulatory requirements, etc., are utilized to construct an
asset that will provide the company with the maximum

94
capital project management

return on assets or return on capital employed for their


shareholders.
If the asset is to be purchased, all the same design
documents, capacity studies, reliability and maintain-
ability specifications and regulatory requirements are
provided by the vendor constructing or providing the
new asset. The company will audit the deliverable asset
against the specifications to ensure the proper asset has
been supplied by the vendor.
If the existing assets are to be redesigned or modified
to meet the business plan, then all the same specifica-
tions that would have been developed for a new asset
are used during the modification of existing assets.
At the end of the redesign or modification, the assets
should be capable of delivering their design capacity
specified cost.

Cp
Commissioning New Assets
In this phase of the asset’s lifecycle, the asset, whether
it is built, purchased, or retrofitted, is installed in the
plant or built. This is the construction or installation
phase of the project. There is some divergence based
on the philosophical leaning of the engineers, but the
project phase involves the installation of the equipment.

95
Reliability Engineering for Maintenance

This phase is important since poor installation or con-


struction practices can diminish the design reliability
and maintainability of the asset. For example, poor
foundations under the equipment can make it virtually
impossible to achieve its reliability and maintainability
design specifications.
During this project phase, commissioning also occurs.
The final inspection and walk down of the equipment
occurs before the asset ownership transfers from the
supplier to the company purchasing the asset. All of the
asset’s capacities and functions are tested to ensure they
meet the design specification. This is also important for
internal projects, since the hand off will be from the
engineering department to the operations and mainte-
nance departments. The same rigor should be observed
during this exchange as well.
Once the commissioning component is achieved, the
asset ownership now passes from the supplier to the
company. All documents, manuals, drawings, training
programs, etc., are transferred to the company. In many
cases, all documentation is provided to the company
electronically. This may also include the requirement for
the supplier, whether internal or external, to enter all
data for the new asset into the company’s computerized

96
capital project management

maintenance management system (CMMS) or enter-


prise asset management (EAM) system at this time as
the asset moves into the maintenance and operations
phase of its lifecycle.

Optimizing Capital Project Management


While capital project management appears to be a very
straightforward process, there is much that can be done
to optimize it within any organization. Three main areas
to focus on are:

a. Data,
b. Resources,
c. Quality.

When considering data in capital project management,

Cp
it is necessary to understand that all documentation,
from the financial justification to purchasing the asset
and right through to the commissioning phase, must be
collected and collated. All this data must be capable of
being referenced once the asset begins performing to be
certain design capacities are achieved, thus ensuring the
asset achieves its return on investment projected by the
strategic plan. Most of this data should be collected and
stored in the organization’s CMMS or EAM system.

97
Reliability Engineering for Maintenance

When considering the resources necessary to optimize


capital project management, it is essential that sufficient
resources are allocated to ensure proper data is collected
and utilized. In many organizations, the reduction in
clerical staff has hampered the organization’s ability to
collect and utilize the equipment/asset data. This impacts
the organization’s ability to document whether the equip-
ment/assets ever achieve the capacities the strategic plan
projected they would need to achieve. This again prevents
the equipment/asset from documenting whether or not
they are achieving the projected return on investment.
The quality of the capital management process is
important. If proper processes and procedures are not
followed, the design lifecycle cost is never specified nor
achieved. The design lifecycle cost is achieved mainly in
the operational and maintenance phase of the asset’s life.
If the processes are not followed, many of the initial costs
that should have been incurred in the capital manage-
ment process phase of the asset’s life are now pushed
into the operational and maintenance phase. This inflates
operational and maintenance costs and severely reduces
the return on investment that should have been achieved.
It also causes the maintenance department to work in
more of a reactive mode due to insufficient budgets.

98
capital project management

What Every Reliability Leader


Should Know
• Capital project is a long-term investment to acquire,
develop, improve, and/or maintain a capital asset such
as plant equipment, buildings, roads, etc.
• Project management is the discipline of planning,
organizing, securing, managing, leading, and con-
trolling resources to achieve specific goals.
• Reliability, maintainability, availability, safety, and
sustainability are design attributes and should be
addressed during capital project execution.

Summary
Capital project management is extremely important
to a company being able to achieve design return on

Cp
investment. Most companies will never achieve true
value realization from their assets. ISO55000 defines an
asset as something that delivers value. The value the asset
realizes is in the operational and maintenance phase of
its life. If capital project management is not properly
utilized, the asset realizes a reduced value through its
lifecycle. Capital project management can be a com-
petitive weapon for companies that properly utilize it.

99
Rcd
reliability
centered
design
reliability centered design

Introduction
Many industry experts report that the majority of failures
(i.e., defects) during an asset’s operational phase are the
result of poor or inadequate design. Many times, design
omissions are caused by insufficient funds or budget
constraints imposed due to a lack of understanding of
the consequences on the lifecycle costs of the asset. The
capital project manager’s and designer’s performance is
judged on how they met budget and schedule targets, not
on long-term asset performance, including lifecycle costs.
A well designed, built and installed asset should have
fewer failures and a much lower total cost of ownership
during the entire life of the asset.
Leading and highly reliable organizations integrate
reliability-centered design (RCD) principles into all
aspects of their capital projects process, including asset
concept, design/development, build and the install phase.

Key Terms and Definitions Rcd


Asset – A thing, entity, or item that has actual or poten-
tial value to an organization.
Asset design specifications – Translation of customer
requirements into a comprehensive, detailed, functional

103
Reliability Engineering for Maintenance

performance or design specification used to build a par-


ticular asset.
Asset lifecycle – Stages or phases involved in the man-
agement of an asset during its life. These phases include
concept, design and development, build, install and com-
mission, operations, maintenance, decommissioning and
disposal.
Asset lifecycle cost – The total costs incurred during
an asset’s life, including design and development, build,
installation and commissioning, operations and main-
tenance, and disposal costs.
Availability – The probability an asset is capable of
performing its intended function satisfactorily, when
needed, in a stated environment; a function of reliability
and maintainability.
Capital project – Projects that include new construc-
tion, major repairs, or improvement where the cost is
capitalized rather than expensed.
Failure – The inability of an asset to perform its designed
function.
Maintainability – The ease and speed in which a mainte-
nance activity can be carried out on an asset; a function of
asset design measured by mean time to repair (MTTR).

104
reliability centered design

Reliability – The probability that an asset, item, or


system will perform its required functions satisfactorily
under specific conditions within a certain time period.
Reliability block diagram (RBD) – A diagram show-
ing logical connections among a system’s components/
parts (assets). The system is usually made of several
components/parts which may be in a series, parallel, or
a combination configuration to provide the designed
(inherent ) reliability.
Sustainability – The ability to maintain a certain status
or process in existing systems; In general, refers to the
property of being sustainable; Capacity to endure.

Principles of Reliability Centered Design


Major causes of asset failure are rooted in inadequate or
improper design, lack of maintenance, or its improper
usage. Human errors involving the skills of the users in
operating and maintaining the asset also play a key part.
Rcd
Many failures caused by human error can be minimized
by a better design.
Inadequate design is caused by the use of unreliable
components when building the asset, resulting in high
failure rates that lead to higher operations and mainte-
nance costs and reduced useful life. A properly designed
105
Reliability Engineering for Maintenance

asset is made with reliable components, ensuring reduced


failures, increased asset useful life, safe and sustainable
operations, and reduced total cost of ownership.

The 10X Rule


Design errors or omissions create a higher number
of asset failures that cause extensive repairs. The ear-
lier these errors are caught during the design, build, or
installation phase, the lower the cost would be for cor-
rective actions. It has been found that corrective action
costs increase by a factor of 10 in each successive stage.

Asset Phases Corrective Cost


Factor
Design and component selection X1

Asset build, subassembly phase X10

Asset build, assembled X100

Installed and operating X1,000 - 10,000

Therefore, it is much more cost-effective to find errors


or defects during the design phase, which will result
in fewer failures later in the operational phase. You
could use a tool, such as failure mode and effects anal-
ysis (FMEA), during design to identify these potential
106
reliability centered design

failures and correct them either by redesign or the use


of reliable and quality components.

Designing for RAMS2


Many design errors, along with commissioning, opera-
tions and maintenance errors, cause failures early in the
asset’s operating life. These are characterized as “infant
mortality” failures. Other defects and errors that do not
appear during asset infancy will eventually surface and
cause failures later during its operating life.
The preferred terminology for these errors is defects
because that is the consequence of a mistake. But the truth
is, an early inaction or wrong action results in a defect
that is really a consequence. Another truth is, most of the
time, most things go right. Failure is not a normal occur-
rence. The problem with failures isn’t the failure itself. It
is the consequences resulting from these failures. When
these consequences are severe, you want to do everything
possible never to let them happen again or find ways to Rcd
mitigate a consequence to alleviate its possibility of occur-
rence. The best way to eliminate or minimize these defects
is by designing them out at the source itself. And design-
ing them right! During design, you should be thinking
about all aspects of reliability, availability, maintainability,

107
Reliability Engineering for Maintenance

safety and sustainability (RAMS2). There is a real possibil-


ity that a right design or a design done well will cost only
a little more, but will reduce the total cost of ownership
during the lifecycle of the asset.
You want asset(s) to be reliable, that is, dependable
and available when you need them to meet your cus-
tomers’ needs. Truly, you want your assets to be designed
for high availability because availability is a function of
reliability and maintainability. Availability is defined as:
Availability = MTBF / (MTBF + MTTR) or
= Uptime / (Uptime + Downtime)
Where MTBF is a measure of Reliability and MTTR
of Maintainability. (MTBF= mean time between fail-
ures; MTTR = mean time to repair)
So, to have higher availability, you need to design
assets with highly reliable components (high MTBF/
low failure rate) and low MTTR (low repair time).
Reliability and maintainability are design attri-
butes. This means they are best achieved when they are
designed to get higher availability. Also, to support safety
and sustainability, the design should select components
that are energy efficient, use less environmentally haz-
ardous materials and safe to operate.

108
reliability centered design

Your design should incorporate:

• Highly reliable components and parts (with higher


MTBF);
• Use redundancy where needed to achieve desired
reliability;
• Ease of operations to minimize repair time:
• Design in condition monitoring and diagnostics
to facilitate repairs;
• Minimize use of special tools;
• Use total productive maintenance (TPM) / oper-
ator driven reliability (ODR) and 5S principles to
optimize design:
• Ease of adjustment to belts and chains, and oil
filling and lubrication;
• Labeling of piping, hoses, devices, etc., for effi-
cient operation;
• Required availability by balancing reliability and
maintainability requirements; Rcd
• Safe and ergonomic design features to eliminate or
minimize accidents and injuries to personnel and the
asset itself;
• Environmentally clean and energy efficient compo-
nents and material;

109
Reliability Engineering for Maintenance

• Extensive use of standard components, including con-


trol devices, such as programmable logic controllers
(PLC);
• Data needed to measure asset performance and the
process design of how it will be collected;
• Use a standardized methodology for asset and
component hierarchy and taxonomy (i.e., naming
structure).
Another area in RAMS2 to consider is mechanical
integrity. Mechanical integrity (MI), also known as asset
integrity management (AIM), refers to the management
of all processing equipment in an organization to ensure
they are sound and operating within the realms of safety.
Equipment, such as tanks, pressure vessels, piping, etc.,
are key assets in the process industry and need to be
fit for service all the time since they operate contin-
uously 24x7. Any failure, such as leaks, over pressure,
or corrosion, in these systems can be very dangerous
and costly. They need to be designed and maintained
with special care, meeting all applicable standards of
the Occupational Safety and Health Administration
(OSHA 1910.119), American Petroleum Institute (API
580, 581), etc.

110
reliability centered design

Practices and Tools for Rcd


Here are some examples of good practices and tools to
support RAMS2 based designs.

Voice of Customer (House of Quality)


Voice of customer (VOC), also called the house of qual-
ity, is a management approach to basic design based
on quality function deployment (QFD). The house of
quality has been used successfully by Japanese manufac-
turers and producers globally of consumer electronics,
home appliances, clothing, integrated circuits, rubber,
construction equipment and automobiles. This design
approach has been used successfully in consumer prod-
ucts, but also can be used for the design of industrial
products and assets.
The foundation of the house of quality is the belief
that products should be designed to reflect customers’
desires and tastes, so marketing people, design engineers
Rcd
and manufacturing staff must work closely together from
the time a product is conceived.
House of quality is a diagram, resembling a house
used for defining the relationship between customer
desires and the product, or asset capabilities. It utilizes
a planning matrix to relate what the customer needs
111
Reliability Engineering for Maintenance

and how that product is going to meet those needs. It


looks like a house with a correlation matrix as its roof,
customer wants versus product features as the main part,
competitor evaluation as the porch, etc.
House of quality is a very powerful tool as it incor-
porates customer needs into design parameters so the
final product or asset will be better designed to meet the
customer’s or owner’s expectations.

Design FMEA to Mitigate Failures


Design failure mode and effects analysis (DFMEA)
is a method for evaluating a design for reliability and
robustness against potential failures. It’s a specific failure
mode and effects analysis (FMEA) method for iden-
tifying possible failures during the design phase of a
product, asset, or service.
Failure mode means the ways or modes in which
something might fail. Failures are any errors or defects,
especially ones that affect asset performance and can be
potential or actual.
Effects analysis refers to studying the consequences
of those failures.
Failures are prioritized according to how serious their
consequences are, how frequently they occur and how

112
reliability centered design

easily they can be detected. The purpose of FMEA is to


take actions to eliminate or reduce failures, starting with
the highest priority ones.
Failure mode and effects analysis also documents cur-
rent knowledge and actions about the risks of failures for
use in continuous improvement. FMEA is used during
design to prevent failures. Later, it’s used for control,
before and during ongoing operation of the asset or
process. Ideally, FMEA begins during the earliest con-
ceptual stages of design and continues throughout the
life of an asset or service.

The DFMEA process is normally employed when:


• An asset, service, or process is being designed or rede-
signed, may be after QFD;
• An existing asset, service, or process is being applied
in a new way;
• Before developing control plans for a new or modified
asset or process; Rcd
• Improvement goals are planned for an existing asset
or service;
• Analyzing failures of an existing asset or service;
• Periodically throughout the life of an asset or service.

113
Reliability Engineering for Maintenance

Design for Manufacturing and Assembly (DFMA)


Design for manufacturing (DFM) and design for assem-
bly (DFA) have some common attributes. Nowadays,
DFM and DFA are commonly referred to a single pro-
cess called Design for Manufacturing and Assembly
(DFMA). The goal is to design the asset so it is easily
and economically manufactured and assembled. The
importance of designing for manufacturing is under-
lined by the fact that about 70 percent of manufacturing
costs of an asset (i.e., cost of materials, processing and
assembly) are determined by design decisions, with pro-
duction decisions, such as process planning or machine
tool selection, responsible for only 20 percent, as reported
in literature such as, Computer-Aided Manufacturing,
Second Edition by Tien-Chien Chang, Richard A Wysk
and Hsu-Pin Wang.

The following are key guidelines for a good DFMA:

1. Minimize the number of components;


2. Use standard, commercially available, components;
3. Use modular design;
4. Design parts with tolerances that are within current
process capability;

114
reliability centered design

5. Design for ease of part fabrication;


6. Design for ease of assembly;
7. Minimize use of flexible components;
8. Eliminate or reduce adjustment required;
9. Ease of handling and shipping.

Designing for Reliability – Reliability Allocation


Methodology
The reliability allocation methodology establishes a hier-
archy of design requirements about reliability goals. The
purpose is to distribute the operational reliability goals
from a top system level to subsystems, subassemblies
and components, and then design or select components
accordingly.
Allocation starts with the asset system goal. For
example: the requirements for an assembly machine
asset are to design and build a machine with these reli-
ability requirements:
Rcd
• Operating hours = 300 hours per month;
• Reliability = minimum 90 percent.

These requirements are translated to MTBF/failure


rates and are assigned to subassembly and component

115
Reliability Engineering for Maintenance

levels. Then, each component is designed or selected to


meet those requirements.

Reliability /Availability Modeling – RBD


Reliability block diagram (RBD) is another tool that can
be used to find gaps to optimize design from a reliability
and availability perspective. RBD is a pictorial repre-
sentation of the logical interdependencies (also called
component configurations) with either parallel or series
paths, required for the asset/machine under analysis to
function correctly. Then, estimated MTBF and MTTR
data, based on selected components, are inputted into
the model to get the predicted reliability/availability.
Several software packages are available that can perform
RBD analysis with minimal training.

What Every Reliability Leader


Should Know
• Reliability, maintainability, availability, safety, and
sustainability are design attributes and should be
addressed during capital project execution.
• Applying practices and tools, such as design FMEA,
DFMA, RBD, etc., will optimize design.

116
reliability centered design

• A well designed, built and installed asset will have


less failures and much lower TCO during the entire
life of the asset.

Summary
Things, products, or assets fail in service. Everyone has
witnessed the various failure of products in their daily
life. To be reliable, assets must be robust and adequately
designed to avoid failure modes, even in the presence
of a broad range of conditions, including harsh envi-
ronments, changing operational demands and internal
deterioration due to wear and fatigue.
Designers and engineers should use a combination of
practices and tools to eliminate or minimize failures to
enhance design, which will result in reduced TCO. Some
examples of these practices and tools are:
• Voice of customers: stakeholders specifically, opera-
tors, maintainers, etc., to understand the requirements
and issues; Rcd
• DFMEA/FMEA types of tools to identify failure
modes and mitigate their consequences;
• Design based on RAMS2 principles:
• Use of reliable components based on reliability
analysis, etc.;
117
Reliability Engineering for Maintenance

• Use of energy efficient and environmentally safe


components;
• Use of modular and standardized components;
• Making design easy to operate, maintain and
ergonomically safe;
• Considering the use of condition monitoring, diag-
nostic devices and display data/dashboard to support
operations and maintenance (O&M) in design;
• Manufacturing and assembly of design:
• Ensure design is easy, economical and safe to
manufacture, assemble and ship.
Finally, the design should not be based on the
lowest cost, but an optimum cost to reduce total cost
of ownership.

References
Gulati, Ramesh. 10 Rights of Asset Management.
Reliabilityweb.com Solutions 2.0 Virtual Conference,
Session 10. www.reliabilityweb.com/videos/article/
solutions-2.0-virtual-conference-session-10
Gulati, Ramesh. Maintenance & Reliability Best Practices,
Second Edition. South Norwalk: Industrial Press, 2012.

118
reliability centered design

Moore, Ron. What Tool? When? Fort Myers:


Reliabilityweb.com, 2013.
Raheja, Dev and Allocco, Michael. Assurance Technologies
Principles and Practices. Hoboken: Wiley-Interscience, 2006.
www.Weibull.com

Rcd

119
Acknowledgment
The Uptime® Elements™ were originally created by Terrence
O’Hanlon, CEO and Publisher of Uptime® magazine and
Reliabilityweb.com®, in consultation and close cooperation
with Reliabilityweb.com co-founder Kelly Rigg O’Hanlon.
Early versions were reviewed by Erin Corin O’Hanlon and
Ian Jaymes O’Hanlon. The initial idea was inspired during a
parent-teacher meeting with science teacher Mark Summit
at Canterbury School in Fort Myers, Florida.
Development of this concept could not have happened
without the mentoring by true masters in the, reliability
and asset management communities, including Terry Wire-
man; Paul Barringer; Dr. Robert Abernathy; Jack Nicholas
Jr.; Anthony “Mac” Smith; Ron Moore; Bob DiStefano;
Steve Turner; Joel Levitt; Ramesh Gulati; Winston Ledet;
June Ledet; Michelle Ledet Henley; Heinz Bloch; Christer
Idhammar; Ralph Buscarello; Edmea Adell; Celso De Aze-
vedo; JohnWoodhouse; the entire AEDC/Jacobs/ATA team
led by Bart Jones; and many more people who have been kind
and generous in sharing their expertise.
Early stage evolution definition and development by
Steve Thomas, Ramesh Gulati, Jeff Smith, Grahame Fogel,
John Schultz and the Allied Reliability Group team, and PJ
Vlok proved invaluable to its current state. Early presentation
of these elements resulted in valuable feedback from mem-
121
Acknowledgment

bers of the Oklahoma Predictive Maintenance Users Group


(OPMUG), Fort Myers Institute of Technology (formerly
High Tech Central), and attendees of CBM-2013 Condi-
tion Monitoring Conference and other learning events held
at the Reliability Leadership Institute in Fort Myers, Florida.
The Uptime Elements revision team includes contribu-
tions from Sandra DiMatteo, Scotty McLean, Anne-Ma-
rie Walters, David Armstrong and Greg Bentley of Bentley
Systems, Derek Burley of Blue Sky Reliability, Jack Poley of
CMI, Allan Rienstra of SDT, Dan Ambre of Full Spectrum
Diagnostics, Jim Hall of The Ultrasound Institute, Ramesh
Gulati of Jacobs and Christo Roux of Outotec Oyj. A huge
effort was made by Rhys Davies, Paul Scott, Danielle Hum-
phries and Claire Gowson of eAsset Management on the
new Asset Management passports.
There was a very strong effort to move thinking around
reliability strategy development and the updated RCM Proj-
ect Managers’ Guide that came from Derek Burley, Sam Paske,
Nick Jize, Tim Allen, Doug Plucknette and John Fortin.
The entire Reliability Leadership Institute Community of
Practice drove the revisions with many lessons and special
contributions from Randy Rhine and Rylan Eades of Honda
NA, Eric Newhard, Medtronic, Rob Bishop and Waldemar
Rivera of BMS, and George Williams of B. Braun.
The Reliabilityweb.com and Uptime Magazine team led
by Jenny Brunson and including Jocelyn Brown, Melody
122
Acknowledgment

McNeill, Dave Reiber, Joel Levitt, Maura Abad and Heather


Clark, made further refinement.
The biggest contributions have come from the existing
Certified Reliability Leaders who helped up reach our initial
goal of 1,000 CRLs within the first 26 months. Your active
participation and your leadership by example has inspired us
to continue to refine Uptime Elements to engage, empower
and align would-be reliability leaders who can positively im-
pact their organizations, their communities and the world.
We hope you will join us in our new CRL-2020 goal of
10,000 Certified Reliability Leaders by the year 2020 and
one in outer space!
Associations, such as the Association of Asset Manage-
ment Professionals, the Association for Facilities Engineer-
ing, the Vibration Institute, the Operational Excellence So-
ciety, the American Society of Civil Engineers, MIMOSA,
Fiatech, The Asset Leadership Network, the National Prop-
erty Management Association, the American Society for
Testing and Materials and The American Society of Non-
destructive Testing, have also created a foundation for this
work through their efforts to create guidance, metrics and
an ever expanding body of knowledge around maintenance,
reliability and asset management practices.

123
CRL Body of Knowledge
The Association of Asset Management Professionals (AMP)
has developed an exam and certification based on the
Uptime Elements and it’s Reliability Leadership system. It
is designed to create leaders who focus on delivering value to
the triple bottom line of:
• Economic prosperity,
• Environmental sustainability,
• Social responsibility.
The body of knowledge that creates the foundation for the
exam and certification includes:
1. The Uptime® ElementsTM Passport series
2. The Journey by Stephen Thomas
3. Don’t Just Fix it, Improve It! by Winston P. Ledet,
Winston J. Ledet and Sherri M. Abshire
4. Uptime® ElementsTM Dictionary for the Reliability Leader
and Asset Manager by Ramesh Gulati

All books are available at


www.mro-zone.com and Amazon.com
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES
REM
Uptime® Elements ™
Passport

Reliability Engineering for Maintenance


E ngi nee
IN PREPARATION FOR
i t y ri
l

ng
i
Reliab
Part of the Certified Reliability Leader
Body of Knowledge REM

e
nc
fo a
r M ai nten

criticality analysis • reliability strategy development


reliability engineering • root cause analysis
capital project management
reliability centered design

You might also like