0% found this document useful (0 votes)
51 views

2 - Reliability Theory

Reliability and maintenance management involves understanding how engineered objects can fail over time due to design, manufacturing, and operational factors. A multi-level systems approach is used to decompose objects into interacting elements. Failures occur when a component can no longer perform its intended function and can range from intermittent to complete. Understanding failure modes, causes, and severity helps improve design and maintenance practices to increase reliability.

Uploaded by

Masood Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

2 - Reliability Theory

Reliability and maintenance management involves understanding how engineered objects can fail over time due to design, manufacturing, and operational factors. A multi-level systems approach is used to decompose objects into interacting elements. Failures occur when a component can no longer perform its intended function and can range from intermittent to complete. Understanding failure modes, causes, and severity helps improve design and maintenance practices to increase reliability.

Uploaded by

Masood Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Reliability and Maintenance

Management
2
Basics of Reliability Theory
Introduction

• Every engineered object (product, plant, or infrastructure) is


unreliable in the sense that it degrades and eventually fails
• The reliability of the object is determined by decisions made during
the design and building (manufacturing) of the object and is affected
by factors such as operating environment, and usage mode & intensity
• Maintenance actions are needed to counteract unreliability of the
object

2
Decomposition of Engineered Objects

Even the simplest engineered product is comprised of several interacting elements and can be viewed as a multilevel
system. The number of levels needed depends on the system under consideration

3
Functions, Failures and Faults

• Essential function
– Intended or primary function. Example - Power Plant - Provide electrical power on demand to the consumers
who are part of the network
• Auxiliary functions
– Support the primary function. Usually less clear than essential functions. “Preserving fluid integrity” is an
auxiliary function of a pump and its failure may cause a critical safety hazard if the fluid is toxic or corrosive
• Protective functions
– Protect people from injury and protect against damage to the environment. Relays offering protection against
current surges & scrubbers on smokestacks that remove particulate matter to protect environment
• Information functions
– Condition monitoring, gauges, alarms, and so on. In a power plant, the main control panel displays various bits
of information about the different subsystems. voltage and current output of generators, pressure and
temperature of steam in the various parts of the plant, and so on

4
Functions, Failures and Faults

Failure is the termination Classification for Failures


of the ability of an item – Primary failure: Due to natural causes (for example,
to perform a required ageing). An action (for example, repair or replacement by
function
a working unit) is needed to make it operational
– Secondary failure: Due to one or more of the following
System failure occurs causes: (i) the (primary) failure of some other
due to the failure of one component(s) in the system, (ii) environmental factors,
or more of its and/or (iii) actions of the user
components
– Command failure: When a component is in non-working
(rather than a failed) state because of improper control
signals or noise (for example, a faulty action of a logic
controller switching off a pump)
*Henley and Kumamoto (1981)

5
Functions, Failures and Faults

A fault is the state of an item characterised by its


inability to perform its required function due to failure

6
Failures Modes
A failure mode is a description of a fault (aka fault mode)
Classification of Failure Modes
• Intermittent failures: Failures that last for a short time.
Software fault which occurs only under certain
intermittent conditions
Catastrophic Failure • Extended failures: Failures that continue until some
complete and sudden
corrective action rectifies the failure. They can be:
– Complete failures: Result in a total loss of function
Degraded Failure
gradual and partial – Partial failures: Result in a partial loss of function
Both can be Sudden (without any warning) or Gradual
(with/ after warning signal) 7
* Blache and Shrivastava (1994)
Example – Hydraulic Valves

Hydraulic valves are used in refineries to control the flow of liquids.


If a valve does not shut properly, the flow is not reduced to zero
and this can be viewed as a partial failure. If a valve fails to operate
(due, for example, to the spring not functioning properly), then the
failure is a complete failure. A valve usually wears out with usage
and this corresponds to a gradual failure

8
Failure Causes and Severity
Failure cause is the circumstances during design, manufacture, or use which have
led to a failure. It helps prevent recurrence

Classification of Failure causes


– Design failure: Due to inadequate design
– Weakness failure: Due to weakness (inherent or induced) in the system so that the
system cannot stand the stress it encounters in its normal environment
– Manufacturing failure: Due to non-conformity during manufacturing
– Ageing failure: Due to the effects of age and/or usage
– Misuse failure: Due to misuse of the system (operating in unintended environments)
– Mishandling failure: Due to incorrect handling and/or lack of care and maintenance

9
Failure Severity Ranking
MIL-STD-882D, 2000

Catastrophic
– Failures that result in death or total system loss
Critical
– Failures that result in severe injury or major system damage
Marginal
– Failures that result in minor injury or minor system damage
Negligible
– Failures that result in less than minor injury or system damage

10
Characterisation of Degradation
binary-state multi-state - finite

Examples: bulb, kettle element

Example: Little, moderate,


extensive leakage

Examples: tyres, cutting tools, shaft


dia etc wearing with age

multi-state - infinite

11
Concept of Reliability
Reliability of an item is its ability to perform a
required function, under given environmental and
operational conditions and for a stated period of
time (ISO 8402, 1986)

If the operational conditions are the same as the nominal conditions


assumed in the designing of the object, then we are referring to the design
reliability
However, when put into operation, the operational condition will differ
from the nominal design conditions and, as such, the reliability (called the
.ield reliability) will differ from the design reliability
12
Concept of Reliability
The reliability of an item conveys the concept of dependability, successful operation or
performance, and the absence of failures. Unreliability (or lack of reliability) conveys the opposite

The reliability of an object depends on a complex interaction of the laws of


physics, engineering design, manufacturing processes, management decisions,
random events, and usage
Usage can be continuous (for example, refineries) or intermittent (for example,
an aircraft with flying and non-flying periods) and the item can be used for short
periods followed by large idle periods (for example, a coffee grinder or aircraft
landing gear)

The concept is of great interest to consumers and businesses


13
Linking System and Component Failures
‣ A system is a collection of interconnected components. The failure of a system
is due to the failure of one or more of its components
‣ Linking of component failures to system failures can be done using two
different approaches:
- Forward (bottom-up): Starts with failure events at the component level and
then proceeds forward to the system level to evaluate the consequences of
such failures on system performance. Failure modes and effects analysis
(FMEA) uses this approach
- Backward (top-down): Starts at the system level and then proceeds
downward to the part level to link system performance to failures at the part
level. Fault tree analysis (FTA) uses this approach
14
Failure Modes & Effects Analysis (FMEA)
‣ A structured, logical, & systematic approach; involves reviewing a system in
terms of its subsystems, assemblies,… down to the component level, to
identify failure modes/ causes & effects of such failures on a system’s function
‣ Assists the design engineer/analyst in developing a deeper understanding of
the relationships among the system components
‣ Analyst can then use this knowledge to suggest changes to the system that can
eliminate or mitigate the undesirable consequences of a failure
‣ Used to assess system safety and to identify design modifications
‣ Also used in planning system maintenance activities

15
Some Objectives of FMEA
IEEE Standard 352

‣ Ensure that all conceivable failure modes and their effects on operational
success of the system have been considered
‣ List potential failures and identify the magnitude of their effects
‣ Provide historical documentation for future reference to aid in the analysis of
field failures and consideration of design changes
‣ Provide a basis for establishing corrective action priorities
‣ Assist in the objective evaluation of design requirements related to
redundancy, failure detection systems, fail-safe characteristics, and automatic
and manual override

16
FMEA Procedure
1. Determining the item functions
2. Identifying all item failure modes
3. Determining the effect of the failure for each failure mode, both on the
component and on the overall system being analyzed
4. Classifying the failure by its effects on the system operation and mission
5. Determining the failure’s probability of occurrence
6. Identifying how the failure mode can be detected (this is especially important
for fault- tolerant configurations)
7. Identifying any design changes to eliminate the failure mode, or if that is not
possible, mitigate or compensate for its effects
17
FMEA Example

18
FMEA & FMECA

The terms FMEA and FMECA (failure modes, effects,


and criticality analysis) are often used
interchangeably
Some authors consider that FMECA extends FMEA
by including a ranking of the failure modes

19
Fault Tree Analysis
‣ FTA is concerned with the identification and analysis of conditions and factors
that cause, or may potentially cause or contribute to, the occurrence of a
defined top event (such as failure of a system)
‣ A fault tree is an organized graphical representation of the conditions or other
factors causing or contributing to the occurrence of the top event
‣ FTA can be used for analysis of systems with complex interactions between the
components, including software–hardware interactions
‣ FTA analysis may be quantitative or qualitative, and FTA may be used
independently or in conjunction with other reliability analyses

20
Objectives of Fault Tree Analysis
‣ Identify causes or combinations of causes leading to the top event
‣ Determine whether a system reliability measure meets a stated requirement
‣ Determination of which potential failure mode(s) or factor(s) would be the
highest contributor to the system’s probability of failure or unavailability
‣ Analysis and comparison of various design alternatives to improve system
reliability

It is important to understand that a fault tree is tailored to its


top event. Therefore, the fault tree includes only those faults
that contribute to this particular top event

21
FTA - Example

22
Reliability Theory
‣ Reliability theory deals with the interdisciplinary use of probability, statistics,
and stochastic modelling, combined with engineering insights into the design
and the scientific understanding of the failure mechanisms, to study the various
aspects of reliability
‣ It encompasses issues such as:
- Reliability modelling
- Reliability analysis and optimization
- Reliability engineering
- Reliability science
- Reliability technology
- Reliability management 23
Reliability Theory
‣ Reliability modelling deals with model building to obtain solutions to
problems in predicting, estimating, and optimising the survival or performance
of an unreliable system, the impact of the unreliability, and actions to mitigate
this impact

‣ Reliability analysis can be divided into two broad categories:


‣ Qualitative Analysis: intended to verify the various failure modes and causes
that contribute to the unreliability of a product or system
‣ Quantitative Analysis: The latter uses real failure data in conjunction with
suitable mathematical models to produce quantitative estimates of product or
system reliability
24
Reliability Theory
‣ Reliability engineering deals with the design and construction of systems and
products, taking into account the unreliability of their parts and components. It
also includes testing and programs to improve reliability. Good engineering
results in a more reliable end product

‣ Reliability science is concerned with the properties of materials and the


causes for deterioration leading to part and component failures. It also deals
with the effect of manufacturing processes (for example, casting, annealing) on
the reliability of the part or component produced

25
Reliability Theory
‣ Reliability management deals with the various management issues in the
context of managing the design, manufacture, and/or operation of reliable
products and systems. Here, the emphasis is on the business viewpoint, since
unreliability has consequences in terms of cost, time wasted and, in certain
cases, the welfare of an individual or even the security of a nation

26
Questions?

You might also like