2 - Reliability Theory
2 - Reliability Theory
Management
2
Basics of Reliability Theory
Introduction
2
Decomposition of Engineered Objects
Even the simplest engineered product is comprised of several interacting elements and can be viewed as a multilevel
system. The number of levels needed depends on the system under consideration
3
Functions, Failures and Faults
• Essential function
– Intended or primary function. Example - Power Plant - Provide electrical power on demand to the consumers
who are part of the network
• Auxiliary functions
– Support the primary function. Usually less clear than essential functions. “Preserving fluid integrity” is an
auxiliary function of a pump and its failure may cause a critical safety hazard if the fluid is toxic or corrosive
• Protective functions
– Protect people from injury and protect against damage to the environment. Relays offering protection against
current surges & scrubbers on smokestacks that remove particulate matter to protect environment
• Information functions
– Condition monitoring, gauges, alarms, and so on. In a power plant, the main control panel displays various bits
of information about the different subsystems. voltage and current output of generators, pressure and
temperature of steam in the various parts of the plant, and so on
4
Functions, Failures and Faults
5
Functions, Failures and Faults
6
Failures Modes
A failure mode is a description of a fault (aka fault mode)
Classification of Failure Modes
• Intermittent failures: Failures that last for a short time.
Software fault which occurs only under certain
intermittent conditions
Catastrophic Failure • Extended failures: Failures that continue until some
complete and sudden
corrective action rectifies the failure. They can be:
– Complete failures: Result in a total loss of function
Degraded Failure
gradual and partial – Partial failures: Result in a partial loss of function
Both can be Sudden (without any warning) or Gradual
(with/ after warning signal) 7
* Blache and Shrivastava (1994)
Example – Hydraulic Valves
8
Failure Causes and Severity
Failure cause is the circumstances during design, manufacture, or use which have
led to a failure. It helps prevent recurrence
9
Failure Severity Ranking
MIL-STD-882D, 2000
Catastrophic
– Failures that result in death or total system loss
Critical
– Failures that result in severe injury or major system damage
Marginal
– Failures that result in minor injury or minor system damage
Negligible
– Failures that result in less than minor injury or system damage
10
Characterisation of Degradation
binary-state multi-state - finite
multi-state - infinite
11
Concept of Reliability
Reliability of an item is its ability to perform a
required function, under given environmental and
operational conditions and for a stated period of
time (ISO 8402, 1986)
15
Some Objectives of FMEA
IEEE Standard 352
‣ Ensure that all conceivable failure modes and their effects on operational
success of the system have been considered
‣ List potential failures and identify the magnitude of their effects
‣ Provide historical documentation for future reference to aid in the analysis of
field failures and consideration of design changes
‣ Provide a basis for establishing corrective action priorities
‣ Assist in the objective evaluation of design requirements related to
redundancy, failure detection systems, fail-safe characteristics, and automatic
and manual override
16
FMEA Procedure
1. Determining the item functions
2. Identifying all item failure modes
3. Determining the effect of the failure for each failure mode, both on the
component and on the overall system being analyzed
4. Classifying the failure by its effects on the system operation and mission
5. Determining the failure’s probability of occurrence
6. Identifying how the failure mode can be detected (this is especially important
for fault- tolerant configurations)
7. Identifying any design changes to eliminate the failure mode, or if that is not
possible, mitigate or compensate for its effects
17
FMEA Example
18
FMEA & FMECA
19
Fault Tree Analysis
‣ FTA is concerned with the identification and analysis of conditions and factors
that cause, or may potentially cause or contribute to, the occurrence of a
defined top event (such as failure of a system)
‣ A fault tree is an organized graphical representation of the conditions or other
factors causing or contributing to the occurrence of the top event
‣ FTA can be used for analysis of systems with complex interactions between the
components, including software–hardware interactions
‣ FTA analysis may be quantitative or qualitative, and FTA may be used
independently or in conjunction with other reliability analyses
20
Objectives of Fault Tree Analysis
‣ Identify causes or combinations of causes leading to the top event
‣ Determine whether a system reliability measure meets a stated requirement
‣ Determination of which potential failure mode(s) or factor(s) would be the
highest contributor to the system’s probability of failure or unavailability
‣ Analysis and comparison of various design alternatives to improve system
reliability
21
FTA - Example
22
Reliability Theory
‣ Reliability theory deals with the interdisciplinary use of probability, statistics,
and stochastic modelling, combined with engineering insights into the design
and the scientific understanding of the failure mechanisms, to study the various
aspects of reliability
‣ It encompasses issues such as:
- Reliability modelling
- Reliability analysis and optimization
- Reliability engineering
- Reliability science
- Reliability technology
- Reliability management 23
Reliability Theory
‣ Reliability modelling deals with model building to obtain solutions to
problems in predicting, estimating, and optimising the survival or performance
of an unreliable system, the impact of the unreliability, and actions to mitigate
this impact
25
Reliability Theory
‣ Reliability management deals with the various management issues in the
context of managing the design, manufacture, and/or operation of reliable
products and systems. Here, the emphasis is on the business viewpoint, since
unreliability has consequences in terms of cost, time wasted and, in certain
cases, the welfare of an individual or even the security of a nation
26
Questions?