Reliability Methodologies Standards and Tools
Reliability Methodologies Standards and Tools
a survey by
Andrey Morozov
Dresden, 30.01.2013
Wednesday, January 30, 2013
Reliability: Methodologies, Standards, and Tools
+ Software Reliability
>&/)*?@."4"$;9$
!"#"$%&'()(*+ 1"&$0 >&/)*?29)".&$A"
>&/)*?5"B94&)
>&/)*?>9."A&0;$=
>&/)*0
23."&*0 C..9.0
>&()/."0
,4&()&'()(*+
5")(&'()(*+ the ability of an item to perform a required function,
under stated conditions
6&7"*+ for a stated period of time
[IEEE, IEC]
89$:%"$;&)(*+
<$*"=.(*+ the ability of an item to perform a required function,
1&($*&($&'()(*+ under given environmental and operational conditions and
for a stated period of time
[ISO]
>&/)*?@."4"$;9$
>&/)*?29)".&$A" the duration or probability of failure free performance
>&/)*?5"B94&) under stated conditions
[MIL-STD]
>&/)*?>9."A&0;$=
>&/)*0
4/68
C..9.0
Wednesday, January 30, 2013
Reliability definition
,4&()&'()(*+
5")(&'()(*+
6&7"*+ the absence of catastrophic consequences on the user and the environment
89$:%"$;&)(*+
<$*"=.(*+
1&($*&($&'()(*+
unSafety = (1 - Reliability) x Hazard level
>&/)*?@."4"$;9$
>&/)*?29)".&$A"
>&/)*?5"B94&)
>&/)*?>9."A&0;$=
>&/)*0
5/68
C..9.0
Wednesday, January 30, 2013
Reliability critical industrial domains
Aviation Telecom
Automotive
Space
Railway
Petroleum
6/68
Wednesday, January 30, 2013
Theory behind
1812 Laplace - Bayesian probability
Time
1880 Markov - Markov chains
...
1939 Petri - Petri nets
1939 Weibull - Weibull distribution (life length of materials)
1941 Feller - Renewal theory
1949 US MIL-P-1629 - Failure Mode and Effect Analysis (FMEA)
1952 Gates - Boolean algebra for redundant systems reliability analysis
1953 Epstein and Sobel - Exponential distribution for reliability analysis of electronic components
1954 Kleinerman - Reliability Block Diagrams (RBD)
1954 Watson - Fault Tree Analysis (FTA)
1956 Moore and Shannon - Network reliability
1958 Toyoda - Root cause analysis (RCA)
1958 Birnbaum and Saunders - Statistical model for life lengths of structures under dynamic loading
1959 Barlow and Hunter - Markov model for system reliability
1961 Mosteller - Use of Baye’s theorem to reliability
1961 Barlow et al - Increasing Failure Rate (IFR)
1963 Kletz - Hazard and operability study (HAZOP)
...
1992 Dugan - Dynamic fault tree analysis (DITree)
1998 Papazoglou - Event tree analysis (ETA)
2003 Trivedi and Fricks - Markov reward model (MRM) for component importance measures
7/68 Pioneers of the Reliability Theories of the Past 50 Years. Alice Rueda, Mirek Pawlak
Wednesday, January 30, 2013
Primary standards developing and setting organizations
International Organization 1947
based in Geneva,
World for Standardization
Switzerland
Standard International
Cooperation Electrotechnical Commission 1906
International
International Telecommunication Union 1865
http://www.enre.umd.edu/tools/ftap.htm
9/68 http://www.ntnu.no/ross/info/softvendors.php
Wednesday, January 30, 2013
Rough classification of reliability methodologies
e
System-level models:
us Mathematical models:
describe the reliability aspects of the use are used for numerical (probabilistic)
product on a system level reliability analysis
10/68
Wednesday, January 30, 2013
Rough classification of reliability methodologies
e
System-level models:
us Mathematical models:
describe the reliability aspects of the use are used for numerical (probabilistic)
product on a system level reliability analysis
11/68
Wednesday, January 30, 2013
Reliability metrics
12/68
Wednesday, January 30, 2013
Reliability metrics
Simple failure model:
Mean time to failure (MTTF) - the average time to failure of a component (no repairs).
Operation
Down
MTTF
Operation
Down
MTBF
MTTR
13/68
Wednesday, January 30, 2013
Reliability metrics
14/68 http://www.ece.cmu.edu/~koopman/des_s99/sw_reliability/
Wednesday, January 30, 2013
Reliability data sources
‣MIL-HDBK-217F - Reliability Prediction of Electronic ‣Safety equipment (sensors, logic units, actuators)
Equipment ‣WellMaster (ExproSoft)
‣EPRD - Electronic Parts Reliability Data (RIAC) Components in oil wells
‣NPRD-95 Non-electronic Parts Reliability Data (RIAC) ‣SubseaMaster (ExproSoft)
‣FMD-97 Failure Mode/Mechanism Distributions (RIAC) Components in subsea oil/gas production systems
‣SPIDR - System and Part Integrated Data Report (System ‣PERD - Process Equipment Reliability Data(AIChE)
Reliability Center) Process equipment
‣SR-332 Reliability Prediction for Electronic ‣GIDEP (Government-Industry DataExchange Program)
Equipment (Telcordia Technologies) ‣CCPS Guidelines for Process Equipment Reliability Data,
‣FIDES (mainly electronic components) AIChE, 1989
‣EiReDA - European Industry Reliability Data Process equipment
Mainly components in nuclear power plants ‣PERD - Process EquipmentReliability Data (AIChE)
‣OREDA - Offshore Reliability Data Process equipment
Topside and subsea equipment for offshore oil and gas ‣FARADIP
production Electronic, electrical, mechanical, pneumatic equipment
‣MechRel - Handbook of Reliability Prediction for Mechanical ‣IEEE Std. 500-1984: IEEE Guide to the Collection
Equipment andPresentation of Electrical, Electronic, Sensing
Mechanical equipment - military applications Component, and Mechanical Equipment Reliability Data for
‣T-Book (Reliability Data of Components in Nordic Nuclear Nuclear Power Generating Stations
Power Plants (ISBN 91-631-0426-1) ‣FASIT (Feil og avbrudd i kraftsystemer)
‣Reliability Data for Control and SafetySystems - PDS Data Failure in the electro-power supply system (in Norwegian)
Handbook ‣PROMISE Data Repository - Software failure data and
Sensors, detectors, valves & control logic metrics
‣Safety Equipment Reliability Handbook (exida) ‣NASA IV&V Facility Metrics Data Program - Software
‣CSIAC - The Software Reliability Dataset failure data
15/68 http://www.ntnu.edu/ross/info/data
Wednesday, January 30, 2013
Rough classification of reliability methodologies
e
System-level models:
us Mathematical models:
describe the reliability aspects of the use are used for numerical (probabilistic)
product on a system level reliability analysis
16/68
Wednesday, January 30, 2013
Reliability Block Diagrams (RBD)
17/68
Wednesday, January 30, 2013
Reliability Block Diagrams (RBD)
M. M. Kleinerman and G. H. Weiss, "On the reliability of networks," 1954.
E. Blanton, "Reliability-Prediction Technique for Use in Design of Complex Systems," 1957.
System
Component 1
Component 2 Component 3
Component 4
A RBD is a graphical depiction of the system’s components and connectors which can be
used to determine the overall system reliability.
Blocks represent system components. Lines describe the connections between components.
If any path through the system is successful, then the system succeeds, otherwise it fails.
[http://www.win.tue.nl/~mchaudro/sa2007/Reliability%20Block%20Diagrams.pdf]
18/68
Wednesday, January 30, 2013
Reliability Block Diagrams (RBD)
Series Configuration:
Parallel Configuration:
Component 1
In addition:
Combined Configuration
Component 2
Complex Configuration
Inheritance
...
Multi-blocks
Mirrored blocks
Etc..
k-out-of-n Parallel Configuration
Component 1
Component 2 2/3
Component 3
19/68
Wednesday, January 30, 2013
Reliability Block Diagrams (RBD), standards
ANSI/ASQ/IEC D61078-1997
Analysis Techniques for Dependability - Reliability
Block Diagram Method
American Society for Quality/International Electrotechnical
Commission / 16-Sep-1997 / 33 pages
DID DI-RELI-81496
RELIABILITY BLOCK DIAGRAMS AND MATHEMATICAL
MODELS REPORT (SUPERSEDING DI-R-7094)
Data Item Description / 30-Oct-1995 / 5 pages
20/68 http://www.techstreet.com/
Wednesday, January 30, 2013
Reliability Block Diagrams (RBD), tools
BlockSim (ReliaSoft)
21/68 http://www.ntnu.no/ross/info/software.php
Wednesday, January 30, 2013
Fault Tree Analysis (FTA)
22/68
Wednesday, January 30, 2013
Fault Tree Analysis (FTA)
Developed in 1962 at Bell Laboratories by H.A. Watson, under a U.S. Air Force Ballistics Systems Division contract to
evaluate the Minuteman I Intercontinental Ballistic Missile (ICBM) Launch Control System.
AND OR
1 2 3 4 5
Basic events
Fault tree diagrams represent the logical relationship between sub-system and component failures and how they
combine to cause system failures.
The TOP event of a fault tree represents a system event of interest and is connected by logical gates to component
failures known as basic events.
After creating the diagram, failure and repair data is assigned to the system components.
The analysis is then performed, to calculate reliability and availability parameters for the system and identify critical
components.
23/68
Wednesday, January 30, 2013
Fault Tree Analysis (FTA)
Reliability block diagram: Fault tree: TOP
System
1
3 4 5
Interconvertible OR
2
IEC 61025 Ed. 2.0 b:2006 Describes fault tree analysis and provides
Fault tree analysis (FTA)
Edition: 2.0 guidance on its application to perform an
International Electrotechnical Commission /
13-Dec-2006 / 103 pages
analysis, identifies appropriate assumptions,
events and failure modes, and provides
identification rules and symbols.
UNE-EN 61025:2011
Fault tree analysis (FTA)
UNE-EN / 19-Jan-2011 / 54 pages
AS IEC 61025-2008
Fault tree analysis (FTA)
Standards Australia / 01-Jan-2008
BS EN 61025:2007
Fault tree analysis (FTA)
British-Adopted European Standard / 28-
Sep-2007 / 56 pages MIL MIL-HDBK-338B
ISBN: 9780580540691 ELECTRONIC RELIABILITY DESIGN
(SUPERSEDING MIL-HDBK-338A)
Military Specifications and Standards / 01-
DIN EN 61025 Oct-1998 / 1045 pages
Fault tree analysis (FTA) (IEC 61025:2006);
German version EN 61025:2007
DIN-adopted European Standard / 01-Aug-2007 / 50
pages
SAE ARP 4761
Guidelines and Methods for Conducting the Safety
Assessment Process on Civil Airborne Systems and Equipment
UNE 21925:1994 SAE International / 01-Dec-1996
FAULT TREE ANALYSIS (FTA).
UNE / 05-Dec-1994 / 18 pages
25/68 http://www.techstreet.com/
Wednesday, January 30, 2013
Fault Tree Analysis (FTA), tools
Relex Fault Tree (Relex)
FaultTree+ (Isograph)
FTA Module (Module of Item Toolkit) (Item UK) (Item US)
BlockSim (ReliaSoft)
LOGAN (RM Consultants) by Reliass
FTA (Module of RAM Commander) (A.L.D.) also sold by Reliass
RAPTOR (ARINC) also sold by Reliass
Cabtree (CAB Innov)
CARE FTA (BQR)
CARA FaultTree (Sydvest)
SAPHIRE (NRC)
RiskSpectrum FT Professional (Relcon Scandpower)
FTAnalyzer Lite (SoHaR)
CAFTA (SAIC)
TDC FTA (TDC)
26/68 http://www.ntnu.no/ross/info/software.php
Wednesday, January 30, 2013
Event Tree Analysis (ETA)
27/68
Wednesday, January 30, 2013
Event Tree Analysis (ETA)
OK
Outcome 1
OK
FAIL
OK Outcome 2
OK
Outcome 3
FAIL
FAIL
Initiating Outcome 4
OK
Event Outcome 5
OK
FAIL
FAIL Outcome 6
FAIL Outcome 7
Event 1
Event 2
Event 3
Event tree diagrams provide a logical representation of the possible outcomes following a
hazardous event.
Event tree analysis provides an inductive approach to reliability and risk assessment and are
constructed using forward logic.
The event tree model may be linked to the fault tree model by using fault tree gate results
as the source of event tree probabilities.
28/68
Wednesday, January 30, 2013
Event Tree Analysis (ETA)
Fault tree: Event tree:
Brake
Fails
OR
The example has been adopted from “Safety Critical Systems Analysis” by Robert Slater
Event trees function similarly to fault trees, but in the opposite direction.
An event tree attempts to enumerate a list of components and determine the result of their
operation or non-operation.
In this way all sequences of possible events are covered involving those components.
29/68 Storey, Neil. "Safety-Critical Computer Systems" Addison Wesley, 1996.
Wednesday, January 30, 2013
Event Tree Analysis (ETA), standards
IEC 62502 Ed. 1.0 b:2010 IEC 62502 specifies the consolidated basic
Analysis techniques for dependability -
Event tree analysis (ETA) principles of Event Tree Analysis (ETA) and
Edition: 1.0
International Electrotechnical Commission / provides guidance on modeling the consequences
27-Oct-2010 / 87 pages
of an initiating event as well as analyzing these
consequences qualitatively and quantitatively in
AS IEC 62502-2011
Analysis techniques for dependability - the context of dependability and risk related
Event tree analysis (ETA)
Standards Australia / 01-Jan-2011
measures.
BS EN 62502:2011
Analysis techniques for dependability. Event tree analysis (ETA)
British-Adopted European Standard / 30-Jun-2011 / 48 pages
BS 09/30169892 DC
BS EN 62502. Analysis techniques for dependability. Event tree analysis
British Standards Institution / 15-Jun-2009 / 46 pages
DIN 25419
Event tree analysis; method, graphical symbols and evaluation
Deutsches Institut Fur Normung E.V. (German National Standard) / 01-Nov-1985 / 5 pages
30/68 http://www.techstreet.com/
ETA (SAIC)
31/68 http://www.ntnu.no/ross/info/software.php
Wednesday, January 30, 2013
Rough classification of reliability methodologies
e
System-level models:
us Mathematical models:
describe the reliability aspects of the use are used for numerical (probabilistic)
product on a system level reliability analysis
32/68
Wednesday, January 30, 2013
Markov chains
33/68
Wednesday, January 30, 2013
Markov chains
General example of a renewal failure model: Control Flow Aspects:
States
0,8 0,7
0,1
Correct Broken
0,1 Module 1
Parts on
0,2 0,1 0,6
order
0,3
0,4 case 3 case 1
0,4 Module 2
Under
Incorrect
repair case 2
2-Module Redundancy:
0.5 0.5x(1- f1)
0,1 State 1 State 1 f1
State3 State
State3
0,1 1 - f3 OK
34/68
Wednesday, January 30, 2013
Markov chains
0,1
Unit 1 - FAIL
0,8
Unit 2 - OK
0,1 0,1
0,8 Unit 1 - OK
System Failure 1
Unit 2 - OK
0,1 0,1
Unit 1 - OK
0,8 Unit 2 - Fail
0,1
The Markov model assumes that the future is independent of the past given the present.
Markov chains are usually used in conjunctions with other reliability methodologies, e.g. FTA.
35/68 http://src.alionscience.com/pdf/MARKOV.pdf
Wednesday, January 30, 2013
Markov chains, standards
IEC 61165 Ed. 2.0 b:2006 This International Standard provides guidance on the
Application of Markov techniques
Edition: 2.0 application of Markov techniques to model and analyze a
International Electrotechnical Commission /
18-May-2006 / 67 pages system and estimate reliability, availability, maintainability
and safety measures. This standard is applicable to all
AS IEC 61165-2008
Application of Markov techniques industries where systems, which exhibit state-dependent
Standards Australia / 01-Jan-2008
behavior, have to be analyzed. The Markov techniques
BS EN 61165:2006 covered by this standard assume constant time-
Application of Markov techniques
British-Adopted European Standard / independent state transition rates. Such techniques are
29-Feb-2008 / 40 pages
often called homogeneous Markov techniques.
BS 05/30101071 DC
IEC 61165. Application of Markov techniques
British Standards Institution / 10-Feb-2005 / 35
pages
DIN EN 61165
Application of Markov techniques (IEC 61165:2006);
German version EN 61165:2006
DIN-adopted European Standard / 01-Feb-2007 / 35 pages IEC 61508-SER Ed. 2.0 b:2010
Functional safety of electrical/electronic/
programmable electronic safety-related systems -
UNE 21406:1997 ALL PARTS
APPLICATION OF MARKOV TECHNIQUES.
International Electrotechnical Commission / 01-
UNE / 21-Apr-1997 / 24 pages
Apr-2010
ANSI/ASQ/IEC D61165-1997
Application of Markov Techniques
American Society for Quality/International Electrotechnical
Commission / 16-Sep-1997 / 29 pages
IEC 61058 requires Markov techniques for
ISA TR84.00.02-2002 - Part 4 estimation of the probability of failure
Safety Instrumented Functions (SIF) - Safety Integrity
Level (SIL) Evaluation Techniques Part 4: Determining the
SIL of a SIF via Markov Analysis
The International Society of Automation / 17-Jun-2002 / 58 pages
Markov Analysis Module (Module of Item Toolkit) (Item UK) (Item US)
CARE-RBD-Markov (BQR)
+ Markov analysis tools not related with the reliability soft vendors
37/68 http://www.ntnu.no/ross/info/software.php
Wednesday, January 30, 2013
Petri nets
38/68
Wednesday, January 30, 2013
Petri nets
Tokens
Unit 1 Unit 2
P1 P2
OK OK
T1 T2 T3 T4
Places
Unit 1 Unit 2
P3 P4
FAIL FAIL
Transitions T5
System
P5
Failure
11000
Unit 1 Unit 2
P1 P2
OK OK
T2 T1 T3 T4
T1 T2 T3 T4 01010 10010
Unit 1 Unit 2
T3 T1
P3 P4
FAIL FAIL
00110
T5
T5
00001
System
P5
Failure
40/68 0,1
ISO/IEC 15909-1:2004
Software and system engineering - High-level Petri nets - Part 1: Concepts, definitions and graphical notation
International Organization for Standardization/International Electrotechnical Commission / 01-Dec-2004 / 38 pages
ISO/IEC 15909-2:2011
Systems and software engineering - High-level Petri nets - Part 2: Transfer format
International Organization for Standardization/International Electrotechnical Commission / 01-Apr-2011
41/68 http://www.techstreet.com/
Wednesday, January 30, 2013
Petri nets, tools
GreatSPN
Petrinetz-Tool Netlab
ExSpect
CPN-AMI
42/68 http://www.informatik.uni-hamburg.de/TGI/PetriNets/tools/quick.html
Wednesday, January 30, 2013
Rough classification of reliability methodologies
e
System-level models:
us Mathematical models:
describe the reliability aspects of the use are used for numerical (probabilistic)
product on a system level reliability analysis
43/68
Wednesday, January 30, 2013
Root Cause Analysis (RCA)
44/68
Wednesday, January 30, 2013
Root Cause Analysis (RCA)
Root cause analysis is an approach for 5 Whys method, invented by Sakichi Toyoda 1958
identifying the underlying causes of why an
incident (a failure) occurred, sometimes it
is also called RCFA (Root Cause Failure Example
Analysis)
Problem: The computer monitor is not working.
Why? The monitor's light signal is not on.
Three basic questions: Why? The monitor's power cord is not functioning.
1) What's the problem? Why? The cord is damaged.
2) Why did it happen? Why? It was placed under a heavy load.
3) What will be done to prevent it? Why? I didn't place the cords properly when the monitor
was plugged in, which caused damage.
45/68 http://www.brighthubpm.com/risk-management/123244-how-has-the-root-cause-analysis-evolved-since-inception/
Wednesday, January 30, 2013
Root Cause Analysis (RCA)
Aircraft failure
Why-because analysis and why-because graph (WBG)
Loss of control
Loss of power of
Deformed wing
engines 3 and 4
IEEE 1415-2006 This guide describes field test methods that assure that
IEEE Guide for Induction Machinery
Maintenance Testing and Failure Analysis current transformers are connected properly, are of marked
Institute of Electrical and Electronics Engineers /
30-Apr-2007 ratio and polarity, and after having been in service for a
ISBN: 9780738155647
period of time: failure analysis, induction machinery,
induction motor, maintenance tests, maintenance testing,
SAE J 2816
Guide for Reliability Analysis Using the root cause analysis
Physics-of-Failure Process
SAE International / 03-Dec-2009
http://www.techstreet.com/
Tools:
RealityCharting® Software
47/68 http://www.rvs.uni-bielefeld.de/Bieleschweig/first/WBA-and-Concorde.pdf
Wednesday, January 30, 2013
Failure Mode and Effect Analysis (FMEA)
48/68
Wednesday, January 30, 2013
Failure Mode and Effect Analysis (FMEA)
A Failure Mode, Effects and Analysis is a procedure Define the system to be analyzed
and sub-systems
for identifying potential failure modes in a system and
classifying them according to their severity values.
Identify and assign failure modes
to the sub-systems
1949, FMEA was developed to study problems that Identify and assign effects of the
might arise from malfunctions of military systems. failure modes
[http://www.isograph-software.com/rwboverfme.htm] http://elsmar.com/FMEA/
49/68
Wednesday, January 30, 2013
Failure Mode and Effect Analysis (FMEA)
Consider the case of starting a car:
Electric power to turn engine:
‣electrical power to turn the engine ‣battery dead
‣fuel for the engine Define the system to be analyzed
‣lights left on (human failure)
‣operation of the ignition system and sub-systems
‣old battery (mechanical failure)
‣mechanical operation of the engine ‣faulty battery (mechanical failure)
...
Identify and assign failure modes ‣battery connector corroded
to the sub-systems ‣cable broken or damaged
‣dead battery, engine will not turn over ‣battery stolen
‣battery connector corroded, engine will Identify and assign effects of the
not turn over Fuel for engine:
failure modes ‣gas tank empty
…
‣gas tank empty, engine will turn over but ‣fuel pump broken
not start Determine and assign severity (S) ...
… rating of the effects
1 - No effect
1 - No known occurrences on similar
Identify potential causes and 2 - Very minor (only noticed by
products or processes assign occurrence (O) rating discriminating customers)
2/3 - Low (relatively few failures)
3 - Minor (affects very little of the
4/5/6 - Moderate (occasional failures)
Determine current controls and system, noticed by average customer)
7/8 - High (repeated failures) assign detection (D) rating 4/5/6 - Moderate (most customers are
9/10 - Very high (failure is almost
annoyed)
inevitable)
Define risk priority number: 7/8 - High (causes a loss of primary
RPN = S x O x D function; customers are dissatisfied)
1 - Certain - fault will be caught on test
2 - Almost certain 9/10 - Very high and hazardous (product
Highlight critical failures and becomes inoperative; customers angered;
3 - High
recommend solutions
4/5/6 - Moderate the failure may result unsafe operation and
7/8 - Low possible injury)
9/10 - Fault will be passed to customer
undetected
AIAG FMEA-4
Potential Failure Mode and Effect Analysis (FMEA),
BS EN 60812:2006 4th Edition
Analysis techniques for system reliability. Procedure Edition: 4th
for failure mode and effects analysis (FMEA) Automotive Industry Action Group / 28-Jun-2008 / 151
British-Adopted European Standard / 30-Jun-2006 / 50 pages
pages
AIAG APD-36M
UNE-EN 60812:2008 Supplier APQP/PPAP/FMEA Forms CD 36+ Users,
Analysis techniques for system reliability - Procedure Version 4 - Site License
for failure mode and effects analysis (FMEA) Automotive Industry Action Group / 01-Jan-2006
UNE-EN / 30-Dec-2008 / 50 pages
JEDEC JEP131B
Potential Failure Mode and Effects Analysis (FMEA)
JEDEC Solid State Technology Association / 01-Apr-2012 /
and more ...
26 pages
51/68 http://www.techstreet.com/
Wednesday, January 30, 2013
Failure Mode and Effect Analysis (FMEA), standards
FMEA in Standards
• Automotive, QS 9000 paragraph 4.2, Cited in the AIAG APQP Manual
• Process Safety Management Act (PSM), CFR 1910.119999999 lists the process FMEA as one
of about 6 methods to evaluate hazards
Example: ICI Explosives - Hazardous Operability Studies
• FDA - GMPs, One of several methods that should be used to verify a new design (21CFR Part
820). Inspectors check list questions cover use of the Design FMEA.
• ISO 9001/2, Requires Preventative Actions. The utilization of FMEAs is one continuous
improvement tool which can satisfy the requirement (ISO9001, Section 4.14)
• ISO14000, FMEA can be used to evaluate potential hazards and their accompanying risks.
• SAE ARP 4761, FMEA is among the covered methods for safety assesment.
52/68 http://elsmar.com/FMEA/
54/68
Wednesday, January 30, 2013
HAzard and OPerability studies (HAZOP)
The Hazard and Operability Study (or HAZOP Study) is a Define the system to be analyzed
standard hazard analysis technique used in the preliminary and sub-systems
safety assessment of new systems or modifications to
existing ones. Identify parameters of the sub-
systems
The effects of such behavior is then assessed and noted Associate consequences
down on study forms. The categories of information
entered on these forms can vary from industry to industry
Apply risk ranking
and from company to company.
http://www.techstreet.com/
HAZOPTOOL (VTT)
PHAWorks (PrimaTech)
58/68
Wednesday, January 30, 2013
Software reliability
Hardware: Software:
There are fundamental differences in the nature of hardware and software faults
59/68
Wednesday, January 30, 2013
Software reliability methodologies
,4&()&'()(*+
5")(&'()(*+
,-.('/*"0 6&7"*+
89$:%"$;&)(*+
<$*"=.(*+
1&($*&($&'()(*+
>&/)*?@."4"$;9$
!"#"$%&'()(*+ 1"&$0 >&/)*?29)".&$A"
>&/)*?5"B94&)
>&/)*?>9."A&0;$=
>&/)*0
23."&*0 C..9.0
>&()/."0
>&/)*?>9."A&0;$=
>&/)*0
C..9.0
>&()/."0
>&/)*?@."4"$;9$
aimed at giving a controlled response for those
>&/)*?29)".&$A" uncovered faults. These techniques are used in safety-
critical software.
>&/)*?5"B94&) Single version techniques
>&/)*?>9."A&0;$= N-version techniques
Software implemented hardware fault tolerance
>&/)*0
C..9.0
>&()/."0
>&/)*?@."4"$;9$
>&/)*?29)".&$A" aimed at detecting and fixing faults once the code has
been developed. These techniques focus on the
>&/)*?5"B94&) product obtained rather than in the process.
>&/)*?>9."A&0;$= Halstead
group of object-oriented metrics
...
General standards:
IEEE 1633-2008
IEEE Recommended Practice on Software Reliability
Edition: 1st
Institute of Electrical and Electronics Engineers / 27-Jun-2008 / 90 pages
SAE JA 1002
Software Reliability Program Standard ( Reaffirmed: May 2012 )
SAE International / 08-May-2012
SAE JA 1003
Software Reliability Program Implementation Guide ( Reaffirmed:
May 2012 )
SAE International / 08-May-2012
http://www.techstreet.com/
65/68 http://webstore.iec.ch/preview/info_isoiec15942%7Bed1.0%7Den.pdf
Wednesday, January 30, 2013
Software reliability standards
ISO 26262-6:2011
Road vehicles - Functional safety - Part 6: Product
IEC 62304 Ed. 1.0 b:2006
Medical device software - Software life cycle processes
development at the software level
Edition: 1.0
International Organization for Standardization / 01-
International Electrotechnical Commission / 09-May-2006 / 155 pages
Jan-2011
SAE J 2640
General Automotive Embedded Software Design Requirements ANSI/AAMI/IEC TIR80002-1:2009
SAE International / 13-Oct-2008 Medical device software - Part 1: Guidance on the application of
ISO 14971 to medical device software
Association for the Advancement of Medical Instrumentation/
MISRA C++:2008 International Electrotechnical Commission / 03-Sep-2009 / 80 pages
Guidelines for the use of the C++ language in critical systems
MISRA/ 2008
http://www.techstreet.com/
66/68 http://webstore.iec.ch/preview/info_isoiec15942%7Bed1.0%7Den.pdf
Wednesday, January 30, 2013
Software reliability tools
C++ Test Test Coverage Tools
C++ Test is a commercial tool from Parasoft that scans C Maintained by Semantic Designs Inc., this page
or C++ code to detect violations. It is an advanced discusses test coverage tools in general, provides
source code analysis tool that implements over 500 pointers to tools for standard languages, and discusses
C/C++ coding guidelines to automatically identify how such tools can be constructed easily for
dangerous coding constructs that compilers do not nonstandard languages or environments.
detect. This page has white papers on the product, http://www.semdesigns.com/Products/TestCoverage/
presentations, demos and a Downloadable, free C++
Test evaluation software available.
http://www.parasoft.com/jsp/products/home.jsp?
product=Wizard&itemId=68 COQUALMO
COnstructive QUALity MOdel(COQUALMO), formerly
called CODEFMO, is an estimation model that can be
used for predicting the number of residual defects/
KSLOC (Thousands of Source Lines of Code) or defects/
Critical Software FP (Function Point) in a software product.
Commercial Fault Injection engine which allows you to http://csse.usc.edu/csse/research/COQUALMO/
inject faults into virtually any of the system or application
processes, any of the processors or memory and at pin
level using a combination of plug-ins and templates.
http://www.xception.org/ Cosmic Software MISRA CHECKER
The Cosmic Software MISRA Checker is a standalone
software utility that aids in the production of well
structured and portable C language code using
SoftRel - Software Reliability Prediction guidelines* prescribed by the Motor Industry Software
SoftRel develops predictive models. Namely numerical Reliability Association (MISRA). The Cosmic MISRA
and classification models via data mining, knowledge Checker is designed to provide comprehensive static
discovery, and knowledge extraction. MISRA compliance checking that executes fast enough
http://www.softrel.com/ to be used on every compile.
http://www.cosmic-software.com/misra.php
67/68 https://sw.thecsiac.com/databases/url/key/2/2455#.UP_mZRwU66Y
68/68
Wednesday, January 30, 2013
Thank You
These slides contain a survey of the existing reliability methodologies, standards, and tools.
These slides contain links to all sources of used textual and graphical information.
The author has no intention of promoting reliability software developer companies, resellers,
or standards organizations.
All the logos of the software vendors and standards organizations belong to the owners and
will not be used in commercial purposes.
Andrey Morozov