0% found this document useful (0 votes)
41 views17 pages

Estimating Failure Function For Sure

This document discusses estimating failure functions from field data. It begins by describing common failure function distributions like the negative exponential, normal, log-normal, and Weibull distributions. It then discusses how to estimate the parameters of these distributions based on failure data samples, including issues with incomplete or censored samples. The document provides an example of estimating the parameters of a Weibull distribution from field data and discusses best practices for collecting reliable field data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views17 pages

Estimating Failure Function For Sure

This document discusses estimating failure functions from field data. It begins by describing common failure function distributions like the negative exponential, normal, log-normal, and Weibull distributions. It then discusses how to estimate the parameters of these distributions based on failure data samples, including issues with incomplete or censored samples. The document provides an example of estimating the parameters of a Weibull distribution from field data and discusses best practices for collecting reliable field data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

MODULE 9 – Marine Systems

Performance & Maintenance

DISTANCE LEARNING MATERIAL –


6. ESTIMATING FAILURE FUNCTIONS

Module Leader: Hugo Grimmelius MSc BSc

Delft University of Technology


Marine Technology
Estimating failure functions

Content
1 FAILURE FUNCTION DISTRIBUTIONS.................................................... 3
1.1 INTRODUCTION ......................................................................................... 3
1.2 NEGATIVE EXPONENTIAL DISTRIBUTION ........................................................... 3
1.3 NORMAL OR GAUSSIAN DISTRIBUTION ............................................................. 4
1.4 LOG-NORMAL DISTRIBUTION ......................................................................... 5
1.5 WEIBULL DISTRIBUTION .............................................................................. 6
2 ESTIMATION OF DISTRIBUTION............................................................ 7
2.1 INTRODUCTION ......................................................................................... 7
2.2 SAMPLES OF FAILURE DATA ........................................................................... 7
2.3 INCOMPLETE OR CENSORED SAMPLES .............................................................. 8
2.4 SAMPLE WITH ONLY A FEW COMPONENTS ......................................................... 9
2.5 ESTIMATING A WEIBULL DISTRIBUTION FROM A SAMPLE ......................................10
2.6 EXAMPLE OF WEIBULL DISTRIBUTION ESTIMATION.............................................11
2.7 PROBLEMS WITH FIELD DATA COLLECTION .......................................................13
2.8 BEST PRACTICE AND RECOMMENDATION FOR FIELD DATA COLLECTION ....................14
3 REFERENCES ........................................................................................ 16

APPENDIX I: WEIBULL GRAPH PAPER ....................................................... 17

Page: 2  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

1 FAILURE FUNCTION DISTRIBUTIONS

1.1 Introduction

In this chapter some distributions will be described, mainly as to enable to illustrate


the application possibilities. Therefore the theory

1.2 Negative exponential distribution

If a constant failure rate is assumed the so-called negative exponential distribution is


found, Figure 1:

1
R (t ) = e - λt (for t » : R (t ) ≈ 1 - λt )
λ
1
F (t ) = 1 - e - λt (for t » : F (t ) ≈ λt )
λ
[1]
f (t ) = λ e - λt

λ (t ) = λ

Failure function, failure probability and


failure rate are shown Figure 1. 1

This function, though not very suited for


F(t)

mechanical components, is used most


often in reliability and availability studies t

because calculation are relatively simple.


λ
It is however often well suited for
electronic components. f(t)

λ(t)

Figure 1: Failure function, failure probability


and failure rate of the negative exponential
distribution

Date: 30-03-2003  H.T. Grimmelius Page: 3


Estimating failure functions

1.3 Normal or Gaussian distribution

This very well known distribution is described by:

1 t 1 x - μ 
2

σ 2 π ∫0
F(t) = - 
e  
2 σ  .dx

[2]
1 1 2
f(t) = e 2( )
- t-μ

σ 2π
With: -∞ < t < ∞
σ > 0: standard deviation
μ > 0: the average life expectancy
For -∞ < t < 0 the distribution has no
meaning for reliability or availability (a
negative life expectancy is of course
impossible). For μ > 3σ this negative
part is negligible.

Failure function, failure probability and


failure rate are shown in Figure 2, for μ
≈ 3σ and t > 0.

This function suitable for wear,


corrosion and other age related failures.
Values for the function can be found in
tables, it is not very suited for analytical
operations.

λ(t)

Figure 2: Failure function, failure probability


and failure rate of the normal distribution

Page: 4  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

1.4 Log-normal distribution

De log-normal distribution is given through:

t 1
2
1 1  ln x -μ 
F(t) = .∫ . e - 2  σ  . dx
σ 2π 0 x
[3]
1 1  ln t - μ 
2

f(t) = . e - 2  σ 
σ t 2π

With t, μ en σ > 0.

This function results in a normal


distribution of ln(t). This function, Figure
3, is not very suited for general failure
functions, but is often applied to repair
times and fatigue failures.

With repair times there is an increasing


probability of short to average repair
times, but there is always the possibility
of very long repair times.

For fatigue failures it is partly based on


physics and partly on mathematical
convenience. For a short description see
[Høland, 1994] , section 2.11, example
2.6.

Figure 3: Failure function, failure probability


and failure rate of the lognormal distribution

Date: 30-03-2003  H.T. Grimmelius Page: 5


Estimating failure functions

1.5 Weibull distribution

Second to the negative exponential distribution the Weibull distribution is probably the most
used distribution in reliability and availability research. The distribution is versatile and can be
used to describe various failure functions.

The distribution is given through:

β
t - γ 
F(t) = 1 - e -  η 

β-1
(t - γ)
β
t - γ 
f(t) = β β
. e -  η 
 [4]
η
β-1
(t - γ)
λ(t) = β . β
η

The parameters used are:


γ : Threshold parameter or minimum
life. For t < γ no failures occur.
η : Scale parameter or characteristic life.
For t = η : F(η) = 0,632 and R(η) =
0,368.
β : Shape parameter. For β = 1 en γ = 0
the Weibull distribution is equal to
the negative exponential distribution,
with λ = 1/η.
If 0 < β < 1 the failure rate is
decreasing.
If β > 1 the failure rate is increasing.
All three faces of the bath-tub curve can
be modelled with the Weibull
distribution. Figure 2 shows the
distribution for several values of the
shape parameter.

Figure 4: Failure function, failure probability


and failure rate of the Weibull distribution
(for β = 0.5, 1, 2, and 4 )

Page: 6  H.T. Grimmelius Date: 30-03-2003


2 ESTIMATION OF DISTRIBUTION

2.1 Introduction

The failure function of a component depends the load and the capability of the component to
cope with the load. The load can be mechanical, thermal, chemical and/or corrosive. The
components capability to cope with the loads is dependent on the design, the production quality
and the maintenance.

Most knowledge of failure function is a result of field data collection. There are however many
problems when collecting failure data. A very strict definition is needed.

There are many sources for failure function data. From literature, from commercial databases
and field data, either self obtained or from a manufacturer. Data in literature are often not very
well specified (conditions of use, failure exact mode) and often only provide a failure rate.
Commercial data bases are much more detailed, but still often a translation has to be made
between the conditions and failure modes described for the found data and the actual
conditions and failure modes. Most reliable are in general data acquired either through own
experience or directly from the manufacturer.

2.2 Samples of failure data

Assume a sample of No identical components. On time t , NF(t) components have


failed, and NNF(t) components have not yet failed. Of course: No = NF + NNF. The
failure function can be estimated through:

ˆ = N F (t)
F(t)
No
ˆ = N NF (t) = 1 - N F (t)
R(t)
No No
ˆ = NF (t + Δt) - N F (t)
f(t) [5]
N o . Δt
ˆ = N F (t + Δt) - N F (t)
λ(t)
N o(t) . Δt
(^ = estimate)

If a large sample is available it is practical to create classes. For a first estimate about
10 classes often suffice, Table I gives an example. Here N0 = 100, and 11 classes have
been defined. With this approach a small underestimate of the failure probability is
found for the lowest class (where no failure has occurred yet in the sample) and
simultaneously an overestimation of the failure probability is found for the highest
class (where no samples have been found).

Delft University of Technology


Marine Technology
Estimating failure functions

Table I: Example of failure sample data, grey columns are actual acquired data.

class tf [hour] NF in failures not failed ^ ^ ^ ^


nr. class NF NNF F (t) R (t) f (t) λ(t)

1 0-2000 0 0 100 0 1.00 0 0

2 2000-3000 2 2 98 0.02 0.98 0.02*10-3 0.02*10-3

3 3000-4000 13 15 85 0.15 0.85 0.13*10-3 0.13*10-3

4 4000-5000 30 45 55 0.45 0.55 0.30*10-3 0.35*10-3

5 5000-6000 29 74 26 0.74 0.26 0.29*10-3 0.53*10-3

6 6000-7000 14 88 12 0.88 0.12 0.14*10-3 0.54*10-3

7 7000-8000 8 96 4 0.96 0.04 0.08*10-3 0.67*10-3

8 8000-9000 3 99 1 0.99 0.01 0.03*10-3 0.75*10-3

9 9000-10000 0 99 1 0.99 0.01 0 0

10 10000-11000 1 100 0 1.00 0 0.01*10-3 1.00*10-3

11 > 11000 - 100 0 1.00 0 0 0

2.3 Incomplete or censored samples

Often the data available is truncated, or censored. There are four types ofen
encountered censoring:
Type I: The time the components are monitored is limited, not all components will
have failed at the end of the monitored period.
Type II: The number of components that are allowed to fail is limited, also to limit
the duration of the monitoring period.
Type III: Combination of I and II: either after a certain time or after a certain number
of components has failed (whichever comes first) monitoring is stopped.
Type IV: The start-up moment of the components is not the same, but a random
variable.
There a various ways to compensate for censored data sets. See for instance Section
9.3. in [Høland, 1994].

Page: 8  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

2.4 Sample with only a few components

Another special approach is needed for small sample. Consider a sample of three times to
failure:
# tf
1 36
2 64
3 124
Using the method described in section 2.2 this gives:
^
# tf F (t)
1 36 0.33
2 64 0.67
3 124 1
This implies that for all components the probability of failure after 124 running hours become a
certainty! This is not very plausible, so lets assume four classes:
I: 0 - 36
II : 36 - 64
III : 64 - 124
IV: ≥124
If we now assume that new failures have an equal chance to occur in either of thiese
classes, the following failure function is found:
^
# tf F (t)
1 36 0.25
2 64 0.50
3 124 0.75
Generalized, this estimate results in:

ˆ ti) = i
F( [6]
No + 1

A assumption underlysing this approximation ios that the distributon function is symmetrical.
Often however a slanted distribution is often found (e.g. log-normal distribution). An often used
approximation, instead of equation [6] is:

ˆ ti) = i - 0,3
F( [7]
N o + 0,4

(also known as Bernard’s approximation)

This results in:


^
# tf F (t)
1 36 0.21
2 64 0.50
3 124 0.79

Date: 30-03-2003  H.T. Grimmelius Page: 9


Estimating failure functions

2.5 Estimating a Weibull distribution from a sample

To be able to calculate failure probability and other quantities at any given time is one
of the main advantages of estimating a distribution function rather then using
tabulated sample results.

The approach in general is:


!"Assume a distribution function, based on experience and/or literature
concerning similar components and failure modes.
!"Estimate the parameters for this function
!"Test the validity of the resulting function.
!"If no proper fit between function and data can be established, assume a
different distribution function and start over again.
Because of its flexibility an estimate based on the Weibull-distribution is often used.
Here a graphical method will be described, using the Weibull graph paper shown in
Figure 5 (blanc version available in Appendix I).

In section 1.5 the Weibull distribution was introduced:


β
t - γ 
F(t) = 1 - e -  η 
 ref [4]

If a minimum life of zero is assumed (γ = 0 ) this becomes:


β
t 
F(t) = 1 - e - 
η  [8]

Equation [8] is linearised by applying the natural logarithm (ln) twice:


β
t 
-  
η 
1 - F(t) = e ⇔
β
t  β
-  
t 
ln ( 1 - F(t) ) = ln e η 
=-  ⇔
η 
β
[9]
 1  t 
ln  =  ⇔
 1 - F(t)   η 
 1  t 
ln ln 
1 - F(t)  = β ln  η  = β ln t - β ln η
   

 1 
With: y= ln ln   , x = ln t, and c = β ln η, this gives:
 1 - F(t) 
y=β.x-c [10]

This is the equation of a straight line! The axis of the Weibull paper are transformed to
represent this, Figure 5.

Page: 10  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

Figure 5: Weibull graph paper with example data

The sample data is now plotted with on the horizontal axis the time to failure and on
the vertical axis the approximated failure function. A proper fit is found if it is possible
to draw a straight line through the plotted points.

^
The characteristic life η is found for F (t) = 0.632 :

β β
t  η 
-   -  
η  η 
F(η) = 1 - e =1-e = 0,632

The shape parameter β is the inclination of the found line. For convenience, a graphic
aid is included on the Weibull graph: in the upper left corner a scale for β is given. By
^
moving the found line in parallel until it crosses the open dot near F (t) = 0.632 , β is
found on the scale on top.

If the result is not a straight line, it is possible that the minimal life γ > 0. By reducing
the time to failures with the minimal life again a straight line is found. It is only
possible to iteratively determine the appropriate minimum life.

For a reasonable accurate estimate a minimum of about 15 sample data is required.

2.6 Example of Weibull distribution estimation

Using the data already presented in Table I, a Weibull distribution estimate is


calculated. First, the times to failures and failure function estimates presented in Table
I are plotted on the Weibull graph. The resulting points can hardly be approximated
with a straight line. Next a minimum life of 2000 hours was assumed. To achieve this,
the estimated failure function was plotted against the time to failure minus 2000

Date: 30-03-2003  H.T. Grimmelius Page: 11


Estimating failure functions

hours. This result is also shown in Figure 5. The resulting points are very well
approximated by a straight line.

So, the found parameters for the Weibull function are:


γ = 2000 hours
η = 3200 hours
β = 2,2 (increasing failure rate)
As a verification, Table II gives the failure function and failure rate for both the first
approximation (Table I) and the Weibull-distribution found here. The found Weibull
distribution is:
2,2
 t - 2000 
F(t) = 1 - e -  3200 
2,2 (t - 2000) 1,2  t - 2000 
2,2

f(t) = 2,2
* e
- 
 3200  [11]
3200
2,2 (t - 2000) 1,2
λ(t) = 2,2
3200

Table II: Comparing estimate from Table I with Weibull distribution

Estimate Table I Weibull-function

Class ^ ^ ^ ^
tf [hour] F (t) λ(t) F (t) λ(t)
nr.

1 <2000 0 0 0 0

2 2500 0.02 0.02*10-3 0.016 0.07*10-3

3 3500 0.15 0.13*10-3 0.17 0.28*10-3

4 4500 0.45 0.35*10-3 0.44 0.50*10-3

5 5500 0.74 0.53*10-3 0.70 0.77*10-3

6 6500 0.88 0.54*10-3 0.88 1.03*10-3

7 7500 0.96 0.67*10-3 0.96 1.32*10-3

8 8500 0.99 0.75*10-3 0.99 1.60*10-3

9 9500 0.99 - 0.998 1.91*10-3

10 10500 1.00 1.00*10-3 0.9998 2.22*10-3

11 > 11000 1.00 0.9999 2.38*10-3

Page: 12  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

2.7 Problems with field data collection1

The main problems associated with failure recording are:

Inventories: Whilst failure reports identify the numbers and types of failure they rarely
provide a source of information as to the total numbers of the item in question and
their installation dates and running times.

Motivation: If the field service engineer can see no purpose in recording information it
is likely that items will be either omitted or incorrectly recorded. The purpose of fault
reporting and the ways in which it can be used to simplify the task need to be
explained. If the engineer is frustrated by unrealistic time standards, poor working
conditions and inadequate instructions, then the failure report is the first task which
will be skimped or omitted. A regular circulation of field data summaries to the field
engineer is the best (possibly the only) way of encouraging feedback. It will help him
to see the overall field picture and advice on diagnosing the more awkward faults will
be appreciated.

Verification: Once the failure report has left the person who completes it the possibility
of subsequent checking is remote. If repair times or diagnoses are suspect then it is
likely that they will go undetected or be unverified. Where failure data are obtained
from customers staff, the possibility of challenging information becomes even more
remote.

Cost: Failure reporting is costly in terms of both the time to complete failure-report
forms and the hours of interpretation of the information. For this reason, both supplier
and customer are often reluctant to agree to a comprehensive reporting system. If the
information is correctly interpreted and design or manufacturing action taken to
remove failure sources, then the cost of the activity is likely to be offset by the savings
and the idea must be ’sold’ on this basis.

Recording non-failures: The situation arises where a failure is recorded although none
exists. This can occur in two ways. First, there is the habit of locating faults by
replacing suspect but not necessarily failed components. When the fault disappears the
first (wrongly removed) component is not replaced and is hence recorded as a failure.
Failure rate data are therefore artificially inflated and spares depleted. Second, there is
the interpretation of secondary failures as primary failures. A failed component may
cause stress conditions upon another which may, as a result, fail. Diagnosis may reveal
both failures but not always which one occurred first. Again, failure rates become
wrongly inflated. More complex maintenance instructions and the use of higher-grade
personnel will help reduce these problems at a cost.

Times to failure: These are necessary in order to establish wear out See next section.

1
This section is based on section 13.2 from “Reliability, Maintainability and Risk”, by D.J. Smith,
Butterworth-Heinemann ISBN 0-7506-5168-7, 2001).

Date: 30-03-2003  H.T. Grimmelius Page: 13


Estimating failure functions

2.8 Best practice and recommendation for field data collection2

The following list summarizes the best practice together with recommended
enhancements for both manual and computer based field failure recording. Recorded
field information is frequently inadequate and it is necessary to emphasize that failure
data must contain sufficient information to enable precise failures to be identified and
failure distributions to be identified. They must, therefore, include:

Adequate information about the symptoms and causes of failure. This is important
because predictions are only meaningful when a system level failure is precisely
defined. Thus component failures which contribute to a defined system failure can only
be identified if the failure modes are accurately recorded. There needs to be a
distinction between failures (which cause loss of system function) and defects (which
may only cause degradation of function).

Detailed and accurate equipment inventories enabling each component item to be


separately identified. This is essential in providing cumulative operating times for the
calculation of assumed constant failure rates and also for obtaining individual calendar
times (or operating times or cycles) to each mode of failure and for each component
item. These individual times to failure are necessary if failure distributions are to be
analysed by the Weibull method.

Identification of common cause failures by requiring the inspection of redundant units


to ascertain if failures have occurred in both (or all) units. In order to achieve this it is
necessary to be able to identify that two or more failures are related to specific field
items in a redundant configuration. It is therefore important that each recorded failure
also identifies which specific item (i.e. tag number) it refers to.

Intervals between common cause failures. Because common cause failures do not
necessarily occur at precisely the same instant it is desirable to be able to identify the
time elapsed between them.

The effect that a ’component part’ level failure has on failure at the system level. This
will vary according to the type of system, the level of redundancy (which may
postpone system level failure) etc.

Costs of failure such as the penalty cost of system outage (e.g. loss of production) and
the cost of corrective repair effort and associated spares and other maintenance costs.

The consequences in the case of safety-related failures (e.g. death, injury,


environmental damage) not so easily quantified.

Consideration of whether a failure is intrinsic to the item in question or was caused by


an external factor. External factors might include:
!"process operator error induced failure
!"maintenance error induced failure
!"failure caused by a diagnostic replacement attempt
!"modification induced failure

2
This section is based on section 13.5 from “Reliability, Maintainability and Risk”, by D.J. Smith,
Butterworth-Heinemann ISBN 0-7506-5168-7, 2001).

Page: 14  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

Effective data screening to identify and correct errors and to ensure consistency. There
is a cost issue here in that effective data screening requires significant man-hours to
study the field failure returns. In the author’s experience an average of as much as
one hour per field return can be needed to enquire into the nature of a given failure
and to discuss and establish the underlying cause. Both codification and narrative are
helpful to the analyst and, whilst each has its own merits, a combination is required in
practice. Modern computerized maintenance management systems offer possibilities
for classification and codification of failure modes and causes. However, this relies on
motivated and trained field technicians to input accurate and complete data. The
option to add narrative should always be available.

Adequate information about the environment (e.g. weather in the case of unprotected
equipment) and operating conditions (e.g. unusual production throughput loadings).

Date: 30-03-2003  H.T. Grimmelius Page: 15


Estimating failure functions

3 REFERENCES
[Høland, 1994]
A. Høland and M. Rausand: “System Reliability Theory, Models and Statistical
Methods”, ISBN 0-471-59397-4, John Wiley & Sons Inc., 1994.

[Smith, 2001]
D.J. Smith: “Reliability, Maintainability and Risk”, ISBN 0-7506-5168-7, Butterworth-
Heinemann, 2001.

Page: 16  H.T. Grimmelius Date: 30-03-2003


Estimating failure functions

APPENDIX I: WEIBULL GRAPH PAPER

Date: 30-03-2003  H.T. Grimmelius Page: 17

You might also like