M Rel PDF
M Rel PDF
Methods in Reliability
Larry Leemis
Department of Mathematics
College of William and Mary
Williamsburg, VA 23187-8795
[email protected] 757-221-2034
Outline
1. Introduction
2. Coherent Systems Analysis
3. Lifetime Distributions
4. Parametric Lifetime Models
5. Specialized Models
6. Repairable Systems
7. Lifetime Data Analysis
8. Fitting Parametric Models to Data
9. Parametric Estimation for Models with Covariates
10. Nonparametric Methods
11. Assessing Model Adequacy
1. Introduction
Motivation
• Space Shuttle Challenger accident
• Chernobyl and Three Mile Island accidents
• product liability
• customer goodwill
• corporate reputation
Terminology
The event at the end of a lifetime is called
• a failure by reliability engineers
• death by actuaries and biostatisticians
• an epoch by point process researchers
The object of a study is called
• a system, component or item by reliability
engineers
• an individual by actuaries
• an organism by biostatisticians
To avoid switching terms, failure of an item will be
used here.
1.1 A definition of reliability
Probability
• Range
all reliabilities must be between 0 and 1
inclusive
• Spinoffs from the probability axioms
statistical independence
Adequate performance
• Must be stated unambiguously
• Standards
Example: a ball bearing has failed when its
diameter falls outside of 3 + 0.05 mm
• Binary models
the item is in either the functioning or failed
state (e.g., a fuse)
Purpose
• Intended use
Example: a drill may have one grade for a
handyman and another for a contractor
Time
• Units
must be specified (e.g., hours, years)
• Notation
many lifetime models use the random
variable T
• Time need not be taken literally
consider an automobile tire, light switch
• Time duration must be specified
Example: 1000 hour reliability is 0.8
• Continuous operation vs. on/off cycling
time alone may not be the only consideration
(e.g., motors, computers)
Environmental conditions
• Factors
temperature, humidity, and turning speed all
affect the lifetime of a machine tool
• Preventive maintenance
usually effective in prolonging the lifetime of
the item and hence increasing the reliability
Subsystems
• orbiter
• external liquid-fuel tank
• two solid rocket motors
O-rings
• 37.5 feet in diameter
• 0.28 inches thick
• all six O-rings must operate to avoid having the
propellant escape causing potential failure, so
the O-rings form a six-component series system
2.5
2.0 •
1.5
1.0 •• • •
0.5
0.0 • ••• • • •• •• • •• •• •
55 60 65 70 75 80 85
Temperature
Figure 1.3 Launch temperature versus
number of field joint failures.
1 2 3 ... n
Figure 2.1 A series system block diagram.
3
..
.
n
Figure 2.2 A parallel system block diagram.
Applications
• kidneys
• brake system on an automobile with two brake
fluid reservoirs
Series and parallel systems are special cases of k-out-
of-n systems, where the system functions if k or more
of the n components function.
Applications
• suspension bridge (components: cables)
• an automobile engine (components: cylinders)
• a bicycle wheel (components: spokes)
Example 2.3 The structure function for a k-out-
of-n system is
n
0 if i Σ
=1
xi < k
φ (x) = n
1 if Σ x i ≥ k.
i=1
The block diagram for a k-out-of-n system is
difficult to draw in general.
1 2
1 3
2 3
Figure 2.3 A 2-out-of-3 system
block diagram.
φ (x) = 1 − (1 − x 1 x 2 )(1 − x 1 x 3 )(1 − x 2 x 3 )
2.3 Reliability functions
Assumptions
• the binary model applies to components and
systems
• the n components must be nonrepairable
• the components are independent
Definition 2.9 The random variable denoting
the state of component i, X i , is
0 if component i has failed
Xi =
1 if component i is functioning
for i = 1, 2, . . . , n.
Random component states
• these n values can be written as a random system
state vector X
• pi = P[X i = 1] is the reliability of the i th
component, i = 1, 2, . . . , n
• reliability vector p = ( p1 , p2 , . . . , p n )
• must specify the time to which the reliability
applies (e.g., 5000-hour reliability is 0.83)
• the system reliability, r, is defined by
r(p) = P[φ (X) = 1]
• r( p) used when all component reliabilities are
equal
Technique 1: Definition of r(p)
Example 2.12 Series system of n independent
components
n n n
r(p) = P[φ (X) = 1] = P[ Π X i = 1] = Π P[X i = 1] = Π pi
i=1 i=1 i=1
"Weakest link" for series systems
• system reliability less than smallest component
reliability
• improvement of weakest component most
effective
Special case: identical components r( p) = p n
r(p)
1.0
0.8
0.6 n=1
0.4
n=2
0.2 n=5
0.0 n = 10
p
0.0 0.2 0.4 0.6 0.8 1.0
Figure 2.12 Reliability of n-component series systems.
Technique 2: Expected value of φ (X)
P[φ (X) = 1] = E[φ (X)]
since φ (X) is a Bernoulli random variable.
Example 2.13 Parallel system of n independent
components.
n
r(p) = E[φ (X)] = E[1 − Π (1 − X i )]
i=1
n n
=1− Π E[1 − X i ] = 1 − Π (1 − pi )
i=1 i=1
r(p)
1.0
n = 10
0.8 n=5
n=2
0.6
n=1
0.4
0.2
0.0 p
0.0 0.2 0.4 0.6 0.8 1.0
Figure 2.13 Reliability of n-component parallel systems.
"Law of diminishing returns" for parallel systems
• marginal gain in reliability decreases
dramatically as more components are added
• improvement of the strongest component is the
most effective
Notes on parallel systems
• standby system
• shared-parallel system
3. Lifetime Distributions
Motivation
Up to this point, reliability has only been considered
at one particular instance of time.
Outline
• lifetime distribution representations
• discrete distributions
• moments and fractiles
• system lifetime distributions
• distribution classes
S(t)
1.0
0.8
0.6 S1(t)
S2(t)
0.4
0.2
0.0 t
0.0 0.5 1.0 1.5 2.0
Figure 3.1 Two survivor functions.
f(t)
0.8
0.6
0.4
F(t0 )
0.2
S(t0 )
0.0 t0
t
0.0 0.5 1.0 1.5 2.0
Figure 3.3 The relationship between survivor and
cumulative distribution functions.
Hazard function (failure rate, force of mortality)
h(t) = f (t) / S(t) t≥0
h(t)∆t = P[t ≤ T ≤ t + ∆t | T ≥ t]
for small ∆t values. Units: failures per unit time.
Interpretations
• h(t) is the amount of risk an item is under at t
• h(t) is a special case of the intensity function for
a nonhomogeneous Poisson process
All hazard functions must satisfy
∞
∫ h(t) dt = ∞ h(t) ≥ 0 for all t ≥ 0
0
h(t)
5 IFR
4 DFR
3
2 BT
1
0 t
0.0 0.5 1.0 1.5 2.0
Figure 3.5 Common hazard function shapes.
Cumulative hazard function (integrated
hazard function and the renewal function)
t
H(t) = ∫ h(τ )dτ t≥0
0
All cumulative hazard functions satisfy
H(0) = 0 lim H(t) = ∞ H(t) is nondecreasing
t →∞
Applications
• variate generation in Monte Carlo simulation
• implementing certain procedures in statistical
inference
• defining certain distribution classes
f (t) = λ e− λ t t≥0
The mean residual life function is
∞ 1
L(t) = e λt
∫ τ λ e− λτ d τ − t = t≥0
t
λ
by using integration by parts.
4.1 Parameters
Three types of parameters:
• location
• scale
• shape
S(t)
1.0 O
O
O
O
O
OO
O
O
O
0.8 O
0.6
0.4
0.2
0.0 t
0 50 100 200 300
Figure 4.2 A mixed discrete-continuous
survivor function.
4.2 The exponential distribution
Motivation
The exponential distribution plays a central role in
reliability modeling since it is the only continuous
distribution with a constant hazard function.
f(t) S(t)
2.0
λ=2 0.8
1.0 λ=1
0.4
λ=1 λ=2
0.0 t 0.0 t
0.0 1.0 2.0 0.0 1.0 2.0
h(t) H(t)
2.0 4
λ=2
λ=2
1.0 2
λ=1 λ=1
0.0 t 0 t
0.0 1.0 2.0 0.0 1.0 2.0
L(t)
0.8 λ=1
0.4 λ=2
0.0 t
0.0 1.0 2.0
S(t)
1.0
0.8
0.6
0.4
0.2
0.0 t
0.0 0.5 1.0 1.5 2.0
Figure 4.4 The memoryless property
of the exponential distribution.
Property 4.2 The exponential distribution is
the only continuous distribution with the
memoryless property.
h(t)
1.5
h T (t)
1.0
h X1(t)
0.5
h X2(t)
0.0 t
0.0 0.5 1.0 1.5 2.0
m
pl Σ h lj (t) e 0 j = 1
l
f (t) = Σ j=1
l=1
m
where m is the number of populations, Σ pl = 1,
l=1
k l is the number of risks acting within the l th
population, h lj (t) is the hazard function for the j th
risk within the l th population.
Example Biostatistics
T : patient survival time
z 1 : age
z 2 : gender
z 3 : cholesterol level
Example Recidivism
T : time to return to prison
z 1 : age
z 2 : time served
z 3 : number of previous convictions
Notation
z = (z 1 , z 2 , . . . , z q )′ covariates
β = ( β 1 , β 2 , . . . , β q )′ regression coefficients
ψ (z) link function
S 0 (t), f 0 (t), h0 (t), H 0 (t) baseline functions
2 h 0(t)
0 t
0.0 0.5 1.0 1.5 2.0
6. Repairable Systems
Motivation
So far, only nonrepairable systems of components
have been considered. Most systems are
repairable.
Outline
• Introduction
• Point processes
• Availability
• Birth-death processes
6.1 Introduction
A repairable item may be returned to an operating
condition after failure to perform a required
function by any method other than replacement of
the entire item.
Replacement models
• used when a nonrepairable item is replaced
with another item upon failure
• "socket models"
• unlimited spares
• redundancy allocation problem (optimal
number of spares)
• replacement policies
X X X X X X X t
0
X X O X O X X O X t
0 c c c
h(t) h(t)
4 4
3 3
2 2
1 1
0 X t 0 X t
0.0 1.0 2.0 0.0 1.0 2.0
λ(t ) λ(t )
4 4
3 3
2 2
1 1
0 X XX X X X t 0 X XX XX XXXX t
0.0 1.0 2.0 0.0 1.0 2.0
3 O O
2 Λ( t ) O O N(t)
1 O O
X1 X2 X3 X4
0 O
X X X X
t
T1 T2 T3 T4
0.0 0.5 1.0 1.5 2.0
0 ( X X ] X ( ] X t
t1 t2 t3 t4
Figure 6.7 Independent increments.
• stationarity: the distribution of the number of
failures in any time interval depends only on
the length of the time interval
Homogeneous Poisson process (HPP)
Definition 6.1 A counting process is a
Poisson process with parameter λ > 0 if
• N (0) = 0
• the process has independent increments
• the number of failures in any interval of
length t has the Poisson distribution
with parameter λ t.
Implications
• the distribution of the number of events in
(t 1 , t 2 ] has the Poisson distribution with
parameter λ (t 2 − t 1 ).
[λ (t 2 − t 1 )] x e− λ (t 2 − t 1 )
• P[N (t 2 ) − N (t 1 ) = x] =
x!
for x = 0, 1, 2, . . .
• N (t) has the Poisson distribution with mean
Λ(t) = E[N (t)] = λ t, where λ is often called
the rate of occurrence of failures
• the intensity function is λ (t) = Λ′(t) = λ
• if X 1 , X 2 , . . . are independent and identically
distributed exponential random variables,
then N (t) corresponds to a Poisson process
• this model is sometimes called just a Poisson
process
Nonhomogeneous Poisson process (NHPP)
Four reasons to consider an NHPP
• the HPP is a special case of an NHPP
(stationarity assumption relaxed)
• the probabilistic model for an NHPP is
mathematically tractable
• the statistical methods for an NHPP are also
mathematically tractable
• the NHPP is capable of modeling improving
and deteriorating systems
Intensity function: λ (t)
t
Cumulative intensity function: Λ(t) = ∫ λ (τ )dτ
0
6.3 Availability
Notation
• X i denotes the i th time to failure, i = 1, 2, . . .
• Ri denotes the i th time to repair, i = 1, 2, . . .
X O X O X O t
X1 R1 X2 R2 X3 R3
0
X O t
Ri
f(t)
1.0
0.8
0.6
0.4
0.2
0.0 X X X X
t
0.0 0.5 1.0 1.5 2.0
− ∂2 log L(t, θ )
Cov(U i (θ ), U j (θ )) = E
∂ θ i ∂ θ j
for i = 1, 2, . . . , p and j = 1, 2, . . . , p.
• the observed information matrix has
components O(θˆ ) is
− ∂2 log L(t, θ ) i = 1, 2, . . . , p
∂ θ i ∂ θ j θ = θˆ j = 1, 2, . . . , p
Example 7.7 Collect t 1 , t 2 , . . . , t n from an
exponential population with a single
parameter θ
1 − t /θ
f (t; θ ) = e t>0
θ
The likelihood function is
n
L (t, θ ) = Π f (t i , θ )
i=1
n 1
= Π e− t i / θ
i=1 θ
n
−n
− Σ ti / θ
=θ e i=1
n iΣ
ti
∂ log L(t, θ ) =1
U(θ ) = =− +
∂θ θ θ2
The maximum likelihood estimator is
1 n
θˆ = Σ ti
n i=1
The derivative of the score vector is
n
∂ log L(t, θ )
2
n
2 Σ ti
i=1
= −
∂θ 2 θ 2 θ3
The information matrix is
− ∂2 log L(t, θ )
I (θ ) = E
∂ θ 2
n
n 2 Σ i t
i=1
= E − 2 +
θ θ 3
n 2 n
= − 2 + 3 E Σ ti
θ θ i=1
n
= 2
θ
The observed information matrix is
− ∂2 log L(t, θ )
O(θˆ ) =
∂ θ 2
θ = θˆ
n
= 2
θˆ
7.5 Censoring
A censored observation occurs when only a bound
is known on the time of failure.
Notation
• n: number of items on test
• r: number of observed failures
• c: censoring time
A data set where all failure times are known is
called a complete data set.
X
X
X
X
X
S(t)
1.0
0.8
0.6
0.4
0.2
0.0 t
0 50 100 150
κ
+ r log λ + Σ log x i − Σ (λ x i )κ log λ x i = 0
i ∈U i=1
The first equation can be solved for λ in terms of κ
1 /κ
r
λ = n
κ
Σ x
i=1 i
The 2 × 2 information matrices are based on
− ∂2 log L(λ , κ ) κ r −2
n
∂λ 2
=
λ2
+ κ (κ − 1) λ κ
Σ xκi
i=1
− ∂2 log L(λ , κ ) r n
∂κ 2
= + Σ
κ 2 i=1
( λ x i )κ
(log λ x i ) 2
− ∂2 log L(λ , κ )
=
∂λ ∂κ
r κ −1
n
n
− +λ
λ
κ Σ x i log x i + (1 + κ log λ ) Σ x i
κ κ
i=1 i=1
Information matrices
• the expected values of these quantities are not
tractable
• use λˆ and κ̂ to obtain the observed information
matrix
Example 8.15 Ball bearing data set. The
fitted Weibull distribution: λˆ = 0. 0122 and
κ̂ = 2. 10.
S(t)
1.0
0.8
Weibull
0.6
0.4
0.2 Exponential
0.0 t
0 50 100 150
λ
0.020
0.015
.
0.010
0.005
0.0 κ
0 1 2 3 4
Figure 8.9 Confidence region for
λ and κ (α = 0. 05).
Inverse of the observed information matrix:
−1
0. 00000165 − 0. 000139
O (λ , κ̂ ) =
ˆ
− 0. 000139 0. 108
The standard errors of the parameter
estimators are the square roots of the
diagonal elements
σ̂ λˆ = 0. 00128 σ̂ κ̂ = 0. 329
An asymptotic 95% confidence interval for κ
is
2. 10 − (1. 96) (0. 329) < κ < 2. 10 + (1. 96) (0. 329)
or 1. 46 < κ < 2. 74.
L(θ , β ) = Π f (x i , zi , θ , β ) Π S(x i , zi , θ , β )
i ∈U i ∈C
The log likelihood function is
log L(θ , β ) = Σ log f (x i , zi , θ , β ) + Σ log S(x i , zi , θ , β )
i ∈U i ∈C
or
n
log L(θ , β ) = Σ log h(x i , zi , θ , β ) − Σ H(x i , zi , θ , β )
i ∈U i=1
Notes
• the maximum likelihood estimators for θ and
β cannot be expressed in closed form
• the number of unique covariate vectors and n
determine whether to use regression models
9.3 Proportional hazards
100 w 1 X
t1 z11 = 1
2 X t2 z21 = 1
100 w
3 X t3 z31 = 0
60 w
t
20 50 80
t(1) t(2) t(3)
Figure 9.2 Proportional hazards
parameter estimation notation.
80 1 1
x = 20 δ =1 Z=1
50 1 0
Example 9.8 North Carolina collected
recidivism data on n = 1540 prisoners
in 1978 (Schmidt and Witte, 1988). T
is the time of release until the time of
return to prison. The purpose of the
study is to assess the impact of the
q = 15 covariates.
Table 9.2 North Carolina recidivism model.
βˆ
zi Covar. βˆ
√ V̂ [ βˆ ] p-value
√ V̂ [ βˆ ]
z2 AGE -3.34 0.52 -6.43 0.000
z3 PRIORS 0.84 0.14 6.10 0.000
z1 TSERV 1.17 0.20 5.96 0.000
z6 WHITE -0.44 0.09 -5.07 0.000
z8 ALCHY 0.43 0.10 4.11 0.000
z 13 FELON -0.58 0.16 -3.54 0.000
z9 JUNKY 0.28 0.10 2.91 0.002
z7 MALE 0.67 0.24 2.78 0.003
z 15 PROPTY 0.39 0.16 2.47 0.007
z4 RULE 3.08 1.69 1.82 0.034
z 10 MARRY -0.15 0.11 -1.42 0.077
z5 SCHOOL -0.25 0.19 -1.30 0.097
z 12 WORK 0.09 0.09 0.96 0.169
z 14 PERSON 0.07 0.24 0.30 0.381
z 11 SUPER -0.01 0.10 -0.09 0.464
S(t)
1.0
0.8 Population survivor function
0.6
0.4 Exponential fit
0.2
Nonparametric estimator
0.0 t
0 50 100 150
Ŝ(t) ± zα / 2
√Ŝ(t)(1 − Ŝ(t))
n
Example 10.1 For the ball bearing data
set, find a nonparametric survivor
function estimator and a 95%
confidence interval for the probability
that a ball bearing will last 50,000,000
cycles.
S(t)
1.0
0.8
0.6
0.4
0.2
0.0 t
0 50 100 150
or
Ŝ(50) ± 1. 96
√Ŝ(50)(1 − Ŝ(50))
23
S(t)
1.0
0.8 D23 Nonparametric estimator
0.6
0.4
Exponential fit
0.2
0.0 t
0 50 100 150
F(t)
1.0
0.8
Exponential fit
0.6
0.4
0.2 D23 Nonparametric estimator
0.0 t
0 50 100 150
1−α
n 0.80 0.90 0.95 0.99
1 0.900 0.950 0.975 0.995
2 0.683 0.775 0.841 0.929
3 0.565 0.636 0.708 0.829
4 0.493 0.565 0.624 0.734
5 0.447 0.510 0.564 0.668
6 0.410 0.468 0.519 0.615
7 0.381 0.435 0.483 0.576
8 0.358 0.410 0.455 0.543
9 0.339 0.387 0.430 0.513
10 0.323 0.369 0.409 0.490
15 0.266 0.304 0.338 0.405
20 0.231 0.264 0.294 0.352
23 0.216 0.248 0.275 0.330
25 0.208 0.237 0.264 0.317
30 0.190 0.217 0.242 0.290
40 0.166 0.189 0.210 0.252
50 0.148 0.170 0.188 0.226
Table 11.1 gives estimates of the 1 − α
fractiles of the distribution of D n under H 0
determined by Monte Carlo simulation
(500,000 replications).
Example 11.1 Use the K-S test to
determine whether the ball bearing data
set was drawn from a Weibull
population with λ = 0. 01 and κ = 2.
Run the test at α = 0. 10.
The goodness-of-fit test is
H 0 : F(t) = 1 − e− (0.01t)
2
H 1 : F(t) ≠ 1 − e− (0.01t)
2
F(t)
1.0 Nonparametric estimator
0.8
0.6 D23
0.4
0.2 Hypothesized Weibull dist.
0.0 t
0 50 100 150