Reliability Engineering - Prob Models and Maintenance Methods PDF
Reliability Engineering - Prob Models and Maintenance Methods PDF
ENGINEERING
Probabilistic Models and
Maintenance Methods
Second Edition
RELIABILITY
ENGINEERING
Probabilistic Models and
Maintenance Methods
Second Edition
JOEL A. NACHLAS
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a photo-
copy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Dedicated to the memory of Betty Nachlas
Contents
1 Introduction .....................................................................................................1
vii
viii Contents
The motivation for the preparation of a second edition was my wish to expand
the treatment of several topics while maintaining an integrated introductory
resource for the study of reliability evaluation and maintenance planning.
The focus across all of the topics treated is the use of analytical methods to
support the design of dependable and efficient equipment and the planning
for the servicing of that equipment. The orientation of the topical develop-
ment is that probability models provide an effective vehicle for portraying
and evaluating the variability that is inherent in the performance and lon-
gevity of equipment.
The book is intended to support either an introductory graduate course
in reliability theory and preventive maintenance planning or a sequence of
courses that address these topics. A fairly comprehensive coverage of the
basic models and of various methods of analysis is provided. An under-
standing of the topics discussed should permit the reader to comprehend
the literature describing new and advanced models and methods.
Notwithstanding the emphasis upon initial study, the text should also
serve well as a resource for practicing engineers. Engineers who are involved
in the design process should find a coherent explanation of the reliability
and maintenance issues that will influence the success of the devices they
create. Similarly, engineers responsible for the analysis and verification of
product reliability or for the planning of maintenance support of fielded
equipment should find the material presented here to be relevant and easy
to access and use.
In preparing this second edition, the treatment of statistical methods for
evaluating reliability has been expanded substantially. Several methods for
constructing confidence intervals as part of the parametric estimation effort
are described and methods for treating data derived from operating repair-
able devices have also been added. In addition, the analysis of nonstation-
ary models of repairable equipment maintenance has been updated and
expanded. These expansions along with numerous other minor improve-
ments to the text should make this book an even more useful resource for
both students and practitioners.
The background required of the reader is a sound understanding of prob-
ability. This subsumes capability with calculus. More specifically, the reader
should have an understanding of distribution theory, Laplace transforms,
convolutions, stochastic processes, and Markov processes. It is also worth
mentioning that the use of the methods discussed in this book often involves
substantial computational effort, so facility with numerical methods and
access to efficient mathematical software is desirable.
xiii
xiv Preface
One caveat concerning the coverage here is that the treatment is strictly
limited to hardware. Reliability and maintenance models have been devel-
oped for applications to software, humans, and services systems. No criti-
cism of those efforts is intended but the focus here is simply hardware.
The organization of the text is reasonably straightforward. The elemen-
tary concepts of reliability theory are presented sequentially in Chapters 1
through 6. Following this, the commonly used statistical methods for eval-
uating component reliability are described in Chapters 7 and 8. Chapters
9 through 13 treat repairable systems and maintenance planning models.
Here again the presentation is sequential in that simple failure models pre-
cede those that include preventive actions and the renewal cases are treated
before the more realistic nonrenewal cases. In the final chapter, four inter-
esting special topics, including warranties, are discussed. It is worth noting
that four appendices that address aspects of numerical computation are pro-
vided. These should be quite useful to the reader.
Naturally, many people have contributed to the preparation of this text.
The principal factor in the completion of this book was the support and
encouragement of my wife Beverley. An important practical component of
my success was the support of Virginia Tech, especially during sabbaticals
when progress with writing is so much easier.
I acknowledge the significant computational capability provided to me by
the Mathematica software. Many of the analyses included in this text would
have been much more taxing or even impossible without the strength and
efficiency the Wolfram software provides.
I also wish to extend my thanks directly to three of my students, each of
whom contributed to my efforts. Edvin Beqari stimulated my increased inter-
est in and analysis of the diffusion models of degradation. He also directed
much of my analysis of that topic. Elliott Mitchell-Colgan helped to expand
the sets of exercises included at the end of the chapters. Paul D’Agostino
invested very many hours in verifying a majority of the complicated numeri-
cal analyses used for examples or for exercise solutions.
I express my profound gratitude to all of my graduate students who have
taught me so much about these topics over the years. May we all continue to
learn and grow and to enjoy the study of this important subject.
Author
Joel A. Nachlas received his BES from Johns Hopkins University in 1970, his
MS in 1972 and his PhD in 1976, both from the University of Pittsburgh. He
served on the faculty of the Grado Department of Industrial and Systems
Engineering at Virginia Tech for 41 years and retired in March 2016. His
research interests are in the applications of probability and statistics to prob-
lems in reliability and quality control. In addition to his normal teaching
activities during his time at Virginia Tech, he served as the coordinator for
the department’s graduate program in operations research and for their
dual master’s degree that is operated with École des Mines de Nantes in
France. From 1992 to 2011, he regularly taught reliability theory at the École
Polytechnique de l’Université Nice Sophia Antipolis. He is the coauthor of
more than 50 refereed articles, has served in numerous editorial and referee
capacities, and has lectured on reliability and maintenance topics through-
out North America and Europe.
xv
1
Introduction
Although we rarely think of it, reliability and maintenance are part of our
everyday lives. The equipment, manufactured products, and fabricated
infrastructure that contribute substantively to the quality of our lives have
finite longevity. Most of us recognize this fact, but we do not always fully
perceive the implications of finite system life for our efficiency and safety.
Many, but not all, of us also appreciate the fact that our automobiles require
regular service, but we do not generally think about the fact that roads and
bridges, smoke alarms, electricity generation and transmission devices, and
many other machines and facilities we use also require regular maintenance.
We are fortunate to live at a time in which advances in the understanding
of materials and energy have resulted in the creation of an enormous variety
of sophisticated products and systems, many of which (1) were inconceiv-
able 100 or 200 or even 20 years ago; (2) contribute regularly to our comfort,
health, happiness, efficiency, or success; (3) are relatively inexpensive; and
(4) require little or no special training on our part. Naturally, our reliance
on these devices and systems is continually increasing and we rarely think
about failure and the consequences of failure.
Occasionally, we observe a catastrophic failure. Fatigue failures of the fuse-
lage of aircraft [1], the loss of an engine by a commercial jet [1], the Three Mile
Island [1] and Chernobyl [1] nuclear reactor accidents, and the Challenger [2]
and Discovery [3] space shuttle accidents are all widely known examples of
catastrophic equipment failures. The relay circuit failure at the Ohio power
plant that precipitated the August 2003 power blackout in the northeastern
United States and in eastern Canada [4] is an example of a system failure
that directly affected millions of people. When these events occur, we are
reminded dramatically of the fallibility of the physical systems on which
we depend.
Nearly everyone has experienced less dramatic product failures such as
that of a home appliance, the wear out of a battery, and the failure of a light
bulb. Many of us have also experienced potentially dangerous examples of
product failures such as the blowout of an automobile tire.
Reliability engineering is the study of the longevity and failure of equip-
ment. Principles of science and mathematics are applied to the investiga-
tion of how devices age and fail. The intent is that a better understanding
of device failure will aid in identifying ways in which product designs can
be improved to increase life length and limit the adverse consequences of
failure. The key point here is that the focus is upon design. New product and
1
2 Reliability Engineering: Probabilistic Models and Maintenance Methods
system designs must be shown to be safe and reliable prior to their fabrica-
tion and use. A dramatic example of a design for which the reliability was
not properly evaluated is the well-known case of the Tacoma Narrows Bridge,
which collapsed into the Puget Sound in November 1940, a few months after
its completion [1].
A more recent example of a design fault with significant consequences is
the 2013 lithium-ion battery fire that occurred on a new Boeing 787 aircraft
while it was parked at the Boston airport [5]. Fortunately, the plane was
empty, so no one was injured, but the fire and two subsequent fires of the
same type resulted in all 787s being grounded until a modification to the
battery containment was made. The cost to the airlines using the planes was
estimated to be $1.1 million per day.
The study of the reliability of an equipment design also has important eco-
nomic implications for most products. As Blanchard [6] states, 90% of the life
cycle costs associated with the use of a product are fixed during the design
phase of a product’s life.
Similarly, an ability to anticipate failure can often imply the opportunity
to plan for an efficient repair of equipment when it fails or even better to per-
form preventive maintenance in order to reduce failure frequency.
There are many examples of products for which system reliability is far
better today than it was previously. One familiar example is the television
set, which historically experienced frequent failures and which, at present,
usually operates without failure beyond its age of obsolescence. Improved
television reliability is certainly due largely to advances in circuit technol-
ogy. However, the ability to evaluate the reliability of new material sys-
tems and new circuit designs has also contributed to the gains we have
experienced.
Perhaps the most well-recognized system for which preventive mainte-
nance is used to maintain product reliability is the commercial airplane.
Regular inspection, testing, repair, and even overhaul are part of the nor-
mal operating life of every commercial aircraft. Clearly, the reason for such
intense concern for the regular maintenance of aircraft is an appreciation of
the influence of maintenance on failure probabilities and thus on safety.
On a personal level, the products for which we are most frequently respon-
sible for maintenance are our automobiles. We are all aware of the inconve-
nience associated with an in-service failure of our cars and we are all aware
of the relatively modest level of effort required to obtain the reduced failure
probability that results from regular preventive maintenance.
It would be difficult to overstate the importance of maintenance and espe-
cially preventive maintenance. It is also difficult to overstate the extent to
which maintenance is undervalued or even disliked. Historically, repair and
especially preventive maintenance have often been viewed as inconvenient
overhead activities that are costly and unproductive. Very rarely have the
significant productivity benefits of preventive maintenance been recognized
and appreciated. Recently, there have been reports [7–9] that suggest that
Introduction 3
The point of departure for the study of reliability and maintenance planning
is the elementary definition of the term reliability. As mentioned in Chapter
1, the technical definition of reliability is similar to the colloquial definition
but is more precise. Formally, the definition is as follows:
Definition 2.1
Observe that there are four specific attributes of this definition of reliability.
The four attributes are (1) probability, (2) proper performance, (3) qualifica-
tion with respect to environment, and (4) time. All four are important. Over
this and the next several chapters, we explore a series of algebraic models
that are used to represent equipment reliability. We develop the models suc-
cessively by sequentially including in the models each of the four attributes
identified in the previous definition. To start, consider the representation of
equipment performance to which we refer as function.
5
6 Reliability Engineering: Probabilistic Models and Maintenance Methods
Note that this representation is intentionally binary. We assume here that the
status of the equipment of interest is either satisfactory or failed. There are
many types of equipment for which one or more derated states are possible.
Discussion of this possibility is postponed until the end of this chapter.
We presume that most equipment is comprised of components and that
the status of the device is determined by the status of the components.
Accordingly, let n be the number of components that make up the device and
define the component status variables, xi, as
ì1 if component i is functioning
xi = í
î0 if component i is failed
x = {x1 , x2 , , xn }
f = f( x ) (2.1)
and the specific form for the function is determined by the way in which the
components interact to determine system function. In the discussions that
follow, ϕ(x) is referred to as a “system structure function” or as a “system sta-
tus function” or simply as a “structure.” In all cases, the intent is to reflect the
dependence of the system state upon the states of the components that com-
prise the system. A parenthetical point is that the terms “device” and “sys-
tem” are used here in a generic sense and may be interpreted as appropriate.
An observation concerning the component status vector is that it is defined
here as a vector of binary elements so that an n component system has 2n pos-
sible component status vectors. For example, a three-component system has
23 = 8 component status vectors. They are
{1, 1, 1} {1, 0, 1}
{1, 1, 0} {1, 0, 0}
{0, 1, 1} {0, 0, 1}
{0, 1, 0} {0, 0, 0}
Each component status vector yields a corresponding value for the system
status function, ϕ.
System Structures 7
Definition 2.2
A coherent system is one for which the system structure function is nonde-
creasing in each of its arguments.
This means that for each element of the component status vector, xi, there
exists a realization of the vector for which
f( x1 , … , xi -1 , 0, xi +1 , … , xn ) < f( x1 , … , xi -1 , 1, xi +1 , … , xn ) (2.2)
Definition 2.3
1 2 3
FIGURE 2.1
Reliability block diagram for a series system.
f( x ) = Õx
i =1
i (2.4)
x1 = x2 = 1, x3 = 0, and f( x ) = 0
x1 = 1, x2 = x3 = 0, and f( x ) = 0
x1 = x2 = x3 = 1 and f( x ) = 1
Definition 2.4
A parallel system is one in which the proper function of any one component
implies system function.
System Structures 9
f( x ) = x
i =1
i (2.6)
FIGURE 2.2
Reliability block diagram for a three-component parallel system.
10 Reliability Engineering: Probabilistic Models and Maintenance Methods
n n
i =1
xi = 1 - Õ (1 - x )
i =1
i (2.7)
Once mastered, this shorthand is very convenient. Example cases for the
three-component parallel system are
x1 = x2 = 1, x3 = 0, and f( x ) = 1
x1 = 1, x2 = x3 = 0, and f( x ) = 1
x1 = x2 = x3 = 0, and f( x ) = 0
Definition 2.5
A k-out-of-n system is one in which the proper function of any k of the n com-
ponents that comprise the system implies proper system function.
System Structures 11
The usual approach to constructing the reliability block diagram for the
k-out-of-n system is to show a parallel diagram and to provide an additional
indication that the system is k out of n.
An example of a k-out-of-n system is the rear axle of a large tractor trailer
on which the functioning of any three out of the four wheels is sufficient to
assure mobility. Another example is the fact that some (1 – k) electronic mem-
ory arrays are configured so that the operation of any 126 of the 128 memory
addresses corresponds to satisfactory operation.
The algebraic representation of the structure function for a k-out-of-n sys-
tem is not as compact as those for series and parallel systems. Given the defi-
nition of the relationship between component and system status, the most
compact algebraic form for the structure function is
ì n
ï1
f( x ) = í
if åx ³ k
i =1
i
(2.8)
ï
î0 otherwise
x1 = x2 = x3 = 1, x4 = 0, and f( x ) = 1
x1 = x2 = 1, x3 = x4 = 0, and f( x ) = 0
x1 = x2 = x3 = 0, x4 = 1, and f( x ) = 0
Definition 2.6
1 4
2 5
FIGURE 2.3
Reliability block diagram for a Wheatstone bridge.
System Structures 13
TABLE 2.1
System Status Values for the Bridge Structure
x ϕ(x) x ϕ(x)
{1, 1, 1, 1, 1} 1 {0, 1, 1, 1, 1} 1
{1, 1, 1, 1, 0} 1 {0, 1, 1, 1, 0} 1
{1, 1, 1, 0, 1} 1 {0, 1, 1, 0, 1} 1
{1, 1, 1, 0, 0} 0 {0, 1, 1, 0, 0} 0
{1, 1, 0, 1, 1} 1 {0, 1, 0, 1, 1} 1
{1, 1, 0, 1, 0} 1 {0, 1, 0, 1, 0} 0
{1, 1, 0, 0, 1} 1 {0, 1, 0, 0, 1} 1
{1, 1, 0, 0, 0} 0 {0, 1, 0, 0, 0} 0
{1, 0, 1, 1, 1} 1 {0, 0, 1, 1, 1} 0
{1, 0, 1, 1, 0} 1 {0, 0, 1, 1, 0} 0
{1, 0, 1, 0, 1} 1 {0, 0, 1, 0, 1} 0
{1, 0, 1, 0, 0} 0 {0, 0, 1, 0, 0} 0
{1, 0, 0, 1, 1} 1 {0, 0, 0, 1, 1} 0
{1, 0, 0, 1, 0} 1 {0, 0, 0, 1, 0} 0
{1, 0, 0, 0, 1} 0 {0, 0, 0, 0, 1} 0
{1, 0, 0, 0, 0} 0 {0, 0, 0, 0, 0} 0
Definition 2.7
A minimum path vector, x, is a path vector for which any vector y < x has a cor-
responding system status function with a value of 0.
Definition 2.8
A minimum path set, Pj, is the set of indices of a minimum path vector for
which the component status variable has a value of 1.
{1, 0, 0, 1, 0} P1 = {1, 4}
{0, 1, 0, 0, 1} P2 = {2, 5}
{1, 0, 1, 0, 1} P3 = {1, 3, 5}
{0, 1, 1, 1, 0} P4 = {2, 3, 4}
14 Reliability Engineering: Probabilistic Models and Maintenance Methods
Next, consider the elements of a minimum path and define a status func-
tion for each minimum path. That is, represent the functional status of each
path using the functions ρ(x). Since all of the components in a minimum
path must function in order for the path to represent proper function, the
components in a minimum path may be viewed as a series system. Hence,
in general,
r j ( x) = Õx
iÎPj
i (2.9)
r1( x ) = Õx = x x
iÎP1
i 1 4
r2 ( x ) = Õx = x x
iÎP2
i 2 5
r3 ( x ) = Õx = x x x
iÎP3
i 1 3 5
r4 ( x) = Õx = x x x
iÎP4
i 2 3 4
Now, observe that the original system will function if any of the mini-
mum paths is functioning. Therefore, we may view the system as a parallel
arrangement of the minimum paths. Algebraically, this means
f( x ) = r ( x ) = Õ x
j
j
j iÎPj
i (2.10)
The most important point here is that for any component status vector,
Equation 2.10 will always give the same system status value as in Table 2.1.
That is, the parallel arrangement of the minimum paths of a system with the
components of the respective minimum paths arranged in series constitutes
a system that is equivalent to the original system. The graphical realization
of this equivalence for the bridge structure is presented in Figure 2.4.
It is appropriate to emphasize here the fact that the equivalent struc-
ture has exactly the same status function value as the original structure
System Structures 15
1 4
2 5
1 3 5
2 3 4
FIGURE 2.4
Minimum path equivalent structure for the Wheatstone bridge.
Definition 2.9
A cut vector, x, is a component status vector for which the corresponding sys-
tem status function has a value of 0.
Definition 2.10
A minimum cut vector, x, is a cut vector for which any vector y > x has a cor-
responding system status function with a value of 1.
Definition 2.11
A minimum cut set, Ck, is the set of indices of a minimum cut vector for which
the component status variable has a value of 0.
system function. For the bridge structure, the minimum cut vectors and
minimum cut sets are
{0, 0, 1, 1, 1} C1 = {1, 2}
{1, 1, 1, 0, 0} C2 = {4, 5}
{0, 1, 0, 1, 0} C3 = {1, 3, 5}
{1, 0, 0, 0, 1} C4 = {2, 3, 4}
Based on the definition of the minimum cuts, we see that the system will
function if any of the elements of the minimum cut function. Hence, we may
define a structure function for the minimum cuts as
kk ( x) = x
iÎCk
i (2.11)
in general and for the specific case of the bridge structure, we have
k1 ( x ) = x = 1 - (1 - x )(1 - x )
iÎC1
i 1 2
k2 ( x) = x = 1 - (1 - x )(1 - x )
iÎC2
i 4 5
k3 ( x) = x = 1 - (1 - x )(1 - x )(1 - x )
iÎC3
i 1 3 5
k4 ( x) = x = 1 - (1 - x )(1 - x )(1 - x )
iÎC4
i 2 3 4
We observe further that the system will function only if all of the minimum
cuts are inactive—if all are functioning. If any minimum cut is active, the
system is failed, so the minimum cuts act as a series system with respect to
system operation. Therefore, the minimum cut equivalent structure has the
min cuts arranged in series with the elements of each cut in parallel. This is
illustrated in Figure 2.5.
Here again, it is appropriate to emphasize the fact that the equivalent struc-
ture and the original structure have the same status function value for each
component status vector. Thus, the system status may be calculated using
only the simple series and parallel forms.
One further observation concerning the equivalent structures is that one
may use either the minimum cut or the minimum path method. Both yield
equivalent expressions for the system status so we may use the one that
appears easier or preferable for some other reason.
System Structures 17
1 2
1 4
3 3
2 5
5 4
FIGURE 2.5
Minimum cut equivalent structure for the Wheatstone bridge.
is a vector of binary module status values and the system status is defined as
f( x ) = f(y( x )) (2.12)
In the same manner, the state of a parallel system is the maximum of the
component state values as stated in Equation 2.5:
(2.14)
f( x ) = max {xi }
i
Then for more general structures, we use the min paths or min cuts to define
ì
j î iÎPj
ü
þ k iÎCk
{
f( x ) = max ímin {xi }ý = min max {xi } } (2.15)
Exercises
2.1 Construct the system structure function for the following system:
1
4
2 3
1
5
2 3
2.2 Construct the minimum path and minimum cut equivalent structures
for a 2-out-of-3 system.
2.3 Construct the minimum path and minimum cut equivalent structures
for a three-component series system.
2.4 Construct the minimum path and minimum cut equivalent structures
for a three-component parallel system.
20 Reliability Engineering: Probabilistic Models and Maintenance Methods
2.5 Show that the status function value of any system structure is
bounded below by the state of the series system comprised of the same
components and is bounded above by the state of the parallel system
comprised of the same components. That is, show that
n n
Õ x £ f(x) £ x
i =1
i
i =1
i
2.6 Construct the minimum path and minimum cut equivalent structures
for the following system:
1 2 3
4 5
6 7
2.7 Construct the minimum path and minimum cut equivalent structures
for the following system:
5 7
3 4 6
9
2.8 Construct the minimum path and minimum cut equivalent structures
for the following system:
1 3 5
4
System Structures 21
2.9 Construct and compare the minimum path and minimum cut equiva-
lent structures for the following systems:
1 3
1 3
2
4 2 4
2.10 Consider a structure having status function f( x ) and define the parallel
and series composition operations as component by component. That
y = ( x y ,…, x y ) and Õ xy = (x y ,…, x y ). Show
is, let x 1 1 n n 1 1 n n
4
3
5
2
Identify the min paths, the min paths equivalent system and the struc-
ture function for the equivalent system.
Then, identify the min cuts, the min cut equivalent system and the
structure function for the equivalent system.
2.12 Suggest a system structure model for a bicycle. Indicate which
component failures could occur without putting the rider in danger.
2.13 For the system structure in Problem 2.6, suggest a structural represen-
tation using modules.
2.14 Identify the primary modules of an automobile.
2.15 Identify three systems that are appropriately viewed as multistate
systems because the systems or their components are sometimes
operated at reduced levels.
3
Reliability of System Structures
Rs = Pr[f = 1] (3.1)
Observe that an artifact of the binary definition of the system state is that
the system reliability is also the expected value of the system state variable:
ri = Pr[xi = 1] (3.3)
where, because of the fact that the xi are binary, it is again the case that the
reliability and expected value correspond. For a system comprised of n com-
ponents, we take
r = {r1 , r2 , … , rn }
23
24 Reliability Engineering: Probabilistic Models and Maintenance Methods
Rs ( r ) = Pr[f( x ) = 1] (3.4)
é n
ù
Rs = Pr[f( x ) = 1] = Pr ê
ëê
Õ
i =1
xi = 1ú
úû
(3.5)
Now, in general,
é n
ù n
Pr ê
êë
Õ i =1
xi = 1ú ³
úû
Õ Pr[x = 1]
i =1
i
and equality holds only when the components are mutually independent.
Thus, for a series system comprised of independent components, the system
reliability may be stated as
n
Rs = Õr
i =1
i (3.6)
We should note that regardless of whether or not the components are inde-
pendent, the system reliability function is an increasing function of the
component reliability values and is a decreasing function of the number of
components.
Reliability of System Structures 25
é n
ù é n
ù
Rs = Pr[f( x ) = 1] = Pr ê
êë
i =1
xi = 1ú = Pr ê
úû êë
Õ
i =1
(1 - xi ) = 0 ú
úû
(3.7)
é n
ù n
Pr ê
êë
i =1
xi = 1ú £
úû
Pr[x = 1]
i =1
i
Rs = r
i =1
i (3.8)
Examination of this function indicates that the system reliability function for
a parallel system is increasing both in the component reliability values and
in the number of components.
For system structures other than series and parallel, the computation of the
system reliability from the component reliabilities is not as straightforward.
é n
ù
Rs = Pr ê
ëê
å x ³ k úúû
i =1
i (3.9)
and even when the components are independent, there is no convenient form
for this function. The single exception occurs when the n components are
independent and identical (have the same reliability). In that case, the system
26 Reliability Engineering: Probabilistic Models and Maintenance Methods
n
æ nö
Rs = å çè j ÷ø r (1 - r)
j=k
j n- j
(3.10)
For most other k-out-of-n systems, the use of the minimum path– and mini-
mum cut–based methods of the next section provides the most effective
approach to evaluating system reliability.
For the special case of the consecutive k-out-of-n:F systems described in
Chapter 2, calculation of the system reliability is quite complicated. Lambiris
and Papastavridis [16] obtained an exact formula for the system reliability
using binomial probabilities as
n n
æ n - jk ö æ n - jk - k ö
å ( ) å çè ( )
j j j j
Rs = ç ÷ (-1) r(1 - r )
k
- (1 - r )k ÷ (-1) r(1 - r )
k
(3.11)
j =0 è j ø j =0
j ø
+ r1r2 (1 - r3 )(1 - r4 )r5 + r1(1 - r2 )r3 r4 r5 + r1(11 - r2 )r3 r4 (1 - r5 ) + r1(1 - r2 )r3 (1 - r4 )r5
+ r1(1 - r2 )(1 - r3 )r4 r5 + r1(11 - r2 )(1 - r3 )r4 (1 - r5 ) + (1 - r1 )r2r3 r4 r5 + (1 - r1 )r2r3 r4 (1 - r5 )
TABLE 3.1
Reliability Values for Paths of the Wheatstone Bridge
x ϕ(x) Pr[f( x ) = 1] x ϕ(x) Pr[f( x ) = 1]
{1, 1, 1, 1, 1} 1 r1r2r3r4r5 {0, 1, 1, 1, 1} 1 (1 − r1)r2r3r4r5
{1, 1, 1, 1, 0} 1 r1r2r3r4(1 − r5) {0, 1, 1, 1, 0} 1 (1 − r1)r2r3r4(1 − r5)
{1, 1, 1, 0, 1} 1 r1r2r3(1 − r4)r5 {0, 1, 1, 0, 1} 1 (1 − r1)r2r3(1 − r4)r5
{1, 1, 1, 0, 0} 0 r1r2r3r(1 − r4)(1 − r5) {0, 1, 1, 0, 0} 0 (1 − r1)r2r3r(1 − r4)(1 − r5)
{1, 1, 0, 1, 1} 1 r1r2(1 − r3)r4r5 {0, 1, 0, 1, 1} 1 (1 − r1)r2(1 − r3)r4r5
{1, 1, 0, 1, 0} 1 r1r2(1 − r3)r4(1 − r5) {0, 1, 0, 1, 0} 0 (1 − r1)r2(1 − r3)r4(1 − r5)
{1, 1, 0, 0, 1} 1 r1r2(1 − r3)(1 − r4)r5 {0, 1, 0, 0, 1} 1 (1 − r1)r2(1 − r3)(1 − r4)r5
{1, 1, 0, 0, 0} 0 r1r2(1 − r3)(1 − r4) {0, 1, 0, 0, 0} 0 (1 − r1)r2(1 − r3)(1 − r4)
(1 − r5) (1 − r5)
{1, 0, 1, 1, 1} 1 r1(1 − r2)r3r4r5 {0, 0, 1, 1, 1} 0 (1 − r1)(1 − r2)r3r4r5
{1, 0, 1, 1, 0} 1 r1(1 − r2)r3r4(1 − r5) {0, 0, 1, 1, 0} 0 (1 − r1)(1 − r2)r3r4(1 − r5)
{1, 0, 1, 0, 1} 1 r1(1 − r2)r3(1 − r4)r5 {0, 0, 1, 0, 1} 0 (1 − r1)(1 − r2)r3(1 − r4)r5
{1, 0, 1, 0, 0} 0 r1(1 − r2)r3(1 − r4) {0, 0, 1, 0, 0} 0 (1 − r1)(1 − r2)r3(1 − r4)
(1 − r5) (1 − r5)
{1, 0, 0, 1, 1} 1 r1(1 − r2)(1 − r3)r4r5 {0, 0, 0, 1, 1} 0 (1 − r1)(1 − r2)(1 − r3)r4r5
{1, 0, 0, 1, 0} 1 r1(1 − r2)(1 − r3) {0, 0, 0, 1, 0} 0 (1 − r1)(1 − r2)(1 − r3)
r4(1 − r5) r4(1 − r5)
{1, 0, 0, 0, 1} 0 r1(1 − r2)(1 − r3) {0, 0, 0, 0, 1} 0 (1 − r1)(1 − r2)(1 − r3)
(1 − r4)r5 (1 − r4)r5
{1, 0, 0, 0, 0} 0 r1(1 − r2)(1 − r3) {0, 0, 0, 0, 0} 0 (1 − r1)(1 − r2)(1 − r3)
(1 − r4)(1 − r5) (1 − r4)(1 − r5)
Rs = r1r4 + r2r5 + r1r3 r5 + r2r3 r4 - r1r2r3 r4 - r1r2r3 r5 - r1r2r4 r5 - r1r3 r4 r5 - r2r3 r4 r5 + 2r1r2r3 r4 r5
(3.13)
In addition, for a structure such as the bridge, it would be reasonable for all
five components to be identical and to have the same reliability. Then,
ri = r "i
The first set of bounds is reasonably obvious and is generally not very tight.
These are the series and parallel bounds. To compute these bounds, we sim-
ply treat the system components as if the system configuration were a series
structure of independent components and calculate a lower bound and we
then calculate an upper bound assuming a parallel configuration. Thus,
n n
bs = Õ i =1
ri £ Rs £ r = b
i =1
i p (3.15)
For a system such as the bridge structure, assuming all of the components
are identical implies that these bounds reduce to
bs = r n £ Rs £ 1 - (1 - r )n = bp
é n
ù n
Pr ê
êë
Õ i =1
xi = 1ú ³
úû
Õ Pr[x = 1]
i =1
i
Applying this inequality to the minimum cut structures yields the mini-
mum cut lower bound on system reliability:
é ù
bmcl = Õ Pr[k (x) = 1] £ Pr êêëÕ k (x) = 1úúû = Pr[f(x) = 1] = R
k
k
k
k s (3.16)
TABLE 3.2
Computed Values of the System Reliability
and the Bounds on System Reliability
r bs b mcl b mml Rs b mmu b mpu bp
0.99 0.951 0.999 0.980 0.999 1.000 1.000 1.000
0.95 0.774 0.995 0.903 0.995 0.998 1.000 1.000
0.90 0.591 0.978 0.810 0.979 0.990 0.997 1.000
0.75 0.273 0.852 0.563 0.861 0.938 0.936 0.999
0.60 0.078 0.618 0.360 0.660 0.840 0.748 0.990
0.50 0.031 0.431 0.250 0.500 0.750 0.569 0.969
0.25 0.001 0.064 0.063 0.139 0.438 0.148 0.763
0.10 0.000 0.003 0.010 0.022 0.190 0.022 0.410
Reliability of System Structures 29
The same reasoning can be applied to the minimum paths. The minimum
path equivalent structure has the minimum paths arranged in a parallel con-
figuration. Therefore, applying the inequality
é n
ù n
Pr ê
êë
i =1
xi = 1ú £
úû
Pr[x = 1]
i =1
i
to the minimum paths yields the minimum path upper bound on system
reliability:
é ù
bmpu = Pr[r j ( x ) = 1] ³ Pr ê
ê r j ( x ) = 1ú = Pr[f( x ) = 1] = Rs
ú
(3.17)
j ë j û
For the example bridge structure having five identical components, these
bounds are computed as follows:
k1 ( x ) = x = 1 - (1 - x )(1 - x )
iÎC1
i 1 2 so Pr[k1( x ) = 1] = r = 1 - (1 - r)
iÎC1
i
2
k2 ( x) = x = 1 - (1 - x )(1 - x )
iÎC2
i 4 5 so Pr[k 2 ( x ) = 1] = r = 1 - (1 - r)
iÎC2
i
2
k3 ( x) = x = 1 - (1 - x )(1 - x )(1 - x )
iÎC3
i 1 3 5 so Pr[k 3 ( x ) = 1] = x = 1 - (1 - r)
iÎC3
i
3
k 4 ( x) = x = 1 - (1 - x )(1 - x )(1 - x )
iÎC4
i 2 3 4 so Pr[k 4 ( x ) = 1] = x = 1 - (1 - r)
iÎC4
i
3
r1( x ) = Õx = x x
iÎP1
i 1 4 so Pr[r1( x ) = 1] = Õr = r
iÎP1
i
2
r2 ( x ) = Õx = x x
iÎP2
i 2 5 so Pr[r2 ( x ) = 1] = Õr = r
iÎP2
i
2
r3 ( x ) = Õx = x x x
iÎP3
i 1 3 5 so Pr[r3 ( x ) = 1] = Õr = r
iÎP3
i
3
r4 ( x) = Õx = x x x
iÎP4
i 2 3 4 so Pr[r4 ( x ) = 1] = Õr = r
iÎP4
i
3
Then,
( ) (1 - (1 - r) ) ( )
2 2
bmcl = 1 - (1 - r )2 3
£ Rs £ 1 - (1 - r 2 )2 (1 - r 3 )2 = bmpu
The minimum paths and minimum cuts may be used to define a third
set of bounds. These are known as the minimax bounds. Starting with
the minimum paths, recall that the structure function for a parallel sys-
tem may also be expressed in terms of a maximum. That is, as stated in
Equation 2.5,
f( x ) = max {xi }
i
so
Pr[f( x ) = 1] = Pr[max {r j ( x )} = 1]
j
Applying the same logic to the minimum cuts, we have the following from
Equation 2.3:
f( x ) = min {xi }
i
so
Pr[f( x ) = 1] = Pr[min {k k } = 1]
k
In general,
For the bridge with identical components, the computation of the minimax
bounds proceeds as follows:
{ }
bmml = max {Pr[r j ( x ) = 1]} = max r 2 , r 2 , r 3 , r 3 = r 2
j
and
{
bmmu = min {Pr[k k ( x ) = 1} = min 1 - (1 - r )2 , 1 - (1 - r )2 , 1 - (1 - r )3 , 1 - (1 - r )3
k
}
= 1 - (1 - r )2
so
r 2 £ Rs £ 1 - (1 - r )2
3.3 Modules
The idea that the components that comprise a system may sometimes be
partitioned into modules may be extended to the calculation of system reli-
ability bounds. There are three key ways in which this may be pursued.
Recall that the algebraic representation of system state using modules is
f( x ) = f(y( x ))
32 Reliability Engineering: Probabilistic Models and Maintenance Methods
where
y( x ) = {y1( x ), y 2 ( x ), … , y m ( x )}
One possible approach to the use of the partition is to calculate the reliability
of each module and to use the resulting values as component values in the
bounds defined in the previous section. If the modules contain relatively
few components or are configured in either a series of parallel structure, this
method will be fairly straightforward.
If, on the other hand, some of the modules are themselves rather compli-
cated but the system is designed with the modules in series, then the series
computation applied to each of the upper and the lower bounds provides a
pair of bounds on system reliability. In this case, the bounds on module reli-
ability are obtained using the methods of the previous section.
The third possibility is that one or more of the modules are complicated
and the system configuration of the modules is not a simple one. In this case,
a lower bound on system reliability can be computed by applying the mini-
mum cut lower bound calculation at the system level to the minimum cut
lower bounds for the modules.
Definition 3.1
d
I R (i ) = Rs (3.21)
dri
Reliability of System Structures 33
n n
Õr = Õr
d d
I R ( j) = Rs = i i
drj drj i =1 i =1
i¹ j
r1 £ r2 £ r3 £ £ rn
so that the weakest component has the greatest importance. For a series sys-
tem, this seems intuitively reasonable.
For a parallel system comprised of independent components, the corre-
sponding analysis is
d æ ö
n n n
Õ Õ (1 - r )
d d
I R ( j) =
drj
Rs =
drj
i =1
ri = ç1-
drj çè i =1
(1 - ri ) ÷ =
÷
ø i =1
i
i¹ j
where
pi is the reliability of component i
qi = 1 - pi is the probability that component i is failed
Now, given that the system is failed, the chance that the failure was caused
by component 1 is the probability that one of the cut sets containing compo-
nent 1 is failed and the Fussell–Vesely importance of component 1, I FV (1), is
the probability that exactly one of those cut sets is failed divided by the prob-
ability of system failure. Thus, the denominator is 1 - Rs and the numerator is
Pr[( k1 = 0) È ( k 2 = 0)] = Pr[k1 = 0] + Pr[k 2 = 0] - Pr[( k1 = 0) Ç ( k 2 = 0)]
( p p ) + (1 - p p ) - (1 - p p p ) = q q + q q - q q q
= 1- 1 2 1 3 1 2 3 1 2 1 3 1 2 3
Therefore,
0.0118
I FV (1) = = 0.452
0.026
0.0193
I FV (2) = = 0.740
0.026
0.0218
I FV (3) = = 0.837
0.026
Reliability of System Structures 35
n
Minimize åc m
i =1
i i
subject to:
Rs ³ Rtarget
mi ³ 1 "i
mi integer
36 Reliability Engineering: Probabilistic Models and Maintenance Methods
Here, we minimize the cost to obtain a target system reliability level. The
alternate problem is to
Maximize Rs
subject to:
n
åc m £ C
i =1
i i budget
mi integer
1 - (1 - ri )mi
3.6 Conclusion
The analyses considered to this point provide a means for relating system
reliability to component reliability for many types of equipment designs.
Several exceptions have been noted and some of these will be addressed
Reliability of System Structures 37
later in the text. For the system configurations that are based on binary com-
ponent states and independent components, the models and analyses treated
so far are sufficient to permit a reductionist approach to reliability analysis.
That is, for these simplest of systems, reliability may be studied at the com-
ponent level because the dependence of system reliability on component reli-
ability is well defined. For very many system designs, the ability to focus
independently on individual component reliability performance is essential
to achieving the high levels of reliability we now enjoy.
Exercises
3.1 Compute the reliability of a three-independent-component parallel
system in which all components have a reliability of 0.75.
3.2 Compute the reliability of a 3-out-of-4 system for which all components
have a reliability of 0.85 and the components are independent.
3.3 Compute the system reliability for the following system assuming the
components are independent and r = (0.96, 0.88, 0.92).
3.9 For the systems of Problem 2.8, assume that r1 = r2 = 0.90 and r3 = r4 = 0.80.
Compute the three types of reliability bounds and the actual system
reliability.
3.10 Suggest an alternative reliability importance measure that is different
from the Birnbaum and the Fussell–Vesely measures. Illustrate your
measure by application to the system of Problem 2.6.
3.11 Show that for a series system of n components, the Fussell–Vesely
importance measure is the greatest for the weakest component.
3.12 For a parallel system of n independent components, which component
has the greatest value of the Fussell–Vesely importance measure?
4
Reliability over Time
39
40 Reliability Engineering: Probabilistic Models and Maintenance Methods
These basic definitions raise two points concerning notation. First, note
that it is reasonable to use either R(t) or the survivor function FT (t) to repre-
sent reliability. As these forms are truly synonymous, the survivor function
form will be used here whenever reasonable. Second, note that the subscript
denoting the component or system has been dropped. Much of the discus-
sion to follow is general in that it applies equally well to the system or to any
component. As no specific component is being identified, the subscript is
excluded unless needed for clarification. In addition, the comment at the end
of Chapter 3 concerning the fact of our ability to reduce our focus from the
system to the component in many designs suggests that we can often exam-
ine reliability independent of component identity.
Returning to the inclusion of time in our models, note that the distribution
function on life length is the basis for four equivalent algebraic descriptors
of longevity. These four descriptors are the distribution function, the sur-
vivor (reliability) function, the density function, and the hazard function.
Reiterating the previous definitions, we have
d
Density function: fT (t) = FT (t) (4.3)
dt
FT (t + Dt) - FT (t)
Pr[T £ t + Dt|T > t] =
FT (t)
As indicated by the algebraic form, the hazard function is the rate at which
surviving units fail. For this reason, it is often called the “failure rate.”
However, because it can apply to other failure phenomena, the terminology
“failure rate” can be misleading, so in this text, the function zT(t) is called the
hazard function.
Knowledge of any one of the four reliability measures implies knowl-
edge of all of them. They are all functionally related and actually comprise
Reliability over Time 41
FT (t) =
ò f (u)du
0
T
FT (t) = 1 - FT (t)
fT (t)
zT (t) =
FT (t)
fT (t)
zT (t) =
FT (t)
d
fT (t) = - FT (t) = zT (t)FT (t)
dt
so
d
FT (t) + zT (t)FT (t) = 0
dt
FT (t) = e ò0
- zT ( u ) du
or alternately
FT (t) = e - ZT (t )
where
t
ò
ZT (t) = zT (u)du
0
(4.5)
is called the cumulative hazard function. Later in this text, there are several
topics in which the cumulative hazard is a useful part of the analysis.
42 Reliability Engineering: Probabilistic Models and Maintenance Methods
As the four reliability measures are all faces of the same description of
the failure behavior of a device, we could use any of them as a basis for
distinguishing failure patterns. The hazard function is commonly used
by reliability analysts to describe the failure behavior of a device. The
use of the hazard function started with the concept that a population of
devices displays a “bathtub”-shaped hazard over the lives of the members
of the population. The “bathtub curve” is shown in Figure 4.1. The shape is
intended to illustrate the view that aging in a device population proceeds
through phases. Early in the lives of the devices, failures occur at a rela-
tively high rate. This “infant mortality period” is often attributed to the
failure of members of the population that are “weak” as a result of material
flaws, manufacturing defects, or other physical anomalies. Following the
“early life” or “infant mortality” period, the device population proceeds
through the “functional life period” during which the hazard function is
relatively low and reasonably stable. Finally, toward the end of the lives of
the population members, survivors fail with an increasing rate as a conse-
quence of “wear out.”
It is reasonable to observe that actuarial curves for human and other
biological entities often display the bathtub shape so that the analogy to
human mortality is often informative. It is also interesting that early life
failure behavior has been observed so extensively that most durable goods
manufacturers include some sort of run-in as part of their product testing
activities. In addition, it has long been common for government and mili-
tary procurement policies to mandate run-in efforts as a condition of sale for
equipment suppliers.
The concept of the bathtub curve has been discussed and debated widely
by reliability analysts. Some authors such as Wong and Lindstrom [22] argue
that device populations actually comprise numerous subpopulations, each
zT (t)
Time
FIGURE 4.1
Example of a bathtub curve.
Reliability over Time 43
Definition 4.1
d
zT (t) ³ 0 0 £ t < ¥ (4.6)
dt
FT (t + t)
FT (t + t|t) =
FT (t)
be nonincreasing in τ for all t ≥ 0. Note that in the earlier definition and the
ones that follow, the terminology “failure rate” is used. This is done to show
the correspondence to the abbreviations, which were defined by Barlow
et al. [23] at a time when the ambiguity in the terminology had not yet been
recognized.
In any case, the previous expression says that if the conditional survival
probability is a nonincreasing function of age, then the rate at which failures
occur is increasing and the life distribution is IFR. By similar reasoning, we
obtain the following.
Definition 4.2
d
zT (t) £ 0 0 £ t < ¥ (4.7)
dt
44 Reliability Engineering: Probabilistic Models and Maintenance Methods
The alternate condition for the DFR classification is that FT (t + t|t) be non-
decreasing in τ for all t ≥ 0. The following is the third possible form for the
hazard function:
Definition 4.3
FT (t + t|t) = FT (t)
for all t ≥ 0. This is an interesting special case that is examined further in the
next section.
There are situations in which the conditions for designation as IFR or DFR
are only partially met. For these cases, we consider the following situations:
Definition 4.4
FT (t) = 1 - e - lt (4.11)
fT (t) = le - lt (4.12)
fT (t) le - lt
zT (t) = = =l (4.13)
FT (t) e - lt
d
FT (t) = fT (t) = lFT (t)
dt
FT (t + t) e - l(t + t)
Pr[T > t + t|T > t] = FT (t + t|t) = = - lt = e - lt = FT (t)
FT (t) e
46 Reliability Engineering: Probabilistic Models and Maintenance Methods
The interpretation of this result is that a used device has the same reli-
ability as a new one. Clearly, this is quite contrary to intuition and is
unlikely to be true of most devices. The lack of memory feature of the
exponential model is therefore a weakness in its representation of real
equipment.
One final observation concerning the exponential model is the fact that
the life distribution of a series system comprising independent components,
each of which has an exponential life distribution, is exponential. That is,
n æ n ö
n - å l it å
- ç l i ÷t
Õ F (t) = Õ e - l it ç ÷
Rs (t) = Ti =e i =1
=e è i=1 ø
i =1 i =1
Note further that this expression confirms the fact that the system-level haz-
ard function for a series system of independent components is computed as
the sum of the component hazards.
b
æ t -d ö
-ç ÷
è q- d ø
FT (t) = 1 - e (4.14)
For this three-parameter form of the distribution function, the parameter δ is
a minimum life parameter that is often assumed to have a value of zero.
The interpretation of the parameter δ is that it is the time before which no
failures occur. When expressed in this manner, it seems reasonable to set
δ = 0. On the other hand, if the “time variable” is actually cycles to failure
or applied force in the case of a mechanical component, δ > 0 may be an
appropriate feature of the failure model. For example, a tensile specimen
made of steel will not fail when subjected to forces of 10–20 newtons. To
model the dispersion in failure strength of such specimens, a minimum
applied force of perhaps δ = 100 newtons might be appropriate. When the
minimum life parameter is nonzero, the distribution function appears as is
shown in Figure 4.2.
Reliability over Time 47
FT (t)
δ t
FIGURE 4.2
Three-parameter Weibull life distribution.
b
ætö
-ç ÷
èqø
FT (t) = 1 - e (4.15)
FT (t = q) = 1 - e -1 = 0.632
btb-1
zT (t) = (4.16)
qb
β>1
zT (t)
β=1
β<1
Time
FIGURE 4.3
Weibull hazard functions.
distribution can only have one of the three forms and the forms corre-
spond to the cases of negative, unconstrained in sign, and positive random
variables. The form for nonnegative random variables has the Weibull as a
representative case. In addition, the variable Y = ln T, where T has a Weibull
distribution, has an extreme distribution of the form
b ( y -ln q )
FY ( y ) = 1 - e - e (4.17)
2t
zT (t) =
q2
2 2
e -(t -m ) /2s
fT (t) = (4.18)
2ps2
prove that fact as follows. Start with the definition of the hazard function as
given in Equation 4.4 and take the derivative of the hazard function to obtain
d
zT (t) =
{ }
fT¢ (t) FT (t) - fT (t){- fT (t)} FT (t) fT¢ (t) + fT2 (t)
=
{ }
2
dt FT (t) FT2 (t)
d (t - m)
Note that FT2 (t) > 0 so zT (t) ³ 0 if - fT (t)FT (t) + fT2 (t) ³ 0. When t < μ,
dt s2
(t - m)
- >0
s2
fT (t) ³ 0
FT (t) > 0
fT2 (t) ³ 0
d
so zT (t) ³ 0
dt
When t ≥ μ,
(t - m) æ (t - m) ö
- 2
fT (t)FT (t) + fT2 (t) = ç - 2
FT (t) + fT (t) ÷ fT (t)
s è s ø
and
fT (t) ³ 0
(t - m)
so the question is whether or not - FT (t) + fT (t) ³ 0. To decide this, we
recall that s2
FT (t) =
ò f (x)dx
t
T
Reliability over Time 51
ò ò
t fT ( x)dx < xfT ( x)dx
t t
Therefore,
¥ ¥
(t - m) (t - m) ( x - m)
s2
FT (t) =
s2 òt
fT ( x)dx <
ò t
s2
fT ( x)dx
¥ 2 2
( x - m ) e - ( x - m ) /2 s
ò
¥
= dx = - fT ( x) t = fT (t)
s2 2ps 2
t
and hence,
(t - m)
FT (t) < fT (t)
s2
(t - m)
This inequality implies that - FT (t) + fT (t) > 0 so the derivative of the
s2
normal hazard function is nonnegative and the hazard function is increasing.
As is discussed in Chapter 5, the normal distribution is often consid-
ered a very appropriate model for the reliability of structural components.
On the other hand, many reliability analysts have resisted the use of the
normal distribution as a life length model for time-indexed ages because
of the fact that the normal is defined over the entire real line, negative as
well as positive.
Note the appearance of the time variable in the denominator of this function.
It also appears that the lognormal is an appealing distribution indepen-
dent of the issues related to positive random variables. The life lengths of
quite a few microelectronic components have been found to be well modeled
by the lognormal distribution.
52 Reliability Engineering: Probabilistic Models and Maintenance Methods
lb b-1 - lt
fT (t) = t e (4.20)
G(b)
The distribution function can be stated in closed form only if the shape
parameter, β, is an integer. In this case, the distribution function is
( lt ) b-1
( lt )
¥ k k
FT (t) = å
k =b
k!
e - lt = 1 - å
k =0
k!
e - lt (4.21)
Note that when the shape parameter is an integer, the gamma distribution is
usually called the Erlang (or Erlang-β) distribution.
As in the case of the Weibull, the gamma distribution model displays increas-
ing hazard when β > 1, decreasing hazard when β < 1, and constant hazard when
β = 1. Thus, setting β = 1 collapses the gamma distribution to the exponential.
The gamma has the disadvantage of being rather difficult algebraically,
but it has the advantage that it arises naturally as the convolution of identi-
cal exponential distributions. It therefore has considerable practical appeal.
Strategies for the numerical evaluation of gamma functions and gamma dis-
tributions are provided in Appendix A.
é b ù
- êat + ( e gt -1)ú
g
FT (t) = 1 - e ë û
(4.22)
zT (t) = a + be gt (4.23)
Reliability over Time 53
so there are three parameters that can be selected to provide whatever type
of model is desired. More importantly, the parameters can be selected so that
the function matches failure data quite well. For this reason, the Makeham
distribution [29] is widely used in actuarial studies. Note that it also corre-
sponds to an extreme value type of distribution.
Still another model that is based on actuarial data analysis is the Gompertz
distribution [29] for which the hazard function is
lbt (4.24)
While this form is rather intricate, the rationale for its construction is that the
reciprocal of the hazard function should be decreasing. That is,
dæ 1 ö
ç ÷<0
dt è zT (t) ø
which implies increasing hazard. For those working with actuarial data, this
seems a reasonable way to treat such behavior and the distribution has been
adopted by some reliability specialists for the same reason.
One final model that is particularly worth mentioning is the one suggested
by Hjorth [30].
He calls it the IDB (Increasing, Decreasing, Bathtub) distribution because
depending on the choice of parameter values, it can have increasing, decreas-
ing, or bathtub-shaped hazard function. The general statement of the IDB
distribution is
2
e - dt /2
(4.26)
FT (t) = 1 -
(1 + bt )
q/b
q
zT (t) = dt + . (4.27)
1 + bt
In summary, we might note that there are very many types of equipment
for which life distributions provide a meaningful model of life duration. The
possible choices are comparably wide. The distributions described earlier are
the principle but not the only distributions used to model life length. Each
has advantages and each has shortcomings. The key is to select one that is
appropriate for its application.
FT (t) = i =1
FTi (t) = 1 - Õ(
i =1
)
1 - FTi (t) = 1 - Õ (1 - e )
i =1
- l it
- (l 2 + l 3 )e -( l2 + l3 )t + (l1 + l 2 + l 3 )e -( l1 + l2 + l3 )t
Reliability over Time 55
0.1
0.08
0.06
zT (t)
0.04
0.02
2 4 6 8 10
Time
FIGURE 4.4
Hazard function for a three (exponential)-component parallel system.
and the hazard function is the ratio of these two functions. A plot of the haz-
ard function for the case in which the parameters are normalized to sum to
one (λ1 = 0.6, λ2 = 0.3, λ 3 = 0.1) is shown in Figure 4.4.
Naturally, the specific behavior of the hazard function depends upon the
values of the parameters of the life distributions of the components and upon
the specific system structure. Nevertheless, we may conclude that IFR behav-
ior is not preserved when IFR components are combined to form a system.
The same is true for DFR components.
In view of the previous results concerning the aggregation of IFR compo-
nents, it is clear that for any system, the behavior of the system-level hazard
function and the system-level reliability function should be examined care-
fully. The approach most likely to lead to a successful investigation of the
hazard function at the system level is to form the system reliability function
and to differentiate it, either algebraically or numerically, and to then evalu-
ate the expression
d
- FT (t)
zT (t) = dt (4.29)
FT (t)
(
FT (t) = e - ZT1 (t ) e - ZT2 (t ) + e - ZT3 (t ) - e - ZT2 (t )- ZT3 (t ) )
= e - ZT1 (t )- ZT2 (t ) + e - ZT1 (t )- ZT3 (t ) - e - ZT1 - ZT2 (t )- ZT3 (t )
56 Reliability Engineering: Probabilistic Models and Maintenance Methods
FIGURE 4.5
System example.
fT (t) = ( zT1 (t) + zT2 (t))e - ZT1 (t )- ZT2 (t ) + ( zT1 (t) + zT3 (t))e - ZT1 (t )- ZT3 (t )
and the hazard function is the ratio of those two expressions. If all three
components have Weibull life distributions with θ1 = 10, β1 = 1.25, θ 2 = θ 3 = 8,
and β2 = β 3 = 2.25, the system-level hazard function is shown in Figure 4.6.
Finally, before leaving the discussion of life distributions, let us con-
sider the standby redundant component configuration. As mentioned in
Chapter 2, this is a system structure that does not fall within the set of
structural forms enumerated and is one that is used frequently. The system
0.25
0.2
0.15
zT (t)
0.1
0.05
2 4 6 8 10
Time
FIGURE 4.6
System-level hazard function.
Reliability over Time 57
FIGURE 4.7
Standby redundant system.
ò
FTS (t) = (1 - p)FT1 (t) + p fT1 (u)FT2 (t - u)du
0
(4.30)
The rationale for this construction is that the system survives as long as the
first component if the switch fails and as long as the sum of the failure times
of the two components if the switch functions properly. As a special case of
this model, setting p = 1 represents the case of the perfect switch. One may
also observe that for this model, if FT1 (t) and FT2 (t) are both IFR, then the sys-
tem life distribution is also IFR.
To conclude this chapter, it is appropriate to note that there are many
other models that have been defined for particular types of equipment with
specific types of operating profiles. The models discussed in this chapter
represent the majority but not all of the useful models.
58 Reliability Engineering: Probabilistic Models and Maintenance Methods
Exercises
4.1 Show that an IFR distribution can be characterized as one for which
- ln FT (t) is convex and a DFR distribution can be characterized as one
for which - ln FT (t) is concave.
4.2 Construct the life distribution for a component having hazard function
zT(t) = 1.5t.
4.3 Construct an algebraic expression for the hazard function for a gamma
distribution having shape parameter β = 3.0.
4.4 Plot the hazard function for a gamma distribution having β = 3.0 and
λ = 0.02.
4.5 Show that an exponential life distribution has FT (t1 + t2 ) = FT (t1 )FT (t2 ).
4.6 Compute the values of the reliability function and the hazard function at
t = 2500 hours for a Weibull distribution having β = 1.8 and θ = 4000 hours.
4.7 Plot the density function for a Weibull distribution having β = 3.2 and
θ = 500 hours.
4.8 Determine the expression for the mean of the Weibull distribution.
4.9 Determine the expression for the mean of the gamma distribution.
4.10 Compute the values of the reliability function and the hazard function
at t = 6000 hours for a gamma distribution having β = 3.0 and λ = 0.004
hours.
4.11 Use the numerical approximation of Appendix A to compute Γ(7.45).
Compare the value you obtain to Γ(7) = 6! and Γ(8) = 7!.
4.12 Compute the value of the reliability function at t = 6000 hours for a
gamma distribution having β = 2.75 and λ = 0.004 hours.
4.13 Compute the value of the reliability function at 1500 hours for a device
having a normal life distribution with μ = 2000 hours and σ = 300 hours.
4.14 Consider a device for which the life length is well modeled by a log-
normal distribution having a mean life of 150 hours and a standard
deviation in life length of 4.5 hours. Compute the reliability for this
device at 20.0 hours. Then, plot the hazard function for the device.
Finally, characterize the point at which the hazard function changes
from increasing to decreasing.
4.15 Use the approximations in Appendix A to compute the reliability at
4500 hours of a component for which the life distribution is normal
with μ = 4000.0 and σ = 250.0.
4.16 Construct an algebraic expression for the mean life of a parallel system
of two independent components, each of which has an exponential life
distribution.
Reliability over Time 59
4.17 Construct an algebraic expression for the reliability function and the
system hazard function for a two-out-of-three system comprising
identical components each having an exponential life distribution. Plot
the hazard function for the case in which λ = 0.05.
4.18 Construct the algebraic expression for the system-level hazard func-
tions for the structure:
1 2
4.19 Construct algebraic expressions for the reliability and hazard functions
for the following two systems. Compare several of the values of these
functions for the case in which all components have Weibull life distri-
butions and in both structures β1 = β2 = 2.5, θ1 = θ2 = 100, β3 = β4 = 1.5,
and θ3 = θ4 = 200.
1 3 1 3
2 4 2 4
4.20 In general, for a distribution function FT(t), tγ is the γth quantile of the
distribution if FT(tγ) = γ. Determine the 0.90 quantile and the 0.99 quan-
tile for the Weibull distribution having β = 1.5 and θ = 20,000 and for the
exponential distribution having λ = 0.001.
4.21 Analyze the standby redundant system for two identical exponential
components having λ = 0.001 for the cases of p = 0.50, 0.80, and 1.0. Plot
the resulting distribution function and its hazard function.
4.22 Consider a set of n independent spot lamps that are installed to illu-
minate a parking lot. Suppose the life length of each lamp is well
modeled by Weibull distribution having β = 1.8 and θ = 450 hours.
How many lamps must be installed in order to have a probability of
at least 0.99 that at least one lamp is still functioning after 600 hours
of operation?
4.23 Consider a parallel system of two independent but unequal compo-
nents, each of which has an exponential life distribution. Construct
60 Reliability Engineering: Probabilistic Models and Maintenance Methods
expressions for the reliability function and the hazard function for this
system. If the components have λ1 = 0.005 and λ2 = 0.004, compute the
system reliability at 250 hours.
4.24 Consider the two-unit standby system shown in Figure 4.7. Suppose
the switch is perfect but the two components are independent and
unequal. If the two components have exponential life distributions
with λ1 = 0.008 and λ2 = 0.003, compute the system reliability at 400 and
at 750 hours.
4.25 Consider a three-independent-component standby system for which
the switch is perfect. Construct a general model for the life length of
this system. Suppose the three components are identical and have
exponential life distributions with λ = 0.0005. Compute the system reli-
ability at 1500 and 8000 hours.
4.26 Consider a three-independent-component standby system for which
the switch is imperfect and has a probability of p of proper function
each time it is activated. Construct a general model for the life length
of this system. Suppose the three components are identical and have
exponential life distributions with λ = 0.0004 and p = 0.98. Compute the
system reliability at 2500 hours.
4.27 Suppose the previous system was designed simply as a three-unit par-
allel system. Compute the system reliability at 2500 hours. For what
values of the switch reliability, p, is the parallel system more reliable at
2500 hours than the standby system?
4.28 (Extension) Consider a shared load system of two identical components.
As long as both components are functioning, each has an exponential
life distribution with λ = 0.0025. When one of the components fails, the
remaining component must support the load initially carried by both
so its hazard increases to λ = 0.0075. Construct a model for the life dis-
tribution of this system and compute its reliability at 500 hours.
5
Failure Processes
61
62 Reliability Engineering: Probabilistic Models and Maintenance Methods
¥ ¥ ¥ x
F = Pr[X > Y ] =
ò ò h (y)g (x)dxdy = ò ò h (y)g (x)dydx
-¥ y
Y X
-¥ -¥
Y X (5.1)
¥ ¥
F = Pr[X > Y ] =
ò
-¥
hY ( y ) ( 1 - GX ( y ) ) dy =
ò H (x)g (x)dx
-¥
Y X (5.2)
¥
æ x - m y ö æ x - m x (t) ö
=
ò F ççè
-¥
÷fç
s y ÷ø è s x (t) ø
÷ dx (5.3)
where
μy and σy are the constant parameters of the normal stress distribution
μx(t) and σx(t) are the time-dependent parameters of the strength distribution
ϕ denotes the standard normal density
Φ represents the cumulative distribution for the standard normal
0.4
0.3
Pr
0.2
0.1
FIGURE 5.1
Basic stress–strength interference model.
64 Reliability Engineering: Probabilistic Models and Maintenance Methods
0.3
110
0.2
Pr
0.1
0 105
th
0
g
en
Str
200 100
Tim
e 400
95
FIGURE 5.2
Time evolution of the strength distribution.
Note that the distribution on stress lies below that on strength so that, in general,
the strength values will probably exceed the stress values.
Next, suppose that the “decay” parameters are α = 0.005 and β = 0.002.
With these values, the gradual deterioration of the device strength is rep-
resented by a gradual change in the center and the width of the strength
distribution. This is illustrated in Figure 5.2. At each point in time, the cor-
responding “slice” of the distribution corresponds to the strength distribu-
tion at that time.
Consider the interference at time values of t = 250 and 500. The cor-
responding plots are shown in Figure 5.3. Clearly, as the mean of the
strength distribution declines, the probability that strength exceeds stress
also decreases. Since both the strength and the stress distributions have
been assumed to be normal, the values of realizations of the distribution
and for the reliability must be computed numerically. Table 5.1 later lists
values of the reliability function of Equation 5.3 for the defined numerical
example.
A plot of the reliability function over the same range is shown in Figure 5.4.
Once the numerical analysis of the model has been performed, one can fit
a distribution to the reliability function as desired. For example, the function
represented in Figure 5.4 (Table 5.1) is well represented by a Weibull distribu-
tion with parameters β = 1.083 and θ = 1204.
0.25
0.2
0.15
Pr
0.1
0.05
0.2
0.15
0.1
Pr
0.05
FIGURE 5.3
(a) Interference at t = 250. (b) Interference at t = 500.
TABLE 5.1
Example of Reliability Function Values
Time 100 200 500 750 1000 1500 2000 2500 3000 4000 5000
Reliability 0.933 0.890 0.702 0.531 0.391 0.217 0.132 0.088 0.064 0.039 0.028
is offered by Gertsbakh and Kordonskiy [33], who was one of the pioneers in
the development of the reliability theory in Europe. Gertsbakh suggests that
we consider a sequence of equipment actuations (or events), each of which
imposes a stress on a component of interest. As long as the stress is below a
threshold, the component does not fail and the first time the stress exceeds
the threshold, the component fails. As an example, he considers that each
66 Reliability Engineering: Probabilistic Models and Maintenance Methods
0.8
0.6
Reliability
0.4
0.2
FIGURE 5.4
Example Reliability Function for the Interference Model.
time an airplane lands, the landing imposes a gravitational force on the com-
munications radio and that variations in weather conditions and pilot skill
imply substantial variation in the loads experienced. When a sufficiently
severe load occurs, the radio is damaged and fails. The life of the radio may
thus be measured in the numbers of landings, and if γ is the probability that
failure occurs on any landing, the distribution on life length is geometric, so
k
FK (k ) = g å (1 - g )
n =1
n -1
= 1 - (1 - g )
k
(5.4)
gives the cumulative probability of failure on or before the kth landing. One
may then argue that for any individual landing, γ is quite small, so
(1 - g )
k
» e - kg
FK (k ) 1 - e -gk (5.5)
Failure Processes 67
( lt )
¥ k
FT (t) = å
k =0
e - lt
k!
HY( k ) (L) (5.6)
Note that the sum is taken over all possible numbers of shocks and the nota-
tion HY( k ) ( y ) represents the k-fold convolution of H Y(y) and thus the sum of k
damage magnitudes that total Y. Thus, Equation 5.6 represents the probabil-
ity that k shocks occur and their sum does not exceed the strength/damage
threshold L—summed over all possible values of k. By convention, HY( 0 ) ( y ) = 1
for all values of y ≥ 0 and, of course, HY(1) ( y ) = HY ( y ). Parenthetically, inde-
pendent of the choice of distribution H Y(y) for modeling shock magnitudes,
Equation 5.6 will correspond to an increasing failure rate on average (IFRA)
distribution.
Consider an example. Assume that a device has an endurance threshold
L = μx(0) = 102.5 N/cm2 as in the case of the stress–strength interference
model. Let H Y(y) be a normal distribution with μY = 50 N/cm2 and σY =
1 N/cm2 and assume λ = 0.004/h. Solving Equation 5.6 as a function of time
yields the reliability curve shown in Figure 5.5.
Note the conceptual duality of the cumulative damage model and the
stress–strength interference model. In the stress–strength interference
68 Reliability Engineering: Probabilistic Models and Maintenance Methods
0.8
Reliability
0.6
0.4
0.2
FIGURE 5.5
Reliability function for the cumulative damage model.
for a component in state j. The state variable represents the degree of deg-
radation of the device. For a failure threshold of state k and a random initial
state, say, i, the time to failure is described by
æ ö
æ k -1
ö k -1 ç 1 k -1 ÷ n m
FT (t|i , k ) = 1 - ç
ç
è
Õ å Õ
l =i
n
l ÷
÷
ø r =i
ç n
çr l =i
( l - r ) ÷ e - r at
n n
÷
(5.8)
ç l¹r ÷
è ø
This model is further enhanced by taking a Poisson mixture over the distance
(k − i) between the initial and final state.
Failure Processes 69
Subsequent to its initial development, the shock model has been studied
extensively and enhanced in several ways. One of the first extensions is to
permit successive shocks to cause increasing amounts of damage. That is,
we continue to assume that the shocks occur according to a Poisson pro-
cess and that the damage caused by the shocks are independent, but succes-
sive shocks are increasingly harmful, so the damage associated with the ith
shock, Yi, has distribution HYi ( y ) with the values of the HYi ( y ) decreasing in i.
With this assumption, the survival function becomes
( lt )
¥ k
FT (t) = å
k =0
e - lt
k!
HY0 * HY1 * * HYk (L) (5.9)
where HY0 * HY1 * * HYk (L) denotes the convolution of the distributions.
Further extensions include (1) eliminating the assumption of indepen-
dence of the successive shocks, (2) allowing attenuation of damage (partial
healing) between shock events, and (3) treating the failure threshold as a
decreasing function of time or damage. In each case, the basic model format
is the same and the resulting life distribution is IFRA.
The cumulative damage models have been viewed as very useful and
have been applied to some very important problems. Perhaps the most
noteworthy application was the study by Birnbaum and Saunders [28] of
fatigue failures in aircraft fuselage. Their model is specific to fatigue failure
but has been used to define reinforcement designs for aircraft structures.
The original model for fatigue failure was defined by Miner [36] and is
known as Miner’s rule. The model is based on the idea that stress is applied
cyclically and that the magnitude of the imposed stress in each cycle falls
within one of a set of K possible intervals. Miner’s rule states that fatigue
failure will occur when
åN
nk
=1 (5.10)
k =1 k
where
nk is the number of cycles in which the stress falls in the kth interval
Nk is the number of cycles that would cause failure if all stresses occurred
in the kth interval. Miner’s rule is deterministic
occurs when Yn exceeds the failure threshold. The probability on the number
of cycles to failure is
é n
ù é n
X i - m L - nm ù
Pr[N £ n] = Pr[Yn ³ L] = 1 - Pr ê
êë
å
i =1
Xi £ L ú = 1 - Pr ê
úû êë
ås
i =1
n
£
s n úû
ú (5.11)
and for large values of the number of cycles, application of the central limit
theorem implies that the distribution is normal with mean value nμ and vari-
ance nσ2 so
æ L - nm ö
Pr[N £ n] » F ç ÷ (5.12)
è s n ø
0.4
0.3
0.2
F
α = 1.25
0.1
α = 0.75
1 2 3 4 5
(a) Time
α = 1.25
3 α = 0.75
z
1 2 3 4 5
(b) Time
FIGURE 5.6
(a) Density functions for the Birnbaum–Saunders distribution having a scale parameter 2.5.
(b) Hazard functions for the Birnbaum–Saunders distribution having a scale parameter 2.5.
Failure Processes 71
æ1æ 1 öö
FT (t) = F ç ç lt - ÷÷ (5.13)
a
è è lt ø ø
For this distribution, α is the shape parameter and λ is the scale parameter.
Although the distribution looks cumbersome, it is easily analyzed as it is
defined as a standard normal probability. Figure 5.6a shows examples of the
life density and Figure 5.6b shows the corresponding examples of the hazard
function.
r = he - Ea /KT (5.14)
where
η is an electron frequency factor
K is Boltzmann’s constant (8.623 × 10 –5 eV/K)
T is the temperature in degrees Kelvin
Ea is Gibb’s free energy of activation
72 Reliability Engineering: Probabilistic Models and Maintenance Methods
For reliability modeling, the product of the reaction rate, ρ, and time yields
the extent of the progress of the deterioration reaction and is therefore con-
sidered to correspond to the cumulative hazard at any time. For example, for
the Weibull life distribution,
b
æ rt ö
ZT (t) = ç ÷ (5.15)
è qø
and it may be further noted that alternate rate functions may be (and some-
times are) used but are applied in the same manner. Until recently, the
scientific and algebraic links between rate functions such as Equation 5.14
and life distributions and hazard functions have not been widely studied.
Nevertheless, the implied models such as Equation 5.15 are used extensively
and have proven to be quite accurate.
where γ1 and γ2 also represent activation energies. Note that in this model,
Krausz and Eyring allow for the possibility that, in addition to their direct
effects, temperature and voltage have a synergetic (combination) effect on
reaction rate. For this model, we again incorporate the rate function in the
cumulative hazard function to obtain a life distribution.
and that these two flaw types both contribute to device failure. Their basic
premise is that the flaws correspond to material impurities or crystal lattice
anomalies that serve as reactive sites for degradation reactions such as metal
migration and oxidation. Based on this basic premise, they show that pub-
lished microelectronic failure data are reasonably well described by either
the Weibull or the IDB life distributions depending upon the relative magni-
tudes of the initial concentrations of the two defect types.
where
α and β are constants based on the material constituents of the device
γ(t) represents the gamma process in damage increments
where
λ(t) is the Poisson rate function
σ is a constant
W(t) is a standard Wiener process
74 Reliability Engineering: Probabilistic Models and Maintenance Methods
More generally, one may let p(x,t) represent the density on the state variable
and use a Markov process to obtain
¶ ¶ ¶2
p( x , t) = - ( m( x , t)p( x , t) ) + 2 ( D( x , t)p( x , t) ) (5.20)
¶t ¶x ¶x
in which the first term represents drift and the second represents diffu-
sion. Irrespective of the definition of the constituent terms in this equa-
tion, it reduces to the Fokker–Planck equation, which is also known as the
Chapman–Kolmogorov backward differential equation [41]:
where the first term represents gradual damage growth, the second term
represents the traumatic increments that occur as a result of the shocks, and
the third term captures the random variation in the damage growth.
The best representation of all of the earlier models is provided in terms of
the survival function by Lemoine and Wenocur [44] as
¶FT ( x , t) ¶F ( x , t) s2 (t) ¶ 2 FT ( x , t)
= -k( x)FT ( x , t) + m(t) T + (5.23)
¶t ¶x 2 ¶x 2
As it wears, the tire has reduced tread or strength and thus less resistance
to randomly occurring road hazards, some (but not all) of which are fatal
to the tire. A conceptually equivalent process is the gradual deterioration
of an incandescent light bulb that will also fail suddenly if subjected to a
voltage surge.
In Equation 5.23, the first term represents the “killing process” associated
with shocks, the second captures the drift in the diffusion process, and the
third term represents the randomness in the damage accumulation diffusion
process. The randomness may result from material variations, ambient oper-
ating conditions, intensity of use, or simple entropy in energy application to
the device.
The general model that Lemoine and Wenocur define can be quite dif-
ficult to analyze or evaluate computationally. However, this rather compli-
cated model is actually quite rich. As they point out, the practically useful
model realizations are generally tractable. In fact, appropriate choices of
initial conditions and of functions μ(t) and σ2(t) lead to a wide variety of
useful and conceptually reasonable specific models. For example, assum-
ing σ(t) = 0 and that the wear tolerance is infinite implies that k(x) cor-
responds to the Weibull hazard function. For that model, setting β = 1 to
obtain an exponential distribution corresponds to the case in which the
wear rate is zero. On the other hand, taking k(x) = x and X(t) = a + bt yields
the Rayleigh distribution. Other choices yield normal, gamma, inverse nor-
mal, and extreme value life distributions. Thus, as Lemoine and Wenocur
point out, selecting the mean value and diffusion functions and the killing
function to represent specific types of degradation processes leads to quite
reasonable reliability models.
As indicated, analysis of Equation 5.23 can be quite challenging. In
fact, the solution of stochastic differential equations is a topic of ongoing
study. Nevertheless, for the simplest diffusion processes, an algorithm for
obtaining numerical solutions has been defined. For Equation 5.23, identify
the discretized approximations to the derivative terms as
where α1 and α2 are placeholders for the drift and diffusion coefficients. Then,
grouping all of the like survivor function terms, rewrite the expression as
æ -a Dt a Dt ö æ 2a Dt ö
F(i , j + 1) = ç 1 + 2 2 ÷ F(i - 1, j) + ç 1 - k( j) - 2 2 ÷ F(i , j)
è 2Dx (Dx) ø è (Dx) ø
æ a Dt a Dt ö
+ ç 1 + 2 2 ÷ F(i + 1, j)
è 2Dx (Dx) ø
1.0
0.5
4
0.0
0
2
x
2
t
4
(a) 0
2 t
1.0
0.5
0.0
2 x
(b) 0
FIGURE 5.7
(a) Example of a solution of the diffusion model for the state boundary conditions FT ( a, t) = 0
and FT (b , t) = 0. (b) Example of a solution of the diffusion model for the state boundary condi-
tions FT ( a, t) = 1 and FT (b , t) = 0.
78 Reliability Engineering: Probabilistic Models and Maintenance Methods
Fbar(x,t)
0.6
probability low
0.4
0 0.2
1
2 0.0
Ti 3 5
m 4
e 4 3
2
5 1 X
0
FIGURE 5.8
Example of a solution of the diffusion model when the state boundary condition applies to a
minimum value.
1.0
0.8
Fbar(x,t)
0.6
0.4
0.2
0
1 0.0
2 5
4
Tim 3 3
e 4 2
1 X
5 0
FIGURE 5.9
Example of a solution of the diffusion model when the state boundary condition applies to a
maximum value.
The interpretation of the equation is that the device has a base hazard func-
tion, z0(t), that represents the core dispersion in life length, and this hazard is
increased by a function, ψ, of the vector of variables, x, that describe the spe-
cific operating environment in which the device is used. The two commonly
å
I
used forms for the function y( x ) are a linear function y( x ) = b0 + bi xi
I
i =1
b0 + å bi xi
and an exponential form y( x ) = e i =1
where the second of these is often
preferred.
There are actually two approaches to using the proportional haz-
ards model. One is to identify the vector x as a set of covariates or explan-
atory variables and to view the model as a statistical basis for fitting a life
distribution to observed failure data. An alternate approach is to view the
vector x as a description of the operating environment and to treat the func-
tion ψ as a description of the effect of operating conditions on the failure
frequency. Practically, the two views are essentially the same as the function
ψ(x) is usually determined by regression analysis on accumulated failure
data. On the other hand, when the failure process for a device is understood,
the proportional hazards model in which the function ψ represents that fail-
ure process is very appropriate.
Observe that under the basic proportional hazards model, the shape of the
hazard function is preserved. The environmental effects on the hazard are
essentially additive (directly or as logarithms) and the cumulative hazard
function has the following comparable form:
t t
ò
0
ò
ZT (t) = y( x )z0 (u)du = y( x ) z0 (u)du = y( x )Z0 (t)
0
(5.26)
( )
y( x )
FT (t) = e - y( x )Z0 (t ) = e - Z0 (t ) (5.27)
The advantages of this model are that it applies to any assumed base life
distribution, appears to be equally appropriate for mechanical and elec-
tronic devices, and allows for the very specific representation of a general
and arbitrary set of environmental effects. Conceptually, the chief drawback
of the model is that it may not directly portray the mechanisms of failure.
Practically, the drawback of the model is that substantial volumes of data are
required to obtain satisfactory regression models for ψ(x). Enhancements to
the model have been based on allowing some of the environmental variables
to be time dependent.
As an example of the proportional hazards model, consider a satellite-
based electronic device for which the gamma distribution with β = 2.0 and
80 Reliability Engineering: Probabilistic Models and Maintenance Methods
λ = 0.00025 provides a reasonable life length model. Suppose that the damage
effects of radiation on the device are modeled by y( x ) = e 0.024+0.25 x1 +0.40 x2 where
the variables x represent the (normalized) particle spectral density and peak
energy intensity in the ambient particle streams. For these circuits, when
x = (0.80, 0.45), the base cumulative hazard at 5000 hours is ZT(5000) = 1.0021
and the corresponding reliability is FT (5000) = 0.644. The cumulative hazard
achieved while in service in orbit is ZT(5000) = 1.501 with a corresponding
reliability of FT (5000) = 0.223. Installation of a radiation shield that reduces
the operating environment to x = (0.30, 0.25) improves the 5000 in orbit reli-
ability to FT (5000) = 0.294.
T = min{tk }
k
ò ò
= fT1 ,T2 ,…,TK (t1 , t2 , … , tK )dt1 dtK
t1 tK
(5.28)
Failure Processes 81
FT1 ,T2 ,…,TK (t1 , t2 ,…, tK ) = Pr[T1 > t1 , T2 > t2 ,…, TK > tK ] (5.31)
The survivor function for the crude life distribution for each risk is
¥æ¥ ¥ ö
òò ò
= ç fT1 ,T2 ,…,TK (t1 , t2 , … , tK )dt1dt2 dt j -1dt j +1 dtK ÷ dt j (5.33)
çç
t è tj
÷÷
tj ø
K
FT (t) = Pr[T > t] = å Pr[T > t, J = j]
j =1
(5.34)
Pr[T ³ t , J = j]
FT|j (t) = Pr[T ³ t| J = j] = (5.35)
pj
82 Reliability Engineering: Probabilistic Models and Maintenance Methods
Observe that the described analysis applies regardless of whether or not the
various risks are independent. Of course, the analysis depends upon having
an initial multivariate model of life length as a function of the various risks.
This presents a problem as it is difficult to justify any particular multivariate
model prior to observing failures and when failures are observed, they
correspond to observations of crude failures rather than net failures.
On the other hand, if the risks are independent, then the joint life
distribution is the product of the marginal life distributions and the model
analysis is quite manageable. For example, suppose a device is subject to
two failure risks for which the marginal survival probability functions are
) . Then F (t , t ) = e -2.4t1 -(t2 0.25) so
1.4
(
1.4
-2.4 t - t
FT1 (t) = e and FT2 (t) = e 0.25
T1 ,T2 1 2
¥ ¥
-2.4 u - ( u
0.25 )
1.4
¶
p1 = -
ò
¶t1
0
FT1 ,T2 (t1 , t2 )
t1 = t2 = u
du = 2.4e
ò
0
du = 0.3808
Necessarily, π2 = 0.6192.
Next, to compute survival probabilities conditioned on the cause of failure,
use the following equation:
¥æ¥ ö
t è t1
òò
Pr[T > t , J = 1] = ç fT1 ,T2 (t1 , t2 )dt2 ÷ dt1
ç ÷
ø
to obtain Pr[T > 0.025, J = 1] = 0.323 and Pr[T > 0.025, J = 2] = 0.582 so that the
survival functions for the crude life distributions have
Pr[T ³ 0.025, J = 1]
FT|1(0.025) = Pr[T ³ 0.025| J = 1] = = 0.848
p1
Pr[T ³ 0.025, J = 2]
FT|2 (0.025) = Pr[T ³ 0.025| J = 2] = = 0.940
p2
Finally, note that when the risks are independent, the life distributions on
the net life lengths can be computed using
The appeal of the competing risk model is that many types of equipment
really are subject to multiple competing risks. A further advantage of the
model is that the combination of risks, even when each risk has a monotone
Failure Processes 83
hazard function, often yields a “crude” failure distribution that has a bath-
tub shape. Thus, the competing risk model provides a means to combine
failure distributions to yield conceptually appealing model behavior in the
overall hazard function.
Note finally that like the proportional hazards model, the competing
risk model applies to both mechanical and electronic devices and may be
employed with any of the life distribution models. Thus, it is a model form
with wide applicability.
Exercises
5.1 Assume that a component has a Weibull strength distribution with
β = 2.0 and θ = 1267 kN/cm2 and that it is subject to a normal stress dis-
tribution having μ = 105.5 kN/cm2 and σ = 1.76 kN/cm2 where in both
cases the distributions are time invariant. Compute the reliability of
the component.
5.2 Suppose that the component of Problem 5.1 has an increasing stress
distribution in which mean and variance change according to the equa-
tions μY(t) = 105.5 + 0.014t and σY(t) = 1.76 + 0.005t. Compute the compo-
nent reliability at 2,000, 4,000, 6,000, and 10,000 hours.
5.3 Suppose a component has a normal strength distribution that declines
with use so that the mean strength is μX(t) = 400−0.01t and the standard
deviation in strength increases according to σX(t) = 15 + 0.004t. If this
component is subject to normally distributed stress with μY = 375 and
σY = 16, compute the device reliability at 1500 hours.
5.4 Suppose the stress distribution for the component of Problem 5.3 is
increasing with μY(t) = 375 + 0.008t and σY(t) = 16 + 0.002t. Compute the
device reliability at 1500 hours.
5.5 Solve the cumulative damage model of Equation 5.6 for the 400 hour
reliability for a component having a strength threshold L = 60 and sub-
jected to shocks that impart a normally distributed damage with mean
μ = 17.5 and standard deviation σ = 1.5. Assume shocks occur according
to a Poisson process with rate λ = 0.01/h.
5.6 For the Birnbaum–Saunders life distribution of Equation 5.13, compute
the reliability function value at 40 and 60 hours for the case in which
L = 75.0, μ = 1.5, and σ = 2.5. Then plot the reliability function over the
range [0,100].
5.7 Using the Arrhenius reaction rate model of Equation 5.14 and assuming
a Weibull life distribution, compute the component reliability at 10,000,
25,000, and 50,000 hours when T = 55°C, Ea = 0.80, η = 1.5 × 1012, β = 0.75,
84 Reliability Engineering: Probabilistic Models and Maintenance Methods
and θ = 40,000 hours. Then plot the reliability at 25,000 hours versus the
activation energy for 0.5 ≤ Ea ≤ 1.2.
5.8 Using the proportional hazards model, plot the baseline and overall
hazard functions and the reliability function for a component having a
Weibull life distribution with β = 1.75 and θ = 2500 hours and ψ(x) = 1.4.
5.9 Consider a competing risk model in which the number of risks is 2 and
the joint density on life length is
Determine the probabilities of failure due to each cause, the net failure
probability functions, and the survival functions for each of the crude
life length distributions.
6
Age Acceleration
85
86 Reliability Engineering: Probabilistic Models and Maintenance Methods
The solutions obtained indicate that for many types of electronic devices,
stress screening is a worthwhile strategy for managing device reliability.
Here again, there is an interesting contrast between mechanical and elec-
tronic devices. Experience has shown that many electrical and electronic
devices display high and decreasing hazard during early life, so stress
screening can be economically sensible. On the other hand, most mechanical
components display increasing hazard from the start of life. Consequently,
stress screening is rarely used for mechanical components and models of its
use have not been constructed.
For both applications, the operating environment of a device is modified
within the limits that the device should be able to tolerate. That is, environ-
ment is manipulated in a manner that the aging process remains unchanged
except that it occurs more rapidly. It is important to maintain the nature of
0.0007
0.0006
0.0005
0.0004
Pr
0.0003
0.0002
0.0001
0.8
0.6
Pr
0.4
0.2
FIGURE 6.1
The (a) density functions and the (b) distribution functions with and without age acceleration.
Age Acceleration 87
the failure processes so that the observed aging behavior is consistent with
actual use. When the age acceleration is done properly, it is usually believed
that a proportional hazards type of hazard rate enhancement is obtained,
so the shape of the distribution is maintained. Thus, age acceleration yields
only timescale compression and
where a is the acceleration factor, so Equation 6.1 states that the life distribu-
tion under enhanced stress is the same as the life distribution under nominal
operating conditions but evaluated for a compressed timescale.
Consider the example of a component having a Weibull life distribution
with β = 1.5 and θ = 1000 hours. If the application of a stress to the component
population results in an age acceleration of a = 10, the density distribution
functions are compressed as shown in Figure 6.1. Representative values of the
distribution functions are FTa (50) = FT (500) = 0.298, FTa (100) = FT (1000) = 0.632,
and FTa (200) = FT (2000) = 0.941. Note the compression in the range of
dispersion. A key point is that the shape of the distribution is unchanged
and this reflects the fact that the basic failure process is unchanged.
Ea æ 1 1 ö
r(Ta ) ç - ÷
a= = e K è T0 Ta ø (6.2)
r(T0 )
of Ea = 0.8 and is tested at Ta = 95°C = 368 K, it will age at a rate a = 21.6 times
faster than under normal operating conditions. That is, for each hour of oper-
ation at 95°, the device gains 21.6 hours of age.
An interesting extension to this model was studied by Nachlas [46] in
response to evidence presented by Wong [47] that for CMOS devices, the activa-
tion energy is often temperature dependent. Recognizing that performing ther-
mal age acceleration implies heating and cooling intervals, Nachlas defined a
model in which the net age acceleration is obtained as a function of the heating
and cooling rate and of the soak temperature. The basic form of that model is
Ea ( T0 )
D E ( T ( t ))
e KT0 - a
a=
D òe
0
KT ( t )
dt (6.3)
where D is the test duration including heating and cooling intervals, the
function T(t) describes the temperature profile over time, and the function
Ea(T) depends upon the device identity and may be selected on the basis
of the findings of Jensen and Wong. A representative temperature profile is
shown in Figure 6.2.
The cycle represented shows heating and cooling intervals of 200 minutes,
each with a dwell (or soak) period of 800 minutes. Representing the tempera-
ture profile by
ì æ t ö
2.5
ï 294 + 74 ç ÷ 0 £ t £ 200
ï è 200 ø
ï
Ta (t) = í368 200 £ t £ 1000
ï 2.5
ï 368 - 74 æ (t - 1000) ö 1000 £ t £ 1200
ïî ç ÷
è 200 ø
0.8
0.6
Temp
0.4
0.2
FIGURE 6.2
Representative thermal cycle for age acceleration.
Age Acceleration 89
and, on the basis of the Jensen and Wong paper, taking the activation energy
function as
permits the modeling of the revised operating profile. More useful models
are obtained if the strength threshold parameter, L, is expressed as a func-
tion of the quantities that define operating conditions.
In a similar manner, both the stress–strength and proportional hazards
models offer the potential to represent age acceleration. In the case of the
stress–strength models, representing the parameters of the strength distribu-
tion in terms of operating conditions may provide a useful model. Normally,
the proportional hazards models are based on variables (covariates) that repre-
sent the operating conditions, so the representation of age acceleration should
be direct.
Recently, Cassady and Nachlas [49] suggested a general model that can be
used to capture the effects of service loads and device vintage as well as those
of the ambient operating environment. They assume that a device is subjected
to operating conditions that can be captured using two sets of variables: n
measures of intensity of use (e.g., speed, vibration) and m measures of ambient
operating conditions (e.g., temperature, relative humidity). They let Xi(t) denote
the value of intensity of use measure i at time t and X(t) = {X1(t), X 2 (t), … , X n (t)}.
Also, X0i denotes the nominal value of intensity of use measure i. Then, they let
Yj(t) denote the value of ambient operating conditions measure j at time t with
Y(t) = {Y1(t), Y2 (t), … , Ym (t)} where Y0j denotes the nominal value of ambient
operating conditions measure j. These quantities are used to obtain a measure
of the “equivalent age” of the device over time.
The unit’s operating conditions govern its aging. Let αk(t) denote the equiv-
alent age of a unit of vintage k at time t. The instantaneous rate of aging at
time t may be represented by
d qk ( t )
a¢k ( t ) = ak (t ) = ( ak0 ) (6.6)
dt
The functions {δk1, δk2, …, δkn} and {γk1, γk2, …, γkm} are vintage dependent and
defined so that dki ( X0 i ) = 0, i = 1, … , n and g kj ( Y0 j ) = 0, j = 1, … , m .
Note that ek is a vintage-dependent parameter. In addition, note that they
assume the effects of individual measures of intensity of use and ambient
operating conditions on equipment aging are independent. Therefore,
t
ò
a k ( t ) = a¢k ( t ) dt
0
(6.8)
92 Reliability Engineering: Probabilistic Models and Maintenance Methods
The complexity of this function depends entirely upon the complexity of the
functions {δk1, δk2, …, δkn} and {γk1, γk2, …, γkm}. Then, if Θ denotes the equiva-
lent age of the unit at failure, then the cumulative distribution function on
failure age is FQ ( q ) = Pr ( Q £ q ), and the transformation Tk = a -k 1 ( Q ) yields the
clock time to failure for items of vintage k. This time value may then be used
for the life distribution and reliability functions. The appeal of this model
is that it captures acceleration due to operating intensity as well as environ-
ment and permits a distinction between operating time and clock time.
Finally, note that the diffusion process models have a “killing function”
that represents the process of deterioration. Here again, if the model that
comprises the “killing function” is defined (as would be expected) in terms
of the forces that drive failure, the representation of age acceleration should
again be direct.
a=
(80e 0.7 æ 1
ç -
1 ö
÷
0.00008623 è 328 368 ø
+ 40e
0.7 æ 1
ç -
1 ö
÷
0.00008623 è 328 383 ø ) = (80(14.73) + 40(34.96)) = 21.47
120 120
Age Acceleration 93
Of course, the failure data obtained during the test should provide consider-
able information about the failure characteristics of the device population.
One further example of a step stress test is provided by the paper clips.
Suppose the stress regimen is to bend each member of a sample set of clips
20 cycles of 45° displacement followed by 20 cycles of 90° displacement and
further followed by 20 cycles of 180° displacement. Clearly, the successive
stress levels are increasingly aggressive and should produce increasing
numbers of failures.
Exercises
6.1 Consider an electronic component having activation energy Ea = 0.6
and normal operating temperature To = 55°C. Compute and plot the
age acceleration factor for the Arrhenius model that is realized for the
range of temperatures 70°C–110°C.
6.2 Assuming a normal operating temperature of 55°C and an accelerated
temperature of 95°C, compute and plot the acceleration factor defined
by the Arrhenius model for Ea in the range of 0.4–0.9.
6.3 Suppose copies of a device having activation energy of 0.82 eV/K are
subjected to a thermal cycling regime in which the dwell time at 90°C
is 4 hours, heating is linear over 30 minutes, cooling is linear over
90 minutes, and normal operating temperature is 55°C. Compute the
net acceleration per cycle.
94 Reliability Engineering: Probabilistic Models and Maintenance Methods
95
96 Reliability Engineering: Probabilistic Models and Maintenance Methods
This expression should be read as xj is the jth smallest failure time observed
and it corresponds to the failure time of the device copy indexed [j], for exam-
ple, in the data set of Table 7.1, x1 = t[1] = t19 = 0.006, x2 = t[2] = t47 = 0.019, and
x50 = t[50] = t3 = 5.292. The completely reordered data set is shown in Table 7.2.
TABLE 7.1
Example of a Failure Data Set
i ti i ti i ti i ti i ti
1 0.883 11 1.555 21 0.129 31 0.829 41 0.894
2 0.875 12 3.503 22 0.455 32 0.548 42 0.336
3 5.292 13 1.541 23 2.008 33 1.016 43 0.129
4 0.038 14 1.218 24 0.783 34 0.223 44 1.373
5 4.631 15 1.285 25 1.803 35 3.354 45 0.613
6 1.690 16 2.190 26 2.505 36 1.559 46 1.272
7 0.615 17 0.720 27 0.465 37 3.785 47 0.019
8 2.877 18 0.056 28 1.494 38 0.599 48 0.068
9 1.943 19 0.006 29 0.795 39 0.090 49 0.658
10 3.106 20 0.279 30 0.299 40 0.026 50 3.085
Nonparametric Statistical Methods 97
TABLE 7.2
Reordered Example of Failure Data
j xj j xj j xj j xj j xj
1 0.006 11 0.279 21 0.720 31 1.285 41 2.190
2 0.019 12 0.299 22 0.783 32 1.373 42 2.505
3 0.026 13 0.336 23 0.795 33 1.494 43 2.887
4 0.038 14 0.455 24 0.829 34 1.541 44 3.085
5 0.056 15 0.465 25 0.875 35 1.555 45 3.106
6 0.068 16 0.548 26 0.883 36 1.559 46 3.354
7 0.090 17 0.599 27 0.894 37 1.690 47 3.503
8 0.129 18 0.613 28 1.016 38 1.803 48 3.785
9 0.129 19 0.615 29 1.218 39 1.943 49 4.631
10 0.223 20 0.658 30 1.272 40 2.008 50 5.292
Now, the statistical methods that we consider will usually be applied to the
values xj.
With our notation defined, we next consider the question of whether or
not a test is run to completion. It is easy to imagine a case in which the time
required to test a complete sample of copies of a component until all have
failed can be excessive. Even with age acceleration, the time required for a
complete test can be unmanageable or infeasible. In order to perform a reli-
ability test within a reasonable length of time, tests are often truncated early.
A decision to use a truncated test should be made before the test is per-
formed. When the test is truncated, we say that the test data obtained are
censored.
There are two basic approaches to test truncation—Type I and Type II.
A test may be truncated at a preselected point in time or after a predeter-
mined number of item failures. If the test is to be terminated after a fixed
time interval, the test duration is known (and therefore limited) in advance,
but the number of device failures that will be observed is a random vari-
able. This is Type I test truncation and the data set that results is said to be a
Type I censored data set. On the other hand, truncation after a fixed number
of failures yields a Type II censored data set in which the number of data
values is known in advance, but the test duration is random. In many cases,
the distinction is not crucial, but for some statistical estimation methods, the
difference between the two types of censoring is important. When a life test
is not truncated, it is said to be a complete test and the data set obtained is
said to be a complete data set.
Now that our notation is established and the basic classes of data sets
that might be used are defined, we may examine the methods of analysis.
To start with, we consider only those cases in which a complete data set is
available.
98 Reliability Engineering: Probabilistic Models and Maintenance Methods
k
FˆT (t) =
n k = { j|x j £ t , x j +1 > t} (7.2)
ˆ n-k
FT (t) =
n
where the caret “^” above a quantity indicates an estimate. Now, while these
expressions seem intuitively logical, they can be shown to be based on statis-
tical reasoning and to be one of the two reasonable forms.
Assume an arbitrary time interval, say, (t, t + ΔT), and let pj denote the
probability that the jth device failure observed occurs during that interval.
In principle, any of the failures could occur during that interval so the prob-
ability that it is the jth failure is a probability on the index of the failure time
that happens to be the one in the interval of interest. Next, we ask how it
could occur that the jth failure would fall in the selected interval and the
answer is as follows:
n!
( )
n- j
( FT (t)) ( fT (t)dt ) FT (t)
j -1
pj = (7.3)
( j - 1)!(1)!(n - j)!
This is the probability on the number of the failure that occurs during an
arbitrary time interval and in the limit as the length of the interval is reduced;
it is the probability that it is the jth failure that has occurred by time t.
Nonparametric Statistical Methods 99
Thus, it is the distribution on the “order statistic,” which is the count of
the number of failures. Properties of distributions on order statistics have
been studied extensively. Among the results that are known and have been
found useful are the facts that the mean of the distribution on the jth order
statistic implies that, as stated in Equation 7.2, an appropriate estimate for
the fraction failed is
j
FˆT (t) = (7.4)
n
while the median of the distribution on the jth order statistic implies that an
appropriate estimate for the fraction failed is
j - 0.3
FˆT (t) = (7.5)
n + 0.4
The estimator based on the median is not unbiased, but it has the advantage
that it does not assign a value of 1.0 to the estimate associated with the nth
failure time. This is often considered important in that one does not expect
that the greatest failure time observed during a test really corresponds to
the maximum achievable life length. In general, either of the estimators may
be used and each is treated at various points in the discussion of statistical
methods presented here.
one may base the computation of the estimate of the reliability (or failure
probability) on the number of survivors (or observed failures) or on the pro-
portion of the test items that have survived (or failed). Consider first the use
of the number of survivors.
The number of survivors at the end of a fixed time interval is a random
variable for which the dispersion is best modeled using the binomial dis-
tribution. That is, survival of each copy of the component may be viewed as
a Bernoulli trial so that the quantity n − j has a binomial distribution with
success probability FT (t). In this case,
én- jù
Eê
é n - j ù FT (t) 1 - FT (t)
= FT (t) and Var ê =
( ) (7.7)
ú
ë n û ë n úû n
In general, when the sample size is large (or at least not too small) and the
success probability is not too close to 0 or to 1, a confidence interval for the
binomial parameter may be computed using the standard normal distribu-
tion. For an arbitrarily selected confidence level of 100(1 − α)%, a confidence
interval of
ˆ
FT (t) + za/2
ˆ
( ˆ
FT (t) 1 - FT (t) ) £ F (t) £ Fˆ (t) + z
T T 1-a/2
ˆ
( ˆ
FT (t) 1 - FT (t) ) (7.8)
n n
j - 0.3 14.7 ˆ
FˆT (0.5) = = = 0.292 FT (0.5) = 1 - FˆT (0.5) = 0.708
n + 0.4 50.4
Taking α = 0.05 so that za/2 = -1.96, the first of the estimates provides the
confidence interval:
In fact, this beta distribution form also follows directly from the distribution
on the order statistics stated in Equation 7.3. With the definition u = FT(t) and
the parameter identities of η = j − 1 and δ = n − j, replacing f T(t)dt by du yields
Equation 7.9. Keep in mind that the proportion u must be in the interval (0, 1).
The point estimate of Equation 7.4 is an appropriate estimate for the frac-
tion failed by time t so a confidence bound on that estimate is obtained using
the quantiles of the beta distribution. For integer values of the parameters,
the distribution function for the beta distribution can be obtained by succes-
sive integration by parts and is
h
å G(k + 1)G(h + d - k + 2) u (1 - u)
G(h + d + 2) h+ d - k +1
FU (u, h, d) = 1 - k
(7.10)
k =0
Then, a 100(1 − α)% confidence interval on the failure probability at the time
of the jth observed failure is
Clearly, these confidence limits must be computed numerically, but the effort
required to do this is not great. For the same example case as the one treated
earlier in which n = 50 and j = 15 at t = 0.5, the point estimate for FT(0.5) is still
j 15
FˆT (0.5) = = = 0.30
n 50
102 Reliability Engineering: Probabilistic Models and Maintenance Methods
ni - wi
p̂i = (7.15)
ni
The Kaplan–Meier estimator for the device survivor function at any time is
defined as
n - i + 1 - wi
ˆ
FT (t) = Õ
i|ti £ t
n-i+1
(7.16)
wi
æ n-i ö
ˆ
FT (t) = Õ Õ
i|ti £ t
pˆ i =
i|ti £ t
ç ÷
è n-i+1ø
(7.17)
( ) ( ) å n (nw- w )
2 i
FˆT ( xi ) = FˆT ( xi )
Var
j
(7.18)
j =1 j j j
ˆ
( )
FˆT ( xi ) £ FT ( xi ) £ FˆT ( xi ) + z1-a/2 Var
FT ( xi ) - za/2 Var FˆT ( xi )
( ) (7.19)
104 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 7.3
Data for Kaplan–Meier Estimation Example
1.20 10.93 16.49* 20.47 43.21
3.27 11.73 18.13 25.35 47.10
3.46 13.75* 18.16 31.43 54.95
3.57* 14.14 18.74* 31.66 60.43*
7.19 16.09 18.99 33.54* 80.40
Note: * Data loss time.
TABLE 7.4
Calculation of Kaplan–Meier Estimates using
Example Data for the Survivor Function
ˆ
i xi wi p̂i FT ( xi )
0 1.0 1.000
1 1.20 1 0.960 0.960
2 3.27 1 0.958 0.920
3 3.46 1 0.957 0.088
4 3.57 0 1 0.088
5 7.19 1 0.952 0.838
6 10.93 1 0.950 0.796
7 11.73 1 0.947 0.754
8 13.75 0 1 0.754
9 14.14 1 0.941 0.710
10 16.09 1 0.938 0.666
11 16.49 0 1 0.666
12 18.13 1 0.929 0.618
13 18.16 1 0.923 0.570
14 18.74 0 1 0.570
15 18.99 1 0.909 0.518
16 20.47 1 0.900 0.467
17 25.35 1 0.889 0.415
18 31.43 1 0.875 0.363
19 31.66 1 0.857 0.311
20 33.54 0 1 0.311
21 43.21 1 0.800 0.249
22 47.10 1 0.750 0.187
23 54.95 1 0.667 0.124
24 60.43 0 1 0.124
25 80.40 1 0 0
Nonparametric Statistical Methods 105
ˆ
Applying this expression to the estimate FT ( x7 = 11.73) yields
( ) ( ) å n (nw- w )
2 7
FˆT (11.73) = FˆT (11.73)
Var
j
j =1 i j j
2æ 1 1 1 1 1 1 ö
= ( 0.754 ) ç + + + + + ÷
è 25(24) 24(23) 23(22) 21(20) 20(19) 19(18) ø
= 0.0076
While it is true that the bounds are relatively wide, they are based on a very
general censoring pattern and thus permit estimation using even very lim-
ited data sets.
Using these definitions, the number of units tested during the ith interval is
i -1
ni = n - å (d + m )
j =1
j j (7.20)
and the conditional probability of failure during the ith interval may be
estimated by
d d
pˆ i = i = i
(7.21)
å
i -1
ni n - (d j + m j )
j =1
i i æ ö
Õ Õ
ˆ ç dj ÷
FT (ti ) = (1 - pˆ j ) = ç1- ÷ (7.22)
å
j -1
j =1 j =1 ç n- (dl + ml ) ÷
è l =1 ø
The variance of this survivor function estimate may also be estimated using
Greenwood’s formula [51]:
( ) ( ) å n (1pˆ- pˆ )
2 i
FˆT (ti ) = FˆT (ti )
Var
j
(7.23)
j =1 j j
Once we have estimates of the reliability and the variance of the estimates,
the normal distribution confidence intervals described in Section 7.3 may be
applied directly.
Consider an example. Meeker and Escobar [52] present heat exchanger
tube failure data in which n = 300, m = (99, 95), and d = (4, 5, 2). The three
intervals are each 1 year long. Using these data, n1 = 300, n2 = 197, and n3 = 97.
Therefore,
4 5 2
pˆ 1 = = 0.0133 pˆ 2 = = 0.0254 pˆ 3 = = 0.0253
300 197 97
Then,
ˆ FˆT (t1 ) = 4.385 ´ 10 -5
FT (t1 ) = 1 - pˆ 1 = 0.9867 and Var ( )
ˆ FˆT (t2 ) = 1.578 ´ 10 -4
FT (t2 ) = (1 - pˆ 1 )(1 - pˆ 2 ) = 0.9616 and Var ( )
ˆ FˆT (t3 ) = 3.438 ´ 10 -4
FT (t3 ) = (1 - pˆ 1 )(1 - pˆ 2 )(1 - pˆ 3 ) = 0.9418 and Var ( )
Nonparametric Statistical Methods 107
Finally, the 95% confidence intervals for the survivor function values are
FT (t g ) = g (7.24)
u = FT ( x j )
Pr[t g ³ x j ] ³ 1 - a
108 Reliability Engineering: Probabilistic Models and Maintenance Methods
in which case, we will have 100(1 − α)% confidence that the reliability at xj is at
least 1 – γ. Using our established definitions, this means that the cumulative
probability of failure at xj is smaller than that at τγ, so
Now, examining the beta distribution on a proportion u for the specific case
of our test results, we find that
j -1
ænö
FU (u, j - 1, n - j) = 1 - å çè k ÷øu (1 - u)
k =0
k n-k
= 1 - B( j - 1, n, u) (7.26)
B( j - 1, n, g ) £ a (7.28)
FU (u, 1, n - 1) = 1 - (1 - u)n = 1 - (1 - g )n ³ 1 - a
so we find that for the example data set of Table 7.2, using α = 0.05 and n = 50,
we compute γ = 0.058 and we say that we have 95% confidence that the reliabil-
ity of the component population is at least 1 − γ = 0.942 at a time of x1 = 0.006.
An alternate use of Equation 7.28 is to ask how many items should we
have tested in order for x1 to correspond to a Type A design allowable.
To answer this question, we set α = 0.05 and γ = 0.10 and we compute n to
be 29. For the corresponding case of the Type B design allowable, γ = 0.01
and n = 298.
Next, for other values of j, we might compute the reliability at the time
of the 15th failure for which we have 95% confidence. To do this we solve
Equation 7.28 for the smallest value of γ for which the expression holds.
That value is γ = 0.403 so the reliability value we seek is 0.597 and we can
say that
Thus, using Equation 7.28, we can calculate nearly any sort of tolerance
bound we need and we can also calculate the sample size required to obtain
any specific level of confidence for any desired level of reliability.
The various methods described up to this point in this chapter treat esti-
mation of failure probabilities or reliability values. An alternate idea is to
develop nonparametric methods that address the hazard function. This
topic is treated next.
Definition 7.1
The total time on test transform of a life distribution having finite mean, μ, is
denoted by H F -1 (u) and is expressed as
T
FT-1 ( u )
H F -1 (u) =
T ò0
FT (w)dw (7.30)
(1 - e ) = 0.467 = 0.556
FT-1 ( u = 0.467 ) 0.75 -0.75 l
H F -1 (u = 0.467 ) =
T ò0
FT (w)dw =
ò0
e - lw dw =
l l
The TTT transform has two useful and important properties that will lead us
to a method for characterizing the behavior of a life distribution. The first is
that it yields the mean when evaluated at u = 1.0. That is,
FT-1 ( u =1.0 ) ¥
H F -1 (u = 1.0) =
T ò
0
ò
FT (w)dw = FT (w)dw = m
0
(7.31)
To clarify, note that the time at which the life distribution has value 1.0 is
infinity (or some very large maximum value) and the integral of the survivor
function over the full range of any random variable yields the mean value of
that random variable.
The second useful property of the transform is that its derivative, when
evaluated at any value of the cumulative failure probability, equals the recip-
rocal of the corresponding value of the hazard function. This is shown as
follows:
FT-1 ( u )
d d
H -1 (u) =
du FT du ò0
FT (w)dw
FT-1 ( u )
æ d -1 ö æ d ö d
=ç
è du ø
(
FT (u) ÷ FT ( FT-1(u)) - ç 0 ÷ FT (0) +
è du ø
) ( ) ò du
FT (w)dw
0
Now, clearly the second and third terms of this derivative equal zero. For the
first term of the derivative, we observe that in the expression
FT ( FT-1(u))
Nonparametric Statistical Methods 111
FT-1(u) is the time for which the cumulative failure probability equals u so
evaluating the survivor function at that value yields 1 − u. That is,
FT ( FT-1(u)) = 1 - u
dt d -1
t = FT-1(u) so = FT (u)
du du
Then,
dt 1 1 1
= = =
du d
du FT (t) f (t)
dt dt
and therefore,
d -1 1
FT (u) = -1
du f ( FT (u))
Combining our two results and evaluating the derivative at the cumulative
failure probability associated with any time yields
d 1- u FT (t) 1
H F -1 (u) = = = (7.32)
du T f ( FT-1(u)) u = F f (t) zT (t)
T (t )
Definition 7.2
The scaled total time on test transform of a life distribution having finite mean,
μ, is denoted by Q F -1 (u) and is expressed as
T
H F -1 (u) H F -1 (u)
Q F -1 (u) = T
= T
(7.33)
T H F -1 (1) m
T
112 Reliability Engineering: Probabilistic Models and Maintenance Methods
Now, clearly the derivative of the scaled transform equals the derivative of
the transform divided by the scaling constant, μ. Thus,
æ d ö
ç H F -1 (u) ÷ -1
d
Q F -1 (u) = è
du T
ø = 1/zT ( FT (u)) (7.34)
du T m m
Consider what the derivative of the scaled transform tells us. Suppose the
life distribution happens to be exponential. In that case, zT(t) = λ and m = 1/l.
Thus, at all values of u, the derivative of the scaled transform is
d 1/l
Q -1 (u) = =1
du FT 1/l
When the life distribution has constant hazard, the scaled TTT transform
has a slope equal to one. In that case, the scaled transform plots as a straight
line. Suppose the hazard is not constant. For the Weibull distribution, the
mean is θΓ(1 + 1/β) and the hazard function is
b -1
bæ t ö
zT (t) = ç ÷
qèqø
For β > 1, the hazard function is increasing in time and the scaled TTT trans-
form displays the form shown in Figure 7.1a. Similarly, for β < 1, the scaled
transform has the form shown in Figure 7.1b as this is indicative of a distri-
bution with decreasing hazard function. Thus, we have the very powerful
result that the scaled transform is concave for increasing hazard, a straight
line for constant hazard, and convex for decreasing hazard distributions.
0.8
0.6
TTT
0.4
0.2
FIGURE 7.1
(a) Total time on test transform for increasing hazard functions. (Continued)
Nonparametric Statistical Methods 113
0.8
0.6
TTT
0.4
0.2
t( x j ) = å x + (n - j + 1)x
k =1
k j (7.35)
where the xj are our ordered failure times so τ(xj) is the total amount of test-
ing time that is accumulated by the time of the jth failure. This may be seen
as follows:
t( x j ) = å (n - k + 1)(x - x
k =1
k k -1 )
ì 0 0 £ t < x1
ï1/n x1 £ t < x2
ï
ïï
FXn (t) = í (7.36)
ï j/n x j £ t < x j +1
ï
ï
ïî 1 xn £ t < ¥
ì 0 0 £ t < 0.006
ï0.02 0.006 £ t < 0.019
ï
ï0.04 0.019 £ t < 0.026
ï
ï0.06 0.0226 £ t < 0.038
FXn (t) = í
ï
ï
ï
ï0.98 4.631 £ t < 5.292
ï1.00 5.292 £ t < ¥
î
A few values of the corresponding inverse function are FX-n1 (0.04) = 0.019,
FX-n1 (0.30) = 0.465, and FX-n1 (0.99) = 4.631. For the same data set, note that
τ(x1) = 50(0.006) = 0.30, τ(x2) = 50(0.006) + 49(0.013) = 0.937, and τ(x3) = 50(0.006) +
49(0.013) + 48(0.007) = 1.273.
Nonparametric Statistical Methods 115
The use of the TTT transform to analyze test data proceeds by applying the
transform to FXn(t) in the same manner as for FT(t). That is,
FX-1 ( u )
n
H F -1 (u) =
Xn ò0
FXn (w)dw (7.38)
However, recognizing that FXn (t) is a step function, the integral may be
expressed as a sum. Consider the transform evaluated at u = j/n:
FX-1 ( j/n )
n
æ jö
H F -1
Xn
ç ÷=
ènø ò
0
FXn (w)dw
=
ò
0
FXn (w)dw +
FX-1
ò(
1/n )
FXn (w)dw + +
FX-1 (
ò j -1/n )
FXn (w)dw
n n
x1 x2 xj
n n -1 n- j+1
=
ò
0
n
dw +
ò
x1
n
dw + +
ò
x j-1
n
dw
1
=
n
( nx1 + (n - 1)(x2 - x1 ) + (n - 2)(x3 - x2 ) + + (n - j + 1)(x j - x j -1 ))
1
= t( x j )
n
Thus, the TTT transform applied to the test data reduces to the sum of test
times—the total testing time—defined in Equation 7.35. In addition, in the
limit, the empirical transform corresponds to the theoretical transform for
the underlying life distribution. That is,
FX-1 ( j/n ) FT ( s )
n
æ jö
lim H F -1 ç ÷ = lim
n ®¥
j /n ® s
Xn
è n ø jn/®¥
n®s
ò
0
FXn (w)dw =
ò F (w)dw = H
0
T FT-1
(s)
1
æ j ö H FX-n1 ( j/n ) n
t( x j ) t( x )
j
Q F -1 ç ÷ = = = (7.39)
xn
ènø H F -1 (1) 1 t( x )
Xn t( xn ) n
n
The application to sample data implied by the earlier discussion and results
is that the quantities τ(xj) and τ(xn) are computed and their ratio is plotted
against j/n. If the result is approximately a 45° line, one concludes that the
life distribution is constant hazard. On the other hand, if the plot is concave
and lies mostly above the 45° line, one concludes that the life distribution is
increasing failure rate (IFR), and if the plot is convex and lies below the 45°
line, one concludes that the life distribution is decreasing failure rate.
Consider the example provided by the data of Table 7.2. The computed val-
ues of the scaled transform are shown in Table 7.5. Figure 7.2 shows the plot
of the values in the table. Note that the points appear to resemble a 45° line.
In fact, the plot seems to cross the 45° line several times and to generally lie
close to it. An examination of the values in the table confirms this. It seems
reasonable to conclude that the test data correspond to a constant hazard life
distribution.
TABLE 7.5
Empirical Values of the Scaled Total Time on Test Transform
xj j/n Θ(x j) xj j/n Θ(x j) xj j/n Θ(x j)
0.006 0.02 0.0046 0.613 0.36 0.3665 1.555 0.70 0.6951
0.019 0.04 0.0143 0.615 0.38 0.3674 1.559 0.72 0.6960
0.026 0.06 0.0194 0.658 0.40 0.3878 1.690 0.74 0.7240
0.038 0.08 0.0281 0.720 0.42 0.4162 1.803 0.76 0.7464
0.056 0.10 0.0407 0.783 0.44 0.4441 1.943 0.78 0.7721
0.068 0.12 0.0489 0.795 0.46 0.4492 2.008 0.80 0.7830
0.090 0.14 0.0637 0.829 0.48 0.4633 2.190 0.82 0.8108
0.129 0.16 0.0893 0.875 0.50 0.4815 2.505 0.84 0.8541
0.129 0.18 0.0893 0.883 0.52 0.4846 2.887 0.86 0.8947
0.223 0.20 0.1482 0.894 0.54 0.4886 3.085 0.88 0.9212
0.279 0.22 0.1824 1.016 0.56 0.5315 3.106 0.90 0.9231
0.299 0.24 0.1943 1.218 0.58 0.5993 3.354 0.92 0.9420
0.336 0.26 0.2158 1.272 0.60 0.6167 3.503 0.94 0.9511
0.455 0.28 0.2831 1.285 0.62 0.6206 3.785 0.96 0.9641
0.465 0.30 0.2886 1.373 0.64 0.6462 4.631 0.98 0.9899
0.548 0.32 0.3329 1.494 0.66 0.6794 5.292 1.00 1.00
0.599 0.34 0.3594 1.541 0.68 0.6916
Nonparametric Statistical Methods 117
0.8
0.6
TTT
0.4
0.2
FIGURE 7.2
Plot of the scaled total time on test transform.
TABLE 7.6
Another Example Life Test Data Set
j xj j xj j xj j xj j xj
1 0.023 11 0.303 21 0.511 31 0.754 41 1.224
2 0.025 12 0.370 22 0.522 32 0.767 42 1.252
3 0.081 13 0.371 23 0.532 33 0.795 43 1.344
4 0.110 14 0.373 24 0.571 34 0.802 44 1.378
5 0.185 15 0.394 25 0.579 35 0.873 45 1.562
6 0.226 16 0.400 26 0.596 36 0.884 46 1.580
7 0.230 17 0.412 27 0.605 37 0.936 47 1.653
8 0.278 18 0.435 28 0.627 38 0.993 48 1.659
9 0.278 19 0.449 29 0.673 39 1.001 49 1.764
10 0.287 20 0.494 30 0.753 40 1.087 50 2.520
118 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 7.7
Empirical Values of the Scaled Total Time on Test Transform
xj j/n Θ(x j) xj j/n Θ(x j) xj j/n Θ(x j)
0.023 0.02 0.0315 0.435 0.36 0.5121 0.873 0.70 0.7880
0.025 0.04 0.0342 0.449 0.38 0.5243 0.884 0.72 0.7925
0.081 0.06 0.1078 0.494 0.40 0.5625 0.936 0.74 0.8125
0.110 0.08 0.1451 0.511 0.42 0.5765 0.993 0.76 0.8328
0.185 0.10 0.2396 0.522 0.44 0.5852 1.001 0.78 0.8354
0.226 0.12 0.2901 0.532 0.46 0.5929 1.087 0.80 0.8613
0.230 0.14 0.2949 0.571 0.48 0.6217 1.224 0.82 0.8988
0.278 0.16 0.3514 0.579 0.50 0.6274 1.252 0.84 0.9057
0.278 0.18 0.3514 0.596 0.52 0.6391 1.344 0.86 0.9259
0.287 0.20 0.3615 0.605 0.54 0.6450 1.378 0.88 0.9324
0.303 0.22 0.3790 0.627 0.56 0.6588 1.562 0.90 0.9626
0.370 0.24 0.4506 0.673 0.58 0.6865 1.580 0.92 0.9651
0.371 0.26 0.4516 0.753 0.60 0.7325 1.653 0.94 0.9731
0.373 0.28 0.4537 0.754 0.62 0.7331 1.659 0.96 0.9735
0.394 0.30 0.4744 0.767 0.64 0.7398 1.764 0.98 0.9793
0.400 0.32 0.4801 0.795 0.66 0.7536 2.520 1.00 1.00
0.412 0.34 0.4913 0.802 0.68 0.7569
0.8
0.6
TTT
0.4
0.2
FIGURE 7.3
Plot of the scaled total time on test transform for an increasing failure rate distribution.
and n − r had not yet failed. For purposes of illustration, assume that only
the first r data values of Table 7.2 had been recorded. In that case, the total
observed test time is
r -1
t( xr ) = å x + (n - j + 1)x
k =1
k r (7.40)
For the censored data set, this quantity is used in place of τ(xn) as the scaling
constant and the scaled transform is defined as
1
t( x j ) t( x )
æ jö n j
Q F -1 ç ÷ = = (7.41)
xr r
è ø 1 t( x r)
t( xr )
n
and this quantity is plotted versus j/r. For the example data set of Table 7.2,
the computed values for r = 12 and for r = 20 are listed in Table 7.8. The cor-
responding plots of the scaled transform are shown in Figure 7.4a and b,
TABLE 7.8
Empirical Values of the Scaled Total Time
on Test Transform for Censored Data Sets
r = 12 r = 20
xj j/r Θ(x j) j/r Θ(x j)
0.006 0.083 0.0236 0.05 0.0118
0.019 0.167 0.0736 0.10 0.0369
0.026 0.250 0.1000 0.15 0.0501
0.038 0.333 0.1444 0.20 0.0723
0.056 0.417 0.2094 0.25 0.1050
0.068 0.500 0.2519 0.30 0.1262
0.090 0.583 0.3280 0.35 0.1643
0.129 0.667 0.4598 0.40 0.2304
0.129 0.750 0.4598 0.45 0.2304
0.223 0.833 0.7627 0.50 0.3822
0.279 0.917 0.9387 0.55 0.4704
0.299 1.00 1.00 0.60 0.5011
0.336 0.65 0.5565
0.455 0.70 0.7299
0.465 0.75 0.7441
0.548 0.80 0.8585
0.599 0.85 0.9268
0.613 0.90 0.9450
0.615 0.95 0.9475
0.658 1.00 1.00
120 Reliability Engineering: Probabilistic Models and Maintenance Methods
0.8
0.6
TTT
0.4
0.2
0.8
0.6
TTT
0.4
0.2
FIGURE 7.4
(a) Plot of the scaled total time on test transform for censored data with r = 12. (b) Plot of the
scaled total time on test transform for censored data with r = 20.
respectively. These plots serve to illustrate the facts that the transform may
be applied to censored data and that our ability to interpret the plots is sig-
nificantly influenced by the degree of censoring. For the case in which r = 20,
we can be reasonably confident that the hazard is constant. For the plot cor-
responding to the data censored after r = 12 observations, our conclusion of
constant hazard is rather more tenuous.
Once again, to provide a contrast to the constant hazard case, consider
the data from an IFR distribution that is listed in Table 7.6. If that data had
been generated in a test with censoring at either r = 12 or r = 20, the cor-
responding data values would have been those shown in Table 7.9 and the
corresponding plots of the scaled transforms would have been those shown
Nonparametric Statistical Methods 121
TABLE 7.9
Empirical Values of the Scaled Total Time on
Test Transform for Censored Data from an
Increasing Failure Rate Distribution
r = 12 r = 20
xj j/r Θ(x j) j/r Θ(x j)
0.023 0.083 0.0699 0.05 0.0560
0.025 0.167 0.0759 0.10 0.0608
0.081 0.250 0.2392 0.15 0.1916
0.110 0.333 0.3220 0.20 0.2580
0.185 0.417 0.5317 0.25 0.4260
0.226 0.500 0.6438 0.30 0.5157
0.230 0.583 0.6545 0.35 0.5243
0.278 0.667 0.7798 0.40 0.6247
0.278 0.750 0.7798 0.45 0.6247
0.287 0.833 0.8023 0.50 0.6427
0.303 0.917 0.8411 0.55 0.6738
0.370 1.00 1.00 0.60 0.8011
0.371 0.65 0.8028
0.373 0.70 0.8066
0.394 0.75 0.8434
0.400 0.80 0.8535
0.412 0.85 0.8734
0.435 0.90 0.9104
0.449 0.95 0.9321
0.494 1.00 1.00
0.8
0.6
TTT
0.4
0.2
FIGURE 7.5
(a) Plot of the scaled total time on test transform for censored data with r = 12. (Continued)
122 Reliability Engineering: Probabilistic Models and Maintenance Methods
0.8
0.6
TTT
0.4
0.2
in Figure 7.5a and b. Here again, it is clear that the degree of censoring affects
the confidence we have in our interpretations of the plots.
In closing this discussion, it should now be reasonably clear that the TTT
transform provides a method that is very simple to perform for characterizing
the hazard behavior of a device population.
FT (t) = e ò0
- zT ( x ) dx
= e - ZT (t )
Nelson [56] suggests that since this expression may be solved to obtain the
cumulative hazard as a function of the reliability, estimates of the cumula-
tive hazard are implied by the Kaplan–Meier estimates of reliability. That is,
cumulative hazard may be estimated by
æ ö
( )
ˆ
Zˆ T (t) = - ln FT (t) = - ln ç
è i
Õpˆ i ÷ = -
ç i|x £t ÷
ø
å
i|xi £ t
ln ( pˆ i ) (7.42)
Nonparametric Statistical Methods 123
TABLE 7.10
Computed Estimates for the
Cumulative Hazard Function
i xi wi p̂i ẐT ( xi )
0 0 0
1 1.20 1 0.960 0.041
2 3.27 1 0.958 0.083
3 3.46 1 0.957 0.128
4 3.57 0 0 0.128
5 7.19 1 0.952 0.177
6 10.93 1 0.950 0.228
7 11.73 1 0.947 0.282
8 13.75 0 0 0.282
9 14.14 1 0.941 0.343
10 16.09 1 0.938 0.407
å ln ( pˆ ) = - å ln æçè å
ni - ci - di ö æ di ö
Zˆ T (t) = - i ÷=- ln ç 1 - ÷ (7.43)
i|xi £ t i|xi £ t
ni - ci ø i|xi £ t è ni - ci ø
ˆ ˆ
FT (t) = e - ZT (t )
ˆ
For example, using Zˆ T (16.09) = 0.407 yields FT (16.09) = 0.665, which may be
compared to the Kaplan–Meier estimate of 0.666.
In closing, it should now be apparent that nonparametric methods can pro-
vide us with substantial information concerning the reliability of a device
design without requiring us to assume an underlying distribution model of
the dispersion in failure times.
124 Reliability Engineering: Probabilistic Models and Maintenance Methods
Exercises
7.1 The following data set was obtained from a life test of n = 50 copies of
a component.
i ti i ti i ti i ti i ti
1 551.881 11 297.883 21 122.750 31 539.933 41 141.582
2 964.448 12 78.966 22 119.677 32 175.578 42 329.841
3 687.943 13 526.061 23 568.533 33 465.506 43 570.971
4 206.215 14 558.106 24 453.852 34 208.198 44 929.433
5 844.059 15 484.969 25 267.140 35 326.713 45 67.964
6 439.283 16 282.293 26 128.874 36 154.290 46 294.060
7 170.110 17 589.303 27 675.259 37 703.458 47 23.774
8 273.522 18 1032.227 28 347.812 38 327.022 48 295.930
9 475.883 19 726.202 29 283.398 39 511.423 49 514.202
10 255.646 20 573.447 30 357.552 40 560.902 50 251.874
For these data, identify the values of x1, x5, x10, and x50.
7.2 For the data set of Problem 7.1, compute the point estimates of the reli-
ability at x1, x5, x10, and x50 with both the mean- and the median-based
estimation equations. How do these estimates compare?
7.3 Using the data in Problem 7.1, compute 95% confidence intervals for the
failure probability at each of x1, x5, x10, and x50 using each of Equations 7.8
and 7.11. How do these intervals compare?
7.4 Suppose a sample of 40 copies of a device were placed on test and that
the observed failure times are:
ˆ
For this data set, use the median rank–based statistics to compute FT ( x 4 )
and the beta distribution–based limits to construct a 95% confidence
interval for that value.
7.5 Test data for a prototype battery [52] were accumulated over 550 hours
with an initial test sample of n = 68 specimens. The batteries were only
Nonparametric Statistical Methods 125
inspected at 50 hour intervals so τ = (50, 100, 150, 200, 250, 300, 350, 400,
450, 500, 550). The associated observed failures and specimen removals
were x = (1, 0, 1, 4, 1, 1, 1, 4, 4, 2, 2) and m = (5, 6, 1, 6, 2, 1, 2, 2, 3, 1, 0).
Compute reliability estimates for each of the inspection times and con-
fidence intervals for the estimates at 200 and 400 hours.
7.6 Using the data of Problem 7.1, compute a 95% tolerance bound on the
reliability at x10 and x15. Then compute the level of confidence the data
provide that the reliability at x20 exceeds 0.55.
7.7 The following ordered data set was obtained during a life test of n = 20
copies of a component.
j tj wj j tj wj
1 2663.42 1 15 1391.43 1
2 1031.09 1 16 1020.26 1
3 463.70 1 17 1494.93 1
4 413.09 1 18 1847.83 0
5 866.53 0 19 256.93 1
6 1827.96 1 20 1307.72 0
7 1788.34 1 21 1033.41 0
8 1282.64 1 22 1994.37 1
9 640.61 0 23 1286.57 1
10 1151.34 1 24 692.62 1
11 852.12 1 25 847.15 1
12 659.83 1 26 1562.79 1
13 1429.42 1 27 706.13 1
14 1474.34 1 28 541.41 1
7.10 Use the data set of Problem 7.7 to compute and plot the TTT transform.
Indicate what type of hazard behavior the data display.
7.11 For the following data set obtained from a life test, order the data and
then plot the TTT transform. Indicate what form of the hazard function
is suggested by the plot.
7.12 For the following data set obtained from a life test, order the data and
then plot the TTT transform. Indicate what form of the hazard function
is suggested by the plot.
7.13 For the following data set obtained from a life test, order the data and
then plot the TTT transform. Indicate what form of the hazard function
is suggested by the plot.
i ti i ti i ti i ti i ti
1 635.655 11 456.731 21 335.464 31 282.015 41 170.998
2 369.012 12 459.482 22 100.790 32 172.954 42 465.023
3 312.489 13 420.944 23 453.539 33 216.222 43 254.582
4 196.092 14 306.064 24 82.843 34 204.064 44 319.789
5 72.393 15 216.330 25 356.053 35 228.195 45 285.048
6 22.150 16 180.638 26 255.021 36 528.971 46 307.34
7 302.257 17 137.704 27 302.217 37 270.25 47 318.541
8 114.434 18 159.855 28 181.568 38 117.524 48 242.783
9 68.381 19 231.442 29 93.694 39 70.280 49 458.005
10 200.899 20 203.094 30 314.594 40 93.151 50 130.900
7.14 For the data set of Problem 7.13, assume that because of test termina-
tion, only the earliest 16 data values are available. Construct the plot of
the TTT transform for the resulting censored data set.
7.15 For the data set of Problem 7.13, assume that because of test termina-
tion, only the earliest 25 data values are available. Construct the plot of
the TTT transform for the resulting censored data set.
Nonparametric Statistical Methods 127
7.16 Use the complete set of the following life test data to construct a plot of
the TTT transform and indicate what type of behavior the hazard func-
tion appears to have.
i ti i ti i ti i ti
1 29.835 11 1048.13 21 126.097 31 154.884
2 1262.860 12 641.953 22 434.761 32 103.444
3 804.623 13 762.882 23 170.046 33 176.225
4 691.363 14 206.062 24 1880.470 34 252.424
5 654.951 15 593.040 25 1058.727 35 333.961
6 409.087 16 224.793 26 957.271 36 1989.75
7 1615.690 17 203.809 27 2.970 37 1646.63
8 470.408 18 309.879 28 75.239 38 344.135
9 918.823 19 3094.740 29 346.605 39 48.831
10 68.348 20 55.854 30 801.645 40 131.215
7.17 For the data set of Problem 7.16, assume that because of test termina-
tion, only the earliest 12 data values are available. Construct the plot of
the TTT transform for the resulting censored data set.
7.18 For the data set of Problem 7.16, assume that because of test termina-
tion, only the earliest 20 data values are available. Construct the plot of
the TTT transform for the resulting censored data set.
7.19 For the data set in Problem 7.4, use the complete data set to compute
and plot the TTT transform and to characterize the hazard function.
7.20 Use the data in Problem 7.7 to compute Nelson cumulative hazard
estimates x5, x10, x14, x18, and x22 and then plot those estimates.
8
Parametric Statistical Methods
FT (t) = e - ZT (t ) (8.1)
so consequently,
If test data are used to compute estimated values of the reliability and these
estimates are plotted as a function of time, the resulting graph will pro-
vide estimated values for the distribution parameters that form the hazard
function.
129
130 Reliability Engineering: Probabilistic Models and Maintenance Methods
1
- ln FT (t) = ln = lt (8.3)
FT (t)
which is the equation for a line. Hence, if we represent our successive reli-
ability estimates by
1 1 n
y j = ln = ln = ln (8.4)
ˆ ( n - j ) n n -j
F T (t)
å x y -å x å
n n n
n j j j yj
j =1 j =1 j =1
l̂ = 2 (8.5)
nå x - æç å x ö÷
n n
2
j j
è j =1 ø j =1
TABLE 8.1
Example Ordered Failure Data
j xj n–j/n yj j xj n–j/n yj
1 4.740 0.950 0.051 11 128.756 0.450 0.799
2 12.636 0.900 0.105 12 150.393 0.400 0.916
3 17.358 0.850 0.163 13 168.101 0.350 1.050
4 22.099 0.800 0.223 14 194.277 0.300 1.204
5 29.085 0.750 0.288 15 238.897 0.250 1.386
6 32.732 0.700 0.357 16 303.383 0.200 1.609
7 41.725 0.650 0.431 17 340.621 0.150 1.897
8 57.518 0.600 0.511 18 382.142 0.100 2.303
9 62.864 0.550 0.598 19 492.023 0.050 2.996
10 65.288 0.500 0.693 20 544.017 0.0
Parametric Statistical Methods 131
y
3
2.5
1.5
0.5
x
100 200 300 400 500
FIGURE 8.1
Example failure data plot.
( )
ln - ln FT (t) = b ln t - b ln q (8.6)
which is again the equation for a line. In this case, the intercept is nonzero.
We again represent the dependent variable, which is the estimated reliability
at each failure time, by yj and the result is a data set such as the example set
132 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 8.2
Example Weibull Failure Data
ˆ ˆ
j xj ln(x j) FT (t ) yj j xj ln(x j) FT (t ) yj
1 390.896 5.968 0.972 −3.577 14 932.309 6.838 0.461 −0.255
2 509.925 6.234 0.933 −2.670 15 957.288 6.864 0.421 −0.146
3 540.671 6.293 0.894 −2.186 16 984.191 6.892 0.382 −0.038
4 594.520 6.388 0.854 −1.849 17 1003.160 6.911 0.343 0.069
5 621.604 6.432 0.815 −1.587 18 1018.753 6.926 0.303 0.177
6 626.117 6.440 0.776 −1.370 19 1030.576 6.938 0.264 0.287
7 679.096 6.521 0.736 −1.183 20 1082.845 6.987 0.224 0.402
8 664.210 6.499 0.697 −1.018 21 1222.792 7.109 0.185 0.523
9 710.355 6.566 0.657 −0.869 22 1279.176 7.154 0.146 0.656
10 714.938 6.572 0.618 −0.732 23 1285.361 7.159 0.106 0.807
11 746.485 6.615 0.579 −0.603 24 1392.606 7.239 0.067 0.995
12 763.342 6.638 0.539 −0.482 25 1577.441 7.364 0.028 1.279
13 775.172 6.653 0.500 −0.367
shown in Table 8.2. Note that the values of the yj listed in the table are com-
puted using Equation 7.5:
j - 0.3
FˆT (t) =
n + 0.4
so all of the data may be included in our analysis. The independent variable
in this case is the logarithm of the failure time as specified in Equation 8.6.
The plot of the data is shown in Figure 8.2. Clearly, using the plot to obtain
parameter estimates would be quite difficult for this case. Instead of trying
0 x
6.0 6.2 6.4 6.6 6.8 7.0 7.2
–1
–2
–3
FIGURE 8.2
Plot of logarithms of Weibull failure data.
Parametric Statistical Methods 133
to judge the behavior of the plot, we use the regression analysis equations to
calculate our estimates.
The slope and intercept of a line fit to this type of data are given by
Equations 8.7 and 8.8. For the data listed in Table 8.2, the computed value
of the slope is 3.344 and that for the intercept is −23.05. Now, inverting
Equation 8.6, these numerical values correspond to
bˆ = 3.344
å y j ln( x j ) - æç å y j ö÷ æç å ln( x j ) ö÷
n n n
n
Slope =
j =1 è j =1 øè j =1 ø (8.7)
2
å ( ln(x j )) - çèæ å ö
n 2 n
n ln( x j ) ÷
j =1 j =1 ø
æ
å å y - æçè å ln(x ) ö÷ø æçè å
ö y j ln( x j ) ö÷
n n n n
( ln(x j ))
2
ç ÷ j j
Intercept = è ø ø
j =1 j =1 j =1 j =1
2 (8.8)
nå ( ln( x ) ) - çæ å ln( x ) ö÷
n 2 n
j j
è
j =1 ø j =1
Equations 8.7 and 8.8 are the standard linear regression forms. They have
a corresponding matrix form that is actually easier to understand and use.
For a general linear fit, the data pairs, say, (uj, vj), that correspond to a model
v = a + bu
é1 u1 ù é v1 ù
ê1 u2 úú êv ú
U=ê and V = ê ú
2
ê ú êú
ê ú ê ú
ë1 un û ë vn û
V = UM
with the two coordinates of the matrix M being the intercept “a” and the
slope “b.” The regression solution of this model for a set of data is
M = [U ¢U ]-1U ¢V (8.9)
134 Reliability Engineering: Probabilistic Models and Maintenance Methods
ébˆ ln qˆ ù
M=ê ú
êë bˆ úû
There are three final points related to the graphical method. The first is
that the method can be applied to other distributions. However, because
most other distributions used to model life length do not have a closed-
form representation of the cumulative distribution function, estimation of
the parameters for those distributions is usually performed using methods
other than the graphical ones.
Second, if a plot of Weibull data displays curvature, particularly near the
ends, this is evidence of the existence of a third parameter, the minimum life
parameter (Figure 8.3). This was the parameter δ in Equation 4.14. The best
approach to this situation is to use a search method to identify the value of
the minimum life parameter for which the regression fit is best (Table 8.3).
Consider the example of the following data set:
Notice the decided nonlinearity of the initial data plot and the curvature at
the ends of the plot. By successive trials using a search strategy, we find that
a value of δ = 400.0 results in the plot shown in Figure 8.4. Then applying the
regression analysis to the data adjusted by δ, we obtain estimates of bˆ = 2.85
and qˆ = 606.771.
Finally, it is important to note that the graphical method (and its regression
equivalent) applies directly to censored data. If, for example, a sample of
n = 40 items are placed on test and the test is terminated after only 18 failures,
the resulting failure times can be plotted against the corresponding esti-
mated failure probabilities and the parameter estimation equations are the
slope and intercept expressions. As a further example, suppose only the first
0 x
6.4 6.5 6.6 6.7 6.8 6.9 7.0 7.1
–1
–2
–3
FIGURE 8.3
Initial plot of logarithms of Weibull failure data.
Parametric Statistical Methods 135
TABLE 8.3
Example Weibull Failure Data With Minimum Life
ˆ ˆ
J xj ln(x j) FT (t ) yj j xj ln(x j) FT (t ) yj
1 589.614 6.379 0.966 −3.355 11 1003.984 6.912 0.475 −0.297
2 640.936 6.463 0.917 −2.442 12 1008.317 6.916 0.426 −0.160
3 670.372 6.508 0.868 −1.952 13 1030.989 6.938 0.377 −0.026
4 731.327 6.595 0.819 −1.609 14 1040.955 6.948 0.328 0.107
5 828.633 6.720 0.770 −1.340 15 1052.332 6.959 0.279 0.243
6 859.266 6.756 0.721 −1.116 16 1062.327 6.968 0.230 0.384
7 870.480 6.769 0.672 −0.921 17 1103.345 7.006 0.181 0.535
8 881.452 6.782 0.623 −0.747 18 1103.629 7.006 0.132 0.704
9 894.234 6.796 0.574 −0.587 19 1192.351 7.084 0.083 0.910
10 934.200 6.840 0.525 −0.438 20 1230.343 7.115 0.034 1.216
0 x
5.4 5.6 5.8 6.0 6.2 6.4 6.6
–1
–2
–3
FIGURE 8.4
Revised plot of logarithms of Weibull failure data using δ = 400.0.
12 of the data points in Table 8.1 had been observed. This means that 60% of
the copies of the device on test had failed, but the total test time would then
have been around 150.393 units rather than the full 544.017 indicated in the
table. This is a significant savings in test time and the resulting parameter
estimate is an identical lˆ = 0.0058 to the one obtained using the full data set.
parameter estimates. The most common realization of this idea is the use of
a sample mean to estimate a population parameter such as its mean.
1
E[T ] = (8.10)
l
As the sample mean is a reasonable estimate for the population mean, one
may use Eˆ [T ] = x so
1
l̂ = (8.11)
x
For the example data in Table 8.1, we obtain x = 164.43, which yields the esti-
mate lˆ = 0.0061.
For the Weibull distribution, two equations are needed to estimate
two parameters. In principle, we could use the mean and the variance
expressions along with the sample mean and the sample variance. However,
there is a slightly simpler approach. In general, the moments of the Weibull
distribution may be determined to be
æ kö
E[T k ] = qk G ç 1 + ÷ (8.12)
è bø
æ 1ö
E[T ] = qG ç 1 + ÷ (8.13)
è bø
2
æ 2ö æ æ 1 öö
Var[T ] = E[T ] - ( E[T ])
2 2 2
= q G ç 1 + ÷ - ç qG ç 1 + ÷ ÷
è b ø è è b øø
æ æ 2ö æ 1 öö
= q2 ç G ç 1 + ÷ - G 2 ç 1 + ÷ ÷ (8.14)
è è bø è b øø
Parametric Statistical Methods 137
Now, rather than equate the sample mean and sample variance to the distri-
bution mean and variance, we take the coefficient of variation
æ æ 2ö æ 1 öö æ 2ö
q2 ç G ç 1 + ÷ - G 2 ç 1 + ÷ ÷ G ç 1 + ÷
Var[t] è bø è b øø bø
c= 2 = è = è -1 (8.15)
E [T ] æ 1 ö æ 1ö
q 2G 2 ç 1 + ÷ G2 ç 1 + ÷
è bø è bø
which contains only one of the parameters. Given a set of failure data, we
solve this expression numerically for the estimate of β and we then use that
estimate in the expression for the mean to compute an estimate for θ.
Consider an example. The failure data in Table 8.2 have a sample mean value
of 884.153 and a sample variance of 91,548 (standard deviation of 302.569).
Thus, the sample value of the coefficient of variation is 0.1171. A numerical
search for the value of β in Equation 8.15 that most closely matches this value
yields an estimate of bˆ = 3.208. The sequence of functional evaluations used
to find this estimate is shown in Table 8.4. Note that only eleven trials were
needed to obtain the parameter estimate. Once the estimate of β is obtained,
we compute the estimate of the scale parameter using the sample mean in
place of the population mean in Equation 8.13:
x 884.153
qˆ = = = 987.04
æ 1ö G(1.3117 )
Gç1+ ÷
è bˆ ø
In the case of the normal distribution, the estimation process using the
method of moments is direct as the moments of the distribution are the mean
TABLE 8.4
Numerical Search Values
β c
1.0 1.0
5.0 0.0525
1.5 0.4610
3.5 0.1001
3.0 0.1321
3.3 0.1113
3.2 0.1177
3.21 0.1170
3.205 0.1173
3.208 0.11712
3.209 0.11706
138 Reliability Engineering: Probabilistic Models and Maintenance Methods
and variance. Thus, for a sample mean of x and sample variance of s2, the
estimation equations are
mˆ = x
(8.16)
ˆs2 = s2
The gamma distribution may also be analyzed using the method of moments.
The approach used for the Weibull distribution is the most efficient one for
the gamma. In general, the moments of the gamma distribution stated in
Equation 4.21 are
k -1
Õ (b + i )
1
k
E[T ] = k (8.17)
l i =0
b
E[T ] = (8.18)
l
b(b + 1) b2 b
Var[T ] = E[T 2 ] - ( E[T ]) =
2
- 2 = 2 (8.19)
l2 l l
Var[t] b/l 2 1
c= 2
= 2 2 = (8.20)
E [T ] b /l b
It should now be apparent that the method of moments is usually quite easy
to apply. It has intuitive appeal. The disadvantage of the method is that one
Parametric Statistical Methods 139
TABLE 8.5
Gamma Distributed Life Test Data
j xj j xj
1 692 11 2623
2 995 12 2881
3 1239 13 2972
4 1314 14 3271
5 1530 15 3618
6 1740 16 3889
7 1949 17 4493
8 2056 18 4973
9 2199 19 6214
10 2348 20 7979
does not expect to be able to use it with censored data. An approach for using
censored data has been suggested and is discussed in the final section of this
chapter.
2nx 2nx
2
£q£ (8.21)
c 2 n , 1- a / 2 c 22 n , a/2
Referring to the data in Table 8.1, the computed value of the sample mean is
x = 164.43, so
2(20)(164.43) 2(20)(164.43)
110.84 = £q£ = 269.23
59.34 24.43
140 Reliability Engineering: Probabilistic Models and Maintenance Methods
Then, using θ = 1/λ the confidence interval for the distribution parameter is
1 1
0.0037 = £l£ = 0.0090
269.23 110.84
xe
(
- z1-a/2 n ) £ q £ xe -( za/2 n ) (8.22)
106.08 = 164.43e
(
- 1.96 20 ) £ q £ 164.43e -( -1.96 20 ) = 254.87
1 1
0.0039 = £l£ = 0.0094
254.87 106.08
For example, using the approximate confidence limits 0.0039 ≤ λ ≤ 0.0094 for
a time of 100 hours leads to
p 6
e
(
1.049 za/2 n ) £ b £ p 6 e1.049( z1-a/2 n ) (8.24)
sw sw
w+
p
sw
6
( g +1.081 za/2 n
) w+
p
sw
6
( g +1.081
z1-a/2 n
)
e £q£e (8.25)
It is acknowledged that these intervals are quite wide, but they represent
the most direct and simplest intervals available for the Weibull distribution
parameters when using the method of moments.
As in the case of the exponential distribution, the confidence intervals on
the parameters may be used to define confidence intervals on the system reli-
ability. However, the probability content of the intervals is no longer equal
to that of the parameters and is not generally known. It may be conjectured
to be approximately (1 – α)2. For the Weibull, the intervals for the reliability
function are
b
£ FT (t) £ e ( upper )
bupper lower
e ( lower )
- tq - tq
(8.26)
Applying this expression to the estimates computed for the data in Table 8.2
yields
s s
x + ta/2 , n -1 £ m £ x + t1- a/2 , n -1 (8.27)
n n
142 Reliability Engineering: Probabilistic Models and Maintenance Methods
1/2 1/2
æ n -1 ö æ n -1 ö
sç 2 ÷ £ s £ sç 2 ÷ (8.28)
è c1- a / 2 , n - 1 ø è c a /2 , n - 1 ø
As an example, suppose that the data in Table 8.2 were assumed to represent
the behavior of a device for which the normal distribution is an appropriate
model. The sample mean and sample standard deviation for that data are
884.153 and 302.569, respectively. Applying Equations 8.27 and 8.28 to those
values yields
302.569 302.569
759.25 = 884.153 - 2.064 £ m £ 884.153 + 2.064 = 1009.05
25 25
1/2 1/2
æ 24 ö æ 24 ö
236.27 = 302.569 ç ÷ £ s £ 302.569 ç ÷ = 420.94
è 39.36 ø è 12.40 ø
Note that the same approach as is suggested in Equation 8.26 for the Weibull
distribution applies to the normal distribution reliability values.
The sampling distributions for the method of moments parameter esti-
mates for the gamma distribution have not been determined. Consequently,
confidence intervals for the parameter estimates cannot be computed.
However, when the shape parameter is known, it is possible to define a
confidence interval on the scale parameter. The key to constructing that
interval is the fact that the sum of gamma random variables has a gamma
distribution. Thus, if a sample is taken from a population having a gamma
life distribution with parameters β and λ, the sum of the sample obser-
vations, say, ξ, will be gamma distributed with parameters nβ and λ.
Furthermore, the quantity λξ will be gamma distributed with parameters
nβ and 1. Using these facts, a 1 − α probability confidence interval defined
on that ratio is
éc c ù
Pr [ cl £ lx £ cu ] = Pr ê l £ l £ u ú = 1 - a
ëx xû
where cl and cu are the α/2 and 1 – α/2 quantiles of the gamma distribution
having parameters nβ and 1. Therefore, the corresponding confidence inter-
val for the scale parameter is
cl c
£l£ u (8.29)
x x
Parametric Statistical Methods 143
Clearly, this construction has no rigorous basis, but it may provide some prac-
tical guidance in judging device life length. For example, the analysis of the
data in Table 8.5 yielded the parameter estimates of bˆ = 2.554 and lˆ = 0.00086.
For 1 – α = 0.95, the quantiles of the gamma distribution with parameters
nβ = 20(2.554) = 51.08 and 1 are cl = 38.042 and cu = 66.010, so Equations 8.29
and 8.30 yield
38.042 66.010
0.00065 = £l£ = 0.00110
58, 975 58, 975
38.042 66.010
1.902 = £b£ = 3.300
20 20
ìïæ 40 ö üï
max{b(16, 40, p} = max íç ÷ p16 (1 - p)24 ý = 0.128
îïè 16 ø þï
p p
occurs at p = 0.40. This is illustrated in Figure 8.5 in which the binomial prob-
ability of 16 heads in 40 trials is plotted against the value of p. Note that one
usually plots event probabilities against the events but that this plot provides
an alternate view of the probabilities—one focused on the parameter of the
distribution.
144 Reliability Engineering: Probabilistic Models and Maintenance Methods
Prob
0.12
0.1
0.08
0.06
0.04
0.02
p
0.2 0.4 0.6 0.8
FIGURE 8.5
Likelihood function for a binomial sample.
ænö
L( x , n, p) = b( x , n, p) = ç ÷ p x (1 - p)n - x
èxø
and to find the maximum, we take the derivative with respect to p and set it
equal to zero:
d d ænö x ænö
L( x , n, p) = n- x x -1
(
n- x
ç ÷ p (1 - p) = ç ÷ xp (1 - p) - (n - x)p (1 - p)
x
x n - x -1
=0 )
dp dp è ø èxø
= x(1 - p) - (n - x)p = x - px - np + px = x - np = 0
x
p̂ =
n
For our example, this means that pˆ = 0.40 is the maximum likelihood esti-
mate for the probability of heads.
The application of the principle of maximum likelihood to life test data
is direct. We form the likelihood function as the joint distribution of the
Parametric Statistical Methods 145
L( x , q) = fT ( x , q) = fT ( x1 , x2 , … , xn , q) (8.31)
We next note the fact that the individual failure times are mutually indepen-
dent so the joint distribution may be expressed as the product of the mar-
ginal distributions:
n
fT ( x1 , x2 , … , xn , q) = Õ f ( x , q)
j =1
T j (8.32)
Then we observe further that the values of the parameters that maximize the
likelihood function are the same as the ones that maximize the logarithm of
the likelihood function, so in those cases in which this equivalence is useful,
we can exploit it.
Consider the exponential distribution for which there is only one param-
eter so q = {l} and
lx j
f T ( x j , q ) = f T ( x j , l ) = le
Õ
n
- lx j -l å j=1 x j
L( x , l) = le = l ne (8.33)
j =1
To maximize this function we take the derivative and set it equal to zero
n
åx = 0
d n
ln ( L( x , l) ) = - j
dl l j =1
1 1 (8.35)
l̂ = =
å
1 n
x
xj
n j =1
146 Reliability Engineering: Probabilistic Models and Maintenance Methods
A check of the second derivative condition indicates that the second deriva-
tive is clearly negative, so the solution we have found is a maximum. In this
particular case, but not in general, the maximum likelihood and method
of moments estimates are the same. As noted earlier, the example data in
Table 8.1 display a mean value of x = 164.43 so our estimate of the param-
eter is lˆ = 0.0061. In addition, since the estimation expressions are the same
for the methods of moments and for the maximum likelihood method, the
confidence intervals for the estimates are also the same for the two methods.
Thus, when computing maximum likelihood parameter estimates for expo-
nential distributions, Equations 8.21 through 8.23 may be used to construct
the associated confidence intervals.
In the case of the Weibull distribution, the concept is the same but there are
two parameters. Thus, we form the likelihood function
b n b
æ xj ö
b æ ö -å ç q ÷
n n æ xj ö n
bxbj -1 n
Õ f (x, b, q) = Õ Õ
-ç ÷
= nb ç xbj -1 ÷ e j=1
è q ø è ø
L( x , b, q) = T e (8.36)
qb q ç ÷
j =1 j =1 è j =1 ø
n n
å åx
1
ln ( L( x , b, q) ) = n ln b - nb ln q + (b - 1) ln x j - b
j (8.37)
j =1
qb j =1
and to maximize this function, we set the partial derivatives equal to zero.
We solve the resulting expressions for our parameter estimates. First,
n
åx
¶ nb b
ln ( L( x , b, q) ) = - + b +1 b
j =0
¶q q q j =1
åx
1
qˆ b =
ˆ bˆ
j
n j =1
1/bˆ
æ1 n ö
qˆ = ç
çn å x ÷
÷
bˆ
j (8.38)
è j =1 ø
and then
n n n
å å å x ln x = 0
¶ n ln q 1
ln ( L( x , b, q) ) = - n ln q + ln x j + b b
x - b
j
b
j j
¶b b j =1
q j =1
q j =1
Parametric Statistical Methods 147
å x ln x
n
bˆ
n
+ å ln x -
j
1 1 j =1
j
=0 (8.39)
å x
j
bˆ n
n
bˆ
j =1 j
j =1
This final expression must be solved numerically for the value of the
estimate for β and that value is then used to compute the estimated
value of θ.
For maximum likelihood estimation for distributions that are expressed
using more than one parameter, the second derivatives are important for
several reasons. First, they may confirm that maxima have been identified.
In the case of the Weibull distribution, the second partial derivatives of the
likelihood function are
n
¶2
åx
nb b(b + 1)
ln ( L( x , b, q) ) = 2 - b + 2 b
j (8.40)
¶q2 q q j =1
n n
¶2
å å x ln x
n 1 - b ln q b
ln ( L( x , b, q) ) = - + xbj + b
j j (8.41)
¶b ¶q q qb +1 j =1
qb +1 j =1
n n n
¶2 n (ln q)2
å å å x (ln x )
2 ln q 1
ln ( L( x , b, q) ) = - 2 - b
x + b
j
b
x ln x j - b
j
b
j j
2
¶b 2
b qb j =1
q j =1
q j =1
(8.42)
Arranging the second partial derivatives in matrix form yields the Hessian
matrix, which, when multiplied by −1 in a statistical context, is known as the
Fisher information matrix. In an optimization context, a sufficient condition
for a stationary point to be a maximum is that the Hessian matrix be nega-
tive semidefinite. In a statistical context, the inverse of the Fisher information
matrix is the variance–covariance matrix for the sampling distribution on
the estimators. Thus, for the maximum likelihood estimators of the Weibull
parameters, the Hessian matrix is
é ¶ 2 ln L ¶ 2 ln L ù
ê ú
¶b2 ¶b¶q ú
H=ê 2 (8.43)
ê ¶ ln L ¶ 2 ln L ú
ê ú
ë ¶b ¶q ¶q2 û
148 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 8.6
Trial Values of the Parameter Estimate
b̂
b l.h.s. b̂
b l.h.s.
2.0 0.288 3.20 0.002
4.0 −0.113 3.22 −0.001
3.0 0.038 3.21 0.0004
3.5 −0.046 3.215 −0.0005
3.25 −0.006 3.212 0
3.15 0.011
é ¶ 2 ln L ¶ 2 ln L ù
ê- - ú
¶b2 ¶b¶q ú
F=ê 2 (8.44)
ê ¶ ln L ¶ 2 ln L ú
ê- - ú
ë ¶b ¶q ¶q2 û
é Var(bˆ ) Cov(bˆ , qˆ )ù
(8.45)
F -1 = ê ˆ ˆ
ë Cov(b , q ) Var(qˆ ) úû
Consider the example of the failure data in Table 8.2. Using a numerical
search strategy to solve Equation 8.39, the values of the estimate for β that are
evaluated are shown in Table 8.6.
Observe that only eleven values were needed to obtain substantial preci-
sion in the estimate of bˆ = 3.212. Substituting that value into Equation 8.38
yields qˆ = 988.257. These parameter estimates and the data are included in
Equations 8.40 through 8.42 to obtain
é ¶ 2 ln L ¶ 2 ln L ù
ê = -4.725 = 1.164 ´ 10 -2 ú
¶b2 ¶b ¶q
H=ê 2 ú
ê ¶ ln L ¶ 2 ln L ú
ê = 1.164 ´ 10 -2 2
-4
= -2.64 ´ 10 ú
ë ¶b ¶q ¶q û
Given the values of the variances of the sampling distributions on the parame-
ter estimates, it is possible to construct confidence intervals on those estimates.
To obtain the intervals, Nelson [57] and Lawless [58] show that the most efficient
approach is to transform the Weibull data and estimates into their extreme
value forms, construct the intervals, and then invert the transformations.
The transformation from the Weibull is to let the random variable of the
distribution, say, w, be defined by w = ln t. With this identity, the distribution
is reparameterized by δ = 1/β and η = ln θ so that the distribution becomes
w -h
d
FW (w) = 1 - e - e (8.46)
While it may not be apparent, the normal distribution may also be trans-
formed to this form, and consequently, the sampling distribution on the
parameter estimate set (dˆ , hˆ ) is nearly normal. As a result, the confidence
intervals on the extreme value distribution parameter estimates may be used
to define the confidence intervals on the Weibull parameter estimates. This is
done by inverting the transformation, and when the process is implemented,
the approximate 1 – α probability confidence intervals are
qˆ e za/2
( Var ( qˆ ) qˆ ) £ q £ qˆ e z (
1-a/2
Var ( qˆ ) qˆ ) (8.48)
Further use of the methods defined by Lawless [58] permits the construc-
tion of confidence limits on quantiles of the distribution and on reliability
estimates. The transformation of the Weibull to the extreme value distribu-
tion that is defined by w = ln t implies that wγ = ln tγ where, as defined in
Chapter 7, tγ is the time at which FT(t) = γ. Therefore, the corresponding value
of the extreme value random variable is wg = h + dFW-1(1 - g ). Recognizing that
FW(wγ) = γ implies that FW (wg ) = 1 - g and FW-1(1 - g ) = ln(- ln(1 - g )), the expres-
sion for wγ is analyzed by replacing η and δ with their estimates, which are
dˆ = 1/bˆ and hˆ = ln qˆ , to obtain the interval
ˆ g + za/2 Var(w
w ˆ g ) £ wg £ w
ˆ g + z1- a/2 Var(w
ˆ g)
The variance and covariance values are obtained from the inverse of the
Fisher information matrix for the extreme value distribution where the like-
lihood function is defined by replacing f T(t) by f W(w) in Equation 8.36. Thus,
é ¶ 2 ln L ¶ 2 ln L ù
ê- - ú
¶h2 ¶d¶h ú
F=ê 2
ê ¶ ln L ¶ 2 ln L ú
ê- - ú
ë ¶d¶h ¶d2 û
å
¶ ln L(w , h, d) n 1
=- + e d
¶h d d i =1
n n wi - h
å å
¶ ln L(w , h, d) n wi - h wi - h
=- - + e d
¶d d i =1
d2 i =1
d2
n wi - h
¶ 2 ln L(w , h, d)
å
1
2
=- 2 e d
¶h d i =1
n n wi - h n wi - h
¶ 2 ln L(w , h, d) n (wi - h)2
å å å
wi - h wi - h
= 2 +2 -2 e d - e d
¶d2 d i =1
d 3
i =1
d3 i =1
d4
n wi - h n wi - h
¶ 2 ln L(w , h, d) n 1
å å
wi - h
= 2- 2 e d - e d
¶d¶h d d i =1 i =1
d3
é 0.00435 -0.00103 ù
F -1 = ê
ë -0.00103 0.00223 úû
the estimated value of any quantile is calculated using wg = hˆ + dˆ ln(- ln(1 - g )).
Taking dˆ = 1/bˆ and hˆ = ln qˆ yields
h
and the estimator for u is approximately normal with a mean of E[uˆ ] = E[w] -
and variance of d
æ 1 ö
(
Var(uˆ ) = ç 2 ÷ Var(hˆ ) + 2uˆ Cov(hˆ , dˆ ) + uˆ 2 Var(dˆ )
è dˆ ø
)
Using these definitions,
( )
uˆ + z1-a/2 Var ( uˆ ) uˆ + za/2 Var ( uˆ )
e-e £ FW (w) = FT e w £ e - e (8.50)
ln(900) - hˆ
For the example data of Table 8.2, take t = 900 so u = = -0.3005
dˆ
and Var(û ) = 0.0533. Using these values, the 95% confidence interval for the
reliability at 900 hours is
We proceed in the same manner as we did with the Weibull. We take the
partial derivatives
n
åx = 0
¶ nb
ln ( L( x , b, l) ) = - j
¶l l j =1
b b
l̂ = = (8.54)
å
1 n
x
xj
n j =1
å ln x = 0
¶
ln ( L( x , b, l) ) = n ln l - ny(b) + j
¶b j =1
æ1 n ö n
å å ln x = 0
1
ln bˆ - ln ç x j ÷ - y(bˆ ) + j (8.55)
çn ÷ n
è j =1 ø j =1
where the psi function is the derivative of the logarithm of the gamma
function
d G¢(b)
y(b) = ln G(b) =
db G(b)
Parametric Statistical Methods 153
TABLE 8.7
Trial Values of the Parameter Estimate
b̂
b l.h.s. b̂
b l.h.s.
2.0 0.093 2.9 0.005
4.0 −0.048 2.98 −0.006
3.0 −0.002 2.97 0
This function is relatively well behaved. There are both tables of the function
and numerical strategies for computing it. One such algorithm is included in
Appendix A.
The final form of Equation 8.55 is obtained by substituting Equation 8.54
into the partial derivative with respect to β. Once again, we solve numeri-
cally for the estimate for β and then use that value to compute l̂ using
Equation 8.54. For the example data of Table 8.5, Table 8.7 shows that only six
function evaluations are needed to obtain bˆ = 2.97, and using this value, we
obtain lˆ = 0.00101.
In principle, confidence intervals for the parameter estimates, quantiles,
and reliability values for the gamma distribution cannot be defined. A key
reason for this is that there is no closed-form expression for the derivative of
the psi function. Nevertheless, a heuristic algorithm for pursuing the con-
fidence interval analysis like the one defined for the Weibull distribution
earlier can be constructed. However, it is heuristic and has no demonstrable
statistical properties.
To start, note that the second partial derivatives of the logarithm of the
gamma likelihood function are
¶2 nb
ln ( L( x , b, l) ) = - 2 (8.56)
¶l 2 l
¶2 n
ln ( L( x , b, l) ) = (8.57)
¶b¶l l
¶2 ¶
ln ( L( x , b, l) ) = -n y(b) (8.58)
¶b2 ¶b
é ¶ 2 ln L ¶ ¶ 2 ln L nù
ê- 2
= n y(b) - =- ú
¶b ¶b ¶b ¶l lú
F=ê (8.59)
ê 2
¶ ln L n ¶ ln L nb ú
2
ê - =- - = 2ú
ë ¶ b ¶l l ¶l 2 l û
154 Reliability Engineering: Probabilistic Models and Maintenance Methods
é Var(bˆ ) Cov(bˆ , lˆ )ù
(8.60)
F -1 = ê ˆ ˆ
ë Cov(b , l ) Var(lˆ ) úû
Clearly, given estimates for the parameters of the distribution, all of the
¶
terms in the information matrix are well defined except y(b).
¶b
There are several equivalent formulas for the psi function and one that
appears helpful is [59]
¥
å j( j + x)
x
y( x + 1) = - g + (8.61)
j =1
where γ is Euler’s constant. Taking the derivative of this form of the psi func-
tion yields
¥
å ( j + x)
dy( x + 1) 1
= 2
(8.62)
dx j =1
é 7.992 -19856.9 ù
F=ê
ë -19856.9 5.855 ´ 107 úû
where the derivative of the psi function was truncated at 0.39 after 100 terms
of the sum were included. For this matrix,
é 0.795 0.00027 ù
F -1 = ê
ë0.00027 1.085 ´ 10 -7 úû
bˆ bˆ
Using the same variance estimates, take tˆg = Eˆ [T ] + zg sˆ T = + zg 2 and
ˆl ˆl
compute
( )
Var(tˆg ) = Var(bˆ ) + 2tˆg Cov(bˆ , lˆ ) + tˆ 2g Var(lˆ )
For this example, the estimate of the 10% quantile is tˆ0.10 = 755.199 and the
associated confidence interval is
752.995 £ t0.10 £ 757.403
bˆ
ˆ [T ] t-
zˆ =
t - E
= lˆ
sˆ T bˆ /lˆ 2
Next, take
(
Var( zˆ ) = Var(bˆ ) + 2zˆ Cov(bˆ , lˆ ) + zˆ 2 Var(lˆ ) )
and compute the confidence intervals as
( ) (
F zˆ + za/2 Var( zˆ ) £ F(t) £ F zˆ + z1- a/2 Var( zˆ ) )
For the example, t = 1000.0, zˆ = -1.139 and Var( ẑ ) = 0.794 so
and
0.272 £ F(1000) £ 0.998
One final case that is quite intuitive (and more precise) is the normal dis-
tribution. Taking the density function for the normal as stated in Chapter 4,
the likelihood function is
n
1
n n - å ( x j -m )2
Õ f ( x , m, s ) = Õ
1 - ( x j - m ) /2 s2 2 1 2 s2 j=1
L( x , m , s2 ) = T
2
e = e
( 2p )
n/ 2
j =1 j =1 2ps2 sn
(8.63)
å( x - m)
n 1
( )
ln L x , m , s2 = -
2
ln(2p) - n ln s - 2
2s
j
2
(8.64)
j =1
å( x - m)
¶ n 1
¶s
(
ln L x , m , s2 = - + 3
s s
) j
2
=0
j =1
å( x - m) = 0
¶ 1
¶m
(
ln L x , m , s2 = 2
s
) j
j =1
åx = x
1
m̂ = j (8.65)
n j =1
å ( x - mˆ )
1 2
sˆ = j (8.66)
n j =1
TABLE 8.8
Normally Distributed Life Test Data
2029.37 2018.04 2051.76
2006.77 1990.25 1968.18
1971.75 2041.18 1913.2
1994.76 2002.67 1968.34
1982.65 1945.68 1983.37
1972.79 1985.5 1975.8
2017.15 1993.63 1994.95
1967.66 1974.04 2025.12
1976.01
matrix for the parameter estimates. For the normal distribution, the second
partial derivatives are
n
¶2
å( x - m)
n 3
¶s 2 ( s s
)
ln L x , m , s2 = 2 - 4 j
2
(8.67)
j =1
n
¶2
å( x - m)
2
¶m¶s
(
ln L x , m , s2 = - 3
s
) j (8.68)
j =1
¶2 n
¶m 2 ( )
ln L x , m , s2 = - 2
s
(8.69)
For the data in Table 8.8, following the same pattern as in the analyses of the
Weibull and gamma distributions yields
é ¶ 2 ln L ¶ 2 ln L ù
ê - = 0.0285 - = 3.844 ´ 10 -7 ú
ê ¶ mˆ 2 ¶mˆ ¶sˆ ú
F=
ê ¶ 2 ln L ¶ 2 ln L ú
ê- = 3.844 ´ 10 -7 - 2
= 0.0570 ú
m
ˆ
ë ¶ ¶s ˆ ¶ŝ û
so
é 35.088 -2.366 ´ 10 -4 ù
F -1 = ê -4 ú
ë -2.366 ´ 10 17.544 û
(n - 1)sˆ 2 (n - 1)sˆ 2
£ s £ (8.71)
c2 a c 2a
1- , n - 1 , n -1
2 2
1977.80 £ m £ 2002.25
24(29.63)2 24(29.63)2
23.125 = £s£ = 41.222
39.4 12.4
to obtain
For the example data of Table 8.8, t0.10 = 1952.04 and Var(tˆg ) = 52.63 so
Finally, for the confidence intervals on reliability values, use of the standard
normal distribution leads to
æ t - mˆ ö
FˆT (t) = F ç ÷ (8.72)
è sˆ ø
and
FˆT (t) + za/2 Var( FˆT (t)) £ FT (t) £ FˆT (t) + z1- a/2 Var( FˆT (t)) (8.73)
where
Fˆ T (t) ( 1 - Fˆ T (t) )
Var( Fˆ T (t)) =
n
Parametric Statistical Methods 159
For the example data in Table 8.8, take t = 1975 so zˆ = -0.507 and FˆT (t) = 0.306
so
0.125 £ FT (1975) £ 0.487
and
0.513 £ FT (1975) £ 0.875
æ r ö
Õ ( )
n-r
L( x , q) = ç fT ( x j , q) ÷ FT ( xr ) (8.74)
ç ÷
è j =1 ø
æ r
b
æ xj ö ö æ -æ xr öb ön - r r æ r
b
ö - å rj=1 æç x j ö÷ æx ö
b
bxbj -1 ÷ ç e çè q ÷ø ÷ = b ç
- ( n - r )ç r ÷
Õ Õx
-ç ÷
L( x , b, q) = ç e è q ø b -1 ÷e è q ø è q ø
ç qb ÷ç ÷ qbç
r j
÷
ç j =1 ÷è ø è j =1 ø
è ø
(8.75)
b æ ö
r
åx
¶ rb
ln ( L( x , b, q) ) = - + b +1 ç b
j + (n - r )xrb ÷
¶q q q ç ÷
è j =1 ø
160 Reliability Engineering: Probabilistic Models and Maintenance Methods
ln q æç ö
r r
å åx
¶ r
ln ( L( x , b, q) ) = - r ln q + ln x j + b
j + (n - r )xrb ÷
¶b b qb ç ÷
j =1 è j =1 ø
1 æ ö
r
- bç
q ç å x ln x + (n - r)x ln x ÷÷
b
j j
b
r r
è j =1 ø
1/bˆ
æ æ r öö
å
1
qˆ = ç ç xbj + (n - r )xrb ÷ ÷
ˆ ˆ
(8.77)
çrç ÷÷
è è j =1 øø
and substituting this expression into the second of the partial derivative
equations, we have
æ
å xbj ln x j + (n - r )xrb ln xr ö÷
r ˆ ˆ
r ç
å
1 1
ln x j - è ø =0
j =1
+ (8.78)
bˆ r æ
å
r ˆb ˆb ö
j =1 ç x j + ( n - r )x r ÷
è j =1 ø
Once again, we solve (8.78) numerically for b̂ and then compute q̂ using (8.77).
Note that the form of Equations 8.77 and 8.78 are consistent with those
obtained for complete data sets. In fact, if we take r = n, these equations
reduce to those obtained for complete data sets.
As an example, suppose only the first r = 15 of the failure times of Table 8.2
had been observed. Using the equations developed earlier, we obtain the
parameter estimates bˆ = 3.513 and qˆ = 965.571. A check of the Hessian matrix
evaluated at these values confirms that they constitute a maximum.
For the method of maximum likelihood, even with censored data, the sec-
ond derivatives provide an estimate of the Fisher information matrix and
hence of the variance–covariance matrix for the estimators. This means that
the same methods that were used for complete data sets can be applied to the
censored data to obtain confidence intervals for the parameter estimates, for
estimates of distribution quantiles, and for reliability estimates.
To complete this discussion, it is appropriate to note that the method of
maximum likelihood is very appealing because of its intuitive foundation.
It is widely used and the fact that it can sometimes be applied to censored
data is an additional positive feature of the method. The main disadvantage
of the method is that the estimators obtained are not always unbiased. In
particular, the estimators for the Weibull distribution are not unbiased. Even
for the normal distribution, the estimator for the mean is unbiased, but the
one for the standard deviation is not.
Parametric Statistical Methods 161
1 æç ö
r
x=
nç å x j + (n - r ) ( xr + E[Y ]) ÷
÷
(8.79)
è j =1 ø
where Y is the remaining life of the components that have not yet failed. The
general expression for the expected value of Y has been shown by Cox [61]
to be
m 2 + s2
E[Y ] = (8.80)
2m
where
μ is the mean of the life distribution
σ2 is the variance of the life distribution
For the exponential distribution, this expected value is 1/λ so Equation 8.79
becomes
1 æç 1 ö
r
x=
nç å x + (n - r) æçè x + l̂ ö÷ø ÷÷
j r
è j =1 ø
162 Reliability Engineering: Probabilistic Models and Maintenance Methods
and we equate this to the population mean to solve for the estimate of the
distribution parameter:
1 1 æç 1 öö
r
=
lˆ n ç å æ
x j + (n - r ) ç xr + ÷ ÷
è lˆ ø ÷
è j =1 ø
qG ( 1 + 2/b )
E[Y ] =
2G ( 1 + 1/b )
1æ æ qG ( 1 + 2/bˆ ) öö
r
x= ç
nç å x j + (n - r ) ç xr +
ç
è 2G ( 1 + 1/bˆ )
÷÷ ÷
ø ÷ø
è j =1
1æ ö n - r æ qG ( 1 + 2/bˆ ) ö
r
= ç
nç
è
å
j =1
x j + ( n - r )x r ÷ +
÷
ø
ç ÷
n çè 2G ( 1 + 1/bˆ ) ÷ø
1 æç ö
2
æ qG ( 1 + 2/bˆ ) ö
r
2
s =
n-1ç å 2
x + (n - r ) ç xr +
j
ç
è
÷
2G ( 1 + 1/bˆ ) ÷ø
- nx 2 ÷
÷
è j =1 ø
ˆq = è j =1 ø
2nG 2 (1 + 1/bˆ ) - (n - r )G(1 + 2/bˆ )
Parametric Statistical Methods 163
This expression is used in place of θ in the expressions for both the sample
mean and the sample variance that are included in the coefficient of varia-
tion as in the case of uncensored data. A numerical search yields an estimate
for β, which is then used in the previous equation to compute an estimate
for θ. As an example, the first r = 15 data values from Table 8.2 have been used
to obtain parameter estimates of bˆ = 2.391 and qˆ = 1175.1.
In the case of the gamma distribution, Cox’s result implies
b+1
E[Y ] =
2l
1æ b + 1 ö ö÷ 1 æç ö n-r b+1
r r
x= ç
nç å æ
x j + (n - r ) ç xr +
è
÷ =
2l ø ÷ n ç å x j + ( n - r )x r ÷ +
÷
æ
ç
ö
÷
n è 2l ø
è j =1 ø è j =1 ø
1 æç ö
r 2
å æ b+1ö
s2 = x 2j + (n - r ) ç xr + ÷ - nx ÷
2
n-1ç è 2l ø ÷
è j =1 ø
2nbˆ - (n - r )(bˆ + 1)
lˆ =
å
2 çæ x j + (n - r )xr ÷ö
r
è j =1 ø
This expression is used in place of λ in the expressions for both the sam-
ple mean and the sample variance, which are included in the coefficient
of variation as in the case of uncensored data. A numerical search yields
an estimate for β, which is then used in the previous equation to compute an
estimate for λ. As an example, the first r = 12 data values from Table 8.5 have
been used to obtain parameter estimates of bˆ = 3.540 and lˆ = 0.00118.
As should now be evident, it is possible to use right censored data to con-
struct estimators for distribution parameters using the method of moments.
The process is fairly involved, but it does yield reasonable estimator values.
However, the sampling distributions for the estimator values have not been
obtained so confidence intervals cannot be constructed.
164 Reliability Engineering: Probabilistic Models and Maintenance Methods
s3
s2
s1
τ1 τ2 τ3
FIGURE 8.6
Representation of a three-level step stress regimen.
Parametric Statistical Methods 165
ì a1t j 0 < t j £ t1
ï a1t1 + a2 (t j - t1 ) t1 < t j £ t2
ïï
x j = í a1t1 + a2 (t2 - t1 ) + a3 (t j - t2 ) t2 < t j £ t3 (8.82)
ï
ï
ïî 1 1 2 2 1 + am (t j - tm -1 )
a t + a ( t - t ) + tm -1 < t j £ tm
Then, this data set may be analyzed using any of the parametric (or nonpara-
metric) methods described previously.
Consider an example. Suppose n = 25 copies of a device for which life
length is temperature dependent and the nominal operating temperature is
55°C are subjected to a step stress test with the regimen {(85°C, 40 h), (95°C,
100 h), (105°C, 60 h)}. Suppose further that the device has an activation energy
of 0.80 ev/°k and that the failure times recorded during the test are those
shown in Table 8.9. The corresponding equivalent failure times are also
shown in the table. The equivalent failure times are the life lengths to which
the accelerated ages correspond. Subjecting the equivalent failure times to
the method of moments estimation procedure for a Weibull distribution
yields the parameter estimates of bˆ = 0.835 and qˆ = 886.427.
TABLE 8.9
Example Data for a Step Stress Test
j tj xj j tj xj
1 1.316 14.056 14 52.155 689.435
2 3.411 36.427 15 56.620 785.784
3 6.031 64.396 16 61.629 893.898
4 9.122 97.403 17 67.305 1016.375
5 12.675 135.346 18 73.811 1156.797
6 16.702 178.343 19 81.387 1320.296
7 21.228 226.667 20 90.390 1514.574
8 26.290 280.722 21 101.389 1751.944
9 31.940 341.051 22 143.300 2723.871
10 38.242 408.348 23 160.333 3439.661
11 41.236 453.784 24 191.578 4752.684
12 44.524 524.749 25 No failure
13 48.146 602.917
166 Reliability Engineering: Probabilistic Models and Maintenance Methods
This is to say that if 85°C were the normal operating temperature and if the
Weibull distribution having parameters bˆ = 0.834 and qˆ = 87.632 were an
accurate model of device length, then the device population should have a
reliability value of 0.50 at a time near to te,13 = 56.469.
The shape invariance of the life distribution implies that following an ini-
tial operating period of length τ, age acceleration implies that observed and
equivalent ages are related by
a(t j - t) = te , j - t (8.83)
For the example time values earlier, the realization of this expression is
Note that the choice of the 13th failure time was arbitrary. If we had used
the 16th recorded failure time, the extrapolated failure time would have
been
ˆ 9.7 0.834
= 0.382 = e ( e ,16
- t /87.632 )
FT (te ,16 ) = Þ te ,16 = 83.719
25.4
Exercises
8.1 Assume the following data have been generated in the life test of a
component that is believed to have exponential life distribution. Use
the complete data set to obtain a graphical estimate of the distribution
parameter.
8.2 Assume that only the earliest 20 data values for Problem 8.1 are avail-
able and use them to construct the graphical estimate for λ.
8.3 Use the data set of Problem 8.1 to compute the method of moments
estimate for λ. Then compute a 95% confidence interval for the estimate
and for the reliability at 150 hours.
168 Reliability Engineering: Probabilistic Models and Maintenance Methods
8.4 Assume the following data have been generated in the life test of a
component that is believed to have exponential life distribution. Use
the complete data set to obtain a graphical estimate of the distribution
parameter.
8.5 Assume that only the earliest 20 data values for Problem 8.4 are avail-
able and use them to construct the graphical estimate for λ.
8.6 Use the data set of Problem 8.4 to compute the method of moments
estimate for λ. Then compute a 95% confidence interval for the estimate
and for the reliability at 500 hours.
8.7 Assume the following data have been generated in the life test of a com-
ponent that is believed to have Weibull life distribution. Use the complete
data set to obtain graphical estimates of the distribution parameters.
8.8 Assume that only the earliest 20 data values for Problem 8.7 are avail-
able and use them to construct the graphical estimate for the distribu-
tion parameters.
8.9 Use the complete data set of Problem 8.7 to compute the method of
moments estimate for the distribution parameters.
8.10 Using the definitions of the matrix U and the vector V of Section 8.1,
derive Equations 8.7 and 8.8.
Parametric Statistical Methods 169
8.11 Assume the following data have been generated in the life test
of a component that is believed to have Weibull life distribution.
Compute the method of moments estimates of the distribution
parameters. Also compute the 95% confidence bounds on the param-
eter estimates.
8.12 Suppose a sample of 40 copies of a device were placed on test and that
the observed failure times are
Assume that the appropriate choice of life distribution model for the
devices is the Weibull distribution and use the graphical/regression
analysis method to compute estimates of the distribution parameters.
8.13 For the data of Problem 8.11, compute the maximum likelihood esti-
mates of the distribution parameters and the associated 95% confidence
intervals for those estimates.
8.14 Use the solutions of Problem 8.13 to compute confidence bounds on t0.10
and on FT (250).
8.15 Suppose the life test of Problem 8.11 had been censored so that only the
first 24 data values were observed. Compute the maximum likelihood
estimates for the distribution parameters.
170 Reliability Engineering: Probabilistic Models and Maintenance Methods
8.16 Assume the following data have been generated in the life test of a
component that is believed to have gamma life distribution. Use the
complete data set to obtain the method of moments estimates of the
distribution parameters.
8.21 For the data of Problem 8.18, compute 95% confidence intervals on
the maximum likelihood parameter estimates and also on t0.10 and on
FT ( 425).
8.22 Suppose 42 copies of a device for which temperature-dependent life
length is believed to be well represented by a Weibull distribution were
subjected to a step stress accelerated life test with a stress regimen of
{(85°C, 40 h), (95°C, 80 h), (105°C, 40 h)} and that the data in the follow-
ing table were obtained. Compute estimates for the distribution param-
eters at the normal operating temperature of 85°C and for the activation
energy.
j tj j tj
1 0.3799 22 40.5757
2 0.3894 23 44.3383
3 0.5062 24 48.9265
4 1.3523 25 49.5716
5 2.9877 26 51.2090
6 3.0075 27 51.5310
7 3.1151 28 53.6824
8 3.4273 29 54.7999
9 4.3191 30 63.3631
10 6.2674 31 64.6060
11 6.6553 32 69.4093
12 8.3147 33 69.6571
13 13.8439 34 75.79811
14 15.7537 35 101.3423
15 16.2473 36 110.6855
16 17.7215 37 117.9570
17 22.3923 38 125.9469
18 23.1186 39 129.8984
19 24.5316 40 130.6095
20 26.7895 41 140.1335
21 36.8783 42 No failure
9
Repairable Systems I: Renewal
and Instantaneous Repair
Definition 9.1
173
174 Reliability Engineering: Probabilistic Models and Maintenance Methods
Definition 9.2
and we say that Sk is the duration of the interval over which k copies of the
device are operated to failure. As each of the Ti has the same probability dis-
tribution function, FT(t), the probability distribution on Sk is the k-fold convo-
lution of FT(t). We represent this as
where
t
ò
F (t) = FT( k -1) (t - u) fT (u)du
(k )
T (9.3)
0
T1 T2 T3
0 S1 S2 S3
FIGURE 9.1
Illustration of a renewal process.
176 Reliability Engineering: Probabilistic Models and Maintenance Methods
There are two natural questions that one can address using this basic
model. These are as follows: (1) How long until the kth failure? (2) How many
failures will occur over a fixed time interval? That is, how soon and how
many? It is easy to see that the answers to these questions form the basis for
planning the volume of spare parts that are purchased, the extent of invest-
ment in repair facilities and equipment, the levels of staffing that are estab-
lished and perhaps the extent to which substitute systems are acquired.
The general expression of these questions is that we wish to examine the
time until an arbitrary number of renewals have occurred and the probabili-
ties on the number of renewals over a defined time interval. The probabilities
for these two measures are related by the important fundamental relation
where Nt represents the number of renewals that occur during the interval
[0, t]. This expression is a bit subtle and is worth pondering. An example
realization is that the number of renewals during 1000 hours can equal or
exceed 4 only if the time of the fourth renewal is equal to or earlier than
t = 1000 hours.
We exploit this relationship to address the questions of how soon and
how many. Note also that we are really considering a measure in the “time
domain” and a measure in the “frequency domain” and that Equation 9.4
is the bridge between the two. One further point is that once we have used
Equation 9.4 to address the questions of time until the kth renewal and the
number of renewals over time, we can use the results of that analysis to deter-
mine (1) the expected number of renewals during an interval, (2) the identity
of the renewal density, (3) higher moments of the distribution on renewals,
(4) the distribution of backward recurrence times, and (5) the distribution on
forward recurrence times. The forward recurrence time is the time from an
arbitrary point in time until the next event (failure) and the backward recur-
rence time is the time that has elapsed since the last event.
To exploit Equation 9.4, we assume that we know the distribution FT(t) on
the length of the individual operating intervals. In principle, this means we
also know the distribution on Sk as we presume that we can construct the
convolution of FT(t). Then, to determine the probability distribution on the
number of renewals, we use
Now, to proceed with the identification of the specific forms of the distribu-
tions on Nt and Sk, we must specify FT(t). The most well-known construction
Repairable Systems I 177
of this type, and one of the few that is tractable at all, starts with the selection
of the exponential distribution for FT(t). In this special case, the point process
is called a Poisson process.
If the lengths of the individual intervals are exponential in distribution, then
the variable Sk has a gamma distribution. This result is relatively well known,
but its construction is repeated here to remind the reader of the method of
analysis. We start with the statement of the exponential distribution as
FT (t) = 1 - e - lt
and we state the corresponding Laplace transform for its density function as
¥ ¥
l
ò
0
ò
fT*(s) = L fT (s) = L ( fT (t) ) = e - st fT (t)dt = le - st - lt dt =
0
s+l
l
FT*(s) = L ( FT (t) ) =
s(s + l)
Often, constructing the transform for the density is easier than doing so for
the distribution function.
The transform for the convolution is the product of the transforms for the
distributions included in the convolution. This means that the transform for
the distribution on Sk is the product of k identical terms, each of which is the
transform on the exponential distribution. Thus,
( )
k
k
æ l ö
fS*k (s) = fT*(s) =ç ÷
è s+l ø
l k k - 1 - lt
fSk (t) = t e
G(k )
¥ t t ¥
= FT (t) + åò j =1 0
( j)
F (t - u) fT (u)du = FT (t) +
T
òåF0 j =1
( j)
T (t - u) fT (u)du
ò
MFT (t) = FT (t) + MFT (t - u) fT (u)du
0
(9.9)
Equation 9.9 is the very well-known fundamental form called the “key
renewal theorem.” Very often, this form serves as the basis for the analysis of
a renewal process. The main reason for its extensive use is that the associated
Laplace transform yields a direct relationship between the renewal function
and the underlying distribution of the process FT(t). That relationship is
æt ö
ç
è0
ò
MF*T (s) = FT*(s) + L ç MFT (t - u) fT (u)du ÷ = FT*(s) + MF*T (s) fT*(s)
÷
ø
(9.10)
FT*(s) FT*(s)
MF*T (s) = = (9.11)
1 - f T*(s) 1 - s FT*(s)
Repairable Systems I 179
and equivalently,
Of course, the utility of these results depends upon our ability to invert the
transforms in any particular application.
Note that in Equation 9.12, the lowercase m represents the transform of the
renewal density. The renewal density is the first derivative of the renewal
function. As a derivative, the renewal density necessarily represents a rate. It
is the rate at which renewals occur. Thus, it is the rate at which the number of
failures (renewals) increases. It is the “rate of failure.” The quotation marks
are intended to signal the fact that there is considerable opportunity for mis-
understanding here.
Some people refer to the hazard function for a life distribution as the failure
rate while others use the term failure rate to mean the renewal density. These
two entities are not the same conceptually, and except for the case of the
exponential distribution, they are not the same algebraically. As explained
in Chapter 4, the hazard function is the conditional probability of failure for
members of a population given survival to any time. As shown previously, the
renewal density is the unconditional probability of another event in a sequence
of events. In an effort to clearly distinguish between the two entities, the term
failure rate is not used here. The hazard function is called the hazard func-
tion or hazard rate and the renewal density is called the renewal density, the
intensity function, or the failure intensity. In fact, the most appropriate label
for the renewal density (aside from renewal density) is the failure intensity.
The renewal density really is the intensity with which new renewals occur.
Algebraically, we can represent the derivative as
d
mFT (t) = MFT (t) (9.13)
dt
and we note that the derivative extends to both Equation 9.8 and the key
renewal theorem. Hence, we have
¥
mFT (t) = åf
n =1
( n)
T (t) (9.14)
ò
mFT (t) = fT (t) + mFT (t - u) f T (u)du
0
(9.15)
and
Clearly, depending upon their relative complexity, one may work with either
the renewal function or the renewal density.
The single case in which the analysis is relatively straightforward is the
exponential case. As indicated earlier, the forms for the Laplace transforms
for the exponential are
l l
fT*(s) = and FT*(s) =
s+l s(s + l)
so
l l
mF*T (s) = and MF*T (s) =
s s2
Note that these results imply that when the individual interval lengths are
exponentially distributed, the expected number of renewals (failures) over
an interval of length t is λt and the intensity with which new failures occur
is λ. Thus, for the Poisson process, the failure intensity is constant.
For many other life distributions, the analysis of the renewal function is
rather difficult. In those cases, there are some basic results that can be useful.
A few of those results are
t
MFT (t) ³ -1
EF [T ]
N 1
lim t =
t ®¥ t EF [T ]
(9.18)
M (t) 1
lim FT =
t ®¥ t EF [T ]
x
lim ( MFT (t + x) - MFT (t) ) =
t ®¥ EF [T ]
where the notation EF[T] is used to represent the mean of the distribution on
the lengths of the individual intervals and it is assumed that mean is finite.
As an example, applying these expressions to the exponential distribution
for which the expected value is 1/λ yields
MFT (t) ³ lt - 1
N
lim t = l
t ®¥ t
M (t)
lim FT = l
t ®¥ t
lim ( MFT (t + x) - MFT (t) ) = lx
t ®¥
Repairable Systems I 181
t
MFT (t) ³ -1
2256.86
Nt 1
lim = = 4.43 ´ 10 -4
t ®¥ t 2256.86
MFT (t)
lim = 4.43 ´ 10 -4
t ®¥ t
x
lim ( MFT (t + x) - MFT (t) ) =
t ®¥ 2256.86
Note particularly the fact that Equation 9.18 applies to nearly any choice of
distribution so the previous results can be very useful. The last of the four
relationships is known as Blackwell’s theorem. It is well treated in Feller [64].
In effect, that last relationship states that as time advances, a renewal pro-
cess “settles down” and experiences renewals about once per EF[T] so the
expected number of renewals during any interval is the length of the inter-
val divided by the mean of the underlying distribution.
To close this discussion, note that a sequence of intervals over which a
series of copies of a component are used in a machine can reasonably be
represented using the renewal process model provided each interval has
the same stochastic characteristics. When one is examining components
and even some modules, this is often the case. For other levels of equip-
ment aggregation and other types of operational profiles, we will modify the
renewal models later in this text. Note also that several of the most popular
choices of distribution for representing operating durations yield renewal
functions that are impossible or quite taxing to analyze. However, modern
computing power has made these distributions much more manageable. The
example of the Weibull is treated at the end of the next section on the basis of
a numerical strategy that is described in Appendix B.
represented by one of the life distributions with which we are already famil-
iar. We now describe those distributions in terms that are meaningful for a
series of intervals. To start, we compare life lengths of new and used devices.
Definition 9.3
Definition 9.4
A life distribution, FT(t), is said to be new better than used in expectation (NBUE) if
¥
1. If FT(t) is IFR and has finite mean EF[T], then for 0 ≤ t < EF[T],
2. If FT(t) is IFR and has finite mean EF[T], then for 0 ≤ t < EF[T],
( t/EF [T ])
n -1 j
( n)
F (t) £ 1 -
T å j =0
j!
e -t/EF [T ] (9.26)
3. If FT(t) is IFR and has finite mean EF[T], then for 0 ≤ t < EF[T],
( t/EF [T ])
¥ j
Pr[Nt ³ n] £ åj=n
j!
e -t/EF [T ] (9.27)
4. If FT(t) is NBU and has cumulative hazard function ZT(t), then for t ≥ 0,
( ZT (t))
n -1 j
Pr[Nt < n] ³ å j =0
j!
e - ZT (t ) (9.28)
5. If FT(t) is IFR and has cumulative hazard function ZT(t), then for t ≥ 0,
( nZ (t/n ) )
j
n -1
å
T
Pr[Nt < n] £ e - nZT (t/n ) (9.29)
j =0
j!
184 Reliability Engineering: Probabilistic Models and Maintenance Methods
6. If FT(t) is NWU and has cumulative hazard function ZT(t), then for t ≥ 0,
( ZT (t))
n -1 j
Pr[Nt < n] £ å
j =0
j!
e - ZT (t ) (9.30)
7. If FT(t) is DFR and has cumulative hazard function ZT(t), then for t ≥ 0,
( nZ (t/n ) )
j
n -1
å
T
Pr[Nt < n] ³ e - nZT (t/n ) (9.31)
j =0
j!
so
In this case, we can compute the reliability at 1000 hours to be 0.978. Perhaps
more interesting is the fact that the probability that the third component life
length is completed by 5000 hours is bounded by Equation 9.26 as
( t/3559.43 )
2 j
(3)
F (2000) £ 1 -
T åj =0
j!
e -t/3559.43 = 0.0195
Repairable Systems I 185
On the other hand, Equation 9.27 indicates that the probability of 3 or more
renewals in 5000 hours is bounded by the same quantity. That is,
For the same device population, Equations 9.28 and 9.29 indicate that
( ZT (t))
2 j
0.718 = å
j =0
j!
e - ZT (t ) £ Pr[N 5000 < 3]
( 3Z (1.25/3 ))
j
2
å -3 ZT ( 1.25/3 )
T
£ e = 0.997
j =0
j!
so
5000
MFT (5000) £ = 1.405
3559.43
Note further that Equation 9.32 can be combined with the first of the expres-
sions of (9.18) to yield
t t
- 1 £ MFT (t) £
EF [T ] EF [T ]
so
0.405 £ MFT (5000) £ 1.405
Naturally, we can perform similar example computations for Equations 9.30,
9.31, and 9.33. If our Weibull population had a shape parameter of β = 0.75
rather than 2.75, the mean life would be 4762.56 hours, ZT(2000) = 0.595, and
ZT(6000) = 1.355 so Equation 9.30 indicates that
These examples illustrate the information one can obtain in cases in which
the convolutions or renewal equations are computationally difficult.
As noted earlier, modern computing power has made previously tax-
ing computations much more manageable. For example, the renewal func-
tion for the Weibull distribution cannot be expressed in closed form at all.
However, Lomnicki [65] defined an equivalent infinite series expansion
for the Weibull renewal function. The series is exact until it is truncated
to finitely many terms in which case it provides an approximation that is
often quite accurate. The series form is provided in Appendix B. When
the series is truncated at 15 terms and the value of the shape parameter is
β = 2.75, we obtain MFT (1000) = 0.022, MFT (2000) = 0.140, MFT (6000) = 1.272, and
MFT (8000) = 1.821. For the case in which β = 0.75, we obtain MFT (1000) = 0.370,
MFT (2000) = 0.641, MFT (6000) = 1.585, and MFT (8000) = 2.029.
In closing this discussion, it is appropriate to note that the computational
effort involved in using Lomnicki’s method depends on the time over which
the renewal function is evaluated. Truncating the infinite series after rela-
tively few terms yields quite accurate values for MFT (t) when t is smaller than
around 2EF [T ]. As the time interval is increased, more terms must be included
in the truncated sum and the computational effort grows exponentially. For
the values provided here, 15 terms were more than adequate for all of the val-
ues other than MFT (8000), and for that value, 18 terms were required.
FT (u + t) - FT (t) FT (t) - FT (u + t) F (u + t)
Pr[U (t) £ u] = FU ( t) (u) = = = 1- T (9.34)
FT (t) FT (t) FT (t)
Repairable Systems I 187
FT (u + t)
Pr[U (t) > u] = FU ( t) (u) = (9.35)
FT (t)
FT (u + t) e - l(u + t)
FU ( t) (u) = 1 - = 1 - - lt = 1 - e - lu
FT (t) e
so FU(τ = 500)(1000) = 0.208 and FU(τ = 500)(2500) = 0.806. The corresponding reli-
ability values are the complements of the failure probabilities.
Here is another way to look at the residual life distribution. If we consider
the progress of a renewal process over the time domain, the age, say, τ, of a
functioning device at any point in time, say, t, is a random variable. The value
of the random variable, device age may be represented by
t = t - SNt (9.36)
Given these definitions, we can represent the probability that at any point
in time, t, the residual life exceeds a specific value, u, as the probability that
the first device in the process survives beyond t + u plus the probability that a
renewal occurred at some point in time prior to t and the device started at
that time survives longer than t + u. That is,
t
ò
FU ( t) (u) = Pr[U (t) > u] = FT (t + u) + FT (t + u - x)mFT ( x)dx
0
(9.38)
The approach to the analysis of Equation 9.38 is the same as that for the
key renewal theorem and can be just as taxing. Nevertheless, there are some
useful results we can obtain. First, the limiting form of the residual life dis-
tribution at any point in time is
u
1
FU (u) =
ò
EF [T ]
0
FT ( x)dx (9.39)
Note that this form does not depend upon the age of the operating device. As
an example, consider again the Weibull distributions having scale parameter θ =
4000 hours and shape parameters β = 2.75 and β = 0.75. The corresponding limit-
ing residual life distributions of Equation 9.39 are shown in Figure 9.2a and b,
1.0
0.8
0.6
F
0.4
0.2
1.0
0.8
0.6
F
0.4
0.2
FIGURE 9.2
(a) Limiting forms of residual life distributions. (b) The corresponding underlying life
distributions.
Repairable Systems I 189
where the underlying life distributions are shown. The contrast in the behaviors
of the residual life distributions and the life distributions is striking. In addition,
observe that the limiting forms seem to fit our intuition concerning their individ-
ual and their compared behaviors as the upper curve corresponds to the larger
value of β.
As indicated in Chapter 8, Cox [61] has shown that the mean of this distri-
bution will be
EF2 [T ] + VarF [T ]
EU [U ] = (9.40)
2EF [T ]
9.4 Conclusion
The models presented in this chapter serve to highlight the questions one
should consider in the study of repairable systems. For some systems, it
is the periods of operation that are the greatest concern and the duration
of repair is either negligible or unimportant. For those systems, treating
repair as instantaneous is appropriate. Similarly, the study of individual
components and the analysis of some systems may reasonably be based
on renewal processes. Finally, the operating performance of many systems
is improved by the use of preventive maintenance, while for some other
systems, preventive maintenance is unproductive or impossible. As we
study specific equipment items, we should tailor our models in terms of
these operating features. The models presented in this chapter apply to the
190 Reliability Engineering: Probabilistic Models and Maintenance Methods
instantaneous repair case with renewal. More important, they establish the
basic approaches to model formulation and analysis and emphasize the
choices we must make.
Exercises
9.1 Construct the Laplace transform for the Poisson distribution. Use the
transform to identify the distribution on a sum of two identically dis-
tributed Poisson random variables.
9.2 Construct the Laplace transform for the normal probability density
function. Use the transform to show that the sum of independent nor-
mal random variables has a normal distribution.
9.3 Using the transform obtained in Problem 9.2, state the transform for
the renewal density and the renewal function for the normal renewal
process.
9.4 If the length of the operating intervals in a renewal process is well
modeled by an exponential distribution having parameter λ = 0.004,
what is the probability that the number of failures during 1250 hours
will exceed 7?
9.5 Consider a renewal process in which times between failures have a
normal distribution with a mean of 400 hours and a standard devia-
tion of 50 hours. Construct the functional form for FSk (t) = FT( k ) (t) and the
specific realization of this function when k = 3. Compute the mean and
standard deviation for the distribution on S3.
9.6 For the distribution described in Problem 9.5, construct the bounds
of Equation 9.18 using t(or x) = 1000 hours. Then compute the bounds
of Equations 9.25, 9.26, 9.27, and 9.32 for t = 100 hours.
9.7 Consider a Weibull distribution for which β = 0.60 and θ = 1000 hours.
For this distribution, compute the bounds at 2000 hours defined by
Equations 9.18, 9.30, 9.31 (for n = 3), and 9.33.
9.8 Let FT(t) be a Weibull distribution with β = 1.75 and θ = 800 hours.
For this distribution, compute the bounds at 1500 hours defined by
Equation 9.18.
9.9 For the Weibull distribution of Problem 9.8, compute the bounds of
Equations 9.25 through 9.29 and 9.32 at t = 500 hours and n = 3.
9.10 Prove that an IFR distribution is NBU.
9.11 Consider a Weibull distribution for which β = 0.60 and θ = 1000 hours.
Use Lomnicki’s method to obtain approximate values for the renewal
function at 1000, 2000, and 5000 hours.
Repairable Systems I 191
9.12 For the distribution of Problem 9.8, use Lomnicki’s method to obtain
approximate values for the renewal function at 2000, 4000, and 5000
hours.
9.13 Let FT(t) be a Weibull distribution with β = 1.75 and θ = 800 hours and
let u(τ) be the residual life at age τ of a component having this life distri-
bution. Compute the reliability at u = 100, u = 1000, and u = 1500 hours
for devices that have achieved ages of τ = 500 hours, τ = 1000 hours, and
τ = 2000 hours.
9.14 Let FT(t) be an NBU distribution and let u(τ) be the residual life at age τ
of a component having life distribution FT(t). Show that EU[U(τ)] ≤ EF[T]
where EF[T] is the mean of FT(t).
9.15 Consider a population of devices for which the gamma distribution
having β = 3 and λ = 0.005 is used to model life length. For the elements
of that population that have survived for 200 hours, compute the value
of the survivor function for the residual life distribution at t = 475 hours
(τ = 200, t + τ = 475). If it were decided that a more accurate parameter
set for the gamma distribution model is β = 3.6 and λ = 0.006, what
would be the corrected value of GT (t = 475|t = 200)?
10
Repairable Systems II: Nonrenewal
and Instantaneous Repair
As indicated in the previous chapter, there are many types of devices for
which repair implies renewal. For those cases in which we are studying
individual components in a specific equipment “slot,” treating component
repair as a component renewal point is clearly appropriate. For some other
systems, repair corresponds either exactly or approximately to system
renewal, so the models described in the preceding chapter provide a rea-
sonable portrayal of operating behavior.
On the other hand, there are many types of devices for which a repair does
not return the unit to a new condition. There are also large complex systems,
such as automobiles, for which the replacement of a few of its many compo-
nents does not appreciably change the “age” of the system. For equipment of
this sort, the unit is not as good as new following repair, so unit age follow-
ing repair may not be taken to be zero. Clearly, the state of a device following
its repair determines whether or not a sequence of operating periods is well
modeled by a common distribution. When system age following repair is
nonzero, successive operating periods do not have a common distribution
and the renewal model does not apply.
Several models based on nonstationary processes have been suggested for
those devices that are not renewed by repair. We shall explore some of them
here. While doing so, we will continue to assume that repair is instantaneous
or, equivalently, that the operating periods are our key concern and repair
intervals are negligible or unimportant.
The idea of a nonstationary process is that we have a sequence or oper-
ating intervals, each of which ends with unit failure. As in the case of the
renewal process, we denote the lengths of the intervals by T1, T2, T3, …, but for
the nonstationary process, each interval has a distinct distribution. Several
useful models have been developed to treat the nonstationary sequence of
operating intervals. Three of them are treated here.
193
194 Reliability Engineering: Probabilistic Models and Maintenance Methods
represent its expected value. We call Λ(t) the “cumulative rate function” for
the process and we also define the failure intensity function, λ(t), so that
t
ò
L(t) = l(u)du
0
(10.2)
In general, for an NHPP, the integrand of Equation 10.2 may be any function,
but for reliability applications, we will take it to be the hazard function of the
underlying life distribution for the device of interest.
The definition of the accumulating operating time, Sk, is the same as
defined in Equaion 9.1 and the basic relationship between the variables in the
time domain and those in the frequency domain (Equation 9.4) still applies.
Therefore, we can say that in the frequency domain,
( L(t + x) - L(t))
k
e (
- L ( t + x ) - L ( t ))
Pr[Nt + x - Nt = k ] = (10.3)
k!
Pr[Tk +1 - Tk > t] = e ( k
- L ( T + t ) - L ( Tk ))
(10.4)
( L(t))
n
( L(t))
¥ ¥ j
so
In addition, since the expectation of any random variable, say, y, may be com-
puted using either
E[Y ] =
ò yf (y)dy
Y
Y or E[Y ] = FY ( y )dy
ò
Y
¥ ¥
( L(t))
k -1
zT (t) - L(t )
ò
E[Sk ] = tfSk (t)dt = t
0
ò
0
(k - 1)!
e dt (10.8)
or as
¥ ¥ k -1 k -1 ¥
( L(t)) ( L(t))
j j
ò
E[Sk ] = FSk (t)dt =
0
òå
0 j =0
j!
e - L(t )dt = åò
j =0 0
j!
e - L((t )dt (10.9)
Two further points here are (1) that the realization of FSk(t) for k = 1 implies
that E[Nt] = Λ(t) and (2) that since E[X + Y] = E[X] + E[Y] regardless of whether
the variables are independent,
so
for k > j. Finally, observe that just as with the renewal models, Equation 10.9
can be used to define the calculation of E[Tk] as E[Sk] − E[Sk−1].
Consider the example of a device for which L(t) = ( t/q ) with θ = 4000
b
hours and β = 2.75. Then, Λ(2000) = 0.149, Λ(4000) = 1.0, Λ(6000) = 3.049, and
Λ(8000) = 6.727. Also, using Equation 10.3, we obtain results such as
( L(4000) - L(2000))
0
- ( L ( 4000 ) - L ( 2000 ))
Pr[N 4000 - N 2000 = 0] = e = 0.4268
0!
( L(6000) - L( 4000))
0
- ( L ( 6000 ) - L ( 4000 ))
Pr[N 6000 - N 4000 = 0] = e = 0.1288
0!
and
( L(8000) - L(6000))
2 k
so
and
Pr[T1 > 2000] = e (
- L ( 2000 ))
= 0.8619
Note the implied decline in the 2000 hour reliability with increasing refer-
ence time. This is a feature identified as indicative of increasing hazard in
Chapter 4. Finally, observe that using either Equation 10.8 or 10.9, we may
compute values of E[Sk] such as
so
E[S3 - S2 ] = E[S3 ] - E[S2 ] = 882.510.
and then using Equation 10.5, we obtain a plot of the cumulative intensity
function as shown in Figure 10.1.
It is also possible to compute values such as Λ(100) = 2.082. Thus, we can
compute the probabilities for the possible number of failures over the first
100 hours of device operation. These are shown in Table 10.1. Finally, appli-
cation of either Equation 10.8 or Equation 10.9 to the gamma distribution–
based example yields
6
Cumulative intensity
FIGURE 10.1
Cumulative intensity function for a nonhomogeneous Poisson process based on a gamma
distribution.
TABLE 10.1
Frequency Probabilities for a Nonhomogeneous
Poisson Process Based on the Equivalent HPP
N100 Pr
0 0.125
1 0.260
2 0.270
3 0.186
4 0.098
5 0.041
6 0.014
7 0.004
8 0.001
so
The NHPP also has the feature that it can be transformed into an equiva-
lent homogeneous Poisson process (HPP). Conceptually, the transformation
is revision in the timescale. Formally, for a nonhomogeneous process with
cumulative intensity function Λ(t) = E[Nt], as stated in Equation 10.1, the
inversion is defined by
To treat this in a practical sense, let the random variable Uk = Λ(Tk) represent
the cumulative rate function at the time of the kth failure event. It has been
shown that Uk has a gamma distribution for which the density function is
dt u k -1 - u
gUk (u) = fSk (t) = e (10.11)
du (k - 1)!
so
n -1
å
uk
GUn (u) = e - u
k =0
k!
and these expressions do not depend on the specific NHPP that we wish to
analyze. We can plot GUk (u) for any value(s) of k that are interesting. Figure 10.2
shows the function for k = 2, 3, and 4, respectively. Some of the values rep-
resented by the graph are GU2 (3) = 0.8008 , GU2 ( 4) = 0.9084 , GU3 (3) = 0.5768 ,
GU3 ( 4) = 0.7619, GU4 (3) = 0.3528 , and GU4 ( 4) = 0.5665.
Recall that a gamma distribution is the convolution of identically distrib-
uted exponential random variables, so U k may be viewed as the sum of k
exponentials, each of which has λ = 1. Thus, if we take Mu to be the count-
ing process associated with those exponentials, then setting t = Λ−1(u) and
Mu = Nt defines the mapping from the NHPP to an HPP having the rate
function λ = 1.
For the example of the Weibull distribution-based NHPP, we com-
pute t = Λ−1(2) = 5146.66, t = Λ−1(3) = 5964.29, and t = Λ−1(4) = 6602.03 and we
may then say that E[N(5146.66)] = E[M2] = 2, E[N(5964.29)] = E[M3] = 3, and
G
1.0
0.8
0.6
0.4
0.2
4000
2 4 6 8
FIGURE 10.2
Distribution functions for the nonhomogeneous Poisson process cumulative intensity function
at a sequence of failure times.
200 Reliability Engineering: Probabilistic Models and Maintenance Methods
( )
p
FT|p (t) = 1 - FT (t) = 1 - e - pZT (t ) (10.14)
This is a very useful and very general result. It is useful first because it applies
to all choices of underlying life distribution. In addition, Equation 10.12
and those that follow it show that the distribution on the time between
perfect repairs will be of the same class as the life distribution—increasing
failure rate (IFR), increasing failure rate on average (IFRA), decreasing fail-
ure rate (DFR), and so on. Within the intervals between perfect repairs,
Equations 10.3 and 10.4 describe the probabilities on the frequency of
minimal repairs and the times between them. Finally, all of the results for
renewal processes that we have examined apply to the intervals between
perfect repairs.
Consider a simple example. Suppose that failures of a particular device
are well modeled by a Weibull distribution with parameters θ = 4000 hours
and β = 2.75. Suppose further that the probability of perfect repair is p = 0.25.
Using these values, we compute the values for the life distribution as usual
and the calculation of the times between renewals has the distribution
( q ) = 1 - e -0.25( t 4000 )
b 2.75
( )
p -p t
FT|p (t) = 1 - FT (t) = 1- e
The results of these calculations are plotted in Figure 10.3. Note that the
distribution on individual life lengths is stochastically smaller than the dis-
tribution on renewal times. Naturally, as p is increased, the times between
0.8
0.6
Pr
0.4
0.2
FIGURE 10.3
Example of life and renewal time distributions under imperfect repair.
202 Reliability Engineering: Probabilistic Models and Maintenance Methods
p(t) fT (t)
zT|p (t) = p(t)zT (t) = (10.15)
FT (t)
As in the simpler case, the times of perfect repairs form a renewal process
and it is again the case that we can compute device behavior measures using
the methods discussed earlier. It is also often but not always the case that
the distribution on times between renewals has the same behavior as the
life distribution. Behavior is preserved for IFR, IFRA, new better than used,
DFR, decreasing failure rate on average, and new worse than used hazards
but not for new better than used in expectation.
Consider an example device for which failures are well modeled by a
Weibull distribution with parameters θ = 4000 hours and β = 2.75. Suppose
further that the probability function for perfect repair is p(t) = 1 − e−ρt where
ρ = 0.0001. Using these parameter values, we find that the distribution on
the time between perfect repairs (renewals) is the one shown in Figure 10.4.
In addition, as noted earlier, during the intervals between renewals, we use
the cumulative intensity function (Equation 10.5) of
0.8
0.6
0.4
0.2
FIGURE 10.4
Example of renewal time distributions under age-dependent imperfect repair.
equipment during only the most recent operating interval and not any of the
damage or aging that was incurred during earlier intervals. Thus, following
the nth repair, the virtual (or equivalent) age of the device, An, is defined to be
An = An -1 + pnTn (10.18)
An = pn ( An -1 + Tn ) (10.19)
The selection of the values πn determines the form of these models. In both
cases, 0 ≤ πn ≤ 1 for all n. If we take πn = 0 for all n, both Model I and Model
II reduce to the renewal model. On the other hand, if we set πn = 1 for all n in
Model I, we obtain the minimal repair model. If we take πn to be a Bernoulli
random variable for each n in Model II, the result is the imperfect repair
model. Because the various choices of coefficients lead to the previous mod-
els, Kijima called his models general repair models. He also specified that in
the most general case, the coefficients πn should be taken to be random vari-
ables with any arbitrary and not necessarily identical distributions.
For all choices of the coefficients, the analysis of the Kijima models is com-
plicated. In general, for any set of coefficients, the distribution on the dura-
tion of any operating interval can be defined for both Model I and Model II
using the residual life distribution. The general form is
FT (t + u) - FT (u)
Pr[Tn £ t| An -1 = u] = FTn ( u ) (t) = (10.20)
FT (u)
FT (t + u)
Pr[Tn > t| An -1 = u] = (10.21)
FT (u)
Now, we would like to be able to identify the probabilities and the expecta-
tions for the number of repairs over time and the time until a given number
of repairs have been made. If we let Π = {π1, π2, …} represent the sequence of
repair effectiveness factors, then we can identify algebraic representations
Repairable Systems II 205
for these measures. Specifically, we note that the sequence of random vari-
ables {An} represents the “virtual age” stochastic process and that the ran-
dom variable
n
Sn = åT j =1
j (10.22)
is the real age (or elapsed time) for the sequence of operating intervals. We
can denote the distribution on the real age by
Then, the usual relationship between the time and frequency domains
implies that
¥
E[Nt ] = åF n =1
Sn (t) (10.26)
and
¥
ò
E[Sn ] = FSn (u)du
0
(10.27)
Unfortunately, for the most general definitions of the vector Π, these mea-
sures of interest are extremely difficult to determine. As shown in the
succeeding texts, even for quite simple choices of Π, the exact analysis of
Equations 10.26 and 10.27 requires successive numerical integrations that
are intricate. Rather than attempt the difficult numerical analysis, we usu-
ally compute an upper bound on E[Sn] and use that as our key measure of
system behavior.
To appreciate the utility of the bounds, we examine the analysis of the
basic models. Assume that the factors πn are independently and identically
distributed (i.i.d.) random variables having expected value E[π]. Start with
Model I.
206 Reliability Engineering: Probabilistic Models and Maintenance Methods
Suppose that πn + 1 = π for π ∈ [0, 1] and assume that π ≠ 0 so that the applica-
tion of Equation 10.21 yields
t
ò
FAn+1 (t) = Pr[ An +1 > t] = Pr[ An > t] + Pr[pTn > t - u| An = u]Pr[ An = u]du
0
æ t-uö æ t-uö
t FT ç u + ÷ t FT ç u + ÷
p ø p ø
= FAn (t) +
ò
0
è
FT (u)
dFAn (u) = FAn (t) +
ò
0
è
FT (u)
f An (u) du
(10.28)
and the density function is
æ t-uö
t fT ç u + ÷
1 p ø
f An+1 (t) =
p ò
0
è
FT (u)
f An (u)du (10.29)
with
1
f A1 (t) = fT (t/p) (10.30)
p
Now that the density on An is defined, we can construct the expected value of
each operating interval using Equations 10.21 and 10.29. Specifically,
¥ ¥
FT (t + u)
ò
E[Tn | An -1 = u] = Pr[Tn > t| An -1 = u]dt =
0
ò
0
FT (u)
dt (10.31)
and
¥
ò
E[Tn ] = E[Tn | An -1 = u] f An-1 (u)du
0
(10.32)
FAn+1 (t) =
ò g ( u) f
0
t An (u)du (10.33)
where
ì t-u
ï FU ( u ) ( ) u£t
g t ( u) = í p (10.34)
ïî 1 u>t
Repairable Systems II 207
He then notes that when the elements of the vector of repair effectiveness
factors are i.i.d. random variables having distribution function FΠ(π), the sto-
chastic process, {An} having A0 = 0, is a Markov process with the following
transition probability function:
1
Pr[ An +1 £ t| An = u] =
ò (1 - g (u)) f (p)dp
0
t P (10.35)
Using this result and the assumption that the underlying life distribution is
DMRL, Kijima shows that bounds for E[Sn] can be obtained. A lower bound
is obtained when it is assumed that πn = π for all n and π is simply a constant.
In that case, successive evaluation of the integrals stated in Equations 10.31
and 10.32 are demanding but possible. For an upper bound for E[Sn], we use
the corresponding quantity for the imperfect repair case. That is, taking the
πn to be Bernoulli with E[πn] = p, we have the imperfect repair case. For that
model, we can compute E[Sn|p] using the following recursion:
v(m, n) = m m + (1 - p)v(m, n - 1) + pv(m + 1, n - 1) (10.36)
where v(m, n) is the expected value of Sn for a device that is subject to an imper-
fect repair regime and has already had m − 1 minimal repairs. The quantity μm
is the mean length of m intervals when only minimal repair is used and p is the
probability that repair is perfect (π = 0). The boundary conditions for the recur-
sion are v(m, 1) = μm and the values v(1, n) = E[Sn|p] provide the upper bounds.
The interpretation of Equation 10.36 is that the expected value of the sum
of the lengths of the next n operating intervals for a device that has had m
minimal repairs so far is the expected value length of the next operating
interval, μm, plus the expected value of the sum of the lengths of the following
n − 1 intervals, for which the number of minimal repairs will be m + 1 with
probability p and m with probability 1 − p. The upper bound value that we
obtain using Equation 10.36 applies to the “general repair case” in which the
distribution on the πn has expected value E[πn] = p.
Based on our understanding of the minimal repair case, we can compute
the value of the mean length of any of the operating intervals under minimal
repair using Equation 10.9 as
¥
1
ò ( Z ( u) )
m -1
mm = T e - ZT ( u )du (10.37)
G(m)
0
That is, μm is really just E[Tm]. This quantity forms the basis for the numerical
computations for the recursion of Equation 10.36.
Consider an example. Suppose the underlying life distribution for a device
is Weibull with β = 2.75 and θ = 4000 hours. In this case, we calculate
b 2.75
ætö æ t ö
ZT (t) = ç ÷ = ç ÷
è q ø è 4000 ø
208 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 10.2
Values of the Mean Residual Life
under Minimal Repair
n μn
1 3559.43
2 1294.34
3 882.50
4 695.31
5 584.69
6 510.27
7 456.15
8 414.69
9 381.70
10 354.71
TABLE 10.3
Values of E[Sn|p] for Minimal Repair as Upper Bounds
on E[Sn] for General Repair
n E[πn] = p = 0.90 E[πn] = p = 0.75 E[πn] = p = 0.50
1 3559.43 3559.43 3559.43
2 5080.27 5420.04 5936.81
3 6063.38 6624.28 7743.96
4 6823.51 7527.55 9092.12
5 7457.78 8266.61 10179.47
6 8009.35 8902.29 11093.31
7 8501.38 9465.85 11886.39
8 8948.05 9975.54 12591.67
9 9358.77 10443.04 13230.49
10 9740.17 10876.37 13817.18
Repairable Systems II 209
with
1 ætö
f A1 (t) = fT ç ÷ (10.40)
p èpø
where the values v(1,n) = E[Sn|p] are the upper bounds on E[Sn] for the general
repair case when the πn are i.i.d. and the distribution on the πn has expected
value E[πn] = p.
For Model II, we also note that repeated substitution within the recursion
Equation 10.41 leads to the relation that under imperfect repair, when n ≥ 2,
n -1
E[Tn ] = E[Sn ] - E[Sn -1 ] = (1 - p) åp
j =1
j -1
m j + p n -1m n (10.42)
The quantities μn in each of these expressions are the same as for Model I.
They are the successive mean residual life lengths for a device subjected to a
minimal repair regime.
210 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 10.4
Values of E[Sn|p] for Minimal Repair as Upper
Bounds on E[Sn] for General Repair
n E[πn] = p = 0.90 E[πn] = p = 0.75 E[πn] = p = 0.50
1 3559.43 3559.43 3559.43
2 5080.27 5420.04 5986.31
3 6267.53 7048.99 8310.23
4 7318.33 8598.97 10610.76
5 8296.54 10113.95 12904.37
6 9230.82 11611.27 15195.65
7 10136.33 13098.95 17486.09
8 11022.02 14581.11 19776.21
9 11893.50 16059.96 22066.19
10 12754.52 17536.78 24356.13
Consider the same example as the previous one. When the life distribution
is Weibull with β = 2.75 and θ = 4000.0, the mean residual life values are those
listed in Table 10.2 and the upper bounds on E[Sn] defined in Equation 10.41
are shown in Table 10.4. Here again, the computational effort associated with
the computation of the lower bounds is excessive.
In summary, we may observe that the Kijima models are very appealing as
they provide a significantly more realistic image of the state of equipment follow-
ing repair. Unfortunately, the models are correspondingly difficult to analyze.
On the other hand, it is not difficult to simulate the models and to use the simu-
lation output to describe system behavior. This approach is now widely used.
Tn = a n -1X n (10.43)
where α (>0) is a constant that alters the scale of the distribution and the Xn
are i.i.d. random variables. Under this definition, the sequence {Tn} is said
to form a quasi-renewal process. Actually, Lam [73,74] and Finkelstein [75]
studied the same process earlier and called it a geometric process. Clearly, if
α = 1, the sequence is a renewal process. An interesting feature of this model
is that a choice of α < 1 implies that the operating intervals are decreasing in
magnitude as might occur with aging and deterioration. On the other hand,
a choice of α > 1 might be used to represent an ongoing improvement in the
Repairable Systems II 211
α1–n f t α1–n
0.06
0.05
0.04
0.03
0.02
0.01
t
100 120 140 160
FIGURE 10.5
Normal density functions on Tn for n = 4, 8, and 12.
æ t ö
FTn (t) = FX ç n -1 ÷
èa ø
1 æ t ö
fTn (t) = n -1 f X ç n -1 ÷ (10.44)
a èa ø
n -1
E[Tn ] = a E[X ]
n n
å åa j -1 1 - an
E[Sn ] = E[Tj ] = E[X ] = E[X ] (10.45)
j =1 j =1
1- a
Pr[Nt ³ n] = Pr[Sn £ t]
212 Reliability Engineering: Probabilistic Models and Maintenance Methods
F(8000)
1.0
0.8
0.6
0.4
0.2
n
2 3 4 5 6
FIGURE 10.6
Normal distribution functions on Sn at 8000 hours as n increases.
and the same logic that we used for the renewal process to construct the
quasi-renewal function
¥
QFT (t) = E[Nt ] = åF
n =1
Sn (t) (10.46)
Of course, the distributions FSn (t) are all distinct as they are convolutions of
distinct distributions from a common class. However, since the process {Tn}
is constructed so regularly, the Laplace transforms for the distinct distribu-
tions and hence for the quasi-renewal function are readily constructed as
and
¥ ¥ n
QF*T (s) = å
n =1
FS*n (s) = å Õ F * (a
n =1 j =1
X
j -1
s) (10.48)
Also, as with the renewal process, the derivative of the quasi-renewal func-
tion is the quasi-renewal intensity function
¥ ¥ n
q*FT (s) = å
n =1
fS*n (s) = å Õ f * (a
n =1 j =1
X
j -1
s) (10.49)
- ( t/a n-1 - m )
2
1 æ t ö 1 2 s2
fTn (t) = n -1
f X ç n -1 ÷ = e
a èa ø a n - 1
2ps 2
( )
2
1 - t - ma n-1 2( a n-1s )2
= e
n -1 2
2p(a s)
but with a mean value of αn−1μ and a standard deviation of αn−1σ. We can thus
construct the convolutions of the distributions on the Tn to obtain the distri-
butions on the Sn. We do this most easily using the Laplace transforms. For
the normal life distribution,
s2 s2
- sm +
f X* (s) = e 2
so
n-1 s2 ( a n-1s )2
- sa m+
fT* (s) = e
n
2
and
n n
s2 s2 æ 1- a n ö s 2 s 2 æ 1- a 2 n ö
n n s2 ( a j-1s )2 - sm å a j-1 + å
( a j-1 )2 - sm ç ÷+ ç ÷
Õ f * (s) =Õ e
- sa j-1m + 2 j=1 ç 1- a ÷ 2 çè 1- a 2 ÷ø
fS*n (s) = Tn
2 =e j =1
=e è ø
j =1 j =1
and we can conclude that the distributions on the Sn are normal with mean
1
1 - an æ 1 - a 2n ö 2
m and standard deviation ç 2 ÷
s . Then, we observe that as n
1- a è 1- a ø
increases, both the mean and the standard deviations values converge—
often quite rapidly. In addition, for any time interval, t, the sum of the quan-
tities FSn (t) will also converge. Algebraically, it is difficult to identify the
convergent form, but we can easily compute approximate values of the limit
numerically. As a general observation, the closer the value of α is to 1.0, the
more quickly the values FSn (t) decline and, therefore, the more quickly the
numerical convergence of the sum.
Suppose the underlying life distribution has μ = 4000.0 and σ = 1000.0.
The corresponding (approximate) limits for several values of α are shown
in Table 10.5. As the value of α is decreased, the number of terms that must
be computed increases. For example, for α = 0.95 and a time of 5,000 hours,
Q(5,000) is determined using only 2 terms FSn (t), and Q(10,000) requires
214 Reliability Engineering: Probabilistic Models and Maintenance Methods
TABLE 10.5
Values of Q(t) for the Quasi-Renewal Model
with Normal Life Distribution
t α = 0.95 α = 0.85 α = 0.75
1,000 0.001 0.001 0.001
2,500 0.067 0.067 0.067
5,000 0.863 0.875 0.897
8,000 1.577 1.741 1.992
10,000 2.145 2.447 3.084
6 terms. On the other hand, for α = 0.75, computing Q(5,000) requires 6 terms
and computing Q(10,000) requires 200 terms.
10.4 Conclusion
As in the previous chapter, the models presented earlier serve to highlight the
questions one should consider in the study of repairable systems. The mod-
els in this chapter focus on the more realistic representation of the postrepair
equipment state. By treating repair as instantaneous, we obtain the simplest
model forms possible and develop the methods best suited to their analysis.
With these methods now defined, we are ready to move on to the investiga-
tion of equipment performance when repair is noninstantaneous.
Exercises
10.1 Suppose that the minimal repair model is to be applied to a device
for which the underlying life distribution is a gamma distribution.
For each of the parameters pairs obtained by crossing β = 2, 3, 5, and 8
with λ = 0.05 and 0.005, compute E[T] and then compute Λ(0.5E[T]),
Λ(1.5E[T]), and Λ(2.5E[T]).
10.2 Assume the operation of a device is to be represented using the mini-
mal repair model with a Weibull life distribution having β = 1.5 and
θ = 5000.0 hours. Compute Pr[T4 − T3 > 1200] for T3 values of 6,000,
9,000, and 12,000 hours. Also compute Pr[Nt = 4] for times of 6,000,
9,000, and 12,000 hours. Finally, compute Pr[Nt = 4], Pr[N4000 − N1000 > 2],
Pr[N6000 − N3000 > 2], and Pr[N8000 − N5000 > 2].
10.3 Plot the cumulative intensity function for the minimal repair model
for the case in which the underlying life distribution is gamma with
β = 2 and λ = 0.20.
Repairable Systems II 215
10.4 Compute and plot the distribution on the time between device renew-
als for the imperfect repair model having a Weibull life distribution
with β = 1.5 and θ = 5000.0 hours and p = 0.4. Then compute compari-
son values of FT|p(t) and FT(t) for t = 2,000, 5,000, 10,000, and 20,000.
10.5 Resolve Problem 10.4 using the time-dependent form of the renewal
probability p(t) = 1−e−ρt with ρ = 0.0005. (Hint: The function p(t) is not a
CDF and should be used as stated.)
10.6 Compute E[T4−T3] and E[T5−T4] for the gamma distribution–based
NHPP where β = 2 and λ = 0.008.
10.7 Compute E[T4−T3] and E[T5−T4] for the gamma distribution–based
NHPP where β = 4 and λ = 0.005.
10.8 Return to your analysis of the Weibull renewal process and use your
coefficients to obtain a plot of the approximation to the convolutions
FT( k ) (t) for the distribution having β = 2.75 and θ = 4000. Do this for
k = 1, 3, and 5, and in each case, carry the plot to 8,000 and 12,000
hours. Comment on your results.
1 æyö
10.9 Prove that in general, f cX (cx) = f X ç ÷ and construct the realization
c ècø
of this relationship for the Weibull distribution.
10.10 Prove that an IFR distribution is DMRL.
10.11 Replicate the bounds on E[Sn] for the Kijima I stated in Table 10.3 by
computing the values of E[Sn, p] using Equation 10.32 for the case in
which p = 0.90.
10.12 Suppose the Weibull life distribution for the device subjected to mini-
mal repair has shape parameter β = 0.75. Compute the first 10 values
of the mean residual life and the resulting upper bounds for E[Sn] for
general repair when θ = 4000 and p = 0.90. How do these values com-
pare to those of Problem 10.11? Why?
10.13 Assume that the base distribution for a quasi-renewal process, FX(x),
is normal with μ = 2500 and σ = 600. Construct the algebraic form of
the quasi-renewal function and evaluate it using α = 0.98 at t = 2600,
t = 5000, and t = 7500.
10.14 For the quasi-renewal model having α = 0.90, assume FX(x) is a gamma
distribution having β = 2.0 and λ = 0.05. Construct general expressions
for fTn (t) and FTn (t) and plot these functions for n = 2, n = 4, and n = 6.
Next, construct the Laplace transform for fSn (t) and use it as a basis for
constructing FSn (t). Finally, use your expressions to compute values of
the quasi-renewal function at t = 25 and t = 100 using 2 or 3 truncation
levels. Comment on your results.
11
Availability Analysis
The models and results described in Chapters 9 and 10 are constructed with-
out considering the duration of the repair activity. In many situations, the
focus of our analysis is upon questions for which the answers do not depend
upon the duration of the repair process. This is especially true when the
time a system spends down is relatively less important than the fact that a
failure has occurred and when the duration of repair is small or negligible
in comparison to the device life length. In contrast to such devices, there are
components and systems for which the duration of the repair activity has an
impact on the meaningful device performance measures. For items of this
sort, we must include repair times in our models.
Naturally, when we include repair time in our models, we may represent
the possible repair durations in whatever manner seems most representa-
tive of actual experience. In some cases, repair time is taken to be a constant,
while in most cases, repair time is treated as a random variable and a specific
distribution is selected to portray the dispersion in repair times. We will
consider both of these possibilities.
We may consider that a typical sample path for a device is one in which
periods of operation are terminated by device failure and that therefore a
repair period follows each failure. Upon completion of the repair, the device
is placed in operation again. A representative sample path is shown in
Figure 11.1. Note that we will modify the labels soon but for now, we indicate
that the periods of repair have durations Rj while the operating periods are
labeled as Tj. Observe also that the device’s state is denoted by Xi and that the
value of the state variable is shown to be 1 when the device is operating and
0 when it is being repaired. Note further that each pair of periods, operating
and repair, is shown to have total duration Vj where
Vj = Tj + Rj (11.1)
217
218 Reliability Engineering: Probabilistic Models and Maintenance Methods
1
State
T1 R1 T2 R2
0
0 V1 V2
Time
FIGURE 11.1
Representative sample path.
majority of this chapter, the cycles are assumed to form a renewal process
and the types of analysis that can be performed are discussed. Toward the
end of the next chapter, we examine the relatively few results that have been
constructed for the nonrenewal case.
To study the cycles in the renewal case, we first modify the definition of the
quantities Vj. In general, our analyses are based on an interest in the times of
device failure rather than in times of device restart. Therefore, we modify the
labels on the sample paths so that the Vj corresponds to failure times. This
is illustrated in Figure 11.2. The difference is quite subtle. The change is that
the values of the Vj now correspond to failure times. That is,
V1 = T1
V2 = R1 + T2
(11.2)
Vj = Rj -1 + Tj
1
State
T1 R1 T2 R2
0
0 V1 V2
Time
FIGURE 11.2
Representative sample path with revised labeling of the Vj.
Availability Analysis 219
As a matter of convention, let the distributions FTj (t) represent the distribu-
tions on the lengths of the operating intervals (which could be just the life
distribution), and let GRj (t) represent the distributions on the lengths of the
repair intervals. In that case, the distributions on the durations of the cycles
are constructed as the convolution of FTj (t) and GRj (t) and are denoted by
HVj (t). Thus, in general,
t
ò
HVj (t) = FTj (t) * GRj-1 (t) = GRj-1 (t - u)dFTj (u)
0
(11.3)
When the distributions are common for all cycles, the subscript “j” is
dropped.
An interesting consequence of the relabeling of the sample path is that
the renewal functions for the expected number of failures and the expected
number of repairs are different. We will exploit that difference in our analy-
sis later. For now, note that the Laplace transform for the expected number of
repair completions during the interval (0, t) is given by
fT* (s)
MH*V (s) = (11.5)
(
s 1 - fT* (s) g R* (s) )
The expected number of failures and the expected number of points Vj are
the same. These arise from the applicable forms of the key renewal theorem
that are as follows:
t
ò
MGR (t) = HV (t) + MGR (t - u)hV (u)du
0
(11.6)
ò
MHV (t) = FT (t) + MHV (t - u)hV (u)du
0
(11.7)
Definition 11.1
The (point) availability at time t, A(t), for a device is the probability that it is
functioning (X(t) = 1) at the time. Thus,
Definition 11.2
The limiting availability for a device is the limit of the point availability func-
tion. That is,
The limiting availability can be very useful. As we will see, many devices
experience an interval of transience before they “settle down” into a consis-
tent pattern of operation. At that point, the devices often display availability
behavior that is stable and similar (or equal) to the limiting form. In addi-
tion, there are many analytical cases in which the point availability is very
difficult (or impossible) to compute but the limiting availability measure is
manageable.
It may also be useful at certain times to compute averages, so we have the
following definition:
Definition 11.3
Definition 11.4
The limiting average availability for a device is the limit of the average avail-
ability over (0, t). That is,
Each of the four measures has utility in specific cases. In general, the point
availability is the most informative measure but it is usually the most difficult
to obtain. The most commonly used of the measures is the limiting avail-
ability. The primary reason for its popularity is that in the case in which the
operating and repair cycles form a renewal process, it is easily computed as
EF [T ] MTTF
A¥ = = (11.12)
EF [T ] + EG [T ] MTTF + MTTR
where
MTTF is the “mean time to failure”
MTTR is the “mean time to repair”
each cycle has the same probability distribution on its length and the cycles
form a renewal process. Therefore, the availability function may be defined
as the sum of two probabilities:
1. The probability that the device has never failed and is thus still
functioning
2. The probability that a new cycle was started at some recent point in
time and no failure has occurred since then
Algebraically, this is
t
ò
A(t) = FT (t) + FT (t - u)mHV (u)du
0
(11.13)
A* (s) =
1
s
æ1 ö 1
( )(
- FT* (s) + ç - FT* (s) ÷ m*HV (s) = 1 - fT* (s) 1 + mH* V (s)
ès ø s
) (11.14)
and the inverse transform gives the availability function. Note that
Equation 9.16 of Chapter 9 implies that we may use
A* (s) =
1
s(1 - fT* (s) ) *
1
1 - fT (s) g R* (s)
(11.15)
1
A* (s) = MG*R (s) - MH* V (s) + (11.16)
s
lf lr l f lr
hV* (s) = fT* (s) g R* (s) = =
s + l f s + l r s2 + (l f + l r )s + l f l r
1 A B
= +
s2 + (l f + l r )s + l f l r s + l f s + l r
1 -1
A= and B =
lr - l f lr - l f
l f lr
hV (t) =
lr - l f
(
-l t
e f - e -lr t )
with the corresponding distribution function
HV (t) = 1 -
(l er
-l f t
- l f e -lr t )
lr - l f
Observe that this is neither an exponential nor a gamma distribution but that
it is a proper distribution function and it provides an unambiguous model of
224 Reliability Engineering: Probabilistic Models and Maintenance Methods
the dispersion in the lengths of the cycles of operation and repair. Naturally,
it also correctly yields
EHV [T ] = EF [T ] + ER [T ]
we obtain
lr lf
A= and B =
lr + l f lr + l f
lr lf - ( l + l )t
A(t) = + e f r
l f + lr l f + lr
This result is quite useful for examining the behavior of the availability
measures for the renewal case. Note first that the availability function is
comprised of a constant term and a term that diminishes over time so the
function displays an initial transience and then settles down to a stable and
essentially constant value. This is illustrated in Figure 11.3 for λf = 0.01 and
λr = 0.1. The limiting value is
lr
A¥ = = 0.909
l f + lr
0.945
0.94
0.935
Availability
0.93
0.925
0.92
0.915
20 40 60 80 100
Time
FIGURE 11.3
Availability function example.
lr 1/EG [T ] EF [T ]
A¥ = = =
l f + l r 1/EF [T ] + 1/EG [T ] EF [T ] + EG [T ]
EF [T ] 177.245
A¥ = = = 0.94659
EF [T ] + EG [T ] 177.245 + 10.0
¥ t
mHV (t) = åò
k =1 0
f F( k ) (u) g R( k ) (t - u)du (11.18)
and then replace fT( n ) (t) with its approximate form as defined in Appendix B.
The resulting values of the function are fully accurate until we introduce
the numerical error associated with the integration and the error resulting
from the finite truncation of the infinite sums. The actual numerical effort
involved in this analysis is quite taxing. However, it is manageable. Several
of the values of the point availability obtained for the given distributions
are shown in Table 11.1. The values show a transient interval and a relatively
rapid convergence to the limiting value. In view of the fact that both the
TABLE 11.1
Computed Values of the Availability
Function for Weibull Failures
and Exponential Repairs
Time A(t)
50 0.9807
100 0.9614
150 0.9498
200 0.9454
250 0.9451
300 0.9466
350 0.9466
400 0.9466
Availability Analysis 227
Weibull life distribution and the exponential repair time distribution cen-
ter around exponential terms, the rapid convergence is exactly the type of
behavior one expects to see.
l jr l jf - ( l + l )t
A j (t) = + e jf jr
l jf + l jr l jf + l jr
and
l jr
A j¥ =
l jf + l jr
and the point availability function is the one shown in Figure 11.4. We can see
that the product form of the structure function implies a fairly rapid decline
in the value of the exponential term and the system thus “settles down” into
its limiting behavior in a short time.
As an example of the more complicated analysis we might encounter, con-
sider the bridge structure of Chapter 2 and assume each of the components
0.9
0.85
0.8
0.75
10 20 30 40
FIGURE 11.4
Point availability function for a series system example.
Availability Analysis 229
has a Weibull life distribution with parameters β = 2.0 and θ = 200 hours and
an exponential repair time distribution with parameter λ = 0.10. As we have
already computed the availability for a single device with these character-
istics, we know that Aj(50) = 0.9807 and Aj(200) = 0.9454. We may also recall
that the series–parallel bounds, the min path upper–min cut lower bounds,
and the minimax bounds on reliability for the bridge structure with identical
components are
r 5 £ Rs £ 1 - (1 - r )5
(1 - (1 - r) ) (1 - (1 - r) ) ( )
2 2
2 3
£ Rs £ 1 - (1 - r 2 )2 (1 - r 3 )2
and
r 2 £ Rs £ 1 - (1 - r )2
and
and
E[Tj ] 177.245
A j¥ = = = 0.947
E[Rj ] + E[Tj ] 10.0 + 177.245
0.760 £ As¥ £ 1
and
Thus, we can see that for the assumed operating scenario, the results we devel-
oped earlier using the system structure can be very useful and informative.
A second model for which we can obtain results is that in which we have
an m component series system. We assume that the failure of any component
stops the operation of the system and thus of the other components. The
other components do not age while the failed component is being replaced.
Thus, when system operation is resumed, the replaced component is new
and the other components have the age they had when the replaced compo-
nent failed. Several key structural results for this model were developed by
Barlow and Proschan [12]. These results are general in that they apply to any
choice of continuous life and repair time distributions.
The sample path shown in Figure 11.1 at the start of this chapter is rep-
resentative of the operating experience for a series system. The individual
failure and repair times may (usually do) correspond to the failures and
replacements of different components. Thus, let Tji represent the life length
of the ith copy of component j used in the system, and let Rji represent the ith
replacement time for the jth component. Next, let U(t) represent the system
operating time (uptime) during the real time interval (0, t). Note that U(t) is
not an availability measure but that for any realization of the system sample
path, U(t)/t is the proportion of the time the system functioned and hence is
the average availability for that sample path. Now, Barlow and Proschan [12]
prove that
-1
U (t) æç EGj [R] ö
m
AS¥ = lim
t ®¥ t
= 1+
ç å ÷
EFj [T ] ÷
(11.21)
è j =1 ø
That is, we take the ratio of the expected repair time to the expected life
length for each component and sum across components. Add this to one and
take the reciprocal and the result is the limiting average system availability.
Availability Analysis 231
ìï é N j (U (t)) ù üï 1
lim íE ê
U (t ) ú ý = E [T ] (11.22)
t ®¥
îï ë û þï Fj
and
N j (t) = sup{n|Sjn £ t}
and
ì N j (t) ü AS¥
lim í ý» (11.24)
t ®¥
î t þ EFj [T ]
Over time, the uptime is approximated by the product of time and limit-
ing average system availability, and the number of replacements for any
component is approximately the limiting average availability divided by
the average life length. Note that dividing both sides of Equation 11.24 gives
(approximately) Equation 11.22.
At the system level, we can state that the average system uptime per cycle
converges in the limit to
-1
æ m
1 ö÷
U =ç
ç å EFj [T ] ÷
(11.25)
è j =1 ø
and the corresponding limit for the average time down per cycle is
æ m
EGj [R] ö
D =Uç
ç å E [T] ÷÷ Fj
(11.26)
è j =1 ø
232 Reliability Engineering: Probabilistic Models and Maintenance Methods
Clearly, these imply the average cycle length and they are consistent with the
average availability expressions.
Consider again the example of a system comprised of m = 5 components in
series each of which has an exponential life distribution with the parameters
λf1 = 1.20, λf2 = 0.800, λf3 = 0.400, λf4 = 0.90, and λf5 = 5.00 and an exponential
repair time distribution with parameters λr1 = 19.01, λr2 = 25.00, λr3 = 10.00,
λr4 = 40.00, and λr5 = 20.00. In this case, we find
-1
æ
AS¥ = ç 1 +
( 0.0625 ) + ( 0.0640 ) + ( 0.100 ) + ( 0.025 ) + ( 0.050 ) ö = 0.695
ç ÷
è ( 0.833 ) (1.250 ) ( 2.500 ) (1.111) ( 0.200 ) ÷ø
-1
æ m
1 ö÷
U =ç
ç å EFj [T ] ÷
= ( 0.833 + 1.25 + 2.50 + 1.111 + 0.20 ) = 0.11668
-1
è j =1 ø
D = U (0.4387 ) = 0.0732
U
AS¥ =
U+D
-1
æ E [R] ö EF [T ]
AS¥ = ç1+ G ÷ =
è EF [T ] ø EF [T ] + EG [R]
the probability analysis for availability. Some of these cost models are dis-
cussed in the next chapter. In the meantime, we can construct some single
unit availability results for the imperfect repair model and for the quasi-
renewal model.
( )
p
Fp (t) = 1 - FT (t) = 1 - e - pZT (t ) (11.27)
This expression applies when the minimal repair actions are instantaneous.
However, it also represents the distribution on total operating time for the
case in which minimal repairs have nonzero durations. In addition, it is use-
ful to note that the average time of operation per renewal cycle may be deter-
mined as
¥
Ep [T ] = Fp (u)du
ò
0
(11.28)
Now, if the minimal repair times have distribution, Gm(t) and if it is the nth
device failure that produces the first perfect repair, then the total time spent
in minimal repair will have the distribution, Gm( n -1) (t), the n − 1 fold convolu-
tion of Gm(t). For this conceptual model, Iyer [77] has shown that the distri-
bution on the total time devoted to minimal repairs in any renewal cycle is
( L m (t))
¥ n -1
G(t) = e - L m (t ) å
n =1
(n - 1)!
Gm( n -1) (t) (11.29)
where
s
ò
L m (s) = q(u)zT (u)du
0
(11.30)
ò
Fp* (t) = G(t - u) f p (u)du
0
(11.31)
Similarly, the distribution on the total duration of the renewal cycle is the
convolution of this distribution with the one on perfect repair time, say, Gp(t).
We may also determine the limiting device (average) availability using the
expectations for the durations already defined. To do this, note first that in
general, each renewal cycle will include, say, N periods of device operation
terminated by a device failure. The first N − 1 of the operating periods is
followed by periods of minimal repair and the last failure is followed by a
period of perfect repair. Now Iyer [77] has shown that
¥
ò
E[N ] = 1 + L m (u)Fp (u)du
0
(11.32)
EFp [T ]
A¥ = (11.33)
EFp [T ] + E[N - 1]EGm [T ] + EGp [T ]
That is, the average uptime divided by the average cycle length gives us the
average availability. The average cycle length is comprised of the average
uptime plus the average number of minimal repair intervals times their
average duration plus the average duration of the perfect repair interval.
Notice that we can also state that the average fraction of the time the device
is undergoing minimal repair is
E[N - 1]EGm [T ]
EFp [T ] + E[N - 1]EGm [T ] + EGp [T ]
and the average fraction of the time the device is undergoing perfect repair is
EGp [T ]
EFp [T ] + E[N - 1]EGm [T ] + EGp [T ]
Availability Analysis 235
Finally, we note that for the simpler case in which p(t) = p, the probability
of perfect repair is age independent so the expected number of failures to a
perfect repair is geometric and
1
E[N ] = (11.34)
p
¥
(1 - q)qn -1 ( n -1)
G(t) = å
n =1
(1 - qn )
Gm (t) (11.35)
æ t ö
FTn (t) = FX1 ç n -1 ÷
èa ø
æ 1 ö
GYi (t) = GY1 ç i -1 t ÷
èb ø
define Vi = Ti + Ri so that HVi (t) is the convolution of FTi (t) and GYi (t), and
å
n
Qn = Vi has distribution H Qn (t) so that the quasi-renewal function for
i =1
t t
ò
A(t) = FX1 (t) + FX2 (t - u)hQ1 (u)du + FX3 (t - u)hQ2 (u)du
0
ò
0
ò
+ + FXn+1 (t - u)hQn (u)du +
0
¥ t
( ) å
¥ ¥
å F*
1
A* (s) = 1 - f X*1 (s) + H Q* n (s) - X n+1 (s)hQ* n (s)
s n =1 n =1
1æ
) å
ö
(
¥ ¥
= ç 1 - f X*1 (s) +
s çè n =1
hQ* n (s) - å f*n =1
X n+1 (s)hQ* n (s) ÷
÷
ø
1æ
) å (1 - f * (s)) h* (s) ÷÷ø
ö
(
¥
= ç 1 - f X*1 (s) + X n+1 Qn (11.37)
s çè n =1
Device availability is also equal to one minus the probability that the device
is down for repair. This event can be represented as
æt
ç
è0
ò
A(t) = 1 - ç GY1 (t - u) f X1 (u)du
¥ t t-w
ö
+ åò ò G Yn+1 (t - w - u) f Xn+1 (u)hQn (w)du ÷
÷
(11.38)
n =1 0 0 ø
Availability Analysis 237
A* (s) =
1
s { (
1 - f X*1 (s)[1 - gY*1 (s)]
üï
f X*n (s) gY*1 (s) gY*2 (s) gY*n-1 (s) ý )
þï
1 ìï æ 1 ö üï
¥
= í1 -
sï
î
å ç
ç *
1 è g Yn ( s)
n =1
- 1 ÷hQ* n (s)ý
÷
ø ïþ
(11.39)
This second form of the availability function is called the downtime-based form.
As stated, the two forms are equal but both the uptime- and the downtime-based
functions include infinite sums. As it is usually impossible to obtain closed-form
expressions for the inverse transforms of the availability expressions, the trans-
forms are inverted numerically with truncation of the infinite sums. The result-
ing solutions are approximate and because the two expressions are based on
complementary concepts, they form bounds on the actual availability.
Truncating the infinite sum in the transform of the uptime-based function
after “c” terms yields
1æ
) å (1 - f * (s)) h* (s) ÷÷ø
ö
(
c
Al*( c ) (s) = ç 1 - f X*1 (s) + X n+1 Qn (11.40)
s çè n =1
1 ìï æ ö ü
c
c n
æ (aml)n ö
å Õ (l + a
1 1
A* (s) = + ç ÷ j -1
(11.42)
l+s n =1 è
l + a ns ø j =1
s)(m + b j -1s)
c n
å( ) Õ (l + a
1 1 1
A* (s) = - bn -1(ml)n j -1
(11.43)
s m n =1 j =1
s)(m + b j -1s)
If the repair intervals are i.i.d. rather than quasi-renewal, these two equa-
tions reduce to
c n
æ ö
å
(aml)n
Õ (l + a
1 1
A* (s) = + ç n ÷ j -1
(11.44)
l+s n =1 è
n
(l + a s)(m + s) ø j =1
s)
and
c n n
æ ml ö
å Õ (l + a
1 1 1
A* (s) = - ç ÷ j -1
(11.45)
s m n =1 è
m+sø j =1
s)
Rehmert and Nachlas actually obtain closed-form expressions for the avail-
ability measures for the case in which operating periods and repair inter-
vals are normally distributed. The analysis of the functions using various
other distributions yields a common structure for the availability measures.
They show that the contribution of successive terms in the summations
behave as shown in Figure 11.5a and b for the uptime and downtime models,
respectively.
They show further that the approximations may be made arbitrarily close
by the choice of c. This is illustrated for an exponential example using c = 30
in Figure 11.6. Note that as time is increased, the number of terms needed
to keep the approximations close together increases. By selecting the value
of c, one may attain any preferred level of accuracy. In addition, analysis of
a large number of terms in the downtime expression can be used on its own
to obtain a tight upper bound on availability, and this may be adequate for
device management decisions. In both of these analyses, there will be a bal-
ance between computational effort and accuracy.
Availability Analysis 239
0.8
A(t) contribution
0.6
0.4
0.2
1
0.975
0.95
A(t) contribution
0.925
0.9
0.875
0.85
0.825
FIGURE 11.5
(a) Contributions of the individual terms in the lower bound on point availability. (b) Contribu-
tions of the individual terms in the upper bound on point availability.
0.8
0.6
A(t)
0.4
0.2
FIGURE 11.6
Curves of A H(t) for the exponential case for values of c = {1, 2, …, 13} (λT = 0.01 and λ R = 0.05).
λ(t)
X=1 X=0
μ(t)
FIGURE 11.7
Two-state transition diagram.
state (state 1). Under this format, the device operation is represented as a
continuous time Markov process with the Chapman–Kolmogorov forward
differential equations [80] being
d
p1,1(t) = -l(t)p1,1(t) + m(t)p2 ,1(t)
dt
d
p1, 2 (t) = -m(t)p1, 2 (t) + l(t)p1,1(t)
dt
(11.46)
d
p2 ,1(t) = -l(t)p2 ,1(t) + m(t)p2 , 2 (t)
dt
d
p2 , 2 (t) = -m(t)p2 , 2 (t) + l(t)p2 ,1(t)
dt
where pi,j(t) is the probability that the process passes from state i to state j
during an interval of length t. For the case in which the intensity functions
Availability Analysis 241
are constant, the fact that p1,1(t) + (t)p1,2(t) = 1 implies that the first of the
Equation 11.46 becomes
d
p1,1(t) = -(l + m)p1,1(t) + m
dt
so
s+m
p1*,1(s) =
s(s + l + m)
which is the form given in Section 11.2.1. Clearly, the result may be con-
structed using either approach.
For some of the more complicated cases, the use of the Markov process
model is easier than the direct analysis of the renewal process model. As a
case in point, consider a repairable parallel system of two components.
Assume that two identical components are arranged in a parallel configu-
ration and that the system is operated so that one of the components is
functioning and has constant failure hazard λ, while the second component
is in reserve and is therefore subject to a constant but reduced failure haz-
ard, say, λ r. Suppose that whenever a component fails, repair is undertaken
immediately. The repair time is exponential with intensity μ, and if the
repair is completed before a second component failure, the system contin-
ues to function with the repaired component in the reserve role. System
failure occurs when a component failure precedes the completion of a com-
ponent repair. This seemingly simple system is actually quite difficult to
analyze. The state transition diagram for the representative Markov pro-
cess is shown in Figure 11.8. The state space E = {0, 1, 2} represents the
number of failed components.
λ + λr λ
μ μ
FIGURE 11.8
State transition diagram for the repairable parallel system.
242 Reliability Engineering: Probabilistic Models and Maintenance Methods
d
p0 , 0 (t) = -(l + l r )p0 , 0 (t) + mp0 ,1(t)
dt
d
p0 ,1(t) = -(l + m)p0 ,1(t) + mp0 , 2 (t) + (l + l r )p0 , 0 (t)
dt
d
p0 , 2 (t) = -mp0 , 2 (t) + lp0 ,1(t)
dt
d (11.47)
p1, 0 (t) = -(l + l r )p1, 0 (t) + mp1,1(t) + lp1, 2 (t)
dt
d
p1,1(t) = -(l + m)p1,1(t) + mp1, 2 (t) + (l + l r )p1, 0 (t)
dt
d
p1, 2 (t) = -m(t)p1, 2 (t) + l(t)p1,1(t)
dt
d
p2 , 0 (t) = -(l + l r )p2 , 0 (t) + mp2 ,1(t)
dt
d
p2 ,1(t) = -(l + m)p2 ,1(t) + mp2 , 2 (t) + (l + l r )p2 , 0 (t)
dt
d
p2 , 2 (t) = -mp2 , 2 (t) + lp2 ,1(t)
dt
The solution of these two equations requires considerable effort but finally
one finds the solutions to be
æ ì ü
ç m 2 + ïí (l + l r )(l + m) + (l + l r )(2lm - l rm - ll r ) ýï
ç ïî 2 2 4lm + l 2r ïþ
è
ì t t
üï - t ( 2l + 2m + lr ) ö
´ ïíe 2
4 lm + l 2r - 4 lm + l 2r
-e 2
ýe 2 ÷
÷
îï þï ø
p0 , 0 (t) = 2
(l + l r )(l + m) + m
ì t ïü - 2 ( 2l + 2m + lr ) ö
t t
´ ïíe 2
4 lm + l 2r - 4 lm + l 2r
-e 2
ýe ÷
ïî ïþ ÷
p0 ,1(t) = ø
(l + l r )(l + m) + m 2
Assuming the system starts with both components being new, the sum of
these two probabilities is the availability function:
A(t) =
( m(l + l r + m))
(l + l r )(l + m) + m 2
and the complement is the solution for p0,2(t). Clearly, the limiting availability is
A¥ =
( m(l + l r + m)) (11.50)
(l + l r )(l + m) + m 2
Note that if we set λr = 0, the model and its results represent the standby
redundant case in which the second component is not activated until the first
one fails. For that case, the limiting availability reduces to
A¥ =
( m(l + m))
l(l + m) + m 2
244 Reliability Engineering: Probabilistic Models and Maintenance Methods
A¥ =
( m(2l + m))
2l(l + m) + m 2
For both of these special cases, the corresponding forms for the point avail-
ability function apply.
The Markov process models are used widely to represent system behavior.
It is appropriate to emphasize that the earlier case is the simplest one of its
class. A more general case is to assume a k out of n structure for which com-
ponent failures are repaired as they occur by a single repair person. System
failure occurs when the number of functioning components is reduced by
k + 1 to n − k − 1. The problem may again be represented as a Markov pro-
cess. In fact, since only adjacent states are accessible in single transitions, it
is a birth–death process [80]. This makes analysis possible. However, it is
exceedingly complicated to construct the time-dependent transition func-
tions. Birolini [79] gives the expected first passage time to the system down
state and the limiting availability. To state his solution, let Ei represent the
state that i components are failed and let δi denote the failure intensity for
state Ei. Then,
di = kl + (n - k - 1)l r
d0 + d1 + m
E[T ] =
d0d1
and
m(d0 + m)
A¥ =
d0d1 + md0 + m 2
and
m(d0d1 + d0m + m 2 )
A¥ =
d0d1d2 + md0d1 + d0m 2 + m 3
Availability Analysis 245
Finally, Birolini provides the general solution for the limiting availability for
any value of n − k:
æ i -1 ö
å Õ
n-k
1+ ç d j ÷ /m i
i =1 ç ÷
A¥ = è j=0 ø (11.51)
n - k +1 æ ö
i -1
1+ å i =1 Õ
ç
ç j=0 ÷
d j ÷ /m i
è ø
Clearly, more complicated system behavior or even less regular life and
repair time distributions may make the analysis of system behavior quite
intricate. For the more complicated systems, the general results defined in
Section 11.3 provide one approach and a simulation of the general Markov
process is often a worthwhile approach.
Exercises
11.1 Assume a device has an exponential life distribution with parameter
λf = 0.005 and an exponential repair time distribution with parameter
λr = 0.08. Plot the point availability function for the device and plot also
the Laplace transform of the availability function, A* (s).
11.2 For the device of the previous problem, compute the limiting availabil-
ity and also the average availability over the interval (40, 80) hours.
11.3 Reconstruct the numerical results shown in Table 11.1 for the Weibull
life distribution and exponential repair time distribution for 50 hours
and for 100 hours.
11.4 Assume the bridge structure is comprised of 5 identical copies of the
device described in Problem 11.1. Plot the point availability function
for the system.
11.5 Suppose a series system is comprised of four components, each of
which has an exponential life distribution and an exponential repair
time distribution. Assume the parameters for these distributions are
λ1f = 0.001, λ1r = 0.02, λ2f = 0.003, λ2r = 0.07, λ 3f = 0.004, λ 3r = 0.03, λ 4f = 0.002,
and λ 4r = 0.08. Plot the point availability function for the system and
compute the limiting system availability.
11.6 Consider the following 6-component system that was treated in an
earlier chapter. Recall that components 1 and 6 are identical and com-
ponents 2 and 3 are identical.
246 Reliability Engineering: Probabilistic Models and Maintenance Methods
5
2
Suppose the life distributions for components 1 and 6 are Weibull with
β = 2.25 and θ = 750.0 and the other components have exponential life
distributions with λ2 = λ 3 = 0.0016, λ 4 = 0.0025, and λ 5 = 0.001. Assume
further that the repair times for the components are exponential with
μ1 = 0.05, μ2 = 0.08, μ 3 = 0.08, μ 4 = 0.06, μ 5 = 0.02, and μ6 = 0.05. Use the
min paths and min cuts to compute bounds on the system availability
at 50 hours and to compute bounds on the limiting system availability.
11.7 Suppose a device has a Weibull life distribution with parameters β = 2.0
and θ = 200 hours and is subject to imperfect repair with p(t) = p = 0.25.
The duration of perfect repair has an exponential distribution with
parameter λp = 0.08 and the duration of minimal repair has an expo-
nential distribution with parameter λr = 0.02. Compute the limiting
availability for the device.
11.8 Construct an algebraic statement of the uptime and the downtime
availability functions for the quasi-renewal model for the case in
which both failure times and repair times have normal distributions
and both failure and repair processes are quasi-renewal.
11.9 Suppose that for the previous problem, μo = 250, σo = 60 and μr = 150,
σr = 25 and that α = 0.975, β = 1.01. Compute the values of the point
availability functions for truncation values of c = 2,3, and 4 for times
of 200, 400, 900, and 1200 hours. Comment on the comparison of the
uptime- and downtime-based values. Then, plot the uptime-based
availability function and comment on its behavior.
12
Preventive Maintenance
247
248 Reliability Engineering: Probabilistic Models and Maintenance Methods
N(t, τa) be the number of device failures during an interval (0, t) when the
device is operated under an age replacement policy with policy age τa
N(t, τb) be the number of device failures during an interval (0, t) when
the device is operated under a block replacement policy with policy
time τb
Preventive Maintenance 249
N(t) be the number of device failures during an interval (0, t) when the
device is operated to failure with no PM
(t , ta ) be the number of device replacements (failures and PM) during
N
an interval (0, t) when the device is operated under an age replace-
ment policy with policy age τa
(t , tb ) be the number of device replacements (failures and PM) during
N
an interval (0, t) when the device is operated under a block replace-
ment policy with policy time τb
Definition 12.1
st
N (t) ³ N a (t , t a ) (12.2)
st
1. For all t ≥ 0 and τb ≥ 0, N (t) ³ N b (t , tb ) if and only if FT(t) is NBU.
2. For all t ≥ 0 and τa ≥ 0, Na(t, τa) is stochastically increasing in τa if and
only if FT(t) is NBU.
3. For all t ≥ 0 and τb ≥ 0, Nb(t, τb) is stochastically increasing in τb if and
only if FT(t) is NBU.
250 Reliability Engineering: Probabilistic Models and Maintenance Methods
With regard to the comparisons of the policies, the relationships are quite inter-
esting. Specifically, for all t ≥ 0 and for any value assigned to the policy param-
eters, τ
st
a (t , t) £ N
N b (t , t)
st
(12.3)
N a (t , t) ³ N b (t , t)
where the first of these relations holds in general and the second applies
when the life distribution is increasing failure rate (IFR). Basically, these two
inequalities state that block replacement tends to yield more device remov-
als, while age replacement tends to yield more device failures. This seems to
conform to intuition as one has the sense that the block replacement policy
involves removal of relatively young copies of the device.
Now, the cost models for these two basic PM strategies are usually for-
mulated without considering the durations of the maintenance tasks. It is
considered that the costs adequately capture the implications of failure and
of planned replacement so the total cost per unit time is an informative
measure of device performance. It is also considered that the use of PM
will reduce the frequency of “field failures” (failures while in operation)
and presumably, this implies a cost savings. Start with the block replace-
ment policy.
Suppose it is possible to identify the costs of a planned replacement and of
field failures and these quantities are represented by c1 and c2, respectively.
Then, a model for the total cost per unit time associated with a block replace-
ment PM strategy is
c1 + c2 MFT (tb )
E[Cost|tb ] = (12.4)
tb
The interpretation of this model is that there will be one planned replace-
ment per period at a cost of c1, and the expected number of failures with
corrective replacements per period is given by the renewal function. Each
corrective replacement has a cost of c2. Dividing by the length of the period
gives the expected cost per unit time.
Once the cost model is defined, we can use it to determine an optimal
choice to the policy time, τb, by conventional optimization methods. Taking
the derivative,
d
c2 t b MFT (tb ) - c1 - c2 MFT (tb )
d dt b
E[Cost|tb ] =
dtb tb2
Preventive Maintenance 251
and equating it to zero, we find that the optimal choice of the policy time is
the value for which
d c
tb MFT (tb ) - MFT (tb ) = 1 (12.5)
dtb c2
Intuitively, it is appealing that the policy time should depend upon the ratio
of preventive to corrective replacement costs. Before analyzing the derivative
equation further, observe that the second derivative condition becomes
d2 d
c2tb3 MFT (tb ) - 2c2tb2 MFT (tb ) - 2c1tb - 2c2tb MFT (tb )
d2 dtb2 dt b
E[Cost|tb ] =
dtb2 tb4
c2 d 2
= MFT (tb )
tb dtb2
Since c2 and τb are positive, the sign of the second derivative is determined by
the life distribution. Recall that a distribution that is NBU has
so, the slope of the renewal density is positive. Thus, if the device life dis-
tribution is NBU, the value of τb computed using Equation 12.5 will corre-
spond to a minimum of the expected cost function. Logically, if replacement
improves reliability, it is worthwhile and if replacement does not improve
reliability, PM is not appropriate.
Consider an example. Suppose a device has a Weibull life distribution with
parameters β = 2.00 and θ = 2000 hours. Suppose further that the ratio of
preventive replacement cost to corrective replacement cost is ρ. Recall that
the derivative of the renewal function is the renewal density, so we wish to
determine the value of τb for which
c1
tb mFT (tb ) - MFT (tb ) = =r
c2
TABLE 12.1
Optimal Block Replacement Intervals
as a Function of Cost Ratio
ρ τb
0.05 350
0.20 730
0.40 1102
0.50 1282
0.75 1808
and we must distribute this cost over the expected cycle length, which is
ta ta
ò
ta FT (ta ) + tfT (t)dt =
0
ò F (t)dt
0
T
ta
Once again, the solution has the intuitive appeal that the length of the
replacement interval depends directly upon a ratio of the replacement costs.
Consider the same example as was used for block replacement. A plot of the
cost function is shown in Figure 12.1. Solution of Equation 12.7 indicates that
the optimal replacement age is τa = 460.86 hours.
The block replacement and age replacement policies and models provide
an informative starting point for our examination of PM. Once the initial
models were defined, numerous extensions and improvements were defined
Preventive Maintenance 253
0.05
0.04
Cost
0.03
0.02
0.01
FIGURE 12.1
Expected cost function for the age replacement.
and then more intricate and more efficient policies were defined. We will
examine a variety of these extensions in the pages that follow.
T1 R1 τa PM
FIGURE 12.2
(a) Operation to failure with repair. (b) Operation to preventive maintenance replacement.
failure occurs with probability FT(τa) and the one that involves PM occurs
with probability FT (ta ), so the overall renewal process is a mixture of the
two processes created by the operation/repair renewals and the operation/
PM renewals.
Now, the length of the renewal intervals for the case of operation and
repair has distribution Ha,c(t) where the distribution is the convolution of the
repair time distribution with the truncated life distribution:
t
f T ( u)
ò
H a , c (t) = Gc (t - u)
0
FT (ta )
du (12.8)
ò
fT*, c (s, ta ) = e - st fT (t)dt
0
(12.9)
Then, the transform for the density on the length of the interval is
1
ha*, c (s) = fT*, c (s, ta ) gc* (s) (12.10)
FT (ta )
The renewal function for the process associated with operation and repair
intervals is denoted by MHa ,c (t).
In the case of the renewal intervals that involve PM, the distribution on the
duration of the intervals is
H a , p (t) = Gp (t - ta ) (12.11)
The renewal function associated with this process is denoted by MHa ,p (t),
and the density on the duration of the renewal intervals is the mixture of the
densities for the lengths of the two types of possible intervals:
ha* (s) = FT (ta )ha*, p (s) + FT (ta )ha*, c (s) = FT (ta )e - sta g *p (s) + fT*, c (s, ta ) gc* (s) (12.13)
ì t
ïï ò
ï FT (t) + FT (t - u)mHa ,c (u)du
0
0 < t £ ta
A(t) = í t (12.14)
ï
ï ò
ïî t - ta
FT (t - u)mHa (u)du t > ta
Observe that in this expression, for times before τa, there will be no PM,
so all intervals are failure intervals, and therefore, the renewal density on
restart times is mHa ,c (t). As with the availability functions of Chapter 11, there
are two reasonable approaches to constructing the actual point availability
function for a particular choice of life and service time distributions. Direct
numerical integration using appropriate numerical approximations can be
very effective. Murdock and Nachlas [81] constructed the Laplace transform
for Equation 12.14 as
FT*, a (s, ta )
A* (s) = (12.15)
1 - ha* (s)
They then used numerical inversion algorithms to obtain values for the
time-dependent function for several combinations of life and service time
distributions. An example of their results for the Weibull life distribution
and two different service time distributions is shown in Figure 12.3.
1.00
0.95
0.90
0.85
0.80
0.75
0 10 20 30 40 50
FIGURE 12.3
Availability function for an age replacement policy example.
(PM + T1), (R1 + T2), …, then the failure times form a delayed renewal process.
The availability function is thus
t t
A(t) =
ò
0
ò
g p (u)FT (t - u)du + FT (t - u)mHb (u)du
0
(12.16)
In this expression, the renewal density for the delayed renewal process is
constructed by first forming the convolutions:
t
f T ( u)
hb , c (t) =
ò g (t - u) F (t - u) du
0
c
T b
(12.17)
t
f T ( u)
hb , p (t) =
ò g (t - u) F (t - u) du
0
p
T b
(12.18)
where the terms in the denominators reflect the truncation of the life dis-
tribution at the end of the block replacement interval less the time already
consumed by replacement. Then, the renewal function is the convolution of
hb,p(t) and the renewal density for the ordinary renewal process based on
hb,c(t). That is,
A (t)
0.95
0.9
0.85
Time
100 200 300 400 500 600 700
FIGURE 12.4
Availability function for a block replacement policy example.
A point here related to the evaluation of this model is that the Laplace trans-
form for the availability function, Equation 12.16, is likely to be completely
unmanageable so that direct numerical integration appears to be the most
efficient approach to the analysis. The numerical analysis is not too difficult.
In addition, for most plausible cases, the first of the two terms of the avail-
ability function dominates the other in magnitude. Consider an example.
Suppose we have a device with a Weibull life distribution with parameters
β = 2.0 and θ = 2000 and exponential corrective and preventive replacement
times with λ c = 0.005 and λp = 0.025. If we take τb = 730 hours, the availability
function will be as shown in Figure 12.4. Finally, note that Equation 12.16
applies for t in the interval [0, τb] or to say it another way, it is defined relative
to the length of the block replacement policy time.
A model for this policy is rather intricate and requires considerable notation.
Let τai represent the age replacement policy age for component i and ω ai rep-
resent the opportunistic replacement policy age for component i. Necessarily,
ωai < τai. Also, let FTi(t) represent the life distribution function for component
i and GTi(t) represent the distribution on the time to replace component i.
These quantities are used to define a nested renewal process that represents
the operation of the components and thus of the system.
An illustration of the nested renewal process is shown in Figure 12.5.
Notice that the renewal points for “component 1,” points t1, t2, t4, and t5, are
“nested” within those for “component 2,” points t3 and t5. To describe oper-
ating profiles that correspond to nested renewal processes, we define two
classes of operating intervals, major intervals and minor intervals and we
further distinguish between initial and general minor intervals. A “major
interval” is a period of system operation and repair that starts with both
components being new and ends with the completion of a service period
in which both components are replaced (renewed). Thus, a major interval
is a system renewal period and a sequence of major intervals forms a
renewal process. In Figure 12.5, the interval (0, t5) is a major interval.
0 t1 t2 t3 t4 t5
Component 1 Component 2
FIGURE 12.5
Illustration of a nested renewal process for opportunistic replacement. (Reprinted with permis-
sion from Degbotse, A.T. and Nachlas, J.A., Use of nested renewals to model availability under
opportunistic maintenance policies, Proceedings of the Annual Reliability and Maintainability
Symposium, pp. 344–350. Copyright 2003 IEEE.)
Preventive Maintenance 259
Denote the density functions as hI(t, k) and hG(t, k). The functions HI and HG
are the key building blocks of the model of system behavior. They are con-
structed using the life and the residual life distributions. That construction
is postponed for now so that the general model structure can be defined
without treating special cases.
Since a minor interval that ends without system renewal includes k compo-
nent 1 replacements and one component 2 replacement, the density function
on the total repair time during a minor interval is
òg (k )
gT (t , k ) = 1 ( x) g 2 (t - x)dx (12.20)
0
where g1(k ) (t) is the k-fold convolution of the replacement time density.
260 Reliability Engineering: Probabilistic Models and Maintenance Methods
¥ t
¥ t
ò
QI (t) = qI ( x)dx
0
(12.23)
ò
QG (t) = qG ( x)dx
0
(12.24)
Corresponding relationships are defined for the minor intervals that end
with system renewal. Let
Naturally,
¥ ¥ ¥
ZI (t) = å
k =0
H I (t , k ) + å k =0
VI 1(t , k ) + åV (t, k)
k =0
I2 (12.25)
and
¥ ¥ ¥
ZG (t) = å H (t, k) + åV
k =0
G
k =0
G1 (t , k ) + åV
k =0
G2 (t , k ) (12.26)
¥ t
mQ (t) = å ò q (x)q
n=0 0
I
(n)
G (t - x)dx (12.27)
This represents the probability that a minor interval starts at any time. For
the sequence of minor intervals, the system availability is the probability
that the system is operating at any point in time and is thus given by
ò
A(t) = ZI (t) + ZG (t - x)mQ ( x)dx
0
(12.28)
where ZI (t) = 1 - ZI (t) represents the probability that the length of the operat-
ing period in an initial minor interval exceeds t.
262 Reliability Engineering: Probabilistic Models and Maintenance Methods
The system-level availability starts with the distribution function for the
lengths of the major intervals, Φ(t), and the corresponding renewal density,
mΦ(t). The distribution on the lengths of the major intervals is
ò
F(t) = U I (t) + mQ ( x)UG (t - x)dx
0
(12.29)
as either an initial or a general minor interval may be the last minor interval
in a major interval. Given this definition, the system-level availability func-
tion is
ò
AS (t) = A(t) + A( x)mF (t - x)dx
0
(12.30)
This form of the system-level availability function reflects the nesting of the
minor intervals within the major intervals.
Next, observe that we can apply the general model structure to four problem
classes and that the stepwise application over the four classes will promote
understanding of the model structure and will allow us to connect our avail-
ability results to the few others that are available.
Recall that the two components each have an age replacement policy age τai
and an opportunistic replacement age ω ai. The four cases are defined by the
assumed magnitudes of those policy parameters. Specifically, we consider (1)
τa1 = τa2 = ωa1 = ωa2 = ∞; (2) τa1 = τa2 = ∞ with ωa1 and ωa2 finite; (3) τa1 = ∞ with
τa2, ωa1, and ω a2 finite; and (4) all policy ages finite. For each case, the specifica-
tion of the policy ages leads to the definition of the functions HI and HG. The
first case is a pure failure model.
and
where it should be particularly noted that F1(t) denotes the survivor function
for the residual life distribution, F1(t). The residual life distribution is used
because the general minor intervals begin with component 1 being used.
Note further that the convolution, f1( k ) (t), includes a first operating period
during which component 1 is used followed by k − 1 periods in which com-
ponent 1 is new at the start.
In general, it is difficult to specify the residual life distribution for a pro-
cess such as the one studied here. We use the approximation to the residual
life distribution defined by Cox [61]:
t
1
FT (t) =
mT ò F (x)dx
0
T (12.35)
qI (t) =
ò f (u)F (u)g (t - u)du
0
2 1 2
¥ t
æ x
ö æ t-x ö
+ åò ç
0 è
ò ÷ç ò
ç f 2 ( x) F1( x - u) f1( k ) (u)du ÷ ç g1( k ) (w) g 2 (t - x - w)dw ÷ dx
÷
(12.36)
k =1 0 øè 0 ø
264 Reliability Engineering: Probabilistic Models and Maintenance Methods
and
t
qG (t) =
ò f (u)F (u)g (t - u)du
0
2 1 2
¥ t
æ x
ö æ t-x ö
+ åò ç
0 è
ò ÷ç ò
ç f 2 ( x) F1( x - u) f1( k ) (u)du ÷ ç g1( k ) (w) g 2 (t - x - w)dw ÷ dx
÷
(12.37)
k =1 0 øè 0 ø
These forms are substituted into Equation 12.27 to obtain the renewal den-
sity for minor intervals. The distributions on total operating time during
minor intervals are
¥ ¥ t x
ZI (t) = å
k =0
H I (t , k ) = åò
k =0 0
ò
f 2 ( x) F1( x - u) f1( k ) (u)dudx = F2 (t)
0
(12.38)
and
¥ ¥ t x
ò
A(t) = F2 (t) + F2 (t - x)mQ ( x)dx
0
(12.40)
As usual, depending upon the life and repair time distributions, eval-
uation of the availability function can be rather intricate. As noted
Preventive Maintenance 265
previously, Barlow and Proschan [12] present the limiting availability for
the failure model as
-1
æ n n ö
A¥ = ç 1 + 1 + 2 ÷ (12.41)
è m1 m 2 ø
where
νi is the mean of the repair time distribution Gi(t)
μi is the mean of the life distribution for component i
Using the derivatives of the Laplace transforms of Equation 12.40 yields the
same result.
hI (t , 0) = f 2 (t)F1(t) (12.42)
provided t ≤ ω a1. If k ≥ 1, the cases that imply no system renewal are as follows:
ì t
ï ò
ï fT 2 (t) FT 1(t - u) f ( k ) (u)du
T1 0 £ t < wa 1
ï 0
ï t
ï
hI (t , k ) = í fT 2 (t)
ï
ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 1 £ t < wa 2 (12.43)
ï wa 2
ï
ï fT 2 (t)
ïî
ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 2 £ t < w a 1 + wa 2
266 Reliability Engineering: Probabilistic Models and Maintenance Methods
The expressions in Equations 12.42 and 12.43 are used in Equation 12.21 to
obtain the density on the lengths of the initial minor intervals.
Next, we consider the general minor intervals that end without system
renewal. The reasoning and hence construction for the general minor
intervals are identical to that for the initial minor intervals except that com-
ponent 1 is used and is subject to a residual life distribution at the start of
a general minor interval. To reflect the age of component 1 at the start of a
general minor interval, we take the component 1 age to be the average back-
ward recurrence time based on the life distribution, FT1(t). Denote that age as
a1 and note that Cox [61] shows that age to be
mT2 1 + sT2 1
a1 = (12.44)
2mT 1
Given the average starting age a1,
ï ò
ï fT 2 (t) FT 1(t - u) f ( k ) (u)du
T1 0 £ t < wa 1
ï 0
t
ïï
hG (t , k ) = í fT 2 (t)
ï
ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 1 £ t < wa 2 (12.46)
ï wa 2
ï f (t)
ï T2
ïî
ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 2 £ t < w a 1 + wa 2
Again, these expressions are combined with the total repair time to deter-
mine the density on the general minor interval length.
Observe that the conditions under which renewal does not occur corre-
spond to a specific set of sample paths. The set of possible sample paths that
is the complement of the set described earlier defines the cases in which sys-
tem renewal does occur. Considering initial minor intervals, we note that
system renewal will occur as a result of a component 2 replacement if k = 0
and component 2 fails at time t where ωa1 ≤ t. That is,
vI 2 (t , 0) = fT 2 (t)FT 1(t) wa1 £ t (12.47)
and a component 2 failure will precipitate system renewal for the sample
paths having a component 2 failure time of t where
1. ωa1 ≤ t < ω a2 and the kth component 1 failure time occurred at x where
0 ≤ x ≤ t − ωa1
2. ω a2 ≤ t < ω a1 + ω a2, the kth component 1 failure occurred at x where
0 ≤ x ≤ t − ωa1, and it is also the case that x < ωa2
3. ωa1 + ωa2 < t and the kth component 1 failure occurred at x where x < ωa2
Preventive Maintenance 267
while it will be the kth component 1 failure that precipitates the system
renewal at time t if that failure occurs at time t where ωa2 ≤ t < ωa1 + ωa2 and
the k − 1st failure occurred at time x where x < ωa2 or if that failure occurs at
time t > ωa1 + ωa2 and the k − 1st failure occurred at time x < ωa2. These cases
exhaust the complement of the set of sample paths that do not yield system
renewal at the end of an initial minor interval. Algebraically,
ì t - wa 1
ï fT 2 (t)
ï ò FT 1(t - u) fT( k1) (u)du wa 1 £ t < wa 2
ï 0
t - wa 1
ïï
vI 2 (t , k ) = í fT 2 (t)
ï
ò0
FT 1(t - u) fT( k1) (u)du wa 2 £ t < w a 1 + wa 2 (12.48)
ï wa 2
ï
ò
(k )
ï fT 2 (t) FT 1(t - u) fT 1 (u)du wa 1 + wa 2 £ t
ïî 0
and
wa 2
vI 1(t , k ) = FT 2 (t)
òf
0
T1 (t - u) fT( k1-1) (u)du wa 2 £ t (12.49)
To combine the operating and repair times, we observe that when a compo-
nent 2 failure precipitates system renewal, the accumulated repair time is
gT(t, k + 1) and in contrast, when a component 1 failure precipitates system
renewal, the accumulated repair time is gT(t, k). Thus, the density function
for the duration of an initial minor interval that ends with system renewal is
¥ t ¥ t
uI (t) = åò
k =0 0
vI 1( x , k ) gT (t - x , k )dx + åòv
k =0 0
I2 ( x , k ) gT (t - x , k + 1)dx (12.50)
For the general minor intervals that end with system renewal, we again con-
sider the sample paths that form the complement of the set of paths that do
not yield renewal so,
and
ì t - wa 1
ï fT 2 (t)
ïï ò0
FT 1(t - u) fT( k1) (u)du wa 1 £ t < wa 1 + wa 2
vG 2 (t , k ) = í wa 2
(12.52)
ï
ïî
ò
0
(k )
ï fT 2 (t) FT 1(t - u) fT 1 (u)du wa 1 + wa 2 £ t
268 Reliability Engineering: Probabilistic Models and Maintenance Methods
and
wa 2
vI 1(t , k ) = FT 2 (t)
òf
0
T1 (t - u) fT( k1-1) (u)du wa 2 £ t (12.53)
Then, the density on the length of the general minor intervals that end with
system renewal is
¥ t ¥ t
uG (t) = åòv
k =0 0
G1 ( x , k ) gT (t - x , k )dx + åòv
k =0 0
G2 ( x , k ) gT (t - x , k + 1)dx (12.54)
Having enumerated and represented all of the sample paths for the minor
intervals, the system availability is determined by the successive application
of Equations 12.25 through 12.30. That is, the specification of any particular
realization of the general model structure hinges upon the careful construc-
tion of the quantities qI(t) and qG(t) using hI(t, k) and hG(t, k) and the quantities
uI(t) and uG(t) using vI1(t, k), vI2(t, k), vG1(t, k), and vG2(t, k). Once these prob-
abilities have been obtained, Equations 12.25 through 12.30 accumulate their
content to provide the system availability measure.
To complete this model, we observe that there is a finite probability of
system renewal when the system is subjected to an opportunistic failure
replacement policy and this is confirmed for the stated expressions in that
QI(∞) < 1 and QG(∞) < 1. Also, one can obtain a partial validation of the
stated expressions by noting that taking ω a1 = ω a2 = ∞ yields the equations for
the failure replacement model. Finally, note that it is possible to obtain the
limiting system availability by taking derivatives of the pertinent Laplace
transforms. Doing this yields
m ZI (1 - QG (¥)) + m ZGQI (¥)
AS (¥) = (12.55)
(mUI + mQI + (n1 + n 2 ))(1 - QG (¥)) + (mUG + mQG )QI (¥)
where the mean values are identified by their subscripts that correspond to
the distributions to which they apply. Note that this result conforms to that
of Barlow and Proschan [12].
(if there is one) and let x denote the time of the kth component 1 failure (if there
is one). The sample paths for which there is no system renewal are
ì ì0 £ t < wa1 , 0 £ x < t ü
ï ïw £ t < w , t - w £ x < t ï
ï ï a1 ï
Sk = 0 = {(t|0 £ t < w1 )} and Sk ³1
a2 a1
= ít , x|í ý
ï w
ï a2 £ t < t a2 , t - w a1 £ x < w a2 ï
ï ïît = ta 2 , ta 2 - wa1 £ x < wa 2 ï
î þ
These sets imply that
ì t
ï ò
ï fT 2 (t) FT 1(t - u) fT( k1) (u)du
0
0 £ t < wa 1
ï
t
ï
ï fT 2 (t)
ïï ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 1 £ t < wa 2
hI (t , k ) = í wa 2
(12.57)
ï
ï
ï
f T 2 (t )
ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 2 £ t < t a 2
ï wa 2
ï
ï FT 2 (t)
ïî
ò
ta 2 - wa 1
FT 1(t - u) fT( k1) (u)du t = ta 2
The same sample paths apply to the general minor interval except that the
initial age must be considered when k = 0. Thus,
hG (t , 0) = fT 2 (t)FT 1(t) 0 £ t < wa1 - a1 (12.58)
and
ì t
ï ò
ï fT 2 (t) FT 1(t - u) fT( k1) (u)du
0
0 £ t < wa 1
ï
t
ï
ï fT 2 (t)
ïï ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa1 £ t < wa 2
hG (t , k ) = í wa 2
(12.59)
ï
ï fT 2 (t)
ï
ò
t - wa 1
FT 1(t - u) fT( k1) (u)du wa 2 £ t < t a 2
ï wa 2
ï
F
ï T2
ïî
( t )
ò
ta 2 - wa 1
FT 1(t - u) fT( k1) (u)du t = ta 2
270 Reliability Engineering: Probabilistic Models and Maintenance Methods
In order to identify the probability functions for the system renewal cases,
we take the complements of the sets Sk = 0 and Sk ≥ 1. Clearly,
ì ì wa 1 £ t < wa 2 , x £ t - wa 1 ü
ï ïw £ t < t , x £ t - w ï
ï ï a2 a1 ï
Sk = 0 = {(t|wa1 £ t)}
a2
and Sk ³1 = ít , x|í ý
ï ï wa 2 £ t < t a 2 , x ³ wa 2 ï
ï ïî t = ta 2 , x £ ta 2 - wa1 ï
î þ
so,
vI 2 (t , 0) = fT 2 (t)FT 1(t) wa1 £ t (12.60)
and
ì t - wa 1
ï fT 2 (t)
ïï ò 0
FT 1(t - u) fT( k1) (u)du wa 1 £ t < t a 2
vI 2 (t , k ) = í ta 2 - wa 1
(12.61)
ï
ï FT 2 (t)
ïî
ò0
FT 1(t - u) fT( k1) (u)du t = ta 2
and
t2
ò
vI 1(t , k ) = FT 2 (t) fT 1(t - u) fT( k1-1) (u)du
0
wa 2 £ t < t a 2 (12.62)
As with the nonrenewal case, the general minor intervals are similar to the
initial minor intervals, so,
and
ì t - wa 1
ï fT 2 (t)
ïï ò 0
FT 1(t - u) fT( k1) (u)du wa 1 £ t < t a 2
vG 2 (t , k ) = í ta 2 - wa 1
(12.64)
ï
ï FT 2 (t)
îï
ò 0
FT 1(t - u) fT( k1) (u)du t = ta 2
and
wa 2
vG1(t , k ) = FT 2 (t)
òf
0
T1 (t - u) fT( k1-1) (u)du wa 2 £ t < t a 2 (12.65)
Preventive Maintenance 271
Once the basic densities on interval duration have been defined relative to
the cases that do and do not yield system renewal, the calculation of avail-
ability measures follows the previously defined format.
The densities uI(t) and uG(t) are computed as given in Equations 12.50 and
12.54, respectively. These are then used in Equations 12.25 through 12.30 to
obtain the system availability measure. Observe that QI(∞) < 1 and QG(∞) <
1, so system renewals do occur for this model and Equation 12.55 again gives
the limiting system availability. Also, setting τa2 = ∞ causes the model to
reduce to the opportunistic failure replacement model as it should.
It is not necessary to show all of the available results here but a few key
results will help to illustrate the power of the nested renewal concept. To
start, we note that
¥
qI* (s) = å h* ( s, k ) g * ( s, k )
k =0
I T (12.66)
and the same applies to QG(∞). Substituting into Equation 12.27 yields
qI* (s)
mQ* (s) = (12.68)
1 - qG* (s)
Then, using the derivatives of the Laplace transforms for Equation 12.26,
d * d QI (¥)
mF = - F (s) = - éêuI* (s) + mQ* (s)uG* (s)ùú = mUI + mQI + (mUG + mQG )
ds ds ë û 1 - QG (¥)
(12.69)
and this leads directly to the limiting availability Equation 12.55. The inter-
mediate steps are to take
and
along with the standard definition of a renewal function for mF* (s) to obtain
Note that all of these results apply regardless of the choices of life and repair
time distributions.
Preventive Maintenance 273
respectively, and that the life distributions are Weibull and are
b
FTj (t) = 1 - e ( )
- t/q j
For the purposes of illustration and recognizing that scale is arbitrary, take
β1 = β2 = 2.0 with θ1 = 1.5, θ2 = 1.0, λ2 = 0.4, and λ1 = 0.667. Then, take the age
replacement policy ages to be integer multiples of the mean of the life distri-
butions so that
taj = mm j
and take the opportunistic age replacement policy ages to be fractions of the
age replacement policy ages:
waj = gtaj
Using this construction, the limiting availability of Equation 12.55 for several
cases is shown in Table 12.2.
Similarly, the time-dependent availability function can be obtained using
numerical inversion of the Laplace transforms. These functions are shown
for the failure replacement, opportunistic failure replacement, and the par-
tial opportunistic age replacement models in Figure 12.6 for the specific case
in which ωa1 = 1.50, ωa2 = 2.13, and τa2 = 2.51 and the life and replacement time
distributions are the same as that given earlier. The figure serves to illustrate
the fact that numerical results are possible. The results also verify the inter-
nal consistency of the models.
TABLE 12.2
Representative Limiting Availability Values
γ m 1 = 1, m 2 = 1 m 1 = 1, m 2 = 2 m 1 = 2, m 2 = 1 m 1 = 2, m 2 = 2
0.2 0.47 0.51 0.48 0.54
0.4 0.49 0.52 0.64 0.66
0.6 0.54 0.53 0.62 0.65
0.8 0.63 0.53 0.55 0.61
274 Reliability Engineering: Probabilistic Models and Maintenance Methods
1
Opp. age
Opp. fail
0.9 Failure
0.8
A(t)
0.7
0.6
0.5
1 2 3 4 5 6 7
t
FIGURE 12.6
Example of availability functions. (Reprinted with permission from Degbotse, A.T. and
Nachlas, J.A., Use of nested renewals to model availability under opportunistic mainte-
nance policies, Proceedings of the Annual Reliability and Maintainability Symposium, pp. 344–350.
Copyright 2003 IEEE.)
å ò
N tb
c2 zk (t) + ( N - 1)c1 + c3
k =1 0
C( t b , N ) = (12.73)
Nt b
where c3 is the cost of system replacement. Note that the assumption of even-
tual replacement was needed to obtain a tractable model. Note further that
N is a decision variable so the model analysis includes selection of optimal
values for both τb and N. In fact, Nakagawa shows that the optimal value for
τb may be obtained by differentiation to be the one for which
N æ tb
ö
å ç tb zk (tb ) - zk (t)dt ÷ = ( N - 1)c1 + c3
ç
k =1 è
ò
÷ c2
(12.74)
0 ø
and the optimal value of N may be determined using the difference expressions
C(tb , N ) < C(tb , N - 1)
(12.75)
C(tb , N ) £ C(tb , N + 1)
Clearly, the appeal of this model is the fact that the hazard function behavior
may be specified arbitrarily (as long as zk(t) < zk + 1(t)). The assumption of mini-
mal repair at failure may or may not be viewed as a drawback of the model.
æc p
å å (1 - p) j -1 FT ( jtb ) ö÷
¥ ¥
ç 2 (1 - p) j -1 FT ( jtb ) + c1
C( t b , p ) = è ø
j =1 j =1
(12.76)
æ ö
å
¥ jtb
ç
è j =1
(1 - p) j -1
ò( j -1) tb
FT (u)du ÷
ø
because the probability that it is a failure that precipitates renewal is
¥ jtb ¥
åj =1
(1 - p) j -1
ò
( j -1) tb
fT (u)du = p å (1 - p)
j =1
j -1
FT ( jtb )
276 Reliability Engineering: Probabilistic Models and Maintenance Methods
æ ¥ jtb
ö
å
1
C( t b , p ) = ç c 2 p 2
ò zT (u)du + c1 ÷
j -1
(1 - p) (12.77)
tb ç ÷
è j =1 0 ø
c0k (t) is the cost for the kth minimal repair t time units after the most
recent device replacement.
Then, recalling that E[Nt] is the expected number of events during (0, t) in a
nonhomogeneous Poisson process that has mean value
t
ò q(u)z (u)du
0
T
the cost of minimal repairs during an interval (0, t) is c0E[ Nt ]+1(t) and the result-
ing cost per unit time is
ta x
One appealing feature of this model is that it generalizes several other age
replacement imperfect repair models. Specifically, if the minimal repair cost
is a constant c0, the cost model reduces to
ta
C( t a , p ) =
(c2 - c0 )Fp (ta ) + c1Fp (ta ) + c0
ta
ò0
Fp ( x)zT ( x)dx
(12.79)
ò 0
Fp ( x)dx
Preventive Maintenance 277
and if the probability of minimal repair is constant rather than age depen-
dent and the minimal repair cost equals the failure replacement cost, the
model becomes
c2
Fp (ta ) + c1Fp (ta )
p
C( t a , p ) = ta (12.80)
ò0
Fp ( x)dx
Tk = aTk -1 = a k -1T1
and
1 æ t ö
FTk (t) = FT1 ç k -1 ÷
a k -1 èa ø
where FT1 (t) is the underlying life distribution for the device. Recall further
that the quasi-renewal function, QFT (t), is defined in the same manner as a
renewal function:
¥
QFT (t) = E[Nt ] = åF
n =1
Sn (t)
Given these definitions, a block replacement policy (perfect PM) with quasi-
renewals upon failure has cost rate:
c2QFT (tb ) + c1
C( t b ) = (12.81)
tb
This corresponds directly to the expression for the renewal case and has the
corresponding optimality condition.
If the PM that is performed at the scheduled times is imperfect and failures
are quasi-renewal, we obtain a cost rate model of
å
¥
c2 p 2 (1 - p) j -1QFT ( jtb ) + c1
j =1
C( t b ) = (12.82)
tb
278 Reliability Engineering: Probabilistic Models and Maintenance Methods
å
jtb
K (K - 1) c ¥ æ u ö
C( t b , K , p ) =
Kc2 f +
2
c2v + 1 + pc2 f
p j =1 ò
(1 - p) j -1
0
zT ç k
èa
÷ du
ø
(12.83)
1 - ak 1 - bk tb
E[T1 ] + E[R1 ] + + E[R0 ]
1- a 1-b p
In this expression, the minimal repair cost is taken to be c2f, the time to per-
form minimal repair is taken to be zero, and the time to perform a perfect
repair (renewal) is denoted by R0. For this policy and model, the correspond-
ing average availability is
1 - ak t
E[T1 ] + b
1- a p
(12.84)
1 - ak 1 - bk t
E[T1 ] + E[R1 ] + b + E[R0 ]
1- a 1-b p
times within each cycle that are t ≤ Ta and t > Ta. Consequently, within each
operating interval, the life distribution is
ì FTi (t)
ï t £ Ta
FTi , p (t) = í FTi (Ta ) (12.85)
ï1 t > Ta
î
If a period of device operation ends before Ta as a result of a failure, the repair
time distribution is
æ t ö
GRi (t) = GR1 ç i -1 ÷ (12.86)
è bR ø
and if the operating intervals extend until Ta, the service time distribution is
æ t ö
GPi (t) = GP1 ç i -1 ÷ (12.87)
è bP ø
These functional forms are folded into the construction of the convolutions
that comprise the availability models.
As in Rehmert and Nachlas [78], Vi = Ti + Ri but two realizations of this
definition are possible. For a repair interval, Vi corresponds to a repair com-
pletion, so
t
With these definitions, the length of the ith cycle is a mixture of the two pos-
sible forms. Thus,
H· ,Vi (t) = FTi (Ta )H R ,Vi (t) + FTi (Ta )H P ,Vi (t) (12.90)
However, for a series of cycles that ends before Ta, all intervals must be repair
intervals, so
ì ¥
ï å
ï n =1
H R , Qn (t) t £ Ta
QH· ,Q (t) = í ¥ (12.93)
ï
ï å
î n =1
H· , Qn (t) t > Ta
ì ¥ t
ï FT1 (t) +
ïï åò
n =1 0
hR , Qn (u)FTn+1 (t - u)du t £ Ta
A(t) = í t
(12.94)
¥
ï
ï åò
ïî n =1 t -Ta
h· , Qn (u)FTn+1 (t - u)du t > Ta
Making the substitutions for the life distribution and service time distribu-
tions and taking the Laplace transform yields
ì 1é æ öù
( ) å
¥ n
ï ê 1 - fT*1 (s) +
ïï s ê
ç 1 - fT*1 (a n s)
ç
n =1 è
Õ fT*1 (a i -1s) g R*1 (biR-1s) ÷ ú
÷ú
øû
t £ Ta
A* (s) = í ë i =1
å ( )Õ( )
¥ n
ï 1
ï 1 - fT*1 (a n s) fT*1 (a i -1s) g R*1 (s) + FT1 (Ta )e - sTa g P*1 (s) t > Ta
ïî n =1 s i =1
(12.95)
The downtime-based availability function is
ì t
ò
ï1 - fT1 (u)GR1 (t - u)du
ï
ï 0
ï ¥ t u
ï
ï - åòò hR , Qn (w) fTn+1 (u - w)GR , n +1(t - u)dwdu t £ Ta
ï n = 1 0 0
ï Ta
ò
ï1 - F (T ) f (u)G (t - u)du - F (T )G (t - T )
A(t) = í T1 a T1 , P R1 T1 a P1 a (12.96)
ï 0
ï t u
ï ¥
ï -
ï n =1
å FTn+1 (Ta )
òò
0 u - Ta
h· , Qn (w) fT1 , P (u - w)GRn+1 (t - u)dwdu t > Ta
ï
ï ¥ t - Ta
ï -
ï å FTn+1 (T a )
ò h· ,Qn (u)GPn+1 (t - Ta - u)du
î n = 1 0
Preventive Maintenance 281
ì é ¥ æ ö n ù
ï1 ê1 -
å Õ
1
ç - 1÷ fT*1 (a i -1s) g R*1 (biR-1s)ú t £ Ta
ïs ê ç * n -1 ÷ úû
n =1 è g R1 (b R s) ø i =1
ï ë
ï1
( ) (
A* (s) = í é1 - fT*1 (s) 1 - g R*1 (s) - FT1 (Ta )e - sTa 1 - g P*1 (s) ù
ê
ïs ë
) ûú
ï ¥ t > Ta
ï-
ï å
î n =1
( ( ) (
h·*, Qn (s) fT*n+1 (s) 1 - g R*1 (bnR s) + FTn+1 (Ta )e - sTa 1 - g P*1 (bnp s) ))
(12.97)
Just as in the cases studied by Rehmert and Nachlas [78], truncating the infi-
nite sums and numerical inversion of the two transforms provides arbitrarily
tight bounds on the availability function. Intiyot shows this by application to
several application cases that are defined by the choice of life and service time
distributions. The solutions display the same behaviors as those illustrated
in Figures 11.5a,b and 11.6 with the single difference that the initial transient
behavior extends over a slightly longer time. Thus, the quasi-renewal model
permits the investigation of the benefits of age-based PM policies for devices
that are well represented by this model.
An = qSn (12.98)
where
Sn is the actual device age at the time of the nth failure æç Tj ö÷ å
n
è j =1 ø
An is the resulting virtual age just after the completion of the nth repair
FT (t + u) - FT (u)
FTn+1 (t) = Pr[Tn +1 £ t| An = u] =
FT (u)
Using this notation, Kijima [69] defines the generalized renewal density as
t
fT ( y - (1 - q)x)
mFA (t) = fT (t) + mFA ( x)
ò
0
FT (qx)
dx (12.102)
ta
C( t a ) =
c1 +
ta
ò0
c2 (u)mFA (u)du
(12.103)
EFT [T ] +
ò
0
EFA [Tn +1 | An = qu]mFA (u)du
where the denominator is the expected length of the replacement cycle. Note
that the form of the time-dependent repair cost may be selected as appro-
priate to an application. Makis and Jardine also show that as long as the
repair cost is bounded (c2(t) ≤ K) and EFA [Tn +1 | An = y] ³ e > 0 , the cost func-
tion of Equation 12.103 has a unique minimum that can be computed using
the derivative.
One final model is the one defined by Kijima et al. [70] for a block replace-
ment policy with repairs that improve the device age following failure. Since
it is assumed that the PM activity involves replacement that is perfect, the
model for this policy appears only slightly different from the basic block
replacement model. The model is
c2 MFA (tb ) + c1
C( t b ) = (12.104)
tb
Clearly, for appropriate choices of the cost parameters and life distribution,
this cost function will have a unique minimum. The degree of difficulty for
Preventive Maintenance 283
the calculation of the minimum depends mostly on the choice of life distri-
bution. However, Kijima et al. show that using the now familiar form of the
residual life distribution
FT (t + qx) - FT (qx)
FA (t|x) =
FT (qx)
MFA (t) »
t
-
ò F (u)du
0
T
EFT [T ] EFT [T ]
æ ì t
üì t-x
üö
ç ïï
t
ò FA (u|x)du ï ï
ïï ò FA (u|t - x)du ï ÷
ï
ò
+ ç1- í 0
EFT [T ]
ýí
0
¥ ý÷
ç ï
0 ç
è ïî
ïï
ïþ ïî ò 0
FA (u|t - x)du ï ÷÷
ïþ ø
æ x
ö
ç
´ ç fT ( x) + x
0 ò
FT ( x) f A ( x - u|u)du ÷
÷ dx (12.105)
ç
ç
è 0 ò
FA ( x - u|u)du ÷÷
ø
12.3 Conclusion
PM policies may be defined in several ways and may ultimately have very
many operational realizations. The models discussed in this chapter serve
to illustrate the many forms the models may take and the two principal
approaches to analysis—availability and cost. PM is an essential ingredient
in any productivity assurance strategy and for many devices, it is critical
to system safety. Given the significance of PM, methods for selecting effi-
cient and effective PM policies are important and are worth the (sometimes
taxing) effort they require. It should now be clear that the state of a unit fol-
lowing repair or replacement is the aspect of the equipment behavior that
guides the analysis of maintenance policies. It should also be appreciated
that numerical approximations to complicated functions will often yield
policy solutions that are quite satisfactory.
284 Reliability Engineering: Probabilistic Models and Maintenance Methods
Exercises
12.1 Assume a population of devices display life lengths that are well
modeled by a gamma distribution having parameters β = 3.0 and
λ = 0.005. Compute and compare the quantities E[N(t, τa)], E[N(t, τb)],
and E[N(t)] for t = 100, 200, and 400 hours and τa and τb values of
0.2E[T], 0.40E[T], and 0.75E[T].
12.2 Establish the stochastic ordering between a Weibull distribution and
an exponential distribution having the same mean value for the cases
of IFR and DFR.
12.3 Compute an optimal block replacement policy (under the model in
Equation 12.4) for a system having a normal life distribution with
parameters μ = 400 and σ = 50 and assuming the repair cost is 12
times the PM cost.
12.4 For the system of the previous problem, assume operating and repair
periods are quasi-renewal and compute the optimal block replace-
ment policy time (under the model in Equation 12.81) for the case in
which repairs are also normal quasi-renewal with EG[T] = 20, σG = 4,
and α = 0.975, β = 1.02.
12.5 Compute an optimal age replacement policy for a system having
Weibull life distribution with β = 2.75 and θ = 5000 hours and the
repair cost that is 20 times the PM cost.
12.6 Assume a population of devices display life lengths that are well
modeled by a gamma distribution having parameters β = 3.0 and
λ = 0.002. Assume further that the devices are managed using an
age replacement policy with τb = 750 hours, Gc(t) is exponential with
parameter λ c = 0.02, and Gp(t) is exponential with parameter λp = 0.05.
Use Equation 12.15 to compute A(t) for t = 900, 1200, and 1500 hours.
12.7 Assume a population of devices display life lengths that are well
modeled by a Weibull distribution having parameters β = 2.25 and
θ = 2000 hours. Assume further that the devices are managed using
an age replacement policy with τa = 750 hours, Gc(t) is exponential
with parameter λ = 0.02, and Gp(t) is exponential with parameter
λ = 0.05. Use Equation 12.14 to compute A(t) for t = 100, 400, and 500
hours and also then for t = 800, 1000, 1500, and 2000 hours.
12.8 Suppose that PM is imperfect in that a device is renewed with prob-
ability p and the hazard is unchanged by PM with probability 1 − p.
Assume also that a device failure implies device renewal so that the
cost rate function is Equation 12.76. Analyze this model as a function
of p for a Weibull life distribution having parameters β = 2.25 and θ =
2000 hours, a block replacement policy time of τb = 1500 hours, and c2
= 20c1. Plot the cost rate divided by c1 as a function of p.
Preventive Maintenance 285
12.9 Assume a population of devices display life lengths that are well
modeled by a Weibull distribution having parameters β = 2.25 and
θ = 2000 hours. Use the approximation for the generalized renewal
function of Equation 12.105 with π = 0.90 to compute the values of
MFA (t) for t = 500, 1000, 2000, 3000, 4000, and 5000. Then, compute
the value of cost rate function of the Kijima et al. model of Equation
12.104 for each of these times when c2 = 20c1.
12.10 Assume a population of devices display life lengths that are well
modeled by a gamma distribution having parameters β = 3.0 and
λ = 0.002. Use the approximation for the generalized renewal func-
tion of Equation 12.105 with π = 0.90 to compute the values of MFA (t)
for t = 500, 1000, 2000, 3000, 4000, and 5000. Then, recompute the
values for π = 0.75 and π = 0.50.
13
Predictive Maintenance
287
288 Reliability Engineering: Probabilistic Models and Maintenance Methods
lb(t2 -t1 )
f X (t2 )- X (t1 ) ( x) = xb(t2 -t1 )-1e - lx (13.1)
G(b(t2 - t1 ))
Using this model, the average deterioration during the interval is β(t2−t1)/λ
and the variance is β(t2−t1)/λ2. Thus, there is considerable flexibility in the
model so many types of deterioration can be represented this way. Van
Noortwijk et al. [87] argue that this model provides an accurate representa-
tion of cumulative damage processes, erosion and corrosion, defect-based
degradation, and most other evolutionary deterioration behaviors. A graphi-
cal representation of this type of process is shown in Figure 13.1. Note that
the assumption of independent increments implies that the process is semi-
regenerative and the times T1, T2, and T3 are semiregeneration points.
Predictive Maintenance 289
X(t)
T1 T2 T3
Time
FIGURE 13.1
Representation of generalized gamma deterioration process.
q ql (13.2)
=
bt/l bt
and the number of inspections required to discover that the threshold has
been passed is one greater than that number. Assuming an inspection cost of
cI and PM and corrective replacement costs of c1 and c2 as in Chapter 12, the
cost per unit time of an inspection policy τ is
cI ( ql/bt + 1) + c1FX (X f - q) - c2 FX (X f - q)
C(t) = (13.3)
( ql/bt + 1) t
290 Reliability Engineering: Probabilistic Models and Maintenance Methods
Xf
X(t)
τ 2τ 3τ 4τ
Time
FIGURE 13.2
Inspection intervals with PM criterion.
where FX(Xf − θ) is the probability that the deterioration in the final inspec-
tion interval does not exceed the difference between the failure threshold
and the PM threshold.
Clearly, the simple model of Equation 13.3 is very similar to the age replace-
ment model. One important difference is that analysis of the model permits
optimization of both the inspection interval and the PM criterion. A short-
coming of the model is the restriction to equally spaced inspection intervals.
A key feature of condition-based maintenance is sensitivity to device state
and a corresponding capability to adjust inspection frequency.
é E[C(t)] ù
E[C¥ ] = lim ê
t ®¥ ë t úû
where
D(t) is the quantity of device downtime over (0, t)
cd is the unit cost of downtime
NI(t) is the number of inspections
NP(t) is the number of PM replacements
NC(t) is the number of corrective (failure precipitated) replacements
over (0, t)
ì æ t öü
tn +1 = tn + 1 + max í0, ç t0 - 0 X(tn ) ÷ ý (13.6)
î è q - e øþ
Using this form, the minimum inspection interval is one time unit, and since
X(τ0) = 0, the maximum interval is 1 + τ0 time units. The minimum interval
is used whenever the device state is within ε units of the PM threshold, and
over time, inspection interval length declines from τ0 to 1.
As a matter of notational convenience, let In(X(τn)) represent the time inter-
val until the next inspection that is selected using Equation 13.6. That is,
I n (X(tn )) = tn +1 - tn
lbIn ( X ( tn ))
f X ( I ) ( x) = f X ( tn+1 )- X ( tn ) ( x) = xbIn ( X ( tn ))-1e - lx (13.7)
G(bI n (X(tn )))
which is to say that the shape parameter of the distribution depends upon
the selected length of the inspection interval.
292 Reliability Engineering: Probabilistic Models and Maintenance Methods
Yn = X(tn )
and takes real values in [0, θ). The evolution of the variable Yn is a continuous
state space Markov chain. If we let Π(Y) represent the stationary distribution
on the state variable Yn, then the limits in Equation 13.5 can be shown [80]
to equal the corresponding first interval expectations with respect to Π(Y).
That is,
é E [N (t )] ù é E [N (t )] ù é E [N (t )] ù
E[C¥ ] = cI ê P I 1 ú + cP ê P P 1 ú + cC ê P C 1 ú
ë EP [ t1 ] û ë EP [ t1 ] û ë EP [t1 ] û
é E [D(t1 )] ù
+ cd ê P ú (13.8)
ë EP [t1 ] û
Thus, the first step in analyzing the PM strategy is the construction of Π(Y).
For the gamma deterioration process described by the distribution of (13.7),
the stationary distribution is obtained as the solution to
q
P(Y ) =
ò (F
0
X(I ) (q - x)d0 ( y ) + f X ( I ) ( y - x) P( x)dx ) (13.9)
in which δ 0(y) is the Dirac mass function. The solution of this equation takes
the form of the convex combination:
P( y ) = ad0 ( y ) + (1 - a) g( y ) (13.10)
òf
(k ) ( k -1)
f X(I ) (y) = X ( tk -1 ) - X ( t0 ) ( y - x) f X ( tk )- X ( tk -1 ) ( x)dx (13.11)
0
Once the value of a is determined, the function g(x) in Equation 13.10 is com-
puted as
a
g( x ) = y( x) (13.14)
1- a
EP [N P (t1 )] =
ò (F
0
X(I ) )
(q - x) - FX ( I ) (X f - x) P( x)dx (13.15)
ò
EP [NC (t1 )] = FX ( I ) (X f - x)P( x)dx
0
(13.16)
æ I(x)
q ö
ç
0 è 0
ò ò
EP [D(t1 )] = ç FX ( I ( u )) (X f - u)du ÷ P( x)dx
÷
ø
(13.17)
ò
EP [t1 ] = I ( x)P( x)dx
0
(13.18)
l ( t - m )2
l - 2tm 2
fT (t) = e (13.19)
2pt 3
Predictive Maintenance 295
æ læt ö ö ( 2l/m ) æ l æ t öö
FT (t) = F çç ç - 1 ÷ ÷÷ + e F çç - ç - 1 ÷ ÷÷ (13.20)
è t è m øø è t è m øø
where
μ = ω/η
λ = ω2/δ2
For a complete data set from a reliability test, the observed failure times
(t1,t2,…,tn) can be used to construct the likelihood function:
n n æ l ( t - m )2
- i ö
l
Õ Õ ç ÷
2
L( t , l , m) = fT (ti ) = e 2ti m
ç 2pti3 ÷
i =1 i =1
è ø
n
l ( ti - m )2
æ l ö
n/2
æ n
ö -å
=ç ÷
è 2p ø
ç
ç
è
Õt
i =1
-3/2
i ÷ e i=1
÷
ø
2ti m 2
(13.21)
n n
(ti - m)2
å å
n 3 l
ln L( t , l , m) = ( ln l - ln 2 - ln p ) - ln ti - (13.22)
2 2 i =1
2m 2 i =1
ti
mˆ 2
lˆ = (13.23)
ˆ )2
å
1 n (t - m
i
n i =1 ti
and
n n
(ti - mˆ ) (ti - mˆ )2
å å
1
+ =0 (13.24)
i =1
ti mˆ i =1
ti
Once the historical data have been used to obtain estimates for the life
distribution parameters, there are two reasonable approaches to using device
monitoring measurements to evaluate failure risk.
296 Reliability Engineering: Probabilistic Models and Maintenance Methods
FT (t + t)
FT|t (t|t) = 1 - (13.25)
FT (t)
so a decision rule for initiating maintenance should have the form that PM
is done when FT|τ(t|τ) exceeds a selected value for an appropriately chosen
value of time. The choice of the probability threshold and the time interval
can be based on cost analyses or any other meaningful criteria.
An alternate approach that may be more precise is to take the degradation
measure as the starting point for the diffusion process and to consider that
the distance to the threshold is revised to be ω – X(τ). Consequently, the dis-
tribution on time to failure is adjusted to be
æ l¢ æ t ö ö -( 2l¢/m¢) æ l¢ æ t öö
FT|t (t) = F çç ç ¢ - 1 ÷ ÷÷ + e F çç - ç ¢ - 1 ÷ ÷÷ (13.26)
è t èm øø è t èm øø
where
μ′ = (ω – X(τ))/η
λ′ = (ω – X(τ))2/δ2
TABLE 13.1
Times to Reach an Example Failure Threshold (in Days)
1460 1193 797 1169
1385 1006 1125 1450
660 869 775 728
999 1035 1424 670
841
Predictive Maintenance 297
1.0
0.8
Reliability residual
0.6
0.4
0.2
FIGURE 13.3
Residual reliability over time for three state variable measures.
å
t =1
( yˆ t +1 - yt +1 )2 = å (s + b - y
t =1
t t t +1 )2
TABLE 13.2
Sequence of Observed Fraction Defective in Sampled Output
0.0027 0.0159 0.0306
0.0061 0.0197 0.0320
0.0088 0.0216 0.0348
0.0125 0.0252 0.0359
0.0144 0.0272
TABLE 13.3
Computed Time Series Values
t bt st
1 0.0024 0.0052
2 0.0025 0.0079
3 0.0026 0.0108
4 0.0040 0.0135
5 0.0039 0.0117
6 0.0038 0.0209
7 0.0037 0.0242
8 0.0036 0.0274
9 0.0034 0.0303
10 0.0032 0.0332
11 0.0030 0.0356
12 0.0037 0.0380
13 0.0034 0.0406
300 Reliability Engineering: Probabilistic Models and Maintenance Methods
æ æ rs y öö
ç y - ç my + (x - mx ) ÷ ÷
è sx ø÷
g1( y |x) = Pr[Y(t) = y |X(t) = x , S > T ] = f ç (13.30)
ç s y (1 - r2 ) ÷
ç ÷
è ø
where S represents the failure time. For devices that fail during the test, the
conditional density on Y(t) is
They further state that the density on the value of the state variable for sur-
viving units is
æ ( x - m x T )2 ö
-ç ÷
ç 2 s2 T
è æ -ç
æ 2 w( w- x ) ö ö
÷
ø ÷
e x
ç 1 - e çè s2x T ÷ø ÷
hX ( x) = Pr[X(T ) = x , S > T ] = (13.32)
2ps xT çç
2 ÷÷
è ø
where necessarily x < ω. Next, the failure time for the Wiener process is the
inverse Gaussian so
æ ( w- m x s )2 ö
-ç ÷
ç 2 s2 s ÷
we è x ø
hS (s) = Pr[S = s] = (13.33)
2ps2x s3
Predictive Maintenance 301
Using these definitions, Whitmore et al. [93] give the joint density on the
marker variable and the chance of survival as
w
æ æ s2xy öö
ç w - çç m xT + 2 ( y - m y ) ÷÷ ÷ æ ö -æ 2wmx ö
ç è sy ø ÷ f ç y - mY T ÷ - e çè sx ÷øF
= Fç ÷ ç
s2x (1 - r2 ) 2 ÷
ç ÷ è s yT ø
ç ÷
è ø
æ æ 2
s xy öö æ æ s2xy ö ö
ç w - çç m xT + 2 ( y - m y ) + 2w(1 - r2 ) ÷÷ ÷ ç y - (mY T + 2w ç 2 ÷ ÷
ç è sy ø ÷ fç è sx ø ÷
´ç ÷ ç ÷
2 2 2
ç s x (1 - r ) ÷ ç s yT ÷
ç ÷ ç ÷
è ø è ø
(13.34)
and the complementary joint density on the marker variable and failure
time as
1
- M ¢S -1 M
we 2
hY , failure,S ( y, s) = Pr[Y(s) = y, S = s] = g 2 ( y |x)hS (s) = (13.35)
2p |S|1/2 s2
é y - mys w - mxs ù
M¢ = ê ,
ë s s úû
Using the defined probability functions, the failure and survival probabili-
ties over time are
t t ¥
ò
FT (t) = hS (s)ds =
0
òòh
0 -¥
Y, failure, S ( y, s)dyds (13.36)
w ¥
FT (t) =
ò h (x)dx = ò h
-¥
X
-¥
Y, survivor ( y )dy (13.37)
of the units fail by the termination of the test at time T, then the data observa-
tions will be (yi, si) for the items that fail and (yi, T) for those that survive. The
general form of the logarithm of the likelihood function is
m n
(
ln L m , S = ) å
i =1
ln ( hY, failure,S ( yi , si ) ) + å ln ( h
i = m +1
Y, survivor ( yi )) (13.38)
å
n
yi nw
mˆ y = i =1
mˆ x = (13.39)
å å
n n
si si
i =1 i =1
and
n n
(w - mˆ x si )2 ( yi - mˆ y si )2
å å
1 1
sˆ 2x = sˆ 2y =
n i =1
si n i =1
si
n
( yi - mˆ y si )(w - mˆ x si )
å
1
sˆ 2xy = (13.40)
n i =1
si
where the densities are evaluated at τ rather than T and the time from τ to
failure for any given value of the degradation is the inverse Gaussian density
for s – τ when the distance to the threshold is ω – X(τ). Therefore, the density on
remaining time to failure when the marker variable value y(τ) is observed is
ˆl ( s - m
æ ˆ w
-
ˆ )2
ö
ò
fS|Y (s| y ) = ç l 3 e 2 sm ÷ f X|Y ( x| y )dx
ˆ 2
(13.42)
ç 2ps ÷
-¥ è ø
Predictive Maintenance 303
TABLE 13.4
Simulated Failure Times and Marker Variable Values
si yi si yi si yi
652 52.155 671 53.216 666 53.056
658 52.228 657 53.056 660 53.335
663 52.459 674 53.221 656 52.561
654 52.730 671 53.060 664 52.027
657 51.927 671 52.628 656 52.986
656 53.108 671 53.014 665 52.706
660 52.749 658 53.021 651 52.066
675 53.090 652 52.729 643 53.152
648 52.526
TABLE 13.5
Simulated Marker Variable Values and Censored Failure Times
si yi si yi si yi
657 52.244 657 52.910 649 52.369
Censored 52.345 Censored 51.711 656 53.230
654 52.510 Censored 52.617 659 52.679
646 51.944 656 52.372 Censored 52.820
657 53.089 Censored 52.459 660 52.787
654 52.764 659 51.784 647 52.267
647 51.667 Censored 53.548 Censored 51.955
660 51.925 656 52.445 632 51.543
655 51.067
304 Reliability Engineering: Probabilistic Models and Maintenance Methods
0.0070
0.0065
0.0060
0.0055
f (s|y)
0.0050
0.0045
0.0040
0.0035
700 800 900 1000 1100 1200 1300
Time
FIGURE 13.4
Residual life density for the simulated marker variable.
13.5 Conclusion
The concept of predictive maintenance is very appealing. Predictive
maintenance holds the promise of significant maintenance cost reductions
and associated increases in productivity. The definition of alternate policy
formats and the formulation and analysis of models for selecting predictive
maintenance policies are embryonic and represent one of the important new
frontiers in maintenance planning research.
Predictive Maintenance 305
Exercises
13.1 Assume that the gamma process of Equation 13.1 has parameters β = 3.0
and λ = 0.5 and simulate a sample path of the process over time with
inspections every 4 time units. Plot the observed values of X(t) relative
to a failure PM threshold of θ = 100.
13.2 Assume a gamma deterioration process has parameters β = 2.0 and
λ = 0.4 and cost parameters of c1 = 1 and c2 = 22 with failure threshold
Xf = 75.0 and PM criterion θ = 70.0. Solve the model of Equation 13.3 to
determine the optimal inspection schedule.
13.3 Assume a device has a gamma deterioration process that has param-
eters β = 2.0 and λ = 1.2; failure and PM thresholds of Xf = 45 and θ = 40,
respectively; and an inspection schedule that is defined by τ0 = 5 and
τn+1 = τ0 – 0.1X(τn). Use Equations 13.9 through 13.14 to determine the
stationary density on the state of the device.
13.4 For the device described in Problem 13.3, assume cost parameters of
cI = 1, cP = 10, cC = 125, and cd = 250 and calculate the limiting expected
cost rate for predictive maintenance of the device.
13.5 Suppose a sample of 40 copies of a device having failure threshold
ω = 20.0 had previously subjected to a life test and that the failure time
data obtained are
13.6 Suppose that a performance (or marker) variable for a device has
been monitored at regular time intervals of one week over a period of
24 weeks and that the observed data are
si yi si yi si yi
449 53.486 446 53.256 453 54.533
444 51.894 442 53.39 455 53.613
451 53.537 443 52.47 452 52.836
442 53.626 453 52.926 453 53.3
437 53.313 442 52.821 443 53.231
454 53.043 439 52.312 439 52.761
441 52.464 455 54.051 438 52.902
441 52.914 447 55.008 449 54.444
444 53.352
14
Special Topics
307
308 Reliability Engineering: Probabilistic Models and Maintenance Methods
ì N1
ït 0 £ t < t1
ï 1
ï N2 t1 £ t < t2
ï
l(t) = í (t2 - t1 ) (14.1)
ï
ï
ï Nm
tm -1 £ t < tm
ï ( tm - tm -1 )
î
where the m values τi are not necessarily of equal length but together define
the partition of the observation interval.
If the times τi are selected because they are failure times, then the Laplace
test [52] may be used to determine whether or not the failure intensity func-
tion appears to be homogeneous. If the intensity function is homogeneous,
then the quantity
å
nti T
-
L= n 2
i =1
T/ 12n
TABLE 14.1
Failure Times for a Nonhomogeneous Failure Process
Example
676.4103 2289.604 4014.987
814.0804 3015.355 4222.475
1194.773 3142.368 4691.219
1790.756 3580.177
Special Topics 309
c2 a c2 a
2n, 2 n , 1-
2
£l£ 2
(14.3)
2nt 2nt
ì 0 t<0
ï
G(t) = íL(t)/L(T ) 0£t£T (14.5)
ï 1 t>T
î
For the intensity function of Equation 14.4, the distribution on order statistics
simplifies to
L(t)/L(T ) = ( t/q )
b
G(t) = (14.6)
( T/q ) = ( t/T )
b b
310 Reliability Engineering: Probabilistic Models and Maintenance Methods
The likelihood function that applies to a set of observed failure times τi and
the associated random variable N is the joint density on these quantities:
b
( T/q ) e ( )
n nb - T /q n
æ tb -11 ö
L(q, b|n, t1 , t2 , … , tn ) = f N (n) ´ n ! Õ i =1
g( t ) =
i
n!
n! Õ
i =1
bç i b ÷
è T ø
b -1
æ bn ö æ ö
n
Õ t ÷÷ø
b
e ( )
- T/q
= ç nb ÷ ç i (14.7)
è q ø çè i =1
n b
ln L = n ln b - nb ln q + (b - 1) å i =1
æT ö
ln ti - ç ÷
èqø
(14.8)
n n T
bˆ = = and qˆ = 1/bˆ (14.9)
å å
n n
ln ( T/ti ) n ln T - ln ti n
i =1 i =1
ˆ 2
bc a
ˆ 2
bc a
2n, 2 n , 1-
2
£b£ 2
(14.10)
2n 2n
that the observed times are τji for i = 1,…,nj. Here again, the numbers of fail-
ures and the times of failures are random variables.
If the failure process is homogeneous, then the failure intensity function is
defined by a single parameter and the likelihood function for the observed
failure data is
( lTj )
J nj J nj
Tj
Õ Õn!
J J
- lTj å j=1 n j -l å j=1 Tj
L(l|n, t1 , t2 , … , tJ ) = e =l e (14.11)
j =1
nj ! j =1 j
J æ J ö J
ln L = -l å Tj + ç
ç å n j ÷ ln l +
÷ å ( n ln T - ln n )
j j j (14.12)
j =1 è j =1 ø j =1
Setting the derivative of the logarithm of the likelihood function yields the
estimation equation
å
J
nj
j =1
l̂ = (14.13)
å
J
Tj
j =1
c2 a c2 a
n, n , 1-
2
£l£ 2 (14.14)
å å
J J
2 Tj 2 Tj
j =1 j =1
TABLE 14.2
Example of Homogeneous Failure Times for a Fleet of 5 Identical
Systems
n1 = 6 n2 = 3 n3 = 0 n4 = 3 n5 = 4
234.076 150.168 378.700 110.829
480.5505 329.935 413.786 209.701
941.3557 501.633 572.794 1400.070
1505.591 1668.630
1792.566
1984.426
times. The failure times for the separate systems should be independent so
the likelihood function for the failure intensity function parameters is
J æ nj
ö J æ nj
btbji-1 ö -(Tj/q)b
L(q, b|n, t1 , t2 , … , tJ ) = ÕÕ
j =1
ç
ç
è i =1
l(t ji ) ÷ e j =
÷
ø
- L(T )
ÕÕ j =1
ç
ç
è i =1
qb ÷
÷e
ø
J b -1 J
å j=1 n j æ J nj
ö - å (Tj /q)
b
b
=
q
J
bå nj ç
ç
ÕÕ t ji ÷
÷
e j =1
(14.15)
j =1
è j =1 i =1 ø
The logarithm of this function is not complicated and taking its derivative
leads to the estimation equations
1/bˆ
æ
å T ÷ö
J
b
ˆq = ç
j
j =1
ç ÷ (14.16)
å
J
ç nj ÷
è j =1 ø
and
å
J
nj
bˆ =
j =1
(14.17)
å å ( )-
J J ˆ
nj Tjb ln Tj
å å
J nj
j =1 j =1
ln t ji
å T
J
bˆ j =1 i =1
j
j =1
å å ln t - å (T ln T ) = 0
J nj J
bˆ
ji j
1 j =1 i =1 j =1
j
+ (14.18)
bˆ
å n å T
J J
bˆ
j j
j =1 j =1
Special Topics 313
TABLE 14.3
Example of Nonhomogeneous Failure Times for a Fleet
of 4 Identical Systems
n1 = 11 n 2 = 13 n3 = 9 n4 = 9
676.410 399.084 303.763 570.445
814.080 440.728 461.063 841.766
1194.773 722.429 695.678 1201.825
1790.756 1316.966 1325.921 1508.394
2289.604 2011.369 1442.667 1653.020
3015.355 2587.809 1845.714 2189.773
3142.368 3139.724 2376.580 2591.334
3580.177 3490.071 2657.079 2863.896
4014.987 4057.239 2896.699 3621.946
4222.475 4190.488
4691.219 4970.179
5344.368
5858.649
n=2 ån
j =1
j
and
ˆ 2
bc a
ˆ 2
bc a
n, n , 1-
2
£b£ 2 (14.19)
å å
J J
2 nj 2 nj
j =1 j =1
so 0.770 ≤ β ≤ 1.414.
In summary, the statistical methods that were developed for estimating
the parameters of life distributions have been modified slightly. The revised
methods permit us to compute parameter estimates for useful models of the
behavior of the stochastic processes that represent the operation of repairable
equipment.
314 Reliability Engineering: Probabilistic Models and Maintenance Methods
14.2 Warranties
A warranty is a guarantee by a producer that a product will display a defined
level of reliability. It is increasingly common for manufacturers to provide
warranties for their products. The chief reason that manufacturers pro-
vide warranties is that customers often consider warranties to be appropri-
ate and warranties are thus an important ingredient in successful marketing.
Independent of this point, it is generally recognized that complex and espe-
cially expensive products should be guaranteed to function properly. In 1964,
the U.S. Congress enacted the Magnuson–Moss Act, which formally defined
how the warranties are to be structured and specified the responsibilities of
manufacturers in meeting warranty commitments.
There are essentially two types of warranties. These are full replacement
warranties and pro rata warranties. Pro rata warranties are usually offered
on products such as automobile components such as tires and batteries.
Common characteristics of products that carry pro rata warranties are that
(1) product use implies wear or at least accumulating deterioration, (2) repair
is either physically impractical or economically inefficient, and (3) evaluation
of product age is reasonably straightforward. Under a pro rata warranty, the
manufacturer returns a proportion of the original price of the product to the
customer in the event of product failure. The proportion of the price that is
returned is computed on the basis of an estimate of how much of the product
life the customer has lost due to failure of the product. In the case of an auto-
mobile tire, a blowout renders the tire unusable. If one occurs, it is common
to measure the depth of the tire tread that remains and to use the ratio of
the remaining tread depth to the original tread depth to determine the tire
life lost due to the failure. The formula for converting the lost life into a cash
settlement is usually specific to the manufacturer.
The full replacement warranty is a guarantee to repair or replace a failed
product (or component of a product) in order that the product be as good as
new. A key feature of the full replacement warranty is the definition of the
time limit on its applicability. For some products, that warranty is in force for
a fixed period of time from the date of product purchase. A 3-year warranty
on an automobile and a 1-year warranty on a stereo are examples. In the case
of the auto, the frequency of repair does not change the termination date of
the warranty.
For some products, the full replacement warranty is renewed when the
product is replaced under the warranty agreement. For example, a por-
table stereo that fails prior to the completion of its 1-year warranty period
might be replaced with another copy of the product that is also warranted
for 1 year. Most durable consumer goods carry one of the two types of
warranties.
The choice of which type of warranty to offer is made by the manufac-
turer and is highly influenced by the nature of the product and its reliability.
Special Topics 315
C = gP
Next, assume that that the likelihood of product sales is enhanced by the
warranty and that this can be modeled using
(
E [ demand|Tw ] = d(Tw ) = u1 + u2Twr )
where 0 ≤ r < 1. The expected profit realized from product sales adjusted by
the cost of warranty support is
where
1
E[N (Tw )] = -1 (14.21)
FT (Tw )
the number of copies of the product used in order to achieve a life duration
in excess of the warranty interval is geometric with parameter FT (Tw ), the
survivor function for the product life distribution. The expected value of
the number of replacements, N(Tw), is one less than the number of copies
used. Substituting the demand and cost expressions along with E[N(Tw)] into
Equation 14.20 yields the following expected profit model:
æ æ 1 ö ö
E[ profit] = çç (P - gP) - ç (
- 1 ÷ ( gP + dP) ÷÷ u1 + u2Twr ) (14.22)
è è FT (Tw ) ø ø
dE[ profit] æ æ 1 ö ö
= çç (1 - g )P - ç - 1 ÷ ( g + d)P ÷÷ ru2Twr -1
dTw è è FT (Tw ) ø ø
ææ ö ö
f (T )
(
- ç ç T w 2 ÷ ( g + d)P ÷ u1 + u2Twr
çç ç F (T ) ÷ ÷÷ )
(
èè T w ø ) ø
æ æ 1 ö ö
= çç (1 - g ) - ç - 1 ÷ ( g + d) ÷÷ Pru2Twr -1
è è FT (Tw ) ø ø
æ æ z (T ) ö ö
(
- çç ç T w ÷ ( g + d) ÷÷ P u1 + u2Twr )
è è FT (Tw ) ø ø
Setting this derivative equal to zero allows us to divide out the price, P, and
then applying the hazard function identity, we obtain
æ æ FT (Tw ) ö ö æ æ zT (Tw ) ö ö
çç (1 - g ) - ç
F (T )
r -1
÷ ( g + d) ÷÷ ru2Tw - çç ç
F (T )
( r
)
÷ ( g + d) ÷÷ u1 + u2Tw = 0 (14.23)
è è T w ø ø èè T w ø ø
23.25
23.20
23.15
23.10
23.05
23.00
22.95
45 50 55 60 65 70 75
FIGURE 14.1
Expected profit function for a full replacement warranty example.
Next, consider the situation in which the warranty has a fixed duration
during which replacement (repair) service is provided but the duration of
the warranty is not extended. For this case, the cost of meeting the warranty
commitment for each unit of product is defined by the renewal function
based on the product life distribution function. If it is again assumed that
demand depends on the duration of the warranty as expressed by d(Tw), the
expected profit function is
(
E[ profit] = ( (P - gP) - MF (Tw )( gP + dP) ) u1 + u2Twr ) (14.24)
To identify the optimal warranty policy time, we again use the derivative
dE[ profit]
= ( (P - gP) - MF (Tw )( gP + dP) ) ru2Twr -1
dTw
(
+ ( -mF (Tw )( gP + dP) ) u1 + u2Twr ) (14.25)
which also is set equal to zero and analyzed numerically. For the same
parameter values as those used for the full replacement warranty example,
the optimal policy time is Tw = 57.5 weeks.
Now, under a pro rata warranty, the residual value of a failed unit may be
represented as
a a
æ Tw - t ö æ t ö
ç T ÷ = 1-ç T ÷
è w ø è w ø
and this value is returned to any customer that decides to repurchase the
product. This form permits either linear or nonlinear reduction in warranty
value and also allows that the maximum warranty value be any fraction of
the purchase price. Using this expression, the repurchase price is
æ æ t öa ö
P - ç1- ç ÷P
ç è Tw ÷ø ÷
è ø
so the consequence of servicing the warranty is that the revenue from the
repurchase is
æ æ t öa ö
P - ç1- ç ÷ ÷ P - C - dP (14.27)
ç è Tw ø ÷
è ø
æ Tw
æ æ t öa ö æ t ös ö
ç
E[ profit] = (P - C ) + ç P ç
ò ÷ - C - dP ÷ h ç ÷ fT (t)dt ÷ (u1 + u2Tw )
r
ç ç è Tw ø ÷ è Tw ø ÷
è 0 è ø ø
æ Tw
æ æ t öa ö æ t ös ö
ç
è
ò
= P ç (1 - g ) + ç ç ÷ - ( g + d) ÷ h ç ÷ fT (t)dt ÷ (u1 + u2Twr )
ç è Tw ø
0 è
÷ è Tw ø
ø ÷
ø
(14.28)
d
E[ profit] = P ( ( 1 - (d + g ) ) h fT (Tw )
dTw
Tw
æ æ at a ö æ t ös æ st s ö ö ö
-
ò ç ç a +1 ÷ h ç
ç è Tw ø è Tw ÷ø
+ ( t(
/ Tw )
a
- ( g + d ) )
h ÷
ç s +1 ÷ ÷
è T øø
f T (t )dt ÷ (u1 + u2Twr )
÷
0 è w
ø
æ Tw
æ æ t öa ö æ t ös ö
+ ru2Twr -1 ç (1 - g ) + ç ç
ç
è 0 è
ò ÷
ç è Tw ø
- ( g + d ) ÷ hç
ø
÷
÷ è Tw ø
fT (t)dt ÷ = 0
÷
ø
(14.29)
For the numerical values of the previous examples, we add s = 0.2, a = 0.1, and
η = 0.8. The model solution is a warranty policy time of 51.5 weeks.
Naturally, there are many possible variations to the models described
here. One extension that is often considered worthwhile is to apply a dis-
count factor to future cash flows. Depending upon the application, this and
other modifications may be reasonable. The modeling format should be the
same and the result should be an effective approach to the selection of a
warranty policy.
m = aT b (14.30)
where
T is the total accumulated test duration
α is a scale parameter that depends on the initial failure intensity
β is the rate of growth
j -1 K
l j = lu + å å (1 - d )l
i =1 k =1
ik ck (14.32)
where
λu is the hazard rate associated with the uncorrectable failure modes
λ ck is the hazard rate for the kth correctable failure mode
dik is the effectiveness in modifying the kth failure mode of the engineering
change implemented at the end of cycle i
K is the number of correctable failure modes
TABLE 14.4
Test Data from Table XIV of Mil. HDBK 189C
First Observed
Type B Mode Failure Time Ni di
1 15.04 2 0.67
2 25.26 3 0.72
3 47.46 2 0.77
4 53.96 2 0.77
5 56.42 3 0.87
6 99.57 2 0.92
7 100.31 1 0.50
8 111.99 3 0.85
9 125.48 3 0.89
10 133.43 4 0.74
11 192.66 1 0.70
12 249.15 2 0.63
13 285.01 1 0.64
14 379.43 1 0.72
15 388.97 1 0.69
16 395.25 1 0.46
Totals 32 11.54
are shown in Table 14.4. The data shown are based on a simulation of a
TAAF process that includes 10 failures due to Type A failure modes and
32 failures caused by 16 out of 100 possible Type B failure modes. The total
test duration is 400 hours. Note that there are multiple failures for some
of the failure modes and that failure mode correction effectiveness factors
are listed.
Assuming that the expected value of the failure intensity function follow-
ing the TAAF process is
K
E[l(T )] = l u + å (1 - d )l + dm(T )
i =1
i i (14.33)
the maximum likelihood estimates (which are not unbiased) for the failure
intensity model of Equation 14.33 are
m m N N (14.34)
bˆ = = lˆ i = i lˆ u = A
å
m
å æT ö T T
m
ln ç ÷ m ln T - ln ti
i =1
i =1
è ti ø
Special Topics 323
where
m is the number of Type B failure modes that are observed
d is the average failure mode correction factor
μ(T) is the failure intensity for new Type B failure modes that arise during
testing
T is the total test duration
μ(T) is estimated by mˆ (T ) = mbˆ/T and the estimate for the projected failure
intensity function is
1æ ö
m m
Eˆ [l(T )] = ç N A +
T çè å
i =1
å d ÷÷ø
(1 - di )N i + bˆ
i =1
i (14.35)
where
Ts is the overall system life length
T1 is the time at which the first component failure occurs
T2 is the residual life length of the system following the first component
failure
If this model is applied to a shared load system, one often assumes that the
individual component life lengths are independent and it is only the sharing
of the load the produces the dependence at the system level. In that case,
where y(T1) describes the effect of load sharing on the life distributions.
Similarly,
t t
æ F (t - u) F (t - u) ö
ò
0 0
èò
fT1 (u)FT2 (t - u)du = ç f X1 (u)FX2 (u) X2¢
F X2 ( u )
+ f X2 (u)FX1 (u) X1¢
FX1 (u) ø
÷ du
=
ò( f
0
X1 )
(u)FX2¢ (t - u) + f X2 (u)FX1¢ (t - u) du (14.38)
where the notation Xi¢ is intended to signify the fact that the hazard function
for the surviving component is different than when both components are
functioning.
An alternate general model for component dependence was proposed by
Barlow and Proschan [12]. They argued that a reasonable interpretation of cer-
tain shock models leads directly to a bivariate exponential system life distri-
bution. Their construction proceeds as follows. Suppose a system composed
of two components (in arbitrary structure) is subject to shocks from three
sources. The shocks from each source occur according to Poisson processes
such that [Ni(t)] is the number of shocks from source i occurring during (0, t)
and λi is the parameter for shock process i. Suppose further that a shock from
source 1 causes failure of component 1 with probability θ1, while a shock
from source 2 causes failure of component 1 with probability θ2. In the case
of shocks from source 3, each shock causes failure of both components with
probability θ11, causes failure of component 1 only with probability θ10, causes
Special Topics 325
æ ¥
öæ ¥
ö
å å
(l1t1 )k (l 2t2 )k
Pr[T1 > t1 , T2 > t2 ] = ç e - l1t1 (1 - q1 )k ÷ ç e - l2t2 (1 - q2 )k ÷
ç k! ÷ç k! ÷
è k =0 øè k =0 ø
æ ¥ ¥
ì - l3 t1 (l 3t1 ) j j ü ì - l3 (t2 -t1 ) (l 3 (t2 - t1 ))k üö
´ç
ç åå íe
î j!
q00 ý íe
þî k!
(q00 + q10 )k ý ÷ (14.39)
þ ÷ø
è k =0 j =0
where
l*3 = l 3q11
Device life is a resource that may be best represented and for which the
consumption may best be measured using a two (or higher)- dimensional
vector and the quantities that comprise the vector are specific to the equip-
ment. Years of usage and mileage are not the only two quantities that might
describe device longevity. We use the terms age and use here, but these
terms are generic and may represent quite different measures than dura-
tion of ownership and distance traveled. In the example of an automobile
tire, age might correspond to accumulated mileage and usage might be
measured as tread loss. Even more complicated measures such as current
flow and thermal history may be appropriate for some integrated circuits.
There are basically two ways to approach the definition of a bivariate reli-
ability model. The traditional approach has been to define the second vari-
able, usage, as a function of time so that the bivariate model can be collapsed
into a univariate model. Models of this type are discussed first. In fact, we
have already discussed some as the cumulative damage models portray
equipment reliability in terms of deterministically defined deterioration
occurring at random points in time and the proportional hazards models
treat age as a deterministic function of use covariates. Clearly, the nondeter-
ministic cases are more interesting.
A second approach that is more recent and was first treated by Singpurwalla
and Wilson [100] and subsequently by Yang and Nachlas [101] is to develop
reliability functions that are truly bivariate. These are presented subsequently.
where f X(t)(x) is the density function for the distribution on the transition in
state over time.
Special Topics 327
zT , X (t , x) = zT (t) + hx (14.44)
then the Laplace transform of Equation 14.43 with respect to the state vari-
able X is
æ l s + ht t
ö
*
FT , X (t , s) = exp ç
çh
è
ò s
*
ò
f X (t ) (v)dv - lt - zT (v)dv ÷
0
÷
ø
(14.45)
Then, taking the state transition process to be a gamma process, the col-
lapsed time-dependent life distribution is obtained as the Laplace transform
evaluated at s = 0:
æ l ht t
ö
FT (t) = FT*, X (t , 0) = exp ç
çh
è 0
ò ò
f X*(t ) (v)dv - lt - zT (v)dv ÷
0
÷
ø
(14.46)
Nearly all other reasonable choices of constituents of this model are more
difficult to evaluate, but most that are practically interesting can be managed
numerically.
longevity variables is portrayed. Assume that the time and use to failure are
related by the function u = B(t) and that the stochastic nature of this relation-
ship can be represented by treating one or more of the parameters of B(t) as
random variables.
The interpretation of the function B(t) is that across a population of devices,
the accumulated usage by age t is B(t). This is equivalent to saying that the
mileage traveled by 2-year-old cars is B(t = 2). Of course, we impose a prob-
ability distribution on B(t) to model its dispersion. To illustrate this construc-
tion, we consider the four example forms here:
1. B(t) = αt + β
2. B(t) = αt2 + βt + γ
3. B(t) = αtn
4. B(t) = (eαt − 1)/(eαt + β)
where the fourth form is the logistic model analyzed by Eliashberg et al.
[102]. In each case, we introduce randomness into the function by treating
the parameter α as a random variable having distribution πα(·). This imposes
random variation on the extent of use experienced by any age. Consequently,
both age and usage at failure are random variables. Certainly, there may be
many other functional forms that may be defined and that may be repre-
sentative of observed behavior. The analytical methods described here may
apply to those other forms as well.
The use of the distribution πα(·) to construct the marginal probability distribu-
tion on usage is accomplished using well-known methods. In general, as indi-
cated by Eliashberg et al. [102], we may construct the marginal density on use as
da(u)
fU ( u) = f g ( t ) ( u) = pa (a(u)) (14.47)
du
u -b da(u) 1
a(u) = and =
t du t
so
1 æ u -b ö
fU ( u) = p a ç ÷ (14.48)
t è t ø
ìï t üï
ò
fT|U ( t ) = zT|U ( t ) = zT|g (t ) ( t ) exp í- zT|g (t ) ( x)dx ý
ïî 0 ïþ
(14.50)
We use this form specifically so that we can focus upon the hazard function
in the definition of the failure model. We assume that the bivariate device
failure hazard function may be stated as
so that the definitions of the functions λ(t), η(u), and B(t) determine the hazard
and ultimately the bivariate life distribution. Here, we treat the simplest con-
ceivable form of the hazard function. More intricate, and perhaps realistic,
forms should be studied. Thus, we assume that λ(t) and η(u) are simple linear
functions. That is, we use λ(t) = λt and η(u) = ηu.
Under this modeling format, the bivariate life distribution corresponding
to the form (1) earlier is obtained by constructing
ææ u -b ö ö
zT|U ( x ) = zT|g (t ) ( x ) = lx + h ç ç ÷ x + b÷ (14.52)
èè t ø ø
l t + hu ì h(u + b) l ü æ u -b ö
fT ,U (t , u) = exp í- t - t 2 ý pa ç ÷ (14.53)
t î 2 2 þ è t ø
l t + hu ì h(u + 2g ü ì 3l + hb 2 ü æ u - bt - g ö
fT ,U (t , u) = exp í- t ý exp í- t ý pa ç ÷ (14.54)
t2 î 3 þ î 6 þ è t2 ø
l t + hu ì hu l ü æuö
fT ,U (t , u) = exp í- t - t 2 ý pa ç n ÷ (14.55)
tn î n + 1 2 þ èt ø
and
ì b+1 ü
ï h ïï
(1 + b)(lt + hu) æ 1 1 + bu ö ïh l 2 b 1
fT ,U (t , u) = pa ç ln ÷ exp í t - t - 1 + bu ln tý
t(1 - u)(1 + bu) è t 1- u ø ïb 2 ln (1 - u) ï
ïî 1- u ïþ
(14.56)
330 Reliability Engineering: Probabilistic Models and Maintenance Methods
for cases (2), (3), and (4), respectively. Note that in the case of (4), the definition
of the use function limits the variable U to [0, 1] so the functions may require
rescaling for some applications. Also, in cases (1) and (2), the forms of the
functions B(t) allow a nonzero minimum value for usage.
Finally, observe further that all four models are well defined and require
only the specification of the density πα(·) to be complete bivariate life distribu-
tions. On the other hand, for each of them, it is unlikely that a closed-form
expression can be obtained for the marginal distribution on age at failure.
In any case, the previous examples illustrate the construction of a model in
which the usage variable is a stochastic function of the age variable.
(
F T ,U (t , u) = e -( lt + hu ) 1 + r(1 - e - lt )(1 - e - hu ) ) (14.57)
(
fT ,U (t , u) = lhe -( lt + hu ) 1 + r{1 - 2e - lt - 2e - hu + 4e -( lt + hu ) } ) (14.58)
and the marginal densities are the constituent exponentials regardless of the
value of ρ.
A second model that is an obvious choice is the bivariate normal. The den-
sity function for this model is well known so it is not restated here. As is also
well known, the marginal densities are normal.
One final model that we wish to consider here is the one stated by Hunter
[104] in a queueing context but also consistent with reliability interpretations:
lh æ 2 r ö ì l t + hu ü
fT ,U (t , u) = I0 ç lhtu ÷ exp í- ý (14.59)
ç
1-r è 1-r ÷ î 1-r þ
ø
Special Topics 331
where In(·) is the modified Bessel function of order n. The marginal densities
for this model are not obvious.
To summarize the construction to this point, we have defined examples of
two classes of models that might be used to portray the dispersion in equip-
ment longevity as defined using two variables. We next examine the general
probability concepts commonly associated with reliability analysis and use
some of the suggested example forms to illustrate the concepts discussed.
Subsequently, we construct reliability and maintenance models using the
probability concepts and some of the example model forms.
FT ,U (t , u) = Pr[T £ t , U £ u] (14.60)
One may say that this probability corresponds to the proportion of the popu-
lation of devices that have longevity vector values at failure that do not exceed
(t, u) in either vector component. We emphasize this definition because of the
fact that for a bivariate distribution, probability is generally computed over
rectangles such as [t1 ≤ T ≤ t2, u1 ≤ U ≤ u2]. Consequently, for any specific lon-
gevity vector, (t, u), the range of age and use values implies that there are four
rectangles in the (T, U) plane for which probabilities may be meaningfully
calculated. Referring to Figure 14.2, observe that in addition to the rectangle
used in Equation 14.60, there are the rectangles Pr[T ≤ t, U ≥ u], Pr[T ≥ t,
U ≤ u], and Pr[T ≥ t, U ≥ u]. It is not obvious, but relative to the cumulative
probability FT,U(t, u), the probabilities
t ¥
Pr[T £ t , U > u] =
òò f
0 u
T ,U (s, v)dvds (14.61)
and
¥ u
Pr[T > t , U £ u] =
òò f
t 0
T ,U (s, v)dvds (14.62)
F
Pr[T <= t, U > u]
Pr[T > t, U > u]
F
Pr[T > t, U <= u]
Pr[T < t, U < u]
T
t
FIGURE 14.2
Bivariate probability distributions. (Reprinted with permission from Yang, S.C. and Nachlas,
J.A., Bivariate reliability and maintenance planning models, IEEE Trans. Reliab., 50(1), 26–35.
Copyright 2001 IEEE.)
failure ages exceed t or because their failure usages exceed u. We do not have
informative names for the probabilities represented by Equations 14.61 and
14.62 but have considered names such as marginal survival probabilities.
A further point that is rather subtle is the fact that the reliability at (t, u)
does not include the probabilities represented by Equations 14.61 and 14.62.
The reliability at longevity vector value (t, u) corresponds to the proportion
of the population for which failure age exceeds t and failure usage exceeds u.
Therefore, the reliability function corresponding to FT,U(t, u) is
¥ ¥
F T ,U (t , u) = Pr[T ³ t , U ³ u] =
òò f
t u
T ,U (s, v)dvds (14.63)
Pr[t £ T £ t + Dt , u £ U £ u + Du]
= lim
Dt ® 0
Du ® 0
DuDt Pr[T > t , U > u]
1 é F (t + Dt , u + Du) F (t + Dt , u)
= ê lim T ,U - lim T ,U
F T ,U (t , u) êë Du ®0
Dt ® 0 DuDt Dt ® 0
Du ® 0
DuDt
FT ,U (t , u + Dt) F (t , u) ù
- lim + lim T ,U ú
Dt ® 0 DuDt Dt ® 0 DuDt ú
Du ® 0 Du ® 0 û
fT ,U (t , u)
= (14.66)
F T ,U (t , u)
rate (MIFR), and the application of that definition to the bivariate life distri-
butions is that a distribution is MIFR if and only if
F T ,U (t + s, u + v)
(14.67)
F T ,U (t , u)
F T (t + s) F U (u + v )
and (14.68)
F T (t) F U ( u)
FT (t) = 1 - e ò0
- zT ( x ) dx
dFT (t)
fT (t) = = zT (t) ( 1 - FT (t) )
dt
¶ 2 FT ,U (t , u)
fT ,U (t , u) = = zT ,U (t , u)FT ,U (t , u) (14.69)
¶t¶u
The solution for this equation has not yet been found so one may not
build a bivariate reliability model from the hazard as is done in the uni-
variate case.
A further question is how one computes the mean and other descriptive
measures for a bivariate longevity distribution. The answer is that as with
univariate distributions, one begins by constructing the moment-generating
function (or Laplace transform) and then obtains moments as successive
Special Topics 335
æ n n
ö
(X n , Yn ) = ç
ç
è
å åU ÷÷ø
i =1
Ti ,
i =1
i (14.71)
Of course, by convention, (X0, Y0) = (0, 0). The renewal vector is illustrated
in Figure 14.3. Referring to the figure, we note that the number of renewals
at any coordinate point in the plane, say, (t, u), corresponds to the largest
value of n for which the nth renewal occurs on or before time t and usage u.
Therefore, it follows that the number of renewals by (t, u) is given by
NT ,U (t , u) = sup{n : n ³ 0, X n £ t , Yn £ u} (14.72)
P[NT ,U (t , u) = 0] = 1 - FT ,U (t , u) (14.74)
336 Reliability Engineering: Probabilistic Models and Maintenance Methods
Y3
U3
Y2
U2
Y1
U1
T1 T2 T3
X1 X2 X3
FIGURE 14.3
A bivariate renewal process. (Reprinted with permission from Yang, S.C. and Nachlas, J.A.,
Bivariate reliability and maintenance planning models, IEEE Trans. Reliab., 50(1), 26–35.
Copyright 2001 IEEE.)
which corresponds to the univariate form. As in the case of the univariate func-
tion, the recursive statement of Equation 14.75 is the key integral renewal equation:
t u
MF (t , u) = FT ,U (t , u) +
ò ò M (t - x, u - y) dF
0 0
F T ,U (x, y) (14.76)
and this function is the basis for analysis of the renewal process. As a final
point, note that assuming FT,U(t, u) is absolutely continuous implies that the
renewal density exists and is
¥
¶2
mF (t , u) =
¶t¶u
MF (t , u) = åf
n =1
( n)
T ,U (t , u) = fT ,U (t , u)
t u
+
ò ò m (t - x, u - y) f
0 0
F T ,U ( x , y )dxdy (14.77)
Special Topics 337
and
1 *
FT*,U (s, v) = f T ,U ( s , v ) (14.79)
sv
Using these forms in the analysis of the key renewal equation leads to
and
and the Laplace transform of the renewal function for the bivariate normal
distribution is
MF* (s, v) = ë (
exp é -smt - vm u + 12 s2st2 + 2rsvstsu + v 2su2 ù
û ) (14.84)
é
ë
é
ë (
2 2
sv ê1 - exp -smt - vm u + 2 s st + 2rsvstsu + v su ú
1 2 2 ùù
ûû )
Unfortunately, the inverse transforms for these expressions cannot be con-
structed in closed form. For Equation 14.84, working with the transform of
the associated renewal density permits its inversion [104] to
lh æ 2 lhtu ö ì l t + hu ü
m f (t , u) = I0 ç ÷÷ exp í- ý (14.85)
1 - r çè 1 - r ø î 1-r þ
It is rare that the transform is invertible in closed form. For most of the bivar-
iate models, closed-form transform inversions are not available.
Next, consider the cases in which repair is no longer instantaneous.
Assume instead that repair effort extends over a bivariate interval that is
random. Assume further that the distribution function on the magnitude of
the repair effort is denoted by Gr(t, u) and is of the same family as the fail-
ure distribution. There is no justification for the assumption that the failure
and repair distributions are of the same family. The reason for using this
assumption here is the fact that it sometimes makes the analysis easier. In
addition, we observe that for cases in which it is appropriate, the repair dis-
tribution can easily be collapsed into a univariate form.
To construct renewal models for the failure with noninstantaneous repair
cases, we obtain the convolution on the operating and repair intervals and
then renewal function based on the convolution. As availability is the quan-
tity of interest for these cases, the renewal function is used to obtain the
availability function. That is, considering a longevity cycle to be the sum of
an operating interval and a repair interval, the cycle has distribution Hr(t,u)
that is obtained as the inverse transform of
Then using the same reasoning as for univariate models, we observe that a
device is available at coordinate point (t, u) if it experiences no failures prior
to that point, or else it is renewed at some earlier coordinate point and expe-
riences no further failures before (t, u). That is,
t u
A(t , u) = FT ,U (t , u) +
òòF
0 0
T ,U (t - x , u - y ) dMH ( x , y ) (14.87)
Special Topics 339
where MH(t, u) is obtained using Equation 14.61. Also comparable to the uni-
variate case is the fact that the availability function is ultimately obtained as
the inverse transform of
(
A* (s, v) = FT*,U (s, v) 1 + m*H (s, v) = ) FT*,U (s, v)
1 - h* ( s, v)
(14.88)
(
gT ,U (t , u) = l r hr e -( lr t + hr u ) 1 + rr {1 - 2e - lr t - 2e - hr u + 4e -( lr t + hr u ) } ) (14.89)
and
so that
a1
A * ( s, v) = (14.91)
a2 - a3
with
and
æ lh öæ l r hr ö
hr* (s, v) = ç ÷ç ÷ (14.92)
è ( s + l )( n + h) - s nr øè ( s + l r )( n + h r ) - s nr r ø
340 Reliability Engineering: Probabilistic Models and Maintenance Methods
Then, using the probability identity of Equation 14.66 simplifies slightly the
construction of
1 s+ l -1 v + h-1 lh
- - +
A * ( s, v) =
sv s ( s + l ) v ( v + h ) ( s + l ) ( v + h) - svr
(14.93)
æ lh öæ l1h1 ö
1-ç
ç ( s + l ) ( v + h) - svr ÷÷ çç ( s + l r ) ( v + hr ) - svrr ÷÷
è øè ø
Finally, for the bivariate normal models, we again use Equation 14.65 with
the result that
ë (
fT*,U (s, v) = exp é -smt - vm u + 12 s2st2 + 2rsvstsu + v 2su2 ù
û ) (14.95)
ë ( )
fT* (s) = exp é -mt s + 12 s2st2 ù and
û (
fU* (v) = exp é -m uv + 12 v 2su2 ù
ë û ) (14.96)
and
( ( ) (
+ 12 s2 st2 + st2r + 2svr ( stsu + str sur ) + v 2 su2 + su2r ùú
û )) (14.97)
Exercises
14.1 A system has been monitored over 4000 hours and the following
sequence of failure times has been observed. Apply the Laplace test to
these data to decide whether or not it is homogeneous. Then estimate
the parameter(s) for the failure process model.
14.2 A system has been monitored over 6000 hours and the following
sequence of failure times has been observed. Apply the Laplace test to
these data to decide whether or not it is homogeneous. Then estimate
the parameter(s) for the failure process model.
14.3 A set of 3 identical systems has been monitored over T1 = 5000, T2 = 5000,
and T3 = 4000 hours and the sequence of observed failure times is shown
in the following table.
n1 = 14 n 2 = 12 n 3 = 14
272.5649 350.5998 314.9211
683.209 662.1392 594.3912
1056.697 1145.495 924.9023
1520.834 1463.103 1180.4
2170.058 1806.611 1461.752
2495.64 2353.335 1604.111
2774.353 2759.114 1909.145
3113.331 3320.393 2239.001
3471.008 3714.615 2579.923
3742.009 4052.621 2888.052
3930.438 4285.35 3120.781
4109.861 4707.776 3338.468
4337.591 3635.817
4854.601 3850.161
342 Reliability Engineering: Probabilistic Models and Maintenance Methods
343
344 Appendix A: Numerical Approximations
and the calculation yields a result with |ε(z)| < 1.5 × 10−7. This algorithm was used
to generate the table of cumulative normal probabilities that is in Appendix D.
There is also an algorithm for computing quantiles in the tail of the distri-
bution. Specifically, for γ in the range 0 ≤ γ ≤ 0.05, Abramowitz and Stegun
[59] indicate that
c0 + c1t + c2t 2
z1- g » t - + e( g ) (A.5)
1 + e1t + e2t 2 + e3t 3
where
c0 = 2.515517
c1 = 0.802853
c2 = 0.010328
e1 = 1.432788
e2 = 0.189269
e3 = 0.001308
and
æ 1ö
t = ln ç 2 ÷ (A.6)
èg ø
This approximation yields results for which |ε(γ)| < 4.5 × 10−4.
ò
G( x) = t x -1e -t dt
0
(A.7)
G( x) = ( x - 1)! (A.8)
For cases in which x is not an integer, Abramowitz and Stegun [59] provide
the approximation
d d/dxG( x)
y( x) = ln G( x) = (A.10)
dx G( x)
Tables of the psi function are available [59]. On the other hand, approximate
numerical evaluation of the psi function is reasonably straightforward except
for the fact that precise values may require that a large number of terms be
included in the calculation.
The general series expansion for the psi function is
¥
y( x + 1) = - g + å (-1) z(n)x
n=2
n n -1
|x|< 1 (A.11)
where
¥
z(n) = åk
k =1
-n
(A.12)
is the Riemann zeta function and γ is Euler’s constant for which the value is
γ = 0.57721(56649). For values of the argument of the function that exceed 1,
we use the recursion
1
y( x + 1) = y( x) + (A.13)
x
346 Appendix A: Numerical Approximations
g1 ( z g ) g 2 ( z g ) g 3 ( z g ) g 4 ( z g )
tn , g = zg + + + + (A.14)
n n2 n3 n4
x3 + x
g1 ( x ) =
4
5x 5 + 16 x 3 + 3 x
g2 (x) =
96 (A.15)
7 5 3
3 x + 19x + 7 x - 15x
g3 (x) =
384
79x 9 + 776 x 7 + 1, 482x 5 - 1, 920 x 3 - 945x
g 4 ( x) =
92, 160
If necessary, the value of zγ may be computed using Equation A.5. This method
was used to compute the values included in the t table in Appendix D.
Appendix B: Numerical Evaluation
of the Weibull Renewal Functions
The Weibull distribution provides a very useful model of life lengths. The
utility of the distribution results from its tractability when treating simple
probabilities for life lengths. When exploring renewal behavior, it is less
manageable. In fact, there are no closed-form expressions for the convolu-
tions or the renewal function for the Weibull distribution. Lomnicki [65]
shows that the Weibull renewal function has an equivalent representation
as the sum of terms of a Maclaurin series and that the same is true for con-
volutions of a Weibull distribution. Specifically, suppose we represent the
Weibull distribution function as
b
FT (t) = 1 - e ( )
- t/q
(B.1)
Then, define the “Poissonian function”
æ æ t öb ö ( t/q )
Pk ç ç ÷ ÷ =
(
e ( )
- t/q
b )
b k
(B.2)
çè q ø ÷ k!
è ø
with the corresponding remainder accumulation
æ æ t öb ö ¥
æ æ t öb ö
Dk ç ç ÷ ÷ =
çè q ø ÷
è ø
å
j=k
Pj ç ç ÷ ÷
çè q ø ÷
è ø
(B.3)
for which the coefficients, c(k), are computed as indicated by Lomnicki and
explained in the following. Observe that the corresponding representation of
the k-fold convolution of the Weibull distribution is
¥
æ æ t öb ö
FT( k ) (t) = å
j=k
f j (k )Dj ç ç ÷ ÷
çè q ø ÷
è ø
(B.5)
in which the coefficients ϕj(k) are also defined by Lomnicki and shown as
follows. Before showing the calculation of the coefficients, we note that the
renewal density and the convolution of the probability density may also be
obtained by taking the derivatives of the functions in Equations B.4 and B.5,
respectively. This is shown here.
347
348 Appendix B: Numerical Evaluation of the Weibull Renewal Functions
¥
æ æ t öb ö ¥
æ æ t öb ö
å å
d (k ) d d
fT( k ) (t) = FT (t) = f j (k )Dj ç ç ÷ ÷ = f j (k ) Dj ç ç ÷ ÷ (B.7)
dt dt çè q ø ÷ dt çè è q ø ÷ø
j=k è ø j=k
æ
( ) e( ö
j
d ç ( t/q )
b
æ æ t öb ö d ¥
æ æ t öb ö ¥
d ææ t ö ö
b ¥
÷
å å å
d b
- t/q )
Dk ç ç ÷ ÷ = Pj ç ç ÷ ÷ = Pj ç ç ÷ ÷ =
dt çè è q ø ÷ø dt j=k
çè q ø ÷
è ø j=k
dt çè è q ø ÷ø j=k
dt çç j! ÷
÷
è ø
¥
d æ ( t/q ) -(t/q)b ö
jb ¥ æ ( jb/t ) ( t/q ) jb ( b/t ) ( t/q )b ( t/q ) jb ö - t/q b
= å
j=k
ç
dt ç j !
è
e ÷=
÷
ø
åj=k
ç
ç
è
j!
-
j!
÷e ( )
÷
ø
¥ æ ( t/q ) jb ( t/q )( j +1)b ö -(t/q)b b ( t/q )kb -(t/q)b
å
b
= ç - ÷e = e (B.8)
t ç ( j - 1)! j! ÷ t (k - 1)!
j=k è ø
b ( t/q ) -(t/q)b
¥ kb
mFT (t) =
k =1
å
c(k )
t (k - 1)!
e (B.9)
and
b ( t/q ) -(t/q)b
¥ jb
Thus, all of the interesting probability measures can be computed using the
coefficients defined by Lomnicki.
To obtain those coefficients, we start by computing a ratio of gamma
functions:
G(kb + 1)
g( k ) = (B.11)
G(k + 1)
Appendix B: Numerical Evaluation of the Weibull Renewal Functions 349
b0 (k ) = g(k )
k -1
å b (i )g( k - i )
(B.12)
b j +1(k ) = j
i= j
and then,
f j ( j) = a j ( j)
k k -1
f j (k ) = åi= j
ai (k ) - å a (k - 1)
i= j
i
(B.14)
where again k ≥ j + 1. These are the coefficients we use to compute the con-
volutions on the distribution and density functions. Finally, we obtain the
coefficients for the renewal function and the renewal density:
k
c(k ) = å f (k )
j =1
j (B.15)
To obtain a sense of the computations involved and the likely accuracy of the
finite sum of terms, consider a set of example calculations. Suppose β = 1.5
and since the scale is arbitrary, take θ = 1. Using these values, the first 16 of
the gamma function–based quantities are (Table B.1)
TABLE B.1
Values of γ(k) for β = 1.5
k γ k γ
0 1.0000 8 11880.0000
1 1.3293 9 63636.2377
2 3.0000 10 360360.0000
3 8.7238 11 2.1453 × 106
4 30.0000 12 13.3661 × 106
5 116.9534 13 86.8191 × 106
6 504.0000 14 586.0512 × 106
7 2360.9966 15 4099.8766 × 106
350 Appendix B: Numerical Evaluation of the Weibull Renewal Functions
Then, using the quantities, γ(k), we use Equation B.12 to obtain the bj(k)
(Table B.2):
TABLE B.2
Values of the Quantities bj(k) for β = 1.5
k\j 0 1 2 3 4
0 1.0000
1 1.3293 1.3293
2 3.0000 4.7672 1.7671
3 8.7238 16.6998 10.3251 2.3491
4 30.0000 62.1938 48.0981 19.0271 3.1228
5 116.9534 249.0566 214.2441 110.3305 32.3409
6 504.0000 1071.0464 961.8336 572.1867 224.2416
7 2360.9966 4926.1199 4442.0375 2857.3636 1328.0853
8 11880.0000 24121.6916 21294.4744 14200.6517 7323.0231
9 63636.2377 125198.1354 106397.8593 71453.2323 39160.3981
10 360360.0000 685940.2457 555031.7309 367598.2560 207719.7852
11 2.1453 × 106 3.9521 × 106 3.0421 × 106 1.9452 × 106 1.1080 × 106
12 13.3661 × 106 23.8614 × 106 17.2026 × 106 10.6275 × 106 5.9966 × 106
13 86.8191 × 106 150.4915 × 106 102.0784 × 106 60.0923 × 106 33.1335 × 106
14 586.0512 × 106 988.5588 × 106 631.0661 × 106 352.1645 × 106 187.7382 × 106
15 4099.8766 × 106 6745.3736 × 106 4058.6121 × 106 2140.5447 × 106 1094.3582 × 106
k\j 5 6 7 8 9
5 4.1513
6 52.3605 5.5185
7 422.3589 82.0588 7.3359
8 2814.0220 754.7551 125.6394 9.7519
9 17010.7346 5589.1903 1297.6446 189.0252 12.9636
10 97696.0360 36796.0159 10575.6191 2165.9264 280.5346
11 547237.1877 226339.1912 75373.4950 19267.6607 3531.4026
12 3.0392 × 106 1.3390 × 106 495049.8362 147871.4150 34052.6332
13 16.9205 × 106 7.7593 × 106 3.0903 × 106 1.0338 × 106 280080.8014
14 95.1867 × 106 44.6066 × 106 18.7034 × 106 6.8004 × 106 2.0780 × 106
15 544.2424 × 106 2.567160 × 106 111.2747 × 106 4.29887 × 106 14.3811 × 106
k\j 10 11 12 13 14 15
10 17.2330
11 411.8167 22.9086
12 5649.1317 599.1438 30.4533
13 58697.9832 8895.4067 865.1917 40.4828
14 515220.0330 99086.5916 13822.3045 1241.4942 53.8154
15 4.0449 × 106 924648.5878 164320.0939 21235.7913 1771.8168 71.5390
Appendix B: Numerical Evaluation of the Weibull Renewal Functions 351
Next, the analysis of Equation B.13 yields the following data (Tables B.3
and B.4):
TABLE B.3
Values of the Quantities aj(k) for β = 1.5
k\j 0 1 2 3 4 5 6 7 8
0 1.0000
1 0.0000 1.0000
2 0.0000 0.4110 0.5890
3 0.0000 0.1471 0.5836 0.2693
4 0.0000 0.0497 0.4033 0.4429 0.1041
5 0.0000 0.0163 0.2393 0.4650 0.2439 0.0355
6 0.0000 0.0052 0.1306 0.3970 0.3472 0.1091 0.0109
7 0.0000 0.0017 0.0677 0.3005 0.3881 0.1971 0.0419 0.0031
8 0.0000 0.0005 0.0339 0.2102 0.3752 0.2731 0.0921 0.0143 0.0008
9 0.0000 1.60 × 10−4 0.0165 0.1392 0.3288 0.3211 0.1525 0.0371 0.0044
10 0.0000 4.96 × 10−5 0.0079 0.0885 0.2686 0.3375 0.2114 0.0715 0.0132
11 0.0000 1.52 × 10−5 0.0037 0.0546 0.2081 0.3267 0.2588 0.1142 0.0292
12 0.0000 4.66 × 10−6 0.0017 0.0328 0.1547 0.2971 0.2892 0.1597 0.0528
13 0.0000 1.42 × 10−6 0.0008 0.0194 0.1113 0.2570 0.3011 0.2021 0.0830
14 0.0000 4.32 × 10−7 0.0004 0.0112 0.0779 0.2137 0.2963 0.2367 0.1173
15 0.0000 1.31 × 10−7 0.0002 0.0064 0.0534 0.1719 0.2785 0.2605 0.1525
k\j 9 10 11 12 13 14 15
9 0.0002
10 0.0013 4.78 × 10−5
11 0.0043 0.0003 1.07 × 10−5
12 0.0106 0.0013 8.33 × 10−5 2.28 × 10−6
13 0.0215 0.0035 0.0004 1.97 × 10−5 4.66 × 10−7
14 0.0375 0.0079 0.0011 9.14 × 10−5 4.41 × 10−6 9.18 × 10−8
15 0.0586 0.0151 0.0026 0.0003 2.23 × 10−5 9.45 × 10−7 1.74 × 10−8
352 Appendix B: Numerical Evaluation of the Weibull Renewal Functions
TABLE B.4
Values of the Quantities ϕj(k) for β = 1.5
k\j 1 2 3 4 5 6 7 8 9
0
1 1.0000
2 0.0000 0.5890
3 0.0000 0.2638 0.2693
4 0.0000 0.0974 0.2776 0.1041
5 0.0000 0.0334 0.1974 0.1753 0.0355
6 0.0000 0.0111 0.1197 0.1878 0.0845 0.0109
7 0.0000 0.0036 0.0665 0.1630 0.1220 0.0340 0.0031
8 0.0000 0.0011 0.0350 0.1252 0.1382 0.0622 0.0120 0.0008
9 0.0000 0.0003 0.0177 0.0887 0.1351 0.0870 0.0266 0.0038 0.0002
10 0.0000 0.0001 0.0087 0.0594 0.1196 0.1033 0.0444 0.0099 0.0011
11 0.0000 3.43 × 10−5 0.0042 0.0382 0.0987 0.1095 0.0620 0.0193 0.0033
12 0.0000 1.06 × 10−5 0.0020 0.0237 0.0771 0.1068 0.0764 0.0310 0.0074
13 0.0000 3.24 × 10−6 0.0009 0.0144 0.0578 0.0978 0.0860 0.0436 0.0134
14 0.0000 9.89 × 10−7 0.0004 0.0086 0.0419 0.0853 0.0901 0.0555 0.0212
15 0.0000 3.01 × 10−7 0.0002 0.0050 0.0296 0.0714 0.0891 0.0653 0.0300
k\j 10 11 12 13 14 15
10 1.07 × 10−5
11 0.0002 1.07 × 10−5
12 0.0010 7.49 × 10−5 2.28 × 10−6
13 0.0025 0.0003 1.78 × 10−5 4.66 × 10−7
14 0.0051 0.0008 7.57 × 10−5 4.04 × 10−6 9.18 × 10−8
15 0.0090 0.0018 0.0002 1.88 × 10−5 8.71 × 10−7 1.74 × 10−8
Finally, we obtain the values for the coefficients c(k) (Table B.5):
TABLE B.5
Values of c(k) for β = 1.5
k c(k) k c(k)
1 1.0000 9 0.3595
2 0.5890 10 0.3466
3 0.5331 11 0.3354
4 0.4792 12 0.3256
5 0.4417 13 0.3168
6 0.4140 14 0.3089
7 0.3923 15 0.3017
8 0.3744
Appendix C: Laplace Transform
for the Key Renewal Theorem
In Chapter 9, the key renewal theorem is presented and its Laplace transform
is stated to be
æt ö
ç
è0
ò
MF*T (s) = FT* (s) + L ç MFT (t - u) fT (u)du ÷ = FT* (s) + MF*T (s) fT* (s)
÷
ø
(C.1)
To see that this is correct, start with the general definition of the transform:
¥
ò
L ( FT (t) ) = e - st FT (t)dt = FT* (s)
0
(C.2)
Then, we apply the definition to the renewal function and the transform of
the integral term is
æt ö ¥ t
ò
L ç MFT (t - u) fT (u)du ÷ = e
ò ò
- st
MFT (t - u) fT (u)dudt
ç ÷
è0 ø 0 0
¥ t ¥
=
òò
0 0
e - s( t - u ) åF
n =1
( n)
T (t - u) e - su fT (u)dudt
¥ t ¥
ò òåe
- s( t - u )
= FT( n ) (t - u) e - su fT (u)dudt
0 0 n =1
¥ t ¥
ò òåe - s( t - u ) ( n )
= T F (t - u) e - su fT (u)dtdu
0 0 n =1
¥ ¥
æ¥ ö
= åò ò ç e - s(t - u ) FT( n ) (t - u)dt ÷ e - su fT (u)du
ç
0 è u
÷
n =1 ø
¥ ¥ ¥ ¥
= åò
n =1 0
F * (s)e - su fT (u)du =
( n)
T å n =1
( n)
T
ò
F * (s) e - su fT (u)du
0
¥ ¥
= åF
n =1
( n)
T
* (s) fT* (s) = fT* (s)
åF
n =1
( n)
T
* (s)
353
Appendix D: Probability Tables
TABLE D.1
Standard Normal Cumulative Probabilities
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8079 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8728 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9648 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9712 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9773 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9983 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
355
356 Appendix D: Probability Tables
TABLE D.2
Student’s t Distribution
ν\γ 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990 0.995 0.999
1 0.158 0.325 0.51 0.727 1.000 1.376 1.963 3.078 6.314 12.71 31.82 63.66 636.6
2 0.142 0.289 0.445 0.617 0.816 1.061 1.386 1.886 2.920 4.303 6.965 9.925 31.6
3 0.137 0.277 0.424 0.584 0.765 0.978 1.250 1.638 2.353 3.182 4.541 5.841 12.92
4 0.134 0.271 0.414 0.569 0.741 0.941 1.190 1.533 2.132 2.776 3.747 4.604 8.610
5 0.132 0.267 0.408 0.559 0.727 0.920 1.156 1.476 2.015 2.571 3.365 4.032 6.869
6 0.131 0.265 0.404 0.553 0.718 0.906 1.134 1.440 1.943 2.447 3.143 3.707 5.959
7 0.130 0.263 0.402 0.549 0.711 0.896 1.119 1.415 1.895 2.365 2.998 3.499 5.408
8 0.130 0.262 0.399 0.546 0.706 0.889 1.108 1.397 1.860 2.306 2.896 3.355 5.041
9 0.129 0.261 0.398 0.543 0.703 0.883 1.100 1.383 1.833 2.262 2.821 3.250 4.781
10 0.129 0.260 0.397 0.542 0.700 0.879 1.093 1.372 1.812 2.228 2.764 3.169 4.587
11 0.129 0.260 0.396 0.540 0.697 0.876 1.088 1.363 1.796 2.201 2.718 3.106 4.437
12 0.128 0.259 0.395 0.539 0.695 0.873 1.083 1.356 1.782 2.179 2.681 3.055 4.318
13 0.128 0.259 0.394 0.538 0.694 0.870 1.079 1.350 1.771 2.160 2.650 3.012 4.221
14 0.128 0.258 0.393 0.537 0.692 0.868 1.076 1.345 1.761 2.145 2.624 2.977 4.140
15 0.128 0.258 0.393 0.536 0.691 0.866 1.074 1.341 1.753 2.131 2.602 2.947 4.073
16 0.128 0.258 0.392 0.535 0.690 0.865 1.071 1.337 1.746 2.120 2.583 2.921 4.015
17 0.128 0.257 0.392 0.534 0.689 0.863 1.069 1.333 1.740 2.110 2.567 2.898 3.965
18 0.127 0.257 0.392 0.534 0.688 0.862 1.067 1.330 1.734 2.101 2.552 2.878 3.922
19 0.127 0.257 0.391 0.533 0.688 0.861 1.066 1.328 1.729 2.093 2.539 2.861 3.883
20 0.127 0.257 0.391 0.533 0.687 0.860 1.064 1.325 1.725 2.086 2.528 2.845 3.850
21 0.127 0.257 0.391 0.532 0.686 0.859 1.063 1.323 1.721 2.080 2.518 2.831 3.819
22 0.127 0.256 0.390 0.532 0.686 0.858 1.061 1.321 1.717 2.074 2.508 2.819 3.792
23 0.127 0.256 0.390 0.532 0.685 0.858 1.060 1.319 1.714 2.069 2.500 2.807 3.768
24 0.127 0.256 0.390 0.531 0.685 0.857 1.059 1.318 1.711 2.064 2.492 2.797 3.745
25 0.127 0.256 0.390 0.531 0.684 0.856 1.058 1.316 1.708 2.060 2.485 2.787 3.725
26 0.127 0.256 0.390 0.531 0.684 0.856 1.058 1.315 1.706 2.056 2.479 2.779 3.707
27 0.127 0.256 0.389 0.531 0.684 0.855 1.057 1.314 1.703 2.052 2.473 2.771 3.689
28 0.127 0.256 0.389 0.530 0.683 0.855 1.056 1.313 1.701 2.048 2.467 2.763 3.674
29 0.127 0.256 0.389 0.530 0.683 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.66
30 0.127 0.256 0.389 0.530 0.683 0.854 1.055 1.310 1.697 2.042 2.457 2.750 3.646
Appendix D: Probability Tables 357
TABLE D.3
Table of Chi-Square Coordinates
ν\γ 0.005 0.010 0.025 0.050 0.100 0.900 0.950 0.975 0.990 0.995
1 0.00+ 0.00+ 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475 20.278
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 14.648 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 19.768 39.087 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 20.599 40.256 43.773 46.979 50.892 53.672
50 27.991 29.707 32.357 34.764 37.689 63.167 67.505 71.420 76.154 79.490
80 51.172 53.540 57.153 60.391 64.278 96.578 101.879 106.629 112.329 116.321
100 67.328 70.065 74.222 77.929 82.358 118.498 124.342 129.561 135.807 140.169
References
1. Schlager, N. (1994). When Technology Fails. Detroit, MI: Gale Research, Inc.
2. NASA. (1986). Report of the presidential commission on the space shuttle chal-
lenger accident. NASA Report No. S677. Washington, DC: US Government
Printing Office.
3. NASA. (2003). Columbia accident investigation board report. Washington, DC:
US Government Printing Office.
4. Power System Outage Task Force, North America Electric Reliability Council.
(2004). Final report on the August 14, 2003 blackout in the US and Canada:
Causes and recommendations. Princeton, NJ: NERC.
5. Ostrower. J., Pasztor, A. (2013). Boeing tries to defuse fears about dreamliner.
The Wall Street Journal, January 9, 2013.
6. Blanchard, B. S. (1992). Logistics Engineering and Management. Englewood Cliffs,
NJ: Prentice Hall.
7. Thompkins, J. A. (1989). Winning Manufacturing. New York: McGraw Hill.
8. Shetty, Y. K., Buehler, V. M. (eds.) (1988). Competing through Productivity and
Quality. Cambridge, MA: Productivity Press.
9. Nachlas, J. A., Cassady, C. R. (1999). Preventive maintenance study: A key
component in engineering education to enhance industrial productivity and
competitiveness. European Journal of Engineering Education, 24:299–309.
10. Dougherty, Jr., E. M., Fragola, J. R. (1988). Human Reliability Analysis: A Systems
Engineering Approach with Nuclear Power Plant Applications. New York: John
Wiley & Sons.
11. Musa, J. D., Iannino, A., Okumoto, K. (1987). Software Reliability: Measurement,
Prediction, Application. New York: McGraw Hill.
12. Barlow, R. E., Proschan, F. (1975). Statistical Theory of Reliability and Life Testing:
Probability Models. New York: Holt, Reinhart, and Winston.
13. Chiang, D. T., Niu, S. (1981). Reliability of consecutive k-out-of-n:F systems.
IEEE Transactions on Reliability, 30:65–67.
14. Natvig, B. (1982). Two suggestions of how to define a multistate coherent sys-
tem. Advances in Applied Probability, 14:434–455.
15. Barlow, R. E., Wu, A. S. (1978). Coherent systems with multistate components.
Mathematics of Operations Research, 4:275–281.
16. Lambiris, M., Papastavridis, S. (1985). Exact reliability formulas for linear
and circular consecutive k-out-of-n:F related systems. IEEE Transactions on
Reliability, 34:124–126.
17. Birnbaum, Z. W. (1969). On the importance of different components in a multi-
component system. In Multivariate Analysis, P. R. Krishnaiah (ed.). San Diego,
CA: Academic, pp. 581–592.
18. Fussell, J. B. (1975). How to hand calculate system safety and reliability charac-
teristics. IEEE Transactions on Reliability, 24:169–174.
19. Majety, S. R., Dawande, M., Rajgopal, J. (1999). Optimal reliability allocation with
discrete cost-reliability data for components. Operations Research, 47:899–906.
359
360 References
20. Rice, W. F., Cassady, C. R., Wise, T. R. (1999). Simplifying the solution of redun-
dancy allocation problems. Proceedings of the Annual Reliability and Maintainability
Symposium, pp. 190–194. Piscataway, NS: Institute of Electrical and Electronics
Engineers.
21. Lloyd, D. K., Lipow, M. (1962). Reliability: Management, Methods and Mathematics.
Englewood Cliffs, NJ: Prentice Hall.
22. Wong, K. L., Lindstrom, D. L. (1988). Off the bathtub onto the rollercoaster curve.
Proceedings of the Annual Reliability and Maintainability Symposium, pp. 356–363.
Piscataway, NS: Institute of Electrical and Electronics Engineers.
23. Barlow, R. E., Marshall, A. W., Proschan, F. (1963). Properties of probabil-
ity distribution with monotone hazard rate. Annals of Mathematical Statistics,
34:375–389.
24. Weibull, W. (1951). A statistical distribution function of wide applicability.
Journal of Applied Mechanics, 18:293–297.
25. Gumbel, E. J. (1958). Statistics of Extremes. New York: Columbia University Press.
26. Fisher, R. A., Tippet, L. H. C. (1928). Limiting forms of the frequency distribu-
tion of the largest or smallest member of a sample. Proceedings of the Cambridge
Philosophical Society, XXIV:180–190.
27. Birnbaum, Z. W. (1958). A statistical model for life length of materials. Journal of
the American Statistical Association, 53:151–160.
28. Birnbaum, Z. W., Saunders, S. C. (1969). A new family of life distributions.
Journal of Applied Probability. 6:319–327.
29. Jordan, C. W. (1967). Life Contingencies. Chicago, IL: Society of Actuaries.
30. Hjorth, U. (1980). A reliability distribution with increasing, decreasing and
bathtub shaped failure rates. Technometrics, 22:99–107.
31. Nachlas, J. A., Beqari, E. (2016). Toward a unified failure theory with implica-
tions for condition based maintenance. Proceedings of the Annual Reliability and
Maintainability Symposium, pp. 374–377. Piscataway, NS: Institute of Electrical
and Electronics Engineers.
32. Kapur, K. C., Lamberson, L. R. (1977). Reliability in Engineering Design. New York:
John Wiley & Sons.
33. Gertsbakh, I. H., Kordonskiy, K. H. (1969). Models of Failure. Berlin, Germany:
Springer-Verlag.
34. Karlin, S. (1964). Total positivity, absorption probabilities and applications.
Transactions of the American Mathematics Society, III:33–107.
35. Nachlas, J. A. (1988). A stochastic model of component failure. Proceedings of the
Sixth International Conference on Reliability and Maintainability, pp. 487–491. Paris,
France: Centre National d’Etudes Spatiales.
36. Miner, M. A. (1945). Cumulative damage in fatigue. Journal of Applied Mechanics,
12:A159–A164.
37. Arrhenius, S. Z. (1989). Physical Chemistry, 4:120.
38. Krausz, A. S., Eyring, H. B. (1975). Deformation Kinetics. New York. John Wiley &
Sons.
39. Stevenson, J. L., Nachlas, J. A. (1990). Microelectronics reliability predictions
derived from component defect densities. Proceedings of the Annual Reliability and
Maintainability Symposium, pp. 366–371. Piscataway, NS: Institute of Electrical
and Electronics Engineers.
40. Kumar, D., Klefsjo, B. (1994). Proportional hazards model: A review. Reliability
Engineering and System Safety, 44:177–188.
References 361
41. Cox, D. R., Miller, H. D. (1965). The Theory of Stochastic Processes. London, U.K.:
Chapman & Hall.
42. Kharoufeh, J. P., Cox, S. M. (2005). Stochastic models for degradation based reli-
ability. IIE Transactions, 37(6):533–542.
43. Esary, J. D., Marshall, A. W., Proschan, F. (1973). Shock models and wear pro-
cesses. Annals of Probability, 1:627–649.
44. Lemoine, A. J., Wenocur, M. L. (1985). On failure modeling. Naval Research
Logistics Quarterly, 32:497–508.
45. Cox, D. R., Oakes, D. (1984). Analysis of Survival Data. New York: Chapman & Hall.
46. Nachlas, J. A. (1986). A general model for age acceleration during thermal
cycling. Quality and Reliability Engineering International, 2:3–6.
47. Wong, K. (1989). Off the bathtub curve onto the roller coaster curve. Quality and
Reliability International, 5(1):29–36.
48. Vassiliou, P., Mettas, A. (2004). Understanding accelerated life testing analy-
sis. Topics in Reliability: Tutorial Notes. Annual Reliability and Maintainability
Symposium. Piscataway, NS: Institute of Electrical and Electronics Engineers.
49. Cassady, C. R., Nachlas, J. A. (2010). A general framework for modeling equip-
ment aging. Proceedings of the Annual Reliability and Maintainability Symposium,
pp. 374–377. Piscataway, NS: Institute of Electrical and Electronics Engineers.
50. Kaplan, E. I., Meier, P. (1958). Nonparametric estimation from incomplete obser-
vations. Journal of the American Statistical Association, 38(3):457–481.
51. Greenwood, M. (1926). The natural duration of cancer. Reports on Public Health
and Medical Subjects. Her Majesty’s Stationary Office, London, U.K., pp. 1–26.
52. Meeker, W. Q., Escobar, L. A. (1998). Statistical Methods for Reliability Data.
New York: John Wiley & Sons.
53. Military Handbook 781. (1996). Reliability Test Methods, Plans and Environments
for Engineering, Development Qualification and Production. Washington, DC: U.S.
Department of Defense.
54. Barlow, R. E., Campo, R. (1975). Total time on test processes and applications to
failure data analysis. In Reliability and Fault Tree Analysis, R. E. Harlow, J. Fussel,
N. D. Singpurwalla, eds. Philadelphia, PA: SIAM, pp. 451–481.
55. Klefsjo, B. (1982). On aging properties and total time on test transforms.
Scandinavian Journal of Statistics, 9:37–41.
56. Nelson, W. (1972). Theory and applications of hazard plotting for censored fail-
ure data. Technometrics, 14:945–966.
57. Arrhenius, S. A. (1889). Über die Dissociationswärme und den Einfluss
der Temperatur auf den Dissociationsgrad der Elektrolyte. Zeitschrift für
Physikalische Chemie, 4:96–116.
58. Lawless, J. F. (2003). Statistical Models and Methods for Lifetime Data, 2nd edn.
Hoboken, NJ: John Wiley & Sons.
59. Abramowitz, M., Stegun, I. A. (1965). Handbook of Mathematical Functions.
New York: Dover Publications.
60. Nachlas, J. A., Kumar, A. (1993). Reliability estimation using doubly censored
field data. IEEE Transactions on Reliability, 42:268–279.
61. Cox, D. R. (1962). Renewal Theory. London, U.K.: Methuen.
62. Nelson, W. (1980). Accelerated life testing: Step-stress models and data analy-
sis. IEEE Transactions on Reliability, 29:103–108.
63. Nelson, W. (1990). Accelerated Testing: Statistical Models, Test Plans and Data
Analysis. New York: John Wiley & Sons.
362 References
64. Feller, W. (1966). An Introduction to Probability Theory and Its Applications. II.
New York: John Wiley & Sons.
65. Lomnicki, Z. A. (1966). A note on the Weibull renewal process. Biometrika, 53:375–381.
66. Barlow, R. E., Hunter, L. C. (1960). Optimum preventive maintenance policies.
Operations Research, 8:90–100.
67. Brown, M., Proschan, F. (1983). Imperfect repair. Journal of Applied Probability,
20:851–859.
68. Block, H. W., Borges, W. S., Savits, T. H. (1985). Age dependent minimal repair.
Journal of Applied Probability, 22:370–385.
69. Kijima, M. (1989). Some results for repairable systems with general repair.
Journal of Applied Probability, 26:89–102.
70. Kijima, M., Morimura, H., Suzuki, Y. (1988). Periodical replacement problem
without assuming minimal repair. European Journal of Operations Research,
37:194–203.
71. Wang, H., Pham, H. (1996). Optimal maintenance policies for several imperfect
repair models. International Journal of Systems Science, 27:543–549.
72. Wang, H., Pham, H. (1996). A quasi renewal process and its applications in
imperfect maintenance. International Journal of Systems Science, 27:1055–1062.
73. Lam, Y. (1988). Geometric process and replacement problem. Acta Mathematica
Applicata Sinica, 4:366–377.
74. Lam, Y. (1988). A note on the optimal replacement problem. Advances in Applied
Probability, 20:479–482.
75. Finkelstein, M. S. (1993). A scale model of general repair. Microelectronics
Reliability, 33:41–44.
76. Cassady, C. R., Nachlas, J. A. (1994). The frequency distribution of availability.
Proceedings of the Annual Reliability and Maintainability Symposium, pp. 278–282.
Piscataway, NS: Institute of Electrical and Electronics Engineers.
77. Iyer, S. (1992). Availability results for imperfect repair. Sankhaya: The Indian
Journal of Statistics, 54(B):249–256.
78. Rehmert, I. J., Nachlas, J. A. (2009). Availability analysis for the quasi-renewal
process. IEEE Transactions on Man, Systems and Cybernetics, Part A, 39(1):272–280.
79. Birolini, A. (1985). On the Use of Stochastic Processes in Modeling Reliability
Problems. Lecture Notes in Economics and Mathematical Systems, 252. Berlin,
Germany: Springer-Verlag.
80. Ross, S. M. (1988). A First Course in Probability. New York: Macmillan.
81. Murdock, W. P., Nachlas, J. A. (2007). Availability under age replacement with
distinct service time distributions. Proceedings of the Annual Reliability and
Maintainability Symposium, pp. 333–338. Piscataway, NS: Institute of Electrical
and Electronics Engineers.
82. Degbotse, A., Nachlas, J. A. (2003). Use of nested renewals to model avail-
ability under opportunistic maintenance policies. Proceedings of the Annual
Reliability and Maintainability Symposium, pp. 344–350. Piscataway, NS: Institute
of Electrical and Electronics Engineers.
83. Nakagawa, T. (1986). Periodic and sequential preventive maintenance policies.
Journal of Applied Probability, 23:536–542.
84. Block, H. W., Borges, W. S., Savits, T. H. (1986). Preventive maintenance policies.
In Reliability And Quality Control, A. P. Basu, ed. Amsterdam, the Netherlands:
Elsevier Science Publishers, pp. 101–106.
References 363
85. Intiyot, B. (2007). Availability analysis for the quasi-renewal process with an age
dependent preventive maintenance policy. Doctoral Dissertation, Department
of Industrial & Systems Engineering, Virginia Tech, Blacksburg, VA.
86. Makis, V., Jardine, A. K. S. (1993). A note on optimal replacement policy under
general repair. European Journal of Operations Research, 69:75–82.
87. van Noortwijk, J. M., Cooke, R., Kok, M. (1995). A Bayesian failure model
based on isotropic deterioration. European Journal of Operations Research,
82:270–282.
88. Grall, A., Dieulle, L., Berenguer, C., Roussignol, M. (2002). Continuous time pre-
dictive maintenance scheduling for a deteriorating system. IEEE Transactions on
Reliability, 51:141–150.
89. Chhikara, R. S., Folks, J. L. (1989). The Inverse Gaussian Distribution: Theory,
Methodology and Applications. New York: Marcel Dekker.
90. Doksum, K. A., Hoyland, A. (1992). Models for variable-stress accelerated life
testing experiments based on Weiner processes and the inverse Gaussian dis-
tribution. Technometrics, 34:74–82.
91. Lu, C. J., Meeker, W. Q. (1993). Using degradation measures to estimate a time to
failure distribution. Technometrics, 35:161–174.
92. Lu, H., Kolarik, W. J., Lu, S. S. (2001). Real time performance reliability predic-
tion. IEEE Transactions on Reliability, 50:353–357.
93. Whitmore, G. A., Crowder, M. J., Lawless, J. F. (1998). Failure inference from a marker
process based on a bivariate Weiner model. Lifetime Data Analysis, 4:229–251.
94. Ascher, H., Feingold, H. (1984). Repairable Systems Reliability: Modeling, Inference,
Misconceptions and Their Causes. New York: Marcel Dekker.
95. Rigdon, S. E., Basu, A. P. (2000). Statistical Methods for the Reliability of Repairable
Systems. New York: John Wiley & Sons.
96. Duane, J. T. (1964). Learning curve approach to reliability monitoring. IEEE
Transactions on Aerospace, 2:563–566.
97. Crow, L. H. (1984). Methods for assessing reliability growth potential.
Proceedings of the Annual Reliability and Maintainability Symposium, pp. 484–489.
Piscataway, NS: Institute of Electrical and Electronics Engineers.
98. Military Handbook 189C. (2011). Reliability Growth Management. Washington,
DC: U. S. Department of Defense.
99. Benton, A. W., Crow, L. H. (1989). Integrated reliability growth testing.
Proceedings of the Annual Reliability and Maintainability Symposium, pp. 160–166.
Piscataway, NS: Institute of Electrical and Electronics Engineers.
100. Singpurwalla, N. D., Wilson, S. P. (1998). Failure models indexed by two scales.
Advances in Applied Probability, 30:1058–1072.
101. Yang, S. C., Nachlas, J. A. (2001). Bivariate reliability and maintenance planning
models. IEEE Transactions on Reliability, 50:26–35.
102. Eliashberg, J., Singpurwalla, N. D., Wilson, S. P. (1997). Calculating the reserve
for a time and usage indexed warranty. Management Science, 43:966–975.
103. Baggs, G. E., Nagagaja, H. N. (1996). Reliability properties of order statistics from
bivariate exponential distributions. Communications in Statistics, 12:611–631.
104. Hunter, J. (1974). Renewal theory in two dimensions: Basic results. Advances in
Applied Probability, 6:376–391.
105. Nachlas, J. A. (1992). Convenient construction of t-tests for quality evaluation.
Quality and Reliability Engineering International, 7:1–4.
Index
365
366 Index