16 Dijoux
16 Dijoux
Isha Dewan
Indian Statistical Institute
7, S. J. S. Sansanwal Marg, New Delhi, India
Yann Dijoux
Université de Technologie de Troyes
10010, Troyes, France
Abstract
1
Proceedings of the 38th ESReDA Seminar, Pécs, May 4-5, 2010
system as good as new. Then, imperfect maintenance models have been proposed. The
most popular of them are the virtual age models, defined by Brown-Proschan [1] and
Kijima et al [7]. These models are adapted to wearing-out systems, which are described
by reliability models with increasing initial failure intensity. However, if we consider sys-
tems presenting a burn-in period, the risk of observing a failure soon after the startup of
system is important. It’s not reasonable to model these systems simply by a continuous
degradation.
In this paper, we propose to dissociate or separately identify both causes of failure.
To that purpose, we use the competing risks framework to describe a risk due to the
degradation, an other due to burn-in defects. Preventive maintenances are assumed to
be deterministic, then the PM times correspond to censoring times of the CM process.
It seems realistic that the actions performed and their efficiency differ depending on the
kind of maintenance. Then we consider different virtual age assumptions for each risk,
which corresponds to use asymmetric virtual ages as defined in [6]. It seems reasonable
to assume a minimal repair for the risk of burn-in failure and a classical virtual age
model for the risk related to the degradation and preventive maintenances. If we consider
that the kind of maintenance is recorded, we can easily derive the classical results of the
competing risks models associated to imperfect maintenance. Unfortunately, the kind
of maintenance performed is not always observed or recorded and in practice, very few
dataset provide this information. Then we present an adaptation of the E-M algorithm
in order to estimate the parameters of the model. Finally, simulations and numerical
estimations are presented.
1.2 Notations
The PM-CM process is the sequence of PM times and CM times. Maintenance durations
are assumed to be negligible or not taken into account. Then, we introduce the following
notations.
1 if the k th maintenance is preventive (PM)
Uk = 0 if the k th maintenance is corrective, due to the degradation (1)
−1 if the k th maintenance is corrective, due to a burn-in defect
where Ht− is the past of the maintenance process at time t, i.e. the set of all events that
have occurred before t. Without external variable, it usually corresponds to the times
2
Advanced Maintenance Modelling
In the classical competing risks framework, it is assumed that the variables Yi (respec-
tively Zi and Ri ) are independent and identically distributed. It means that the effect
of each maintenance is to restore the system as good as new (AGAN). In the perfect
maintenance case, it is only necessary to define the joint distribution for a new system.
For example, we can express the three-dimensional survival function:
In many situations, the initial risks are assumed to be independent, which implies
that the survival function is product of three marginal survival functions SY , SZ , SR .
These functions are the survival functions of Y, Z and R, respectively. But if we consider
condition-based preventive maintenance, the PM and CM processes can not be assumed
independent. This lead to several competing risks models such as the Random sign models,
the Alert-Delay model.
Another very limiting assumption is the iid hypothesis. Doyen and Gaudoin [6]
proposed a general framework call the Generalized Competing Risks where the couples
{(Yk , Zk , Rk )}k≥1 are not assumed to be iid. The couples {(Wk , Uk )}k≥1 are therefore also
not iid. Thus, the effect of every PM and CM can be imperfect. The usual compet-
ing risks objects are naturally generalized by introducing a conditioning on the past of
the PM-CM process. For example, we can introduce the CM-PM conditional generalized
survival function as:
3
Proceedings of the 38th ESReDA Seminar, Pécs, May 4-5, 2010
A virtual age [7] model is characterized by a sequence {Ai }i≥1 of positive random variables,
called effective ages, such that A0 = 0 and the conditional distributions of interfailure
times are given by:
where Y is a random variable with the same distribution as the first failure time W1 . This
means that after the ith CM, the system behaves as a new one having survived until Ai .
Then, it is easy to prove that the failure intensity is [5]: λt = λ(t − CKt− + AKt− ).
A virtual age model can be defined by a particular value of the effective ages Ai and
by an initial intensity. For instance, we have:
• ABAO: Ai = Ci .
• AGAN: Ai = 0.
• The Brown-Proschan (BP, [1]) model: each maintenance is AGAN with probability
p and ABAO
with probability
1 − p. It corresponds to a virtual age model for which
i
X i
Y
Ai = (1 − Bk ) Wj , where Bk is a random variable independent of Wk and of
j=1 k=j
{Wj , Bj }1≤j<k , and its distribution is Bernoulli with parameter p. The Bk represent
the efficiencies of successive maintenances.
• The Arithmetic Reduction of Age model with memory ∞ (ARA∞ , [5]): the effect of
ith maintenance is to reduce the virtual age of the system of an amount proportional
4
Advanced Maintenance Modelling
to its age just before the failure. Then, the effective age is: Ai = (1 − ρ)(Ai−1 + Wi ),
Kt− −1
(1 − ρ)j CKt− −j )
X
and we prove that the failure intensity is: λt = λ(t − ρ
j=0
• The Arithmetic Reduction of Age model with memory 1 (ARA1 , [5]): the effect of
ith maintenance is to reduce the virtual age just before maintenance, Ai−1 + Wi , of
a quantity proportional to the time elapsed since last maintenance ρWi . Then, the
effective age is: Ai = Ai−1 + (1 − ρ)Wi = (1 − ρ)Ci , and the failure intensity is:
λt = λ(t − ρCKt− )
P (Rk+1 > r, Zk+1 > z | Ak , w1 , . . . , uk ) = P (R > Ak +r, Z > Ak +z|R > Ak , Z > Ak ) (4)
P (Rk+1 > r, Zk+1 > z | AR,k , AZ,k , w1 , . . . , uk ) = P (R > AR,k +r, Z > AZ,k +z|R > AR,k , Z > AZ,k )
(5)
One of the major issues in the competing risks framework is the identifiability of the
model. There is identifiability problem in the classical approach or in the generalized
virtual age approach. However, considering Asymmetrical Virtual ages allows, under
certain conditions, to obtain a complete identifiability of the model. The reason is that in
the classic or GVA approach, the age related to each risk is systematically identical. This
leads to compute the joint survival function only on the diagonal. It is then impossible
to identify the joint survival function on the whole bidimension space. The AVA models
present more flexibility and require to have the expression on the joint survival function
at every point.
2 The model
2.1 Motivations
The majority of the studies considering systems presenting a burn-in period are based
on a bathtub shaped intensity. This classic shapes allows to take into account a burn-in
period when the function is decreasing, a potential useful life when the function is constant
and the useful life when the intensity is finally increasing. However, it seems relevant to
consider a modelling based on the competing risks framework for multiple reasons:
5
Proceedings of the 38th ESReDA Seminar, Pécs, May 4-5, 2010
• The bathtub shaped intensity is the addition of two very different processes, one
linked to the degradation of the system and the other linked to failures due to design
defects. It seems then relevant to differentiate these two causes of failures.
• This model is flexible in the sense that other failure modes or other kinds of preven-
tive maintenance can be added with a minimal impact on the general modelling.
• The PM times are deterministic and can be considered as a censoring of the com-
peting risks process.
• The risks Z1 and R1 for a new system are independent. This assumption seems
realistic as the nature of the risks are very different.
6
Advanced Maintenance Modelling
(
AR,n = Cn
Un = 0 ⇒
AZ,n = (1 − ρ0 )(AZ,n−1 + Wn )
It is possible to have an explicit expression of the effective ages AR,n and AZ,n . To
simplify the notations, let’s introduce the parameter ρ−1 = 0, which corresponds to assume
that the CM after a burn-in failure is of ARA∞ type with parameter ρ−1 = 0, which is
equivalent to a minimal maintenance. Naturally, as maintenance after a birth failure is
minimal, AR,n is always equal to the real age of the system. The effective ages AZ,n need
simple and straightforward calculations to have an explicit expression. Finally we obtain
the following expressions:
∀n > 0, AR,n = Cn
n Y
X n
∀n > 0, AZ,n = [ (1 − ρuk )]Wj
j=1 k=j
7
Proceedings of the 38th ESReDA Seminar, Pécs, May 4-5, 2010
the maintenance intensities and the likelihood function associated to the observation of
couples {(Wj , Uj )}1≤j≤n . The general expression of the likelihood function is as follows:
n
SZ (ai−1 + wi )
[λR (ci )]1I{ui =−1} [λZ (ai−1 + wi )]1I{ui =0}
Y
L(w1 , u1 , .., wn , un ) = SR (cn )
i=1 SZ (ai−1 )
with ai = AZ,i = ij=1 [ ik=j (1 − ρuk )]Wj . The expression remains correct for other kind
P Q
of virtual ages for AZ,i . The Maximum Likelihood estimation leads to have some explicit
estimators (αR , βR , αZ ), and MLEs of other estimators can be derived by simple numerical
methods. In particular, we obtain:
n
X
ui (ui − 1)
i=1
α̂R = ˆ
2cβnR
n
X
ui (ui − 1)
i=1
β̂R = X
n
cn
ui (ui − 1)log( )
i=1 ci
n
X
2n − ui (ui − 1)
i=1
α̂Z = n
ˆ ˆ
[(âi−1 + wi )βZ − (âi−1 )βZ ]
X
2
i=1
n n
(βˆZ , ρˆ0 , ρˆ1 ) = arg max (n− [(ai−1 +wi )βZ −(ai−1 )βZ ])+log(βZ ))
X X
ui (ui −1)/2)(−log(
βZ ,ρ0 ,ρ1
i=1 i=1
n
X
+βZ ui (ui − 1)/2log(ai−1 + wi )
i=1
Pi Qi
with ai = ai (ρ0 , ρ1 ) = j=1 [ k=j (1 − ρuk )]Wj .
8
Advanced Maintenance Modelling
• In practice, the manufacturing defects occur during the burn-in period. By defini-
tion, this period takes place at the startup of the system. A reasonable compromise
is then to use an expert opinion to judge the end of the burn-in period. Then we
can compute the conditioning for the first maintenances and assume that the other
maintenances are due to the degradation. The cost of the conditioning is then O(2p )
with hopefully p << n. An even simpler possibility should be to assume that all
the first maintenances are due to burn-in defects and the other to the degradation.
λR (ci )
pi = P (ui = −1|w1 , . . . , wi , u1 , . . . , ui−1 ) =
λR (ci ) + λZ (âi−1 + wi )
– If the previous probability is greater than 0.5, we consider than the maintenance
is due to a burn-in failure otherwise due to a degradation failure:
• M-step: Maximize the complete likelihood function L(w1 , û1 , .., wk , ûk ) as pre-
sented in the previous section. Then, we obtain a new set of estimators θn+1 =
n+1
(αR , βRn+1 , αZn+1 , βZn+1 , ρn+1
0 , ρ1n+1 ).
9
Proceedings of the 38th ESReDA Seminar, Pécs, May 4-5, 2010
λR (w)
P (R caused the failure|time to failure=w) =
λR (w) + λZ (w)
After the (i-1)th failure, the system R is restored as bad as old and the system Z
is restored with an age ai−1 . The two components are still functioning indepen-
dently, the only differences from a new system are the new failure rates of R and Z,
respectively λR (ci−1 + t) and λZ (ai−1 + t). Then we finally obtain:
λR (ci−1 + wi )
P (R caused the ith failure|ith inter failure time=wi ) =
λR (ci−1 + wi ) + λZ (ai−1 + wi )
10
Advanced Maintenance Modelling
• Performances of the E-step are extremely good. In few iterations, the types of CM
are identified with the E-step. Table 1 presents one example of estimation. General
validity of the estimation of the kinds of maintenance based on simulations will be
presented during the oral talk.
Ci Wi Ui Ûi
0.84 0.84 -1.0 -1.0
1.72 0.87 -1.0 -1.0
3.63 1.91 -1.0 -1.0
7.27 3.63 -1.0 -1.0
15.02 7.74 0 -1.0
30.73 15.70 -1.0 -1.0
51.49 20.76 0 0
89.47 37.97 0 0
105.89 16.42 0 0
131.03 25.14 0 0
154.59 23.55 0 0
170.90 16.30 0 0
• Logically, when the burn-in period represents the major part of the observations,
the estimations of αR and βR are better. On the other hand, when the major part
of the observations consists in the wear-out period, the estimations of αZ , βZ and
ρ0 tends to be better.
• The period of time when we can observe consecutively both kind of CM is relatively
short. This means that a sequence of Ui such as 0, 0, −1, 0, −1, 0 never exceeds four
or five observations. This may be a limit of the Weibull intensities. Meanwhile, it
allows to have a quite precise idea of the end of the burn-in period. This information
can be very useful for the maintenance team.
First results of estimations are very positive on the efficiency of the method. A further
and more precise study need to be carried out to validate the method of estimation. The
next step will be then to apply this methodology a real dataset. These two will be
developed during the oral presentation.
Conclusion
In this paper, we have proposed a general modelling of imperfect maintenance for
systems presenting a burn-in period and subject to corrective and preventive maintenance.
The model allows to dissociate the causes and effects of both failure modes by using the
competing risks framework and the virtual age models. Considering the two failure modes
involved, it seems realistic not to have the same assumption on the virtual ages. The first
contribution of the paper is to propose asymmetrical virtual ages. The complexity of
the model increases when we assume that the causes of failure may not be recorded. To
overcome this issue, we have proposed different methods. The second contribution of
the paper is to detail an estimation method using the EM algorithm. First simulations
11
Proceedings of the 38th ESReDA Seminar, Pécs, May 4-5, 2010
results are very promising. A validation of the model need to be carried out to judge the
robustness of the algorithm but we observe that we can obtain very relevant information
from the estimation such as the end of the burn-in period.
References
[1] Brown, M., Proschan, F. (1983) Imperfect repair, Journal of Applied Probability, 20,
pp 851–859.
[2] Cooke, R.M., Paulsen, J. (1997) Concepts for measuring maintenance performance and
methods for analysing competing failure modes, Reliability Engineering and System
Safety, 55, pp 135–141.
[3] Dijoux, Y. (2009) A virtual age model based on a bathtub shaped initial intensity,
Reliability Engineering and System Safety, 94(5), pp 982–989
[4] Dijoux, Y., Doyen, L., Gaudoin, O. (2008) Conditionally independent generalized
competing risks for maintenance analysis, in Advances in Mathematical Modeling
for Reliability, T. Bedford, J. Quigley, L. Walls, B. Alkali, A. Daneshkhah and G.
Hardman eds., IOS Press, Amsterdam, pp 88-95, 2008.
[5] Doyen, L., Gaudoin, O. (2004) Classes of imperfect repair models based on reduction
of failure intensity or virtual age, Reliability Engineering and System Safety, 84(1),
45-56.
[6] Doyen, L., Gaudoin, O. (2006) Imperfect maintenance in a generalized competing risks
framework, Journal of Applied Probability, 43(3), 825–839.
[7] Kijima, M., Morimura, H. and Suzuki, Y. (1988) Periodical replacement problem
without assuming minimal repair. European Journal of Operational Research, 37, 194-
203.
[8] Langseth, H., Lindqvist, B.H. (2003) A maintenance model for components exposed to
several failure mechanisms and imperfect repair. In Doksum, K. and Lindqvist, B.H.,
editors, Mathematical and Statistical methods in Reliability, Quality, Reliability and
Engineering Statistics, World Scientific Publishing Co., 355–370.
[9] Lindqvist, B.H. (2006) On the Statistical Modeling and Analysis of Repairable Sys-
tems, Statistical Science, 21(4), 532-551.
12