CBM Modelling
CBM Modelling
Wenbin Wang
5.1 Introduction
The use of condition monitoring techniques within industry to direct maintenance
actions has increased rapidly over recent years to the extent that it has marked the
beginning of what is likely to prove a new generation in production and
maintenance management practice. There are both economic and technological
reasons for this development driven by tight profit margins, high outage costs and
an increase in plant complexity and automation. Technical advances in condition
monitoring techniques have provided a means to achieve high availability and to
reduce scheduled and unscheduled production shutdowns. In all cases, the
measured condition information does, in addition to potentially improving decision
making, have a value added role for a manager in that there is now a more
objective means of explaining actions if challenged.
In November 1979, the consultants, Michael Neal & Associate Ltd published
‘A Guide to Condition Monitoring of Machinery’ for the UK Department of Trade
and Industry, Neal et al (1997). This groundbreaking report illustrated the
difference in maintenance strategies (e.g., breakdown, planned, etc.) and suggested
that condition based maintenance, using a range of techniques, would offer
significant benefits to industry. By the late 1990’s condition based maintenance
had become widely accepted as one of the drivers to reduce maintenance costs and
increase plant availability. With the advent of e-procurement, Business to Business
(B2B), Customer to Business (C2B), Business to Customer (B2C) etc., industry is
fast moving towards enterprise wide information systems associated with the
internet. Today, plant asset management is the integration of computerised
maintenance management systems and condition monitoring in order to fulfill the
business objectives. This enables significant production benefits through objective
maintenance prediction and scheduling. This positions the manufacturer to remain
competitive in a dynamic market.
Today there exists a large and growing variety of condition monitoring
techniques for machine condition monitoring and fault diagnosis. A particularly
2 Book Title
popular one for rotating and reciprocal machinery is the vibration analysis.
However, irrespective of the particular condition monitoring technique used, the
working principle of condition monitoring is the same namely, condition data
becomes available which needs to be interpreted and appropriate actions taken
accordingly. There are generally two stages in condition based maintenance. The
first stage is related to condition monitoring data acquisition and its technical
interpretations. There have been numerous papers contributing to this stage, as
evidenced by the proceedings of COMADEM over recent years. This stage is
characterised by engineering skill, knowledge and experience. Much effort of the
study at this stage has gone into determining the appropriate variables to monitor,
Chen et al (1994), the design of systems for condition monitoring data acquisition,
Drake et al (1995), signal processing, Wong et al (2006), Samanta et al (2006),
Harrison (1995), Li and Li (1995), and how to implement computerised condition
monitoring, Meher-Homji et al (1994). These are just a few examples and no
modelling is explicitly entered into the maintenance decision process based upon
the results of condition monitoring. For detailed technical aspects of condition
monitoring and fault diagnosis, see Collacott (1997). The second stage is
maintenance decision making, namely what to do now given that condition
information data and its interpretations are available. The decision at this stage can
be complicated and entails consideration of cost, downtime, production demand,
preventive maintenance shutdown windows, and most importantly, the likely
survival time of the item monitored. Compared with the extensive literature on
condition monitoring techniques and their applications, relatively little attention
has been paid to the important problem of modelling appropriate decision making
in condition based maintenance.
This chapter focuses on the second stage of condition monitoring, namely
condition based maintenance modeling as an aid to effective decision making. In
particular, we will highlight a modelling technique used recently in condition based
maintenance, e.g. residual life modelling via stochastic filtering, Wang and
Christer (2000). This is a key element in modeling the decision making aspect of
condition based maintenance. The chapter is organised as follows. Section 5.2
gives a brief introduction to condition monitoring techniques. Section 5.3 focuses
on condition based maintenance modeling and discuss various modeling
techniques used. Section 5.4 presents the modelling of the residual life conditional
on observed monitoring information using stochastic filtering. Section 5.5
concludes the chapter with a discussion of topics for future research.
T
Vrms = 1
T ∫0
(V (t )) 2 dt ,
into play when establishing a vibration based maintenance model is the casual
relationship between the measured signals and the state of the plant. It is the defect
which causes the abnormal signals, but not vice versus, Wang (2002). This factor
plays an important role when selecting an appropriate model for describing such a
relationship.
difference when modeling the state of the plant in oil based monitoring compared
to vibration based.
The long term expected cost per unit time, C (t ) , given that a preventive
replacement is scheduled at time t> t i is given by, Wang (2003),
(c f − c p ) P(t − ti | ℑi ) + c p + icm
C (t ) = t −ti
(5.1)
ti + (t − ti )(1 − P (t − ti | ℑi )) + ∫ xi pi ( xi | ℑi )dxi
0
Chapter Title 7
t −ti
where P (t − ti | ℑi ) = P ( X i < t − ti | ℑi ) = ∫
0
pi ( xi | ℑi )dxi , which is the
probability of a failure before t conditiional on ℑi . The right hand side of (5.1) is
the expected cost per unit time formulated as a renewal reward function, though the
life times are independent but not identical.
The time point t is usually bounded within the time period from the current to
the next monitoring since a new decision shall be made once a new monitoring
reading becomes available at time t i +1 .
In general, if a minimum of C (t ) is found within the interval to the next
monitoring in terms of t , then this t should be the optimal replacement time. If no
minimum is found, then the recommendation would be to continue to use the plant
and evaluate (5.1) at the next monitoring point when new information becomes
available. For a graphical illustration of the above principle see Fig. 5.1.
C(t)
No replacement is recommended
5.3.2 Modelling pi ( xi | ℑi )
h ( t ) = αβ t β − 1 .
There are two problems with proportional hazards modeling or accelerated life
models in condition based maintenance. The first is that the current hazard is
determined partially by the current monitoring measurements and the full
monitoring history is not used. The second is the assumption that the hazard or the
life is a function of the observed monitoring data which acts directly on the hazard
Chapter Title 9
via a covariate function. Both problems relate to the modeling assumption rather
than the technique. The first one can be overcome if some sort of transformation of
the observed data is used. The second problem remains unless the nature of
monitoring indicates so. It is noted however that for most condition monitoring
techniques, the observed monitoring measurements are concomitant types of
information which are a function of the underlying plant state. A typical example is
in vibration monitoring where a high level of vibration is usually caused by a
hidden defect but not vice versus as we have discussed earlier. In this case the
observed vibration signals may be regarded as concomitant variables which are
caused by the plant state. Note that in oil based monitoring things are different as
the metal particles and other contaminants observed in the oil can be regarded both
as concomitant variables and covariates as we discussed earlier. In this case a
model considers both variables might be appropriate.
The last decade has seen an increased use of stochastic filtering and Hidden
Markov Models (HMM) for modelling pi ( xi | ℑi ) in condition based
maintenance, Hontelez et al (1996), Christer et al (1997), Wang and Christer
(2000), Bunka et al (2000), Dong and He (2004), Lin and Markis (2003, 2004),
Baruah and Chinnam (2005), Wang (2006a). These techniques overcome both
problems of PHM and provide a flexible way to model the relationship between the
observed signals and unobserved plant state. HMM can be seen as a specific type
of stochastic filtering models that are usually used for discrete state and
observation variables. If the noise factors in the model are not Gaussian, then a
closed form for pi ( xi | ℑi ) is generally not available and one has to resort to
numerical approximations. A comparison study using both filtering, Wang (2002),
and PHM, Markic and Jardine (1991), based on vibration data revealed that the
filtering based model produced a better result in terms of prediction accuracy,
Matthew and Wang (2006).
It should be noted also that if the monitored variables also influence the state to
some extent, then both HMM and PHM should be used to tackle the problem.
Alternatively an interactive HMM can also be formulated where a bilateral
relationship is assumed between the observed and unobserved. In the next section,
we shall discuss in details a specific filtering model used for the derivation of
pi ( xi | ℑi ) . This model is simple to use and is analytically tractable.
Defect short residual life higher than normal signal may be observed.
If the severity of the defect is represented by the length of the residual life, the
relationship between the residual life and observed condition related variables
follows.
monitoring. If this is the case, we can simply set the threshold level to be zero. Fig.
5.2 shows a typical condition monitoring practice.
yi
x1
y3
y2 x3
y1 x2
Threshold level
0 t1 t2 t3 failure
It is noted from Fig. 5.2 that the conditional information obtained before t1 is not
used since they are irrelevant to the decision making process. It is noted however,
that the time to t1 is one of important information sources to be used in
determining the condition monitoring interval, Wang (2003).
Since the residual life at t i is the residual life at t i −1 minus the interval
between t i and t i −1 provided the item has survived to t i and no maintenance
action has been taken, it follows that
⎧ X − (t i − t i −1 ) if X i −1 > t i − t i −1
X i = ⎨ i −1 . (5.2)
⎩ not defined else
p( xi , yi | ℑi −1 )
pi ( xi | ℑi ) = p( xi | yi , ℑi −1 ) = (5.3)
p( yi | ℑi −1 )
p( xi , yi | ℑi −1 ) = p( yi | xi , ℑi −1 ) p ( xi | ℑi −1 ) (5.4)
p( xi , yi | ℑi −1 ) = p( yi | xi , ℑi −1 ) p ( xi | ℑi −1 ) = p ( yi | xi ) p( xi | ℑi −1 ) (5.5)
∞ ∞
p( yi | ℑi −1 ) = ∫ p( xi , yi | ℑi −1 )dxi = ∫ p( yi | xi ) p( xi | ℑi −1 )dxi (5.6)
0 0
dg ( xi )
p( xi | ℑi −1 ) = pi −1 ( g ( xi ) | ℑi −1 , X i −1 > ti − ti −1 ) (5.7)
dxi
dg ( xi )
Since = 1 and
dx i
pi −1 ( g ( xi ) | ℑi −1 )
pi −1 ( g ( xi ) | ℑi −1 , X i −1 > ti − ti −1 ) = ∞
(5.8)
∫
t i − t i −1
pi −1 ( xi −1 | ℑi −1 )dxi −1
we finally have
Chapter Title 13
pi −1 ( xi + ti − ti −1 | ℑi −1 )
p ( xi | ℑi −1 ) = ∞
(5.9)
∫ t i − t i −1
pi −1 ( xi −1 | ℑi −1 )dxi −1
p ( yi | xi ) pi −1 ( xi + ti − ti −1 | ℑi −1 )
pi ( xi | ℑi ) = ∞
(5.10)
∫ 0
p ( yi | xi ) pi −1 ( xi + ti − ti −1 | ℑi )dxi −1
p( y1 | x1 ) p0 ( x1 + t1 − t0 | ℑ0 )
p1 ( x1 | ℑ1 ) = ∞
(5.11)
∫ 0
p ( y1 | x1 ) p0 ( x1 + t1 − t0 | ℑ0 )dx1
p0 ( x0 ) is just the delay time distribution over the second stage of the plant life.
Here we use the Weibull dsitribution as an example in this context. In practice or
theory, the distribution density function p 0 ( x0 ) should be chosen from the one
which best fits to the data or from some known theory.
The set-up of the p ( y i | xi ) term requires more attention. Here we follow the
one used in Wang (2002), where y i | xi is assumed to follow a Weibull
− cx
distribution with the scale parameter being equal to the inverse of A + Be i . In
this way we establish a negative correlation between y i and xi as expected., that
is E (Yi | X i = xi ) ∝ A + Be − cxi . The pdf is given below
14 Book Title
η
y
yi −( i
)η
η −1 A+ Be − cxi
p ( y i | xi ) = ( ) e . (5.12)
A + Be −cxi A + Be −cxi
This is a concept called floating scale parameter, which is particularly useful in this
case, Wang (2002). There are other choices to model the relationship between y i
and xi , but will not be discussed here, and can be found in Wang (2006a).
∞
P ( X i −1 > ti − ti −1 | ℑi −1 ) = ∫ pi −1 ( xi −1 | ℑi −1 ) dxi −1 (5.13)
t i − t i −1
If the item monitored failed at time t f after the last monitoring at time t n , the
complete likelihood function is then given by
L(Θ) = ⎛⎜ ∏i =1 p( yi | ℑi −1 ) ∫ pi −1 ( xi −1 | ℑi −1 )dxi −1 ) ⎞⎟ pn (t f − tn | ℑn )
n ∞
⎝ t i − t i −1 ⎠
(5.14)
Fig. 5.3 shows the data of overall vibration level in rms of six bearings, which is
from a fatigue experiment, Wang (2002). It can be seen from Fig. 5.3 that the
bearing lives vary from around 100 hours to over 1000 hours, which shows a
Chapter Title 15
typical stochastic nature of the life distribution. The monitored vibration signals
also indicate an increasing trend with bearing ages in all cases, but with different
paths. An important observation is the pattern of vibration signals which stays
relatively flat in the early stage of the bearing life and then increases rapidly (a
defect may have been initiated). This indicates the existence of the two stage
failure process as defined earlier.
The initial point of the second stage in these bearings is identified using a control
chart called the Shewhart average level chart and the threshold levels of the
bearings are shown in table 5.1, Zhang (2004).
β
p( x0 ) = αβ (αx0 ) β −1 e − (αx0 )
and
16 Book Title
η
y
yi −( i
)η
η −1 A + Be − cxi
p ( y i | xi ) = ( ) e
A + Be −cxi A + Be −cxi
∏
β
( xi + ti ) β −1 e − (α ( xi + t i )) ψ k ( xi , ti )
i
pi ( xi | ℑi ) = ∞
k =1
(5.15)
∏
β
∫ ( z + ti ) β −1 e − (α ( z + t i )) ψ ( z , ti )dz
i
0 k =1 k
where
− C ( z + t i − t k ) −1 η
e − ( yk ( A+ Be ) )
ψ k ( z, ti ) = .
A + Be −C ( z + ti −tk )
α̂ βˆ Â B̂ Ĉ ηˆ
0.011 1.873 7.069 27.089 0.053 4.559
Based on the estimated parameter values in table 5.2 and (5.15) the predicted
residual life at some monitoring points given the history information of bearing 6
in Fig. 5.3 is plotted in Fig. 5.4.
Chapter Title 17
In Fig.5.4 the actual residual lives at those checking points are also plotted with
symbol *. It can be seen that actual residual lives are well within the predicted
residual life distribution as expected.
Given the estimated values for parameters and associated costs such as
c f = 6000 , c p = 2000 and cm = 30 , Wang and Jia (2001), we have the
expected cost per unit time for one of the bearings at various checking time t,
shown in Fig. 5.5.
Expectd cost per unit time
27
t=80.5 hrs
t=92.5 hrs
23
t=104 hrs
t=116.5 hrs
19 t=129 hrs
15
0 10 20 30
Planned replacement time
Fig. 5.5. Expected cost per unit time v planned replacement time in hours from the current
time t
18 Book Title
In can be seen from Fig. 5.5. that at t=116.5 and 129 hours both planned
replacements are recommended within the next 30 hours.
To illustrate an alternative decision chart in terms of the actual condition
monitoring reading, we transformed the cost related decision into actual reading in
Fig. 5.6 where the dark grey area indicates that if the reading falls within this area a
preventive replacement is required within the planning period of consideration.
The advantage of Fig. 5.6 is that it can not only tell us whether a preventive
replacement is needed but also show us how far the reading is from the area of
preventive replacement so that appropriate preparation can be done before the
actual replacement.
14
Preventive replacement area
12
Observed CM reading
10
4
No preventive replacement area
2
0
80.5 92.5 104 116.5 129
Tim e (age in hour) of CM reading taken
With the delay time concept, see chapter 14, system life is assumed to be classified
into two stages. The first is the normal working stage where no abnormal condition
parameters are to be expected. The second starts when a hidden defect is first
initiated with possible abnormal signals. The identification of the initial point in
the evolution of such a defect is important and has a direct impact on the
subsequent prediction model. Most research on fault diagnosis focuses on the
location of the fault, the possible cause of the fault, and of course, the type of fault.
This serves for the engineering purpose of deciding what to repair, but does not aid
the decision of when to do the task. This initial point defect identification has
received very little attention in prognosis literature. Wang (2006b) addressed this
problem to some extent using a combination of the delay time concept and the
HMM. Much work still remains. It is possible that a multi-stage (>2) failure
process could be used, which might be more appropriate to some cases.
The definition of the underlying state and the relationship between the observed
monitoring parameters and the state of the system are issues which still need
attention. In the model presented in this chapter, the state of the system is defined
as the residual life, which is assumed to influence the observed signal parameters.
Whilst the modelling output appears to make sense, there are a few potential
problems with the approach. The first is the issue that the life of the plant is fixed
at birth (installation) but unknown. This is termed as playing the God. Secondly,
the residual life is not the direct cause of the observed abnormal signals. These are
more likely caused by some hidden defects which are linked to the residual life in
this chapter. To correct the first problem we can introduce another equation
describing the relationship between X i and X i −1 deterministically or randomly.
This will allow X i to change during use, which is more appropriate. If the
relationship is deterministic, then a closed form of (5.3) is still available, but if it is
random, HMM must be used and no closed form of (5.3) exists unless the noises
associated are normally distributed. The second problem can be overcome if we
adopt a discrete or continuous state hidden Markov chain to describe the system
deterioration process where the state space of the chain represents the system state
under question.
20 Book Title
5.6 References
Aghjagan, H.N., 1989, Lubeoil analysis expert system, Canadian Maintenance Engineering
Conference, Toronto.
Aven T, 1996, Condition based replacement policies - a counting process approach, Rel.
Eng. & Sys. Safety, 51(3), 275-281.
Banjevic D., Jardine A.K.S., Makis V., Ennis M., 2001, A control-limit policy and software
for condition based maintenance optimization, INFOR 39(1), 32-50.
Chapter Title 21
Baruah P., and Chinnam R.B., 2005, HMM for diagnostics and prognostics in maching
processes, I. J. Prod. Res., 43(6), 1275-1293.
Black M., Brint, A.T., and Brailsford J.R., 2005, A semi-Markov approach for modelling
asset deterioration, J. Opl. Res. Soc. 56(11), 1241-1249.
Bunks C., McCarthy D., and Al-Ani T., Condition based maintenance of machine using
hidden Markov models, 2000, Mech. Sys. & Sig. Pro., 14(4), 597-612.
Charles W. Reeves, 1998, The vibration monitoring handbook, Coxmoor Publishing
Company, Oxford, 1998.
Chen, W., Meher-Homji, C.B. and Mistree, F., 1994, COMPROMISE: an effective
approach for condition-based maintenance management of gas turbines. Engineering
Optimization, 22, 185-201.
Chen D. and Trivedi K.S., 2005, Optimization for condition based maintenance with semi-
Markov decision process, Rel. Eng. & Sys. Safety, 90(1), 25-29.
Christer A.H and Wang W., 1995, A simple condition monitoring model for a direct
monitoring process, E. J. Opl. Res., 82, 258-269.
Christer A.H. and Wang W., 1992, A model of condition monitoring inspection of
production plant, I. J. Prod. Res., 30, 2199-2211.
Collacott, R.A., 1977, Mechanical fault diagnosis and condition monitoring, Chapman and
Hall Ltd., London.
Dong M., and He D., 2004, Hidden semi-Markov models for machinery health diagnosis
and prognosis, Trans. North Amer. Manu. Res. Ins. of SME, 32, 199-206.
Drake, P.R., Jennings, A.D., Grosvenor, R.I. and Whittleton, D., 1995, acquisition system
for machine tool condition monitoring. Quality and Reliability Engineering
International 11, 15-26.
Freud J.E., 2004, Mathematical statistics with applications, Pearson Prentice and Hall,
London.
Harrison, N., 1995, Oil condition monitoring for the railway business. Insight 37, 278-283.
Hontelez J.A.M., Burger H.H. and Wijnmalen D.J.D., 1996, Optimum condition based
maintenance policies for deteriorating systems with partial information, Rel. Eng. & Sys.
Safety, 51(3), 267-274.
Hussin B, and Wang, W., 2006, Conditional residual time modelling using oil analysis: a
mixed condition information using accumulated metal concentration and lubricant
measurements, to appear in Proc. 1st Main. Eng. Conf, Chendu, China.
Jardine A.K.S., Makis V., Banjevic D., Braticevic D., and Ennis M., 1998, A decision
optimization model for condition based maintenance, J. Qua. Main. Eng., 4(2), 115-121.
Jensen U., 1992, Optimal replacement rules based on different information level, Naval Res.
Log. 39, 937-955.
Kalbfleisch, J.D. & Prentice, R.L., 1980, The Statistical Analysis of Failure Time Data.
Wiley, New York.
Kumar, D., and Westberg U., 1997, Maintenance scheduling under age replacement policy
using proportional hazard modelling and total-time-on-test plotting, Euro. J. Opl. Res.,
99, 507-515.
Li, C.J. & Li, S.Y., 1995, Acoustic emission analysis for bearing condition monitoring.
Wear 185, 67-74.
Lin D., and Makis V., 2003, Recursive filters for a partially observable system subject to
random failures, Adv. Appl. Prob., 35(1), 207-227.
Lin D., and Makis V., 2003, Filters and parameter estimation for a partially observable
system subject to random failures with continuous-range observations, , Adv. Appl.
Prob., 36(4), 1212-1230.
Love, C.E. & Guo, R., 1991, Using proportional hazard modelling in plant maintenance.
Quality and Reliability Engineering International, 7, 7-17.
22 Book Title
Makis V. and Jardine A.K.S., 1991, Computation of optimal policies in replacement models,
IMA J. Maths. Appl. Business & Industry, 3, 169-176.
Matthew C., and Wang W., 2006, A comparison study of proportional hazard and stochastic
filtering when applied to vibration based condition monitoring, submitted to Int. Tran
OR.
Meher-Homji, C.B., Mistree, F. and Karandikar, S., 1994, An approach for the integration of
condition monitoring and multi-objective optimization for gas turbine maintenance
management. International Journal of Turbo and Jet Engines, 11, 43-51.
Neal M., and Associates, 1979, Guide to the condition monitoring of machinery, DTI,
London.
Samanta, B., Al-Balushi, K.R., Al-Araimi, S.A. 2006, Artificial neural networks and
genetic algorithm for bearing fault detection Soft Computing, 10 (3), 264-271.
Wang W., 2002, A model to predict the residual life of rolling element bearings given
monitored condition monitoring information to date, IMA. J. Management Mathematics,
13, 3-16.
Wang W., 2003, Modelling condition monitoring intervals: A hybrid of simulation and
analytical approaches, J. Opl. Res Soc, 54, 273-282.
Wang W., 2006a, A prognosis model for wear prediction based on oil based monitoring, to
appear in J. Opl. Res Soc,
Wang W., 2006b, Modelling the probability assessment of the system state using available
condition information, to appear in IMA. J. Management Mathematics.
Wang W. and A.H. Christer , 2000, Towards a general condition based maintenance model
for a stochastic dynamic system, J. Opl. Res. Soc. 51, 145-155.
Wang W., and Jia, Y., 2001, A multiple condition information sources based maintenance
model and associated prototype software development, proceedings of COMADEM
2001, Eds. A. Starr and Raj B.K.N. Rao, Elsevier, 889-898.
Wang W., and Zhang W., 2005, A model to predict the residual life of aircraft engines based
on oil analysis data, Naval Logistics Research, 52, 276-284.
Wong, M.L.D., Jack, L.B., Nandi, A.K., 2006, Modified self-organising map for
automated novelty detection applied to vibration signal monitoring Mech. Sys.
& Sig. Proc., 20(3), 593-610.
Zhan Y. Makis V., and Jardine A.K.S., 2006, Adaptive state detection of gearboxes under
varying load conditions based on parametric modeling, Mech. Sys. & Sig. Prod. 20(1),
188-221.
Zhang W., 2004, Stochastic modeling and applications in condition based maintenance,
PhD, thesis, University of Salford, UK.
Zhang Z.G. and Love C.E., 2000, A simple recursive Markov chain model to determine the
optimal replacement policies under general repairs, Com. and OR, 27(4), 321-333.