A Reliability Centered Approach To Remote Condition Monitoring. A Railway Points Case Study
A Reliability Centered Approach To Remote Condition Monitoring. A Railway Points Case Study
www.elsevier.com/locate/ress
Abstract
Railway turnouts, consisting of switches and a crossing, are complex electro-mechanical devices which are exposed to severe
environmental influences and which are essential for the operation of any railway bar horizontal lifts. Their safe and reliable operation must
be assured if the rail mode of transport is to flourish. Conventionally, the continuous availability of turnout mechanisms has been assured by
high levels of routine maintenance, to some extent tailored to the criticality of a particular point location. However, traffic increases and
shortened maintenance windows require better approaches to turnout maintenance. The authors of the present paper undertook the
development of algorithms to detect gradual failure in railway turnout which should allow a move to an RCM2 approach to the management
of switch and crossing maintenance. They demonstrate the approach using data from tests on a commonly found point mechanism and
include a discussion of the benefits of adopting a Kalman Filter for pre-processing the data collected during tests.
q 2003 Elsevier Science Ltd. All rights reserved.
Keywords: Remote condition monitoring; Reliability centred maintenance; Kalman Filter; Point mechanism; Failure mode and effect analysis
0951-8320/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved.
PII: S 0 9 5 1 - 8 3 2 0 ( 0 2 ) 0 0 1 6 6 - 7
34 F.P. Garcı́a Márquez et al. / Reliability Engineering and System Safety 80 (2003) 33–40
(straight through) or reverse. The switches move from maintenance approach to infrastructure and logistic of
normal to reverse or reverse to normal direction. Turnouts railway operation), all of which justify the contribution of
are perhaps the most important infrastructure elements of RCM1 for point mechanism in improving the safety and
the railway system and affect its safety greatly. The Potters reliability of railways.
Bar accident of 10th May, 2002 in England was caused by FMEA is a systematic analysis of the potential failure
a faulty turnout while the consequences of the Eschede modes of a component of a system [9]. It includes the
accident in Germany were aggravated by a point moving identification of possible failure modes, determination of
underneath the train. the potential causes and consequences and an analysis of the
The 55% of railway infrastructure component failures on associated risk. It also includes a record of corrective actions
high speed lines are due to signalling equipment and turnouts or controls implemented resulting in a detailed control plan.
[10], where ‘signalling equipment’ covers signals, track FMEAs can be performed on both the product and the
circuits, interlockings, automatic train protection (ATP) or process. Typically, an FMEA is performed at the component
LZB (track loop-based ATP), and the traffic control centre. level, starting with potential failures and then tracing their
From another point of view, the annual cost of maintaining effects up to the ultimate consequences. The FMEA allows
points is high, about UKP (United Kingdom Pound) 3.4 the identification of the most critical components and the
million per year for about 1000 km of railway, compared to likely failure mechanisms, thus leading to the specification
other infrastructure elements. TC-TCR trade circuits, for of system parameters to be monitored.
example, cost UKP 2.1 million per year for the same area. Of Primary performance parameters of complex mechan-
the points expenditure, UKP 1.2 million is for clamp lock isms, such as railway points, are speed of movement,
(hydraulic) turnout and UKP 1.4 million for electrically vibration, supply voltage, power, throwing time, tempera-
operated turnouts (data provided by a British asset manager). ture, current, force, etc. Based on these performance
Turnouts can also be used to implement flank protection for a parameters, RCM1 can be used to define terms such as
train route allocated to another train. This is achieved by risk, quality, control, comfort, economy, containment,
positioning the blades of the turnout in such a way that a train ergonomy, etc. In practice, condition-based maintenance
driving through the turnout is not directed into a track decisions are based substantially upon assessments of the
segment belonging to the route of the other train. condition of the system obtained at discrete monitoring time
The two safe positions of the moving parts of a turnout, intervals [15]. This type of condition monitoring is called
normal and reverse, are generally detected using switches indirect condition monitoring in contrast to direct monitor-
operated by the blades or their operating mechanisms. In ing which measures the actual condition. The latter employs
order to ensure high availability and reliable and safe advanced electronics, sensors and transducers, computing
operation, points require regular inspection and mainten- and communications technology. Their measurement
ance. Currently, such maintenance is carried out on a time (vibration, supply voltage, power) can be embodied in
basis with allowance being made for the operational remote condition monitoring systems (RCM2) [2 –4,11].
criticality of a particular point. A better and more cost- RCM2 leads to improved reliability and can pay for itself in
effective approach though is reliability centred maintenance terms of cost-effectiveness since staff do not have to visit
which is being adopted by a number of railway undertakings installations as frequently. The integration of the two types
for point mechanism maintenance. of RCMi is called RCM2, with the overall aim of using
advanced electronics, control, computing and communi-
2.2. Turnout failures and maintenance cation technologies to address the multiple objectives of
cost effectiveness, improved reliability and services.
Reliability centred maintenance (RCM1) is a process used In addition to the data collection required for RCM2, it is
to decide what must be done to ensure that any physical asset, also necessary to process the large amount of information to
system or process continues to do whatever its users want it provide a warning when the device moves out of tolerance or
to do [4,8,12]. Therefore, RCM1 provides powerful rules for adjustments. Algorithm design for the detection of trends
deciding whether a failure management policy is technically and failure patterns has been undertaken by many research-
appropriate, providing precise criteria for deciding how often ers but only a few papers dealing with the dynamics of
routine tasks should be carried out. RCM1 identifies ways in railway turnouts have been found in the literature [1,4,13,
which the system can fail to live up these expectations. This 14]. The present paper describes a simple approach to RCM2
must generally be followed by a failure mode and effects as applied to railway turnout mechanisms in a case study.
analysis (FMEA) which allows an assessment of the
consequences of failure. As substantial number of research
projects concerning railway infrastructure have been carried 3. Description of turnout used in experiment
out or are still in progress, for example, REMAIN (reliability
and maintainability in European Rail Transport), ROMAIN Turnouts are assembled from switches and a crossing
(Railway Open Maintenance Tool), INFRACOST (The Cost where the moving parts are often described as the ‘points’.
of Railway Infrastructure), RAIL (reliability centred Most standard point machines contain a switch and lock
F.P. Garcı́a Márquez et al. / Reliability Engineering and System Safety 80 (2003) 33–40 35
5. Model criteria
and the model must detect faults in both directions of the as follows
turnout mechanism movement. This was the reason for the ! n¼t j
authors to choose a reference dynamic system for their
j
tmax Xmax Xj
m¼T
Fig. 3. Criteria employed for detecting faults in a point mechanism. (a) First criterion, (b) second criterion and (c) third criterion.
38 F.P. Garcı́a Márquez et al. / Reliability Engineering and System Safety 80 (2003) 33–40
Fig. 4. Difference between the actual reference curve and the new curve in absolute values with and without Kalman Filter. (a) Normal to reverse direction and
(b) reverse to normal direction.
of severity (Appendix A). A total of 151 experiments were the rule-based decision mechanism. The Kalman Filter
carried out, 79 in the reverse to normal direction and 72 in estimates a process by using a form of feedback control: the
the normal to reverse direction. filter estimates the process state at some time and then obtains
The most important results are as follows. With a Kalman feedback in the form of measurements.
Filter, we could detect 100% of faults in the reverse to normal With a Kalman Filter, the authors can currently detect
direction in the 79 experiments. Without the Kalman Filter 100% of faults in the reverse to normal direction, and
this drops to 97.33%. In this direction, the margin employed without the Kalman Filter this drops to 97.33%. In the other
for detecting the maximum position is 0.3 s less when using direction, we can currently detect only 97.1% of faults with
the Kalman Filter, and the margin considered for detecting the Kalman Filter and without it only a 94.2%. In general,
irregularities in curves is reduced to 91.3%. In the other employing Kalman Filter has improved the margins of
direction, we can currently detect only 97.1% of faults when criteria in both directions.
using the Kalman Filter and without it only 94.2%. The The authors cannot explain why detection in the normal
margin for detecting irregularities is 85.71% better, and the to reverse is not as successful as for reverse to normal, but in
margin in maximum position is 0.3 s worse. their opinion they have achieved a minimum rejection rate
but they continue to improve their methods.
8. Conclusions
Appendix A. Sample list of faults
Turnouts are probably the most important infrastructure
elements of the railway system and affect its safety greatly. † 15 mm obstruction at second bearer on normal side of
The standard railway turnout is a complex electro- points;
mechanical device with many potential failure modes. In † 13 mm obstruction at eighth bearer on reverse side of
order to ensure high availability and reliable and safe points;
operation, points require regular inspection and † 12 mm obstruction at toe on normal side of points;
maintenance. † Back drive overdriving at heel on normal side with dry
Reliability-centred maintenance (RCM1) provides slide chairs;
powerful rules for deciding whether a failure management † Back drive slack end off at toe end;
policy is technically appropriate, providing precise criteria † Back drive slack end off at toe end (LHS side drive
for deciding how often routine tasks should be carried out. basket slack end off);
The technique employs advanced electronics, sensors and † Diode snubbing block disconnected;
transducers, computing and communications technology † Dry slide chairs;
embodied in remote condition monitoring systems (RCM2). † Low tension on motor brush;
RCM2 leads to improved reliability and can pay for itself in † Operational contact in original position;
terms of cost-effectiveness since staff do not have to visit † Tight lock on reverse side;
installations as frequently. The integration of the two types † Tight lock on reverse side (sand on all bearers on both
of RCMi is called RCM2. The authors of the present paper sides).
have described a simple approach to RCM2 as applied to
railway turnout mechanisms, based on a case study.
Faults in turnout mechanisms must be detected quickly References
and reliably if the information is to be useful. It is a discrete
dynamic system, where data must be processed on line. Any [1] Andersson C, Dahlberg T. Wheel/rail impacts at a railway turnout
detection to be adopted must use a simple model for crossing. J Rail Rapid Transit 1998;212(2):123–34.
[2] Christer AH, Wang W. A simple condition monitoring model for a
detecting faults quickly by analysing data in real time. The
direct monitoring process. Eur J Oper Res 1995;82:258– 69.
model for detecting faults must adapt to external conditions [3] Christer AH, Wang W, Sharp JM. A state space condition monitoring
and must detect faults in both directions of the turnout model for furnace erosion prediction and replacement. Eur J Oper Res
mechanism movement. This was the reason for the authors 1997;101:1 –14.
to choose a reference dynamic system for their analysis. [4] Garcı́a Márquez FP, Schmid F, Conde J. Mantenimiento Centrado en
la Fiabilidad y Monitorización Remota Basada en la Condición,
The RCM2 implementation was developed as part of a
RCM2: Un caso de Estudio. Gestión de Activos Industriales 2002; in
research project. The data collection and algorithm devel- press.
opments were carried out during the year 2000 with a number [5] Jacobs OLR. Introduction to control theory, 2nd ed. New York:
of 151 experiments at a test site. The data collected refers to Oxford University Press; 1993.
force (N) versus time (s). If we analyse the difference [6] Kalman RE. A new approach to linear filtering and prediction
problems. Trans ASME J Basic Engng 1960;35–45.
between the actual data and the reference data in the form of
[7] Kennedy A. Risk management and assessment for rolling stock safety
absolute values, we can detect the majority of faults as they cases. Proc Inst Mech Engrs: Part F 1997;211:67 –72.
develop. The objective for using Kalman filtering in this [8] Moubray J. Reliability-centred maintenance. Oxford, UK: Butter-
study was to increase the reliability of the model presented to worth/Heinemannn; 1997.
40 F.P. Garcı́a Márquez et al. / Reliability Engineering and System Safety 80 (2003) 33–40
[9] Price CJ, Pragh DR, Wilson MS, Snooke N. The flame system: USA: International Society of Automotive Engineers Department;
automating electrical failure mode and effects analysis (FMEA). Proc 1999.
Reliab Maintain Symp 1995;90–5. [13] Shimonae T, Kawakami T, Miki H, Matsuda O, Tekeuchi H.
[10] REMAIN Consortium. Modular system for reliability and maintain- Development of a monitoring system for electric point machines.
ability management in European rail transport. Final Report, IITB; IRSE Aspect Int Conf 1991;395–401.
1998. [14] Stott PF. Automatic open level crossing a review of safety. London,
[11] Roberts C, Fararooy S. Remote condition monitoring into the next UK: Her Majesty’s Stationery Office; 1987.
millennium. Proc Comput Aid Des Manuf Oper Railway 1998;. [15] Wang W. A model to determine the optimal critical level and the
[12] SAE Standard JA1011, Evaluation criteria for reliability-centered monitoring in condition based maintenance. Technical Report CMS-
maintenance (RCM) process. Commonwealth Drive Warrendale, 99-07; 1999.