SMRP Case Study
SMRP Case Study
Ron Moore
The RM Group, Inc.
Knoxville, TN
and…
Ron Rath
Diesel Technology Company
Grand Rapids, MI
A large automotive parts plant has recently begun the process of introducing a
number of methodologies to improve manufacturing performance. Among the
tools being applied at this plant is TPM, with a particular focus on operator care
and minor PM, or as some people would say, TLC -- Tender Loving Care in the
form of actions such as Tightening, Lubricating, and Cleaning. Other tools are
also being introduced and applied in an integrated way, including continuous
improvement teams, improved cell design, pull systems, process mapping, etc.,
but the focus of this case history is how TPM was integrated with RCM.
The application of TPM at this plant, and in the author’s experience at other
plants, has focused most of its attention on operator PM and basic care, operator
"condition monitoring", etc. These practices are essential for assuring
manufacturing excellence, but used alone are not likely to be sufficient. In this
case, a focus on TPM was not considered to give adequate consideration to
other, perhaps equally valid or even more advanced methodologies, such as
reliability centered maintenance, predictive maintenance, root cause analysis,
maintenance planning, etc. This view was confirmed by the maintenance
manager for the business, who felt that while TPM was an effective tool for
assuring basic care for the equipment, for detecting the on set of failures, and
often for preventing failures in the first place, it frequently overlooked other
maintenance tools and requirements. This, in turn, often resulted in equipment
breakdowns, and in frequent reactive maintenance, not to mention the largest
loss -- reduced production capacity. As a result we embarked upon an effort to
combine the best of TPM and RCM in order to provide the most effective
processes for both maintenance and production. In the process we expected to
be able to provide manufacturing excellence -- maximum uptime, minimum unit
cost of production, maximum equipment reliability. Each methodology is
discussed individually below and then the process for combining them is
provided.
TPM was developed in Japan by Seiichi Nakajima 1. With its Japanese origins,
the strategy places a high value on teamwork, consensus building, and
continuous improvement; and tends to be more structured in its cultural style --
everyone understands their role and generally acts according to an understood
protocol. Teamwork is a highly prized virtue; whereas individualism may be
frowned upon. This basic underlying genesis of the Japanese TPM strategy is a
significant issue to be understood when applying TPM to a given manufacturing
plant. This may be especially true in a US manufacturing plant, since US culture
tends to value individualism more, to value people who are good at crisis
management, who rise to the occasion, who will take on seeming insurmountable
challenges, and prevail. We tend to reward those who respond quickly to crises
and solve them. We tend to ignore those who simply "do a good job". They are
not particularly visible -- no squeaking, and therefore no grease. This "hero
worship" and individualism may be an inherent part of our culture, and may make
more difficult the implementation of TPM.
This is not to say that TPM faces overly serious impediments, or is an ineffective
tool in non-Japanese manufacturing plants. On the contrary, when an
organization’s leadership has made it clear that the success of the organization is
more important than the individual, while still recognizing individual contribution,
a team oriented corporate culture can develop which transcends the tendency for
the individualistic culture, and success is more likely. Many plants world wide
have used TPM effectively, and most plants have tremendous need for improved
communication and teamwork which could be facilitated using the TPM
methodology. Most would be better off with fewer heroes, and more reliable
production capacity. Considerable progress have been made through programs
and strategies like TPM, but it is still evident that in many plants there still exist
strong barriers to communication and teamwork.
How often have you heard from operations people- "If only maintenance would fix
the equipment right, then we could make more product!", and from maintenance
people "If operations wouldn't run the equipment into the ground, and would give
the time to do the job right, we could make more product!", or from engineering
people "If they'd just operate and maintain the equipment properly, the
equipment is designed properly (he thinks) to make more product!", and so on.
The truth lies in "the middle", with "the middle" being a condition of teamwork,
combined with individual contribution and responsibility, and effective
communication. A case history for effecting this "middle" is provided below.
Total Productive Maintenance -- the name itself implies that all maintenance
activities must be productive, and that they should yield gains in productivity.
Reliability Centered Maintenance -- its name implies that the maintenance
function must be focused on assuring reliability in equipment and systems. As
we’ll see, RCM also calls for an analysis for determining maintenance needs.
Properly combined, the two work well together.
The basic pillars of TPM and some thoughts on their relationship to an RCM
strategy are:
3. TPM calls for improving maintenance efficiency and effectiveness. This is also
a hallmark of RCM. Many plants make extensive use of preventive maintenance
or so-called PM's. However, while inspection and minor PM’s are appropriate,
intrusive PM's for equipment overhaul may not be, unless validated by equipment
condition review, since according to RCM studies little equipment is truly
average. RCM helps determine which PM is most effective, which should be
done by operators, which should be done by maintenance, and which deserve
attention from design and procurement. PM’s become more effective since they
are based on sound analysis, using appropriate methods.
4. TPM calls for training people to improve their job skills. RCM will help in
identifying the failure modes which are driven by poorly qualified staff, and hence
identify the areas for additional training. In some cases it may actually eliminate
the failure mode entirely, thus potentially eliminating the need for training in that
area. RCM is highly supportive of TPM, since training needs can be more
effectively and specifically identified and performed.
6. TPM calls for the effective use of preventive and predictive maintenance
technology. RCM methods will help identify when and how to most effectively use
preventive and predictive maintenance through a failure modes analysis to
determine the most appropriate method to detect on set of failure, e.g., using
operators as "condition monitors", or using a more traditional approach in
predictive tools.
Reliability Centered Maintenance Principles -- RCM
2. Identify the failure modes which can result in any loss of system function.
In doing the analysis, equipment histories are needed, and teamwork is also
necessary in order to gather the appropriate information for applying the above
steps. However, not having equipment histories in a database should not negate
the ability to do an RCM analysis. As is demonstrated below, equipment histories
can be found in the minds of the operators and technicians. More over, operators
can help detect the onset of failure, and in take action to avoid these failures.
Like TPM, RCM describes maintenance in four categories: preventive, predictive,
failure finding, and run to failure. At times the difference between these can be
elusive.
RCM analysis as traditionally practiced can require lots of paperwork -- it’s very
systematic and can be document intensive. It has been shown to be very
successful in a number of industries, but particularly in airlines and nuclear
industries which have an inherent requirement for high reliability, and a very low
tolerance for the risk of functional failure.
There are however some potential RCM pitfalls. Most of these are addressed by
the better practitioners, but let’s discuss them for completeness. For example, it
implies that if back up equipment exists that run to failure is acceptable, since it
has no effect on system function. However, this could be risky in that run to
failure may result in ancillary damage; or the back up, if not cared for, may not
operate, or operate long; or it may reinforce a historical culture of run to fail, and
reactive maintenance, which typically costs much more.
More over, its traditional or historical focus may tend to be primarily PM activities,
versus a more proactive, integrated approach, which includes the effects of
product mix, production practices, procurement practices, installation practices,
commissioning practices, stores practices, etc. The more advanced application
by most current practitioners does include these effects, and caution should be
exercised to assure that these issues are included.
At the automotive plant, the first step was to bring together a cross functional
team of people to review a critical production line with the goal of identifying
those failure which were resulting in a loss of function -- production line capacity.
This team of people were generally production supervisors, operators,
maintenance supervisors, engineers, mechanics, technicians, electricians, etc. --
people who knew the production process and the equipment. We also, as
necessary, brought in support staff such as purchasing and stores people to help
in defining and eliminating failures.
From the outset, however, we defined a functional failure in the system (the
production line) as anything which resulted in loss of production output, or
resulted in incurring extraordinary costs. That is to say, we did not restrict a
functional failure to the equipment, but rather defined a functional failure very
broadly. We also looked at the frequency of these failures and their effects,
principally their financial effect as measured by the value of lost uptime or extra
costs.
Initially we focused on the first production step, say step A, but once we finished
with identifying all the major functional failures in step A, we looked downstream
and asked questions: Are failures in step B causing any failures in step A?"; are
failures in utilities causing any failures in step A?; are any failures in purchasing
or personnel causing failures in step A?; and so on. We walked through each
step in the production line looking for areas where actions (or failures to act)
were resulting in production losses or major costs (functional failures). We also
made sure that all the support functions were encouraged to communicate with
the team regarding how the team could help the support functions more
effectively perform their job through better communication.
1. Raw material quality was a contributor to functional failures, e.g., lost uptime,
lost quality, poor process yields, etc. Operations and maintenance have little
control to correct this problem, but, they can advise others of the need to correct
it.
6. Inherent design features (or lack thereof) made maintenance a difficult and
time consuming effort, e.g., insufficient pull and lay down space, etc. Lowest
installed cost was the principal criteria for capital projects, versus a more
proactive lowest life cycle cost.
7. Poor power quality was resulting in potential electronic problems, and was
believed to be causing reduced electrical equipment life. Power quality hadn’t
been considered by the engineers as a factor in equipment and process
reliability.
10. And last, and perhaps most importantly from an equipment reliability
standpoint, precision alignment of the machine tools was sorely lacking and if
implemented should dramatically improve machinery reliability, and hence
reduce system failures.
Beyond the general findings above about functional failures in the system, we
also found that three separate sets of production equipment, were key to
improving the overall system (production line) function. Functional failures in this
equipment was resulting in the bottlenecking of the production line. It varied from
day to day as to which equipment in the production process was the "bottleneck",
depending on what equipment was down. Therefore all three steps and
equipment were put through an RCM analysis to develop the next level of detail.
At this next level, a method was established for assessing the criticality of the
equipment by creating a scoring system associated with problem 1) severity, 2)
frequency, and 3) detectability. This is shown in Table 1.
Table 1. Criticality Ranking M
The score for a given problem was the multiple of the three factors, Severity x
Frequency x Detectability. If a given problem was given a score of more than 4
by the group, then it was considered one which required additional attention.
Scores of less than 4 were considered to be those which operators could
routinely handle, and/or were of lesser consequence. For example, suppose a
functional failure was detectable by the operator, e.g., broken drill bit. Further,
suppose it was occurring daily, and could be repaired in one shift. Then it would
received a score of 1x2x3, or a fairly serious problem. Very few problems
occurred weekly, and required more than one shift to correct, and were not
detectable by an operator. Finally, this scoring system could obviously be further
refined to provide greater definition on a given system or set of problems.
Readers are encouraged to develop their own models which adequately address
their particular situation; or to use models already existing in their organizations
for product or process FMEA’s.
One finding of this review process was that a considerable amount of equipment
really needed a complete overhaul. That is, it needed to be restored to "like new"
condition (a TPM principle), but it was found using RCM methods as we looked
at the failure modes and effects associated with the system which defeated
function. This equipment subsequently went through a "resurrection" phase
wherein a team of people -- operators, electrician, electronic technicians,
mechanics, and engineers thoroughly examined the equipment and determined
the key requirements for an overhaul, including the key steps for verifying that
the overhaul had been successful. Less intensive, but equally valuable, (and
summarizing) we found the following model to be effective:
This model was then used to analyze the equipment and assign a criticality rating
which then dictated the priority of action required for resolution of the problems
being experienced.
We also found that it was critical to our success to begin to develop better
equipment histories, to plan and schedule maintenance, to be far more proactive
in eliminating defects from the operation, regardless whether they were rooted in
process, equipment, or people issues. This was all done with a view of not
seeking to place blame, but seeking to eliminate defects. All problems were
viewed as opportunities for improvement, not searches for the guilty. Using this
approach, it was much easier to develop a sense of teamwork for problem
resolution.
Function: Drill three holes in a part a given diameter, depth and cycle time
In this example, the severity code of the broken drill bit is 9+, when added to
other issues such as quantity of downtime, bottlenecking, etc., this became a
critical problem throughout the plant, and was addressed as rapidly as possible,
using root cause failure analysis and putting in place practices to eliminate the
problem.
Improved Design
Summary
The first step in combining TPM and RCM is to perform a streamlined RCM
analysis of a given production line as the system. A functional failure of the
"system" is defined as anything which causes loss of production capacity, or
results in extraordinary costs. It is focused on failure modes, frequencies and
effects, and is extended to identify those failure modes which would be readily
detected and prevented by proper operator action, as well as detailing those
failure modes and effects which require more advanced methodology and
techniques such as predictive maintenance, better specifications, better repair
and overhaul practices, better installation procedures etc., so as to avoid the
defects from being introduced in the first place. The next step is to apply TPM
principles related to restoring equipment to like new condition, having operators
provide basic care (TLC) in tightening, lubricating and cleaning, and applying
more effectively preventive and predictive techniques. Operators represent the
best in basic care and condition monitoring, but very often they need the support
of more sophisticated problem detection and problem solving techniques. These
are facilitated by integration of TPM and RCM methods.
Results
Process. The results thus far have been very encouraging. The cross-functional
teams have identified areas wherein operators through their actions can avoid,
minimize, or detect developing failures early on such that maintenance
requirements are minimized and such that equipment and process reliability are
improved. Moreover, more effective application of maintenance resources is now
being applied in order to assure that they are involved in those areas which truly
require strong mechanical, electrical, or other expertise in getting to the more
serious and difficult issues and problems. The application of these principles is in
fact much the same as we behave with our cars, that is we as operators of our
cars do routine monitoring, observation, and detect developing problems long
before they become serious. As we detect problems developing, we make
changes in the way we operate, and/or we have a discussion with a mechanic.
As necessary we bring our car into the mechanic describing the symptoms for a
more in-depth diagnosis and resolution of the problem using their superior skills.
Similarly, we as operators of our cars can preclude failures and extend
equipment life by applying basic care such as routine filter and oil changes which
don’t require much mechanical expertise, leaving the mechanic to do the more
serious and complex jobs, such as replacing the rings, seals, transmission, etc.
Equipment. The machines to which the methodology has been applied has been
very encouraging. For example, before this method was applied to one
bottleneck production area it was common that 6 out of 16 machines would be
unavailable for production, with only one of those typically being down for
planned maintenance. After the process was applied, 15 of 16 machines are now
routinely available, with one machine still typically down for planned
maintenance. This represents an increase of 50% in equipment availability. In
another area maintenance was routinely called in for unplanned downtime. After
application of this method, production staff were trained in routine operational
practices which essentially eliminated the need for emergency repairs and for
maintenance to "come to the rescue". This eliminated many unnecessary work
orders, improving equipment availability, and reducing costs. The methodology
continues to be applied in the plant with continuing improvement.
In closing, it must be said that methodologies such as TPM, RCM, TQM, RCFA,
etc. all work when consistently applied. However, as a practical matter, each
methodology appears to have its focus or strength. For example, TPM tends to
focus on maintenance prevention and operator care. RCM focuses on failure
modes and assuring system function. Both are good methodologies. Both work.
However, in this instance, we’ve found that combining the two actually led to a
better process and to improvements in teamwork and cooperation at the
production level, leading to improved performance and output, and lower
operating costs.
References:
1. Total Productive Maintenance, Seiichi Nakajima, Productivity Press, Portland,
OR, 1993.
3. Reliability Centered Maintenance, A.M. Smith, McGraw Hill, New York, NY,
1993.