Maintainability RAMS
Maintainability RAMS
Using the asset life cycle to proactively consider maintenance can vastly extend the
life and reduce the life cycle cost of assets.
Maintainability and operability should be at the forefront of every life cycle stage of
an asset.
WHAT IS MAINTAINABILITY
Maintainability in engineering design is “the relative ease and economy of
time and resources with which an engineered installation can be retained
in, or restored to, a specified condition when maintenance is performed by
personnel having specified skill levels, using prescribed procedures and
resources, at each prescribed level of maintenance and repair”.
We can distinguish between two general types of maintenance: reactive and proactive
maintenance. Reactive maintenance is performed in response to unplanned or unscheduled
downtime of the unit, usually as a result of a failure, whether the cause of failure is internal
(inherent) or external (for example, operator-induced). Proactive maintenance may be
either preventive maintenance or predictive maintenance. Preventive maintenance is
scheduled downtime, usually periodical, in which a well-defined set of tasks, such as
inspection and repair, replacement, cleaning, lubrication, adjustment, and alignment, are
performed. Predictive maintenance estimates, through diagnostic tools and measurements,
when a part is near failure and should be repaired or replaced, thereby eliminating a
presumably more costly unscheduled maintenance action. Proactive maintenance should be
performed only when and to the extent it is cost-effective. It must reduce the number of
unscheduled failures or extend the life of the item or both. It is generally assumed that a
proactive maintenance action is less costly than a reactive one.
ANALYSIS OF DOWNTIME
Failures are restored through the repair process. The repair process can be decomposed into
a number of different subtasks and delay times, as shown in Figure below.
Supply delay consists of the total delay time in obtaining necessary spare parts or
components, including administrative lead times, procurement lead times, and
transportation times. The supply delay time will not necessarily occur at the beginning of the
repair cycle but may occur after the diagnosis subtask has identified the failed component to
be replaced.
Maintenance delay time is the time spent waiting for maintenance resources or facilities,
including administrative (notification) time and travel time. Resources may be personnel, test
equipment, support equipment, tools, manuals or other technical data. Facilities may be a repair
dock such as an aircraft hangar, a service bay in an automotive repair shop, or a fixed test stand.
Because supply delay times and maintenance delay times are influenced by external parameters
(such as resource levels) that are not part of the system itself, they are not considered part of the
inherent repair time of the item. The inherent repair time of the item is defined as the sum of the
durations of the following subtasks: access, diagnosis, repair or replacement, validation, and
alignment. Access time is the amount of time required to gain access to the failed component.
Diagnosis, or troubleshoot, time is the amount of time required to determine the cause of the
failure. It is also referred to as fault isolation time. The repair or replacement time includes only
the hands-on time to complete the restoration process once the problem has been identified and
access to the failed component is obtained. Following restoration, some failures may require
validating the restoration or alignment check to ensure that the unit has been returned to an
operational condition. If this check is required, it is considered part of the repair time. Whenever
we talk about repair time, we mean the inherent repair time unless stated otherwise. The repair
time is considered to be an inherent design feature of the item and reflects its maintainability.
Measures of Maintainability
Like reliability, maintainability is defined to be a probability and will be
characterized by specifying a repair-time probability distribution. There
are several measures of maintainability. The most popular, and the one
we will focus on, is the meantime to repair (MTTR). Other possible
measures include the median time to repair, the mode, or the most
likely repair time, the time in which a specified percentage of the
failures must be repaired, the mean repair time plus the mean
preventive maintenance downtime, and the number of maintenance
hours per operating hour.
THE REPAIR TIME DISTRIBUTION
To quantify maintainability, the repair-time distribution is defined on T, the
continuous random variable representing the time to repair (TTR) a failed unit,
having a probability density function of h(t). Then the cumulative distribution
function H(t), is the probability that a repair will be accomplished within time t.
Variance ( )
EXAMPLE: The time to repair a failed electronic component has the
following probability density function:
h(t) = 0.08333t when 1 ≤t ≤5 hr
Calculate the probability of completing a repair, in less than 3 hr and
the mean of repair times.
Exponential Repair Times
If the repair distribution is exponential, then
where the parameter of the distribution is the MTTR. For this distribution,
r is the rate of repair (number of repairs per unit of time). The
repair rate is constant only for the exponential distribution.
EXAMPLE: A component can be repaired at the constant rate
of 10 per 8-hr day. What is the probability of a single repair
exceeding 1 hr?
Lognormal Repair Times
The lognormal distribution is often used to represent the repair distribution. For the lognormal
distribution, h(t) is the probability density function for the repair time.
The lognormal distribution is a two-parameter distribution such that tmed is the median time to
repair and s is a shape parameter. The probability of a repair being completed in time t is H(t):
The lognormal distribution is asymmetric, skewed right, indicating that most repair times will be
distributed around the centre (mode) of the distribution with relatively few long repair times in
the right-hand tail of the distribution. The mean time to repair is
EXAMPLE: It is required that the electronic ‘subsystem – X’ of an airbus
be repaired (or replaced) within 3 hr, 90 per cent of the time. If the
repair distribution is lognormal with s = 0.45, what MTTR should be
achieved to meet this goal? Estimate the most likely repair time of X. If
the repair distribution of the ‘subsystem – X’ was exponential with the
same mean, what would be the probability of completing repair within
3 hrs.
SYSTEM REPAIR TIME
System repair time is a function of the repair times of the components and average
(mean) system repair time is computed from knowledge of the mean subsystem or
component repair times. For example, the mean time to repair of an aircraft depends
on the repair distribution of each of the subsystems, such as the electrical, hydraulic,
and environmental subsystems. The system MTTR may be computed as a weighted
average of the subsystem MTTRs in which the weights are based on the relative
number of failures.
Let MTTRi be the mean time to repair the ith unique subsystem, fi be the expected
number of failures of the ith unique subsystem over the system design life, and qi be
the number of identical subsystems of type i. Then the system mean time to repair is
When all of the components have constant failure rates and the same number of
operating hours, fi can be replaced by λi.
EXAMPLE: An electronic system has the following three subsystems: Power
supply, Amplifier and Tuner. The failure rate (constant) and mean time to
repair of these subsystems are as follows. Calculate the MTTR of this
electronic system.
QUANTIFIABLE MEASURES OF MAINTAINABILITY
The mean time to repair (MTTR): As an average, this measure has the
disadvantage of attempting to summarize the repair distribution with a single
value. Two distributions having the same mean can provide a considerably different
range of repair times. An improvement would be to specify an upper bound on the
variance (or equivalently, the standard deviation) of the repair times along with the
mean. A small variance will ensure more consistent repair times that are closer to
the MTTR.
Median time to repair: The median is also an average and has the same
disadvantage as the MTTR. It is preferred over the MTTR if the repair times are
highly skewed. For example, a few very large repair times would influence the
MTTR more than the median. In addition to the 50th percentile (that is, the median
time), the other quartiles could be specified as well. That is, the repair time at
which 25, 75, and 100 percent of the repairs would be accomplished would be part
of the specification.
Maximum time, tp , in which a certain percentage p of the failures must be repaired:
Generally, this measure is preferred over the MTTR and median time since it identifies an
acceptable (maximum) repair time for most failures. Mathematically,
Mean system downtime ( ): Mean system downtime is the average downtime, including
scheduled maintenance, but excluding supply or maintenance delay times. Since the
requirement for scheduled maintenance is partly driven by design, it is appropriate to
include scheduled downtime as part of the design criteria. Mathematically,
where Tpm = the (mean) time between preventive maintenance, td = the system design life
MPMT = the mean preventive maintenance time, m(td) = the expected number of failures in the interval (0, td)
For a constant failure rate, m(td) = λtd and once failure data is collected, m(t) is the number of observed failures over
the time, t.
Mean time to restore (MTR)
Mean time to restore is the average unscheduled system downtime
including delays for maintenance and supply resources.
MTR = MTTR + MDT + SDT
where MDT is the mean delay for maintenance, and SDT is the mean
delay for supply resources.
Maintenance work hours per operating hour
(MH/OH)
The number of maintenance work hours per operating hour combines
reliability and repair time with the number of maintenance personnel (crew
size) necessary to complete repair. It is a measure of the maintenance work
generated. Mathematically,
where m(t) is the expected number of failures in the interval (0, t) and CREW is
the average crew size. For a constant failure rate, m(t) = λt and,
λ
If mean preventive maintenance time (MPMT) is to be included in
the workhour calculation, then,
𝐶 𝑡 = 𝐶 +𝐶 𝑡+𝐶 𝑓 𝑡 𝑑𝑡
and the cost per unit of time is
𝐶 𝐶
𝐶= +𝐶 + 𝑓 𝑡 𝑑𝑡
𝑡 𝑡
For the increasing intensity of failure, we can assume f(t) = ,
then
To minimize the cost per time unit, set = 0 and solve for t:
Key concepts of design activity from
maintainability aspect
Standardization
Interchangeability
Modularization
Accessibility
Maintenance frequency
Simplicity
Visibility
Testability
Standardization
Standardization reduces the range of parts that must be maintained
and stocked to a minimum. The amount of training and skill required to
perform maintenance may, therefore, also be reduced. It simplifies the
coding and labelling of parts and allows for fewer tools, test
equipment, and technical manuals.
Interchangeability
Interchangeability refers to the ability and ease with which a component can be
replaced with a similar component without excessive time or undue retrofit or
recalibration. This flexibility in design reduces the number of maintenance
procedures and, consequently, reduces maintenance costs. Interchangeability also
allows for system expansion with minimal associated costs, due to the use of
standard or common end-items. Interchangeability requires both functional and
physical substitution. Physical substitution requires standardization in mountings,
connectors, pins, and so on, as well as compatibility in size and available space.
For example, personal computers can accept a variety of hard drives and expansion
boards due to industry standards in cables, system board slots, and connectors.
Functional substitution may be supported by the software. Parts standardization
also affects reliability design since there are fewer unique components.
Modularization
The packaging of components in self-contained functional units, or
modularization, facilitates maintenance. Typically, a failure can be identified
by the failure of a specific function. Under modularization, this isolates the
problem to a physical unit. Diagnostics can then be applied to isolate the
fault further. Modularization also allows for the removal and replacement of
the failed unit, provided a spare module is available, with minimum
downtime. Therefore, system availability may be dramatically improved.
Modularization permits packaging against known environmental hazards,
thus decreasing the chance of failure due to environmental stress.
Accessibility
It is the ease with which an item can be accessed during maintenance, and can
greatly impact maintenance times if not inherent in the design, especially on
systems where in-process maintenance is required. With complex integration of
systems, the design of a system must avoid removing another system’s equipment to
gain access to a failed item. Access requirements may vary from complete removal
of the unit to a part of it using certain tools, adjustments, or servicing. Size, weight,
and clearances must be considered in the physical design of any removal unit.
Ideally, removing any failed unit should be possible without requiring the removal of
a unit that has not failed. Furthermore, the ability to permit the use of standard
hand tools must be observed throughout. When accessibility is poor, other failures
are often caused by isolation/disconnection/removal and installation of other items
that might hamper access, causing rework. Accessibility of all replaceable,
maintainable items will provide time and energy savings.
Maintenance frequency, Simplicity and Visibility
Maintenance frequency is the frequency with which each maintenance action
must be conducted and is central to a system's preventive, schedule, or corrective
maintenance requirements.
Simplicity is the simplification of maintenance tasks associated with the system.
System simplification helps to reduce the costs of spares and improves the
effectiveness of maintenance troubleshooting.
Visibility measures how readily the system component requiring maintenance can
be seen. Visibility is an element of maintainability design that allows the
maintenance function visual access to assemblies and components for ease of
maintenance action. Even short-duration tasks can increase downtime if the
component is blocked from view. Designing for visibility greatly reduces
maintenance times.
Testability
This measures the ability to detect system faults and isolate these at the lowest
replaceable component. Diagnosis of a failure by identifying the fault is a major
task in the repair process. The speed with which faults are diagnosed can greatly
influence downtime and maintenance costs. As technology advances continue to
increase the capability and complexity of systems, the use of automatic diagnostics
as a means of fault detection, isolation, and recovery (FDIR) substantially reduces
the need for highly trained maintenance personnel and can decrease maintenance
costs by reducing the need to replace components. FDIR systems include both
internal diagnostic systems, referred to as built-in-test (BIT) or built-in-test-
equipment (BITE), and external diagnostic systems, referred to as automatic test
equipment (ATE), or offline test equipment. These equipment are part of a reduced
support system, all of which will minimize downtime and cost over the operational
life cycle.
EXAMPLE: Tests performed on a self-diagnostic module for a complex
electronic system resulted in correct diagnostics of a known fault 98
percent of the time with only a 1 percent false reading when it was
known there were no faults present. The probability of a failure (fault)
occurring over the test period is 0.005. How reliable is the self-
diagnostic module?
MAINTAINABILITY IN THE PRODUCT LIFE CYCLE
An important element in achieving an efficient and effective design is
serious consideration of the maintainability issues that arise
throughout the product life cycle. An effective maintainability program
incorporates a dialogue between the user and manufacturer during the
total life cycle of the product. This dialogue concerns the user's
maintenance requirements and other requirements for the product
and the manufacturer's response to those requirements. The product
life cycle is composed of the four phases i.e. the concept development
phase, the validation phase, the production phase, and the operation
phase. Specific maintainability functions are associated with each of
these phases.
Concept Development Phase
During this phase, the operational needs of the product are translated into a set of operational
requirements, and high-risk areas are identified. In other words, the objective of the concept
development phase is to develop and choose the most appropriate method of meeting the identified
operational needs. The method must be proven viable from technical, schedule, and cost
standpoints. A product development plan, implementation plans for the recommended method,
advanced development objectives, and any other necessary plans or objectives should also be
prepared.
The primary maintainability task during this phase is to determine product effectiveness
requirements and to determine, from the purpose and intended operation of the product, the field
support policies and other provisions required. Product effectiveness can be defined as the
probability that the product can successfully satisfy an operational demand within a defined interval
when used according to design specifications. In order to establish product maintainability
requirements, it is necessary to determine product utilization rates, mission time factors, and
product life cycle duration, including product use and out-of-service conditions. It is also necessary to
describe the mission and performance expectations, product operating modes, and overall logistic
support objectives and concepts.
Validation Phase
In this phase, the operational requirements developed during the
concept development phase are refined further in terms of
product design requirements. The main objective of the
validation phase is to ensure that no full-scale development takes
place until associated costs, schedules, and performance and
support objectives have been prepared and evaluated with
utmost care.
Because the most crucial part of the maintainability effort occurs
during the concept development and validation phases, effective
maintainability management is especially important at these stages.
In particular, some of the functions that must be completed before the
production phase are updating the maintainability program plan to
meet final specification requirements for the project; monitoring the
maintainability efforts of subcontractors; issuing detailed program
schedules, milestones, work orders, and budgets, and periodically
reviewing and updating them; predicting and addressing
maintainability requirements in quantitative terms, down to the
lowest-level product component; preparing specific maintainability test
and demonstration plans; and monitoring the maintainability effort,
following the maintainability program plan and management policies
and procedures.
Production Phase
During this phase the product is manufactured, tested, delivered, and in
some cases installed in accordance with the technical data developed in
the earlier phases. Even though at this stage the maintainability design
effort will be largely completed, design should be reviewed and updated as
engineering changes, initial field experience, and logistic support
modification require. The production phase maintainability effort includes
production process monitoring; examining production test trends from the
standpoint of adverse effects on maintainability and maintenance
requirements; evaluating all proposals for change with respect to their
impact on maintainability; ensuring the eradication of all discrepancies
that may diminish maintainability; and taking part in the development of
controls for process variations, errors, and other problems that may affect
maintainability.
Operation Phase
During this phase, the user put the product into operation,
logistically supports it, and modifies it as appropriate. It is in
this phase that the supply, maintenance, training, overhaul,
and material readiness requirements and characteristics of the
product become clear. Therefore, although there are no
specific maintainability requirements at this time, the phase is
probably the most significant because the product's true cost-
effectiveness and logistic support are now demonstrated and
maintainability data can be collected from the experience for
use in future applications.