4 key reliability centered maintenance (RCM) and preventive maintenance optimisation (PMO) concepts

Sandy Dunn 18 Jun 2014

While there are a range of possible approaches that can be used for the development and optimisation of Preventive Maintenance programs – each with their own strengths and weaknesses, here are four core concepts which apply to the development of effective and efficient Preventive Maintenance programs that are valid, no matter what approach you use.

1. We don’t do maintenance for maintenance’s sake. We do it to avoid the consequences of failure.

Clearly, routine maintenance which does not add value to our organisation is to be avoided. Why spend thousands of dollars per year preventing failures from occurring on a particular equipment item, if the implications of each failure, on average, only result in failure costs of hundreds of dollars per year.

The implications of this are profound. We can only determine the optimal preventive maintenance program for each item of equipment if we first understand what the consequences of failure of that item are. While there are many different ways of categorising and quantifying the consequences of equipment failure, essentially the consequences fall into two broad categories:

Financial consequences (cost of lost production, repair costs, costs of consequential damage etc.)
Risk consequences (potential for a safety or environmental incident, breach of statutory or licence requirements etc.)

For failures that have financial consequences, the decision regarding whether preventive maintenance is “value-adding” is based on financial evaluation; we need to compare the overall costs of preventing the failure from occurring with those associated with an in-service failure. Depending on the results of that analysis, the proposed preventive maintenance tasks may or may not be worth doing.

For failures that have risk consequences, the decision regarding whether preventive maintenance is “value-adding” is based on an assessment regarding whether putting the proposed preventive maintenance task in place reduces the risks associated with in-service equipment failure to tolerable levels.

The key here, however, is to recognise that these decisions are equipment-context specific. Depending on what the equipment is intended to do for us, the consequences of failure of those items may be quite different – even if the equipment is identical. Therefore the optimum preventive maintenance program may be quite different for identical equipment items – if the operating context (what the equipment is expected to do) is different for those items. In other words the optimum preventive maintenance program depends more on what the equipment does, than what it is. This is a fairly common trap that organisations fall into – putting in place identical preventive maintenance programs for identical equipment items, without recognising that the consequence of failure of those items may vary quite significantly from application to application. This is also one of the weaknesses of using generic preventive maintenance programs sourced from equipment manufacturers – as they are generally unaware of the application in which the equipment is to be used.

2. To be effective, the preventive maintenance program must be aimed at addressing failure modes, the causes of equipment failure.

Generic Preventive maintenance tasks such as “Inspect pump” are fraught with danger for several reasons:

First, we cannot ensure that the inspection would be performed to the same level of rigour and detail by all people assigned to perform this inspection – because we have not detailed the specific items to be inspected, or what constitutes an “acceptable” condition. Depending on the skill levels of your maintenance workforce, a greater or lower level of consistency and rigour will result – which is not conducive to ensuring consistent equipment reliability. If we cannot ensure that a preventive maintenance task is done consistently to the required quality standard, then this has the potential to generate waste – either rework (in terms of having to redo the inspection), or additional work (by having the equipment fail when this failure could have been prevented if the task had been done properly).
Second, and more importantly, unless we understand why equipment fails, then we cannot put in place effective maintenance tasks to prevent those equipment failures.

This does not necessarily mean, however, that we need to perform detailed Root Cause Analysis on every equipment failure that occurs. But we do need to understand the physical causes of failure down to an appropriate level of detail that permits us to identify the most appropriate preventive maintenance tasks to put in place to avoid those failures. As consultants, we frequently come across preventive maintenance tasks that are intended to address failure causes that will never happen, or which have been put in place as a “knee-jerk” reaction to a past failure, but which do not adequately address any likely failure modes.

Understanding the causes of equipment failure allows you to put in more closely targeted, optimal preventive maintenance actions.

3. To be applicable, preventive maintenance tasks must take into account the technical characteristics of the failure that they are intended to prevent.

One of the more common improvement opportunities that we come across is to ensure that a technically appropriate task is put in place. This necessarily involves understanding the technical characteristics of the failure – and more specifically the failure distribution associated with each failure mode.

As most people who have encountered Reliability Centered Maintenance principles will know, Nowlan and Heap identified six different failure distributions associated with failures in the civil aviation industry. These distributions, and the proportion of failures associated with each distribution in Nowlan and Heap’s study are illustrated below.

Reliability Centered Maintenance Failure Patterns

The key implications of Nowlan and Heap’s work are that:

Not all things “wear out”. In fact, in Nowlan and Heap’s study only 11% of items showed an increasing conditional probability of failure with age (Failure Distributions type A, B and C) above.
Fixed interval replacement or overhaul of items (if this is performed without considering the condition of the item) is only appropriate for items that do display wear-out characteristics (and the number of items that display these characteristics may be fewer than we think).
At best, performing fixed interval overhauls or replacement of items that do not wear out is a complete waste of time and money (if they demonstrate Failure Pattern E), but at worst may actually reduce equipment reliability, rather than improve it (if they demonstrate Failure Distribution F).

So what should we do if we have a failure mode that, if it were to occur, were to have serious consequences, but which does not demonstrate a failure distribution of type A, B or C? The key here is to recognise that the primary purpose of Preventive Maintenance is not to prevent failures themselves, but to minimise or avoid the consequences of failure. If we can predict when a failure is to occur, then we can schedule a repair or replacement of that item at a time when the consequences are minimal. This is a key concept at the heart of Predictive Maintenance. Performing vibration analysis on a bearing does not avoid the need to replace the bearing – but it does mean that we can schedule the bearing replacement to take place at a time when the equipment is not required for operational purposes.

However, for a Predictive Maintenance inspection to be applicable, there are several conditions that must apply. First and foremost amongst these is that the failure mode (cause) must give us a detectable warning that it is about to occur. Most, but not all, failures do give us this warning – even failures that do have wear-out failure characteristics. And therefore we can frequently apply predictive maintenance inspections to those failure modes as well. As an example, consider the maintenance strategy that you apply to the tyres on your car – you choose to replace them when the tread depth is getting low (a predictive maintenance strategy) – even though the tyres clearly wear out over time.

Finally, note the implications of failure modes that demonstrate failure distribution F. In this case, performing routine maintenance leads to increased failures and lower reliability. Most of us have probably, at some time in our career, experienced the situation where an item of equipment was operating reliably until we did our preventive maintenance on it – and it then subsequently failed shortly after being brought back into service. Preventive Maintenance tasks need to be selected carefully, and quality of performance of those tasks monitored closely, to ensure that this does not occur.

4. There must be single-point accountability for the prevention of each failure mode.

One of the greatest opportunities for waste reduction in Preventive Maintenance tasks is through the elimination of multiple tasks performed by multiple work groups, all aimed at preventing the same failure mode. For example, at one of our clients (a Nickel Refinery), the following tasks were all in place to detect incipient bearing failures in electrical motors:

Twice per shift area walk around inspections by production operators (“Listen to motors for abnormal noises”)
Once per week area walk around inspections by mechanical fitters (“Listen to rotating equipment and report any abnormal noises”)
Once per month vibration analysis by external Condition Monitoring contractor
Once per quarter area walk around inspections by electrical fitters (“Report any noisy bearings on electrical motors”)

Not only is this a complete waste of most people’s time and energy, but it also dilutes accountability for preventing this failure from occurring. If a motor bearing was to seize while in service, who would we hold accountable for failing to prevent this from happening? As a result, there is a high probability that one or more of the groups involved in doing these inspections would “assume” that someone else would have noticed the noisy bearing before they did, and would have reported it – therefore they do not need to. There is therefore a significant risk that the incipient failure is not reported and addressed before it becomes too late.

Having multiple tasks performed by multiple people “just to be sure” is almost always a bad idea. An effective and efficient Preventive Maintenance program will, for each failure mode, have selected the best task, will have assigned this to the most suitable workgroup, and the organisation will have ensured that the members of that workgroup have the skills to perform the task effectively. The organisation will also need to ensure that the task is being performed on time, and in full – but that is a subject for further discussion in the next article in this series – “Doing the Work Right”.

These key concepts apply, regardless of what approach you use for preventive maintenance development and optimisation. Some of these alternative approaches are discussed in our article Alternative approaches for developing and optimising Preventive Maintenance

Next article 5 keys to lean maintenance and improved maintenance productivity

Sandy Dunn 18 Jun 2014

Reliability Improvement

5 keys to lean maintenance and improved maintenance productivity

Many asset intensive companies do not understand that the principles of measuring productivity can be applied internally...

4 key reliability centered maintenance (RCM) and preventive maintenance optimisation (PMO) concepts

1. We don’t do maintenance for maintenance’s sake. We do it to avoid the consequences of failure.

2. To be effective, the preventive maintenance program must be aimed at addressing failure modes, the causes of equipment failure.

3. To be applicable, preventive maintenance tasks must take into account the technical characteristics of the failure that they are intended to prevent.

4. There must be single-point accountability for the prevention of each failure mode.

5 keys to lean maintenance and improved maintenance productivity

Related articles

Introduction to the PF interval

The importance of a no-blame culture for safety and...

What is reliability centered maintenance (RCM)?

We optimised our preventive maintenance without realising it

Putting a value on maintenance and reliability improvement

Alternative approaches for developing and optimising preventive maintenance