The Joint Commission (TJC) has been encouraging healthcare facilities to take advantage of the option provided by the Centers for Medicare & Medicaid Services (CMS) in its 2013 regulatory clarification memo and accompanying Appendix A document1 by implementing an AEM program. However, although the AEM option is more efficient and less labor intensive, few facilities have chosen to take advantage of it. This could be because the default option (continuing to perform all equipment maintenance strictly according to manufacturer recommendations) may appear to be simpler. It is, however, much more burdensome and extremely inefficient.
The 2013 revision of CMS's interpretive guidelines for hospitals included a section on AEM programs, stating in part that such programs must “minimize risks to patients and others in the hospital associated with the use of … medical equipment.”1 This requirement is presumably intended to ensure that the alternate maintenance strategies achieve the same level of safety as the default option, implying that these strategies can be used only if the hospital can provide reasonable assurance that they will not result in a higher level of risk to patients or the medical staff. This implied requirement to quantify the resulting levels of safety is probably the primary factor deterring more widespread adoption. Although ensuring adequate levels of patient safety should always be the primary concern, safety, like risk, is hard to measure and difficult to demonstrate. The current article addresses this issue.
According to the CMS memo, the equipment items chosen for inclusion in the AEM program must be identified through a formal risk assessment conducted by qualified personnel acting on behalf of the facility. Part 1 of this two-part series of articles provides a blueprint for a simple, basic AEM program that every healthcare facility should be able to put in place immediately. Part 2 will describe a more comprehensive program that will make an even larger number of devices eligible for the AEM program. However, the approach described in part 2 will require collaboration on the part of the healthcare technology management (HTM) community to create a new database of standardized, real-world maintenance findings.
Maintenance Practices Task Force
In October 2015, AAMI announced its support for a new Maintenance Practices Task Force (MPTF) with a project to “begin exploring whether an approach known as reliability-centered maintenance (RCM) should be adopted on a wider scale throughout the field of healthcare technology management.”2 The MPTF devised several objectives, one of which was to develop a scientifically sound, RCM-based methodology for determining which types of medical devices are made safer by periodic planned maintenance (PM). Details on this project, as well as a number of helpful explanatory articles and references, database tables, model documents, and model procedures, are available at the MPTF website (www.HTMCommunitydB.org). The current article makes reference to several of these articles, database tables, and other materials.
Modern reliability engineering, including the analytical method known as failure modes and effects analysis (FMEA), has its roots in RCM. Prior to RCM, the traditional approach to PM used for many years for relatively simple machines (e.g., steam locomotives) was an invasive form of “scheduled overhauls.” However, this traditional approach would have been much too costly to use for the new jumbo jets when they were first introduced by the airlines. In the mid-1960s, a task force convened by the Federal Aviation Administration investigated how more complex machines, such as modern aircraft, actually fail, and this led to a more systematic and less costly way of maintaining these more complex devices–now known as RCM. RCM has since become widely used by NASA, the military, and many civilian industries. Although RCM has been identified elsewhere, including in the maintenance strategy section of Appendix A in the 2013 CMS memo,1 as simply another maintenance strategy, it is much more than that. RCM represents a completely different approach to maintenance, focusing on preserving a device's function rather than its physical integrity. For more on the well-established analytical methods incorporated into RCM, see HTM ComDoc 13 on page 3 of the MPTF website (www.HTMCommunitydB.org). Additional details on the original development of RCM can be found in HTM ComDoc 14 and other reference documents listed on page 4 of the MPTF website.
Risk Assessment Methodology
In the CMS memo, the term “critical equipment” is defined as devices “for which there is a risk of serious injury or death to a patient or staff person should the equipment fail.”1 However, when considering the many ways in which medical equipment can fail and the subject matter of the memo, this definition is insufficiently precise (see section below). This is an important issue because it has a direct impact on the nature of the risk assessment that the regulation requires before an AEM program can be implemented.
Defining ‘Critical Equipment’
Medical equipment failures, which have a variety of causes, can be grouped into three general categories (also see ComDoc 1 on the MPTF website [www.HTMCommunitydB.org]):
Failures attributable to the device's inherent reliability, including random failures of its components, as well as poor design or poor construction of the device itself
Process-related failures attributable to causes such as use errors, physical damage, environmental stress, accessory problems, tampering, and connected network issues
Maintenance-related failures attributable to causes such as inadequately restored nondurable parts or delayed recalibrations, as well as intrusive maintenance and poor initial set up of the device itself
There are only four ways in which failures can become hazardous:
When a device on which a patient's well-being is dependent experiences a complete loss of function
When a device on which a patient's well-being is dependent develops a failure that reduces its performance or safety to an unacceptable level and the failure is not likely to be obvious to the device user (often called a “hidden” failure)
When a device is damaged in such a way that it presents some type of direct physical threat to the safety of patients or staff (e.g., the damage leaves an exposed sharp edge)
When the device is used improperly
Of these four ways, only two can be prevented by timely PM:
Failures that are caused by the premature deterioration of a part that usually is restored during periodic PM
“Hidden” failures that reduce the device's performance or safety below a critical level and that would be detected by performance and/or safety verification testing during periodic PM as recommended by the manufacturer
RCM represents a completely different approach to maintenance, focusing on preserving a device's function rather than its physical integrity.
Of note, not all of these failures will necessarily result in outcomes that could result in serious injury or death. The intent in the CMS memo is to disqualify from immediate inclusion in an AEM program any item of equipment that could present “a risk of serious injury or death to a patient should the equipment fail”—unless the facility can provide specific justification.1 Further, CMS requires that the facility conduct a formal risk analysis to identify (and label as critical equipment) devices whose failure could result in such a serious risk. The CMS surveyors are instructed in the memo to pay particular attention to any items that the facility has labeled as critical equipment but that are being maintained other than per manufacturer recommendations. This implies that when such items are found, the facility might be asked to explain why not following the manufacturer's PM recommendations for the items is not reducing the overall level of patient safety.
A practical problem emerges: No one has created a list of devices, or device types, for which there is a consensus that all of the items meet the CMS definition of “critical equipment.” To the best of the authors' knowledge, no general agreement exists on which particular device types should be considered critical equipment. The same is true for the similar term, “high-risk medical equipment,” which is used by TJC. The only examples of critical equipment provided in the regulation are ventilators, defibrillators, and robotic surgery devices.
The MPTF has taken the position that the agency intends to allow to be included in an AEM program only those devices that either 1) are highly unlikely to result in a serious injury or death to a patient or staff person if they should fail from a failure mode (way of failing) that can be prevented by performing appropriate PM or 2) are highly unlikely to fail from a PM-preventable failure mode.
The objective of the AEM option is to permit changes to the default PM program mandated by CMS (which is following the manufacturer's PM recommendations for every single item of medical equipment) that will not increase the risk to patients or the attending staff. Therefore, a logical place to start would be to determine which devices could result in a serious, life-threatening outcome if they should fail—from any cause, not just from a PM-preventable cause. According to the current language in the revised CMS regulation, all such devices would meet the current CMS definition of critical equipment.
However, uncertainty exists about the extent to which items that 1) demonstrate a relatively high level of inherent unreliability, 2) are relatively easy to misuse in a way that makes them hazardous, or 3) are unusually prone to becoming damaged in a way that can become dangerous were intended to be classified as critical equipment and thus excluded from the AEM program. The MPTF has taken the position that this ambiguity in the definition in the revised CMS regulation was not intended, and for this particular purpose, such items should not be considered critical equipment.
To make this interpretation absolutely clear, we have chosen to use more specific terms, such as “PM critical” or “potentially high PM risk,” to label those devices with the potential to experience critical failures that can be prevented by competent and timely PM.
Recommended Inclusion Criteria
After careful analysis of the CMS memo, the MPTF has taken the position that the agency intends to allow to be included in an AEM program only those devices that either 1) are highly unlikely to result in a serious injury or death to a patient or staff person if they should fail from a failure mode (way of failing) that can be prevented by performing appropriate PM or 2) are highly unlikely to fail from a PM-preventable failure mode. If this position is acceptable to CMS, then healthcare facilities will be able to create a completely comprehensive CMS-compliant AEM program using a relatively simple risk analysis that considers only the device's PM-preventable failure modes.
However, it also is important to note that an even simpler analysis can be performed using just one (either one) of these two criteria. Choosing the first criterion, which the MPTF calls the “severity of PM-related harm” criterion, provides every healthcare facility with an opportunity to create, virtually immediately, a highly efficient AEM program. The opportunity can be seized quickly because the MPTF has already documented on its website (see HTM ComDoc 16 [www.HTMCommunitydB.org)] a credible risk analysis using just the first criterion.
This article provides a complete blueprint for implementing both a phase 1 AEM program (using the first criterion only) and, eventually, a phase 2 AEM program (by adding consideration of the second criterion).
To make this added qualification (of limiting the analysis to considering only failures from PM-preventable causes) absolutely clear, the MPTF is calling the analysis using this particular criterion a “PM-focused risk analysis.” The option to use this more limited analysis is not specifically called out in the CMS regulation; however, as the sole topic of the CMS memo is PM, it seems reasonable to assume that the risk analysis being called for should consider only PM-preventable failures.
Changing the way that a facility performs PM on its medical equipment by implementing an AEM program based on a PM-focused risk assessment will not do anything, intentionally, to reduce failures resulting from other, non–PM-preventable causes (e.g., physical damage, use errors). These are process-related failures, and making changes to the PM program only will not prevent such failures. Although most PM procedures include a physical inspection of the device, a long period could pass before the next PM, and during that time, patients and staff could be exposed to potentially hazardous physical damage.
The MPTF labels the second AEM program inclusion criterion as “likelihood of PM-preventable failures.” This criterion will be used in phase 2 to identify specific manufacturer-model versions of the various device types that have demonstrated having an acceptably low likelihood of failing from a PM-preventable cause.
This second criterion is logically consistent with a key statement in Appendix A of the CMS memo: “Multiple factors must be considered, since different types of equipment present different combinations of severity of potential harm and likelihood of failure.”1 This recognition by CMS that the level of risk associated with a device failure consists of a combination of the severity of potential harm resulting from the failure and the likelihood of the failure occurring is extremely important. It is a key element of the FMEA method, which was first introduced as a part of the RCM approach pioneered in the 1960s.
The second criterion potentially is helpful because it will allow even more devices to be included in the facility's AEM program even though they could cause serious harm if they do fail. Devices meeting this criterion will have demonstrated an acceptably low likelihood of failing from a PM-preventable cause (i.e., a high level of PM-related reliability).
Although the threshold of acceptable PM-related reliability has not been set, the second phase of the MPTF's project, which is currently underway, will provide information on the levels of PM-related reliability that are achieved when devices are maintained strictly according to manufacturer recommendations. This information will provide a rational basis for determining acceptable levels of PM-related reliability.
Based on these logical arguments, devices meeting one or both of the MPTF's recommended inclusion criteria can be moved into an appropriately documented AEM program—a program that the MPTF believes will be in full compliance with the CMS regulation.
Phase 1 Risk Analysis
During the first phase of its project, the MPTF undertook a formal two-step risk analysis that examined 76 device types that are considered most likely to be “potentially high PM-risk devices.” This initial group of candidate device types is listed in Table 1 on page 2 (“The database Tables”) of the MPTF website (www.HTMCommunitydB.org). Of note, this is a tentative list and subject to modification based on feedback from the HTM community.
Any device restoration tasks in the manufacturer's recommended PM procedure (or the equivalent generic version of the procedure) are examined and a judgment made about the level of severity (LOS) of the worst-case adverse outcome that could result if, because of this deterioration, the device stopped working while in use.
Because PM procedures are typically made up of two different kinds of tasks (see “The Basics of PM” sidebar p. 276), devices can fail as a result of inadequate PM in two ways:
Premature deterioration of a component that the manufacturer has indicated as needing periodic restoration during the working lifetime of the device
Some type of imperceptible deterioration (not obvious to the user) that is causing the device to no longer meet its critical performance and safety specifications
Because both these factors need to be considered, the risk analysis has to be performed in two steps. First, any device restoration tasks in the manufacturer's recommended PM procedure (or the equivalent generic version of the procedure) are examined and a judgment made about the level of severity (LOS) of the worst-case adverse outcome that could result if, because of this deterioration, the device stopped working while in use. However, these potentially harmful adverse outcomes can cover a wide range of severity, from trivial to life threatening, and for the purposes of this analysis, the MPTF has chosen to separate this continuum of outcome severity into four categories (sidebar).
PM procedures generally consist of two types of tasks:
1. Device restoration tasks that restore to full functionality any components of the device that the manufacturer has specified as needing periodic attention to prevent the device from failing. The MPTF has termed these components “nondurable parts.” They include moving parts that are subject to wear, as well as nonmoving parts that are subject to some other kind of progressive deterioration.
2. Safety verification tasks that confirm the device is still meeting any critical performance or safety specifications. The inclusion of one or more device restoration tasks in the manufacturer's PM procedure is taken as a recommendation from the manufacturer that the targeted component needs periodic restoration. However, the deterioration of the part may or may not cause the device to fail completely, resulting in outcomes that could have different LOSs (see sidebar below). The manufacturer's PM procedure may also specify a performance or safety verification task that is not critical in the sense that failing one or more of these tests will not result in a serious, life-threatening outcome.
LOS 3: Serious, life-threatening injury. The patient (or user) could lose his or her life.
LOS 2: Less serious, non–life-threatening injury. The patient (or user) could sustain a direct or indirect injury that ranges in severity but is less than life threatening.
LOS 1: No injury, but possible disruption of care. The failure could cause a temporary disruption of care, such as requiring one or more patients to be rescheduled, delaying treatment or delaying the acquisition of diagnostic information.
LOS 0: Negligible impact. The adverse effect of the device failure is considered insignificant.
The results of the first part of the phase 1 risk analysis are documented in Table 2 on page 2 of the MPTF website (www.HTMCommunitydB.org). The judgments made in the analysis take into account the potential scope of the adverse outcome, as well as any mitigating factors. Row 6 of the table illustrates the process using the example of the battery in a defibrillator that could fail prematurely, producing a worst-case outcome that could result in a serious, life-threatening injury to the patient—even though the users have supposedly subjected the device to regular checks in order to ensure a high level of reliability.
The second part of the analysis involves a similar evaluation of the performance and safety verification tasks in each device's recommended PM procedure. The results of this second part of the analysis are documented in Table 3 on the MPTF website. Examples in this case are undetected (hidden), potentially life-threatening component failures that could cause critical controls or indicators to malfunction or read incorrectly on devices such as anesthesia machines and apnea monitors (rows 2 and 3).
The findings documented in Tables 2 and 3 then are integrated into a single measure of the device's combined LOS and its potential level of PM-related risk (high, moderate, low, and negligible). These combined results are shown in Table 4 on the MPTF website and partially reproduced in Table 1 in the current article.
Table 4 on the MPTF website identifies 20 device types that fail to meet the MPTF's first recommended AEM program inclusion criterion (i.e., devices that could pose a risk of serious injury or death to a patient or staff member if they should fail from a PM-preventable cause). These are considered potentially high PM-risk devices, and in a phase 1 AEM program, these devices should continue to be maintained according to manufacturer recommendations.
Of note, these are considered to be only potentially high PM-risk devices because it may turn out that in phase 2 of the MPTF's project, certain manufacturer-model versions of these particular device types have demonstrated an acceptable level of PM-related reliability (i.e., they have an acceptably low likelihood of failing from a PM-preventable cause). Therefore, they then will meet the MPTF's second inclusion criterion and can be added to the AEM program.
With the exception of a number of device types that CMS has stated cannot be included in any kind of AEM program (elsewhere referred to as “taboo” devices [sidebar p. 278]) and the 20 potentially high PM-risk devices, all other medical devices can be included in the phase 1 AEM program and immediately switched to alternate maintenance methods or completely different maintenance strategies.
A considerable number of devices do not have any failure modes (ways of failing) that are PM preventable. These are referenced in the box in the lower right corner of Figure 1. This could be the result of the recommended PM procedure having no critical device restoration or safety verification tasks or because the manufacturer does not offer a recommended PM procedure. Many of the devices in this subgroup are frequently inventoried devices, such as beds, exam tables, gurneys, wheelchairs, cast cutters, and chart recorders.
Similarly, a fairly large number of device types have PM-preventable failure modes that are judged to potentially create only an insignificant adverse effect (LOS 0) (e.g., otoscopes, ergometers, lab sample shakers, lab stirrers). Devices that are judged to have only the potential to disrupt care (LOS 1) or result in only a minor injury (LOS 2) were identified during the phase 1 risk assessment. They are classified in Figure 1 as potentially moderate PM-risk and potentially low PM-risk devices, respectively.
As can be seen from the numerical distribution in Figure 1, this relatively simple phase 1 AEM program will reduce the facility's PM workload to a much lower level than that required by the default option of maintaining every single medical device according to the manufacturer recommendations. The logical implication in the CMS memo is that except for the “taboo” devices and the “potentially high PM-risk devices,” all other devices can be included in the facility's AEM program. Any additional risk from no longer performing PM on these devices according to the manufacturer's recommendations apparently is considered insignificant.
According to the CMS regulation, there are two device types (“medical laser devices” and “imaging and radiologic equipment”) that, at the moment (and for reasons not provided by CMS), cannot be subjected to alternate maintenance methods. Another two groups are excluded for other reasons. For more on this, see Table 2 in Baretich.3
Although this simple phase 1 AEM program will not identify every single device that will eventually prove to be eligible for alternate maintenance methods, it will identify a large number that can be included. After these devices are formally incorporated into a documented AEM program, they can be immediately transitioned to less stringent PM methods, including the highly efficient “run-to-failure” strategy mentioned in Appendix A of the CMS memo. Alternatively, at the very least, the manufacturer-recommended procedures can be modified, for example, by omitting electrical safety checks that the facility has found to be nonproductive or by changing the PM intervals to coincide more conveniently or more efficiently with other routines.
Extending the AEM program by incorporating the second recommended inclusion criterion (i.e., the “likelihood of PM-preventable failure”) into the risk analysis will result in more devices becoming eligible. However, this additional step will require credible estimates of the level of PM-related reliability of all the various manufacturer-model combinations of the device types that are excluded because they do not meet the first criterion. Because the MPTF believes that estimates of the PM-related reliability of these “PM-critical” devices, based solely on theoretical analyses and judgments (or guesses) by individuals claiming to be experts will be a challenge, it has chosen to document the actual levels of PM-related reliability being demonstrated in the field. This will require the creation of a substantial database of real-world maintenance findings—work that is currently underway and which will be described in part 2 of this two-part series of articles.
Implementing a Simple RCM-based AEM Program
On its website (www.HTMCommunitydB.org), the MPTF has posted an explanatory article titled “HTM ComDoc 16: Implementing a Simple RCM-based Alternate Equipment Management (AEM) program.” The following seven-step implementation plan is based on the more detailed version appearing in section 16.9 of that article. Links to some model policy documents are available in the article.
Step 1. Create a modified version of the facility's medical equipment inventory (titled “AEM-eligible medical equipment inventory”) by removing the four kinds of taboo devices that CMS specifically states cannot be included in an AEM program.
Step 2. Use the results of the MPTF's phase 1 risk assessment (documented in Table 4 on the MPTF website), or a version that you have modified and consider more acceptable, to identify which of the facility's devices are potentially high PM-risk devices. These are the devices that the facility should continue to maintain according to manufacturer recommendations.
Step 3. Create an AEM program policy that contains a section describing how the decision regarding which items are considered potentially high PM-risk devices was reached. A reference to this article and/or the tables on the MPTF's website may prove sufficient. Include a section describing the alternate maintenance strategies that might be used on the devices included in the AEM program (see HTM ComDoc 10 on the MPTF website).
Step 4. Decide which, if any, of the devices in the group considered eligible for inclusion in the AEM program will continue to be maintained according to manufacturer recommendations, rather than using some type of alternate maintenance strategy.
Step 5. Create a description for inclusion in the AEM program policy of the process that will be used to confirm that the levels of PM-related safety achieved by the alternate maintenance strategies being used in the AEM program are considered acceptable (described further in the following section). This should also describe the corrective actions that will be taken if the apparent level of PM-related safety becomes unacceptable.
Step 6. Create materials for training relevant hospital staff on how to explain the decisions that were used to place the various devices in the AEM program and the evidence used to confirm that the alternate maintenance strategies are not reducing overall levels of PM-related patient safety.
Step 7. Create a section for inclusion in the AEM policy describing the qualifications of the personnel managing the AEM program and those performing the AEM maintenance activities.
The three online tables making up this analysis are part of a wiki site (www.HTMCommunitydB.org). Members of the HTM community are invited to review the professional judgments reflected in this analysis and submit any comments, particularly suggested changes, to the MPTF at email@example.com.
The intent of the invitation is to encourage participation and, ultimately, the development of a legitimate consensus on the part of the HTM community.
Monitoring Levels of PM-Related Patient/Staff Safety
One of the better ways to generate confidence that a PM program is meeting its safety objective is to routinely code all corrective maintenance (repair) calls that are judged to be “PM preventable” as such and periodically report on how many of these involved “potentially high PM-risk” devices. A relatively low count would indicate that the devices in question are acceptably safe with respect to PM-preventable failures.
A good supplement to this would be to report on how frequently the potentially high PM-risk devices failed one or more of their PM inspections, either because a performance or safety verification test was failed or a critical nondurable part was found to be well past the time that it should have been restored. A low count here would similarly indicate that the devices in question are demonstrating high levels of PM-related reliability, with very few PM-preventable hidden failures.
For further discussion of these techniques and suggested ways of coding maintenance calls, see HTM ComDoc 15, section 15.3, on the MPTF website (www.HTMCommunitydB.org). Of course, any routine reporting such as this should include statistics on any device-related patient incidents in which harm was attributed to a PM-preventable device failure.
Likelihood of PM-Preventable Failures
If the analysis described here is extended to include consideration of the second risk criterion—devices with an acceptably low likelihood of failing from a PM-preventable cause—it should prove possible to identify at least some manufacturer-model versions of these potentially high PM-risk device types that would be eligible for inclusion in an AEM program. These devices would, in fact, be classified as lower PM-risk devices because they are sufficiently reliable, even when maintained using PM strategies less stringent than those recommended by the manufacturer.
The MPTF calls the most PM-dependent devices “PM priority 1 devices.” These are devices that have potentially the most severe (serious, life-threatening) adverse outcomes and that are also found to be “quite likely” to fail from a PM-preventable cause.
Incorporating data on the second criterion will result in an orderly ranking of all medical devices that have PM-preventable failure modes, ranging from those that are most critically dependent on timely, competent PM to keep them safe (from PM-preventable failures) to those that are the least dependent on timely PM. The MPTF calls the most PM-dependent devices “PM priority 1 devices.” These are devices that have potentially the most severe (serious, life-threatening) adverse outcomes and that are also found to be “quite likely” to fail from a PM-preventable cause. We will describe this ranking process, as well as the tasks involved in phase 2 of this project, in more detail in part 2 of this article series.
Malcolm Ridgway, PhD, CCE, FAIMBE, is a retired clinical engineer who has been a leader in several initiatives aimed at elevating the healthcare technology management field and advancing the works of its professionals. Email: firstname.lastname@example.org
Matthew F. Baretich, PE, PhD, is president of Baretich Engineering, Inc., based in Fort Collins, CO. Email: email@example.com
Matthew Clark, MBA, CHTM, is a clinical engineer at Advocate Health in Downers Grove, IL. Email: firstname.lastname@example.org
Stephen Grimes, FACCE, FHIMSS, FAIMBE, is managing partner at Strategic Health Care Technology Associates, LLC in Swampscott, MA, and a member of the BI&T Editorial Board. Email: email@example.com
Bhaskar Iduri, MS, CCE, CHTM, is director of clinical engineering and quality assurance at Renovo Solutions, LLC in Irvine, CA. Email: firstname.lastname@example.org
Michael W. Lane, CHTM, is director of Technical Services Partnership at the University of Vermont in Burlington, VT. Email: email@example.com
Alan Lipschultz, CCE, PE, CSP, CPPS, is president of HealthCare Technology Consulting, LLC in North Bethesda, MD, and vice president of the American College of Clinical Engineering. Email: firstname.lastname@example.org
Nancy Lum, MHSc, is a clinical engineer project manager at Massachusetts General Hospital in Boston. Email: email@example.com
Editor's note: This article is part 1 in a two-part series. Part 2, which will appear in the September/October issue of BI&T, will describe a more comprehensive alternate equipment management (AEM) program.