Recurring failures are typically an indicator of a bad process. A well-crafted process failure mode and effects analysis identifies workflow vulnerabilities before they create unplanned downtime. By contrast, a poorly formed process results in assets with inexplicably failing health and incomplete reporting.

Understanding how to conduct a PFMEA is critical to ensuring a good process. Each phase of the analysis should be approached as a checklist of hard steps the next time you evaluate a process. More importantly, it needs to work through those steps in sequence to avoid the gaps where most analyses stall.

Book a Demo to see how LLumin CMMS+ supports every stage of your failure analysis workflow.

How to Conduct a PFMEA: A 10-Step Guide

Step 1: Evaluate Your Maintenance Workflow

Walk through the specific maintenance procedure. Map every task in sequence, from the initial trigger to job completion. Finally, confirm that every technician involved understands the operational flow before any vulnerability analysis begins.

Breaking the workflow into individual components ensures nothing gets overlooked. Failures tend to hide in seemingly routine steps, particularly when different technicians perform them differently.

Step 2: Identify Where Breakdowns Happen

Go through each workflow step and define exactly how it could fail. A single component can fail in multiple ways. For instance, it might be omitted, executed incorrectly, completed out of sequence, or delayed long enough to affect downstream work.

Don’t skip or combine failure modes during this phase. Reviewing historical work orders gives you a concrete record of failures that have already occurred, reducing the risk of overlooking issues your team has encountered before.

Step 3: Define the Operational Impact

For each failure mode you’ve identified, describe exactly what happens when it occurs. Document specific consequences, such as unexpected downtime, safety hazards, OEE losses, downstream bottlenecks, or customer-facing defects.

Keep these descriptions precise. Vague effect statements make it difficult to assign accurate severity scores in the next step, and they reduce the usefulness of your PFMEA maintenance checklist as a reference document when failures recur.

Step 4: Assign a Severity Score

Score the potential damage of each failure effect on a scale from 1 to 10. A score of 10 indicates a dangerously high failure that creates a sudden safety hazard without warning. By contrast, a score of 1 reflects minimal disruption to downstream operations or the end customer.

PFMEA Severity Score Breakdown

ScoreExplanationExample
1No discernible effect on operations or outputCosmetic surface variation within spec
2Perceptible only to trained observersMarginal sensor reading variance within acceptable range
3Perceptible to most techniciansAudible increase in bearing noise without performance loss
4Minor impact on output qualitySurface finish variation outside spec on a small percentage of parts
5Reduced secondary functionIntermittent sensor misreads causing minor scheduling delays
6Loss of secondary functionCooling system partially offline, but equipment still operational
7Significant reduction in primary functionPump running at 40% capacity due to impeller wear
8Complete loss of primary functionCNC spindle failure halting production entirely
9Safety hazard with advanced warningHydraulic line failure preceded by pressure drop alert
10Sudden safety hazard without warningPressure vessel rupture with no prior indicators

Note that severity assesses the consequence of the failure, not how often it happens. Keep the two separate at this stage.

Step 5: Assign an Occurrence Score

Use your maintenance records to determine how frequently the root cause of each failure actually occurs. Note that estimating this number without data introduces bias. Failures your team remembers most vividly tend to be overemphasized, while intermittent failures get underestimated. Pull documented breakdown frequency from your CMMS rather than relying on team recall before assigning a score.

If the root cause isn’t immediately clear, use the 5-Whys technique before scoring. Start with the observable failure and ask “why” repeatedly until you reach the underlying cause rather than a symptom. For example:

  1. Why did the conveyor stop? The motor overheated.
  2. Why did the motor overheat? The cooling fan wasn’t running.
  3. Why wasn’t the cooling fan running? The fan belt had snapped.
  4. Why did the fan belt snap? It hadn’t been replaced on schedule.
  5. Why wasn’t it replaced on schedule? No PM trigger existed for that component.

The root cause is the missing PM trigger, not the motor, the fan, or the belt. Assigning an occurrence score to the motor overheating would have targeted the wrong thing. Always score the root cause identified by your 5-Whys analysis, not the first symptom your team observed. Once the root cause is confirmed, assign a score using the scale below:

PFMEA Frequency Score Breakdown

ScoreFrequencyExample
1Highly unlikelyFailure has never occurred or is considered near-impossible
2RemoteFailure is unlikely; it is only theoretical or anecdotal
3Very lowFailure is uncommon; a few documented instances
4LowFailure occurs rarely; isolated incidents on record
5Low-moderateFailure occurs infrequently, but is not unusual
6ModerateOccasional failure; occurs with some regularity
7Moderately highFailure occurs often, but not on every cycle
8HighFailure occurs frequently and regularly
9Very highAlmost certain to occur; failure is near-inevitable
10ConstantFailure occurs repeatedly under normal operating conditions

A score of 10 indicates that failure occurs constantly under your operating conditions. A score of 1 means it is highly unlikely based on your maintenance history. When documented data is unavailable for a specific root cause, treat the lack of records as grounds to investigate further.

Step 6: Assign a Detection Score

Evaluate how effectively your existing controls (e.g., sensors, inspection tasks, preventive maintenance procedures) would catch this failure before it causes a full breakdown. Similar to the severity and frequency scores, detection scores are measured on a scale of 1 to 10.

PFMEA Detection Score Breakdown

ScoreDetection LevelExample
1Almost certainExisting controls will virtually always detect the failure before it causes impact
2Very highControls are highly reliable; detection failure is rare
3HighControls consistently catch the fault with only occasional misses
4Moderately highControls usually detect the failure but miss it under some conditions
5ModerateControls catch the failure roughly half the time; misses are not uncommon
6Low-moderateControls may detect the failure, but cannot be relied upon consistently
7LowControls occasionally detect the failure but miss it more often than not
8Very lowControls rarely catch the fault; detection depends on chance or manual observation
9RemoteControls exist in theory, but are unlikely to detect the failure in practice
10No detectionNo controls exist; the failure would reach operations or the end customer without warning

Remember, this score reflects your current detection capability, not what you’re aiming to achieve. Use this as an opportunity to identify the gaps in your detection processes.

Step 7: Calculate Your Risk Priority Number

Multiply severity, occurrence, and detection together to produce the Risk Priority Number (RPN). Because each factor uses a 10-point scale, your RPN will always fall between 1 and 1,000. Take the example below: 

A food processing facility is evaluating a conveyor belt system. The team identifies three failure modes and scores each one:

Failure ModeSeverity (S)Occurrence (O)Detection (D)RPN
Belt slippage under load573105
Drive motor overheating846192
Bearing seizure without warning938216

Even though belt slippage occurs often and the drive motor overheating is serious, bearing seizure carries the highest RPN. That’s because its severity score indicates a near-immediate safety hazard when it occurs. As a result, the team flags it for immediate attention: a vibration sensor is installed, and the bearing is added to a condition-based replacement schedule.

Use RPN rankings to prioritize which failure modes to address first. Note, however, that failure modes with a severity score of 9 or 10 warrant immediate attention regardless of their overall RPN. High-consequence failures are time-sensitive even when they occur infrequently.

Step 8: Build a Targeted Improvement Strategy

For each high-priority failure mode, define specific corrective actions to reduce the RPN. Common approaches depend on the primary analysis concern: 

  • Detection: Add condition-based maintenance triggers to improve detection
  • Occurrence: Update standard operating procedures to reduce instances 
  • Severity: Implementing fail-safe controls

Note that lowering the severity score typically requires physically modifying equipment or layout, making it the hardest lever to move. For each, however, assign a responsible party and a firm completion deadline to every action before moving to execution.

Step 9: Execute Your New Maintenance Controls

Put the plan into action by creating the necessary work orders and updating relevant procedures. Similarly, confirm that every technician on the affected workflow understands the changes. Larger corrective actions may require phased rollouts or a temporary monitoring period before the updated procedure is fully standardized.

CMMS software makes this execution trackable. Work orders tied to each corrective action give your team a clear record of what was done, when, and by whom. This is especially important when you reassess the risk level in Step 10.

Step 10: Reassess the Risk Level

After your new controls are in place, run the scoring process again. Reassign severity, occurrence, and detection ratings per the updated process, and recalculate the RPN.

A reduced RPN confirms the corrective action worked. If the recalculated score remains above your acceptable threshold, your team needs a new action plan and another implementation cycle. Treat your PFMEA as a living document that evolves with your processes, equipment, and operating conditions.

Make Your PFMEA More Effective With LLumin CMMS+

Knowing how to conduct a PFMEA is one thing. Having the data to do it accurately is another. Paper checklists and spreadsheets make it difficult to retrieve reliable historical failure data. When your occurrence scores are built on memory rather than records, the entire analysis becomes less trustworthy.

LLumin’s asset management software provides asset histories, work order records, and corresponding cost data in one place. That data is key to ensuring that every rating in your PFMEA is as precise as possible.  Additionally, corrective actions are identified, and LLumin connects them directly to scheduled work orders. This ensures your team consistently executes the new preventive measures rather than leaving them in a document.

Book a Demo to see how LLumin CMMS+ supports your failure analysis program from assessment through corrective action.

Frequently Asked Questions

What Are The 10 Steps Of A PFMEA?

The 10 steps are: (1) evaluate your maintenance workflow, (2) identify where breakdowns happen, (3) define the operational impact, (4) assign a severity score, (5) assign an occurrence score, (6) assign a detection score, (7) calculate your Risk Priority Number, (8) build a targeted improvement strategy, (9) execute your new maintenance controls, and (10) reassess the risk level after changes are in place.

How Do You Calculate A Risk Priority Number (RPN)?

RPN = Severity × Occurrence × Detection, with each factor rated on a 1–10 scale. The result falls between 1 and 1,000. Higher scores indicate higher-priority failure modes. Always treat high-severity failure modes as urgent, regardless of overall RPN, since a low-frequency failure with catastrophic consequences still warrants immediate action.

When Should A Maintenance Team Perform A PFMEA?

Perform a PFMEA:

  • Before implementing a new maintenance process
  • After a significant equipment change
  • When the same failure mode recurs despite previous corrective actions. 

Revisit your PFMEA maintenance checklist when staffing, available equipment, or operating environments could affect process reliability.

What Is The Difference Between FMEA And PFMEA?

FMEA evaluates potential failures in a product or system design, asking whether the equipment itself could fail. By contrast, PFMEA evaluates potential failures in the process used to maintain or operate that equipment, asking whether the procedure could fail. Both methods use RPNs, but FMEA RPNs typically lead to design changes, while PFMEA RPNs revise process control plans.

How Does CMMS Software Improve PFMEA?

A CMMS provides the historical maintenance data that makes PFMEA scoring accurate. Occurrence scores, for example, become more reliable when they’re based on documented breakdown frequency rather than technician recall. Similarly, severity scores better reflect real consequences when repair costs and downtime records are attached to specific failure modes. After the analysis, a CMMS ensures corrective actions are converted into scheduled work orders rather than remaining items on a checklist.

VP of Operations at LLumin CMMS+

With over 15 years of experience, Ann Porten stands as a seasoned leader in asset management, ERP Solutions, and B2B Sales. Her extensive background in manufacturing has equipped her with unique insights, enabling her to navigate complex software solutions with precision and drive results. Currently, as the Director of Business Development for LLumin, Ann has led various industries, including Manufacturing, Construction, Pharmaceuticals, Food & Beverage, and Oil & Gas to identify their business opportunities and challenges, and implementing profitable solutions. Her reputation as a trusted advisor and industry leader stems from her dedication to delivering economic success and satisfaction to the customers she serves.

Contact