How to Conduct a PFMEA: A Step-by-Step Guide for Maintenance Techs
Recurring failures are typically an indicator of a bad process. A well-crafted process failure mode and effects analysis identifies workflow vulnerabilities before they create unplanned downtime. By contrast, a poorly formed process results in assets with inexplicably failing health and incomplete reporting.
Understanding how to conduct a PFMEA is critical to ensuring a good process. Each phase of the analysis should be approached as a checklist of hard steps the next time you evaluate a process. More importantly, it needs to work through those steps in sequence to avoid the gaps where most analyses stall.
Book a Demo to see how LLumin CMMS+ supports every stage of your failure analysis workflow.
How to Conduct a PFMEA: A 10-Step Guide
Step 1: Evaluate Your Maintenance Workflow
Walk through the specific maintenance procedure. Map every task in sequence, from the initial trigger to job completion. Finally, confirm that every technician involved understands the operational flow before any vulnerability analysis begins.
Breaking the workflow into individual components ensures nothing gets overlooked. Failures tend to hide in seemingly routine steps, particularly when different technicians perform them differently.
Step 2: Identify Where Breakdowns Happen
Go through each workflow step and define exactly how it could fail. A single component can fail in multiple ways. For instance, it might be omitted, executed incorrectly, completed out of sequence, or delayed long enough to affect downstream work.
Don’t skip or combine failure modes during this phase. Reviewing historical work orders gives you a concrete record of failures that have already occurred, reducing the risk of overlooking issues your team has encountered before.
Step 3: Define the Operational Impact
For each failure mode you’ve identified, describe exactly what happens when it occurs. Document specific consequences, such as unexpected downtime, safety hazards, OEE losses, downstream bottlenecks, or customer-facing defects.
Keep these descriptions precise. Vague effect statements make it difficult to assign accurate severity scores in the next step, and they reduce the usefulness of your PFMEA maintenance checklist as a reference document when failures recur.
Step 4: Assign a Severity Score
Score the potential damage of each failure effect on a scale from 1 to 10. A score of 10 indicates a dangerously high failure that creates a sudden safety hazard without warning. By contrast, a score of 1 reflects minimal disruption to downstream operations or the end customer.
PFMEA Severity Score Breakdown
| Score | Explanation | Example |
|---|---|---|
| 1 | No discernible effect on operations or output | Cosmetic surface variation within spec |
| 2 | Perceptible only to trained observers | Marginal sensor reading variance within acceptable range |
| 3 | Perceptible to most technicians | Audible increase in bearing noise without performance loss |
| 4 | Minor impact on output quality | Surface finish variation outside spec on a small percentage of parts |
| 5 | Reduced secondary function | Intermittent sensor misreads causing minor scheduling delays |
| 6 | Loss of secondary function | Cooling system partially offline, but equipment still operational |
| 7 | Significant reduction in primary function | Pump running at 40% capacity due to impeller wear |
| 8 | Complete loss of primary function | CNC spindle failure halting production entirely |
| 9 | Safety hazard with advanced warning | Hydraulic line failure preceded by pressure drop alert |
| 10 | Sudden safety hazard without warning | Pressure vessel rupture with no prior indicators |
Note that severity assesses the consequence of the failure, not how often it happens. Keep the two separate at this stage.
Step 5: Assign an Occurrence Score
Use your maintenance records to determine how frequently the root cause of each failure actually occurs. Note that estimating this number without data introduces bias. Failures your team remembers most vividly tend to be overemphasized, while intermittent failures get underestimated. Pull documented breakdown frequency from your CMMS rather than relying on team recall before assigning a score.
If the root cause isn’t immediately clear, use the 5-Whys technique before scoring. Start with the observable failure and ask “why” repeatedly until you reach the underlying cause rather than a symptom. For example:
- Why did the conveyor stop? The motor overheated.
- Why did the motor overheat? The cooling fan wasn’t running.
- Why wasn’t the cooling fan running? The fan belt had snapped.
- Why did the fan belt snap? It hadn’t been replaced on schedule.
- Why wasn’t it replaced on schedule? No PM trigger existed for that component.
The root cause is the missing PM trigger, not the motor, the fan, or the belt. Assigning an occurrence score to the motor overheating would have targeted the wrong thing. Always score the root cause identified by your 5-Whys analysis, not the first symptom your team observed. Once the root cause is confirmed, assign a score using the scale below:
PFMEA Frequency Score Breakdown
| Score | Frequency | Example |
|---|---|---|
| 1 | Highly unlikely | Failure has never occurred or is considered near-impossible |
| 2 | Remote | Failure is unlikely; it is only theoretical or anecdotal |
| 3 | Very low | Failure is uncommon; a few documented instances |
| 4 | Low | Failure occurs rarely; isolated incidents on record |
| 5 | Low-moderate | Failure occurs infrequently, but is not unusual |
| 6 | Moderate | Occasional failure; occurs with some regularity |
| 7 | Moderately high | Failure occurs often, but not on every cycle |
| 8 | High | Failure occurs frequently and regularly |
| 9 | Very high | Almost certain to occur; failure is near-inevitable |
| 10 | Constant | Failure occurs repeatedly under normal operating conditions |
A score of 10 indicates that failure occurs constantly under your operating conditions. A score of 1 means it is highly unlikely based on your maintenance history. When documented data is unavailable for a specific root cause, treat the lack of records as grounds to investigate further.
Step 6: Assign a Detection Score
Evaluate how effectively your existing controls (e.g., sensors, inspection tasks, preventive maintenance procedures) would catch this failure before it causes a full breakdown. Similar to the severity and frequency scores, detection scores are measured on a scale of 1 to 10.
PFMEA Detection Score Breakdown
| Score | Detection Level | Example |
|---|---|---|
| 1 | Almost certain | Existing controls will virtually always detect the failure before it causes impact |
| 2 | Very high | Controls are highly reliable; detection failure is rare |
| 3 | High | Controls consistently catch the fault with only occasional misses |
| 4 | Moderately high | Controls usually detect the failure but miss it under some conditions |
| 5 | Moderate | Controls catch the failure roughly half the time; misses are not uncommon |
| 6 | Low-moderate | Controls may detect the failure, but cannot be relied upon consistently |
| 7 | Low | Controls occasionally detect the failure but miss it more often than not |
| 8 | Very low | Controls rarely catch the fault; detection depends on chance or manual observation |
| 9 | Remote | Controls exist in theory, but are unlikely to detect the failure in practice |
| 10 | No detection | No controls exist; the failure would reach operations or the end customer without warning |
Remember, this score reflects your current detection capability, not what you’re aiming to achieve. Use this as an opportunity to identify the gaps in your detection processes.
Step 7: Calculate Your Risk Priority Number
Multiply severity, occurrence, and detection together to produce the Risk Priority Number (RPN). Because each factor uses a 10-point scale, your RPN will always fall between 1 and 1,000. Take the example below:Â
A food processing facility is evaluating a conveyor belt system. The team identifies three failure modes and scores each one:
| Failure Mode | Severity (S) | Occurrence (O) | Detection (D) | RPN |
|---|---|---|---|---|
| Belt slippage under load | 5 | 7 | 3 | 105 |
| Drive motor overheating | 8 | 4 | 6 | 192 |
| Bearing seizure without warning | 9 | 3 | 8 | 216 |
Even though belt slippage occurs often and the drive motor overheating is serious, bearing seizure carries the highest RPN. That’s because its severity score indicates a near-immediate safety hazard when it occurs. As a result, the team flags it for immediate attention: a vibration sensor is installed, and the bearing is added to a condition-based replacement schedule.
Use RPN rankings to prioritize which failure modes to address first. Note, however, that failure modes with a severity score of 9 or 10 warrant immediate attention regardless of their overall RPN. High-consequence failures are time-sensitive even when they occur infrequently.
Step 8: Build a Targeted Improvement Strategy
For each high-priority failure mode, define specific corrective actions to reduce the RPN. Common approaches depend on the primary analysis concern:
- Detection: Add condition-based maintenance triggers to improve detection
- Occurrence: Update standard operating procedures to reduce instancesÂ
- Severity: Implementing fail-safe controls
Note that lowering the severity score typically requires physically modifying equipment or layout, making it the hardest lever to move. For each, however, assign a responsible party and a firm completion deadline to every action before moving to execution.
Step 9: Execute Your New Maintenance Controls
Put the plan into action by creating the necessary work orders and updating relevant procedures. Similarly, confirm that every technician on the affected workflow understands the changes. Larger corrective actions may require phased rollouts or a temporary monitoring period before the updated procedure is fully standardized.
CMMS software makes this execution trackable. Work orders tied to each corrective action give your team a clear record of what was done, when, and by whom. This is especially important when you reassess the risk level in Step 10.
Step 10: Reassess the Risk Level
After your new controls are in place, run the scoring process again. Reassign severity, occurrence, and detection ratings per the updated process, and recalculate the RPN.
A reduced RPN confirms the corrective action worked. If the recalculated score remains above your acceptable threshold, your team needs a new action plan and another implementation cycle. Treat your PFMEA as a living document that evolves with your processes, equipment, and operating conditions.
Make Your PFMEA More Effective With LLumin CMMS+
Knowing how to conduct a PFMEA is one thing. Having the data to do it accurately is another. Paper checklists and spreadsheets make it difficult to retrieve reliable historical failure data. When your occurrence scores are built on memory rather than records, the entire analysis becomes less trustworthy.
LLumin’s asset management software provides asset histories, work order records, and corresponding cost data in one place. That data is key to ensuring that every rating in your PFMEA is as precise as possible. Additionally, corrective actions are identified, and LLumin connects them directly to scheduled work orders. This ensures your team consistently executes the new preventive measures rather than leaving them in a document.
Book a Demo to see how LLumin CMMS+ supports your failure analysis program from assessment through corrective action.
Frequently Asked Questions
What Are The 10 Steps Of A PFMEA?
The 10 steps are: (1) evaluate your maintenance workflow, (2) identify where breakdowns happen, (3) define the operational impact, (4) assign a severity score, (5) assign an occurrence score, (6) assign a detection score, (7) calculate your Risk Priority Number, (8) build a targeted improvement strategy, (9) execute your new maintenance controls, and (10) reassess the risk level after changes are in place.
How Do You Calculate A Risk Priority Number (RPN)?
RPN = Severity × Occurrence × Detection, with each factor rated on a 1–10 scale. The result falls between 1 and 1,000. Higher scores indicate higher-priority failure modes. Always treat high-severity failure modes as urgent, regardless of overall RPN, since a low-frequency failure with catastrophic consequences still warrants immediate action.
When Should A Maintenance Team Perform A PFMEA?
Perform a PFMEA:
- Before implementing a new maintenance process
- After a significant equipment change
- When the same failure mode recurs despite previous corrective actions.Â
Revisit your PFMEA maintenance checklist when staffing, available equipment, or operating environments could affect process reliability.
What Is The Difference Between FMEA And PFMEA?
FMEA evaluates potential failures in a product or system design, asking whether the equipment itself could fail. By contrast, PFMEA evaluates potential failures in the process used to maintain or operate that equipment, asking whether the procedure could fail. Both methods use RPNs, but FMEA RPNs typically lead to design changes, while PFMEA RPNs revise process control plans.
How Does CMMS Software Improve PFMEA?
A CMMS provides the historical maintenance data that makes PFMEA scoring accurate. Occurrence scores, for example, become more reliable when they’re based on documented breakdown frequency rather than technician recall. Similarly, severity scores better reflect real consequences when repair costs and downtime records are attached to specific failure modes. After the analysis, a CMMS ensures corrective actions are converted into scheduled work orders rather than remaining items on a checklist.
With over 15 years of experience, Ann Porten stands as a seasoned leader in asset management, ERP Solutions, and B2B Sales. Her extensive background in manufacturing has equipped her with unique insights, enabling her to navigate complex software solutions with precision and drive results. Currently, as the Director of Business Development for LLumin, Ann has led various industries, including Manufacturing, Construction, Pharmaceuticals, Food & Beverage, and Oil & Gas to identify their business opportunities and challenges, and implementing profitable solutions. Her reputation as a trusted advisor and industry leader stems from her dedication to delivering economic success and satisfaction to the customers she serves.
