Master downtime analysis with this practical step-by-step framework. Learn how to identify bad actor assets, run root cause analysis, and build a repeatable process that reduces unplanned stoppages.

If you run a production line, you feel downtime in your bones.
Lost output. Overtime to catch up. Operators standing around. Angry emails about missed orders.
Most plants collect downtime data somewhere – in a CMMS, SCADA, MES, or even in spreadsheets – but very few teams have a simple, repeatable downtime analysis process they run every week.
This article lays out a step-by-step framework for downtime analysis that any maintenance team can use. We will start from the data in your CMMS and finish with a focused action plan for reducing unplanned stoppages.
Downtime analysis is the process of taking all the small and large stops on your equipment, organising them, and identifying:
Done well, downtime analysis gives you:
Done badly, it becomes just another report that nobody reads.
The goal of this framework is to make downtime analysis practical, fast, and repeatable, not perfect.
Before you touch the data, define the scope. Otherwise, you get lost in noise.
Decide on:
Time window
Area / line / asset group
Downtime types
A good starting point:
Analyse unplanned downtime on one critical line for the last 90 days.
That gives you enough data for patterns without becoming overwhelming.
Next, you need a data set you can actually work with.
From your CMMS or downtime logging system, export at least the following fields:
CSV or Excel is fine – the key is consistency.
If your system does not calculate duration, you can derive it as:
Duration (minutes) = End time – Start time
Make sure your export covers the time window and assets you chose in Step 1.
If you need more guidance on working with CMMS exports and data quality, see our Ultimate Guide to CMMS Data Analysis.
Raw downtime data is messy. If you try to analyse it without cleaning, you will get junk.
Focus on a few simple cleaning rules:
Remove obvious errors
Standardise timestamps
Check asset names
Filter to the scope
You do not need perfect data. You just need it clean enough that your numbers are credible to your team.
To make downtime analysis useful, you need a way of grouping events into meaningful buckets.
If you already have good cause codes in your CMMS, use them. If not, build a simple classification from what you have:
High-level categories (examples)
Sub-categories where useful
You can build this classification by:
Start simple and improve it over time. The aim is to be able to say:
"Most of our unplanned downtime comes from this category on these few assets."
With your cleaned and classified data, you can now calculate a few key metrics.
For each asset and each downtime category, calculate:
Total downtime (minutes) – Sum of all durations.
Number of events – How many times the downtime occurred.
Average duration per event – Total downtime ÷ number of events.
Mean Time Between Failures (MTBF) – Time in operation ÷ number of failures. Even a rough estimate (e.g. hours run per week) is useful.
Mean Time To Repair (MTTR) – Total repair time ÷ number of failures. Approximated by the average downtime duration if you don't have detailed labour time.
You do not need to turn this into a complicated reliability study. The practical questions are:
Now we apply the classic 80/20 principle.
Create Pareto charts (bar charts ordered from highest to lowest) for:
Downtime by asset
Downtime by category / cause code
You will almost always see the same pattern:
Those are your "vital few" bad actors.
For example, you might see:
This immediately focuses your improvement effort.
Modern AI tools can automatically detect these patterns in your data. If you're interested in how AI is transforming downtime pattern detection, see our guide on AI & Machine Learning in Maintenance.
Numbers tell you where the pain is. People tell you why it is happening.
Take your top 3–5 downtime causes or assets and run a quick root cause analysis session with:
Use simple tools:
Ground the discussion in real data:
Aim to finish with:
Insight is useless without action.
For each major downtime cause or bad actor asset, define:
Specific action – E.g. redesign a problematic chute, introduce a new inspection step, improve operator training, adjust PM frequency.
Owner – One person accountable, not a committee.
Due date – Realistic but tight enough to maintain momentum.
Expected impact – Rough estimate of downtime reduction (e.g. "aim to cut these failures by 50%").
Follow-up check – When will you review whether the action worked?
Capture these in a simple table or tracker. This becomes your downtime reduction backlog.
One-off analyses are better than nothing, but the real gains come from a cadence.
Set up a regular rhythm such as:
Weekly downtime review – 30–45 minutes
Monthly deep dive – 60–90 minutes
Keep the visuals simple:
The measure of success is not the report itself; it is:
Pitfall: Exporting everything and building huge spreadsheets that nobody wants to touch.
Fix: Limit the scope. One line. One quarter. Unplanned downtime only. Build from there.
Pitfall: Operators and techs select "Other" for half of all causes, or pick the first item in the list.
Fix:
Pitfall: Spending time on rare but dramatic failures instead of the everyday small losses that quietly erode output.
Fix: Let the Pareto chart decide. Go where the minutes are, not where the noise is.
Pitfall: Doing a big analysis once, implementing a couple of fixes, and then going back to business as usual.
Fix: Build downtime review into your weekly routine. Short, consistent reviews beat big, infrequent projects.
You can use this checklist as a quick guide every time you run downtime analysis.
Define scope
Export data
Clean data
Classify events
Analyse
Decide actions
Review
Everything in this framework is achievable with Excel – but it is slow and fragile.
Most maintenance teams do not have spare hours every week to:
LeanReport was built to make this process almost automatic:
Instead of spending your week preparing reports, you can spend it reducing downtime.
If you want to see what this looks like with your own data, upload a sample CSV or visit our How It Works page to learn more. Ready to start? Check out our pricing and begin your free trial today.
The main goal of downtime analysis is to identify the small number of assets and failure modes that cause most of your lost production time, so you can focus your maintenance and improvement effort where it will have the biggest impact.
A good rhythm is weekly for a short review of the last week's performance and monthly for a deeper look at trends. The key is consistency – small, regular reviews beat big, infrequent projects.
No. You need data that is clean enough to be trusted, not perfect. Start with what you have, fix the biggest data quality issues, and improve your downtime coding as you go.
At minimum, you need a CMMS or logging system that records downtime events, a way to export to CSV or Excel, and a tool (Excel, BI tool, or LeanReport) to aggregate and chart the data. Dedicated tools like LeanReport can save significant time by automating the cleaning and analysis steps.
Keep the process simple and explain why it matters. Use a short, clear list of cause codes, involve operators in designing them, and feed back the results in weekly meetings so people can see the impact of the data they enter.
Pareto analysis is a technique based on the 80/20 principle – it reveals that a small number of causes (typically 20%) account for the majority of the impact (80%). In downtime analysis, Pareto charts help you identify the "vital few" assets or failure modes that deserve immediate attention, rather than spreading effort thinly across all problems.

Founder - LeanReport.io
Rhys is the founder of LeanReport.io with a unique background spanning marine engineering (10 years with the Royal New Zealand Navy), mechanical engineering in process and manufacturing in Auckland, New Zealand, and now software engineering as a full stack developer. He specializes in helping maintenance teams leverage AI and machine learning to transform their CMMS data into actionable insights.
Mobile technology is reshaping how maintenance teams capture, report and act on data right where the work happens. Learn how mobile reporting drives faster response, better accuracy and stronger decisions.
Learn how to build a strategic annual maintenance report with clear templates, meaningful KPIs and consistent structure that drives reliability and supports decision-making.
Learn a practical, data-driven continuous improvement framework for maintenance teams to reduce downtime and boost reliability using CMMS data.