Downtime, outages and outages: Understand your true costs
This content is brought to you by Evolven. Evolven Change Analytics is a unique AIOps solution that tracks and analyzes all actual changes made in the enterprise cloud environment. Evolven helps leading companies reduce the number of incidents, reduce problem resolution time and eliminate unauthorized changes.Learn more
When it comes to mission-critical applications or data center performance quality, companies are willing to invest heavily. Unfortunately, these investments are not always fully delivered.
Coping with system downtime
Despite the efforts that have been invested in infrastructure resilience, many IT organizations continue to deal with database, hardware and software downtimes lasting from a few minutes to several days, rendering them completely inoperable for the business and cause huge losses.
The world of IT outages can sometimes seem uncomfortable.
Despite the variety of advanced solutions and the growing amount of data being collected by major enterprise software vendors and IT departments - from ERP to CRM and more - outages remain a valid and terrible threat to the industry.
On the other hand, IT outages have somehow become an inherently accepted, even expected, part of business life.
IT downtime review
While IT professionals face downtime from time to time and then focus their efforts on overcoming it, the business organization as a whole suffers the impact of "financial pain" that is usually quite significant.
In the past, we've delved deeply into the many ways IT downtime can impact bottom line operations (you can read more about this here:Cost and scope of unplanned outages). In doing so, we consider different aspects, from direct sales losses and damage to reputation to indirect effects such as reduced productivity.
Now I want to review the problem and explore how organizations should address and assess threats to their IT operations, including systems, applications and data, by looking at solid (and established) benchmarks that represent potential costs of downtime and disruption.
Measuring the failures of big brands
When should the industry start measuring the financial impact of major brand failures like the one that recently occurred?Facebook, Isone that affected hundreds of thousands of Lloyds Bank customers, or theJetstar failurethat caused hundreds of flight delays?
In other words, at what point is an outage "significant enough" that a cost analysis becomes valuable for the industry to learn from and predict the impact of future outage incidents?
Well, apparently at some point the disruption creates an impact that PR-wise can't ignore. This is the point of no return, followed by estimates of the financial impact.
The cost of downtime varies significantly between industries. The size of the affected company is of course a critical but not the only important factor. The role of IT systems in the company is also crucial.
In order to give an IT outage a numeric value, its impact on multiple business and organizational aspects must be predefined so that the entire industry can learn and optimize accordingly.
A failure of a critical application can result in two different types of losses:
- Application service outage: The impact of downtime varies by application and organization;
- Data Loss: The potential loss of data due to a system failure can have significant legal and financial implications.
Well, I'm sure you'll agree that today's data centers should never go down; Applications need to be available 24/7, and internal (let alone external) end users around the world need to rely on data center availability (for critical data and application availability) at all times.
Well, reality bites. This is not the case in the back office (i.e. within the data center). No organization enjoys 100% uptime. Should You Try to Achieve 100%? Secure. However, you also need to develop a deep understanding of the impact of downtime and ways to minimize it.
Worst Blackout Nightmare Ever? Probably what happened to you...
Some past blackout incidents have turned into PR disasters, like the mythological Virgin Blue debacle of 2010 or the most recent one that ravaged Facebook.
Because? The massive impact probably had something to do with it.
As a reminder, Virgin Blue's outage prevented passengers from boarding flights for 11 days (!!), resulting in negative press, reputational damage and millions of dollars in losses.
More specifically, Virgin Blue's reserve management company, Navitaire, eventually compensated Virgin Blue for more than $20 million (Navitaire's booking error nets Virgin $20m in compo).
There are many other incidents that still attract media attention. Here's just a recent oneUSA Today article on the Wells Fargo power outagewho prevented customers from accessing their accounts for many hours.
I can safely say that anyone in IT would agree that failures or downtime are VERY bad for business. They are undesirable, very damaging financially and must be combated with all available means.
Misconfigurations are key
The IT Process Institute's Visible Operations Handbook has reported in the past that "80% of unplanned outages are due to poorly planned changes made by administrators ("operations staff") or developers" (visible operations).
The Enterprise Management Association reported that 60% of availability and performance failures are due to misconfigurations.
How much does it cost?
Downtime can cost organizations $5,600 per minute and up to $300,000 per hour in web application downtime (according to a2014 Gartner-Analyse).
The average hourly cost of enterprise server downtime worldwide, 2017-2018:
Application maintenance costs are increasing at 20% annually. But that can't solve all your problems. A previous industry survey found that at least a quarter of the downtime surveyed was caused by configuration errors. (How much will you spend on app downtime this year?).
How common is downtime or disruption?
Ok, so downtime can be a financial nightmare. That part is clear. However, if you want to properly assess the potential risk of disruption in your organization, the immediate question should be, "How likely is it to happen?"
Fuente:data center knowledge
Ok, failures are too common to think "I probably won't have a major failure". Now the question arises as to how you can calculate your specific risk for your company.
Production and application downtime costs clarified
Unplanned outages depend on the IT department to fix them. However, as I mentioned earlier, these outages ultimately impact the entire organization.
An important part of a complete default risk assessment process is estimating how much money you will lose per hour (or minute, or whatever time interval you choose) as a result of the default.
For organizations that rely solely on the ability of data centers to provide IT and network services to customers, such as For businesses such as telecom service providers or e-commerce companies, downtime can be particularly costly, with the cost of a single event exceeding $1 million (more than $11,000 per minute) according to expert estimates.
In a USA Today survey of 200 data center managers, more than 80% said their downtime costs exceeded $50,000 per hour. More than 25% reported downtime costs of more than $500,000 per hour (!!).
According to another survey, while companies cannot achieve zero downtime, one in ten companies say their availability needs to be greater than 99.999%.
To gain a thorough understanding of the impact of production and release downtime, let's take a look at how the consequences of downtime manifest themselves.
Downtime costs: per year or per incident?
AStudy 2017found that 46% of 400 IT decision makers experienced more than four hours of IT-related downtime over a 12-month period; 23% said they incur costs ranging from $12,000 to over $1 million per hour.
More than 35% admitted they are unsure of the cost of a business interruption.
If you ask Delta Airlines, which had to cancel 280 flights in 2017 because of a power outage, the losses from a single power outagecan reach more than 150 million dollars.
A few years ago, Dun & Bradstreet reported that 59% of Fortune 500 companies experience at least 1.6 hours of downtime per week.
If you take an average Fortune 500 company (or a company with at least 10,000 employees) and assume that they pay IT team members an average of $56 an hour, then (assuming all of IT busy solving downtime) work alone is part of it Downtime for a company this size would reach $896,000 per week, which works out to over $46 million per year (Assessing the financial impact of downtime).
Of course, the reality is more complicated, since you have to take into account many parameters, such as: B. the time of the event (weekday or weekend? day or night?) and more. However, understanding the cost of downtime greatly assists in assessing your potential risk and the return on investment of tools that can help minimize the impact of downtime.
Has the industry been able to learn from the past and minimize collateral damage during an outage?
How have things changed since the past?
So we already know that there is still downtime and outages that the industry has yet to successfully eliminate. But how have costs changed over time? Are these incidents less harmful today?
In 2010,an investigation by Coleman Parkesfound that IT downtime costs companies a total of more than 127 million hours per year in employee productivity, an average of 545 hours per company.
In 2009, the average cost of downtime varied significantly by industry, from about $90,000 per hour in the media sector to about $6.48 million per hour for large online intermediation agencies (How to quantify downtime).
According to a survey of IT managers conducted during these years, companies are becoming increasingly aware of the direct financial costs of computer failures. The survey found that one in five companies is losing $12,000 an hour due to system outages (How to quantify downtime).
As mentioned above, a subsequent analysis by Gartner in 2014 found average costs of $5,600 per minute and more than $300,000 per hour.
As early as 2004, a conservative estimate by Gartner put the hourly cost of computer network downtime at $42,000. As a result, a company that suffers downtime longer than the average 175 hours per year can lose more than $7 million per year. However, the cost of each disruption affects every business differently, so it's important to know how to calculate the exact financial impact (How to quantify downtime).
It makes sense to think that the cost of disruption will only increase over time (since we all rely more on data systems these days). Here's how to understand why past data can be multiplied by a significant number to reflect current reality...
Every minute counts
More than a decade ago, the average cost of data center downtime across all industries was estimated at approximately $5,600 per minute (Unplanned IT outages cost more than $5,000 per minute), a number accgardener, remained the same until 2014. The previous Ponemon Institute study referenced above calculated the minimum, mean, average, and maximum cost per minute of unplanned outages based on information from 41 data centers. The largest cost of an unplanned outage has been found to exceed $11,000 per minute.
On average, the cost of an unplanned outage is likely to be over $5,000 per minute.
It only makes more sense
AStudy 2013saw an increase of more than 41% over the previous averages described above and an average cost of more than $7,900 per minute.
And2015 ITIC-Umfrageclearly shown that the cost per hour (compared to 2008 data) increased by 25-30%.
Impact of downtime per year
A previous Gartner analysis estimated that downtime can average up to 87 hours per year. Obviously this is the sum of many interruptions, from a few minutes to several hours (The average large enterprise experiences 87 hours of network downtime per year).
How have things changed?
After2011 investigationrevealed that while the industry has been successful in combating the downtime epidemic and reducing its incidence, we are still seeing significant downtime and huge revenue losses (source:resulted in over 3 million (apparently Whatsapp users) migrating to Telegram)
The impact on reputation and loyalty
How much is your company's reputation worth? This can be extremely difficult to assess, as can the long-term impact of a damaged reputation and its impact on sales and profitability.
In this case, the cost of failure includes lost customers (both short- and long-term) and other tangible items that reflect the cost of reputational damage, such as: B. Stock outs, marketing times (crisis management and brand recovery) and the media budget required to reboot and polish. the profile of an organization.
What parameters should affect its calculation?
When attempting to estimate the cost of downtime, there are obvious direct costs (e.g., lost business during downtime). However, many indirect costs must also be calculated, such as employee overhead or the reputation problems mentioned above.
Personnel costs derive from the cost of burning out “war room” tasks focused on getting IT systems up and running again, the cost of being behind on all other scheduled tasks, the cost of staff overtime ( if applicable) and more. Add to this the value of data loss, emergency maintenance fees (especially if the outage occurs outside of business hours), and additional repair costs that can persist long after service is restored.
It goes without saying that you should consider these costs when calculating the impact of downtime, as they are often very high; But even a rough estimate can be extremely helpful in understanding the risks and deciding what level of technology to rely on to combat them.
There's also the impact of lost sales. To get an accurate estimate of total lost sales, the impact percentage needs to be increased to reflect the true lifetime value of customers who permanently switch to a competitor. For example, the Facebook (and Whatsapp) outage mentioned above.Unconscious Costs: Denying the true cost of network downtime. How much revenue is lost if these users submit fewer billable ad impressions?
Stocks fell 25%
Although it is difficult to quantify so many parameters, they are significant and significant. For example, when Amazon.com went offline for several hours in its early days, its stock fell 25% in a single day (Unconscious Costs: Denying the true cost of network downtime)!
In thisAmazon Cloud OutageFor example, the company continued to fight to bring its cloud services back online. As a result, many customers questioned the reliability of their cloud and Amazon's communications surrounding the outage. Other customers felt they should be compensated for downtime as part of their SLA.
I know you're curious: In terms of SLA, Amazon's EC2 SLA was not breached despite the nearly four-day outage (Seven lessons from the Amazon outage).
The cost of downtime: Calculate it yourself
How much will you lose from unexpected server or business application downtime?
According to multiple sources, the easiest way to calculate potential lost revenue during an outage is to use this equation:
|LOSS OF INCOME||=||(GR/TH) x I x H|
|GRAMM||=||annual gross income|
|AND||=||total annual working time|
|H||=||Number of hours of downtime|
How to minimize the risk of disruptions and downtime?
Downtime and failures are catastrophic, but they don't have to be overly shocking. By using solutions that focus on getting to the root of the problem, failures can be prevented before they happen.
Evolute change analysishas developed a unique AIOps solution that focuses on changes that are the root cause of performance issues. Evolven helps enterprise IT and cloud operations teams prevent and remediate incidents before they start.
Contact usto see how we are helping leading companies drastically reduce the number of incidents and MTTR.
Quick downtime calculator
To get a quick estimate of your company's probable downtime costs, use the following formula, based on the size of your business and the number of minutes your most recent incident lasted: Downtime cost = minutes of downtime x cost-per-minute. For small business, use $427 as cost-per-minute.
Downtime cost is defined as any profit that a company loses when its equipment or network stops functioning. The cost of downtime implies not only direct financial loss but can have an impact on your company in at least the other 4 ways.What are some costs of downtime? ›
Relatively small businesses' cost of downtime falls into a range of $137 to $427 per minute, whereas for larger businesses, the downtime can cost over $16,000 per minute ($1 million per hour) for just a short outage.How much does an hour of downtime cost? ›
How Much Does Downtime Cost a Company? The average cost of downtime is significant. Each minute costs an average of $9,000, according to the Ponemon Institute, bringing the downtime cost per hour to over $500,000.What is true downtime cost analysis? ›
TDC is a methodology of analyzing all cost factors associated with downtime, and using this information for cost justification and day to day management decisions. Most likely, this data is already being collected in your facility, and need only be consolidated and organized according to the TDC guidelines.What are the three types of downtime? ›
Common categories of downtime include excessive tool changeover, excessive job changeover, lack of operator, and unplanned machine maintenance.What is the difference between downtime and outage? ›
Downtime occurs when a system can't complete its primary function. It can be broken up into two types: IT outages and brownouts. IT brownouts occur when a system is slowed or partially available. This might mean customers can access your site, but pages load slowly or dynamic features like "add to cart" don't function.What are some examples of downtime? ›
Downtime has many causes, including shutdowns for maintenance (known as scheduled downtime), human errors, software or hardware malfunctions, and environmental disasters such as power outages, fires, flooding or major temperature changes.What are the main causes of downtime? ›
This can be due to several reasons including hardware or software failure, human error, malicious attacks or natural disasters. Since unplanned downtime is unexpected and occurs without a warning, preventing it can be a challenge.How do you explain downtime? ›
a time during a regular working period when an employee is not actively productive. an interval during which a machine is not productive, as during repair, malfunction, maintenance.
- Track Downtime. Before jumping into the steps of reducing downtime, it is critical to track it. ...
- Monitor Production. Having a system to monitor production can also help reduce downtime. ...
- Create a Preventative Maintenance Schedule. ...
- Provide Operator Decision Support. ...
- Perform DMAIC Analysis.
Downtime falls into two categories: planned and unplanned. Planned downtime is notable because it offers advanced warning and gives users a chance to prepare. Planned downtime is usually done for upgrades or maintenance to the network infrastructure.What is average downtime? ›
Average downtime is usually built into the price of goods produced to recover its costs through the sales revenue. Opposite of "uptime." Also called "waiting time."What is the industry standard for downtime? ›
World Class Standards For Downtime
Aim for unscheduled downtime to be 10% or less.
True Cost – From Costs to Benefits in Food and Farming
True Cost Accounting (TCA) is a new way of identifying the real costs of a specific product or service. TCA calculates not only the direct costs like raw materials and labour, but also the effects on the natural and social environment in which a company operates.
Calculating Downtime Cost
The duration of the downtime and the cost incurred per minute you're offline are the two variables that most affect the financial impact of an outage.
1. Divide your total revenue by the planned operating time to get your daily revenue. 2. Assess by how much your daily revenue goes down if the chosen piece of equipment stops working for 1 hour.What is downtime also called as? ›
DOWNTIME stands for Defect, Overproduction, Waiting, Non-Utilized Talent, Transportation, Inventory, Motion, and Extra Processing.What is Level 3 downtime? ›
Downtime Level 3 - Operations are defined as localized, scheduled or unscheduled problem involving the loss of multiple functions, applications, or systems, not anticipated to exceed 24 hours of unavailability. For a level 3 the problem can be resolved using all available resources.What is a major outage? ›
More Definitions of Major Outage
Major Outage means any Power Outage that lasts for at least ten (10) consecutive minutes and/or any Temperature Irregularity, in each case causing inoperability of Customer's Equipment.
A period when a service or an application is not available or when equipment is not operational.What are the consequences of downtime? ›
Consequences of unplanned downtime
Lost productivity and revenue: Every minute of downtime can result in lost productivity and revenue, affecting a business's bottom line. Decreased customer satisfaction: Unplanned downtime can lead to delayed deliveries, canceled orders, and frustrated customers.
What is downtime at work? It is a period during which an equipment or machine is not functional or cannot work. It may be due to technical failure, machine adjustment, maintenance, or non-availability of inputs such as materials, labor, power.How do you use downtime wisely? ›
- Upgrade Nonwork-Related Skills. Work on developing or upgrading skills that aren't directly related to your job or business. ...
- Read Biographies. ...
- Join Activities Outside Your Network. ...
- Be A Curious Customer. ...
- Give Back To Your Community. ...
- Practice Mindfulness And Meditation. ...
- Improve Your Physical Health. ...
- Plan Out Your Week.
Planned downtime is scheduled time when production equipment is limited or shut down to allow for planned maintenance, repairs, upgrades or testing.What is managing downtime? ›
Downtime management enables you to exclude periods of time from being calculated for events, alerts, or views that can skew CI data. To access. Administration > Service Health > Downtime Management. Alternatively, click Downtime Management.What are the benefits of downtime? ›
Downtime gives us time and space to enjoy our personal lives and get personal tasks done. It grants us time with family, friends, and our hobbies. On a brain level, it allows us to reach homeostasis and is a necessary break from the aroused state, Dr. Hanson says.What is healthy downtime? ›
A little downtime is important for your brain health. Research has found that taking breaks can improve your mood, boost your performance and increase your ability to concentrate and pay attention. When you don't give your mind a chance to pause and refresh, it doesn't work as efficiently.What is 5 nines availability downtime? ›
Availability is normally expressed in 9's. For example, “5 nines uptime” means that a system is fully operational 99.999% of the time — an average of less than 6 minutes downtime per year. The chart shows what impact various availability levels have on your server downtime.What are downtime metrics? ›
The most well-known downtime metric is Mean Time to Repair (MTTR). The MTTR metric reflects the average time it takes to troubleshoot and repair a failed piece of equipment.
Definition(s): The amount of time mission/business process can be disrupted without causing significant harm to the organization's mission.How much does downtime cost the auto industry? ›
For example, in the auto industry, downtime can cost up to $50,000 per minute. That's $3 million per hour. 400 The true downtime cost includes a variety of wasted business support costs and lost business opportunity costs because resources were needed to resolve a downtime incident that probably didn't need to happen.Is database downtime costly? ›
Database outages can have a significant impact on top line revenue. In fact, according to a survey conducted by ITIC, 98% of organizations say a single hour of downtime costs over $100,000, while 81% report that it costs over $300,000. And that's just for a single hour!What is the average cost of downtime in a data center? ›
According to Gartner, downtime costs $5,600 per minute on average. This results in average costs between $140,000 and $540,00 per hour depending on the organization. Some factors that contribute to the costs associated with downtime include: Lost sales.Is the auto shortage getting better? ›
The Auto Chip Shortage Remains, But It May Be Improving
However, if Fiorani's estimate holds true, it would mark a significant improvement for the industry. More than 10.5 million vehicles were cut from production in 2021, according to Auto News.
The worldwide semiconductor shortage that began in 2021 has continued to be one of the biggest stories in the automotive industry. Automakers have faced slashed production schedules and staggering revenue losses since the shortage of computer chips began.How much will the chip shortage cost automakers? ›
The chip shortage will cost the global auto industry in 2021 $210 billion in revenues and lost production of 7.7 million vehicles, consultant AlixPartners estimated in September.How does downtime affect a business? ›
Repeated downtime events can result in unhappy customers, which can quickly translate into bad customer reviews and tarnished brand image. Data Loss: Downtime affects not only your business but your clients as well. Downtime due to cyberattacks, server or network outage can result in corrupt, damaged or stolen data.How does downtime affect production? ›
All manufacturing downtime reduces overall output by stopping production. Unplanned downtime can cost 15 times more than planned downtime. The loss of revenue during any type of asset maintenance can be as high as $3 million per incident.How do you measure data downtime? ›
- Labor Cost: ([Number of Engineers] X [Annual Salary of Engineer]) X 30%
- Compliance Risk: [4% of Your Revenue in 2019]
- Opportunity Cost: [Revenue you could have generated if you moved faster, releasing X new products, and acquired Y new customers]
- = $ Annual Cost of Data Downtime.
In other words, data downtime is the periods of time when data quality is bad or the data is unavailable. You can't do anything without data or even with bad data. For example, let's say you're forecasting stocks using Twitter as your data source. If Twitter is down, you won't have any data to use for forecasting.How is maintenance downtime calculated? ›
1. Divide your total revenue by the planned operating time to get your daily revenue. 2. Assess by how much your daily revenue goes down if the chosen piece of equipment stops working for 1 hour.