1700 DevOps Monitoring Experts Agree: Too Many Alerts from Too Many Tools Put Customers at Risk
May 18, 2016

Dan Turchin
BigPanda

We're all technology companies. Every second of downtime hurts. Monitoring at scale is hard. And that's just the beginning of what you shared in our recent survey.

We invited you to tell us about the state of monitoring. Tales of woe and glory from more than 1,700 ops experts provided the most articulate, profound, comprehensive summary of IT Ops life ever assembled.

We thought you'd all benefit from what you shared so we published the results. You represent five continents, large and small companies (modal reply: more than 10,000 employees), large and small teams (modal reply: less than 10 members), and both traditional IT and DevOps organizations.

Here's what fascinated me...

You rely on many tools to monitor your infrastructure.

■ Each team member is responsible for triaging between 10 and 50 alerts per day.

■ In an eight-hour shift, that means you're each working about 10 issues simultaneously assuming you don't inherit orphans from previous shifts (which you do!).

■ Translation: there are fire-swallowing, tightrope-walking, lion tamers working the e.coli route for Carnival Cruise Line with easier jobs than yours.

The more you've invested in agility and velocity, the more effective you are at reducing downtime.

■ Self-described "DevOps" organizations are more than twice as likely to deploy code and/or infrastructure changes at least a few times per day (31% for DevOps orgs vs. 15% overall).

■ They're also more than twice as likely to have cloud-based infrastructure (32% of DevOps orgs vs. 13% overall).

You're dissatisfied with the current reliability of your monitoring and incident management process.

■ Nearly 80% of you say the most challenging part of your job is suppressing alert noise.

■ The problem's not going away: more than 55% are dissatisfied with the current monitoring strategy. Your comments also indicate the problem won't improve in the next 12 months without a better way to manage the growing workload.

A bleak picture perhaps best summarized by Carlos from a midwestern credit union who says if he could change one thing about his organization's current monitoring strategy it would be "to focus on the only thing that matters: reducing noise." Carlos, you're right. Human beings alone can't fix a problem created by machines. We've been in this position before … before there was client-server, TCP, DNS, virtualization, cloud.

We've approached each challenge with the same tenacity, the same passion, the same commitment to solving problems with technology. We'll do it again. This time, with better automation and collaboration. Soon, machines and people will speak a common language. And when they do, we'll be the first to share how great technology plus your ingenuity makes life better for everyone.

Dan Turchin is VP Product at BigPanda.

The Latest

November 14, 2018

What to automate? Which parts of the delivery process are good candidates? Which applications will benefit from automation? At first, those sound like silly questions. Automate all your repetitive processes. If you think that you'll do the same thing manually more than once, automate it. Why would you waste your creative potential and knowledge by doing things that are much better done by scripts? Yet, an average company does not adhere to that logic. Why is that? ...

November 13, 2018

I'd love to see more security automation deeply integrated into the development process. Everybody knows since the 1990s that security as an afterthought just doesn't work, yet we keep doing it. The reason, I think, is because it's very hard to automate security ...

November 09, 2018

DEVOPSdigest asked experts from across the IT industry for their opinions on what steps in the SDLC should be automated. Part 5, the final installment, covers deployment and production ...

November 08, 2018

DEVOPSdigest asked experts from across the IT industry for their opinions on what steps in the SDLC should be automated. Part 4 is all about security ...

November 07, 2018

DEVOPSdigest asked experts from across the IT industry for their opinions on what steps in the SDLC should be automated. Part 3 covers the development environment and the infrastructure ...

November 06, 2018

DEVOPSdigest asked experts from across the IT industry for their opinions on what steps in the SDLC should be automated. Part 2 covers the coding process ...

November 05, 2018

Everyone talks about automating the software development lifecycle (SDLC) but the first question should be: What should you automate? With this question in mind, DEVOPSdigest asked experts from across the IT industry for their opinions on what steps in the SDLC should be automated. Part 1 starts with by-far the most popular recommendation: Testing ...

October 31, 2018

Halloween is a time for all things spooky, but not when it comes to your mobile app experience. A poor experience can not only scare off your customers but keep them away for good ...

October 30, 2018

As organizations have embraced open source, they have become polyglot — using multiple programming languages and technology stacks to accomplish software and hardware related tasks. Enterprises are caught between the benefits provided by a polyglot environment and the complexities and challenges these environments bring. Ultimately, if the situation remains unchecked, polyglot will kill your enterprise ...

October 29, 2018

Factor 5 of the Twelve-Factor App relates more to processes and advises strictly separating the build and run stages. The emphasis is on identifying and separating each stage of app development, and encouraging automation between each so as to accelerate the process ...

Share this