In the last year, we’ve seen a dramatic shift in expectations regarding how to measure the success of IT monitoring. As more people believe monitoring is critical to a company’s success, the ability to quickly remediate the cause of alerts has become less critical. IT decision-making about monitoring and agility is bound to be affected.
For the last few years, pundits have been telling us that IT monitoring is no longer just about up-time. As we previously reported, there are two new expectations for monitoring: 1) tools should help improve performance (both IT and business) and 2) data from multiple systems should be integrated to provide a holistic picture.
Correlation between monitoring satisfaction and MTTR has decreased dramatically.
Monitoring software company BigPanda recently published its second annual State of Monitoring report, which provides data and a few answers. It is based on responses from over 1,500 IT pros. When comparing the 2017 and 2016 reports we found that many things have not changed. The top IT concerns are about security and downtime. The top IT monitoring challenges are quick remediation of service disruptions, getting money to buy monitoring tools and reducing the number of unimportant alerts being generated. Interestingly, even the top performance key performance indicators (KPIs) are the same, with customer satisfaction cited by 73 percent, followed service level agreements (SLAs) compliance, incident volume and mean time to repair (MTTR).
What has changed in the survey findings is related to what makes people satisfied with their monitoring strategies and how they define agility.
More people believe a strategic monitoring process is important to their organization, going from 80 percent in 2016 to 85 percent in 2017. Yet, the percentage that are “very” satisfied with their process remains relatively unchanged at 13 percent. If you “somewhat” satisfied responses are included 54 percent are happy.
The report asserts that “there is a clear correlation between monitoring strategy satisfaction and ability to remediate.” We can’t deny that is true, but the correlation has weakened dramatically since last year. Like last year, only about three in ten companies that couldn’t resolve a majority of alerts within 24 hours are satisfied with their organization’s strategy.
Obviously, if you can’t adequately address problems, then there is something wrong with your approach to IT operations. However, dissatisfaction among those that are resolving their alerts within a day has grown. The percentage satisfied with their monitoring strategy dropped from 48 percent in 2016 to 30 percent in 2017 among those that resolved 50-75 percent of their alerts within 24 hours. Among those that can resolve more than 75 percent of alerts within a day, the percentage satisfied declined from 63 percent to 51 percent.
So, although monitoring is more strategic than ever, the simple ability to address service disruptions does not mean the monitoring process is successful. Perhaps this is because the stated goal of being agile is now becoming harder to achieve. According to the survey, the number of alerts continues to increase. While reducing alert is noise a challenge, the increase is due to a non-trivial phenomenon. The pace of software development has increased. Code deployments are more likely to happen on a daily or weekly basis. Infrastructure changes are less likely to be a rare event.
Although fixing problems quickly is less strategic than it used to be, the number of events needing monitoring has increased. Consequently, many IT pros no longer believe their organization is agile. Among respondents that say their organization has as strategic monitoring process, the believe that they are agile declined from 89 percent in 2016 to 65 percent in 2017. It is likely that with the increasing speed of development, the expectations for agility have increased. Not only do the developers need to be agile, but so do the IT ops teams as well as business units.
Many, many monitoring vendors say they can make you more agile. They claim that innovations using machine learning and artificial intelligence can reduce alert noise and speed up issue resolution. Integrating analytics into dashboards can make executives more agile.
These are worthy endeavors. As IT pros move beyond custom-rigged monitoring solutions, these products are being evaluated and bought. Whether their adoption becomes widespread depends on whether the products actually make organizations more agile. Looking at the large list of sponsors for the Monitorama open source monitoring software conference, to be held this May in Portland, we look forward to hearing their opinions on this subject.
Feature image by Joaquim Alves Gaspar via Wikipedia.