Machine Learning / Monitoring / Networking

New Relic’s Ambitious Plan to Apply AI and ML to Incident Response

16 Apr 2020 1:37pm, by

Application performance management company New Relic has begun to apply machine learning (ML) and artificial intelligence (AI) to automate incident response, management and remediation. If successful, the new features could serve to mitigate a major source of lost IT productivity among organizations with often different operations to manage, including multicloud and on-premises infrastructures.

New Relic AI offers a wide sweep of AIOps capabilities to help reduce “noise” and other distractions when managing workflows. The idea is to solve a common pain point of having to devote IT resources to respond to an often overwhelming number of telemetry alerts. Such “noisy” alerts often consist of false positives. Such an automated system could thus potentially solve a number of these issues by serving to fill in for IT workers who would otherwise have to provide manual fixes.

The feedback from DevOps and site reliability engineering (SRE) teams New Relic hopes is there is a great need to more effectively monitor and interpret large amounts of operational data that can be spread across various multicloud and on-premises networks, Michael Olson, director of product marketing at New Relic, said.

“It’s very difficult to be able to quickly diagnose quickly detect, diagnose and ultimately resolve production incidents. On-call teams have shared with us that they have challenges finding signals from the flood of alerts they’re receiving that create noise, as well as prioritizing and determining how to take action upon incidents. This way, they can solve problems before they impact customer experience or service level objectives to the business. And so what we believe is that as this complexity of software systems grows DevOps and SRE teams really need faster and easier ways to resolve incidents.”

Olson said New Relic AI’s AIOps capabilities include:

  • Proactive anomaly detection.
  • Incident Intelligence to reduce alert noise by correlating related incidents and providing them with context and guidance, to help on-call teams diagnose and respond faster.
  • Integration into customers’ existing incident management workflows.

Photo: New Relic.

Meanwhile, it will be the end results that finally count. “The key here is to drill into actual ML/AI improvements, what metrics they use to measure their effects and how many clients have already benefited from this new AI/ML,” Torsten Volk, an analyst for Enterprise Management Associates (EMA), said. “Differentiation is a big challenge in this regard, as nobody has actually truly quantified the value of their AI magic.”

The ML and AI aspects of New Relic’s platform are also one component that can help organizations’ DevOps to free up more resources to help them keep up with the often brutal cadence of releasing code updates and applications at faster and faster paces.

“As enterprises are forced to release software more often without sacrificing quality or adding cost on the side of developers and operators, they need to give up on the traditional siloed approach toward monitoring where infrastructure, applications, user experience and business services are typically monitored and managed through separate platforms,” Volk said. “Therefore, engineering and DevOps executives need to look at platforms such as New Relic One, Dynatrace, DataDog and Splunk/SignalFX to figure out how they can create reinforcement learning loops across the entire enterprise to be able to leverage advanced analytics, machine learning and deep learning to issue earlier and more accurate alerts, automate root cause analysis and ultimately take action in an autonomous manner.”

Photo: New Relic.

New Relic is certainly focused on “all of the right release themes, such as integration, alert noise and proactivity, but now the company needs to demonstrate that the value of its own AIOps solution,” Volk said.

The end result is for New Relic AI to thus complement New Relic’s APM infrastructure and trace capabilities, Guy Fighel, group vice president and product general manager, said. “The benefits of a true observability platform that customers can use right out of the box for incident response detection and remediation is something that New Relic has been offering for quite some time,” Fighel said. “This includes gathering event data and clearing it with other types of incidents and alerts that are firing up either from our platform, or from other systems.”

New Relic is a sponsor of The New Stack.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.