Machine Learning / Monitoring

New Relic, SignifAI and the Shifting View of Monitoring Technologies

12 Feb 2019 9:21am, by

New Relic’s recent acquisition of SignifAI shows how system monitoring landscape is changing. Technology tool providers are combining machine intelligence and other tools into single integrated service that can better inform DevOps technologists and site reliability engineers (SRE) about the state of their systems.

“We anticipate a declining demand for stand-alone event analytics and incident management tools as monitoring providers expand horizontally, collecting a growing portion of operations data within a single tool,” writes Nancy Gohring, an analyst with 451 Research in a report about the acquisition. “With that data, monitoring specialists can run analytics that deliver root-cause analysis and other benefits that the stand-alone event analytics tools offer.”

The SignifAI acquisition also points to the emergence of “AIOps,” or use of machine intelligence to better monitor and troubleshoot complex IT systems. ML can detect correlations across distributed systems, ushering in observability and soon to machine intelligence and the greater realm of self-learning systems.

SignifAI is a superset of technologies — it runs on Prometheus, OpenShift and about 60 monitoring tools. It offers:

  • Chewie, an API to manage monitoring platforms with the stated intent of reducing noise in a team’s incident management report.
  • Integration with Prometheus and OpenShift for monitoring visibility and correlations of alerts and metric to relevant logs and events.
  • SignifAI Decisions — a correlation engine for SRE and DevOps teams that uses correlations to provide insights into production systems.

SignifAI analyzes both event information and metrics. The correlations can become highly tuned over time by continually providing more feedback into the system using the SignifAI technology. Companies that build on their data have the capability to reduce the hours spent on time-consuming tasks such as combing through event information that they collect and act on to discover root issues.

It can take a full day for teams from multiple parts of the enterprise to determine the root cause of a spike in error rates. In the end, after combing through all the data, it may only be a single issue and not a broader problem affecting customers. Go through that exercise time and again and alert fatigue becomes a real problem while opportunities get delayed or not pursued at all.

New Relic now ingests 2.1 billion data points per minute for all its products, said Aaron Johnson, senior vice president of product management at New Relic. Tools like New Relic APM uses distributed tracing to find issue across distributed architectures. SignifAI helps customers sift through the data that monitoring tools such as New Relic are creating.

Distributed Tracing

In an interview, Johnson made the point that distributed tracing is a front and center issue, and why it relates to developing trends.

Observability is a huge trend for New Relic, Johnson said. Distributed tracing, metrics and/or logs or events are changing how people think about monitoring. Now observability is of utmost importance when engineers build their applications. It helps with the finger pointing between trends that can often happen when there is no clear view about the source of the problem.

In a post last July, SignifAI Product Manager Annika Garbers provided an account for why automatic correlation helps optimize an SRE’s operations efficiency. The correlations come from the ability to use its integrations to optimize the mean time it takes to detect, understand, and respond to an issue. Each has a separate key performance metric (KPI). Hiring more engineers, adding additional configurations or further educating a team may only solve part of the problem and in some cases make it worse.

Resolving KPIs can be complicated by the lack of automated capabilities to see the relationships in the data.

SignifAI is New Relic’s second purchase in the past four months. The company acquired the assets from CoScale in October of 2018. According to Gohring, the acquisition of the Belgian company helped round out New Relic’s monitoring capabilities for containers and Kubernetes.

New Relic is a sponsor of The New Stack.

Author Alex Williams is the publisher of The New Stack.


A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.