CI/CD / Cloud Native / Kubernetes

Why Dynatrace Says AI Is the Answer for ‘Single Panel’ Observability for Cloud Native

13 Feb 2020 12:21pm, by

The life of DevOps team members these days is often fraught with chasing down and prioritizing often numerous bugs and security holes. In addition to having to spend valuable time sifting through numerous false positives while scrambling to decide what needs to be fixed first, you often have to work with separate application performance management (APM), observability and monitoring interfaces — that can often only add to the confusion.

Detection is just the first part of the process before remediation begins. Ultimately, many DevOps team members can spend most of their time detecting and remediating bugs and security holes.

One potential fix for this DevOps’ dilemma is an artificial intelligence (AI)-controlled system, developed by Dynatrace, that serves as a single panel for monitoring and observability and also automating application-management processes.

During its Perform 2020 conference last week, Dynatrace announced that it has extended its support for deployments of these new capabilities on Kubernetes. Dynatrace’s AI engine now automatically ingests Kubernetes events and metrics, thus allowing it to detect and process performance issues and anomalies Kubernetes clusters, containers and workloads, the company said.

Such features could be important for developers, especially in CI/CD scenarios, since they spend so much time on instrumenting their code for monitoring and tracing, Torsten Volk, an analyst for Enterprise Management Associates (EMA), said. “Dynatrace automatically injects the needed instrumentation for mobile applications and for Kubernetes so that developers no longer have to worry about making and managing code changes for monitoring,” Volk said.

Additionally, the new automatic mobile app instrumentation helps developers “debug code a lot easier,” Volk said. “Developers can now see the context of the failure — maybe the failure only occurs in a crowded train at high speed on LTE or maybe it only manifests on a certain version of Android or iOS,” Volk said. “This ability of accessing individual failures or groups of failures within their context and receiving a full stack trace with it makes debugging and enhancing apps based on actual real-life use cases much easier.”

As mentioned above, Dynatrace has been developing its platform with a strong emphasis on AI technology for observability and monitoring and application management, particularly for microservices and container environments. Dynatrace’s platform, for example, covers auto-instrumentation and full-stack monitoring, while leveraging AI to automatically resolve operations and DevOps issues, Volk said. “Dynatrace has basically rewritten their whole portfolio about a year ago,” Volk said.

The single-panel model is a key component in the platform. “Having a centralized dashboard that lets you trace the impact and root cause of your latency issue will let you make a rational decision based on full information, whether or not you should spend your scarce resources on fixing the issue,” Volk said. “For example, the dashboard shows you if the application is used right now and you can often even see exactly who is using it and decide if this user group deserves immediate attention because they are abandoning high-value shopping carts out of frustration over the extra weight or if they can weight as they are not your target customers.”

Sometimes, DevOps finds the application latency might have been caused by a bad query that was part of “last night’s software release,” so with Dynatrace’s platform, “you can simply roll that back and let the development team optimize their query code,” Volk said.

“Maybe the red lights originate from a temporary slowdown or outage of an external microservice, for example, a payment gateway, where all you can do is make an angry phone call and wait for your payment provider to fix their issues, but there definitely is no reason at all to wake up developers in the middle of the night to help figure out the issue. Or maybe the slowdown is due to a legit infrastructure issue, like disk performance or CPU utilization or low memory,” Volk said. “Wouldn’t it be great to know if you should buy more hardware or if the hardware is just fine, but the problem originated from the new developer team that is hammering an API with a huge number of temporary requests for their load testing? You could simply tell them to do their testing on a different schedule and all your lights would be green again.”

Dynatrace is a sponsor of The New Stack.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.