Development / Monitoring

Rookout’s Agile Flame Graphs Give Quick, Visual Insight into Production Code

23 Mar 2021 7:00am, by

For developers looking to debug problems happening in production environment code, they’re often stuck between a rock and a hard place, argues Rookout chief technology officer and co-founder Liran Haimovitch. This week the company, which says it helps engineers solve customer issues faster by making debugging accessible in any environment, has released Agile Flame Graphs, further working toward that goal by making code debugging into a visual medium.

“Traditionally, profiling tools fell into one of two categories: the kind that would collect anything, everything, and the whole kitchen sink too, and those were just incredibly slow. They gave you a very accurate view of what’s going on, but they were so slow you couldn’t run them in any real-world scenario. The other kind of option is the application performance monitoring (APMs), which are designed to operate in the real world, but those provide you with a very unclear view, with just a handful of data points,” said Haimovitch.

Rookout wants to bridge that gap with Agile Flame Graphs, a feature for the company’s debugging platform that allows you to select with a click of a button to create the points you want to measure the time between.

Rookout integrates into a developer’s code with a software development kit (SDK), which Haimovitch says takes just minutes to do, and is compatible with the JDK runtime, the Docker runtime, Python, Node.js, and Ruby, among others.

From there, the SDK allows developers to set what the company refers to as a “non-breaking breakpoint,” which allows developers to set points in their code to observe without ever interrupting the code’s operation. By comparison, a standard breakpoint interrupts code execution to return results, making it incompatible with runtime debugging. Rookout does this via bytecode manipulation and combined with Agile Flame Graphs, it means that developers can quickly determine where code might be bogging down a software’s performance in production with a glance at a graph.

This isn’t Rookout’s first foray into visual debugging, either, with the company also introducing a live debugging heatmap last year.

With Rookout, developers can set these non-breaking breakpoints between two different lines of code, even lines of code right next to each other, and the Agile Flame Graph will show the latency between those two points.

Rookout CEO Shahar Fogel further explained in an email how Rookout differs from the competition, allowing developers to address code in complex production environments, where other tools may fail.

“The ‘old ways’ of debugging are about either reproducing locally and debugging step-by-step, or about adding log lines and hoping that the issue will happen again. In cloud native and distributed environments, and in production environments, these methods are not effective and are sometimes outright impossible to use,” wrote Fogel. “Complex environments are hard or impossible to reproduce locally, yet debugging step-by-step means stopping your app or pod, which is something you can’t do in a production. And adding log lines means waiting anywhere between hours or sometimes days, even for the most agile and automation-driven organizations.”

Currently, latency is the primary metric offered with Agile Flame Graphs, but the company plans on adding new data points in the near future, including CPU usage, memory allocation, and garbage collection pressure. Haimovitch explained that garbage collection pressure, for example, can slow down performance and could be similarly monitored.

“Whenever you allocate something, and when you de-allocate it, this has to be cleaned up. This cleanup effort takes time. In Node.js, for instance, when the system is under a lot of pressure it doesn’t have time to clean up, especially if you’re creating a lot of objects and then deleting them, which is a lot of time work for the runtime,” said Haimovitch. “If you’re arbitrarily allocating lots and lots of objects, that can cause the code to become inefficient. You’re essentially pressuring the garbage collection operation to constantly be running. The application is going to collect the garbage you’re generating instead of running your code to serve your customers.”

A newsletter digest of the week’s most important stories & analyses.