Grafana Shows New Observability Projects at ObservabilityCON
Using open source Grafana has emerged as something beyond just a predominant way for visualizing observability data among the developer and operations community, as well as those who use the panel to view anything from monitoring oven temperatures to SpaceX’s use of Grafana to monitor fuel flow in its spacecraft. In today’s software-driven world at large companies, the developer’s role has evolved to become a force to be reckoned with.
This helps to explain — at least partially — why Grafana has increasingly been incorporated into corporate infrastructures based on what the developers, as well as operations folks, want (as opposed to the not-so-far-gone days when the CIO or other C-suits might have dictated the platforms (usually proprietary) the engineers had to use. Today, the tables have been turned somewhat and the engineers have more of a stake in what infrastructure is adopted.
Reaching the summit of 10 million users, Grafana’s popularity and the grassroots-pushed adoption of large organizations was apparent during ObservabilityCON 2022 held in New York earlier this month, where Grafana users from large banks and other corporate entities were present in numbers. Grafana also hosted several reasonably priced workshops on the eve of the conference at one of Google’s locations in downtown Manhattan, where snacks and fruit were in abundance.
Latency and other performance issues as your needs scale aren't fun when your dealing with terabytes and terabytes of log data. @grafana is seeking to help with those headaches with soon-to-be available #Loki 2.7, as @tom_wilkie described at #ObservabilityCON 2022 in NYC today. pic.twitter.com/cLCH11FllP
— BC Gain (@bcamerongain) November 2, 2022
During the conference, speakers from Grafana and customers set out and described how it plans to further offer a much wider observability experience beyond its initial panel for data visualization. To help do that, Grafana announced two new open source projects Grafana Phlare for continuous profiling and Grafana Faro for frontend application observability as two new Grafana additions when you access your Grafana panel next time. These open source projects build on Grafana existing open source projects Mimir (metrics), Loki (logs) and Tempo (traces). (More about this below).
New features for Grafana Loki for logs, Grafana k6 for performance testing and Grafana Tempo integration and Grafana Mimir for metrics were also discussed and demoed.
@grafana CEO Raj Dutt described what Grafana is trying to accomplish for observability, during the #keynote here in NYC today at #ObservabilityCON 2022. Those Grafana nice-to-look at panels are cool and fun to play around with, but there is a lot more to it, of course. @nopzor pic.twitter.com/gWW7H3cpYX
— BC Gain (@bcamerongain) November 2, 2022
If anything, Grafana does want to maintain its connection to the developer. Its “LGTM,” refers to “looks good to me” practices for when pull requests are approved on GitHub, as well as standing for Loki, Grafana, Tempo and Mimir.
“We have spent the last four years developing that initial core stack. So what’s next for us is we’re going earlier in the software development cycle, with more testing, including chaos testing, and reliability engineering with tools like k6,” Raj Dutt CEO and co-founder of Grafana Labs, told The New Stack during an interview during ObservabilityCON. “We are going earlier in the software lifecycle with the pre-prod testing and continue to invest in observability.”
But, of course, you do not have to opt for a Grafana data source for observability when visualizing data with a Grafana panel. As Dutt noted during his keynote, Grafana is designed to prioritize interoperability. This is because most organizations work with many observability vendors and solutions, as well as different databases. “We’re pretty unique, and I think differentiated as a vendor. because we don’t tell you we don’t require you to consolidate all your data into a single platform. We allow you to keep your data where you want it and use the technologies that make sense for your company,” Dutt said. “We want you to use the vendors that make sense and ultimately own your own observability strategy and we think that changes the relationship that we can enjoy with our customers and users, because it makes us more of a true partner rather than a vendor that’s going to consolidate all your data and spend.”
'Continuous profiling' is being called by some as the 'fourth pillar of observability.' @Grafana's new #Phlare open source project offers storage and querying of profiling data, Phlare creator Cyril Tovena of @Grafana described today. @Kuqd #ObservabilityCON 2022. pic.twitter.com/QgiysgW0Qv
— BC Gain (@bcamerongain) November 2, 2022
Helping developers with their observability struggles largely served as the impetus behind perhaps Grafana’s most ambitious effort as of late: the launch of Grafana Phlare, What does continuous profiling mean and why should this new open source alternative be of interest to developers, as well as operations folks? To begin with, let’s look at the definition of what profiling means. According to Cyril Tovena, a Grafana principal software engineer who is behind much of the creation of the latest versions of Loki and the lead developer of Phlare, profiling is an analysis of code execution by collecting stack trace samples, a set of function calls that lead to resource consumption. “Profiling, in my opinion, is a superpower because it tells you down to the actual line of code what the problem is… and allows you to pinpoint a performance problem,” Tovena said.
The idea behind Phlare was to fill a gap in existing open source projects for storing and querying continuous profiling data that do not meet the scaling, reliability and overall performance requirements Grafana users appreciate, Tovena described. It also caters to those already familiar with Grafana and how it works since it shares the same underlying architecture as Loki, Temp, and Mimir.
Phlare is considered by Grafana to be “another pillar of observability,” Tom Wilkie, vice president of product at Grafana Labs, who is also a Prometheus maintainer and a Loki and Mimir co-creator, said during a keynote. So, Phlare’s continuous-profiling capability “is the regular collection of memory and CPU profiles. It helps with things like program optimization, it helps with identifying and reducing tail latencies and optimizing your TCO,” he said. “Bringing it to Grafana means you can see it alongside your metrics, traces and logs and you can correlate between the two.”
What End Users See
Grafana Pharo targets frontend application observability, which has the targeted business use case of observing what the end-user customers see on their screen,” Myrle Krantz, director of Engineering, Grafana Labs, said during her keynote. “You still need to know what’s actually going into customers. The frontend is what your customers see and what they touch and where they access the value that your company has to offer. And it’s also harder to monitor than other things because what happens on the frontend doesn’t happen on a computer that’s under your control — it happens on your clients’ computers.
According to its GitHub documentation, it is launching with the Grafana Faro Web SDK, which Grafana describes as a highly configurable web SDK for real-user monitoring (RUM) that instruments browser frontend applications to capture observability signals. Frontend telemetry can then be correlated with backend and infrastructure data for full-stack observability.
Find Those Traces
The upcoming Tempo 2.0 release represents a potential turning point in the development of its evolution, featuring a query language for tracing. With it, it is easier to find and delineate the traces you are looking for. Previously, users needed to rely on searching through logs and exemplars, or attributes, like service name, to pinpoint traces.
The capability was designed to allow users to interactively extract and search through their traces. Future phases of the language will allow users to find and analyze traces based on their structure, Wilkie said. They will be able to search for traces that have a master span of a particular attribute and then combine that with other attributes, “so you match the structure of the trace,” he said.
“I think the thing that makes me the most excited about it is that we kind of don’t know what people are going to do with this,” Wilkie said. “What applications will this capability enable? In the future I think we’re going to see some exciting apps built into Tempo based on TraceQL commands”.
Prometheus users and other users already familiar with PromQL will notice how Tempo 2.0’s TraceQL is modeled on PromQL as well as on LogQL, Wilkie said. “If you’re familiar with any of those projects, you should be able to jump right into Tempo,” he said.
It Shouldn’t Be Hard
Grassroots interest among the developer and operations community largely accounts for Grafana’s surge in users, who are seeking metrics from Prometheus or other observability data sources. That surge can also be attributed to how Grafana panels make observability easy to use, or at the very least, to get started. In other words, out with the resident Prometheus or observability expert and in with everyone being able to figure out how to use observability in a short period of time, is what Grafana is trying to do.
“The vision here is that these should really help you get started in five minutes instead of days. It should be as simple as clicking through a few steps, deploying the agent and you’re there. One of the things that I’m particularly pleased with is more recently, we found that these integrations really helped you deploy those best practices throughout your organization,” Wilkie said. “So, instead of just focusing on getting started, this is actually now about getting that bottled-up expertise throughout a big organization: There have been about [100,000 uses in Grafana Cloud of about 50 integrations] after we started with just one integration (which was kind of embarrassing)” at the time.