Cloud Native / Monitoring / Contributed

OpenTelemetry Consolidates Data for Observability

20 Jul 2020 11:37am, by

Dave McAllister
Currently providing technical evangelism for Splunk, Dave is working with DevOps, developers and architects to understand the advantages of modern microservice architectures and orchestration needed in distributed systems, especially for today's fast-moving cycles. Dave has been a champion for open systems and open source from the early days of Linux, through open distributed file systems like XFS, GFS, and GlusterFS to today's world of clouds and containers. He often covers topics such as the real-world issues associated with emerging software architectures and practices, on open source software and on creating new technology companies. A self-described open source geek and standards wonk, you can find his opinions on twitter via @dwmcallister.

The world of applications is a complex place. We have public and private clouds, on-premises, containers, orchestration, microservices, elastic response and more. We evolve constantly, deploy constantly and scale aggressively. Yet with all this change, just like the idiom that “in this world nothing can be said to be certain, except death and taxes,” in the DevOps and developer space, we can be sure that our amount of available and useful data will continually increase. So how do we get the data into our monitoring and analytics tools using open and standard technology and avoid a patchwork of different projects?

Enter Observability. Observability, for applications, is the design and delivery of data from myriad signals (telemetry) to provide the ability to infer and discover how the applications (and subsequently, their infrastructure) are behaving. Observability is often split into three pillars, metrics, (distributed) traces and logs. Unlike the older “one system, one application” model, observability allows our monitoring and analytics tools to show things we didn’t even know we needed to know, to discover and resolve the “unknown unknowns.” However, with our three separate types of data comes the need to cross-correlate that data. In our modern complex environments, no single piece of information is likely to be sufficient to resolve an issue, whether it is an application failure or a proxy of user happiness. And that can cause headaches in how we collect the data.

One of the biggest headaches with our observability data is that we have been restricted in bringing the data into our tools via single-source collectors. In other words, we needed to bring traces in with one mode, metrics via another, and logs via yet another collector. Quite often these were proprietary, leaving us locked into a data model that lagged the changes in our application architecture and environment. We were faced with collectors that delivered in batches, with heavy processing requirements on the act of data ingest and a complete lack of timeliness, giving us no clear understanding of our application performance related to our immediate gratification world.

That’s where projects like OpenTelemetry (a project within the Cloud Native Computing Foundation) can help out. OpenTelemetry is the project created from the merger of OpenTracing and OpenCensus that focuses on getting us the information we need to deal with today’s modern applications. OpenTelemetry takes the best concepts of both projects, merges and extends them, and also provides backward compatibility to make migration pain minimal and allow planned rollover to new APIs for features and function. And it’s well designed for use in observability.

So, OpenTelemetry is becoming that much-needed open, standard way of collecting and transmitting data to our monitoring and analytics tools. In many cases, the choice of a particular tool dictated the use of the aligned data acquisition agent, thus causing an issue should it become necessary to change the particular monitoring tool, for example. The applications and infrastructure would both require being re-instrumented, a scenario that few teams would choose to face without significant reason. With the OpenTelemetry model and its surrounding ecosystem, the data acquisition remains intact, allowing the capability to choose tools that advance our insight.

OpenTelemetry in its current state supports both traces and metrics. Traces are defined by their spans, as a waterfall plot, or a directed acyclic graph (DAG) of spans where the edges between spans are defined as a parent/child relationship.

Each Span encapsulates the following:

  • An operation name
  • A start and end timestamp
  • A set of zero or more key:value Attributes
  • A set of zero or more Events, each of which is itself a key:value map paired with a timestamp
  • Parent’s Span identifier
  • Links to zero or more causally-related Spans
  • SpanContext identification of a Span

Through this, OpenTelemetry can build traces for all trace data, including causally linked spans.

Metrics are recorded as raw measurements or with predefined aggregation and set of labels. Raw measurement allows us to defer what aggregation algorithms to use in visualization and monitoring. By supporting both pre-defined aggregation and raw, OpenTelemetry gives us consolidated data with flexible back-end use.

While OpenTelemetry supports both metrics and tracing, the unification for observability is moving closer. The OpenTelemetry Log Data Model has arrived to consolidate the log information with the application metrics and tracing data. It’s still an evolving work but crucial for completing the view of observability.

Our signals are unified through an open collector, and our monitoring can now correlate between all the pillars of information. It’s still in formation, but the roadmap to unification is clear. With this, we can now talk about telemetry data, without having to detail which data is not available. Of course, this doesn’t mean all tooling will make use of all three, metrics, traces, and logs, but the single collector means our applications are ready to deliver the data to our tools, but it removes the need to run and maintain multiple agents tied to specific back end tools. And as mentioned, it protects us from having to re-instrument to change tools.

OpenTelemetry also provides unification via the exporter model. An exporter allows for the telemetry data to be translated to a new format, most often one that a specific tool may require. As an example, you might want to use Prometheus as your monitoring and alerting system. By using the OpenTelemetry Prometheus exporter, you can feed telemetry into Prometheus. Additionally, since you can chain and use multiple exporters at the same time, you can export the data to where you need it to go. The ecosystem of exporters, along with the OpenTelemetry Collector provides the ability to feed data as necessary in the right format. OpenTelemetry brings our unified observability closer within reach.

It’s time for you to consider OpenTelemetry as your data collection mode for observability. The advantage to you is protection from instrumentation changes in your applications due to new tools or new functionality. The ability to bring all of your data in an open, standard approach and still deliver it to your tools of choice means you can start now and grow with it in the future.

The Cloud Native Computing Foundation is a sponsor of The New Stack.

Feature image via Pixabay.

At this time, The New Stack does not allow comments directly on this website. We invite all readers who wish to discuss a story to visit us on Twitter or Facebook. We also welcome your news tips and feedback via email: feedback@thenewstack.io.

A newsletter digest of the week’s most important stories & analyses.