Why the Latest Advances in OpenTelemetry Are Significant
One of the hot projects at Cloud Native Computing Foundation this year was OpenTelemetry and OpenTelemetry Collector. This project is a very exciting movement in the observability space that is a cross-industry collaboration to agree on a standardized data format for observability and telemetry.
This in and of itself is significant because it allows data to be collected from multiple observability tools, whereas previously teams would be forced to transform data multiple times if they wanted a single picture of an event. With all the hype around AI/ML in observability, it’s more likely than ever that companies benefit from storing and viewing data in one system and training ML models in another.
What’s great is that this project continues to advance thanks to industry vendors and individuals collaborating on the OpenTelemetry Collector, which is a standardized agent and telemetry collector that provides high throughput telemetry collection.
In the following, I share some of the new features of this project and why they’re significant for the community.
1. New Transformation Language
I’ve found the syntax of many agents makes it very difficult to do meaningful transformations without some wacky yaml or toml. OpenTelemetry Collector still relies on yaml formats, but its new transformation language allows function-based statements that are quite fast to execute and allow managing complexity. Check out some examples of the syntax.
2. Logging Went GA
In just about a year of development and log collection are now in GA. The implementation has a few ways of collecting logs:
- First, it runs as a stand-alone agent and collects logs from the file system. It can be sent directly to the final destination or forwarded to an OpenTelemetry Collector running in collector mode, where log metrics can be calculated on the fly.
- Second, a number of logging SDKs exist that can be implemented directly in an application and sent to a central collector or directly to the final destination, which can help minimize the impact on disk IO.
3. Auto Instrumentation Maturity
Auto Instrumentation is the ability to automatically wire up an application to emit traces and metrics with minimal to no code changes. Java and .Net are fully supported and other languages are in various stages of development and release. This functionality is something that proprietary solutions have showcased as differentiators because it reduces the rollout complexity by minimizing developer time, and this now brings just-as-powerful functionality to the OpenTelemetry ecosystem.
4. Semantic Conventions
This one is huge and is benefiting from ElasticSearch donating ECS (Elastic Common Schema) to the OpenTelemetry project. Normalizing telemetry structure is challenging because it seems like just about everyone produces telemetry data in a slightly different format; yet to be able to analyze, create alerts and present the data in a human-friendly way, all the telemetry fields need to be somehow mapped.
If everyone and every system is just a little different, it raises challenges to making reusable dashboards and components. Software vendors can now assume responsibility for creating dashboards on a number of platforms with reasonable confidence the data will be in the right format on multiple platforms.
Meanwhile, those of us who manage large amounts of telemetry data can improve ingest and query efficiency and can provide more advanced functionality with fewer compute resources and memory overhead if the majority of content customers send relies on well-known field names.
The full schema is still a ways off from being finalized, but piece-by-piece the conventions are being ratified. For example, at KubeCon they announced the finalization of the HTTP schema.
5. Plugin Framework and Ecoystem
The ecosystem is growing in maturity. The extensibility framework allows customization of any stage of an ingest pipeline. There is a growing number of receivers for a variety of systems, processors with increasingly advanced functionality and destinations. I’m particularly excited by the new version of the OpenSearch extension that sends log data prepackaged in the Simplified Schema For Observability or ECS formats.
From a developer’s mindset, I find the structure of the schema and internal “p” message schema to be extremely well thought out and built-in protobuf. It has a good balance of functional freedom and minimal complexity.
6. Community Collaboration
This one isn’t so new to the CNCF community, but the speed and impact of this project exemplify the spirit of the CNCF community philosophy. Competing companies are working together to make a piece of computing far better and easier for the rest of us. Some may fear that removing vendor lock-in will cause customers to leave, or that sharing the code may give away proprietary IP.
However, in the telemetry space, the core architecture of agents and collectors is generally a solved problem. So why not make something that leans into conventions and works across platforms so that companies no longer have to maintain agent code, 80% of which is repetitive? This frees companies to work on shared plugins for interoperability and proprietary processors where innovation can be delivered through this framework.
The benefits extend to all operators and software vendors, too. With the standardized OpenTelemetry Collector SDK, vendors can create a single integration to instrument telemetry in their applications and dramatically simplify the collection process and trying to get all major observability providers to implement support for your application.
Operators benefit from the “collect anywhere” and “ship anywhere” mentality, too. Setup is simplified by standard config file formats, and the complexity of onboarding new systems is minimized. I also suspect the field-mapping cardinality problems many operators of log systems stand to see this problem greatly diminish due to the project’s semantic conventions for observability data.
A huge “Thank you!” is due to all of the project contributors and community members. There are too many to list here, but you can track them down on the OpenTelemetry project on GitHub.
The functionality and path forward with OpenTelemetry and OpenTelemetry Collector are moving extremely fast, and this past year was the second most contributed project in the CNCF portfolio, next to Kubernetes. With this many contributors staying organized and working together, the maturity is going to continue to accelerate. This will hopefully lead to unlocking innovation in the observability space through increased interoperability and easing the ability to instrument systems for telemetry collection.
(Editor’s note: This post has been revised for clarity around the role of OpenTelemetry).