How the OpenTelemetry Protocol Works with Lightstep and Prometheus
We’ve been hard at work making metrics telemetry an integral part of Lightstep. As part of this effort, we’ve been looking at the problem of data ingest. Unlike distributed tracing, which is a relatively recent addition to many observability stacks, metrics has been the backbone since the time of dinosaurs.
We’re very interested in the problem of “instrument once, send your data anywhere.” But what if you’ve already instrumented and want to send the data to Lightstep?
The reality is that modern systems are heterogeneous. It isn’t always easy to transition from one metrics system to another, and legacy systems may produce metrics in a format that you no longer wish to use. This can result in a major headache for operators, who end up managing a mishmash of telemetry systems.
While Lightstep supports other common formats, we believe our customers — and the industry as a whole — benefits when we can agree on standard protocols and APIs. This is why Lightstep is a core contributor to the OpenTelemetry project and we believe that telemetry can be simplified. It should be possible to have a standard system that elegantly handles all types of observability data, regardless of where and how the data is produced.
Because earlier systems tended to only handle a single type of data, we needed a new protocol. So OpenTelemetry created the OpenTelemetry Protocol (OTLP), which supports traces, metrics and logs in a single data stream. Lightstep supports OTLP natively and is currently working with the rest of the community to finalize the metrics portion of OTLP.
This means that if you can translate your data into OTLP, you can send it to Lightstep.
The next step is writing the translator, which converts existing formats into OTLP. In OpenTelemetry, this translation occurs in the Collector. The Collector is a stand-alone service that handles ingestion, processing and exporting of all major observability formats. A translation and processing service like this can provide some much-needed glue in a modern system.
Currently, we are hard at work with the rest of the OTel community to make the Collector fluent in all of the common metrics protocols, starting with Prometheus.
Here’s how Prometheus support works today and what the roadmap looks like going forwards.
In order to translate Prometheus data into OTLP, Lightstep designed the OpenTelemetry Prometheus Sidecar, which runs alongside a Prometheus server. The sidecar reads the Prometheus WAL (Write Ahead Log), converting it into OTLP while adding additional OTLP-supported metadata — such as identifying whether an instrument is a counter and a gauge.
So, sending data to Lightstep from an existing Prometheus deployment simply involves adding this sidecar to your Prometheus server deployment. This is helpful, but it still involves running an extra service.
The next step is to move this translation work into the Collector. We are in the process of donating the Prometheus Sidecar to OpenTelemetry and converting it into a Collector component.
By running a Collector as a sidecar, your Prometheus server pool can double as a Collector pool — gaining all of the features that the Collector provides.
After Prometheus WAL support is added, the Collector will move on to supporting OpenMetrics ingestion, as well as Prometheus Remote Write. This support will allow operators to replace their Prometheus servers entirely, reducing their telemetry system down to a single type of service. This will save on overhead and complexity, while keeping the flexibility the Collector provides.
This work is expected to be completed at the end of May. Want to follow along? All of this work is happening in the OpenTelemetry Prometheus SIG, as a collaborative effort between the two projects. If you’d like to get started with Prometheus and Lightstep today, you can find the setup instructions here.