Development / Monitoring / Sponsored / Contributed

Getting Started with OpenTelemetry for Java

1 Jul 2020 8:42am, by

New Relic sponsored this post.

Charles Humble
Charles is a remote engineering team leader with experience of both software delivery and content.

OpenTelemetry is an open source telemetry framework created through the merger of OpenTracing and OpenCensus. Aiming to be robust, portable and easy to implement across many languages, it provides a single set of APIs, libraries, agents and collector services to capture distributed traces and metrics from your application. It is also backward compatible with OpenTracing and OpenCensus, meaning you can migrate from either of those projects to OpenTelemetry without any breaking changes.

In effect, OpenTelemetry standardizes what telemetry data looks like, but it doesn’t standardize the analysis tooling. The net result is that different vendors are able to innovate the analysis component, while development teams can easily shop around and try out different open source and proprietary offerings to find the one that best meets their needs. Longer-term, the hope is that OpenTelemetry instrumentation will also get built into more and more libraries and frameworks, reducing the number of manual instrumentation developers need to do.

At the end of March, the OpenTelemetry team announced the first beta release and the expectation is that there will be a General Availability release in the second half of 2020. The first beta release includes a specification and SDKs to instrument applications written in Erlang, Go, Java, JavaScript, and Python. Each SDK released by OpenTelemetry contains examples of common use cases to help you get started. These examples provide working code that illustrates how to instrument HTTP/gRPC servers and clients, database connectors, and more.

Given that it remains in beta, it is obviously not yet suitable for production use. But supporting vendors — including Amazon (AWS X-Ray), Dynatrace, Google Cloud Monitoring + Trace, Honeycomb, Lightstep, Microsoft (Azure Monitor), New Relic and Splunk — have been quick to provide open source exporters for some of these SDKs. This means that developers can begin to explore the capabilities that OpenTelemetry provides through those vendors’ tools, as well as by using tools such as Prometheus and Jaeger.

In this article, we’ll look at the background to the project, introduce some key terminology, and go through the basics of both manually and automatically instrumenting a Java application. We’ll be sending the generated data to New Relic One.

Background

Modern internet services are often implemented as complex, large-scale distributed systems that may be constructed from multiple microservices, perhaps even developed by different teams and written in different languages. Often these architectures have hidden dependencies: as Leslie Lamport famously said, “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.”

In order to understand such a system, a mechanism for distributed tracing — allowing an engineer to follow a single request as it moves across service boundaries — is invaluable. With distributed tracing you can discover the latency within a request, and identify bottlenecks and failures. Along with events, logs, and metrics, it is one of the four key components of observability.

OpenTelemetry includes a standard collector, which can receive data in a variety of wire formats including Jaeger, Zipkin and OpenCensus. The collector can be configured to output in any of these formats, plus commercial observability tools like New Relic One. It can also fan out to multiple destinations.

Much recent work on distributed tracing has been influenced by Dapper, the distributed tracing system originally developed and used by Google. In particular, a great deal of the terminology and corresponding mental models used in OpenTelemetry can be traced back to that project.

As with any technology, in order to get to grips with OpenTelemetry there is a small amount of terminology that it is useful to know:

  • Trace: a record of activity for a request through a distributed system. A trace is a Directed Acyclic Graph of spans.
  • Spans: named, timed operations representing a single operation within a trace. Spans can be nested to form a trace tree. Each trace contains a root span, which typically describes the end-to-end latency and (optionally) one or more sub-spans for its sub-operations.
  • Metrics: a raw measurement about a service, captured at runtime. OpenTelemetry defines three metric instruments — counter, measure and observer. An observer supports an asynchronous API collecting metric data on-demand, once per collection interval.
  • Context: a span contains a span context, which is a set of globally unique identifiers that represent the unique request that each span is a part of, representing the data required for moving trace information across service boundaries. OpenTelemetry also supports the correlation context which can carry any user-defined properties. Correlation context is not required and components may choose not to carry or store this information.
  • Context Propagation: the means by which context is bundled and transferred between services, typically via HTTP headers. Context propagation is a key part of the OpenTelemetry system, and has some interesting use cases beyond tracing — for example when doing A/B testing. Note that OpenTelemetry supports multiple protocols for context propagation and to avoid issues, it is important that you use a single method throughout your application. So for example, if you use the W3C specification in one service, you need to use it everywhere in your system. These are the currently supported options:

At the time of writing, the W3C specifications are in the process of being standardized and would be a logical choice for a new project.

Manual Instrumentation

If you have very quick and repeatable services, building custom manual instrumentation may not be necessary. But for longer running services and more complex systems, it might be appropriate. OpenTelemetry offers a tracer to enable custom instrumentation throughout your application, and it is straightforward to use.

Broadly, there are four steps you need to start working with OpenTelemetry: install OpenTelemetry; install instrumentation adaptors; configure the SDK; and decorate your application code.

The following is a simple example that shows how to create a tracer, add a root and two child spans with some attributes, and export that data to New Relic. This example uses Maven, but you can also use Gradle. Please note that the APIs are still under active development and will likely change in future versions.

To run the example you will need a New Relic One account. If you don’t already have one you can sign up for a free 30-day trial account. You will need an Insights Insert API key, which you can get in New Relic One via Account settings… API keys… Insights API keys. Add your key to String apiKey. If you are in the EU datacentre for New Relic you will also need to manually indicate this via a URI override as shown in the code sample. The New Relic documentation includes a complete list of EU API endpoints.

Maven Dependencies

After you run the example, to find your spans go to https://one.newrelic.com/ and select ‘Distributed tracing’ from the home page.

If you click on the root span, labeled “getCustomerOrder”, you will see an expanded view which allows you to see the child spans and other data.

Auto Instrumentation

As well as supporting manual instrumentation, the OpenTelemetry project includes a Java agent JAR that can be attached to any Java7+ application. The agent will dynamically inject bytecode to capture telemetry from a number of popular libraries and frameworks, allowing developers to gather telemetry data without having to manually instrument their application or, indeed, make any code changes at all.

While the granularity here is less than you would get from manually instrumenting your own code, it does provide a good starting point, and also overcomes the issue of having to manually instrument what is happening in third party libraries. Moreover, although the OpenTelemetry project is new, the range of supported libraries is already quite extensive as it has built on top of the existing work done for OpenCensus and OpenTracing.

OpenTelemetry includes a simple logging exporter, which provides an easy mechanism to verify that spans are being created by viewing the data in your console. As such it is immensely helpful for debugging.

To try this out for yourself:

Download the latest release and the logging exporter.

Find a suitable app. You can use your own app, or grab something like the Spring Pet Clinic.

Start the application as follows:

You should see telemetry logging information appearing in the console window:

If you scan your console you can see version information:

You can also see a number of spans being created as the Spring Pet Clinic app starts doing database work. For example:

To send this data to New Relic grab the New Relic exporter.

You can build it using:

As with the previous manual instrumentation example above, you will need a New Relic One account and an Insights Insert API key. The details of how to do this are covered under manual instrumentation.

Now, from the command line run the following using your Insights Insert Key. If your New Relic account is in the EU you’ll need to include the manual URI override. Otherwise, you can leave that argument out:

If you encounter an issue with this, you can turn on debug logging for the exporter running in the auto-instrumentation agent, using the following system property:

And, if you wish to enable audit logging for the exporter running in the auto-instrumentation agent, use this system property:

To find your spans in New Relic One: go to https://one.newrelic.com/ and select “Distributed tracing” from the home page.

Next Steps

The OpenTelemetry Java QuickStart provides an example of how to work with the tracer.

The New Relic GitHub project has a file, BasicExample.java, which provides example code for how to set up custom telemetry for an application and send it to New Relic. That example includes some topics we’ve not covered here, including metrics. The GitHub project also provides a written example.

Finally, the OpenTelemetry project has an active Gitter where you can engage with the community and find out more.

Feature image via Pixabay.

At this time, The New Stack does not allow comments directly on this website. We invite all readers who wish to discuss a story to visit us on Twitter or Facebook. We also welcome your news tips and feedback via email: feedback@thenewstack.io.

A newsletter digest of the week’s most important stories & analyses.