Favorite Social Media Timesink
When you take a break from work, where are you going?
Video clips on TikTok/YouTube
X, Bluesky, Mastodon et al...
Web surfing
I do not get distracted by petty amusements
Microservices / Observability / Serverless

Debugging Microservices in the Cloud

Despite the benefits of microservices, they come with unique challenges when it comes to debugging.
Dec 8th, 2020 12:03pm by
Featued image for: Debugging Microservices in the Cloud

Thundra sponsored this post.

Emrah Samdan
Emrah is VP of Product at Thundra. He is enthusiastic about serverless, observability and chaos engineering.

Microservices have quickly grown in popularity as a software architecture for designing applications as suites of independently deployable services.

Unlike traditional monolithic systems, in which an error in one module could cause an entire application to fail, microservices systems use isolated modules and services that function independently. This gives developers more flexibility to edit code without fear of impacting separate modules. If a bug is accidentally introduced, it will only impact that specific microservice.

Despite the benefits of microservices, they come with unique challenges when it comes to debugging. In this article, we’ll explore methods your developers can use to tackle those challenges. If you’d like to read our deeper exploration on the topic, you can download our free guide here.

The Challenges of Debugging Microservices

The nature of microservices makes debugging tricky: the complexity of the environment, opacity of the infrastructure, and transitioning and scaling from development to production can all introduce problems.

Recreating State in a Complex Environment

Because the state of the application isn’t consolidated in one monolithic system, but is spread across many different microservices, it can be difficult to get a clear view of the system’s state. This is compounded by the fact that microservice systems tend to grow in complexity. As more components are added, a complex web develops: each module runs on its own and might fail without affecting the other modules. Additionally, each service can be written in different languages, log differently, and be completely independent of other services.

Lack of Observability and Tracing

As microservices and serverless models take off, infrastructure is quickly becoming more difficult to understand. Each module, each serverless call, and each cloud component further conceals what is going on beneath the hood. This makes observability and tracing difficult, since developers often can’t detect what a microservice system’s internal state is based on outputs. Because microservices run independently, it’s hard to track user requests between asynchronous modules and reproduce errors. And because each request might take a different path, it’s also hard to detect which services interact. All these factors mean developers struggle to pinpoint the root cause of an error.

Inconsistency from Development to Production

Finally, when moving code from development to production, state and performance issues can be unpredictable and difficult to replicate. Even after unit and integration testing, it’s hard to predict what code will do when processing millions of requests on distributed servers. If the code scales poorly or the underlying database can’t handle the number of requests in production, numerous potential failures could arise — making it difficult to tell which piece of code or database is causing the problem.

Traditional Debugging Methods

Numerous debugging techniques have been developed for monolithic code, where it’s relatively easy to trace and track requests. These include:

APM Solutions

Application performance management (APM) solutions help development teams diagnose performance problems and detect exactly where a problem is occurring. They can also often parse web server logs to calculate the number of requests coming in, monitor error rates, identify slowdowns, and track key application dependencies so your team knows where to look when errors arise.

Log Analyzers

Third-party log analyzers save time by breaking down and processing logs, detecting where bugs or other issues are occurring, providing a performance dashboard, and filing bugs directly into Git repositories. They are helpful in monolithic systems where a request interacts with a single system. However, with microservices, requests are much more difficult to manage and track through logs.

Debugging Dumpsters

Before log analyzers and modern error message handling, developers often relied on memory dumps of an entire system where an error had occurred. This process required reading through messages to pinpoint the source of the error. However, since dumps showed only the memory state at the time of the error, this wasted a lot of time without always revealing the root cause.

Breakpoints for Stopping Code

During development, breakpoints let developers view changes in their application’s state and track what is going wrong. However, it’s infeasible to stop live production code with breakpoints just to examine variables — making this method too invasive for production and not providing developers the information they need to get to the root of the problem.

Microservices Debugging Methods

While the monolithic debugging methods above don’t adapt well, a few microservices-specific methods have also been developed, including:

Non-Breaking, Non-Intrusive Options

Third-party tools let your team set breakpoints that don’t actually pause or halt execution like a standard debugger. They can then view stack traces and global variables during execution, along with individual watch variables. This lets developers non-intrusively test hypotheses about where issues are occurring, without halting code or redeploying/restarting their codebase.

Observability into Code

Tracking requests passing through increasingly complex systems is a major problem with microservices systems, and creating a customized observability platform would consume valuable development time and resources. Fortunately, modern third-party tools can track requests and provide code observability for microservices, along with serverless and distributed computing. For example, using Thundra, you can observe user requests moving through your complex infrastructure in production. By letting you observe what is happening in your coding environment at a request level, these tools help developers pinpoint bugs quickly and easily.

Autonomous Exception Capture

Part of the battle with debugging in production is realizing there is a problem in the first place. Systems that automatically capture and track exceptions as they occur can identify patterns or repetitive bad behaviors that occur infrequently; such as an error due to a particular browser version, a strange stack overflow every 30 clicks, or even a daylight savings or leap year error.

Capturing errors isn’t enough. A system that also tracks logs and variables to demonstrate when the error occurred helps your team replicate it easily. Whether or not you are using microservices, the ability to automatically track and identify patterns can drastically simplify debugging in production.

Debugging with Thundra

Debugging microservices doesn’t have to be hard. Thundra was designed to help you debug code easily in serverless and microservices architectures. With live debugging, developers can pause and play the Lambda invocation from their own IDE, while offline debugging (“time-travel” debugging) lets developers step through code minutes, hours, and days after execution finishes, to better understand its flow and detect unexpected behavior.

For better observability, Thundra’s debugger correlates all three components of traceability: traces, metrics and logs. For example, Thundra offers traces that provide end-to-end visibility for a request throughout a microservice system, to help detect bugs, bottlenecks and other exceptions.

Thundra even lets you drill down into a specific request to access variable values, find out which microservices were involved, and correlate similar information about traces and logs, logs and metrics — or all three. Your team can then work backward to quickly and non-invasively identify a bug’s root cause in production.

Thundra’s debugging functionality is currently being developed to help debug any type of application, at any stage. Developers will soon be able to use Thundra to set tracepoints on the code and take snapshots that are integrated with the traces produced — reducing debugging time from hours to minutes, before and after pushing changes to production. You can sign up with Thundra Sidekick now and start debugging microservices right away.

Debugging Cloud Apps Doesn’t Have to Be Hard

Debugging has always been a complex art, and this is even more true today with modern microservices. Tracing requests and predicting how code will scale is not simple. Add in the challenges of production debugging and the task becomes even more daunting — but not impossible, with modern tools to make the job easier.

Third-party tools like Thundra help track requests and increase observability. They also provide in-production breakpoints and correlate logs with metrics and tracing.

Microservice architectures let you deploy more quickly and pivot with ease. And with the right tools, the challenges of debugging shouldn’t outweigh these advantages.

Feature image via Pixabay.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.