Thundra sponsored this post.
While the software development process has evolved to include several techniques that ensure defects are spotted early on, it’s impossible to predict when and how a system will fail. The question is no longer if defects will occur, but rather when —and how to find and fix them quickly to limit their impact.
Bugs are unavoidable, no matter how skilled you are. You may follow all the best software development practices, define and call all functions correctly, and write tests. Still, something will eventually go wrong and you’ll need to find out why.
When building an application or working on a software project, you can be creative about fancy algorithms, application infrastructure, or user interface design. But you must thoroughly debug your code to ensure that it works as expected. In this post, you will learn why debugging in a local environment is not enough to troubleshoot issues, and why you should add remote debugging to your toolset.
Debugging in a Local Development Environment
Historically, developers debug code on their local development environment. Local debugging is effective in a development environment, but it falls short when troubleshooting issues in the cloud — due to the disparity between the environments.
Debugging means different things to different people. For developers, it’s a way of removing bugs from code. When you debug, you’re examining the cause for the deviation between your code’s actual and expected behavior. You may do this by reviewing your code, establishing breakpoints, and evaluating your app’s methods and variables to determine the reason for the deviation. Debugging is important in software, as it enables you to build better applications and reduce the occurrence of bugs that will likely affect end users.
Because debugging is closely tied to development, IDEs and code editors have built various mechanisms to allow you to write code and debug it locally. Most programming languages also have debuggers that integrate with various IDEs to accomplish the same task. For example, Python has pdb and Ruby has Byebug and Pry.
As modern applications become increasingly complex, debugging solely in a local environment has proven to be inadequate.
Why Debugging Applications in a Local Environment Is Not Sufficient
When an application is under development, the test scenarios are clear and the environment is known. An app in production is a different ball game. The unpredictable nature of production systems makes it difficult to reproduce errors and simulate the exact situation you need in order to investigate it locally. Because developer environments aren’t the same as production environments, you might find it challenging to capture the actual states that led to a bug — including the variable values, specific threads, and everything else your code is trying to do.
In addition, some bugs only happen when pushed to the cloud environment. They are not reproducible in development environments. This makes it more challenging for developers to predict every error that may occur when a code is pushed to a cloud service like AWS.
Modern applications often use third-party APIs that are either mocked or skipped during local development. APIs like this are needed for the entire application to function as a whole. Since they are under the control of a third party, they come with challenges — such as controlling the code that comprises their logic, the server that hosts them, or the data transferred between an application and the API.
Bottom line: debugging applications locally is essential, but not enough to find and troubleshoot all issues. This is especially true in microservices or serverless architectures, where several services run and communicate among themselves. To drastically reduce a bug’s time to resolution in cloud native applications, developers need to embrace a new debugging mindset.
Why Debugging Apps in Cloud Is Hard
Debugging code in development environments can be a lot easier than in the production environment. In development environments, you have the comfort of your IDE and debugger. You can also set breakpoints, step into your code, and ship fixes.
But when the app goes to the cloud, all sorts of wild things can happen. Many unexpected factors — like high concurrent usage, high scalability, unpredicted system behaviors, and memory leaks — can make your code unstable. Some issues may occur intermittently, while others occur only when you restart an application. This unpredictable nature makes it difficult to reproduce and debug cloud environment bugs.
Undoubtedly, debugging is a crucial part of an application’s lifecycle; but because the real habitat of modern applications is unpredictable and hard to reproduce, it can be challenging to prepare for issues. Whenever end-users report that a production system isn’t working as expected, it’s often difficult to understand the error’s root cause. Because we live in a world where bugs are inevitable and business uptime is critical, a bug in production means a loss of money, time, or (even worse) an outage of critical services. A single hour of downtime is estimated to cost $68,000 in revenue. To avoid this kind of huge loss, you need to resolve production bugs and resume normal business operations as quickly as possible. Developers often find it hard to debug modern apps in the cloud because the information relevant for detecting and fixing bugs in such applications is distributed between multiple sources, such as log files. Sifting through such files is as tedious as looking for a needle in a haystack.
The Problem with Sifting Through Logs
Historically, logs have been crucial for debugging production issues; and the emergence of microservices architecture makes logs more critical than ever. Undeniably, a microservice architecture offers many benefits — like the ability to accelerate feature delivery, adopt different technology stacks, scale up and down, adopt agile development, and deploy each node independently. But they’ve also brought new challenges in troubleshooting problems and understanding the behaviors of an application.
In a microservices architecture, the information needed to debug errors may be distributed across a set of multiple disparate sources — like local variables, call stack, database queries, event queues, log files, and more. Because each microservice implements its own event-logging methods, you eventually end up with an avalanche of log data that is difficult to sift through.
As the number of microservices grows, logs become increasingly difficult to manage and expensive to store. If you don’t properly structure and aggregate the log entries from each service instance, you might not be able to identify what errors occur in your application or why. For instance, you could discover a severe event in your logs, but how would you know where the log originated and what service produced the event? Without a proper context, tracing log files offers little value.
Many things can go wrong in a microservice architecture. With so many components working together, it becomes more challenging and impractical to reproduce a bug and trace it to the root cause — especially when you need to track a single operation across multiple logs or systems that communicate with each other. In summary, sifting through logs is not enough to pinpoint some issues in production systems, as they may lack relevant contextual data that shows the root cause of a bug.
A Better Way to Debug Applications in Cloud Environments
Although there’s no perfect tool that will magically dictate what’s gone wrong in your code, application development has evolved. There is a better approach to debugging your Lambda functions or your code running remotely in EKS, EC2, Fargate, or elsewhere in the cloud.
Remote debugging is debugging an application that is running outside your local environment. It’s usually done by connecting the remote app to your local environment using a debugger (IDE or programming language) that supports remote debugging.
Remote debugging techniques help you save both money and time, by allowing you to directly inspect the source code of an application running on a different server environment or remote machine. It allows you to efficiently debug an application, especially in scenarios where regular debugging is impossible. Remote debugging also enables you to debug applications without digging through huge log files or replicating the cloud environment locally.
With remote debugging, you don’t need to install expensive hardware or fly to your server location to debug an application. You can do everything remotely without unnecessary complications.
By allowing you to set non-breaking breakpoints in applications on the cloud, remote debugging provides all the benefits of logging, while eradicating the weaknesses associated with the logging approach. Remote debugging enables you to:
- Access the application states, including full stack trace, and the global and local variables you need to debug.
- Fix errors in a cloud application remotely and quickly, without restarting or redeploying your application.
- Collaborate in real-time with your team members to resolve bugs faster.
- Get data from your code and debug issues without logging extensive data that slow down or disrupt your app’s performance.
If you’re a developer, site reliability engineer, or engineering manager, you know that debugging is essential. Even more, you understand that while local debugging is useful, remote debugging — when done right — will save you time, money, and headaches. Because of the difficulty in sifting through many logs and the inefficiencies of debugging production bugs in a local development environment, there’s no better time to adopt remote debugging than now.
Thundra will soon support debugging applications in production through non-breaking tracepoints. Our approach will integrate the remote debugging with our strongest muscle, distributed tracing. Claim your early access today.
Amazon Web Services is a sponsor of The New Stack.
Feature image via Pixabay.