The rise of microservices and containers, cloud-hosted CI/CD systems and serverless platforms give developers more opportunities to build rich and powerful services. But debugging those services is becoming increasingly complicated, and as deployment frequency goes up the mean time to recovery from problems tends to go up. Part of that is the complexity of the environment and the time it takes to deploy a developer environment that accurately mimics the production environment, including all the other microservices, containers and external dependencies, whether it’s third-party code or just code that particular developer doesn’t work on.
But it’s also the difficulty of debugging against live code and being able to set breakpoints and quickly deploy code right from their development environment. Containers offer the promise of having the same environment for development and production but setting that up can be a lot of work; connecting the two for debugging is even more complex. One approach is services like Azure Dev Space, which creates sandboxed environments for developers in Azure Kubernetes Service and uses service mesh routing to sync local code with the production environment so they can debug against the live application.
There are a lot of other environments beyond AKS where developers want a better debugging experience, which is why OzCode is bringing its popular debugging extension for Visual Studio to the Azure DevOps service (previously known as VSTS), which integrates multiple cloud and third-party developer services. OzCode “Debugging-as-a-Service” toolset integrates with Azure Pipelines to help developers share a debugging session, including iterating through code hosted in the cloud service and trying out different code fixes before committing and deploying them.
“About 50 percent of the average developer’s time is spent debugging,” OzCode Chief Technology Officer Omer Raviv told the New Stack. “That’s always held true but it’s becoming even more frustrating because those lines of code are no longer running on the developer’s personal machine. They’re running in the cloud and the developer needs to understand where things went wrong in a different system. For a cloud native developer, when a user presses a button in their app that single button press sends them through different microservices and serverless functions. The world of software is increasingly distributed and complicated, the complexity of debugging is increasing and the problem will only be exacerbated with more enterprises implementing microservices.”
Like the Visual Studio extension, the AzureDevops OzCode tools aim to help developers doing root cause analysis — only against code that’s running in Azure DevOps rather than on their own computer. They can still step through the code and inspect memory in a rich visual debugger with scoping and color coding, looking at the contents of variables as they go. That includes all the previous values of the variable, which can quickly show which iteration through the code caused the problem. This kind of time travel debugging (which Visual Studio and Visual Studio Code users will be familiar with for local code) is a big help for finding the exact point where things started going wrong.
“We can pinpoint the exact moment in time in a complicated, distributed cloud execution where things started going awry,” Raviv claims. “In some cases, the tool will be able to find that point on its own. “If there was an error, and the software threw an exception we can do a bunch of heuristics to figure out the right moment and give the developer enough information to understand the problem. If it’s a more subtle failure where there isn’t an exception, we let the developer put their own instrumentation in to find that moment in time.”
The cloud debugger also lets developers go forward in time to try out their proposed fix live, which he calls “pre-bugging.” “We’re capturing a snapshot of the moment before the code failure happens. We can take a given moment in the program execution and extrapolate from that what is going to happen next.” After capturing the execution with OzCode, the developer can experiment by changing the code and stepping through it to see what happens and experimenting with different approaches. “You can attempt to fix the bug in the same snapshot with the same memory state that reproduces what happens in production. That eliminates the lengthy process of adding more logging and waiting so long between each iteration to get more information about the root causes.”
Those point in time snapshots of code execution, both of the live code and the proposed fix running in the same environment, can be shared with colleagues to make it easier to collaborate on a problem by pointing them to exactly where the problem occurs. The chat panel at the side of the cloud debugger lets developers do the usual things like adding comments and mentioning colleagues. They can also post links to a specific execution snapshot so the other developers can review it, and suggest the code changes they think will fix the bug; that shows both the diff of the old and new code side by side with changes highlighted and the execution of that code showing the change in action. Code owners can then choose to approve the code fix and merge it into the master branch for deployment.
“It’s a pull request model for developers working together to understand the background,” Raviv suggests, “but it’s a pull request making changes not to static source but to the dynamic nature of the execution that happens in the cloud.”
That collaboration can include as many developers as necessary; because they’re looking at a snapshot it’s not necessary to give everyone access to the production environment to collaborate on debugging. The downside of that is that everyone involved is able to see the code and the potentially sensitive personal information that’s flowing through that live production environment; crash dumps often contain information like customer email and phone numbers.
The Azure DevOps integration lets you restrict access to projects and source code through the same mechanisms you’re already using for the CI/CD workflow, and OzCode also allows admins to create policies for automatically redacting the contents of variables that hold personal information like names and addresses in the cloud debugger, as well as providing an audit trail for who has seen the code and content.
The Azure DevOps integration also lets developers and admins correlate bugs and debugging sessions to work items. If there are regression bugs that creep back into the code after they’ve been resolved, they can go back and look at what the problem was and how it got fixed last time rather than having to start debugging again from scratch or digging through code comments and old PRs.
Integration with Microsoft’s live coding collaboration tool Visual Studio Live Share is also on the roadmap. The OzCode collaboration complements this rather than competing with it because it adds asynchronous collaboration, Raviv says. “We’re finding that when two people are trying to debug a problem together it’s cumbersome; one person’s notion of where to look and what to explore can be very different from the other person’s and we see people struggle for the wheel. We’re trying to create a mode where people can be looking at the same problem and exploring different ways to solve it using the notion of pre-bugging. They could go into different moments in time and experiment with different things and still have a meaningful way to collaborate and share their hypothesis for why the code is not doing what they expect it to do.”
Feature image: OzCode’s collaboration environment combines chat, code review and interactive debugging of a snapshot of the live production environment.