How to Keep Your Developers from Idling in Tech Debt
Developers want to work on meaningful projects. They also want autonomy and work-life balance, along with using the best languages, frameworks and tools. Without these basics, you can’t attract the quality of talent you need on your team.
Hiring great developers without proper tools is like having a pro race car driver on a tricycle. However, developer productivity is more than just giving them the fastest possible cars to drive. It’s also about the roads you’re asking them to drive on. Technical debt is like potholes in your development process: the ugliest parts and unfinished business of all of your codebases and the various obstacles that even your smartest, most dedicated developers loathe encountering. According to a recent poll by Stepsize, more than half (51%) of engineers have left a company or considered quitting due to a large amount of technical debt.
The good news is that as software reliability practices mature, new conventions of service-level objectives (SLOs) and error budgets are giving organizations better signals for recognizing when it’s time to pay down that technical debt. These are modern optimization strategies you should be taking advantage of to provide better visibility, context and urgency to your tech debt.
In this article, we will explore two common types of technical debt: “drag” debt and “future-scale” debt. I’ll discuss applying modern reliability techniques to recognize when to pause on features and focus on paying off debts, hopefully before your team looks for new jobs.
Drag Debt: Recognizing When to Tear Up the Road and Repave It
When we work in the same codebase, the surface gets full of patches and potholes and loses its smoothness. Developers get held back by this. There are more bugs and a high risk of regression in fixing bugs. There is spaghetti code – way too many conditionals, if statements, functions over a screen long, too many different developers touching too many lines of code.
Drag debt happens to the most traveled roads – your most frequently changed components, which no one wants to drive because it’s terrible! It has a genuinely negative impact on developer morale.
Developers tend to know what the culprit is when the pager goes off: It’s that problem component again, it’s that noisy one. When there’s an accident at a particular intersection, ambulances know what they are about to respond to because they’ve been at that intersection recently.
Management can’t hear the wheels squeaking because there are too many wheels. It’s not just one road or one intersection; it’s all the roads leading into the entire city. So how can you tell when it’s time to pay the high price of closing the road, tearing it up, repaving and resurfacing it?
Error budgets will allow you to see when components in your stack or whole services in your architecture start decaying. They get janky and start misbehaving operationally as well as in development.
Error budgets give a great way to quantify the impact of drag debt. Of everyone that tried to travel each road, how many were successful at doing so? For a service, you can measure some easy targets like total uptime, total successful requests versus total unsuccessful, the latency of service requests, whether APIs are missing their targets. Service-level objectives (SLOs) allow you to create low watermarks for the expected performance of all your services and APIs, so you can tell when the drag debt is having enough of an impact that it’s time to pay down the debt.
Future-Scale Debt: Give Yourself Signals to Know When It’s Time to Pay
Some forms of tech debt are intentional deferment. The SSL cert expiration is an excellent example of this. Your cert is good for 12 months. There’s nothing you can do to pay it off early; this has to wait until the time of renewal. If you don’t pay it off at that time, your encryption and authentication will be at risk. There’s nothing subjective about this type of tech debt. It’s tied directly to a potential outage in the future if you don’t pay it on time.
Peering into the tech debt backlog at the average company, many issues are connected to future-scale issues with far more ambiguous timetables on when they are appropriate to address.
As a net-new SaaS offering, your first waves of product build will emphasize features to attract your first thousands of customers. At some point in time, you realize that you’ll need to scale the service and, say, run in multiple geographic regions globally so your service is blazing fast.
You’re anticipating a future need, but how do you know when it’s time to build out those regions in Europe, Asia-Pacific or the Middle East-North Africa? You have to insert triggers that remind you that you need to do the work, that it’s time to pay down this tech debt.
First, you should set up latency metrics that indicate that more than, say, 5 percent of your traffic is now coming in from these different regions. Violating this goal triggers your planning cycles, so you can pull your product team aside and say, “Remember a year ago we decided to bring our product to market quickly and deferred the technical debt to scale the service geographically? Now it’s time to pay that debt.”
We might like to believe that tech debt results from poor engineering decisions. In reality, the engineering team is like the bank in this metaphor. To ship now, they issue tech debt. The product team benefits from shipping sooner, but there’s a loan attached.
The way you pay is by deferring features while tech debt gets paid off. Today, SLOs and error budgets are giving developers the means to broker this discussion with their business stakeholders.
No one is awakened in the middle of the night to discuss tech debt. Don’t attach emergency alerting to tech debt indicators, but do track the loans you’ve made so you can get ahead of these payments before your next sprint.