Deep Work: A Better Way to Measure Developer Velocity
Estimating developer work is more art than science. Devs typically use story points or hours for their estimations, or sometimes they’ll use a “planning poker” exercise to determine what they can accomplish in a given sprint. But these estimations are not an accurate measurement of dev team velocity.
Take story points, for example. How many can you complete in a given sprint? You decide on a number, plug it into a burndown chart, and that becomes your velocity. The problem is, story points can mean something different to each person. Consider two developers working on the same project. One may estimate three story points for a task while the other estimates five. How do you compare the two?
It’s not only difficult to compare individuals but also teams. As soon as you hire a new developer, you go back to the storming and norming stages of team development, and your previous view of team velocity goes down the drain. Story points are just too subjective to give you a clear directional picture. Even when estimated with hours or planning poker, the velocity estimates remain subjective.
I’ve given a lot of thought to how dev teams measure velocity, and it’s clear to me that we need a better way. To that end, I started experimenting with new definitions of velocity — something that would make more sense to me and dev teams in general. Here’s what I came up with.
A New Way to Measure Product Velocity
I recently started a new position as Chief Technology Officer of Uplevel, an engineering insights organization that helps empower dev teams and leaders through data. But prior to joining Uplevel, I was a customer at the company. That’s when I first started using its Deep Work insights to measure how much uninterrupted work time my teams were getting each day. I was already familiar with the concept of Deep Work, but it resonated with me even more in a development context.
Next, I looked at their insights around pull requests. While these insights weren’t new to me, the data had previously just floated around without really being corralled into an actionable format. Together, these Deep Work and PR insights are what ultimately led me to a new definition of velocity. I now think of velocity as having two main components:
- The first is capacity, which is where the Deep Work insights come in. The more Deep Work hours you can free up for your team, the more capacity they will have to design, write code, and deliver new products.
- The second component is effectiveness, which is necessary for working within that expanded Deep Work capacity. Effectiveness is all about working on the right things in the right way. To measure effectiveness, we first look at cycle time insights related to pull requests — how long it takes to complete the request. The other insight we look at is PR complexity, another indicator of cycle time.
Here’s what that looks like in action. A developer sits down to write a small logical unit containing up to several dozen lines of changed code, with few files being touched, behind a feature or environment flag. Because it’s a smaller change, the code review will be faster. It’s a different story when developing an entire feature, submitting PRs with 50 touched files and hundreds of changed lines of codes. In this case, the code review process will take longer and involve a lot more back and forth.
By the time you complete the PR and are ready to go back to the main vein of code, the vein itself will have progressed. As a result, you’ll run into merge conflicts, which you’ll need to resolve because other people are working on the same area of code and making progress. To be as effective as possible, it’s best to work in small logical units of code using production-safe flags to prevent it from running in production. This enables you to move the code into the main vein faster than you could writing larger chunks of code changes.
In summary, I believe that velocity is best measured by looking at how much uninterrupted focus time your teams have (capacity) and how effective they are within those periods, with a focus on cycle time and PR complexity.
You Can’t Predict Future Sprints
While I consider the approach above to be a better way of measuring velocity, it does neglect one of the key promises of a story points approach: predicting future sprints.
When using story points to measure velocity, you are predicting how future sprints will look based on previous trends. For example, if you worked on 10 story points across 10 sprints, you can expect the same number next sprint. In reality, story points don’t actually correlate with predicting future sprints because of their subjectiveness.
So while you can’t predict future sprints when using my approach to measure velocity, you’re not missing out on anything. I’d argue that you never really had a way to predict sprints anyway, so you’re not losing anything by trying to create more capacity and effectiveness for your developers. That doesn’t mean you can’t still use story points. They can still be helpful for planning — just not as a measure of velocity.
I’m not claiming this alternative measure of velocity as my own, as other dev teams may already be taking a capacity/effectiveness approach. But I do see it as a more objective way to measure velocity compared to subjectively assigning story points to a task.