What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
DevOps / Observability / Tech Life

Continuous Improvement Metrics for Scaling Engineering Teams

Being VP of Engineering is perhaps even harder than being CEO. Here are the actual metrics you should start to measure that actually help dev teams.
Jul 8th, 2021 11:00am by
Featued image for: Continuous Improvement Metrics for Scaling Engineering Teams
Feature image via Pixabay.

Ori Keren
Ori Keren began his engineering journey in 5th-grade programming in Basic after his parents bought him a Sinclair Spectrum ZX with 48k RAM in 1987. His desire to build started there and it never went away. Ori got his first developer job in 2000 and eventually became VP of Engineering. Then, two years ago, he founded LinearB with friend and colleague Dan Lines. As CEO he doesn't get to write as much code as he used to, but he loves helping engineering teams get better every day with LinearB continuous improvement automation.

The challenges of scaling an engineering team are hard to understand until you experience them. As an engineer-turned-CEO of startup LinearB — focused on software delivery intelligence — I have been afforded a unique perspective on how to communicate engineering metrics to leadership to foster successful scaling of your engineering teams.

I started as a software developer, and I thought that’s what I’d want to do my entire life until I started looking for ways to have more impact. I was promoted to team lead, and then director of engineering, where I learned first-hand about leading and growing teams and then scaling them. After I took on two consecutive roles as vice president of engineering, where the first company was acquired by AT&T, and the next by Cisco, I have now taken on the role of CEO of an early-stage startup. My experience has enabled me to truly understand how to optimize and scale engineering organizations — by understanding what to look at, and then how to remediate bottlenecks in the long term.

One thing I learned in this journey? Being vice president of engineering is perhaps even harder than being CEO. Another is that one of the key areas that can provide a significant impact in software delivery and velocity is the way we measure and optimize engineering processes across our organization.

Let’s dive in a little on where this journey has taken us.

The Engineering Metrics Journey

During the first decade of the millennium, we had little to no visibility into how to measure our engineering delivery capabilities. The second decade was a bit better from a visibility perspective, although many of the initiatives were focused on the CTO-level — what’s working and isn’t working, uptime/downtime, but still very little understanding of the entire engineering organization.

This is beginning to change — the past couple of years have been extremely exciting (if you care about engineering metrics, of course), and the term software delivery intelligence was coined, to represent the area where research and academics intersect with great products to help you drive continuous improvement in software delivery.

Early on, output metrics were the primary metrics measured when it came to engineering organizations, and were largely centered on individuals. These were measured in metrics such as number of commits, lines of code and story points; metrics the industry has now come a long way to understanding are simply the wrong things to measure.

There have been founders and engineering managers who in the heyday of output metrics, would run a script at the end of the week to aggregate this data, and then they’d have conversations with engineers who didn’t produce “enough” lines of code. Today, there’s a much better understanding that output is not a good measure of an engineer’s value — in fact shorter code is often better — and shifting your focus to teams and processes will deliver much greater insight.

The next generation of metrics came from excellent academic research, such as Dr. Nicole Forsgren’s book Accelerate, which focuses on four key metrics which encapsulate a high-velocity engineering team:

  • Deployment Frequency
  • Cycle Time
  • Mean Time to Recovery (MTTR)
  • Focus Time

These metrics coupled with new research, called SPACE that combines them with well-being indicators in the form of human factors such as burnout, focus time, interruptions — and how they impact these metrics, are much better practices when it comes to measuring engineering teams. This research laid the academic foundation for companies to go and build products that help deploy these key metrics, an important phase in this evolution.

This is a similar story to the Agile Manifesto (2001), and then as a result, tools like Rally and today Jira, helped to practically apply these agile principles. What we quickly learned is that metrics are great, and will give you visibility. At times going from no visibility at all and complete darkness, to having KPIs and metrics that help to scale, can already provide improvements as great as 70% in cycle time (we’ll get to this soon). However, that said, metrics alone won’t provide you with the improvement you need for your dev teams.

Now, after years of researching and implementing engineering metrics, we’ve gained a unique perspective on the right approach to help deliver continuous improvement, for dev teams, and in particular when you start to feel the pressure to scale your team.

How to Measure Your Engineering Organization

So first and foremost, the most important individual unit to measure is not a single engineer, it’s likely a squad or team. You need to remove the friction created by measuring individuals; engineers don’t want to be stack ranked. Measuring teammates against one another’s performance is a quick ticket to a toxic team culture.

Since development is a team sport, team culture is critical to its success, and oftentimes is the secret sauce to great engineering processes. This is especially true on teams where one person is really good at cranking out the code, and another is better at managing the deployments, while a third is doing a lot of the code reviews. Much of what makes development teams successful is actually this chemistry and glue, more than individual performance.

The Pillars of Successful Metrics Programs

We’ve learned that the three pillars to deploying a successful metrics program consists of:

  • Visibility in the form of metrics and reports for CTOs and VPEs
  • Context identifying the projects and bottlenecks for teams
  • Workflows through automation and downstream visibility for developers

A good way to explain what this means practically is to take the scenario of a new developer that joins the team. We always want to onboard developers as rapidly as possible, and to ensure that they can become productive quickly. We do this through pairing programs and mentoring. But many times, what happens is that when the new developer pushes their pull request, it goes unhandled.

In this context, pickup time is a very good metric to look at to understand that something isn’t right in the process. So visibility is certainly the first step — we see that instead of being picked up in a few hours, this PR is festering for days. But without context, it’s hard to understand why this is happening.

What tends to happen is that PRs are reviewed in clusters, and sometimes a lone PR can go overlooked, as it’s not in the context of a larger project. This helps us understand that the new developer still needs to be immersed and integrated into the processes better.

Last, once you’ve identified the outlier, and have taken steps to assist the new developer, adding workflows and automation can boost future improvement significantly. You can implement rules, guidelines and guardrails, to prevent this from happening the next time around.

If a PR is waiting for two hours, let’s alert the people that should be alerted about it. Once you have that kind of dynamic, you are able to not just see the problem, and understand the reason, but also prevent it and remediate it. A metrics program that comprises all three of these pillars, will be able to provide tangible value and improvement to your engineering organization over time.

Business Alignment at Scale

So what are the actual metrics you should start to measure that actually help dev teams?

The Executive Metrics

While not necessarily the most important persona to focus on, we’ll start top-down at the CTO level. One important metric to provide your CTO or VP Engineering is resource allocation, including investment profiles and project cadence.

When I was a VP Engineering, before every Product Offsite the CEO would come to me and say he’d like to understand what the team is working on. This view would take me days to compile and normalize to be able to provide a view into what the team is actually working on, that my executive peers could understand. Luckily today, we are more privileged and can achieve this view from our tooling, to get a good grasp on what our team is spending their time on.

Is this project solving more bugs or providing new value to our users? This kind of information is extremely important for aligning the engineering organization with the business, which is what you’ll need to translate to management to get the buy-in for more resources.

Team Metrics

Let’s go down a little lower in the chain, and talk about team metrics. One of the most critical metrics to measure from a team perspective is cycle time (and if you haven’t done so, you can read our CTO’s blog post on this that takes a deep dive on this specific metric.) While cycle time has many variants, it essentially reflects the amount of time it takes from when code is written until it reaches production, and brings value to the company. If teams focus on trying to improve this metric, it will have an incredible impact on how fast you can go through engineering cycles and how rapidly value is driven for the company.

Cycle time can be broken down into three primary metrics that encapsulate it:

  • PR size
  • Review and pickup time
  • Deployment frequency

When organizations focus on optimizing this metric, it has delivered improvements such as deployment velocity that is improved by orders of magnitude, from 15 days to one day or even just a few hours; from first commit until the code is actually in production.

This can also help prevent catastrophes and explosions, as the increments are smaller — meaning the smaller the PR size, the easier the rollback if there are issues, and the flip side is the quicker it can be deployed. And as famed engineer Dr. William Deming would say, many small improvements in aggregate ultimately translate to big gains.

Another metric to focus teams on is quality. The first thing that always comes to mind in the context of quality is bugs. It’s important to track how many bugs exist, and how quickly they’re fixed, but there are other leading indicators of quality.

One metric that can provide a good indication of the quality of your process is the depth of review when performing code reviews. Do we just quickly scan and say “LGTM” or do we take the time to properly review the code, with helpful and thorough comments?

The other half of that equation is represented in code churn. This will happen with greater frequency if you sacrifice quality during review time. This can provide a very good indication of code maturity and quality, or even something as fundamental as misalignment between initial requirements and definitions from product, to the ultimate implementation by engineering.

The Hidden Engineering Metric

If we want to focus on the real scale factor that will help your engineering organization improve, the hidden engineering metric can be measured in developer buy-in. Even if the CTO has all the visibility and metrics, if the developer doesn’t buy into the process, engineering organizations won’t be able to improve. Eventually, the power and autonomy are in the hands of the developers to actually execute the vision and metrics program.

Applying Metrics for Scale — Automating All the Things

Shifting left the process of dev productivity is ultimately not about the metrics for the developers, it’s about automating the workflows. If we want to improve our dev productivity, we need to think like a developer. One of the things I often do is think about what bothered me as an engineer. Perhaps once a quarter I wanted to see some kind of summary of KPIs and other metrics. But what I wanted to see every single day when I came in was everything I’m working on and where it is in the pipeline, what is the next thing that blocks it from progressing? I wanted to see my projects pushed further downstream in the dev pipeline all the way to production, to know the impact I made.

So eventually, if we can create an automated workflow to represent this to developers, such as when someone reviews a pull request (PR) and leaves comments, that a build just failed, that their code is about to be released and they should probably prepare to monitor the service/code, [side note: many organizations still aren’t at the maturity stage of you build it, you ship it, you own it], this real-time visibility for engineers is critical and helps teams optimizes micro feedback loops that happen hundreds time a week.

This will be taking visibility to its next level of value — define the process, build the pillars, automate the workflows to get the program working for you, and not vice versa. This will drive the hidden metric of developer buy-in to productivity.

Where visibility alone can provide a 70% jump in productivity, coupled with intelligent automation and workflows this number can be boosted to 90% and higher. Once you automate processes and workflows combined with relevant metrics, you can see your organization’s capabilities scale bottom-up through developer-led continuous improvement. And this is when you see the magic happen.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.