Modal Title
CI/CD / DevOps / Observability

4 Ways to Measure Your Software Delivery Performance

Visualizing and tracking metrics is critical to the delivery of high-quality software when speed and scale are essential.
Apr 16th, 2021 8:58am by
Featued image for: 4 Ways to Measure Your Software Delivery Performance
Featured image via Pixabay.
Pratik Kale, Praveen Ahuja and Sanmeet Shikh also contributed to this post.

Aravind Kannan
Aravind Kannan is an Engineering Manager on the Delivery Engineering team at eBay. Aravind enjoys building continuous delivery tools and frameworks for Mobile, Web and Service-based applications along with his team.

When you have hundreds of teams deploying code to production for thousands of services and applications via thousands of Continuous Integration and Continuous Delivery (CI/CD) pipelines, measuring these applications’ velocity and stability becomes critical to ensure the delivery of high-quality software with speed.

This article focuses on how eBay measures and visualizes the velocity and stability of our applications across various organizations using the four software delivery performance metrics that are identified and explained in detail in Dr. Nicole Forsgren’s book, Accelerate.

The DevOps Research and Assessment (DORA) program, through six years of extensive research, has collected survey data from over 31,000 professionals worldwide via the yearly State Of DevOps reports. The report shows how elite and high-performing organizations compare with medium and low-performing organizations via the four key metrics that predict engineering and business outcomes.

Software Delivery Performance Metrics

Below are the four software delivery performance metrics:

  1. Deployment Frequency – How often does your organization deploy code to production?
  2. Lead Time for Changes – How long does it take to go from code committed to code successfully running in production?
  3. Change Failure Rate – How often does a deployment require a rollback, hotfix, etc.?
  4. Mean Time to Restore – How long does it take to restore service when there is an issue?

We built a system that can track and visualize all of these metrics in near real-time — with the ability to drill down into any organization at eBay and see the breakdown of these metrics across all teams within that organization. A team can look at these metrics for all of their applications. Additionally, the system allows access to historical trends to identify any improvements or degradation in the four metrics at each level. Teams can also apply filters to view their metrics for a specific period of time.

Deployment Frequency

To track this metric, we look for deployment events from our build and deployment systems. The deployment counter is incremented the moment a new version of an application makes it to an instance that is serving production traffic and is tracked until it reaches all the active instances that serve traffic. This includes both good and bad application versions.

Lead Time For Changes

To calculate the lead time for changes, we use all the commit SHA(s) that are part of a given production deployment. Lead time is calculated as the time taken for a commit from the time of creation until the time of deployment to the very first traffic serving instance(s).

A single build that was deployed to production can have more than one commit, resulting in multiple lead times being calculated. To compute a single overall metric, we take a median of these lead times and use that as the “lead time for changes” for that deployment. When this data is viewed for an organization, team or application, the mean of all the “lead time for changes” across releases for that organization, team or application is displayed as “lead time for changes.”

Change Failure Rate

Change failure rate represents the percentage of all deployments to production that result in user-impacting defects requiring remediation, such as a hotfix, rollback, fix forward or patch.

We currently measure change failure rate as the percentage of rollbacks out of all deployments for the given period of time. Even rollbacks from a single instance are counted as failures if the instance was serving traffic.

Mean Time to Restore (MTTR)

Mean Time to Restore (MTTR) shows how long it takes to recover when a failure occurs.

We measure MTTR as the time taken to roll back a change/release after it has reached production. For example, if a build N (bad build) was rolled out to production, and then the team finds an issue that requires them to rollback N by deploying N-1 (last known good build), the time difference between deployment of N to production until the time N-1 was deployed to production is calculated as Time to Restore (TTR). The mean of all TTRs is MTTR.

Conclusion: Improving Through Visibility

As part of a company-wide “Velocity Initiative,” various platform and product engineering teams at eBay are working together towards a common goal: to improve the ability of all eBay teams to ship high-quality software faster.

Measuring and visualizing these metrics helps eBay’s teams to see where they stand in terms of delivery velocity and stability, and has also helped to identify areas of improvement in the software delivery process across teams.

With this visibility into our metrics, we hope to continue to improve.

Interested in pursuing a career at eBay? We are hiring! To see our current job openings, please visit: http://ebay.to/Careers

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma, Velocity.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.