Culture / DevOps

Google on the DevOps ‘Elite’ and Everyone Else

23 Sep 2021 12:16pm, by

After two waves of DevOps adoption, companies that have fully embraced DevOps best practices are achieving better software delivery and operational performance metrics than their peers. Most other companies are hitting a performance plateau. That’s the glass-half empty way to read the Google’s recently published “Accelerate State of DevOps 2021” report.

Researchers at Google used a statistical technique called cluster analysis to segment the survey’s 1,200 survey respondents. This year, 26% are classified as “elite,” up from 20% in 2019 and 7% in 2018.

The median elite organization is definitely high performing. It delivers software on-demand, deploying code to production or releasing it to users multiple times a day.

Perhaps this capability is underutilized, but it is possible it also takes less than an hour to go from a commit to actually running code in production. That’s an improvement from 2019 when the median elite DevOps performer needed a day’s lead time for changes.

In addition, 15% or fewer of these changes fail, which would mean a hotfix, rollback, patch or some other type of remediation is required. Finally, it takes less than an hour for a median elite organization to restore service after an incident, outage, or perhaps a serious security vulnerability is identified.

That’s great news for the “elite,” but what about the 68% of organizations that mainstream that fit into the “high” and “medium” categories in the study? Just looking at the median respondent, these groups have regressed since 2019. Previously, even the non-elite regularly deployed at least weekly but now that has dropped to between weekly and monthly. What’s changed?

Let’s go back to the cluster analysis for the “high” group. In 2018, 48% of organizations fit a similar profile. That dropped to 23% in 2019 and rose to 40% this year.

As we already told you, that does not correspond with actual improvements in performance. Instead, it relates to peer and industry expectations.  By 2018 a majority of study participants had already adopted DevOps culture but only to to limited extent.

The 2019 version of the Accelerate report, as well as the “2019 State of DevOps Report” revealed some ugly truths. Many companies that architected their software delivery processes for new CI/CD software found that with newfound visibility into their processes that their software delivery and performance metrics were worse than previously thought.

Furthermore, tensions between developers and security teams came to the forefront. As we previously wrote, once the DevSecOps problem was solved, organizations achieved a clear path to success.

Our review of recent surveys conducted for Sonatype‘s State of the Software Supply Chain reports make it clear that there is an elite group of companies that do deploy software faster and on average have better outcomes than the worst performers. But, especially in terms of how fast it takes to address security issues, companies have yet to see significant differences in results after adopting many leading-edge practices.

Think about the cluster analysis as identifying the parts in Geoffrey Moore’s classic technology adoption curve. We “crossed the chasm” in 2019 and a chunk of the “elite” group are now what now  Early Adopters. The report’s “high” cluster maps with the Early Majority, the “medium” with the Late Majority and the “low” with the Laggards.

It is hard to know if the outcomes included in the report represent meaningful differences between the early majority and the late majority; combined they continue to represent two-thirds of the study (67% in 2019 and 68% today). However, the study’s model definitely considers factors such as company culture, cloud, security and SRE practices.

It appears that there is a big difference in terms of reliability. For example, elite performers who successfully meet their reliability targets are more than four times as likely as the average organization likely to have solutions that incorporate observability into overall system health.

Benchmarking yourself against the elite sounds great and generates cools statistics. According to the report, elite performers recover from incidents 6,570 times faster than low-performing ones, but that is a ridiculous, almost worthless figure, and so is a lot of the other results in this report without proper context. What it means is that on average, the elite recovery restores service from let’s say a major security vulnerability in an hour. In contrast, the 7% of low-performing companies take on average three quarters of a year (6,570 hours).

Do you really think that it takes that long, and what about all the results in between? In our opinion, the “low” group should be disregarded as outliers in future analysis.

The normal tech adoption curve has an even distribution, but that doesn’t appear to be what’s happening now. Will there be a new digital divide? Will a quarter of companies be high-performing, leaving everyone behind? Let us know what you think.

Source: An adaptation by Craig Chelius of the technology adoption lifecycle as described in Geoffrey Moore’s book Crossing the Chasm.

Participate in The New Stack surveys and be the first to receive the results of our original research.