It’s been fun watching real-world programmers react to a new study that challenges the idea of vast differences in the productivity of computer programmers. The study tries to suggest better ways for managers to assess and improve the performance of their developers. The data-backed work comes from Bill Nichols, a senior member of the technical staff at Carnegie Mellon’s Software Engineering Institute. Nichols announced his surprising results in a recent blog post, entitled “The End to the Myth of Individual Programmer Productivity.”
Though discussions about developers often include the concept of a ten-times-more-productive “10X programmer,” Nichols blog post explores “the veracity and relevance” of that concept. His research suggested that instead half of the difference in “program-development effort” can actually be attributed to variations in each individual programmer’s day-to-day performance, and “most of the differences resulted from a few very low performances, rather than exceptional high performance…”
“[P]rogrammers differ from themselves as much as they differ from other members of the group. The many studies that seem to show an x10 performance range conflate differences between programmers with normal programmer day-to-day variation.”
The Man Behind the Data
For 14 years, Nichols had developed and maintained nuclear engineering and scientific software after earning his physics doctorate at Carnegie Mellon University. Then in 2006 he’d joined the technical staff at the university’s Software Engineering Institute. Founded in 1984, it’s a federally-funded research center sponsored by the U.S. Department of Defense “to support the nation’s defense” by advancing the practices and technologies around software systems. Much of the group’s technical staff are CMU faculty.
Nichols’ role as an instructor there had left him in the unique position of having years of data from actual programmers…
Since the 1980s the Software Engineering Institute has been developing the Capability Maturity Model (which attempts to measure the extent to which development processes have been formalized and optimized). There’s also a related process for structured software development that’s known as PSP (for “Personal Software Process”), which is geared toward individual developers, encouraging them to formally measure their own performance and establish an ongoing cycle of improvement. As part of their classes on PSP, Nichols and his colleagues had their students collecting data on the time they’d spent completing their programming assignments (as well as counting the lines of code, as well as the number of bugs). Nichols decided this data could be used to test our assumptions about variations in the skills of programmers.
Starting with a dataset from 3,800 students (who took the class between 2000 to 2006), he’d narrowed it down to just the 494 students who had completed all 10 programming exercises — and who’d all used C as their programming language. Half the students had less than a year of experience as computer programmers, while the average experience level was 3.7 years (and a few had at least 36 years of experience.)
So what did the data show? “When we consider the entire body of work, not just the outliers, the evidence for super programmers looks weak. When looking at the 25th-75th percentile range, we can see notable uniformity in student productivity… [A]n average programmer typically finished everywhere between top to the bottom quartile while a top programmer or bottom programmer was sometimes average…”
“[W]hile some programmers are better or faster than others, the scale and usefulness of this difference has been greatly exaggerated.”
Nichols’ data showed that on routine tasks there’s just not that much difference between programmers. “Of the 494 students, 482 had at least one program assignment finished in less than the average time, and 415 had at least one program assignment finished in more than the average time.
In summary, these statistics show that program-assignment completion time is driven as much by seemingly random and unknown factors as by true programmer-productivity differences. These wide performance ranges suggest that even experienced programmers vary widely in performance from one task to the next.
In the real world, this means if you’re a manager trying to assess improvements in your processes, “In the short run, normal performance variation swamps performance… More importantly, we notice instances and extremes, not long-term trends.”
Since it’s so hard to evaluate the skills of individual programmers, Nichols advises managers to instead focus on ways to make improve the programming environment for all of their programmers — and to help those programmers get better.
👍👍 for "Assure that developers have a quiet work environment in which they can focus on the task at hand without interruption." https://t.co/TnS4ivhIHO (doi: 10.1109/MS.2019.2908576)
— Karen Dalton (@kilodalton) February 7, 2020
Nichols’ suggestions include careful workflow planning — like keeping assignments small and padded with “adequate margins” of extra time (“Start critical work early since almost half the time it will take longer than expected, sometimes much longer.”) Nichols also recommends automating routine tasks like deployment and regression testing. Careful design can keep projects from becoming too large and complex. (Nichols also specifically recommends design training, and well as training in testing and review and frequent peer reviews.)
“Since quality can be taught and benefits apply to the total lifecycle cost, emphasize quality rather than speed.”
It’s a refreshingly inclusive approach to optimizing performance. “Rather than try to label programmers with simplistic terms such as ‘best’ and ‘worst,’ the most motivating and humane way to improve average performance is to find ways to improve everyone’s performance,” Nichols writes. He compares it to the practice of assembling a winning baseball team described in the book “Moneyball: The Art of Winning an Unfair Game“: recognizing that conventional wisdom had lead to the systematic undervaluing of certain players.
In baseball, researchers who challenged widely held but erroneous notions were able to exploit market inefficiencies, a development described in #Moneyball. Similarly, #Software managers can benefit by challenging commonly accepted wisdom – https://t.co/Mq3MtYQqdY pic.twitter.com/kfvnekMQnW
— Software Engineering Institute (@SEI_CMU) January 29, 2020
In conclusion, Nichols advises managers not to rank programmers according to each one’s productivity, “because the measurements are mostly noise. Instead, it is far more useful to explore the sources of variance in programmer performance within each task.”
Nichols’ blog post provoked several thoughtful responses around the web. When it turned up in Reddit’s programming forum, the link attracted more than 60 comments. “On our best days, we’re probably 10x better/more productive than on our worst days,” agreed one poster. “I imagine sleep quality, diet, mood, and various impossible to control variables play a role.”
But another commenter attributed their productivity to a simpler secret. “My manager once asked me how I get so much done. I told her, I skip 90% of meetings.”
And Nichols’ blog post also drew another 121 comments from the geeky readers of Hacker News. “I find it amusing that we recognize grandmasters in chess and elite performers in other fields, but don’t wish to acknowledge that such people exist in our fields,” wrote one backend developer.
“There are 10x and 100x developer but you only notice them when the problem is very complex and you give them the opportunity to lead,” argued another programmer, who said they’d been programming for 15 years.
But just to imagine a 10X programmer is to create a platonic ideal — and in some comments, it seemed more like a legend, used for articulating feelings about workplaces. “Once a cowboy coder delivers results and gets noticed for fixing stuff quickly, it’s all over. That person is hailed as a hero and he’ll be the first pick for leading the ‘A-team’ of devs to make 2.0…”
Another comment about 10X programmers quickly segued into a critique of modern programming practices. “I think putting people into Scrum situations precludes most ’10x’ developers from being able to shine.” And another commenter seemed even more cynical. “If you want to reduce output, add more programmers to help.”
But any discussion of a 10X programmer inevitably leads to some introspection about how we code. “The difference between 1x and 10x devs is not (just) their skill,” posted Dortmund, Germany-based developer Sebastian Werhausen. “It’s mostly that they’re just more motivated and do their stuff with high urgency, passion and focus on detail opposed to just doing barely anything at all. In the end, it’s about dopamine, like so many things in life.”
Inevitably the discussion invited some armchair philosophy about the state of the workplace today. “Most businesses don’t have access to top talent and they know it. Most businesses are specifically trying to create an environment where devs are interchangeable, and optimizing their processes for the 50th percentile dev (if that).”
But at the end of the day, Nichols study firmly suggests something new to consider when optimizing performance, and Nichols’ blog post lays down his bold challenge to what’s often a basic assumption:
“Our study suggests that hiring ‘the best programmers’ will not be as effective or simple as we might think.”
What does 10x programmer mean? Maybe not what quite what we thought https://t.co/N4LIsnTcbK the next question is what do we do about it.
— William Nichols (@wrn55) February 3, 2020
- A baseball-loving geek painstakingly visualizes the data for 8,200 pitches thrown to Houston Astro’s batters (who secretly already knew which pitches were coming)
- An Ivory Coast designer created hundreds of emojis to better represent West African culture.
- A cat owner connects their Raspberry Pi to a laser pointer with a motion sensor.
- TechDirt contest challenges geeks to make online games out of 1924 works finally entering the public domain.
- A Paris Museum puts 60,000 historic photos online — copyright free.
- Dice survey calculates which cities, skills, and occupations pay the most.
- A new book shares life advice from Elon Musk’s mother.
- A voice-enabled bot uses “powerful AI” to play Dungeons and Dragons on Twitch.