All about Zombie, an Open Source Effort to Mask Metadata
Bias is a part of human nature. These biases can threaten the quality and security of code, and it can exclude those that produce it. At the scale of participation and adoption of open source, those biases can easily be amplified.
The New Stack previously wrote about how masking metadata during the initial consideration of an open source pull request can help eliminate bias with regard to the identity of the person who seeks to contribute to an open source project. Now we write about a technical solution that already exists to achieve this.
If a similar feature were widely adopted by the open source ecosystem, maintainers could stay more focused on evaluating code quality and preventing technical debt, while not missing signals that their pet contributor has been hacked and a malicious injection is being snuck past them.
Why Create a Blind Zombie?
Back in 2018, Emma Humphries, then a program manager at Mozilla, was looking for a solution to randomly assign active contributors to help triage bugs. Around the same time, they were also a part of a six-week workshop in partnership with Airbnb aimed at improving diversity in open source.
“It’s sometimes the case that a proposed change to a project will actually work, and it solves the case of the person who proposed it,” said Marti, now vice president of ecosystem innovation at CafeMedia, told The New Stack. “But then it means that there’s a little bit of extra complexity in that open source project going forward.”
That can be fine on a smaller scale, Marti said, but “as you accumulate all these little changes that fix one problem for one person, you keep making the project slower and slower to be able to be changed for future problems.”
Working in collaboration with Larissa Brown Shapiro, then Mozilla’s head of global diversity and inclusion, the trio decided to try double blinding or masking pull requests, aiming to support open source quality and productivity, as well as diversity, equity and inclusion (DEI) efforts.
“What Emma and I saw was that there’s a significant overlap between the kinds of problems that come up with open source quality, and the kind of problems that come up in DEI discussions in large organizations,” Marti said. “There’s a real kind of ‘looks good to me,’ [habit shared] by us in a lot of development teams. And so, can you address both quality issues and DEI issues with the same kinds of approaches?”
It’s not just an open source problem, he contends — this confirmation bias exists in all kinds of software development teams.
“We have this sort of exaggerated idea of software developers as kind of meritocratic brains in boxes, when that’s not at all the case,” he said. “Software development teams have more in common with any primate social group — like a monkey troop — than they do with some idealized vision of an open source quality and metrics-driven organization.”
He offered the example of when venture capitalist Paul Graham said he funded a company just because the founder looked like Facebook’s Mark Zuckerberg. The results? A disaster.
“That’s not just at the scale of evaluating founders or high-level people for jobs, it also kind of happens at the micro-scale, where it’s easier for some people to get a bad change through than it is for other people to get a good change through,” Marti said.
Back in 2016, a study by researchers from Cal Poly and North Carolina State University found that the pull requests of women-identifying contributors were less likely to be accepted. If women masked their gender, their changes were more likely to be approved than men’s and overall were less likely to need refactoring.
The Right Information at the Right Time
Marti looks at this metadata — name, photo, and handle — as a distraction. He mentioned a study of physicians evaluating people suspected to be having a heart attack.
“It turned out that looking at the best information was better than looking at a lot of information,” he said. “They could get, on average, better decisions from their cardiologists if the cardiologists were just presented with [only] the important information.”
Hiding the metadata during the first PR review, he says, similarly helps maintainers only looking at the most important information, including code quality. By masking unnecessary metadata, maintainers can focus more on the consequences and operability for the entire project — and not increase technical debt by approving small changes that can have a big impact on project longevity.
“Getting people to think about that kind of neutrally, based on what the code actually does, rather than having their person-recognition circuit in their brain kick in too early, I think has a positive effect,” Marti said.
Investing in New Project Maintainers
The Zombie web extension was originally turned on all the time, but feedback from users both in and outside Mozilla was that always-on wasn’t well-suited for open source engagement.
“We had developers pointing out that it was useful to them, and we did get some feedback on how to kind of balance blind code reviews with open source social conventions,” Marti said.
Such conventions rely on identifying your community members in order to welcome new contributors, and encourage and incentivize their repeated participation.
“A path that invests in new maintainers is the backbone of creating a welcoming, diverse, and sustainable open source community,” wrote Abby Cabunoc Mayes, now senior program manager at GitHub, back when she worked at Mozilla.
Cabunoc Mayes further referenced other lessons she learned from the Mozilla Open Leaders project that ultimately influenced some changes in this masking feature. Namely, a sustainable community needs two things:
- A way to level up within
You’re much more likely to get continued participation from an open source contributor, for example, if you thank them, she wrote. The success of an open source community thrives on identifying and maximizing what VM (Vicky) Brasseur, an open source business strategist, refers to as ”drive-thru contributions.”
This understanding compelled Marti and Humphries’ project to make it very easy for the reviewer to turn the masking feature off once they had done their initial review.
Does Community Size Matter?
“Double-blind code reviews are not going to be a panacea,” Humphries told The New Stack. However, “there’s going to be an area where they’re going to be effective.”
They didn’t think metadata masking would be applicable to their current team of about a hundred Bandcamp contributors.
“It does not work in a small organization or in a small number of contributors or a place where you will easily discern a contributor’s code hand. It’s just going to be too easy to go ‘Oh no, that’s Rebecca’s code,’” Humphries said. (“Code hand” is another name for the cognitive signature of developers — from emojis in comments all the way through to commits, a developer’s personality often shines through.)
“It has to work in a place where you have a very large project like the Mozilla projects, [with] a very large code base, and it’s going to work for smaller pull requests,” Humphries said. “It’s not going to work for larger pull requests,” which, they said, require conversations.
“When you get into those larger pull requests, I’m not sure you can keep that anonymity present.”
However, this masking can be valuable to even six-person projects, argued Sal Kimmich, open source developer advocate at Sonatype, because it helps prevent malicious code injections — high cybersecurity threats that take over the account of an individual that’s recognized in a community.
These regular contributors or maintainers are more easily merged, they told The New Stack because maintainers are more likely to see the recognized name and not always rigorously assess the quality of the code. More sophisticated hackers adopt the cognitive signature and style of frequent project contributors, making it easy to overlook malicious code.
This isn’t a negligible threat. Over the last two years, there’s been a 650% increase in this kind of code, according to Sonatype’s “2021 State of the Software Supply Chain” report. “They’ll put malicious code into something that’s wrapped around something that looks like a typical contribution,” Kimmich said
“A lot of maintainers aren’t security trained, or even really properly trained in best practices,” they continued, and just notice the valued contribution, not the risk it’s hiding.
By masking metadata, you remove the halo effect and validate the code based on quality, before worrying about who wrote it.
Before any masking can take effect, it’s crucial to make sure community basics like psychological safety are in place. A process change like the masking of initial code reviews can’t be imposed on a non-diverse culture, Humphries warned. This includes a code of conduct that outlines the consequences of aggressive or abusive behavior, and a potential pathway to return to the community.
Data Masker Seeks Maintainer
In the end, the Zombie project was destined to only become a minimum viable product, losing out to other competing priorities across Mozilla. It was open sourced on GitHub, but is not currently maintained.
“We didn’t get a chance to give a full qualitative study with the code we had,” Marti said, as other features became greater priorities. This project was built in addition to the maintainers’ regular work, running up to the time of the first major layoffs at Mozilla, in early 2020.
“We didn’t really get the time to explore to its fullness,” Humphries said. But the project did offer some lessons.
First and foremost, they realized that this kind of feature shouldn’t be built as a web extension — it has to be built into the project itself, ideally a feature built within GitHub that can be turned on and off.
Humphries would also like to see some sort of pre-processing of code built in, like a tidier or formatter, to “sanitize the code” for both quality and to remove even more cognitive signatures like comments from that first view — which attackers can leverage to mimic the style of familiar project contributors.
Before a maintainer even considers a pull request, Humphries said that it’d be interesting to run this sanitizer alongside security, unit and integration tests. This would save a lot of time for the maintainer, they said, because, “a lot of the time, when people get a pull request, it’s like ‘“OK, this looks good, but have you run tests on it?’”
In larger projects, Humphries pointed out, the core team tends to move away from feature writing toward infrastructure, specifications and community management. If you can automatically test against the specification, they said, that simplifies the over-burdened maintainer’s role of project/product manager.
All this could be incorporated into the GitHub Actions workflow automation within a repository, Kimmich said.
The open source world’s largest projects tend to have testing in place, they said, so those aren’t always the target of these types of attacks.
“They’ll go for the mid-sized ones,” Kimmich said. “So they’re going for contributor communities that are around three to 500 that are sort of on the up-ramp, and they’re seeing if they’re getting new contributors into them. That’s where they’re finding those little sweet spots and getting in.”
Those attackers are often going deeper into the supply chain through those mid-sized projects, they added, where “They know they’re gonna hit every major enterprise.”
Data masking isn’t a perfect solution. It’s one technical piece in a complex puzzle that has to address increasing bias and security issues baked right into code and communities. The social and structural factors that are inherent to open source remain tenuous at best.
“The model of open source is still flawed. You’re still asking people to do free labor for you,” Humphries said.
This conversation was focused on cognitive factors that can contribute to worse code, but that’s only part of what’s necessary to address in order to foster open source sustainability.
As Humphries noted, “You’re asking people to do a tremendous amount of work and not pay them for it.”