Python’s New Security Developer Has Plans to Secure the Language
Earlier this year, the Python Software Foundation hired Python programmer Seth Larson for a new security developer-in-residence role, funded by the Linux Foundation‘s own nonprofit, the Open Software Security Foundation.
“I’m a passionate advocate for sustainability in open source maintenance,” Larson said in an email interview, so “it’s been a dream for me to live what I believe to be the ideal model for sustainable open source security.”
The position’s funding came from the OSSF’s Alpha-Omega Project, which “partners with open source software project maintainers to systematically find new, as-yet-undiscovered vulnerabilities in open source code,” according to its web site, “and get them fixed to improve global software supply chain security.”
Larson is just one of several full-time developers recently hired to help the Python ecosystem. Since July of 2021 there’s also been a developer-in-residence — soon to be augmented by a new deputy developer-in-residence. Then in August the foundation added its first full-time safety and security engineer for PyPI, the official repository where nearly 500,000 user-contributed Python projects are available for installation.
When looking for new security developer, Larson’s long-time involvement there was definitely a plus. Besides contributing to many open source projects, Larson’s website notes he’s the lead maintainer for the HTTP client library urllib3, “the most downloaded package on the Python Package Index with over 10 billion downloads.”
But Larson also sees the potential for a much larger impact: demonstrating the effectiveness of investments in critical communities for improving their security posture — and thus strengthening the entire software supply chain.
“Software ecosystems have a lot we can learn from each other, so I’ve been documenting everything I’ve learned — so that it can be used by others looking to replicate my work in their own ecosystem!”
Larson says he started by taking a start-to-finish look at Python‘s entire release process. Specifically, he created a diagram of each stage, “and then matched that diagram up with common supply chain threats for open source software.
“This process gave me good ideas on places to start for securing the release process.” (Among other things, Larson is now working with release managers — and developer-in-residence Łukasz Langa — to improve the macOS and Windows builds.) In a recent talk, Larson said he’s exploring workload-lightening automation like Azure Pipelines, which could also harden security. “Injection of malware during build time has happened to multiple other open source projects with disastrous results for users.”
One blog post also shares details of July’s audit of the signatures used for Python releases. Larson had wanted to make sure that end users see signing that’s consistent with the documentation, so “I put together a few simple scripts which downloaded and attempted to verify every Python release artifact against its signatures and published the results.” Then after the signature audit, Larson “worked with release managers to fix discrepancies that I found,” helped standardize signatures for older releases, and even added controls to Python’s release tooling “that prevents anything bad from happening with the signatures — anything confusing — before they get published.”
Soon Larson had been invited to join the Python Security Response Team, where he’s been helping coordinate bug fixes (so Python’s volunteer developers don’t have to continuously monitor incoming reports just in case one needs their response).
Larson also writes in his Q3 report about adding a little automation, by “moving the reporting and triage process to GitHub using GitHub Security Advisories.” But one thing he’d learned during this process was about how CVE numbers were getting assigned to newly discovered vulnerabilities.
The issues are mostly designated by the National Cyber Security Division of the America’s Department of Homeland Security. And surprisingly, it was actually possible to get a CVE number for a Python vulnerability without actually talking to anyone associated with the Python project!
Larson said this had created a situation that “caused reports which weren’t security vulnerabilities to be accepted as CVEs” (which, of course, “caused confusion for users.”) It’d be better — and quicker — if Python’s core developers could just add their remediation tips (along with updates and other information) directly into the CVE.
Larson got to work, and by August the Python Software Foundation had been authorized as a CVE Numbering Authority (both for the Python programming language and the Python Package Installer). Larson says he’ll be sharing primary duties with Ee Durbin, the PSF’s director of infrastructure, and Chloe Gerhardson, one of their infrastructure engineers.
Among other benefits, the authorization brought paid staffing for CNA operations (“rather than requiring volunteer time”), which lightens the workload of Python (and PIP) maintainers who had previously been tasked with publishing advisories themselves.
A PSF blog post adds hope that this ultimately leads to “richer published advisories and CVE Records including descriptions, metadata, and remediation information.” (Future CVEs will be reviewed and approved by the responsible security response teams, with more detailed information originating directly from the Python project.) “Since becoming a CNA we have revitalized the firstname.lastname@example.org mailing list for alerting Python users when there have been new vulnerabilities found in Python and pip,” Larson says.
And at the end of August Larson published “my first end-to-end vulnerability disclosure for Python,” according to that week’s blog post, which included “a coordinated release of fixed Python versions, and publishing of the advisory to the email@example.com mailing list and to the PSF Advisory Database…
“Now that I’ve experienced the flow from end-to-end and I can start to think about where there is potential for improvement and what items need to be on our ‘checklist’ to reduce stress and guesswork from remediation developers, release managers, and coordinators.”
Helping Other Communities
But all this could be just the beginning. “The PSF wants to help other Open Source organizations,” Larson explained in an August blog post, “and will be sharing lessons learned and developing guidance on becoming a CNA and day-to-day operations.” Larson says he’s already been communicating with the Curl program about his experience becoming a CNA — to help them take the same step. And to share the experience even further, “I’ve authored a guide to becoming a CNA as an open source project under the OpenSSF’s Vulnerability Disclosures working group.”
Elsewhere, another blog post notes that the U.S. government “is soliciting ideas from the broader community on where to focus and what to do to improve the security of open source software.” (In early November Larson had spent months conferring with other colleagues, and was scrambling to meet their deadline for submissions.) “Whatever gets done by the U.S. government is likely to have huge implications for everyone maintaining and consuming open source software, so it’s critical that policy and decisions are made with sustainability in mind,” Larson wrote.
“I’m honored to be a part of this and to represent so many Pythonistas in my work both for this RFI and every day as Security Developer-in-Residence.”
Future projects include further strengthening of Python’s release process, and “a bunch of packaging standards work” — where there’s also a role for the larger community. In his talk, Larson said he was also reaching out to the authors of some interesting security-related Python Enhancement Proposals (or PEPs).
One recommends that installers create a JSON-formatted record showing the provenance of Python distributions and a unique distribution hash. Larson actually created an experimental tool to test it out, using the PEP’s reference implementation to generate SBOM documents (in both SPDX and CycloneDX format) for Python’s package manager.
And one day Larson also hopes to create his own standard for the metadata of bundled projects — which in some cases even include other projects not written in Python. (In the talk Larson noted that he’s already found and reported two that contained vulnerabilities.)
Larson is also experimenting with creating an authoritative Software Bill of Materials (or SBOM) for critical projects like CPython and PIP — and then make it easier to use that information when creating SBOMs for other Python packages. Down the road the hope is to easily capture metadata automatically — and to also get the new packaging information supported by the current crop of tools that automatically generate SBOMs.
Sustainability, Clarity, Visibility
In his talk in September, Larson highlighted his three “guiding principles”: sustainability, clarity, and visibility. “Sustainability” means focusing on making a lasting impact — ideally, in ways that require minimal upkeep. “These are things like improving processes, automation, dealing with bureaucracy, or publishing standards.” And with a cacophony of new security tools coming out, Larson strives to bring clarity. “A big part of my role is to distill down everything that’s happening in the security space, and then partnering with the right people on projects to ensure that the value is realized.”
But “visibility” takes the form of regular blog posts and public announcements, to give security the higher profile it deserves. “We should all be proud, and celebrate the work we’ve done to keep our community safe… Part of ensuring current and future success in this area requires talking about what’s getting done and highlighting where there is more opportunity.”
So how is it working out? “I’m quite proud of what I’ve accomplished so far,” Larson wrote in an October blog post, “and think it shows the value of investing into the security of Open Source through hiring folks to work full-time in roles.”
Larson says he’s received grateful comments from maintainers relieved that he’s helping with security, and “The response from the core developers and wider community has been overwhelmingly positive.”
Larson even got an appreciative shout-out on the new core.py podcast from Python core developer Pablo Galindo Salgado and developer-in-residence Lukasz Langa. And in our email interview, he’s found it deeply rewarding “to make a positive difference in a community that I love so dearly…
“My experience so far has been incredible.”