Improving Open Source Supply Chain Transparency with SPDX
Publication of the ISO/IEC 5962:2021 standard for software bill of materials (SBOM) is more beginning than end. At least, that’s what I learned from an interview with Dr. David A. Wheeler around the time of that publication last summer.
To understand that distinction and its implications throughout the computing world requires a bit of background in standardization, software’s logistical perspective, and the Linux Foundation, where Wheeler serves as director of open source supply chain. Here’s the summary on how those parts fit together:
Security at the Center
SBOM is a serious subject, in the sense that it’s discussed not just by bit-pushers and theoreticians, but presidential staff and military decision-makers.
Most tangibly, U.S. President Biden’s Executive Order 14028 on cybersecurity improvement (EO14028) recognized SBOM management as an urgent national security concern.
How does an esoteric corner of computing — that’s what SBOM is — find itself in the spotlight of global attention?
The Biden Administration’s executive order was undoubtedly prompted by the SolarWinds attacks in March of 2020 and various highly publicized infrastructure attacks on energy suppliers, such as Colonial Pipeline in 2021. However, the most recent notable example is a vulnerability in the Apache Software Foundation’s Log4j software. The vulnerability, CVE-2021-44228 aka Log4Shell, can allow complete control of an unpatched system by an attacker.
As with the Heartbleed incident affecting the OpenSSL library almost exactly 10 years prior, Log4Shell is a severe vulnerability that affects many systems. In the case of Log4j, that is potentially billions of systems.
The reaction to the vulnerability was swift. Repair crews rushed into action. Patches were developed quickly. Executives ordered round-the-clock workers to replace the vulnerable versions of Log4j and restore the internet to its baseline of adequate security.
But as in 2012 with HeartBleed, with Log4j, organizations don’t necessarily know where the affected components are in their applications.
It sounds straightforward. The software components were broken, so specialists need to update the software components, right?
But not every fix is as simple as replacing a single software component. Modern software is built on many, many components, though. Just as an automobile might be assembled in Tennessee from parts made in a thousand other factories around the world, software is built on top of other software, and that software is built on other pieces and so on.
Heartbleed responders in 2012 quickly found out that, however much they wanted to replace and repair vulnerabilities, just locating all the uses of SSL was a gargantuan and perhaps infeasible chore. Does a specific piece of software rely on the specific SSL library vulnerable to attack? Part of the shock of 2012 was discovering how often the answer to that “yes/no” question was “we don’t know.”
In 2022, the answers to “where are all our uses of Log4j?” appear to be strikingly similar.
The industry collectively solved this problem with more technology: SBOMs described using Software Package Data Exchange (SPDX), the OpenChain Specification, and other standards, and software helped reduce the component problem to a more manageable level.
As in any other domain, standard-based solutions only solve problems when the standards are observed. Much of the impact of EO14028 has to do with requirements for SBOMs in government operations.
At one level, Wheeler points out, publication of a SPDX as an SBOM standard is “utterly irrelevant”: Individuals and organizations that want to comply already know the standard long before an official milestone like publication, and organizations that choose to ignore the standard can continue to ignore it, however “official” it has become. What really matters is the use of the standard, not just formal publication.
Still, getting people to agree to use a format for SBOMs is a hard challenge. And now that the SPDX format is a fully recognized format for SBOM documents by the ISO, it gives it greater credibility in the space, so Wheeler and others agree that some celebration is warranted.
The Linux Foundation is already using SPDX in some of its projects, such as the Zephyr real-time operating system, where SPDX SBOM generation is now built into its “West” meta command-line tool. Additionally, the Yocto project, a build from source Linux distribution for custom system implementations, is generating SPDX as part of its overall build process.
Understand that standardization is a hard, sustained challenge. Acceptance of a serious standard like ISO/IEC 5962:2021 typically takes at least three years.
Standards organizations originally focused on such basic material requirements as the size of bolts and the materials in safety equipment. Eventually, higher-value commodities, including pharmaceuticals, electrical components and chemical feedstocks, merited formal standardization. Now it’s software’s turn: Software plays such a crucial role in everyday activities that it pays to standardize it.
That’s one sense in which ISO/IEC 5962:2021 is more beginning than the end: Many other software technologies are likely to follow formal standardization paths.
There’s a strategic aspect to this publication, moreover, at least for the LF, which supports SBOM generation with SPDX: The Linux Foundation looks to build on the ISO/IEC 5962:2021 experience. Standardization can itself become more replicable and manageable.
From a computing perspective, SPDX is another interesting special-purpose language designed to express dependency relationships, license inheritance and other domain-specific details.
At the same time, and ultimately even more importantly, SPDX is just a warm up to negotiating the legal systems that make potent software possible. In this viewpoint, SPDX is less a language-and-computing construct and more a story of different forms human collaboration takes on at least three levels pertinent to software development:
- Recognition that software exists in a uniquely rich environment of other software with myriad dependencies. It’s widely estimated that only 10% of the software of a new product is unique or proprietary to that product; the overwhelming majority of the functionality of any application lives within common libraries that exist outside the product in hand. Creating a great application is less like heroic artistic production in an isolated garret and more like a grand logistical undertaking.
- Standardization is itself a distinctively human activity with powerful consequences.
- Smart observers of the SPDX story are taking away lessons to apply in the governance and management of such other software technologies needing standardization as SVG, 5G, Notebooks, and NoSQL.
The focus on cybersecurity and software supply chain issues by the LF and the Open Source Security Foundation (OpenSSF) is rather timely. Those organizations recently attended a cross-industry virtual meeting at the White House and shared their intent to work with the administration across public and private sectors, with all technology and software projects at their disposal.
SPDX is just one of the dozens of projects that the LF and the OpenSSF sponsor, including programs for best practices and training, but SPDX will be a crucial tool that organizations can use to understand the installed components in their environments better.
To learn more about SPDX, the Linux Foundation is encouraging organizations to attend its SPDX Community SBOM DocFest, which will be held virtually on January 27.