SOS: Sustainable Open Source
Free and Open Source Software is eating the world, but is at the same time a victim of its own success. Large enterprises rely on libraries maintained by a single individual, or maybe worse yet: a single vendor.
Individuals or organizations may restrict the use of their technology or EOL versions of their software, posing real challenges to organizations and customers depending on that technology. How can we contribute to the viability and sustainability of open source?
Issues in Open Source
One issue is projects relicensing in order to avoid “free-riding” to avoid bad people from using our work to do more bad or to relieve responsibility.
Another problem is projects maintained by the proverbial single individual in Nebraska (shoutout to the infamous XKCD cartoon). Daniel Stenberg successfully maintains Curl mostly on his lonesome. But for every curl, there is a left-pad.
Lack of resources prevents the maintainer from spending the time the project would warrant given how it supports businesses globally. And maintainers can make decisions based on emotions. We’ve seen maintainers pull their code to avoid it from being used by the likes of U.S. Immigration and Customs Enforcement (ICE), or more recently, to protest Russia’s attack on Ukraine.
Are relicensing and lack of resources for maintainers the only two top-level issues plaguing open source? No. I for one would love to see open source become a more inclusive and equitable place, but for this post, let’s look at recent license changes, and at what causes maintainer drain.
In recent years we’ve seen an increase in “kinda open source” licenses, like The Commons Clause which aims to restrict commercial free-riding on open-source code, especially cloud service providers who don’t “give back” to the FOSS community.
The commons clause conflicts with the FSD (and the “right to use software for any purpose”) and Open Source Definition (in that the license shall not restrict any party from selling or giving away the software). The Commons Clause contains a bunch of ambiguous wording, like “value derived entirely or substantially”… because what is considered substantial?
Mongo used it for a while, as did Redis Labs. MongoDB moved to SSPL in 2018, which is not approved by the Open Source Initiative, the stewards of the OSD. SSPL forces wide copyleft impact on the cloud infrastructure. Its justification? The notion that large cloud vendors capture all the value but contribute nothing back to the community.
Redis Source Available License (RSAL) applies to certain Redis modules created by Redis. It’s a license to do all usual actions (use, modify, distribute, copy, sublicense…) unless your application is “distributed” or “made available as a database product”
Elastic 2.0 then. Again, you find clauses to prevent hosted or managed service providers from using the project. It’s copyleft-style but with straightforward prohibitions, as it:
- Prevents using Elastic as part of a hosted or managed service providing access to Elastic features
- Prevents third parties from obscuring trademark notices and branding
- Can embed license keys to prevent circumvention
There are other, new and restrictive licenses, like:
- TimescaleDB license: which basically says no “Timescale as a service”
- The Confluent Community License: you can use, modify, distribute, unless that competes with Confluent‘s business, which could potentially be a moving target.
- Cockroach Labs introduced features under Business Source License (BSL) and the Cockroach Community License (CCL) on top of the Apache 2.0 licensed core CockroachDB.
Some argue that MongoDB and Elastic were never really open source to begin with, but I don’t think I agree. I think they brought tremendous value to the community, but then confused open source for their business model and couldn’t reconcile with cloud vendors making money off their product.
As a disclaimer, I work for a company actively involved in driving OpenSearch forward as the open source fork of Elasticsearch. When Elastic changed its license a shockwave went through the community. Several players eventually decided to collaborate and fork Elasticsearch — including AWS.
In fact, AWS appears to dominate the project, perhaps because more than anyone they can afford many engineers working on a given project. Speed took precedence over governance decisions, businesses rely on Elastic-like functionality, and it remains to be seen if we’re really better off this way.
Apache Kafka development — Kafka is also a project in Aiven’s portfolio — or rather the decision of what changes make their way into the project, is primarily in Confluent’s hands. The single-vendor issue is prevalent in open source: Databricks has a stronghold on Sparks, Google and BEAM are a similar story.
Grafana, Loki, and Tempo relicensed from Apache2 to GNU AGPLv3, an “infectious” copyleft license. The Cloud Native Computing Foundation (CNCF), in response to the license change of third party dependencies to AGPL, encourages to “switch to an alternative component, freeze the component at the version prior to the license change, and/or seek an exception from the Governing Board.”
Open Source libraries enable you to move faster, but if they’re poorly maintained, if they’re not healthy, they can become a single point of failure.
The 2016 example was left-pad. All left-pad did was pad out the left-hand side of strings with zeroes or spaces. Still, thousands of projects including Node and Babel relied on it. Left-pad’s maintainer felt pushed in a corner by messaging app Kik’s lawyers, over another one of his NPM libraries also called Kik. When NPM took said library away from the developer, he was furious and unpublished all of his NPM-managed modules.
To fix the internet — and I wish that was a hyperbole — Laurie Voss, chief technology officer and co-founder of NPM, took the “unprecedented” step of restoring the unpublished library. Maybe had the left-pad maintainer had access to representation, by a foundation, the left-pad incident could have been prevented.
Seth Vargo, after discovering a contract between software automation company Chef and ICE, deleted his code and in doing so more or less discontinued Chef’s services. Even if he wrote the code when Seth was an employee of Chef, they lived in a personal repository and no OSI license or employment agreement requires Seth to continue to maintain code of his personal accounts. Vargo added that he has even included detailed instructions in his will on how to deal with the code he owns when he dies, requesting all his code accounts be deleted.
Another example then. The project colors.js has scored more than 3.3 billion downloads throughout its lifetime and has over 19,000 projects that depend on it. Faker.js, by the same author, has been retrieved 272 million times from the NPM repository, with over 2,500 dependents.
The hijacked Colors version trapped applications in an infinite loop, printing “LIBERTY ‘LIBERTY LIBERTY” followed by a sequence of gibberish. The developer himself sabotaged its functionality and purged all functional code from the “faker” package in version 6.6.6.
It’s highly likely that the stunt relates to the sentiment the developer shared in 2020, to no longer support big companies with his “free work.”
Open Source is part of our infrastructure, we need to care about it like if it were our own projects. No company will leave critical parts of their in-house developed tech-stack unmaintained, why are we willing to do so for the ones that are Open Source?
Documenting licenses and monitoring changes should be part of a company’s SBOM — Software Bill of Materials. I want you to ask yourselves the following questions:
- Who are the people in your company responsible for identifying and mitigating the impact of license changes?
- What projects in your stack do you think may be at risk of posing a similar challenge like Elastic did?
- Who is looking at the health of the software you rely on?
- Who leads due diligence of alternatives, so that when you will need to change it won’t be a knee-jerk response?
The Log4j flaw (CVE-2021-44228) scored 10/10 on the CVSS (Common Vulnerability Scoring System). But Log4j is developed by the Apache Software Foundation — that certainly signals health, right? Yet this happened.
We talk about open source being inherently secure — the code is out in the open, if something is broken people will see it and fix it. But then how do you explain Log4j, Heartbleed, the Struts vuln? The “many eyes” argument is shaky, it needs the right people to look in the right places. Security is hard, developers are looking to open source for solutions, not problems.
Produced in partnership with Harvard Laboratory for Innovation Science (LISH) and the Open Source Security Foundation (OpenSSF), Census II is the second investigation into the widespread use of FOSS. It aggregates data from over half a million observations of FOSS libraries used in production applications at thousands of companies, and aims to shed light on the most commonly used packages at the application library level, to allow for resource prioritization to address security issues in this widely used software. Please take note of the third point. Exactly.
The Open Source Project Criticality Score (Beta) is an interesting project maintained by members of the OpenSSF Securing Critical Projects WG in order to generate a criticality score for every open source project, thus creating a list of critical projects that the open source community depends on, and use that data to proactively improve the security posture of these projects.
Mitigation and Support
If you’re an employee at a company that heavily relies on open source components (spoiler: you are) I’d encourage you to advocate on behalf of FOSS internally. Advise and help navigate your organization’s participation in open source, by abiding by the Principles of Authentic Participation, which zoom in on corporate accountability in the context of open source.
Your organization could sponsor projects using GitHub Sponsors, or Open Collective. Foundations like the ASP and the CNCF act as stewards for the open source projects in their care. Supporting these organizations to do more great work is definitely a way to leave open source in a better place than you found it.
The concept of the Open Source Program Office has been around for a while, but it hasn’t really gained adoption until recently, due to the mindshare of the criticality of free and open source software as the supply chain that underpins pretty much all technology, anywhere. Find out whether your organization has any plans in this direction.
When your company considers hiring maintainers, make sure they’re empowered to balance internal and external feature requests, and that scope and strategy won’t change on them with every new fiscal year.
Discussions around the sustainability of open source are hard, but they’re necessary. FOSS is ubiquitous, it’s omnipresent, yet we’re still struggling to live with open source in a healthy, safe, and productive way. Understand the risks and then help open source, well, help open source.