Open Source AI and the Llama 2 Kerfuffle

You can’t even call AI open source without any certainty, said Amanda Brock, CEO of OpenUK, in a recent discussion about open source in the age of AI.
It’s even more muddied when considering the roots of open source, the trend away from purely open source licenses, and how to treat AI, which encompasses almost everything, particularly personal and health data protected by privacy laws.
Our topic, Llama 2, and the conversation that ensued when the large language model (LLM) launched and the team at Meta labeled Llama 2 open source. Llama 2 is not open source, even though the Meta blog says it is, as does the page for downloading Llama 2. It’s confusing. For instance, “open innovation,” is the term that Meta uses on their site. So, what is it? It may have support from the vendors but it’s not open source.
Erica Brescia, a managing director at Redpoint Ventures, and Steven Vaughan-Nichols, founder of Open Source Watch, joined the discussion with Brock.
The Open Source Initiative (OSI) emerged in 1998, looking for a more collaborative approach to licensing. Out of OSI came the Open Source Definition. Open source licenses work in compliance with the definition.
“I don’t think we’re going to see going forward any LLM or any significant AI being able to be licensed as open source, because the key to open source is the Open Source Definition,” Brock said.
“And we’re just not going to see that with the LLM,” Brock said. “So we wanted to support a move forward and an opening up of innovation, but certainly not a mischaracterization of it as open source.”
We need language to talk about open source and AI. It may mean the need to evolve the Open Source Definition in recognition of where the world is today compared to the pre-cloud age when the community created the open source definition, Brescia said.
OSI is developing a new definition for open source and AI. Work is ongoing. The OSI team will host a third community review September 19-21, 2023 at the Open Source Summit in Bilbao, Spain.
Brescia said that open source will continue to get watered down if we don’t evolve how we think about open source and how we define open source licenses.
“Because what’s going to happen is, you know, folks who I think are following more open development practices that many of us want to see encouraged are just going to stop even trying to get close to open source because they can’t find a way to do that and still build a business,” Brescia said. “And I know there will be purists who argue, ‘Hey, that’s not what open source is about.’ But I think a lot of the world has moved on. And honestly, because open source has kind of won and become so pervasive, people are focused on other things now than they were 20 years ago when these definitions all came out.”
Some vendors have shifted from open source licenses to Server Side Public License (SSPL).
“They all faced the cloud, and they all blinked,” Vaughan-Nichols said.
They all changed their licenses and alienated their community when they did so, Vaughan-Nichols said.
The historical significance of the cloud in software development has meant a new capability to use data in generative AI. It also implies a question about how changing definitions will affect all the existing open source licenses that serve as the foundation for so much software development.