Open Source Initiative tries to define Open Source AI • The Register
The Open Source Initiative – the non-profit overseeing the Open Source Definition, which lays out the requirements for software licenses – is taking its effort to define Open Source AI to the wisdom of the crowds.
The public benefit biz is embarking on a global series of workshops to solicit input from concerned parties on its Open Source AI Definition, which has been under discussion for the past two years.
The issue is that there’s no accepted way to determine whether or nor an AI system is open source, despite the fact that there are already many machine learning models offered under open source licenses (such as MIT, GPL 3.0, GPL 2.0, and the AFL 3.0).
There’s concern that the legal language in existing OSI-approved licenses doesn’t necessarily suit the way the machine learning models and datasets are used. Terms like “program,” when applied to machine learning models, refer to more than just source code and binary files, for example.
“AI is different from regular software and forces all stakeholders to review how the Open Source principles apply to this space,” Stefano Maffulli, executive director of the OSI, explained in a statement.
“OSI believes that everybody deserves to maintain agency and control of the technology. We also recognize that markets flourish when clear definitions promote transparency, collaboration, and permissionless innovation.”
OSI is thus embarking on a roadshow to gather feedback about its latest draft – presently at v.0.0.8. The workshops will take place at various upcoming conferences in the US, Europe, Africa, Asia, Pacific, and Latin America through September.
Bruce Perens, who drafted the original Open Source Definition, told The Register that he was skeptical about the need to address AI separately.
“I think the problem is not that AI vendors are saying their software is open source when it’s not. It’s the entire software industry saying their software is open source when it isn’t,” argued Perens, who split with the OSI four years ago.
“I think this is going to confuse the open source brand, because OSI already has an open source definition that applies to all software. And now we’re going to have a second one that only applies to AI?
“I think the fundamental problem with AI is that its output is inherently plagiarism,” Perens explained.
“Large language models are trained from websites, and open source software, without regard for their copyright. And their output is just a mix and match of their input. That problem will be dealt with by courts, just as Napster was.” ®