If open source is to win, it must go public
- URL: http://arxiv.org/abs/2507.09296v1
- Date: Sat, 12 Jul 2025 14:16:28 GMT
- Title: If open source is to win, it must go public
- Authors: Joshua Tan, Nicholas Vincent, Katherine Elkins, Magnus Sahlgren,
- Abstract summary: Open source projects have made incredible progress in producing transparent and widely usable machine learning models and systems.<n>But open source alone will face challenges in fully democratizing access to AI.<n>This paper argues that open source AI must be complemented by public AI.
- Score: 11.101077002196202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open source projects have made incredible progress in producing transparent and widely usable machine learning models and systems, but open source alone will face challenges in fully democratizing access to AI. Unlike software, AI models require substantial resources for activation -- compute, post-training, deployment, and oversight -- which only a few actors can currently provide. This paper argues that open source AI must be complemented by public AI: infrastructure and institutions that ensure models are accessible, sustainable, and governed in the public interest. To achieve the full promise of AI models as prosocial public goods, we need to build public infrastructure to power and deliver open source software and models.
Related papers
- Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training [67.895981259683]
General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence.<n>Current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools.<n>We present Cognitive Kernel-Pro, a fully open-source and (to the maximum extent) free multi-module agent framework.
arXiv Detail & Related papers (2025-08-01T08:11:31Z) - The Case for Contextual Copyleft: Licensing Open Source Training Data and Generative AI [1.2776470520481564]
This article introduces the Contextual Copyleft AI (CCAI) license, a novel licensing mechanism that extends copyleft requirements from training data to the resulting generative AI models.<n>The CCAI license offers significant advantages, including enhanced developer control, incentivization of open source AI development, and mitigation of openwashing practices.
arXiv Detail & Related papers (2025-07-17T01:42:51Z) - A Community-driven vision for a new Knowledge Resource for AI [59.29703403953085]
Despite the success of knowledge resources like WordNet, verifiable, general-purpose widely available sources of knowledge remain a critical deficiency in AI infrastructure.<n>This paper synthesizes our findings and outlines a community-driven vision for a new knowledge infrastructure.
arXiv Detail & Related papers (2025-06-19T20:51:28Z) - Is Open Source the Future of AI? A Data-Driven Approach [41.94295877935867]
Large Language Models (LLMs) have become central in academia and industry.<n>Key issue is the trustworthiness of proprietary models, with open-sourcing often proposed as a solution.<n>Open-sourcing presents challenges, including potential misuse, financial disincentives, and intellectual property concerns.
arXiv Detail & Related papers (2025-01-27T09:03:49Z) - Risks and Opportunities of Open-Source Generative AI [64.86989162783648]
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education.
The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation.
This regulation is likely to put at risk the budding field of open-source generative AI.
arXiv Detail & Related papers (2024-05-14T13:37:36Z) - Near to Mid-term Risks and Opportunities of Open-Source Generative AI [94.06233419171016]
Applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education.
The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation.
This regulation is likely to put at risk the budding field of open-source Generative AI.
arXiv Detail & Related papers (2024-04-25T21:14:24Z) - The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial Intelligence [0.0]
We introduce the Model Openness Framework (MOF), a three-tiered ranked classification system that rates machine learning models based on their completeness and openness.
For each MOF class, we specify code, data, and documentation components of the model development lifecycle that must be released and under which open licenses.
In addition, the Model Openness Tool (MOT) provides a user-friendly reference implementation to evaluate the openness and completeness of models against the MOF classification system.
arXiv Detail & Related papers (2024-03-20T17:47:08Z) - Is open source software culture enough to make AI a common ? [0.0]
Language models (LM) are increasingly deployed in the field of artificial intelligence (AI)
The question arises as to whether they can be a common resource managed and maintained by a community of users.
We highlight the potential benefits of treating the data and resources needed to create LMs as commons.
arXiv Detail & Related papers (2024-03-19T14:43:52Z) - Computing Power and the Governance of Artificial Intelligence [51.967584623262674]
Governments and companies have started to leverage compute as a means to govern AI.
compute-based policies and technologies have the potential to assist in these areas, but there is significant variation in their readiness for implementation.
naive or poorly scoped approaches to compute governance carry significant risks in areas like privacy, economic impacts, and centralization of power.
arXiv Detail & Related papers (2024-02-13T21:10:21Z) - Open-Sourcing Highly Capable Foundation Models: An evaluation of risks,
benefits, and alternative methods for pursuing open-source objectives [6.575445633821399]
Recent decisions by leading AI labs to either open-source their models or to restrict access to their models has sparked debate.
This paper offers an examination of the risks and benefits of open-sourcing highly capable foundation models.
arXiv Detail & Related papers (2023-09-29T17:03:45Z) - The Future of Fundamental Science Led by Generative Closed-Loop
Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole.
AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access.
Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z) - h2oGPT: Democratizing Large Language Models [1.8043055303852882]
We introduce h2oGPT, a suite of open-source code repositories for the creation and use of Large Language Models.
The goal of this project is to create the world's best truly open-source alternative to closed-source approaches.
arXiv Detail & Related papers (2023-06-13T22:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.