Related papers: A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety

A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety

URL: http://arxiv.org/abs/2506.22183v1
Date: Fri, 27 Jun 2025 12:45:44 GMT
Title: A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety
Authors: Camille François, Ludovic Péran, Ayah Bdeir, Nouha Dziri, Will Hawkins, Yacine Jernite, Sayash Kapoor, Juliet Shen, Heidy Khlaaf, Kevin Klyman, Nik Marda, Marie Pellat, Deb Raji, Divya Siddarth, Aviya Skowron, Joseph Spisak, Madhulika Srikumar, Victor Storchan, Audrey Tang, Jen Weedon,
Abstract summary: Open-weight and open-source foundation models are intensifying the obligation to make AI systems safe.<n>This paper reports outcomes from the Columbia Convening on AI Openness and Safety.
Score: 12.885990679810831
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The rapid rise of open-weight and open-source foundation models is intensifying the obligation and reshaping the opportunity to make AI systems safe. This paper reports outcomes from the Columbia Convening on AI Openness and Safety (San Francisco, 19 Nov 2024) and its six-week preparatory programme involving more than forty-five researchers, engineers, and policy leaders from academia, industry, civil society, and government. Using a participatory, solutions-oriented process, the working groups produced (i) a research agenda at the intersection of safety and open source AI; (ii) a mapping of existing and needed technical interventions and open source tools to safely and responsibly deploy open foundation models across the AI development workflow; and (iii) a mapping of the content safety filter ecosystem with a proposed roadmap for future research and development. We find that openness -- understood as transparent weights, interoperable tooling, and public governance -- can enhance safety by enabling independent scrutiny, decentralized mitigation, and culturally plural oversight. However, significant gaps persist: scarce multimodal and multilingual benchmarks, limited defenses against prompt-injection and compositional attacks in agentic systems, and insufficient participatory mechanisms for communities most affected by AI harms. The paper concludes with a roadmap of five priority research directions, emphasizing participatory inputs, future-proof content filters, ecosystem-wide safety infrastructure, rigorous agentic safeguards, and expanded harm taxonomies. These recommendations informed the February 2025 French AI Action Summit and lay groundwork for an open, plural, and accountable AI safety discipline.

Related papers

Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework [19.576189199576053]
This article introduces the 4C Framework for multi-agent AI security, inspired by societal governance.<n>By shifting AI security from a narrow focus on system-centric protection to the broader preservation of behavioral integrity and intent, the framework complements existing AI security strategies.
arXiv Detail & Related papers (2026-02-02T10:45:16Z)
Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z)
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z)
The Singapore Consensus on Global AI Safety Research Priorities [129.2088011234438]
"2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety" aimed to support research in this space.<n>Report builds on the International AI Safety Report chaired by Yoshua Bengio and backed by 33 governments.<n>Report organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment) and challenges with monitoring and intervening after deployment (Control)
arXiv Detail & Related papers (2025-06-25T17:59:50Z)
Report on NSF Workshop on Science of Safe AI [75.96202715567088]
New advances in machine learning are leading to new opportunities to develop technology-based solutions to societal problems.<n>To fulfill the promise of AI, we must address how to develop AI-based systems that are accurate and performant but also safe and trustworthy.<n>This report is the result of the discussions in the working groups that addressed different aspects of safety at the workshop.
arXiv Detail & Related papers (2025-06-24T18:55:29Z)
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents [0.0]
Decentralized AI agents will soon interact across internet platforms, creating security challenges beyond traditional cybersecurity and AI safety frameworks.<n>We introduce textbfmulti-agent security, a new field dedicated to securing networks of decentralized AI agents against threats that emerge or amplify through their interactions.
arXiv Detail & Related papers (2025-05-04T12:03:29Z)
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI [93.33036653316591]
We call for three interventions to advance system safety.<n>First, we propose using standardized AI flaw reports and rules of engagement for researchers.<n>Second, we propose GPAI system providers adopt broadly-scoped flaw disclosure programs.<n>Third, we advocate for the development of improved infrastructure to coordinate distribution of flaw reports.
arXiv Detail & Related papers (2025-03-21T05:09:46Z)
Enabling External Scrutiny of AI Systems with Privacy-Enhancing Technologies [0.0]
This article describes how technical infrastructure developed by the nonprofit OpenMined enables external scrutiny of AI systems without compromising sensitive information.<n>In practice, external researchers have struggled to gain access to AI systems because of AI companies' legitimate concerns about security, privacy, and intellectual property.<n>PETs have reached a new level of maturity: end-to-end technical infrastructure developed by OpenMined combines several PETs into various setups that enable privacy-preserving audits of AI systems.
arXiv Detail & Related papers (2025-02-05T15:31:11Z)
Transparency, Security, and Workplace Training & Awareness in the Age of Generative AI [0.0]
As AI technologies advance, ethical considerations, transparency, data privacy, and their impact on human labor intersect with the drive for innovation and efficiency.<n>Our research explores publicly accessible large language models (LLMs) that often operate on the periphery, away from mainstream scrutiny.<n>Specifically, we examine Gab AI, a platform that centers around unrestricted communication and privacy, allowing users to interact freely without censorship.
arXiv Detail & Related papers (2024-12-19T17:40:58Z)
Position: Mind the Gap-the Growing Disconnect Between Established Vulnerability Disclosure and AI Security [56.219994752894294]
We argue that adapting existing processes for AI security reporting is doomed to fail due to fundamental shortcomings for the distinctive characteristics of AI systems.<n>Based on our proposal to address these shortcomings, we discuss an approach to AI security reporting and how the new AI paradigm, AI agents, will further reinforce the need for specialized AI security incident reporting advancements.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Considerations Influencing Offense-Defense Dynamics From Artificial Intelligence [0.0]
AI can enhance defensive capabilities but also presents avenues for malicious exploitation and large-scale societal harm.<n>This paper proposes a taxonomy to map and examine the key factors that influence whether AI systems predominantly pose threats or offer protective benefits to society.
arXiv Detail & Related papers (2024-12-05T10:05:53Z)
Building Trust: Foundations of Security, Safety and Transparency in AI [0.23301643766310373]
We review the current security and safety scenarios while highlighting challenges such as tracking issues, remediation, and the apparent absence of AI model lifecycle and ownership processes. This paper aims to provide some of the foundational pieces for more standardized security, safety, and transparency in the development and operation of AI models.
arXiv Detail & Related papers (2024-11-19T06:55:57Z)
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees. We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z)
Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems. There is a lack of consensus about how exactly such risks arise, and how to manage them. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.