Related papers: The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies

The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies

URL: http://arxiv.org/abs/2509.24394v2
Date: Mon, 13 Oct 2025 02:00:51 GMT
Title: The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies
Authors: Sam Coggins, Alexander K. Saeri, Katherine A. Daniell, Lorenn P. Ruster, Jessie Liu, Jenny L. Davis,
Abstract summary: We analyse the OpenAI 'Preparedness Framework Version 2' (April 2025) using the Mechanisms & Conditions model of affordances and the MIT AI Risk Repository.<n>We find that this safety policy requests evaluation of a small minority of AI risks, encourages deployment of systems with 'Medium' capabilities for unintentionally enabling'severe harm'<n>These findings suggest that effective mitigation of AI risks requires more robust governance interventions beyond current industry self-regulation.
Score: 35.43144920451646
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prominent AI companies are producing 'safety frameworks' as a type of voluntary self-governance. These statements purport to establish risk thresholds and safety procedures for the development and deployment of highly capable AI. Understanding which AI risks are covered and what actions are allowed, refused, demanded, encouraged, or discouraged by these statements is vital for assessing how these frameworks actually govern AI development and deployment. We draw on affordance theory to analyse the OpenAI 'Preparedness Framework Version 2' (April 2025) using the Mechanisms & Conditions model of affordances and the MIT AI Risk Repository. We find that this safety policy requests evaluation of a small minority of AI risks, encourages deployment of systems with 'Medium' capabilities for unintentionally enabling 'severe harm' (which OpenAI defines as >1000 deaths or >$100B in damages), and allows OpenAI's CEO to deploy even more dangerous capabilities. These findings suggest that effective mitigation of AI risks requires more robust governance interventions beyond current industry self-regulation. Our affordance analysis provides a replicable method for evaluating what safety frameworks actually permit versus what they claim.

Related papers

Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 [61.787178868669265]
This technical report presents an updated and granular assessment of five critical dimensions: cyber offense, persuasion and manipulation, strategic deception, uncontrolled AI R&D, and self-replication.<n>This work reflects our current understanding of AI frontier risks and urges collective action to mitigate these challenges.
arXiv Detail & Related papers (2026-02-16T04:30:06Z)
International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management [115.92752850425272]
Second update to the 2025 International AI Safety Report assesses new developments in general-purpose AI risk management over the past year.<n> examines how researchers, public institutions, and AI developers are approaching risk management for general-purpose AI.
arXiv Detail & Related papers (2025-11-25T03:12:56Z)
Responsible AI Technical Report [2.855225489126354]
KT developed a Responsible AI (RAI) assessment methodology and risk mitigation technologies to ensure the safety and reliability of AI services.<n>We present a reliable assessment methodology that verifies model safety and robustness based on KT's AI risk taxonomy tailored to the domestic environment.<n>We also provide practical tools for managing and mitigating identified AI risks.
arXiv Detail & Related papers (2025-09-24T12:26:33Z)
Governable AI: Provable Safety Under Extreme Threat Models [31.36879992618843]
We propose a Governable AI (GAI) framework that shifts from traditional internal constraints to externally enforced structural compliance.<n>The GAI framework is composed of a simple yet reliable, fully deterministic, powerful, flexible, and general-purpose rule enforcement module (REM); governance rules; and a governable secure super-platform (GSSP) that offers end-to-end protection against compromise or subversion by AI.
arXiv Detail & Related papers (2025-08-28T04:22:59Z)
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z)
A Framework for the Assurance of AI-Enabled Systems [0.0]
This paper proposes a claims-based framework for risk management and assurance of AI systems.<n>The paper's contributions are a framework process for AI assurance, a set of relevant definitions, and a discussion of important considerations in AI assurance.
arXiv Detail & Related papers (2025-04-03T13:44:01Z)
Position: Mind the Gap-the Growing Disconnect Between Established Vulnerability Disclosure and AI Security [56.219994752894294]
We argue that adapting existing processes for AI security reporting is doomed to fail due to fundamental shortcomings for the distinctive characteristics of AI systems.<n>Based on our proposal to address these shortcomings, we discuss an approach to AI security reporting and how the new AI paradigm, AI agents, will further reinforce the need for specialized AI security incident reporting advancements.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems. There is a lack of consensus about how exactly such risks arise, and how to manage them. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z)
AI Hazard Management: A framework for the systematic management of root causes for AI risks [0.0]
This paper introduces the AI Hazard Management (AIHM) framework. It provides a structured process to systematically identify, assess, and treat AI hazards. It builds upon an AI hazard list from a comprehensive state-of-the-art analysis.
arXiv Detail & Related papers (2023-10-25T15:55:50Z)
AI Liability Insurance With an Example in AI-Powered E-diagnosis System [22.102728605081534]
We use an AI-powered E-diagnosis system as an example to study AI liability insurance. We show that AI liability insurance can act as a regulatory mechanism to incentivize compliant behaviors and serve as a certificate of high-quality AI systems.
arXiv Detail & Related papers (2023-06-01T21:03:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.