Related papers: AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models

AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models

URL: http://arxiv.org/abs/2506.23949v1
Date: Mon, 30 Jun 2025 15:18:18 GMT
Title: AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models
Authors: Anthony M. Barrett, Jessica Newman, Brandie Nonnecke, Nada Madkour, Dan Hendrycks, Evan R. Murphy, Krystal Jackson, Deepika Raman,
Abstract summary: This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of GPAI/foundation models.<n>We intend this document primarily for developers of large-scale, state-of-the-art GPAI/foundation models.
Score: 15.890326508488673
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Increasingly multi-purpose AI models, such as cutting-edge large language models or other 'general-purpose AI' (GPAI) models, 'foundation models,' generative AI models, and 'frontier models' (typically all referred to hereafter with the umbrella term 'GPAI/foundation models' except where greater specificity is needed), can provide many beneficial capabilities but also risks of adverse events with profound consequences. This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of GPAI/foundation models. We intend this document primarily for developers of large-scale, state-of-the-art GPAI/foundation models; others that can benefit from this guidance include downstream developers of end-use applications that build on a GPAI/foundation model. This document facilitates conformity with or use of leading AI risk management-related standards, adapting and building on the generic voluntary guidance in the NIST AI Risk Management Framework and ISO/IEC 23894, with a focus on the unique issues faced by developers of GPAI/foundation models.

Related papers

AI in a vat: Fundamental limits of efficient world modelling for agent sandboxing and interpretability [84.52205243353761]
Recent work proposes using world models to generate controlled virtual environments in which AI agents can be tested before deployment.<n>We investigate ways of simplifying world models that remain agnostic to the AI agent under evaluation.
arXiv Detail & Related papers (2025-04-06T20:35:44Z)
Could AI Trace and Explain the Origins of AI-Generated Images and Text? [53.11173194293537]
AI-generated content is increasingly prevalent in the real world.<n> adversaries might exploit large multimodal models to create images that violate ethical or legal standards.<n>Paper reviewers may misuse large language models to generate reviews without genuine intellectual effort.
arXiv Detail & Related papers (2025-04-05T20:51:54Z)
Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety [296.5392512998251]
We present a comprehensive taxonomy of safety threats to large models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats.<n>We identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices.
arXiv Detail & Related papers (2025-02-02T05:14:22Z)
Supervision policies can shape long-term risk management in general-purpose AI models [0.0]
We develop a simulation framework parameterized by features extracted from the diverse landscape of risk, incident, or hazard reporting ecosystems.<n>We evaluate four supervision policies: non-prioritized (first-come, first-served), random selection, priority-based (addressing the highest-priority risks first), and diversity-prioritized (balancing high-priority risks with comprehensive coverage across risk types)<n>Our results indicate that while priority-based and diversity-prioritized policies are more effective at mitigating high-impact risks, they may inadvertently neglect systemic issues reported by the broader community.
arXiv Detail & Related papers (2025-01-10T17:52:34Z)
Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems [2.3266896180922187]
We compile an extensive catalog of risk sources and risk management measures for general-purpose AI systems. This work involves identifying technical, operational, and societal risks across model development, training, and deployment stages. The catalog is released under a public domain license for ease of direct use by stakeholders in AI governance and standards.
arXiv Detail & Related papers (2024-10-30T21:32:56Z)
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Generative AI Models: Opportunities and Risks for Industry and Authorities [1.3196892898418466]
Generative AI models are capable of performing a wide variety of tasks that have traditionally required creativity and human understanding.<n>During training, they learn patterns from existing data and can subsequently generate new content.<n>Many risks associated with generative AI must be addressed during development or can only be influenced by the operating organisation.
arXiv Detail & Related papers (2024-06-07T08:34:30Z)
Risks and Opportunities of Open-Source Generative AI [64.86989162783648]
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation. This regulation is likely to put at risk the budding field of open-source generative AI.
arXiv Detail & Related papers (2024-05-14T13:37:36Z)
Deployment Corrections: An incident response framework for frontier AI models [0.0]
This paper explores contingency plans for cases where pre-deployment risk management falls short. We describe a toolkit of deployment corrections that AI developers can use to respond to dangerous capabilities. We recommend frontier AI developers, standard-setting organizations, and regulators should collaborate to define a standardized industry-wide approach.
arXiv Detail & Related papers (2023-09-30T10:07:39Z)
Frontier AI Regulation: Managing Emerging Risks to Public Safety [15.85618115026625]
"Frontier AI" models could possess dangerous capabilities sufficient to pose severe risks to public safety. Industry self-regulation is an important first step. We propose an initial set of safety standards.
arXiv Detail & Related papers (2023-07-06T17:03:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.