Related papers: Risk thresholds for frontier AI

Risk thresholds for frontier AI

URL: http://arxiv.org/abs/2406.14713v1
Date: Thu, 20 Jun 2024 20:16:29 GMT
Title: Risk thresholds for frontier AI
Authors: Leonie Koessler, Jonas Schuett, Markus Anderljung,
Abstract summary: One increasingly popular approach is to define capability thresholds. Risk thresholds simply state how much risk would be too much. Main downside is that they are more difficult to evaluate reliably.
Score: 1.053373860696675
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Frontier artificial intelligence (AI) systems could pose increasing risks to public safety and security. But what level of risk is acceptable? One increasingly popular approach is to define capability thresholds, which describe AI capabilities beyond which an AI system is deemed to pose too much risk. A more direct approach is to define risk thresholds that simply state how much risk would be too much. For instance, they might state that the likelihood of cybercriminals using an AI system to cause X amount of economic damage must not increase by more than Y percentage points. The main upside of risk thresholds is that they are more principled than capability thresholds, but the main downside is that they are more difficult to evaluate reliably. For this reason, we currently recommend that companies (1) define risk thresholds to provide a principled foundation for their decision-making, (2) use these risk thresholds to help set capability thresholds, and then (3) primarily rely on capability thresholds to make their decisions. Regulators should also explore the area because, ultimately, they are the most legitimate actors to define risk thresholds. If AI risk estimates become more reliable, risk thresholds should arguably play an increasingly direct role in decision-making.

Related papers

A First-Principles Based Risk Assessment Framework and the IEEE P3396 Standard [0.0]
Generative Artificial Intelligence (AI) is enabling unprecedented automation in content creation and decision support. This paper presents a first-principles risk assessment framework underlying the IEEE P3396 Recommended Practice for AI Risk, Safety, Trustworthiness, and Responsibility.
arXiv Detail & Related papers (2025-03-31T18:00:03Z)
Intolerable Risk Threshold Recommendations for Artificial Intelligence [0.2383122657918106]
Frontier AI models may pose severe risks to public safety, human rights, economic stability, and societal value. Risks could arise from deliberate adversarial misuse, system failures, unintended cascading effects, or simultaneous failures across multiple models. 16 global AI industry organizations signed the Frontier AI Safety Commitments, and 27 nations and the EU issued a declaration on their intent to define these thresholds.
arXiv Detail & Related papers (2025-03-04T12:30:37Z)
Fully Autonomous AI Agents Should Not be Developed [58.88624302082713]
This paper argues that fully autonomous AI agents should not be developed. In support of this position, we build from prior scientific literature and current product marketing to delineate different AI agent levels. Our analysis reveals that risks to people increase with the autonomy of a system.
arXiv Detail & Related papers (2025-02-04T19:00:06Z)
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
Risk Alignment in Agentic AI Systems [0.0]
Agentic AIs capable of undertaking complex actions with little supervision raise new questions about how to safely create and align such systems with users, developers, and society. Risk alignment will matter for user satisfaction and trust, but it will also have important ramifications for society more broadly. We present three papers that bear on key normative and technical aspects of these questions.
arXiv Detail & Related papers (2024-10-02T18:21:08Z)
The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence [35.77247656798871]
The risks posed by Artificial Intelligence (AI) are of considerable concern to academics, auditors, policymakers, AI companies, and the public. A lack of shared understanding of AI risks can impede our ability to comprehensively discuss, research, and react to them. This paper addresses this gap by creating an AI Risk Repository to serve as a common frame of reference.
arXiv Detail & Related papers (2024-08-14T10:32:06Z)
Risks and Opportunities of Open-Source Generative AI [64.86989162783648]
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation. This regulation is likely to put at risk the budding field of open-source generative AI.
arXiv Detail & Related papers (2024-05-14T13:37:36Z)
Near to Mid-term Risks and Opportunities of Open-Source Generative AI [94.06233419171016]
Applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation. This regulation is likely to put at risk the budding field of open-source Generative AI.
arXiv Detail & Related papers (2024-04-25T21:14:24Z)
RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization [49.26510528455664]
We introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles. We show that RiskQ can obtain promising performance through extensive experiments.
arXiv Detail & Related papers (2023-11-03T07:18:36Z)
Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems. There is a lack of consensus about how exactly such risks arise, and how to manage them. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z)
Frontier AI Regulation: Managing Emerging Risks to Public Safety [15.85618115026625]
"Frontier AI" models could possess dangerous capabilities sufficient to pose severe risks to public safety. Industry self-regulation is an important first step. We propose an initial set of safety standards.
arXiv Detail & Related papers (2023-07-06T17:03:25Z)
Three lines of defense against risks from AI [0.0]
It is not always clear who is responsible for AI risk management. The Three Lines of Defense (3LoD) model is considered best practice in many industries. I suggest ways in which AI companies could implement the model.
arXiv Detail & Related papers (2022-12-16T09:33:00Z)
Quantitative AI Risk Assessments: Opportunities and Challenges [9.262092738841979]
AI-based systems are increasingly being leveraged to provide value to organizations, individuals, and society. Risks have led to proposed regulations, litigation, and general societal concerns. This paper explores the concept of a quantitative AI Risk Assessment.
arXiv Detail & Related papers (2022-09-13T21:47:25Z)
Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.