Supervision policies can shape long-term risk management in general-purpose AI models
- URL: http://arxiv.org/abs/2501.06137v1
- Date: Fri, 10 Jan 2025 17:52:34 GMT
- Title: Supervision policies can shape long-term risk management in general-purpose AI models
- Authors: Manuel Cebrian, Emilia Gomez, David Fernandez Llorca,
- Abstract summary: We develop a simulation framework parameterized by features extracted from the diverse landscape of risk, incident, or hazard reporting ecosystems.
We evaluate four supervision policies: non-prioritized (first-come, first-served), random selection, priority-based (addressing the highest-priority risks first), and diversity-prioritized (balancing high-priority risks with comprehensive coverage across risk types)
Our results indicate that while priority-based and diversity-prioritized policies are more effective at mitigating high-impact risks, they may inadvertently neglect systemic issues reported by the broader community.
- Score: 0.0
- License:
- Abstract: The rapid proliferation and deployment of General-Purpose AI (GPAI) models, including large language models (LLMs), present unprecedented challenges for AI supervisory entities. We hypothesize that these entities will need to navigate an emergent ecosystem of risk and incident reporting, likely to exceed their supervision capacity. To investigate this, we develop a simulation framework parameterized by features extracted from the diverse landscape of risk, incident, or hazard reporting ecosystems, including community-driven platforms, crowdsourcing initiatives, and expert assessments. We evaluate four supervision policies: non-prioritized (first-come, first-served), random selection, priority-based (addressing the highest-priority risks first), and diversity-prioritized (balancing high-priority risks with comprehensive coverage across risk types). Our results indicate that while priority-based and diversity-prioritized policies are more effective at mitigating high-impact risks, particularly those identified by experts, they may inadvertently neglect systemic issues reported by the broader community. This oversight can create feedback loops that amplify certain types of reporting while discouraging others, leading to a skewed perception of the overall risk landscape. We validate our simulation results with several real-world datasets, including one with over a million ChatGPT interactions, of which more than 150,000 conversations were identified as risky. This validation underscores the complex trade-offs inherent in AI risk supervision and highlights how the choice of risk management policies can shape the future landscape of AI risks across diverse GPAI models used in society.
Related papers
- Multi-Agent Risks from Advanced AI [90.74347101431474]
Multi-agent systems of advanced AI pose novel and under-explored risks.
We identify three key failure modes based on agents' incentives, as well as seven key risk factors.
We highlight several important instances of each risk, as well as promising directions to help mitigate them.
arXiv Detail & Related papers (2025-02-19T23:03:21Z) - A Taxonomy of Systemic Risks from General-Purpose AI [2.5956465292067867]
We consider systemic risks as large-scale threats that can affect entire societies or economies.
Key sources of systemic risk emerge from knowledge gaps, challenges in recognizing harm, and the unpredictable trajectory of AI development.
This paper contributes to AI safety research by providing a structured groundwork for understanding and addressing the potential large-scale negative societal impacts of general-purpose AI.
arXiv Detail & Related papers (2024-11-24T22:16:18Z) - Effective Mitigations for Systemic Risks from General-Purpose AI [9.39718128736321]
We surveyed 76 experts whose expertise spans AI safety; critical infrastructure; democratic processes; chemical, biological, radiological, and nuclear risks (CBRN); and discrimination and bias.
We find that a broad range of risk mitigation measures are perceived as effective in reducing various systemic risks and technically feasible by domain experts.
Three mitigation measures stand out: safety incident reports and security information sharing, third-party pre-deployment model audits, and pre-deployment risk assessments.
arXiv Detail & Related papers (2024-11-14T22:39:25Z) - Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents [67.07177243654485]
This survey collects and analyzes the different threats faced by large language models-based agents.
We identify six key features of LLM-based agents, based on which we summarize the current research progress.
We select four representative agents as case studies to analyze the risks they may face in practical use.
arXiv Detail & Related papers (2024-11-14T15:40:04Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Privacy Risks of General-Purpose AI Systems: A Foundation for Investigating Practitioner Perspectives [47.17703009473386]
Powerful AI models have led to impressive leaps in performance across a wide range of tasks.
Privacy concerns have led to a wealth of literature covering various privacy risks and vulnerabilities of AI models.
We conduct a systematic review of these survey papers to provide a concise and usable overview of privacy risks in GPAIS.
arXiv Detail & Related papers (2024-07-02T07:49:48Z) - AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies [88.32153122712478]
We identify 314 unique risk categories organized into a four-tiered taxonomy.
At the highest level, this taxonomy encompasses System & Operational Risks, Content Safety Risks, Societal Risks, and Legal & Rights Risks.
We aim to advance AI safety through information sharing across sectors and the promotion of best practices in risk mitigation for generative AI models and systems.
arXiv Detail & Related papers (2024-06-25T18:13:05Z) - RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization [49.26510528455664]
We introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles.
We show that RiskQ can obtain promising performance through extensive experiments.
arXiv Detail & Related papers (2023-11-03T07:18:36Z) - Quantitative AI Risk Assessments: Opportunities and Challenges [7.35411010153049]
Best way to reduce risks is to implement comprehensive AI lifecycle governance.
Risks can be quantified using metrics from the technical community.
This paper explores these issues, focusing on the opportunities, challenges, and potential impacts of such an approach.
arXiv Detail & Related papers (2022-09-13T21:47:25Z) - Actionable Guidance for High-Consequence AI Risk Management: Towards
Standards Addressing AI Catastrophic Risks [12.927021288925099]
Artificial intelligence (AI) systems can present risks of events with very high or catastrophic consequences at societal scale.
NIST is developing the NIST Artificial Intelligence Risk Management Framework (AI RMF) as voluntary guidance on AI risk assessment and management.
We provide detailed actionable-guidance recommendations focused on identifying and managing risks of events with very high or catastrophic consequences.
arXiv Detail & Related papers (2022-06-17T18:40:41Z) - Sample-Based Bounds for Coherent Risk Measures: Applications to Policy
Synthesis and Verification [32.9142708692264]
This paper aims to address a few problems regarding risk-aware verification and policy synthesis.
First, we develop a sample-based method to evaluate a subset of a random variable distribution.
Second, we develop a robotic-based method to determine solutions to problems that outperform a large fraction of the decision space.
arXiv Detail & Related papers (2022-04-21T01:06:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.