Multi-Agent Risks from Advanced AI
- URL: http://arxiv.org/abs/2502.14143v1
- Date: Wed, 19 Feb 2025 23:03:21 GMT
- Title: Multi-Agent Risks from Advanced AI
- Authors: Lewis Hammond, Alan Chan, Jesse Clifton, Jason Hoelscher-Obermaier, Akbir Khan, Euan McLean, Chandler Smith, Wolfram Barfuss, Jakob Foerster, Tomáš Gavenčiak, The Anh Han, Edward Hughes, Vojtěch Kovařík, Jan Kulveit, Joel Z. Leibo, Caspar Oesterheld, Christian Schroeder de Witt, Nisarg Shah, Michael Wellman, Paolo Bova, Theodor Cimpeanu, Carson Ezell, Quentin Feuillade-Montixi, Matija Franklin, Esben Kran, Igor Krawczuk, Max Lamparth, Niklas Lauffer, Alexander Meinke, Sumeet Motwani, Anka Reuel, Vincent Conitzer, Michael Dennis, Iason Gabriel, Adam Gleave, Gillian Hadfield, Nika Haghtalab, Atoosa Kasirzadeh, Sébastien Krier, Kate Larson, Joel Lehman, David C. Parkes, Georgios Piliouras, Iyad Rahwan,
- Abstract summary: Multi-agent systems of advanced AI pose novel and under-explored risks.<n>We identify three key failure modes based on agents' incentives, as well as seven key risk factors.<n>We highlight several important instances of each risk, as well as promising directions to help mitigate them.
- Score: 90.74347101431474
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid development of advanced AI agents and the imminent deployment of many instances of these agents will give rise to multi-agent systems of unprecedented complexity. These systems pose novel and under-explored risks. In this report, we provide a structured taxonomy of these risks by identifying three key failure modes (miscoordination, conflict, and collusion) based on agents' incentives, as well as seven key risk factors (information asymmetries, network effects, selection pressures, destabilising dynamics, commitment problems, emergent agency, and multi-agent security) that can underpin them. We highlight several important instances of each risk, as well as promising directions to help mitigate them. By anchoring our analysis in a range of real-world examples and experimental evidence, we illustrate the distinct challenges posed by multi-agent systems and their implications for the safety, governance, and ethics of advanced AI.
Related papers
- OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage [59.3826294523924]
We investigate the security vulnerabilities of a popular multi-agent pattern known as the orchestrator setup.<n>We report the susceptibility of frontier models to different categories of attacks, finding that both reasoning and non-reasoning models are vulnerable.
arXiv Detail & Related papers (2026-02-13T21:32:32Z) - AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security [126.49733412191416]
Current guardrail models lack agentic risk awareness and transparency in risk diagnosis.<n>We propose a unified three-dimensional taxonomy that categorizes agentic risks by their source (where), failure mode (how), and consequence (what)<n>We introduce a new fine-grained agentic safety benchmark (ATBench) and a Diagnostic Guardrail framework for agent safety and security (AgentDoG)
arXiv Detail & Related papers (2026-01-26T13:45:41Z) - With Great Capabilities Come Great Responsibilities: Introducing the Agentic Risk & Capability Framework for Governing Agentic AI Systems [11.09031447875337]
Agentic Risk & Capability (ARC) Framework is a technical governance framework designed to help organizations identify, assess, and mitigate risks arising from agentic AI systems.<n>The framework's core contributions are:.<n>It develops a novel capability-centric perspective to analyze a wide range of agentic AI systems.<n>It distills three primary sources of risk intrinsic to agentic AI systems - components, design, and capabilities.<n>It establishes a clear nexus between each risk source, specific materialized risks, and corresponding technical controls.
arXiv Detail & Related papers (2025-12-22T03:51:34Z) - A Safety and Security Framework for Real-World Agentic Systems [2.05255620498371]
This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment.<n>We propose a new way of identification of novel agentic risks through the lens of user safety.<n>We demonstrate the framework effectiveness through a detailed case study of NVIDIA flagship agentic research assistant, AI-Q Research Assistant.
arXiv Detail & Related papers (2025-11-27T00:19:24Z) - Extending the OWASP Multi-Agentic System Threat Modeling Guide: Insights from Multi-Agent Security Research [0.8057006406834466]
This work translates recent anticipatory research in multi-agent security (MASEC) into practical guidance for addressing challenges unique to large language model (LLM)-driven multi-agent architectures.<n>We introduce additional threat classes and scenarios grounded in practical MAS deployments, highlighting risks from benign goal drift, cross-agent propagation, affective prompt framing, and multi-agent backdoors.<n>This work complements the framework of robustness by expanding its applicability to increasingly complex, autonomous, and adaptive multi-agent systems.
arXiv Detail & Related papers (2025-08-13T13:47:55Z) - Towards Unifying Quantitative Security Benchmarking for Multi Agent Systems [0.0]
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols.<n>One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust.<n>In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs.
arXiv Detail & Related papers (2025-07-23T13:51:28Z) - SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z) - Kaleidoscopic Teaming in Multi Agent Simulations [75.47388708240042]
We argue that existing red teaming or safety evaluation frameworks fall short in evaluating safety risks in complex behaviors, thought processes and actions taken by agents.<n>We introduce new in-context optimization techniques that can be used in our kaleidoscopic teaming framework to generate better scenarios for safety analysis.<n>We present appropriate metrics that can be used along with our framework to measure safety of agents.
arXiv Detail & Related papers (2025-06-20T23:37:17Z) - Securing Agentic AI: A Comprehensive Threat Model and Mitigation Framework for Generative AI Agents [0.0]
This paper introduces a comprehensive threat model tailored specifically for GenAI agents.
Research work identifies 9 primary threats and organizes them across five key domains.
arXiv Detail & Related papers (2025-04-28T16:29:24Z) - Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis [35.57217841344101]
This study investigates the underlying factors that contribute to the increased vulnerability of Web AI agents.
We identify three critical factors that amplify the vulnerability of Web AI agents; (1) embedding user goals into the system prompt, (2) multi-step action generation, and (3) observational capabilities.
arXiv Detail & Related papers (2025-02-27T18:56:26Z) - Multi-Agent Security Tax: Trading Off Security and Collaboration Capabilities in Multi-Agent Systems [1.2564343689544843]
We develop simulations of AI agents collaborating on shared objectives to study security risks and trade-offs.
We observe infectious malicious prompts - the multi-hop spreading of malicious instructions.
Our findings illustrate potential trade-off between security and collaborative efficiency in multi-agent systems.
arXiv Detail & Related papers (2025-02-26T14:00:35Z) - Fully Autonomous AI Agents Should Not be Developed [58.88624302082713]
This paper argues that fully autonomous AI agents should not be developed.
In support of this position, we build from prior scientific literature and current product marketing to delineate different AI agent levels.
Our analysis reveals that risks to people increase with the autonomy of a system.
arXiv Detail & Related papers (2025-02-04T19:00:06Z) - Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents [67.07177243654485]
This survey collects and analyzes the different threats faced by large language models-based agents.
We identify six key features of LLM-based agents, based on which we summarize the current research progress.
We select four representative agents as case studies to analyze the risks they may face in practical use.
arXiv Detail & Related papers (2024-11-14T15:40:04Z) - HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions [76.42274173122328]
We present HAICOSYSTEM, a framework examining AI agent safety within diverse and complex social interactions.
We run 1840 simulations based on 92 scenarios across seven domains (e.g., healthcare, finance, education)
Our experiments show that state-of-the-art LLMs, both proprietary and open-sourced, exhibit safety risks in over 50% cases.
arXiv Detail & Related papers (2024-09-24T19:47:21Z) - Safeguarding AI Agents: Developing and Analyzing Safety Architectures [0.0]
This paper addresses the need for safety measures in AI systems that collaborate with human teams.
We propose and evaluate three frameworks to enhance safety protocols in AI agent systems.
We conclude that these frameworks can significantly strengthen the safety and security of AI agent systems.
arXiv Detail & Related papers (2024-09-03T10:14:51Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety [70.84902425123406]
Multi-agent systems, when enhanced with Large Language Models (LLMs), exhibit profound capabilities in collective intelligence.
However, the potential misuse of this intelligence for malicious purposes presents significant risks.
We propose a framework (PsySafe) grounded in agent psychology, focusing on identifying how dark personality traits in agents can lead to risky behaviors.
Our experiments reveal several intriguing phenomena, such as the collective dangerous behaviors among agents, agents' self-reflection when engaging in dangerous behavior, and the correlation between agents' psychological assessments and dangerous behaviors.
arXiv Detail & Related papers (2024-01-22T12:11:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.