Related papers: Red-Teaming LLM Multi-Agent Systems via Communication Attacks

Red-Teaming LLM Multi-Agent Systems via Communication Attacks

URL: http://arxiv.org/abs/2502.14847v2
Date: Mon, 02 Jun 2025 01:51:09 GMT
Title: Red-Teaming LLM Multi-Agent Systems via Communication Attacks
Authors: Pengfei He, Yupin Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu,
Abstract summary: Large Language Model-based Multi-Agent Systems (LLM-MAS) have revolutionized complex problem-solving capability by enabling sophisticated agent collaboration through message-based communications.<n>We introduce Agent-in-the-Middle (AiTM), a novel attack that exploits the fundamental communication mechanisms in LLM-MAS by intercepting and manipulating inter-agent messages.
Score: 10.872328358364776
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model-based Multi-Agent Systems (LLM-MAS) have revolutionized complex problem-solving capability by enabling sophisticated agent collaboration through message-based communications. While the communication framework is crucial for agent coordination, it also introduces a critical yet unexplored security vulnerability. In this work, we introduce Agent-in-the-Middle (AiTM), a novel attack that exploits the fundamental communication mechanisms in LLM-MAS by intercepting and manipulating inter-agent messages. Unlike existing attacks that compromise individual agents, AiTM demonstrates how an adversary can compromise entire multi-agent systems by only manipulating the messages passing between agents. To enable the attack under the challenges of limited control and role-restricted communication format, we develop an LLM-powered adversarial agent with a reflection mechanism that generates contextually-aware malicious instructions. Our comprehensive evaluation across various frameworks, communication structures, and real-world applications demonstrates that LLM-MAS is vulnerable to communication-based attacks, highlighting the need for robust security measures in multi-agent systems.

Related papers

Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS [12.649568006596956]
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication.<n>Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion.<n>We propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system.
arXiv Detail & Related papers (2025-08-05T06:14:53Z)
From Prompt Injections to Protocol Exploits: Threats in LLM-Powered AI Agents Workflows [1.202155693533555]
Large language models (LLMs) with structured function-calling interfaces have dramatically expanded capabilities for real-time data retrieval and computation.<n>Yet, the explosive proliferation of plugins, connectors, and inter-agent protocols has outpaced discovery mechanisms and security practices.<n>We introduce the first unified, end-to-end threat model for LLM-agent ecosystems, spanning host-to-tool and agent-to-agent communications.
arXiv Detail & Related papers (2025-06-29T14:32:32Z)
A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures [59.43633341497526]
Large-Language-Model-driven AI agents have exhibited unprecedented intelligence and adaptability.<n>Agent communication is regarded as a foundational pillar of the future AI ecosystem.<n>This paper presents a comprehensive survey of agent communication security.
arXiv Detail & Related papers (2025-06-24T14:44:28Z)
Seven Security Challenges That Must be Solved in Cross-domain Multi-agent LLM Systems [16.838103835766066]
Large language models (LLMs) are rapidly evolving into autonomous agents that cooperate across organizational boundaries.<n>This position paper maps the security agenda for cross-domain multi-agent LLM systems.
arXiv Detail & Related papers (2025-05-28T18:19:03Z)
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
Les Dissonances: Cross-Tool Harvesting and Polluting in Multi-Tool Empowered LLM Agents [15.15485816037418]
This paper presents the first systematic security analysis of task control flows in multi-tool-enabled LLM agents. We identify a novel threat, Cross-Tool Harvesting and Polluting (XTHP), which includes multiple attack vectors. To understand the impact of this threat, we developed Chord, a dynamic scanning tool designed to automatically detect real-world agent tools susceptible to XTHP attacks.
arXiv Detail & Related papers (2025-04-04T01:41:06Z)
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems [23.379992200838053]
Large language model-based multi-agent systems have recently gained significant attention due to their potential for complex, collaborative, and intelligent problem-solving capabilities.<n>Existing surveys typically categorize LLM-MAS according to their application domains or architectures, overlooking the central role of communication in coordinating agent behaviors and interactions.<n>This review aims to help researchers and practitioners gain a clear understanding of the communication mechanisms in LLM-MAS, thereby facilitating the design and deployment of robust, scalable, and secure multi-agent systems.
arXiv Detail & Related papers (2025-02-20T07:18:34Z)
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs) In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents. We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z)
Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation [4.241100280846233]
AI agents, powered by large language models (LLMs), have transformed human-computer interactions by enabling seamless, natural, and context-aware communication. This paper investigates a critical vulnerability: adversarial attacks targeting the LLM core within AI agents.
arXiv Detail & Related papers (2024-12-05T18:38:30Z)
DAWN: Designing Distributed Agents in a Worldwide Network [0.38447712214412116]
DAWN enables distributed agents worldwide to register and be easily discovered through Gateway Agents. No-LLM Mode for deterministic tasks, Copilot for augmented decision-making, and LLM Agent for autonomous operations. DAWN ensures the safety and security of agent collaborations globally through a dedicated safety, security, and compliance layer.
arXiv Detail & Related papers (2024-10-11T18:47:04Z)
Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study [16.559272781032632]
The rapid progress in the reasoning capability of the Multi-modal Large Language Models has triggered the development of autonomous agent systems on mobile devices. Despite the increased human-machine interaction efficiency, the security risks of MLLM-based mobile agent systems have not been systematically studied. This paper highlights the need for security awareness in the design of MLLM-based systems and paves the way for future research on attacks and defense methods.
arXiv Detail & Related papers (2024-07-12T14:30:05Z)
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents. We propose the Internet of Agents (IoA), a novel framework that addresses these limitations. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z)
AgentScope: A Flexible yet Robust Multi-Agent Platform [66.64116117163755]
AgentScope is a developer-centric multi-agent platform with message exchange as its core communication mechanism. The abundant syntactic tools, built-in agents and service functions, user-friendly interfaces for application demonstration and utility monitor, zero-code programming workstation, and automatic prompt tuning mechanism significantly lower the barriers to both development and deployment.
arXiv Detail & Related papers (2024-02-21T04:11:28Z)
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems [51.6210785955659]
Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions. However, when deploying trained communicative agents in a real-world application where noise and potential attackers exist, the safety of communication-based policies becomes a severe issue that is underexplored. In this work, we consider an environment with $N$ agents, where the attacker may arbitrarily change the communication from any $CfracN-12$ agents to a victim agent.
arXiv Detail & Related papers (2022-06-21T07:32:18Z)
Coordinating Policies Among Multiple Agents via an Intelligent Communication Channel [81.39444892747512]
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another. We propose an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents' collective performance.
arXiv Detail & Related papers (2022-05-21T14:11:33Z)
Adversarial Attacks On Multi-Agent Communication [80.4392160849506]
Modern autonomous systems will soon be deployed at scale, opening up the possibility for cooperative multi-agent systems. Such advantages rely heavily on communication channels which have been shown to be vulnerable to security breaches. In this paper, we explore such adversarial attacks in a novel multi-agent setting where agents communicate by sharing learned intermediate representations.
arXiv Detail & Related papers (2021-01-17T00:35:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.