Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS
- URL: http://arxiv.org/abs/2508.03125v1
- Date: Tue, 05 Aug 2025 06:14:53 GMT
- Title: Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS
- Authors: Bingyu Yan, Ziyi Zhou, Xiaoming Zhang, Chaozhuo Li, Ruilin Zeng, Yirui Qi, Tianbo Wang, Litian Zhang,
- Abstract summary: Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication.<n>Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion.<n>We propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system.
- Score: 12.649568006596956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.
Related papers
- LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks [57.27815890269697]
This work focuses on maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under energy constraints.<n>We introduce a Large Language Model (LLM)-guided multi-agent learning approach.<n>Results show that our method outperforms existing baselines in secrecy and energy efficiency.
arXiv Detail & Related papers (2025-07-23T04:22:57Z) - RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms [15.891423894740045]
We develop a Role-Adaptive LLM-Driven Yoked navigation algorithm RALLY.<n>RALLY uses structured natural language for efficient semantic communication and collaborative reasoning.<n> Experiments show that RALLY outperforms conventional approaches in terms of task coverage, convergence speed, and generalization.
arXiv Detail & Related papers (2025-07-02T05:44:17Z) - Align is not Enough: Multimodal Universal Jailbreak Attack against Multimodal Large Language Models [83.80177564873094]
We propose a unified multimodal universal jailbreak attack framework.<n>We evaluate the undesirable context generation of MLLMs like LLaVA, Yi-VL, MiniGPT4, MiniGPT-v2, and InstructBLIP.<n>This study underscores the urgent need for robust safety measures in MLLMs.
arXiv Detail & Related papers (2025-06-02T04:33:56Z) - MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision [76.42361936804313]
We introduce MAS-ZERO, the first self-evolved, inference-time framework for automatic MAS design.<n> MAS-ZERO employs meta-level design to iteratively generate, evaluate, and refine MAS configurations tailored to each problem instance.
arXiv Detail & Related papers (2025-05-21T00:56:09Z) - A Weighted Byzantine Fault Tolerance Consensus Driven Trusted Multiple Large Language Models Network [53.37983409425452]
Large Language Models (LLMs) have achieved remarkable success across a wide range of applications.<n>Recently, collaborative frameworks such as the Multi-LLM Network (MultiLLMN) have been introduced.<n>We propose a novel Trusted MultiLLMN framework driven by a weighted Byzantine Fault Tolerance (WBFT) blockchain consensus mechanism.
arXiv Detail & Related papers (2025-05-08T10:04:41Z) - A Trustworthy Multi-LLM Network: Challenges,Solutions, and A Use Case [59.58213261128626]
We propose a blockchain-enabled collaborative framework that connects multiple Large Language Models (LLMs) into a Trustworthy Multi-LLM Network (MultiLLMN)<n>This architecture enables the cooperative evaluation and selection of the most reliable and high-quality responses to complex network optimization problems.
arXiv Detail & Related papers (2025-05-06T05:32:46Z) - Red-Teaming LLM Multi-Agent Systems via Communication Attacks [10.872328358364776]
Large Language Model-based Multi-Agent Systems (LLM-MAS) have revolutionized complex problem-solving capability by enabling sophisticated agent collaboration through message-based communications.<n>We introduce Agent-in-the-Middle (AiTM), a novel attack that exploits the fundamental communication mechanisms in LLM-MAS by intercepting and manipulating inter-agent messages.
arXiv Detail & Related papers (2025-02-20T18:55:39Z) - G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems [10.450573905691677]
Large Language Model (LLM)-based Multi-agent Systems (MAS) have demonstrated remarkable capabilities in various complex tasks.<n>As these systems become increasingly integrated into critical applications, their vulnerability to adversarial attacks, misinformation propagation, and unintended behaviors have raised significant concerns.<n>We introduce G-Safeguard, a topology-guided security lens and treatment for robust MAS.
arXiv Detail & Related papers (2025-02-16T13:48:41Z) - Position: Towards a Responsible LLM-empowered Multi-Agent Systems [22.905804138387854]
The rise of Agent AI and Large Language Model-powered Multi-Agent Systems (LLM-MAS) has underscored the need for responsible and dependable system operation.<n>These advancements introduce critical challenges: LLM agents exhibit inherent unpredictability, and uncertainties in their outputs can compound, threatening system stability.<n>To address these risks, a human-centered design approach with active dynamic moderation is essential.
arXiv Detail & Related papers (2025-02-03T16:04:30Z) - Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation [4.241100280846233]
AI agents, powered by large language models (LLMs), have transformed human-computer interactions by enabling seamless, natural, and context-aware communication.<n>This paper investigates a critical vulnerability: adversarial attacks targeting the LLM core within AI agents.
arXiv Detail & Related papers (2024-12-05T18:38:30Z) - R-MTLLMF: Resilient Multi-Task Large Language Model Fusion at the Wireless Edge [78.26352952957909]
Multi-task large language models (MTLLMs) are important for many applications at the wireless edge, where users demand specialized models to handle multiple tasks efficiently.<n>The concept of model fusion via task vectors has emerged as an efficient approach for combining fine-tuning parameters to produce an MTLLM.<n>In this paper, the problem of enabling edge users to collaboratively craft such MTLMs via tasks vectors is studied, under the assumption of worst-case adversarial attacks.
arXiv Detail & Related papers (2024-11-27T10:57:06Z) - RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content [62.685566387625975]
Current mitigation strategies, while effective, are not resilient under adversarial attacks.
This paper introduces Resilient Guardrails for Large Language Models (RigorLLM), a novel framework designed to efficiently moderate harmful and unsafe inputs.
arXiv Detail & Related papers (2024-03-19T07:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.