Related papers: TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems

TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems

URL: http://arxiv.org/abs/2511.05269v1
Date: Fri, 07 Nov 2025 14:30:26 GMT
Title: TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems
Authors: Ishan Kavathekar, Hemang Jain, Ameya Rathod, Ponnurangam Kumaraguru, Tanuja Ganu,
Abstract summary: Large Language Models (LLMs) have demonstrated strong capabilities as autonomous agents through tool use, planning, and decision-making abilities.<n>As task complexity grows, multi-agent LLM systems are increasingly used to solve problems collaboratively.<n>Existing benchmarks and predominantly focus on single-agent settings, failing to capture the unique vulnerabilities of multi-agent dynamics and co-ordination.<n>We introduce $textbfT$hreats and $textbfA$ttacks in $textbfM$ulti-$textbfA$gent $text
Score: 11.885326879716738
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models (LLMs) have demonstrated strong capabilities as autonomous agents through tool use, planning, and decision-making abilities, leading to their widespread adoption across diverse tasks. As task complexity grows, multi-agent LLM systems are increasingly used to solve problems collaboratively. However, safety and security of these systems remains largely under-explored. Existing benchmarks and datasets predominantly focus on single-agent settings, failing to capture the unique vulnerabilities of multi-agent dynamics and co-ordination. To address this gap, we introduce $\textbf{T}$hreats and $\textbf{A}$ttacks in $\textbf{M}$ulti-$\textbf{A}$gent $\textbf{S}$ystems ($\textbf{TAMAS}$), a benchmark designed to evaluate the robustness and safety of multi-agent LLM systems. TAMAS includes five distinct scenarios comprising 300 adversarial instances across six attack types and 211 tools, along with 100 harmless tasks. We assess system performance across ten backbone LLMs and three agent interaction configurations from Autogen and CrewAI frameworks, highlighting critical challenges and failure modes in current multi-agent deployments. Furthermore, we introduce Effective Robustness Score (ERS) to assess the tradeoff between safety and task effectiveness of these frameworks. Our findings show that multi-agent systems are highly vulnerable to adversarial attacks, underscoring the urgent need for stronger defenses. TAMAS provides a foundation for systematically studying and improving the safety of multi-agent LLM systems.

Related papers

Exposing Weak Links in Multi-Agent Systems under Adversarial Prompting [5.544819942438653]
We present SafeAgents, a framework for fine-grained security assessment of multi-agent systems.<n>We conduct a study across five widely adopted multi-agent architectures.<n>Our findings reveal that common design patterns carry significant vulnerabilities.
arXiv Detail & Related papers (2025-11-14T04:22:49Z)
Multi-Agent Tool-Integrated Policy Optimization [67.12841355267678]
Large language models (LLMs) increasingly rely on multi-turn tool-integrated planning for knowledge-intensive and complex reasoning tasks.<n>Existing implementations typically rely on a single agent, but they suffer from limited context length and noisy tool responses.<n>No existing methods support effective reinforcement learning post-training of tool-integrated multi-agent frameworks.
arXiv Detail & Related papers (2025-10-06T10:44:04Z)
AdvEvo-MARL: Shaping Internalized Safety through Adversarial Co-Evolution in Multi-Agent Reinforcement Learning [78.5751183537704]
AdvEvo-MARL is a co-evolutionary multi-agent reinforcement learning framework that internalizes safety into task agents.<n>Rather than relying on external guards, AdvEvo-MARL jointly optimize attackers and defenders.
arXiv Detail & Related papers (2025-10-02T02:06:30Z)
Extending the OWASP Multi-Agentic System Threat Modeling Guide: Insights from Multi-Agent Security Research [0.8057006406834466]
This work translates recent anticipatory research in multi-agent security (MASEC) into practical guidance for addressing challenges unique to large language model (LLM)-driven multi-agent architectures.<n>We introduce additional threat classes and scenarios grounded in practical MAS deployments, highlighting risks from benign goal drift, cross-agent propagation, affective prompt framing, and multi-agent backdoors.<n>This work complements the framework of robustness by expanding its applicability to increasingly complex, autonomous, and adaptive multi-agent systems.
arXiv Detail & Related papers (2025-08-13T13:47:55Z)
Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS [12.649568006596956]
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication.<n>Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion.<n>We propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system.
arXiv Detail & Related papers (2025-08-05T06:14:53Z)
SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z)
ATAG: AI-Agent Application Threat Assessment with Attack Graphs [23.757154032523093]
This paper introduces AI-agent application Threat assessment with Attack Graphs (ATAG)<n>ATAG is a novel framework designed to systematically analyze the security risks associated with AI-agent applications.<n>It facilitates proactive identification and mitigation of AI-agent threats in multi-agent applications.
arXiv Detail & Related papers (2025-06-03T13:25:40Z)
Comprehensive Vulnerability Analysis is Necessary for Trustworthy LLM-MAS [28.69485468744812]
Large Language Model-based Multi-Agent Systems (LLM-MAS) are increasingly deployed in high-stakes applications.<n>LLM-MAS introduces unique attack surfaces through inter-agent communication, trust relationships, and tool integration.<n>This paper presents a systematic framework for vulnerability analysis of LLM-MAS that unifies diverse research.
arXiv Detail & Related papers (2025-06-02T01:46:15Z)
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive.<n>We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons.<n>The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z)
Why Do Multi-Agent LLM Systems Fail? [87.90075668488434]
We introduce MAST-Data, a comprehensive dataset of 1600+ annotated traces collected across 7 popular MAS frameworks.<n>We build the first Multi-Agent System Failure taxonomy (MAST)<n>We leverage MAST and MAST-Data to analyze failure patterns across models (GPT4, Claude 3, Qwen2.5, CodeLlama) and tasks (coding, math, general agent)
arXiv Detail & Related papers (2025-03-17T19:04:38Z)
Multi-Agent Risks from Advanced AI [90.74347101431474]
Multi-agent systems of advanced AI pose novel and under-explored risks.<n>We identify three key failure modes based on agents' incentives, as well as seven key risk factors.<n>We highlight several important instances of each risk, as well as promising directions to help mitigate them.
arXiv Detail & Related papers (2025-02-19T23:03:21Z)
Multi-agent Architecture Search via Agentic Supernet [17.235963703597093]
Large Language Model (LLM)-empowered multi-agent systems extend the cognitive boundaries of individual agents.<n>Despite the availability of methods to automate the design of agentic, they typically seek to identify a static, complex, one-size-fits-all system.<n>We introduce MaAS, an automated framework that samples query-dependent agentic systems from the supernet.
arXiv Detail & Related papers (2025-02-06T16:12:06Z)
Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems.<n>In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy.<n> Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z)
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents [58.79302663733703]
Large language model-based multi-agent systems have shown great abilities across various tasks due to the collaboration of expert agents.<n>The impact of clumsy or even malicious agents--those who frequently make errors in their tasks--on the overall performance of the system remains underexplored.<n>This paper investigates what is the resilience of various system structures under faulty agents on different downstream tasks.
arXiv Detail & Related papers (2024-08-02T03:25:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.