Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation
- URL: http://arxiv.org/abs/2510.19420v1
- Date: Wed, 22 Oct 2025 09:43:32 GMT
- Title: Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation
- Authors: Chengcan Wu, Zhixin Zhang, Mingqian Xu, Zeming Wei, Meng Sun,
- Abstract summary: Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications.<n>We propose a dynamic defense paradigm for MAS graph structures, which continuously monitors communication within the MAS graph.<n>Our method significantly outperforms existing MAS defense mechanisms, contributing an effective guardrail for their trustworthy applications.
- Score: 11.369402753246396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS remain a critical concern. Unlike challenges in single-agent systems, MAS involve more complex communication processes, making them susceptible to corruption attacks. To mitigate this issue, several defense mechanisms have been developed based on the graph representation of MAS, where agents represent nodes and communications form edges. Nevertheless, these methods predominantly focus on static graph defense, attempting to either detect attacks in a fixed graph structure or optimize a static topology with certain defensive capabilities. To address this limitation, we propose a dynamic defense paradigm for MAS graph structures, which continuously monitors communication within the MAS graph, then dynamically adjusts the graph topology, accurately disrupts malicious communications, and effectively defends against evolving and diverse dynamic attacks. Experimental results in increasingly complex and dynamic MAS environments demonstrate that our method significantly outperforms existing MAS defense mechanisms, contributing an effective guardrail for their trustworthy applications. Our code is available at https://github.com/ChengcanWu/Monitoring-LLM-Based-Multi-Agent-Systems.
Related papers
- DREAM: Dynamic Red-teaming across Environments for AI Models [28.267208528754082]
We introduce DREAM, a framework for evaluation of Large Language Models (LLMs) against dynamic, multi-stage attacks.<n>At its core, DREAM uses a Cross-Environment Adrial Knowledge Graph (CE-AKG) to maintain stateful, cross-domain understanding of vulnerabilities.<n>Our evaluation of 12 leading LLM agents reveals a critical vulnerability: these attack chains succeed in over 70% of cases for most models.
arXiv Detail & Related papers (2025-12-22T04:11:57Z) - Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection [76.91230292971115]
Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks.<n>XG-Guard is an explainable and fine-grained safeguarding framework for detecting malicious agents in MAS.
arXiv Detail & Related papers (2025-12-21T13:46:36Z) - Chameleon: Adaptive Adversarial Agents for Scaling-Based Visual Prompt Injection in Multimodal AI Systems [0.0]
We propose a novel, adaptive adversarial framework designed to expose and exploit scaling vulnerabilities in production Vision-Language Models (VLMs)<n>Our experiments demonstrate that Chameleon achieves an Attack Success Rate (ASR) of 84.5% across varying scaling factors.<n>We show that these attacks effectively compromise agentic pipelines, reducing decision-making accuracy by over 45% in multi-step tasks.
arXiv Detail & Related papers (2025-12-04T15:22:28Z) - Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents [7.45063623129985]
Deep reinforcement learning is emerging as a viable strategy for automated cyber defense.<n>In this work, we frame ACD as a two-player context-based partially observable Markov decision problem.<n>We show that this approach outperforms the state-of-the-art by a wide margin.
arXiv Detail & Related papers (2025-09-19T16:57:27Z) - Searching for Privacy Risks in LLM Agents via Simulation [61.229785851581504]
We present a search-based framework that alternates between improving attack and defense strategies through the simulation of privacy-critical agent interactions.<n>We find that attack strategies escalate from direct requests to sophisticated tactics, such as impersonation and consent forgery.<n>The discovered attacks and defenses transfer across diverse scenarios and backbone models, demonstrating strong practical utility for building privacy-aware agents.
arXiv Detail & Related papers (2025-08-14T17:49:09Z) - BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks [58.959622170433725]
BlindGuard is an unsupervised defense method that learns without requiring any attack-specific labels or prior knowledge of malicious behaviors.<n>We show that BlindGuard effectively detects diverse attack types (i.e., prompt injection, memory poisoning, and tool attack) across multi-agent systems.
arXiv Detail & Related papers (2025-08-11T16:04:47Z) - AgentSight: System-Level Observability for AI Agents Using eBPF [10.37440633887049]
Existing tools observe either an agent's high-level intent (via LLM prompts) or its low-level actions (e.g., system calls) but cannot correlate these two views.<n>We introduce AgentSight, an AgentOps observability framework that bridges this semantic gap using a hybrid approach.<n>AgentSight intercepts TLS-encrypted LLM traffic to extract semantic intent, monitors kernel events to observe system-wide effects, and causally correlates these two streams across process boundaries.
arXiv Detail & Related papers (2025-08-02T01:43:39Z) - LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents [11.619180675940482]
Graphical user interface (GUI) agents built on multimodal large language models (MLLMs) have recently demonstrated strong decision-making abilities in screen-based interaction tasks.<n>They remain highly vulnerable to pop-up-based environmental injection attacks, where malicious visual elements divert model attention and lead to unsafe or incorrect actions.<n>Existing defense methods either require costly retraining or perform poorly under inductive interference.<n>Our findings reveal that attention misalignment is a core vulnerability in MLLM agents and can be effectively addressed through selective layer-wise modulation.
arXiv Detail & Related papers (2025-07-13T08:36:09Z) - DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z) - SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems [11.497269773189254]
We present a system-level anomaly detection framework tailored for large language model (LLM)-based multi-agent systems (MAS)<n>We propose a graph-based framework that models agent interactions as dynamic execution graphs, enabling semantic anomaly detection at node, edge, and path levels.<n>Second, we introduce a pluggable SentinelAgent, an LLM-powered oversight agent that observes, analyzes, and intervenes in MAS execution based on security policies and contextual reasoning.
arXiv Detail & Related papers (2025-05-30T04:25:19Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z) - Everything Perturbed All at Once: Enabling Differentiable Graph Attacks [61.61327182050706]
Graph neural networks (GNNs) have been shown to be vulnerable to adversarial attacks.
We propose a novel attack method called Differentiable Graph Attack (DGA) to efficiently generate effective attacks.
Compared to the state-of-the-art, DGA achieves nearly equivalent attack performance with 6 times less training time and 11 times smaller GPU memory footprint.
arXiv Detail & Related papers (2023-08-29T20:14:42Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.