Related papers: GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling

GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling

URL: http://arxiv.org/abs/2505.19234v2
Date: Wed, 15 Oct 2025 15:21:41 GMT
Title: GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling
Authors: Jialong Zhou, Lichao Wang, Xiao Yang,
Abstract summary: Large language models (LLMs) enable the development of intelligent agents capable of engaging in complex and multi-turn dialogues.<n>GUARDIAN is a method for detecting and mitigating multiple safety concerns in GUARDing Intelligent Agent collaboratioNs.
Score: 5.798273384241793
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emergence of large language models (LLMs) enables the development of intelligent agents capable of engaging in complex and multi-turn dialogues. However, multi-agent collaboration faces critical safety challenges, such as hallucination amplification and error injection and propagation. This paper presents GUARDIAN, a unified method for detecting and mitigating multiple safety concerns in GUARDing Intelligent Agent collaboratioNs. By modeling the multi-agent collaboration process as a discrete-time temporal attributed graph, GUARDIAN explicitly captures the propagation dynamics of hallucinations and errors. The unsupervised encoder-decoder architecture incorporating an incremental training paradigm learns to reconstruct node attributes and graph structures from latent embeddings, enabling the identification of anomalous nodes and edges with unparalleled precision. Moreover, we introduce a graph abstraction mechanism based on the Information Bottleneck Theory, which compresses temporal interaction graphs while preserving essential patterns. Extensive experiments demonstrate GUARDIAN's effectiveness in safeguarding LLM multi-agent collaborations against diverse safety vulnerabilities, achieving state-of-the-art accuracy with efficient resource utilization. The code is available at https://github.com/JialongZhou666/GUARDIAN

Related papers

From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration [27.233204826914243]
Collaborative mechanisms may cause minor inaccuracies to solidify into system-level false consensus through iteration.<n>Existing protections often rely on single-agent validation or require modifications to the collaboration architecture.<n>We propose a propagation dynamics model tailored for LLM-MAS that abstracts collaboration as a directed dependency graph.
arXiv Detail & Related papers (2026-03-04T11:45:27Z)
SYNAPSE: Empowering LLM Agents with Episodic-Semantic Memory via Spreading Activation [29.545442480332515]
We introduce Synapse, a unified memory architecture that transcends static rather than pre-computed links.<n>We show that Synapse significantly outperforms state-of-the-art methods in complex temporal and multi-hop reasoning tasks.<n>Our code and data will be made publicly available upon acceptance.
arXiv Detail & Related papers (2026-01-06T06:19:58Z)
Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection [76.91230292971115]
Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks.<n>XG-Guard is an explainable and fine-grained safeguarding framework for detecting malicious agents in MAS.
arXiv Detail & Related papers (2025-12-21T13:46:36Z)
InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs [72.5651722107621]
InterAgent is an end-to-end framework for text-driven physics-based multi-agent humanoid control.<n>We introduce an autoregressive diffusion transformer equipped with multi-stream blocks, which decouples proprioception, exteroception, and action to cross-modal interference.<n>We also propose a novel interaction graph exteroception representation that explicitly captures fine-grained joint-to-joint spatial dependencies.
arXiv Detail & Related papers (2025-12-08T10:46:01Z)
SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems [11.497269773189254]
We present a system-level anomaly detection framework tailored for large language model (LLM)-based multi-agent systems (MAS)<n>We propose a graph-based framework that models agent interactions as dynamic execution graphs, enabling semantic anomaly detection at node, edge, and path levels.<n>Second, we introduce a pluggable SentinelAgent, an LLM-powered oversight agent that observes, analyzes, and intervenes in MAS execution based on security policies and contextual reasoning.
arXiv Detail & Related papers (2025-05-30T04:25:19Z)
Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z)
Perspectives for Direct Interpretability in Multi-Agent Deep Reinforcement Learning [0.41783829807634765]
Multi-Agent Deep Reinforcement Learning (MADRL) was proven efficient in solving complex problems in robotics or games.<n>This paper advocates for direct interpretability, generating post hoc explanations directly from trained models.<n>We explore modern methods, including relevance backpropagation, knowledge edition, model steering, activation patching, sparse autoencoders and circuit discovery.
arXiv Detail & Related papers (2025-02-02T09:15:27Z)
Algorithmic Segmentation and Behavioral Profiling for Ransomware Detection Using Temporal-Correlation Graphs [0.0]
A novel framework was introduced, leveraging Temporal-Correlation Graphs to model the intricate relationships and temporal patterns inherent in malicious operations.<n>Experiments demonstrated the framework's effectiveness across diverse ransomware families, with consistently high precision, recall, and overall detection accuracy.<n>The research contributes to advancing cybersecurity technologies by integrating dynamic graph analytics and machine learning for future innovations in threat detection.
arXiv Detail & Related papers (2025-01-29T06:09:25Z)
Scaling Large Language Model-based Multi-Agent Collaboration [72.8998796426346]
Recent breakthroughs in large language model-driven autonomous agents have revealed that multi-agent collaboration often surpasses each individual through collective reasoning.<n>This study explores whether the continuous addition of collaborative agents can yield similar benefits.
arXiv Detail & Related papers (2024-06-11T11:02:04Z)
Secret Collusion among Generative AI Agents: Multi-Agent Deception via Steganography [43.468790060808914]
Recent capability increases in large language models (LLMs) open up applications in which groups of communicating generative AI agents solve joint tasks.<n>This poses privacy and security challenges concerning the unauthorised sharing of information.<n>Modern steganographic techniques could render such dynamics hard to detect.
arXiv Detail & Related papers (2024-02-12T09:31:21Z)
Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images. We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy. Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
MADiff is a diffusion-based multi-agent learning framework.<n>It works as both a decentralized policy and a centralized controller.<n>Our experiments demonstrate that MADiff outperforms baseline algorithms across various multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)
Decentralized Adversarial Training over Graphs [44.03711922549992]
The vulnerability of machine learning models to adversarial attacks has been attracting attention in recent years.<n>We develop a decentralized adversarial framework for multiagent systems.
arXiv Detail & Related papers (2023-03-23T15:05:16Z)
Soft Hierarchical Graph Recurrent Networks for Many-Agent Partially Observable Environments [9.067091068256747]
We propose a novel network structure called hierarchical graph recurrent network(HGRN) for multi-agent cooperation under partial observability. Based on the above technologies, we proposed a value-based MADRL algorithm called Soft-HGRN and its actor-critic variant named SAC-HRGN.
arXiv Detail & Related papers (2021-09-05T09:51:25Z)
Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions. In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems. Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z)
Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data. We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z)
Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs) GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features. It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.