Safeguard-by-Development: A Privacy-Enhanced Development Paradigm for Multi-Agent Collaboration Systems
- URL: http://arxiv.org/abs/2505.04799v2
- Date: Tue, 24 Jun 2025 05:20:37 GMT
- Title: Safeguard-by-Development: A Privacy-Enhanced Development Paradigm for Multi-Agent Collaboration Systems
- Authors: Jian Cui, Zichuan Li, Luyi Xing, Xiaojing Liao,
- Abstract summary: Multi-agent collaboration systems (MACS) solve complex problems efficiently by leveraging each agent's specialization and communication between agents.<n>The inherent exchange of information between agents and their interaction with external environments introduces significant risks of sensitive data leakage.<n>Existing MACS lack fine-grained data protection controls, making it challenging to manage sensitive information securely.
- Score: 15.15485816037418
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent collaboration systems (MACS), powered by large language models (LLMs), solve complex problems efficiently by leveraging each agent's specialization and communication between agents. However, the inherent exchange of information between agents and their interaction with external environments, such as LLM, tools, and users, inevitably introduces significant risks of sensitive data leakage, including vulnerabilities to attacks such as eavesdropping and prompt injection. Existing MACS lack fine-grained data protection controls, making it challenging to manage sensitive information securely. In this paper, we take the first step to mitigate the MACS's data leakage threat through a privacy-enhanced MACS development paradigm, Maris. Maris enables rigorous message flow control within MACS by embedding reference monitors into key multi-agent conversation components. We implemented Maris as an integral part of widely-adopted open-source multi-agent development frameworks, AutoGen and LangChain. To evaluate its effectiveness, we develop a Privacy Assessment Framework that emulates MACS under different threat scenarios. Our evaluation shows that Maris effectively mitigated sensitive data leakage threats across three different task suites while maintaining a high task success rate.
Related papers
- Towards Unifying Quantitative Security Benchmarking for Multi Agent Systems [0.0]
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols.<n>One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust.<n>In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs.
arXiv Detail & Related papers (2025-07-23T13:51:28Z) - SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z) - From Prompt Injections to Protocol Exploits: Threats in LLM-Powered AI Agents Workflows [1.202155693533555]
Large language models (LLMs) with structured function-calling interfaces have dramatically expanded capabilities for real-time data retrieval and computation.<n>Yet, the explosive proliferation of plugins, connectors, and inter-agent protocols has outpaced discovery mechanisms and security practices.<n>We introduce the first unified, end-to-end threat model for LLM-agent ecosystems, spanning host-to-tool and agent-to-agent communications.
arXiv Detail & Related papers (2025-06-29T14:32:32Z) - CoTGuard: Using Chain-of-Thought Triggering for Copyright Protection in Multi-Agent LLM Systems [55.57181090183713]
We introduce CoTGuard, a novel framework for copyright protection that leverages trigger-based detection within Chain-of-Thought reasoning.<n>Specifically, we can activate specific CoT segments and monitor intermediate reasoning steps for unauthorized content reproduction by embedding specific trigger queries into agent prompts.<n>This approach enables fine-grained, interpretable detection of copyright violations in collaborative agent scenarios.
arXiv Detail & Related papers (2025-05-26T01:42:37Z) - AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z) - Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG)<n>FedE4RAG facilitates collaborative training of client-side RAG retrieval models.<n>We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z) - Privacy-Enhancing Paradigms within Federated Multi-Agent Systems [47.76990892943637]
LLM-based Multi-Agent Systems (MAS) have proven highly effective in solving complex problems by integrating multiple agents, each performing different roles.<n>In this paper, we introduce the concept of Federated MAS, highlighting the fundamental differences between Federated MAS and traditional FL.<n>We then identify key challenges in developing Federated MAS, including: 1) heterogeneous privacy protocols among agents, 2) structural differences in multi-party conversations, and 3) dynamic conversational network structures.<n>To address these challenges, we propose Embedded Privacy-Enhancing Agents (EPEAgent), an innovative solution that integrates seamlessly into the Retrieval-Augmented Generation phase and the
arXiv Detail & Related papers (2025-03-11T08:38:45Z) - MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks [109.53357276796655]
Multimodal large language models (MLLMs) equipped with Retrieval Augmented Generation (RAG)<n>RAG enhances MLLMs by grounding responses in query-relevant external knowledge.<n>This reliance poses a critical yet underexplored safety risk: knowledge poisoning attacks.<n>We propose MM-PoisonRAG, a novel knowledge poisoning attack framework with two attack strategies.
arXiv Detail & Related papers (2025-02-25T04:23:59Z) - Red-Teaming LLM Multi-Agent Systems via Communication Attacks [10.872328358364776]
Large Language Model-based Multi-Agent Systems (LLM-MAS) have revolutionized complex problem-solving capability by enabling sophisticated agent collaboration through message-based communications.<n>We introduce Agent-in-the-Middle (AiTM), a novel attack that exploits the fundamental communication mechanisms in LLM-MAS by intercepting and manipulating inter-agent messages.
arXiv Detail & Related papers (2025-02-20T18:55:39Z) - The Communication-Friendly Privacy-Preserving Machine Learning against Malicious Adversaries [14.232901861974819]
Privacy-preserving machine learning (PPML) is an innovative approach that allows for secure data analysis while safeguarding sensitive information.
We introduce efficient protocol for secure linear function evaluation.
We extend the protocol to handle linear and non-linear layers, ensuring compatibility with a wide range of machine-learning models.
arXiv Detail & Related papers (2024-11-14T08:55:14Z) - AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases [73.04652687616286]
We propose AgentPoison, the first backdoor attack targeting generic and RAG-based LLM agents by poisoning their long-term memory or RAG knowledge base.
Unlike conventional backdoor attacks, AgentPoison requires no additional model training or fine-tuning.
On each agent, AgentPoison achieves an average attack success rate higher than 80% with minimal impact on benign performance.
arXiv Detail & Related papers (2024-07-17T17:59:47Z) - Privacy Preserving Multi-Agent Reinforcement Learning in Supply Chains [5.436598805836688]
This paper addresses privacy concerns in multiagent reinforcement learning (MARL) within the context of supply chains.
We propose a game-theoretic, privacy-related mechanism, utilizing a secure multi-party computation framework in MARL settings.
We present a learning mechanism that carries out floating point operations in a privacy-preserving manner.
arXiv Detail & Related papers (2023-12-09T21:25:21Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - DPMAC: Differentially Private Communication for Cooperative Multi-Agent
Reinforcement Learning [21.961558461211165]
Communication lays the foundation for cooperation in human society and in multi-agent reinforcement learning (MARL)
We propose the textitdifferentially private multi-agent communication (DPMAC) algorithm, which protects the sensitive information of individual agents by equipping each agent with a local message sender with rigorous $(epsilon, delta)$-differential privacy guarantee.
We prove the existence of a Nash equilibrium in cooperative MARL with privacy-preserving communication, which suggests that this problem is game-theoretically learnable.
arXiv Detail & Related papers (2023-08-19T04:26:23Z) - PP-MARL: Efficient Privacy-Preserving Multi-Agent Reinforcement Learning for Cooperative Intelligence in Communications [15.955599283219298]
Multi-agent reinforcement learning (MARL) is a popular approach for achieving cooperative intelligence (CI) in communication problems.<n> Ensuring privacy protection for MARL is a challenging task because of the presence of heterogeneous agents that learn interdependently via sharing information.<n>We propose PP-MARL, an efficient privacy-preserving learning scheme for MARL.
arXiv Detail & Related papers (2022-04-26T04:08:27Z) - Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative
Reinforcement Learning [37.24674549469648]
We make the first step towards conducting message attacks on MACRL methods.
We develop a defence method via message reconstruction.
We consider the ability of the malicious agent to adapt to the changing and improving defensive communicative policies.
arXiv Detail & Related papers (2021-08-09T04:41:47Z) - Collaborative Visual Navigation [69.20264563368762]
We propose a large-scale 3D dataset, CollaVN, for multi-agent visual navigation (MAVN)
Diverse MAVN variants are explored to make our problem more general.
A memory-augmented communication framework is proposed. Each agent is equipped with a private, external memory to persistently store communication information.
arXiv Detail & Related papers (2021-07-02T15:48:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.