GAMA: A General Anonymizing Multi-Agent System for Privacy Preservation Enhanced by Domain Rules and Disproof Mechanism
- URL: http://arxiv.org/abs/2509.10018v2
- Date: Sat, 04 Oct 2025 22:46:50 GMT
- Title: GAMA: A General Anonymizing Multi-Agent System for Privacy Preservation Enhanced by Domain Rules and Disproof Mechanism
- Authors: Hailong Yang, Renhuo Zhao, Guanjin Wang, Zhaohong Deng,
- Abstract summary: General Anonymizing Multi-Agent System (GAMA)<n>GAMA divides the agents' workspace into private and public spaces, ensuring privacy through a structured anonymization mechanism.<n>We evaluate GAMA on two general question-answering datasets, a public privacy leakage benchmark, and two customized question-answering datasets related to privacy.
- Score: 14.491054279033968
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid advancement of Large Language Models (LLMs), LLM-based agents exhibit exceptional abilities in understanding and generating natural language, enabling human-like collaboration and information transmission in LLM-based Multi-Agent Systems (MAS). High-performance LLMs are often hosted on web servers in public cloud environments. When tasks involve private data, MAS cannot securely utilize these LLMs without implementing the agentic privacy-preserving mechanism. To address this challenge, we propose a General Anonymizing Multi-Agent System (GAMA), which divides the agents' workspace into private and public spaces, ensuring privacy through a structured anonymization mechanism. In the private space, agents handle sensitive data, while in the public web space, only anonymized data is utilized. GAMA incorporates two key modules to mitigate semantic loss caused by anonymization: Domain-Rule-based Knowledge Enhancement (DRKE) and Disproof-based Logic Enhancement (DLE). We evaluate GAMA on two general question-answering datasets, a public privacy leakage benchmark, and two customized question-answering datasets related to privacy. The results demonstrate that GAMA outperforms existing baselines on the evaluated datasets in terms of both task accuracy and privacy preservation metrics.
Related papers
- When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing [61.80513991207956]
This work focuses on the challenge of how to restore surrogate-driven protected data in diverse MLLM scenarios.<n>We first bridge this research gap by contributing the SPPE (Surrogate Privacy Protected Editable) dataset.<n>We introduce a unified approach that reliably reconstructs private content while preserving the fidelity of MLLM-generated edits.
arXiv Detail & Related papers (2025-12-08T04:59:03Z) - 1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning [18.751008976082655]
We introduce a multi-agent framework that decomposes privacy reasoning into specialized subtasks (extraction, classification)<n>We conduct a systematic ablation over information-flow topologies, revealing when and why upstream detection mistakes cascade into downstream leakage.
arXiv Detail & Related papers (2025-08-11T06:34:09Z) - MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z) - Automated Privacy Information Annotation in Large Language Model Interactions [40.87806981624453]
Users interacting with large language models (LLMs) under their real identifiers often unknowingly risk disclosing private information.<n>Existing privacy detection methods were designed for different objectives and application domains.<n>We construct a large-scale multilingual dataset with 249K user queries and 154K annotated privacy phrases.
arXiv Detail & Related papers (2025-05-27T09:00:12Z) - Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG)<n>FedE4RAG facilitates collaborative training of client-side RAG retrieval models.<n>We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z) - Privacy-Enhancing Paradigms within Federated Multi-Agent Systems [47.76990892943637]
LLM-based Multi-Agent Systems (MAS) have proven highly effective in solving complex problems by integrating multiple agents, each performing different roles.<n>In this paper, we introduce the concept of Federated MAS, highlighting the fundamental differences between Federated MAS and traditional FL.<n>We then identify key challenges in developing Federated MAS, including: 1) heterogeneous privacy protocols among agents, 2) structural differences in multi-party conversations, and 3) dynamic conversational network structures.<n>To address these challenges, we propose Embedded Privacy-Enhancing Agents (EPEAgent), an innovative solution that integrates seamlessly into the Retrieval-Augmented Generation phase and the
arXiv Detail & Related papers (2025-03-11T08:38:45Z) - PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models [10.050972891318324]
We propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and large language models.<n>We construct SensitiveQA, the first privacy open-ended question-answering dataset.<n>Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs.
arXiv Detail & Related papers (2025-02-19T09:17:07Z) - Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Anonymizing text that contains sensitive information is crucial for a wide range of applications.<n>Existing techniques face the emerging challenges of the re-identification ability of large language models.<n>We propose a framework composed of three key components: a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data [53.70870879858533]
We introduce a Federated Domain-specific Knowledge Transfer framework.
It enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy.
The proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5% with a privacy budget of less than 10.
arXiv Detail & Related papers (2024-05-23T06:14:35Z) - Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection [6.201275002179716]
We introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core processes: hiding private entities for anonymization and seeking private entities for de-anonymization.
To quantitatively assess HaS's privacy protection performance, we propose both black-box and white-box adversarial models.
arXiv Detail & Related papers (2023-09-06T14:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.