PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training
- URL: http://arxiv.org/abs/2602.13840v1
- Date: Sat, 14 Feb 2026 18:07:51 GMT
- Title: PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training
- Authors: Yuhan Cheng, Hancheng Ye, Hai Helen Li, Jingwei Sun, Yiran Chen,
- Abstract summary: PrivAct is a contextual privacy-aware multi-agent learning framework.<n>It internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions.<n> Experiments show consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32%.
- Score: 14.144464261335031
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language model (LLM) agents are increasingly deployed in personalized tasks involving sensitive, context-dependent information, where privacy violations may arise in agents' action due to the implicitness of contextual privacy. Existing approaches rely on external, inference-time interventions which are brittle, scenario-specific, and may expand the privacy attack surface. We propose PrivAct, a contextual privacy-aware multi-agent learning framework that internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions. By embedding privacy preferences into each agent, PrivAct enhances system-wide contextual integrity while achieving a more favorable privacy-helpfulness tradeoff. Experiments across multiple LLM backbones and benchmarks demonstrate consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32% while maintaining comparable helpfulness, as well as zero-shot generalization and robustness across diverse multi-agent topologies. Code is available at https://github.com/chengyh23/PrivAct.
Related papers
- Contextualized Privacy Defense for LLM Agents [84.30907378390512]
LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability.<n>We propose Contextualized Defense Instructing (CDI), a new privacy defense paradigm.<n>We show that our CDI consistently achieves a better balance between privacy preservation (94.2%) and helpfulness (80.6%) than baselines.
arXiv Detail & Related papers (2026-03-03T13:35:33Z) - Privacy Collapse: Benign Fine-Tuning Can Break Contextual Privacy in Language Models [47.866853046761044]
We find that diverse, subtle patterns in training data can degrade contextual privacy.<n>Fine-tuned models lose their ability to reason about contextual privacy norms.<n>Our results reveal a critical gap in current safety evaluations.
arXiv Detail & Related papers (2026-01-21T17:53:06Z) - 1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning [18.751008976082655]
We introduce a multi-agent framework that decomposes privacy reasoning into specialized subtasks (extraction, classification)<n>We conduct a systematic ablation over information-flow topologies, revealing when and why upstream detection mistakes cascade into downstream leakage.
arXiv Detail & Related papers (2025-08-11T06:34:09Z) - MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z) - Token-Level Privacy in Large Language Models [7.4143291213663955]
We introduce dchi-stencil, a novel token-level privacy-preserving mechanism that integrates contextual and semantic information.<n>By incorporating both semantic and contextual nuances, dchi-stencil achieves a robust balance between privacy and utility.<n>This work highlights the potential of dchi-stencil to set a new standard for privacy-preserving NLP in modern, high-risk applications.
arXiv Detail & Related papers (2025-03-05T16:27:25Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration [20.05248442344211]
PrivacyRestore is a plug-and-play method to protect the privacy of user inputs during inference.<n>We create three datasets, covering medical and legal domains, to evaluate the effectiveness of PrivacyRestore.
arXiv Detail & Related papers (2024-06-03T14:57:39Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.