Related papers: PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training

PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training

URL: http://arxiv.org/abs/2602.13840v1
Date: Sat, 14 Feb 2026 18:07:51 GMT
Title: PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training
Authors: Yuhan Cheng, Hancheng Ye, Hai Helen Li, Jingwei Sun, Yiran Chen,
Abstract summary: PrivAct is a contextual privacy-aware multi-agent learning framework.<n>It internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions.<n> Experiments show consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32%.
Score: 14.144464261335031
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM) agents are increasingly deployed in personalized tasks involving sensitive, context-dependent information, where privacy violations may arise in agents' action due to the implicitness of contextual privacy. Existing approaches rely on external, inference-time interventions which are brittle, scenario-specific, and may expand the privacy attack surface. We propose PrivAct, a contextual privacy-aware multi-agent learning framework that internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions. By embedding privacy preferences into each agent, PrivAct enhances system-wide contextual integrity while achieving a more favorable privacy-helpfulness tradeoff. Experiments across multiple LLM backbones and benchmarks demonstrate consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32% while maintaining comparable helpfulness, as well as zero-shot generalization and robustness across diverse multi-agent topologies. Code is available at https://github.com/chengyh23/PrivAct.

Related papers

Contextualized Privacy Defense for LLM Agents [84.30907378390512]
LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability.<n>We propose Contextualized Defense Instructing (CDI), a new privacy defense paradigm.<n>We show that our CDI consistently achieves a better balance between privacy preservation (94.2%) and helpfulness (80.6%) than baselines.
arXiv Detail & Related papers (2026-03-03T13:35:33Z)
Privacy Collapse: Benign Fine-Tuning Can Break Contextual Privacy in Language Models [47.866853046761044]
We find that diverse, subtle patterns in training data can degrade contextual privacy.<n>Fine-tuned models lose their ability to reason about contextual privacy norms.<n>Our results reveal a critical gap in current safety evaluations.
arXiv Detail & Related papers (2026-01-21T17:53:06Z)
1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning [18.751008976082655]
We introduce a multi-agent framework that decomposes privacy reasoning into specialized subtasks (extraction, classification)<n>We conduct a systematic ablation over information-flow topologies, revealing when and why upstream detection mistakes cascade into downstream leakage.
arXiv Detail & Related papers (2025-08-11T06:34:09Z)
MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z)
Token-Level Privacy in Large Language Models [7.4143291213663955]
We introduce dchi-stencil, a novel token-level privacy-preserving mechanism that integrates contextual and semantic information.<n>By incorporating both semantic and contextual nuances, dchi-stencil achieves a robust balance between privacy and utility.<n>This work highlights the potential of dchi-stencil to set a new standard for privacy-preserving NLP in modern, high-risk applications.
arXiv Detail & Related papers (2025-03-05T16:27:25Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration [20.05248442344211]
PrivacyRestore is a plug-and-play method to protect the privacy of user inputs during inference.<n>We create three datasets, covering medical and legal domains, to evaluate the effectiveness of PrivacyRestore.
arXiv Detail & Related papers (2024-06-03T14:57:39Z)
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.