Contextualized Privacy Defense for LLM Agents
- URL: http://arxiv.org/abs/2603.02983v1
- Date: Tue, 03 Mar 2026 13:35:33 GMT
- Title: Contextualized Privacy Defense for LLM Agents
- Authors: Yule Wen, Yanzhe Zhang, Jianxun Lian, Xiaoyuan Yi, Xing Xie, Diyi Yang,
- Abstract summary: LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability.<n>We propose Contextualized Defense Instructing (CDI), a new privacy defense paradigm.<n>We show that our CDI consistently achieves a better balance between privacy preservation (94.2%) and helpfulness (80.6%) than baselines.
- Score: 84.30907378390512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability. Most prior approaches rely on static or passive defenses, such as prompting and guarding. These paradigms are insufficient for supporting contextual, proactive privacy decisions in multi-step agent execution. We propose Contextualized Defense Instructing (CDI), a new privacy defense paradigm in which an instructor model generates step-specific, context-aware privacy guidance during execution, proactively shaping actions rather than merely constraining or vetoing them. Crucially, CDI is paired with an experience-driven optimization framework that trains the instructor via reinforcement learning (RL), where we convert failure trajectories with privacy violations into learning environments. We formalize baseline defenses and CDI as distinct intervention points in a canonical agent loop, and compare their privacy-helpfulness trade-offs within a unified simulation framework. Results show that our CDI consistently achieves a better balance between privacy preservation (94.2%) and helpfulness (80.6%) than baselines, with superior robustness to adversarial conditions and generalization.
Related papers
- PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training [14.144464261335031]
PrivAct is a contextual privacy-aware multi-agent learning framework.<n>It internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions.<n> Experiments show consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32%.
arXiv Detail & Related papers (2026-02-14T18:07:51Z) - Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks [13.326888254423901]
VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from images shared on social media.<n>We propose a novel protection method that jointly optimize privacy suppression and utility preservation under a visual consistency constraint.<n>Our method effectively reduces PAR below 25%, keeps NPAR above 88%, and generalizes well to unseen and paraphrased privacy questions.
arXiv Detail & Related papers (2025-12-20T08:08:50Z) - Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents [40.39717403627143]
We present PrivacyChecker, a model-agnostic, contextual integrity based mitigation approach.<n>We also introduce PrivacyLens-Live, transforming static benchmarks into dynamic MCP and A2A environments.<n>Our data and code will be made available at https://aka.ms/privacy_in_action.
arXiv Detail & Related papers (2025-09-22T08:19:06Z) - The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration [72.33801123508145]
Large language models (LLMs) are integral to multi-agent systems.<n>Privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations.<n>In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information.
arXiv Detail & Related papers (2025-09-16T16:57:25Z) - Context Reasoner: Incentivizing Reasoning Capability for Contextualized Privacy and Safety Compliance via Reinforcement Learning [53.92712851223158]
We formulate safety and privacy issues into contextualized compliance problems following the Contextual Integrity (CI) theory.<n>Under the CI framework, we align our model with three critical regulatory standards: EU AI Act, and HIPAA.<n>We employ reinforcement learning (RL) with a rule-based reward to incentivize contextual reasoning capabilities while enhancing compliance with safety and privacy norms.
arXiv Detail & Related papers (2025-05-20T16:40:09Z) - Real-Time Privacy Risk Measurement with Privacy Tokens for Gradient Leakage [15.700803673467641]
Deep learning models in privacy-sensitive domains have amplified concerns regarding privacy risks.<n>We propose the concept of privacy tokens, which are derived directly from private gradients during training.<n>Privacy tokens offer valuable insights into the extent of private information leakage from training data.<n>We employ Mutual Information (MI) as a robust metric to quantify the relationship between training data and gradients.
arXiv Detail & Related papers (2025-02-05T06:20:20Z) - Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and Future Directions [11.338466798715906]
Fine-tuning Large Language Models (LLMs) can achieve state-of-the-art performance across various domains.<n>This paper provides a comprehensive survey of privacy challenges associated with fine-tuning LLMs.<n>We highlight vulnerabilities to various privacy attacks, including membership inference, data extraction, and backdoor attacks.
arXiv Detail & Related papers (2024-12-21T06:41:29Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Secure Aggregation is Not Private Against Membership Inference Attacks [66.59892736942953]
We investigate the privacy implications of SecAgg in federated learning.
We show that SecAgg offers weak privacy against membership inference attacks even in a single training round.
Our findings underscore the imperative for additional privacy-enhancing mechanisms, such as noise injection.
arXiv Detail & Related papers (2024-03-26T15:07:58Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.