Related papers: Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents

Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents

URL: http://arxiv.org/abs/2509.17488v1
Date: Mon, 22 Sep 2025 08:19:06 GMT
Title: Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents
Authors: Shouju Wang, Fenglin Yu, Xirui Liu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan,
Abstract summary: We present PrivacyChecker, a model-agnostic, contextual integrity based mitigation approach.<n>We also introduce PrivacyLens-Live, transforming static benchmarks into dynamic MCP and A2A environments.<n>Our data and code will be made available at https://aka.ms/privacy_in_action.
Score: 40.39717403627143
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The increasing autonomy of LLM agents in handling sensitive communications, accelerated by Model Context Protocol (MCP) and Agent-to-Agent (A2A) frameworks, creates urgent privacy challenges. While recent work reveals significant gaps between LLMs' privacy Q&A performance and their agent behavior, existing benchmarks remain limited to static, simplified scenarios. We present PrivacyChecker, a model-agnostic, contextual integrity based mitigation approach that effectively reduces privacy leakage from 36.08% to 7.30% on DeepSeek-R1 and from 33.06% to 8.32% on GPT-4o, all while preserving task helpfulness. We also introduce PrivacyLens-Live, transforming static benchmarks into dynamic MCP and A2A environments that reveal substantially higher privacy risks in practical. Our modular mitigation approach integrates seamlessly into agent protocols through three deployment strategies, providing practical privacy protection for the emerging agentic ecosystem. Our data and code will be made available at https://aka.ms/privacy_in_action.

Related papers

Contextualized Privacy Defense for LLM Agents [84.30907378390512]
LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability.<n>We propose Contextualized Defense Instructing (CDI), a new privacy defense paradigm.<n>We show that our CDI consistently achieves a better balance between privacy preservation (94.2%) and helpfulness (80.6%) than baselines.
arXiv Detail & Related papers (2026-03-03T13:35:33Z)
PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training [14.144464261335031]
PrivAct is a contextual privacy-aware multi-agent learning framework.<n>It internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions.<n> Experiments show consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32%.
arXiv Detail & Related papers (2026-02-14T18:07:51Z)
MAGPIE: A benchmark for Multi-AGent contextual PrIvacy Evaluation [61.92403071137653]
Existing privacy benchmarks only focus on simplistic, single-turn interactions where private information can be trivially omitted without affecting task outcomes.<n>We introduce MAGPIE, a novel benchmark designed to evaluate privacy understanding and preservation in multi-agent collaborative, non-adversarial scenarios.<n>Our evaluation reveals that state-of-the-art agents, including GPT-5 and Gemini 2.5-Pro, exhibit significant privacy leakage.
arXiv Detail & Related papers (2025-10-16T23:12:12Z)
PrivacyPAD: A Reinforcement Learning Framework for Dynamic Privacy-Aware Delegation [33.37227619820212]
We introduce a novel reinforcement learning framework called PrivacyPAD to solve this problem.<n>Our framework trains an agent to dynamically route text chunks, learning a policy that optimally balances the trade-off between privacy leakage and task performance.<n>Our framework achieves a new state-of-the-art on the privacy-utility frontier.
arXiv Detail & Related papers (2025-10-16T19:38:36Z)
The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration [72.33801123508145]
Large language models (LLMs) are integral to multi-agent systems.<n>Privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations.<n>In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information.
arXiv Detail & Related papers (2025-09-16T16:57:25Z)
1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning [18.751008976082655]
We introduce a multi-agent framework that decomposes privacy reasoning into specialized subtasks (extraction, classification)<n>We conduct a systematic ablation over information-flow topologies, revealing when and why upstream detection mistakes cascade into downstream leakage.
arXiv Detail & Related papers (2025-08-11T06:34:09Z)
MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z)
PrivacyScalpel: Enhancing LLM Privacy via Interpretable Feature Intervention with Sparse Autoencoders [8.483679748399037]
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing but pose privacy risks by memorizing and leaking Personally Identifiable Information (PII)<n>Existing mitigation strategies, such as differential privacy and neuron-level interventions, often degrade model utility or fail to effectively prevent leakage.<n>We introduce PrivacyScalpel, a novel privacy-preserving framework that leverages interpretability techniques to identify and mitigate PII leakage while maintaining performance.
arXiv Detail & Related papers (2025-03-14T09:31:01Z)
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents [75.85554113398626]
We introduce a new benchmark AgentDAM that measures if AI web-navigation agents follow the privacy principle of data minimization''<n>Our benchmark simulates realistic web interaction scenarios end-to-end and is adaptable to all existing web navigation agents.
arXiv Detail & Related papers (2025-03-12T19:30:31Z)
Position: On-Premises LLM Deployment Demands a Middle Path: Preserving Privacy Without Sacrificing Model Confidentiality [18.575663556525864]
We argue that deploying closed-source LLMs within user-controlled infrastructure enhances data privacy and mitigates misuse risks.<n>A well-designed on-premises deployment must ensure model confidentiality -- by preventing model theft -- and offer privacy-preserving customization.<n>Our findings demonstrate that privacy and confidentiality can coexist, paving the way for secure on-premises AI deployment.
arXiv Detail & Related papers (2024-10-15T02:00:36Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.