Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences
- URL: http://arxiv.org/abs/2507.05391v1
- Date: Mon, 07 Jul 2025 18:22:55 GMT
- Title: Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences
- Authors: Guillem RamÃrez, Alexandra Birch, Ivan Titov,
- Abstract summary: We explore how users can stay in control of their data by using privacy profiles.<n>We build a framework where a local model uses these instructions to rewrite queries.<n>To support this research, we introduce a multilingual dataset of real user queries to mark private content.
- Score: 80.63946798650653
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large language models (LLMs) are primarily accessed via commercial APIs, but this often requires users to expose their data to service providers. In this paper, we explore how users can stay in control of their data by using privacy profiles: simple natural language instructions that say what should and should not be revealed. We build a framework where a local model uses these instructions to rewrite queries, only hiding details deemed sensitive by the user, before sending them to an external model, thus balancing privacy with performance. To support this research, we introduce PEEP, a multilingual dataset of real user queries annotated to mark private content and paired with synthetic privacy profiles. Our experiments with lightweight LLMs show they can follow these instructions to some extent, but also face consistent challenges, highlighting the need for models that better understand and comply with user-defined privacy preferences.
Related papers
- Automated Privacy Information Annotation in Large Language Model Interactions [40.87806981624453]
Users interacting with large language models (LLMs) under their real identifiers often unknowingly risk disclosing private information.<n>Existing privacy detection methods were designed for different objectives and application scenarios.<n>We construct a large-scale multilingual dataset with 249K user queries and 154K annotated privacy phrases.
arXiv Detail & Related papers (2025-05-27T09:00:12Z) - Entailment-Driven Privacy Policy Classification with LLMs [3.564208334473993]
We propose a framework to classify paragraphs of privacy policies into meaningful labels that are easily understood by users.
Our framework improves the F1 score in average by 11.2%.
arXiv Detail & Related papers (2024-09-25T05:07:05Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer [57.04801796205638]
Large Language Models (LLMs) have emerged as dominant tools for various tasks.
However, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information.
We present Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge.
arXiv Detail & Related papers (2023-11-27T02:01:10Z) - PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models [42.20437015301152]
We present PrivLM-Bench, a benchmark for evaluating the privacy leakage of language models (LMs)
Instead of only reporting DP parameters, PrivLM-Bench sheds light on the neglected inference data privacy during actual usage.
We conduct extensive experiments on three datasets of GLUE for mainstream LMs.
arXiv Detail & Related papers (2023-11-07T14:55:52Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Selective Differential Privacy for Language Modeling [36.64464956102432]
Previous work has attempted to tackle this challenge by training RNN-based language models with differential privacy guarantees.
We propose a new privacy notion, selective differential privacy, to provide rigorous privacy guarantees on the sensitive portion of the data.
Experiments on both language modeling and dialog system building show that the proposed privacy-preserving mechanism achieves better utilities.
arXiv Detail & Related papers (2021-08-30T01:11:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.