Related papers: Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

URL: http://arxiv.org/abs/2310.17884v2
Date: Fri, 28 Jun 2024 23:27:10 GMT
Title: Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Authors: Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi,
Abstract summary: We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
Score: 82.7042006247124
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and are expected to reason about what to share in their outputs, for what purpose and with whom, within a given context. In this work, we draw attention to the highly critical yet overlooked notion of contextual privacy by proposing ConfAIde, a benchmark designed to identify critical weaknesses in the privacy reasoning capabilities of instruction-tuned LLMs. Our experiments show that even the most capable models such as GPT-4 and ChatGPT reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. This leakage persists even when we employ privacy-inducing prompts or chain-of-thought reasoning. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.

Related papers

VoxPrivacy: A Benchmark for Evaluating Interactional Privacy of Speech Language Models [25.266028200777317]
Speech Language Models (SLMs) are expected to distinguish between users to manage information flow appropriately.<n>Current SLM benchmarks test dialogue ability but overlook speaker identity.<n>We introduce VoxPrivacy, the first benchmark designed to evaluate interactional privacy in SLMs.
arXiv Detail & Related papers (2026-01-27T06:22:14Z)
The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization [53.51921540246166]
We show that Language Large Models (LLMs) can exploit the contextual vulnerability of DP-sanitized texts.<n>Experiments uncover a double-edged sword effect of LLM reconstructions on privacy and utility.<n>We propose recommendations for using data reconstruction as a post-processing step.
arXiv Detail & Related papers (2025-08-26T12:22:45Z)
MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z)
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance [44.287734754038254]
We present PrivaCI-Bench, a contextual privacy evaluation benchmark for generative large language models (LLMs) We evaluate the latest LLMs, including the recent reasoner models QwQ-32B and Deepseek R1. Our experimental results suggest that though LLMs can effectively capture key CI parameters inside a given context, they still require further advancements for privacy compliance.
arXiv Detail & Related papers (2025-02-24T10:49:34Z)
Multi-P$^2$A: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models [65.2761254581209]
We evaluate the privacy preservation capabilities of 21 open-source and 2 closed-source Large Vision-Language Models (LVLMs) Based on Multi-P$2$A, we evaluate the privacy preservation capabilities of 21 open-source and 2 closed-source LVLMs. Our results reveal that current LVLMs generally pose a high risk of facilitating privacy breaches.
arXiv Detail & Related papers (2024-12-27T07:33:39Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
No Free Lunch Theorem for Privacy-Preserving LLM Inference [30.554456047738295]
This study develops a framework for inferring privacy-protected Large Language Models (LLMs) It lays down a solid theoretical basis for examining the interplay between privacy preservation and utility.
arXiv Detail & Related papers (2024-05-31T08:22:53Z)
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge. Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z)
PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models [42.20437015301152]
We present PrivLM-Bench, a benchmark for evaluating the privacy leakage of language models (LMs) Instead of only reporting DP parameters, PrivLM-Bench sheds light on the neglected inference data privacy during actual usage. We conduct extensive experiments on three datasets of GLUE for mainstream LMs.
arXiv Detail & Related papers (2023-11-07T14:55:52Z)
Privacy in Large Language Models: Attacks, Defenses and Future Directions [84.73301039987128]
We analyze the current privacy attacks targeting large language models (LLMs) and categorize them according to the adversary's assumed capabilities. We present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks.
arXiv Detail & Related papers (2023-10-16T13:23:54Z)
Beyond Memorization: Violating Privacy Via Inference with Large Language Models [2.9373912230684565]
We present the first comprehensive study on the capabilities of pretrained language models to infer personal attributes from text. Our findings highlight that current LLMs can infer personal data at a previously unattainable scale.
arXiv Detail & Related papers (2023-10-11T08:32:46Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
On Privacy and Confidentiality of Communications in Organizational Graphs [3.5270468102327004]
This work shows how confidentiality is distinct from privacy in an enterprise context. It aims to formulate an approach to preserving confidentiality while leveraging principles from differential privacy.
arXiv Detail & Related papers (2021-05-27T19:45:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.