Related papers: User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios

User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios

URL: http://arxiv.org/abs/2510.20721v1
Date: Thu, 23 Oct 2025 16:38:26 GMT
Title: User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios
Authors: Xiaoyuan Wu, Roshni Kaushik, Wenkai Li, Lujo Bauer, Koichi Onoue,
Abstract summary: We show how users perceive privacy-preservation quality and helpfulness of large language models responses to privacy-sensitive scenarios.<n>Our results suggest the need to conduct user-centered studies on measuring LLMs' ability to help users while preserving privacy.
Score: 10.12906605142667
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) have seen rapid adoption for tasks such as drafting emails, summarizing meetings, and answering health questions. In such uses, users may need to share private information (e.g., health records, contact details). To evaluate LLMs' ability to identify and redact such private information, prior work developed benchmarks (e.g., ConfAIde, PrivacyLens) with real-life scenarios. Using these benchmarks, researchers have found that LLMs sometimes fail to keep secrets private when responding to complex tasks (e.g., leaking employee salaries in meeting summaries). However, these evaluations rely on LLMs (proxy LLMs) to gauge compliance with privacy norms, overlooking real users' perceptions. Moreover, prior work primarily focused on the privacy-preservation quality of responses, without investigating nuanced differences in helpfulness. To understand how users perceive the privacy-preservation quality and helpfulness of LLM responses to privacy-sensitive scenarios, we conducted a user study with 94 participants using 90 scenarios from PrivacyLens. We found that, when evaluating identical responses to the same scenario, users showed low agreement with each other on the privacy-preservation quality and helpfulness of the LLM response. Further, we found high agreement among five proxy LLMs, while each individual LLM had low correlation with users' evaluations. These results indicate that the privacy and helpfulness of LLM responses are often specific to individuals, and proxy LLMs are poor estimates of how real users would perceive these responses in privacy-sensitive scenarios. Our results suggest the need to conduct user-centered studies on measuring LLMs' ability to help users while preserving privacy. Additionally, future research could investigate ways to improve the alignment between proxy LLMs and users for better estimation of users' perceived privacy and utility.

Related papers

When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing [61.80513991207956]
This work focuses on the challenge of how to restore surrogate-driven protected data in diverse MLLM scenarios.<n>We first bridge this research gap by contributing the SPPE (Surrogate Privacy Protected Editable) dataset.<n>We introduce a unified approach that reliably reconstructs private content while preserving the fidelity of MLLM-generated edits.
arXiv Detail & Related papers (2025-12-08T04:59:03Z)
LLM-as-a-Judge for Privacy Evaluation? Exploring the Alignment of Human and LLM Perceptions of Privacy in Textual Data [47.76073133338117]
Despite advances in the field of privacy- Natural Language Processing (NLP), the accurate evaluation of privacy remains a challenge.<n>We present a global approach inspired by sox2013$ a model for privacy evaluation in textual data.<n>Our findings pave the way for exploring the feasibility of evaluators as privacy evaluators.
arXiv Detail & Related papers (2025-08-16T20:49:41Z)
MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z)
Automated Privacy Information Annotation in Large Language Model Interactions [40.87806981624453]
Users interacting with large language models (LLMs) under their real identifiers often unknowingly risk disclosing private information.<n>Existing privacy detection methods were designed for different objectives and application domains.<n>We construct a large-scale multilingual dataset with 249K user queries and 154K annotated privacy phrases.
arXiv Detail & Related papers (2025-05-27T09:00:12Z)
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance [44.287734754038254]
We present PrivaCI-Bench, a contextual privacy evaluation benchmark for generative large language models (LLMs)<n>We evaluate the latest LLMs, including the recent reasoner models QwQ-32B and Deepseek R1.<n>Our experimental results suggest that though LLMs can effectively capture key CI parameters inside a given context, they still require further advancements for privacy compliance.
arXiv Detail & Related papers (2025-02-24T10:49:34Z)
Differentially Private Steering for Large Language Model Alignment [55.30573701583768]
We present the first study of aligning Large Language Models with private datasets.<n>Our work proposes the Private Steering for LLM Alignment (PSA) algorithm to edit activations with differential privacy guarantees.<n>Our results show that PSA achieves DP guarantees for LLM alignment with minimal loss in performance.
arXiv Detail & Related papers (2025-01-30T17:58:36Z)
Investigating Privacy Bias in Training Data of Language Models [1.3167450470598043]
A privacy bias refers to the skew in the appropriateness of information flows within a given context.<n>This skew may either align with existing expectations or signal a symptom of systemic issues.<n>We present a novel approach to assess the privacy biases using a contextual integrity-based methodology.
arXiv Detail & Related papers (2024-09-05T17:50:31Z)
LLM-PBE: Assessing Data Privacy in Large Language Models [111.58198436835036]
Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis. Despite the critical nature of this issue, there has been no existing literature to offer a comprehensive assessment of data privacy risks in LLMs. Our paper introduces LLM-PBE, a toolkit crafted specifically for the systematic evaluation of data privacy risks in LLMs.
arXiv Detail & Related papers (2024-08-23T01:37:29Z)
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)
Beyond Memorization: Violating Privacy Via Inference with Large Language Models [2.9373912230684565]
We present the first comprehensive study on the capabilities of pretrained language models to infer personal attributes from text. Our findings highlight that current LLMs can infer personal data at a previously unattainable scale.
arXiv Detail & Related papers (2023-10-11T08:32:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.