Rescriber: Smaller-LLM-Powered User-Led Data Minimization for LLM-Based Chatbots
- URL: http://arxiv.org/abs/2410.11876v3
- Date: Tue, 11 Feb 2025 19:56:20 GMT
- Title: Rescriber: Smaller-LLM-Powered User-Led Data Minimization for LLM-Based Chatbots
- Authors: Jijie Zhou, Eryue Xu, Yaoyao Wu, Tianshi Li,
- Abstract summary: Rescriber is a browser extension that supports user-led data minimization in LLM-based conversational agents.
Our studies showed that Rescriber helped users reduce unnecessary disclosure and addressed their privacy concerns.
Our findings confirm the viability of smaller-LLM-powered, user-facing, on-device privacy controls.
- Score: 2.2447085410328103
- License:
- Abstract: The proliferation of LLM-based conversational agents has resulted in excessive disclosure of identifiable or sensitive information. However, existing technologies fail to offer perceptible control or account for users' personal preferences about privacy-utility tradeoffs due to the lack of user involvement. To bridge this gap, we designed, built, and evaluated Rescriber, a browser extension that supports user-led data minimization in LLM-based conversational agents by helping users detect and sanitize personal information in their prompts. Our studies (N=12) showed that Rescriber helped users reduce unnecessary disclosure and addressed their privacy concerns. Users' subjective perceptions of the system powered by Llama3-8B were on par with that by GPT-4o. The comprehensiveness and consistency of the detection and sanitization emerge as essential factors that affect users' trust and perceived protection. Our findings confirm the viability of smaller-LLM-powered, user-facing, on-device privacy controls, presenting a promising approach to address the privacy and trust challenges of AI.
Related papers
- PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models [10.050972891318324]
We propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and large language models.
We construct SensitiveQA, the first privacy open-ended question-answering dataset.
Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs.
arXiv Detail & Related papers (2025-02-19T09:17:07Z) - Gensors: Authoring Personalized Visual Sensors with Multimodal Foundation Models and Reasoning [61.17099595835263]
Gensors is a system that empowers users to define customized sensors supported by the reasoning capabilities of MLLMs.
In a user study, participants reported significantly greater sense of control, understanding, and ease of communication when defining sensors using Gensors.
arXiv Detail & Related papers (2025-01-27T01:47:57Z) - Preempting Text Sanitization Utility in Resource-Constrained Privacy-Preserving LLM Interactions [4.372695214012181]
We propose an architecture to help estimate the impact of sanitization on a prompt before it is sent to the Large Language Models.
Our evaluation of this architecture revealed a significant problem with text sanitization based on Differential Privacy.
arXiv Detail & Related papers (2024-11-18T12:31:22Z) - Privacy Leakage Overshadowed by Views of AI: A Study on Human Oversight of Privacy in Language Model Agent [1.5020330976600738]
Language model (LM) agents that act on users' behalf for personal tasks can boost productivity, but are also susceptible to unintended privacy leakage risks.
We present the first study on people's capacity to oversee the privacy implications of the LM agents.
arXiv Detail & Related papers (2024-11-02T19:15:42Z) - MisinfoEval: Generative AI in the Era of "Alternative Facts" [50.069577397751175]
We introduce a framework for generating and evaluating large language model (LLM) based misinformation interventions.
We present (1) an experiment with a simulated social media environment to measure effectiveness of misinformation interventions, and (2) a second experiment with personalized explanations tailored to the demographics and beliefs of users.
Our findings confirm that LLM-based interventions are highly effective at correcting user behavior.
arXiv Detail & Related papers (2024-10-13T18:16:50Z) - Prompt Tuning as User Inherent Profile Inference Machine [53.78398656789463]
We propose UserIP-Tuning, which uses prompt-tuning to infer user profiles.
A profile quantization codebook bridges the modality gap by profile embeddings into collaborative IDs.
Experiments on four public datasets show that UserIP-Tuning outperforms state-of-the-art recommendation algorithms.
arXiv Detail & Related papers (2024-08-13T02:25:46Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Privacy-Preserving End-to-End Spoken Language Understanding [7.501598786895441]
Human speech can contain a lot of user-sensitive information, such as gender, identity, and sensitive content.
New types of security and privacy breaches have emerged. Users do not want to expose their personal sensitive information to malicious attacks by untrusted third parties.
This paper proposes a novel multi-task privacy-preserving model to prevent both the speech recognition (ASR) and identity recognition (IR) attacks.
arXiv Detail & Related papers (2024-03-22T03:41:57Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - "It's a Fair Game", or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents [27.480959048351973]
The widespread use of Large Language Model (LLM)-based conversational agents (CAs) raises many privacy concerns.
We analyzed sensitive disclosures in real-world ChatGPT conversations and conducted semi-structured interviews with 19 LLM-based CA users.
We found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs.
arXiv Detail & Related papers (2023-09-20T21:34:36Z) - Privacy Explanations - A Means to End-User Trust [64.7066037969487]
We looked into how explainability might help to tackle this problem.
We created privacy explanations that aim to help to clarify to end users why and for what purposes specific data is required.
Our findings reveal that privacy explanations can be an important step towards increasing trust in software systems.
arXiv Detail & Related papers (2022-10-18T09:30:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.