The Fire Thief Is Also the Keeper: Balancing Usability and Privacy in Prompts
- URL: http://arxiv.org/abs/2406.14318v1
- Date: Thu, 20 Jun 2024 13:52:25 GMT
- Title: The Fire Thief Is Also the Keeper: Balancing Usability and Privacy in Prompts
- Authors: Zhili Shen, Zihang Xi, Ying He, Wei Tong, Jingyu Hua, Sheng Zhong,
- Abstract summary: This paper introduces Prompt Privacy Sanitizer (i.e., ProSan), an end-to-end prompt privacy protection framework.
It produces anonymized prompts with contextual privacy removed while maintaining task usability and human readability.
ProSan is capable of adapting to diverse computational resource conditions, ensuring privacy protection even for mobile devices with limited computing power.
- Score: 7.121210449712282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid adoption of online chatbots represents a significant advancement in artificial intelligence. However, this convenience brings considerable privacy concerns, as prompts can inadvertently contain sensitive information exposed to large language models (LLMs). Limited by high computational costs, reduced task usability, and excessive system modifications, previous works based on local deployment, embedding perturbation, and homomorphic encryption are inapplicable to online prompt-based LLM applications. To address these issues, this paper introduces Prompt Privacy Sanitizer (i.e., ProSan), an end-to-end prompt privacy protection framework that can produce anonymized prompts with contextual privacy removed while maintaining task usability and human readability. It can also be seamlessly integrated into the online LLM service pipeline. To achieve high usability and dynamic anonymity, ProSan flexibly adjusts its protection targets and strength based on the importance of the words and the privacy leakage risk of the prompts. Additionally, ProSan is capable of adapting to diverse computational resource conditions, ensuring privacy protection even for mobile devices with limited computing power. Our experiments demonstrate that ProSan effectively removes private information across various tasks, including question answering, text summarization, and code generation, with minimal reduction in task performance.
Related papers
- A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage [77.83757117924995]
We propose a new framework that evaluates re-identification attacks to quantify individual privacy risks upon data release.
Our approach shows that seemingly innocuous auxiliary information can be used to infer sensitive attributes like age or substance use history from sanitized data.
arXiv Detail & Related papers (2025-04-28T01:16:27Z) - Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs [49.84954577111077]
Pr$epsilonepsilon$mpt is a novel system that implements a prompt sanitizer.
We show that Pr$epsilonepsilon$mpt is a practical method to achieve meaningful privacy guarantees.
arXiv Detail & Related papers (2025-04-07T14:52:40Z) - Preempting Text Sanitization Utility in Resource-Constrained Privacy-Preserving LLM Interactions [4.372695214012181]
We propose an architecture to help estimate the impact of sanitization on a prompt before it is sent to the Large Language Models.
Our evaluation of this architecture revealed a significant problem with text sanitization based on Differential Privacy.
arXiv Detail & Related papers (2024-11-18T12:31:22Z) - Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence.
We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context.
We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Confidential Prompting: Protecting User Prompts from Cloud LLM Providers [0.688204255655161]
We tackle the challenge of securing user inputs in cloud-hosted large language model (LLM) serving.
We introduce Secure Partitioned Decoding (SPD), which uses confidential computing to confine user prompts to a trusted execution environment.
We also introduce a novel cryptographic method, Prompt Obfuscation (PO), to ensure robustness against reconstruction attacks on SPD.
arXiv Detail & Related papers (2024-09-27T20:32:42Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration [18.11846784025521]
PrivacyRestore is a plug-and-play method to protect the privacy of user inputs during inference.
We create three datasets, covering medical and legal domains, to evaluate the effectiveness of PrivacyRestore.
arXiv Detail & Related papers (2024-06-03T14:57:39Z) - No Free Lunch Theorem for Privacy-Preserving LLM Inference [30.554456047738295]
This study develops a framework for inferring privacy-protected Large Language Models (LLMs)
It lays down a solid theoretical basis for examining the interplay between privacy preservation and utility.
arXiv Detail & Related papers (2024-05-31T08:22:53Z) - Privacy-Preserving End-to-End Spoken Language Understanding [7.501598786895441]
Human speech can contain a lot of user-sensitive information, such as gender, identity, and sensitive content.
New types of security and privacy breaches have emerged. Users do not want to expose their personal sensitive information to malicious attacks by untrusted third parties.
This paper proposes a novel multi-task privacy-preserving model to prevent both the speech recognition (ASR) and identity recognition (IR) attacks.
arXiv Detail & Related papers (2024-03-22T03:41:57Z) - Privacy-Preserving Language Model Inference with Instance Obfuscation [33.86459812694288]
Language Models as a Service (LM) offers convenient access for developers and researchers to perform inference using pre-trained language models.
The input data and the inference results containing private information are exposed as plaintext during the service call, leading to privacy issues.
We propose Instance-Obfuscated Inference (IOI) method, which focuses on addressing the decision privacy issue of natural language understanding tasks.
arXiv Detail & Related papers (2024-02-13T05:36:54Z) - DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer [57.04801796205638]
Large Language Models (LLMs) have emerged as dominant tools for various tasks.
However, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information.
We present Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge.
arXiv Detail & Related papers (2023-11-27T02:01:10Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - InferDPT: Privacy-Preserving Inference for Black-box Large Language Model [66.07752875835506]
InferDPT is the first practical framework for the privacy-preserving Inference of black-box LLMs.
RANTEXT is a novel differential privacy mechanism integrated into the perturbation module of InferDPT.
arXiv Detail & Related papers (2023-10-18T18:00:11Z) - Privacy in Large Language Models: Attacks, Defenses and Future Directions [84.73301039987128]
We analyze the current privacy attacks targeting large language models (LLMs) and categorize them according to the adversary's assumed capabilities.
We present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks.
arXiv Detail & Related papers (2023-10-16T13:23:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.