Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection
- URL: http://arxiv.org/abs/2309.03057v1
- Date: Wed, 6 Sep 2023 14:54:11 GMT
- Title: Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection
- Authors: Yu Chen, Tingxin Li, Huiming Liu, Yang Yu
- Abstract summary: We introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core processes: hiding private entities for anonymization and seeking private entities for de-anonymization.
To quantitatively assess HaS's privacy protection performance, we propose both black-box and white-box adversarial models.
- Score: 6.201275002179716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous companies have started offering services based on large language
models (LLM), such as ChatGPT, which inevitably raises privacy concerns as
users' prompts are exposed to the model provider. Previous research on secure
reasoning using multi-party computation (MPC) has proven to be impractical for
LLM applications due to its time-consuming and communication-intensive nature.
While lightweight anonymization techniques can protect private information in
prompts through substitution or masking, they fail to recover sensitive data
replaced in the LLM-generated results. In this paper, we expand the application
scenarios of anonymization techniques by training a small local model to
de-anonymize the LLM's returned results with minimal computational overhead. We
introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core
processes: hiding private entities for anonymization and seeking private
entities for de-anonymization, respectively. To quantitatively assess HaS's
privacy protection performance, we propose both black-box and white-box
adversarial models. Furthermore, we conduct experiments to evaluate HaS's
usability in translation and classification tasks. The experimental findings
demonstrate that the HaS framework achieves an optimal balance between privacy
protection and utility.
Related papers
- Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy.
Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models.
This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - No Free Lunch Theorem for Privacy-Preserving LLM Inference [30.554456047738295]
This study develops a framework for inferring privacy-protected Large Language Models (LLMs)
It lays down a solid theoretical basis for examining the interplay between privacy preservation and utility.
arXiv Detail & Related papers (2024-05-31T08:22:53Z) - Privacy-Preserving Language Model Inference with Instance Obfuscation [33.86459812694288]
Language Models as a Service (LM) offers convenient access for developers and researchers to perform inference using pre-trained language models.
The input data and the inference results containing private information are exposed as plaintext during the service call, leading to privacy issues.
We propose Instance-Obfuscated Inference (IOI) method, which focuses on addressing the decision privacy issue of natural language understanding tasks.
arXiv Detail & Related papers (2024-02-13T05:36:54Z) - ConfusionPrompt: Practical Private Inference for Online Large Language Models [3.8134804426693094]
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers.
We introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts.
We show that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques.
arXiv Detail & Related papers (2023-12-30T01:26:42Z) - DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer [57.04801796205638]
Large Language Models (LLMs) have emerged as dominant tools for various tasks.
However, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information.
We present Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge.
arXiv Detail & Related papers (2023-11-27T02:01:10Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Privacy Implications of Retrieval-Based Language Models [26.87950501433784]
We present the first study of privacy risks in retrieval-based LMs, particularly $k$NN-LMs.
We find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.
arXiv Detail & Related papers (2023-05-24T08:37:27Z) - OPOM: Customized Invisible Cloak towards Face Privacy Protection [58.07786010689529]
We investigate the face privacy protection from a technology standpoint based on a new type of customized cloak.
We propose a new method, named one person one mask (OPOM), to generate person-specific (class-wise) universal masks.
The effectiveness of the proposed method is evaluated on both common and celebrity datasets.
arXiv Detail & Related papers (2022-05-24T11:29:37Z) - When Crowdsensing Meets Federated Learning: Privacy-Preserving Mobile
Crowdsensing System [12.087658145293522]
Mobile crowdsensing (MCS) is an emerging sensing data collection pattern with scalability, low deployment cost, and distributed characteristics.
Traditional MCS systems suffer from privacy concerns and fair reward distribution.
In this paper, we propose a privacy-preserving MCS system, called textscCrowdFL.
arXiv Detail & Related papers (2021-02-20T15:34:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.