Related papers: Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models

Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models

URL: http://arxiv.org/abs/2305.15594v1
Date: Wed, 24 May 2023 22:06:08 GMT
Title: Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models
Authors: Haonan Duan, Adam Dziedzic, Nicolas Papernot, Franziska Boenisch
Abstract summary: We instantiate a simple but highly effective membership inference attack against the data used to prompt large language models. We show that our prompt-based approach is easily deployed with existing commercial APIs.
Score: 26.969641494649267
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are excellent in-context learners. However, the sensitivity of data contained in prompts raises privacy concerns. Our work first shows that these concerns are valid: we instantiate a simple but highly effective membership inference attack against the data used to prompt LLMs. To address this vulnerability, one could forego prompting and resort to fine-tuning LLMs with known algorithms for private gradient descent. However, this comes at the expense of the practicality and efficiency offered by prompting. Therefore, we propose to privately learn to prompt. We first show that soft prompts can be obtained privately through gradient descent on downstream data. However, this is not the case for discrete prompts. Thus, we orchestrate a noisy vote among an ensemble of LLMs presented with different prompts, i.e., a flock of stochastic parrots. The vote privately transfers the flock's knowledge into a single public prompt. We show that LLMs prompted with our private algorithms closely match the non-private baselines. For example, using GPT3 as the base model, we achieve a downstream accuracy of 92.7% on the sst2 dataset with ($\epsilon=0.147, \delta=10^{-6}$)-differential privacy vs. 95.2% for the non-private baseline. Through our experiments, we also show that our prompt-based approach is easily deployed with existing commercial APIs.

Related papers

Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs [49.84954577111077]
Pr$epsilonepsilon$mpt is a novel system that implements a prompt sanitizer. We show that Pr$epsilonepsilon$mpt is a practical method to achieve meaningful privacy guarantees.
arXiv Detail & Related papers (2025-04-07T14:52:40Z)
Private Text Generation by Seeding Large Language Model Prompts [13.407214545457778]
We propose Differentially Private Keyphrase Prompt Seeding (DP-KPS), a method that generates a private synthetic text corpus from a sensitive input corpus. We evaluate DP-KPS on downstream ML text classification tasks, and show that the corpora it generates preserve much of the predictive power of the original ones.
arXiv Detail & Related papers (2025-02-18T16:50:38Z)
Differentially Private Knowledge Distillation via Synthetic Text Generation [5.201318326501886]
We propose DistilDP: a novel differentially private knowledge distillation algorithm. DistilDP exploits synthetic data generated by a differentially private teacher LLM. Our experimental results demonstrate that DistilDP can substantially improve the utility over existing baselines.
arXiv Detail & Related papers (2024-03-01T19:22:24Z)
ConfusionPrompt: Practical Private Inference for Online Large Language Models [3.8134804426693094]
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers. We introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts. We show that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques.
arXiv Detail & Related papers (2023-12-30T01:26:42Z)
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer [57.04801796205638]
Large Language Models (LLMs) have emerged as dominant tools for various tasks. However, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information. We present Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge.
arXiv Detail & Related papers (2023-11-27T02:01:10Z)
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)
Privacy Implications of Retrieval-Based Language Models [26.87950501433784]
We present the first study of privacy risks in retrieval-based LMs, particularly $k$NN-LMs. We find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.
arXiv Detail & Related papers (2023-05-24T08:37:27Z)
Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too? [84.91689960190054]
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts. It is underexplored what factors make the prompts effective, especially when the prompts are natural language.
arXiv Detail & Related papers (2022-12-20T18:47:13Z)
Self-Prompting Large Language Models for Zero-Shot Open-Domain QA [67.08732962244301]
Open-Domain Question Answering (ODQA) aims to answer questions without explicitly providing background documents. This task becomes notably challenging in a zero-shot setting where no data is available to train tailored retrieval-reader models. We propose a Self-Prompting framework to explicitly utilize the massive knowledge encoded in the parameters of Large Language Models.
arXiv Detail & Related papers (2022-12-16T18:23:43Z)
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.