Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models
- URL: http://arxiv.org/abs/2204.07667v1
- Date: Fri, 15 Apr 2022 22:36:55 GMT
- Title: Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models
- Authors: Weiyan Shi, Si Chen, Chiyuan Zhang, Ruoxi Jia, Zhou Yu
- Abstract summary: We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
- Score: 69.66654761324702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing adoption of NLP models in real-world products, it becomes
more and more important to protect these models from privacy leakage. Because
private information in language data is sparse, previous research formalized a
Selective-Differential-Privacy (SDP) notion to provide protection for sensitive
tokens detected by policy functions, and prove its effectiveness on RNN-based
models. But the previous mechanism requires separating the private and public
model parameters and thus cannot be applied on large attention-based models. In
this paper, we propose a simple yet effective just-fine-tune-twice privacy
mechanism to first fine-tune on in-domain redacted data and then on in-domain
private data, to achieve SDP for large Transformer-based language models. We
also design explicit and contextual policy functions to provide protections at
different levels. Experiments show that our models achieve strong performance
while staying robust to the canary insertion attack. We further show that even
under low-resource settings with a small amount of in-domain data, SDP can
still improve the model utility. We will release the code, data and models to
facilitate future research.
Related papers
- Private prediction for large-scale synthetic text generation [28.488459921169905]
We present an approach for generating differentially private synthetic text using large language models (LLMs)
In the private prediction framework, we only require the output synthetic data to satisfy differential privacy guarantees.
arXiv Detail & Related papers (2024-07-16T18:28:40Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Membership Inference Attacks and Privacy in Topic Modeling [3.503833571450681]
We propose an attack against topic models that can confidently identify members of the training data.
We propose a framework for private topic modeling that incorporates DP vocabulary selection as a pre-processing step.
arXiv Detail & Related papers (2024-03-07T12:43:42Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - You Are What You Write: Preserving Privacy in the Era of Large Language
Models [2.3431670397288005]
We present an empirical investigation into the extent of the personal information encoded into pre-trained representations by a range of popular models.
We show a positive correlation between the complexity of a model, the amount of data used in pre-training, and data leakage.
arXiv Detail & Related papers (2022-04-20T11:12:53Z) - Defending against Reconstruction Attacks with R\'enyi Differential
Privacy [72.1188520352079]
Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model.
Differential privacy is a known solution to such attacks, but is often used with a relatively large privacy budget.
We show that, for a same mechanism, we can derive privacy guarantees for reconstruction attacks that are better than the traditional ones from the literature.
arXiv Detail & Related papers (2022-02-15T18:09:30Z) - Selective Differential Privacy for Language Modeling [36.64464956102432]
Previous work has attempted to tackle this challenge by training RNN-based language models with differential privacy guarantees.
We propose a new privacy notion, selective differential privacy, to provide rigorous privacy guarantees on the sensitive portion of the data.
Experiments on both language modeling and dialog system building show that the proposed privacy-preserving mechanism achieves better utilities.
arXiv Detail & Related papers (2021-08-30T01:11:10Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.