Planting and Mitigating Memorized Content in Predictive-Text Language
Models
- URL: http://arxiv.org/abs/2212.08619v1
- Date: Fri, 16 Dec 2022 17:57:14 GMT
- Title: Planting and Mitigating Memorized Content in Predictive-Text Language
Models
- Authors: C.M. Downey, Wei Dai, Huseyin A. Inan, Kim Laine, Saurabh Naik, Tomasz
Religa
- Abstract summary: Language models are widely deployed to provide automatic text completion services in user products.
Recent research has revealed that language models bear considerable risk of memorizing private training data.
In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text.
- Score: 11.911353678499008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models are widely deployed to provide automatic text completion
services in user products. However, recent research has revealed that language
models (especially large ones) bear considerable risk of memorizing private
training data, which is then vulnerable to leakage and extraction by
adversaries. In this study, we test the efficacy of a range of
privacy-preserving techniques to mitigate unintended memorization of sensitive
user text, while varying other factors such as model size and adversarial
conditions. We test both "heuristic" mitigations (those without formal privacy
guarantees) and Differentially Private training, which provides provable levels
of privacy at the cost of some model performance. Our experiments show that
(with the exception of L2 regularization), heuristic mitigations are largely
ineffective in preventing memorization in our test suite, possibly because they
make too strong of assumptions about the characteristics that define
"sensitive" or "private" text. In contrast, Differential Privacy reliably
prevents memorization in our experiments, despite its computational and
model-performance costs.
Related papers
- Subword Embedding from Bytes Gains Privacy without Sacrificing Accuracy and Complexity [5.7601856226895665]
We propose Subword Embedding from Bytes (SEB) and encode subwords to byte sequences using deep neural networks.
Our solution outperforms conventional approaches by preserving privacy without sacrificing efficiency or accuracy.
We verify SEB obtains comparable and even better results over standard subword embedding methods in machine translation, sentiment analysis, and language modeling.
arXiv Detail & Related papers (2024-10-21T18:25:24Z) - NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human [55.20137833039499]
We suggest sanitizing sensitive text using two common strategies used by humans.
We curate the first corpus, coined NAP2, through both crowdsourcing and the use of large language models.
arXiv Detail & Related papers (2024-06-06T05:07:44Z) - FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering [2.2194815687410627]
We show how a malicious client can leak the privacy-sensitive data of some other users in FL even without any cooperation from the server.
Our best-performing method improves the membership inference recall by 29% and achieves up to 71% private data reconstruction.
arXiv Detail & Related papers (2023-10-24T19:50:01Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Training Private Models That Know What They Don't Know [40.19666295972155]
We find that several popular selective prediction approaches are ineffective in a differentially private setting.
We propose a novel evaluation mechanism which isolate selective prediction performance across model utility levels.
arXiv Detail & Related papers (2023-05-28T12:20:07Z) - Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight.
They only give tight estimates under implausible worst-case assumptions.
We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z) - Preventing Verbatim Memorization in Language Models Gives a False Sense
of Privacy [91.98116450958331]
We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization.
Specifically, we design and implement an efficient defense that perfectly prevents all verbatim memorization.
We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.
arXiv Detail & Related papers (2022-10-31T17:57:55Z) - On the Privacy Effect of Data Enhancement via the Lens of Memorization [20.63044895680223]
We propose to investigate privacy from a new perspective called memorization.
Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks.
We demonstrate that the generalization gap and privacy leakage are less correlated than those of the previous results.
arXiv Detail & Related papers (2022-08-17T13:02:17Z) - Semantics-Preserved Distortion for Personal Privacy Protection in Information Management [65.08939490413037]
This paper suggests a linguistically-grounded approach to distort texts while maintaining semantic integrity.
We present two distinct frameworks for semantic-preserving distortion: a generative approach and a substitutive approach.
We also explore privacy protection in a specific medical information management scenario, showing our method effectively limits sensitive data memorization.
arXiv Detail & Related papers (2022-01-04T04:01:05Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.