Related papers: Planting and Mitigating Memorized Content in Predictive-Text Language Models

Planting and Mitigating Memorized Content in Predictive-Text Language Models

URL: http://arxiv.org/abs/2212.08619v1
Date: Fri, 16 Dec 2022 17:57:14 GMT
Title: Planting and Mitigating Memorized Content in Predictive-Text Language Models
Authors: C.M. Downey, Wei Dai, Huseyin A. Inan, Kim Laine, Saurabh Naik, Tomasz Religa
Abstract summary: Language models are widely deployed to provide automatic text completion services in user products. Recent research has revealed that language models bear considerable risk of memorizing private training data. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text.
Score: 11.911353678499008
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training data, which is then vulnerable to leakage and extraction by adversaries. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text, while varying other factors such as model size and adversarial conditions. We test both "heuristic" mitigations (those without formal privacy guarantees) and Differentially Private training, which provides provable levels of privacy at the cost of some model performance. Our experiments show that (with the exception of L2 regularization), heuristic mitigations are largely ineffective in preventing memorization in our test suite, possibly because they make too strong of assumptions about the characteristics that define "sensitive" or "private" text. In contrast, Differential Privacy reliably prevents memorization in our experiments, despite its computational and model-performance costs.

Related papers

Subword Embedding from Bytes Gains Privacy without Sacrificing Accuracy and Complexity [5.7601856226895665]
We propose Subword Embedding from Bytes (SEB) and encode subwords to byte sequences using deep neural networks. Our solution outperforms conventional approaches by preserving privacy without sacrificing efficiency or accuracy. We verify SEB obtains comparable and even better results over standard subword embedding methods in machine translation, sentiment analysis, and language modeling.
arXiv Detail & Related papers (2024-10-21T18:25:24Z)
NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human [55.20137833039499]
We suggest sanitizing sensitive text using two common strategies used by humans. We curate the first corpus, coined NAP2, through both crowdsourcing and the use of large language models.
arXiv Detail & Related papers (2024-06-06T05:07:44Z)
FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering [2.2194815687410627]
We show how a malicious client can leak the privacy-sensitive data of some other users in FL even without any cooperation from the server. Our best-performing method improves the membership inference recall by 29% and achieves up to 71% private data reconstruction.
arXiv Detail & Related papers (2023-10-24T19:50:01Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
Training Private Models That Know What They Don't Know [40.19666295972155]
We find that several popular selective prediction approaches are ineffective in a differentially private setting. We propose a novel evaluation mechanism which isolate selective prediction performance across model utility levels.
arXiv Detail & Related papers (2023-05-28T12:20:07Z)
Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight. They only give tight estimates under implausible worst-case assumptions. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z)
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy [91.98116450958331]
We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization. Specifically, we design and implement an efficient defense that perfectly prevents all verbatim memorization. We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.
arXiv Detail & Related papers (2022-10-31T17:57:55Z)
On the Privacy Effect of Data Enhancement via the Lens of Memorization [20.63044895680223]
We propose to investigate privacy from a new perspective called memorization. Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks. We demonstrate that the generalization gap and privacy leakage are less correlated than those of the previous results.
arXiv Detail & Related papers (2022-08-17T13:02:17Z)
Semantics-Preserved Distortion for Personal Privacy Protection in Information Management [65.08939490413037]
This paper suggests a linguistically-grounded approach to distort texts while maintaining semantic integrity. We present two distinct frameworks for semantic-preserving distortion: a generative approach and a substitutive approach. We also explore privacy protection in a specific medical information management scenario, showing our method effectively limits sensitive data memorization.
arXiv Detail & Related papers (2022-01-04T04:01:05Z)
Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions. We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.