Related papers: Deduplicating Training Data Mitigates Privacy Risks in Language Models

Deduplicating Training Data Mitigates Privacy Risks in Language Models

URL: http://arxiv.org/abs/2202.06539v2
Date: Wed, 16 Feb 2022 18:55:11 GMT
Title: Deduplicating Training Data Mitigates Privacy Risks in Language Models
Authors: Nikhil Kandpal, Eric Wallace, Colin Raffel
Abstract summary: We show that the success of privacy attacks is largely due to duplication in commonly used web-scraped training sets. We show that the rate at which language models regenerate training sequences is superlinearly related to a sequence's count in the training set. We find that after applying methods to deduplicate training data, language models are considerably more secure against these types of privacy attacks.
Score: 35.643052320353114
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Past work has shown that large language models are susceptible to privacy attacks, where adversaries generate sequences from a trained model and detect which sequences are memorized from the training set. In this work, we show that the success of these attacks is largely due to duplication in commonly used web-scraped training sets. We first show that the rate at which language models regenerate training sequences is superlinearly related to a sequence's count in the training set. For instance, a sequence that is present 10 times in the training data is on average generated ~1000 times more often than a sequence that is present only once. We next show that existing methods for detecting memorized sequences have near-chance accuracy on non-duplicated training sequences. Finally, we find that after applying methods to deduplicate training data, language models are considerably more secure against these types of privacy attacks. Taken together, our results motivate an increased focus on deduplication in privacy-sensitive applications and a reevaluation of the practicality of existing privacy attacks.

Related papers

Memory Backdoor Attacks on Neural Networks [3.2720947374803777]
We propose the memory backdoor attack, where a model is covertly trained to specific training samples and later selectively output them. We demonstrate the attack on image classifiers, segmentation models, and a large language model (LLM)
arXiv Detail & Related papers (2024-11-21T16:09:16Z)
Demystifying Verbatim Memorization in Large Language Models [67.49068128909349]
Large Language Models (LLMs) frequently memorize long sequences verbatim, often with serious legal and privacy implications. We develop a framework to study verbatim memorization in a controlled setting by continuing pre-training from Pythia checkpoints with injected sequences. We find that (1) non-trivial amounts of repetition are necessary for verbatim memorization to happen; (2) later (and presumably better) checkpoints are more likely to memorize verbatim sequences, even for out-of-distribution sequences.
arXiv Detail & Related papers (2024-07-25T07:10:31Z)
Forgetting Private Textual Sequences in Language Models via Leave-One-Out Ensemble [13.893379594151533]
We propose a novel leave-one-out ensemble method to unlearn the targeted textual sequences that need to be forgotten from the model. Experiments on LibriSpeech and WikiText-103 datasets show that the proposed method achieves superior privacy-utility trade-offs than other counterparts.
arXiv Detail & Related papers (2023-09-28T00:43:18Z)
Mitigating Unintended Memorization in Language Models via Alternating Teaching [15.112637366882185]
We propose a novel approach to mitigate unintended memorization in sequential modeling. In our method, multiple teachers are trained on disjoint training sets whose privacy one wishes to protect. Experiments on LibriSpeech datasets show that the proposed method achieves superior privacy-preserving results.
arXiv Detail & Related papers (2022-10-13T06:26:41Z)
Measuring Forgetting of Memorized Training Examples [80.9188503645436]
We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization. In specific examples, models overfit specific training and become susceptible to privacy attacks by the end. We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
arXiv Detail & Related papers (2022-06-30T20:48:26Z)
Arithmetic-Based Pretraining -- Improving Numeracy of Pretrained Language Models [67.48894919842576]
State-of-the-art pretrained language models tend to perform below their capabilities when applied out-of-the-box on tasks that require numeracy. We propose a new extended pretraining approach called Arithmetic-Based Pretraining that jointly addresses both in one extended pretraining step. Our experiments show the effectiveness of Arithmetic-Based Pretraining in three different tasks that require improved numeracy.
arXiv Detail & Related papers (2022-05-13T16:10:13Z)
Provably Confidential Language Modelling [36.37616789197548]
We propose Confidentially Redacted Training (CRT), a method to train language generation models while protecting the confidential segments. We show that our method is able to provably prevent unintended memorization by randomizing parts of the training process. Our experimental results show that the models trained by CRT obtain almost the same perplexity while preserving strong confidentiality.
arXiv Detail & Related papers (2022-05-04T02:33:45Z)
Extracting Training Data from Large Language Models [78.3839333127544]
This paper demonstrates that an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data.
arXiv Detail & Related papers (2020-12-14T18:39:09Z)
Train No Evil: Selective Masking for Task-Guided Pre-Training [97.03615486457065]
We propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. We show that our method can achieve comparable or even better performance with less than 50% of cost.
arXiv Detail & Related papers (2020-04-21T03:14:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.