Measuring Forgetting of Memorized Training Examples
- URL: http://arxiv.org/abs/2207.00099v2
- Date: Tue, 9 May 2023 14:08:17 GMT
- Title: Measuring Forgetting of Memorized Training Examples
- Authors: Matthew Jagielski, Om Thakkar, Florian Tram\`er, Daphne Ippolito,
Katherine Lee, Nicholas Carlini, Eric Wallace, Shuang Song, Abhradeep
Thakurta, Nicolas Papernot, Chiyuan Zhang
- Abstract summary: We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization.
In specific examples, models overfit specific training and become susceptible to privacy attacks by the end.
We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
- Score: 80.9188503645436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models exhibit two seemingly contradictory phenomena:
training data memorization, and various forms of forgetting. In memorization,
models overfit specific training examples and become susceptible to privacy
attacks. In forgetting, examples which appeared early in training are forgotten
by the end. In this work, we connect these phenomena. We propose a technique to
measure to what extent models "forget" the specifics of training examples,
becoming less susceptible to privacy attacks on examples they have not seen
recently. We show that, while non-convex models can memorize data forever in
the worst-case, standard image, speech, and language models empirically do
forget examples over time. We identify nondeterminism as a potential
explanation, showing that deterministically trained models do not forget. Our
results suggest that examples seen early when training with extremely large
datasets - for instance those examples used to pre-train a model - may observe
privacy benefits at the expense of examples seen later.
Related papers
- Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy [45.413801663923564]
We discuss adaptations of Membership Inference Attacks (MIAs) to the setting of unlearning.
We show that the commonly used U-MIAs in the unlearning literature overestimate the privacy protection afforded by existing unlearning techniques on both vision and language models.
arXiv Detail & Related papers (2024-03-02T14:22:40Z) - What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement [38.93348195407474]
Language models deployed in the wild make errors.
Updating the model with the corrected error instances causes catastrophic forgetting.
We propose a partially interpretable forecasting model based on the observation that changes in pre-softmax logit scores of pretraining examples resemble that of online learned examples.
arXiv Detail & Related papers (2024-02-02T19:43:15Z) - Unintended Memorization in Large ASR Models, and How to Mitigate It [16.047859326721046]
auditing memorization in large non-auto-regressive automatic speech recognition (ASR) models has been challenging.
We design a simple auditing method to measure memorization in large ASR models without the extra compute overhead.
We show that in large-scale distributed training, clipping the average gradient on each compute core maintains neutral model quality and compute cost.
arXiv Detail & Related papers (2023-10-18T06:45:49Z) - What do larger image classifiers memorise? [64.01325988398838]
We show that training examples exhibit an unexpectedly diverse set of memorisation trajectories across model sizes.
We find that knowledge distillation, an effective and popular model compression technique, tends to inhibit memorisation, while also improving generalisation.
arXiv Detail & Related papers (2023-10-09T01:52:07Z) - Recognition, recall, and retention of few-shot memories in large
language models [21.067139116005592]
We investigate simple recognition, recall, and retention experiments with large language models.
We find that a single exposure is generally sufficient for a model to achieve near perfect accuracy.
The flip side of this remarkable capacity for fast learning is that precise memories are quickly overwritten.
arXiv Detail & Related papers (2023-03-30T17:26:16Z) - Characterizing Datapoints via Second-Split Forgetting [93.99363547536392]
We propose $$-second-$split$ $forgetting$ $time$ (SSFT), a complementary metric that tracks the epoch (if any) after which an original training example is forgotten.
We demonstrate that $mislabeled$ examples are forgotten quickly, and seemingly $rare$ examples are forgotten comparatively slowly.
SSFT can (i) help to identify mislabeled samples, the removal of which improves generalization; and (ii) provide insights about failure modes.
arXiv Detail & Related papers (2022-10-26T21:03:46Z) - Extracting Training Data from Large Language Models [78.3839333127544]
This paper demonstrates that an adversary can perform a training data extraction attack to recover individual training examples by querying the language model.
We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data.
arXiv Detail & Related papers (2020-12-14T18:39:09Z) - Robust and On-the-fly Dataset Denoising for Image Classification [72.10311040730815]
On-the-fly Data Denoising (ODD) is robust to mislabeled examples, while introducing almost zero computational overhead compared to standard training.
ODD is able to achieve state-of-the-art results on a wide range of datasets including real-world ones such as WebVision and Clothing1M.
arXiv Detail & Related papers (2020-03-24T03:59:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.