Can Neural Network Memorization Be Localized?
- URL: http://arxiv.org/abs/2307.09542v1
- Date: Tue, 18 Jul 2023 18:36:29 GMT
- Title: Can Neural Network Memorization Be Localized?
- Authors: Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J.
Zico Kolter, Chiyuan Zhang
- Abstract summary: We show that memorization is a phenomenon confined to a small set of neurons in various layers of the model.
We propose a new form of dropout -- $textitexample-tied dropout$ that enables us to direct memorization of examples to an ai determined set of neurons.
- Score: 102.68044087952913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent efforts at explaining the interplay of memorization and generalization
in deep overparametrized networks have posited that neural networks
$\textit{memorize}$ "hard" examples in the final few layers of the model.
Memorization refers to the ability to correctly predict on $\textit{atypical}$
examples of the training set. In this work, we show that rather than being
confined to individual layers, memorization is a phenomenon confined to a small
set of neurons in various layers of the model. First, via three experimental
sources of converging evidence, we find that most layers are redundant for the
memorization of examples and the layers that contribute to example memorization
are, in general, not the final layers. The three sources are $\textit{gradient
accounting}$ (measuring the contribution to the gradient norms from memorized
and clean examples), $\textit{layer rewinding}$ (replacing specific model
weights of a converged model with previous training checkpoints), and
$\textit{retraining}$ (training rewound layers only on clean examples). Second,
we ask a more generic question: can memorization be localized
$\textit{anywhere}$ in a model? We discover that memorization is often confined
to a small number of neurons or channels (around 5) of the model. Based on
these insights we propose a new form of dropout -- $\textit{example-tied
dropout}$ that enables us to direct the memorization of examples to an apriori
determined set of neurons. By dropping out these neurons, we are able to reduce
the accuracy on memorized examples from $100\%\to3\%$, while also reducing the
generalization gap.
Related papers
- SolidMark: Evaluating Image Memorization in Generative Models [29.686839712637433]
We show that metrics used to evaluate memorization and its mitigation techniques suffer from dataset-dependent biases.
We introduce $rm stylefont-variant: small-capsSolidMark$, a novel evaluation method that provides a per-image memorization score.
We also show that $rm stylefont-variant: small-capsSolidMark$ is capable of evaluating fine-grained pixel-level memorization.
arXiv Detail & Related papers (2025-03-01T19:14:51Z) - Localizing Paragraph Memorization in Language Models [17.943637462569537]
We show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern.
We also show that memorized continuations are not only harder to unlearn, but also to corrupt than non-memorized ones.
arXiv Detail & Related papers (2024-03-28T21:53:24Z) - What do larger image classifiers memorise? [64.01325988398838]
We show that training examples exhibit an unexpectedly diverse set of memorisation trajectories across model sizes.
We find that knowledge distillation, an effective and popular model compression technique, tends to inhibit memorisation, while also improving generalisation.
arXiv Detail & Related papers (2023-10-09T01:52:07Z) - On the Role of Neural Collapse in Meta Learning Models for Few-shot
Learning [0.9729803206187322]
This study is the first to explore and understand the properties of neural collapse in meta learning frameworks for few-shot learning.
We perform studies on the Omniglot dataset in the few-shot setting and study the neural collapse phenomenon.
arXiv Detail & Related papers (2023-09-30T18:02:51Z) - Characterizing Datapoints via Second-Split Forgetting [93.99363547536392]
We propose $$-second-$split$ $forgetting$ $time$ (SSFT), a complementary metric that tracks the epoch (if any) after which an original training example is forgotten.
We demonstrate that $mislabeled$ examples are forgotten quickly, and seemingly $rare$ examples are forgotten comparatively slowly.
SSFT can (i) help to identify mislabeled samples, the removal of which improves generalization; and (ii) provide insights about failure modes.
arXiv Detail & Related papers (2022-10-26T21:03:46Z) - The Curious Case of Benign Memorization [19.74244993871716]
We show that under training protocols that include data augmentation, neural networks learn to memorize entirely random labels in a benign way.
We demonstrate that deep models have the surprising ability to separate noise from signal by distributing the task of memorization and feature learning to different layers.
arXiv Detail & Related papers (2022-10-25T13:41:31Z) - Measures of Information Reflect Memorization Patterns [53.71420125627608]
We show that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization.
Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabelled in-distribution examples.
arXiv Detail & Related papers (2022-10-17T20:15:24Z) - Quantifying Memorization Across Neural Language Models [61.58529162310382]
Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized data verbatim.
This is undesirable because memorization violates privacy (exposing user data), degrades utility (repeated easy-to-memorize text is often low quality), and hurts fairness (some texts are memorized over others).
We describe three log-linear relationships that quantify the degree to which LMs emit memorized training data.
arXiv Detail & Related papers (2022-02-15T18:48:31Z) - Counterfactual Memorization in Neural Language Models [91.8747020391287]
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data.
An open question in previous studies of language model memorization is how to filter out "common" memorization.
We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.
arXiv Detail & Related papers (2021-12-24T04:20:57Z) - Online Memorization of Random Firing Sequences by a Recurrent Neural
Network [12.944868613449218]
Two modes of learning/memorization are considered: The first mode is strictly online, with a single pass through the data, while the second mode uses multiple passes through the data.
In both modes, the learning is strictly local (quasi-Hebbian): At any given time step, only the weights between the neurons firing (or supposed to be firing) at the previous time step and those firing (or supposed to be firing) at the present time step are modified.
arXiv Detail & Related papers (2020-01-09T11:02:53Z) - Learning and Memorizing Representative Prototypes for 3D Point Cloud
Semantic and Instance Segmentation [117.29799759864127]
3D point cloud semantic and instance segmentation is crucial and fundamental for 3D scene understanding.
Deep networks can easily forget the non-dominant cases during the learning process, resulting in unsatisfactory performance.
We propose a memory-augmented network to learn and memorize the representative prototypes that cover diverse samples universally.
arXiv Detail & Related papers (2020-01-06T01:07:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.