Related papers: Can Neural Network Memorization Be Localized?

Can Neural Network Memorization Be Localized?

URL: http://arxiv.org/abs/2307.09542v1
Date: Tue, 18 Jul 2023 18:36:29 GMT
Title: Can Neural Network Memorization Be Localized?
Authors: Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
Abstract summary: We show that memorization is a phenomenon confined to a small set of neurons in various layers of the model. We propose a new form of dropout -- $textitexample-tied dropout$ that enables us to direct memorization of examples to an ai determined set of neurons.
Score: 102.68044087952913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model. Memorization refers to the ability to correctly predict on $\textit{atypical}$ examples of the training set. In this work, we show that rather than being confined to individual layers, memorization is a phenomenon confined to a small set of neurons in various layers of the model. First, via three experimental sources of converging evidence, we find that most layers are redundant for the memorization of examples and the layers that contribute to example memorization are, in general, not the final layers. The three sources are $\textit{gradient accounting}$ (measuring the contribution to the gradient norms from memorized and clean examples), $\textit{layer rewinding}$ (replacing specific model weights of a converged model with previous training checkpoints), and $\textit{retraining}$ (training rewound layers only on clean examples). Second, we ask a more generic question: can memorization be localized $\textit{anywhere}$ in a model? We discover that memorization is often confined to a small number of neurons or channels (around 5) of the model. Based on these insights we propose a new form of dropout -- $\textit{example-tied dropout}$ that enables us to direct the memorization of examples to an apriori determined set of neurons. By dropping out these neurons, we are able to reduce the accuracy on memorized examples from $100\%\to3\%$, while also reducing the generalization gap.

Related papers

Semantic and episodic memories in a predictive coding model of the neocortex [1.70266830658388]
Complementary Learning Systems theory holds that intelligent agents need two learning systems.<n>Semantic memory is encoded in the neocortex with dense, overlapping representations and acquires structured knowledge.<n>Episodic memory is encoded in the hippocampus with sparse, pattern-separated representations and quickly learns the specifics of individual experiences.
arXiv Detail & Related papers (2025-09-02T06:13:16Z)
Understanding In-context Learning of Addition via Activation Subspaces [73.8295576941241]
We study a structured family of few-shot learning tasks for which the true prediction rule is to add an integer $k$ to the input.<n>We then perform an in-depth analysis of individual heads, via dimensionality reduction and decomposition.<n>Our results demonstrate how tracking low-dimensional subspaces of localized heads across a forward pass can provide insight into fine-grained computational structures in language models.
arXiv Detail & Related papers (2025-05-08T11:32:46Z)
SolidMark: Evaluating Image Memorization in Generative Models [29.686839712637433]
We show that metrics used to evaluate memorization and its mitigation techniques suffer from dataset-dependent biases. We introduce $rm stylefont-variant: small-capsSolidMark$, a novel evaluation method that provides a per-image memorization score. We also show that $rm stylefont-variant: small-capsSolidMark$ is capable of evaluating fine-grained pixel-level memorization.
arXiv Detail & Related papers (2025-03-01T19:14:51Z)
Localizing Paragraph Memorization in Language Models [17.943637462569537]
We show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern. We also show that memorized continuations are not only harder to unlearn, but also to corrupt than non-memorized ones.
arXiv Detail & Related papers (2024-03-28T21:53:24Z)
What do larger image classifiers memorise? [64.01325988398838]
We show that training examples exhibit an unexpectedly diverse set of memorisation trajectories across model sizes. We find that knowledge distillation, an effective and popular model compression technique, tends to inhibit memorisation, while also improving generalisation.
arXiv Detail & Related papers (2023-10-09T01:52:07Z)
On the Role of Neural Collapse in Meta Learning Models for Few-shot Learning [0.9729803206187322]
This study is the first to explore and understand the properties of neural collapse in meta learning frameworks for few-shot learning. We perform studies on the Omniglot dataset in the few-shot setting and study the neural collapse phenomenon.
arXiv Detail & Related papers (2023-09-30T18:02:51Z)
Characterizing Datapoints via Second-Split Forgetting [93.99363547536392]
We propose $$-second-$split$ $forgetting$ $time$ (SSFT), a complementary metric that tracks the epoch (if any) after which an original training example is forgotten. We demonstrate that $mislabeled$ examples are forgotten quickly, and seemingly $rare$ examples are forgotten comparatively slowly. SSFT can (i) help to identify mislabeled samples, the removal of which improves generalization; and (ii) provide insights about failure modes.
arXiv Detail & Related papers (2022-10-26T21:03:46Z)
The Curious Case of Benign Memorization [19.74244993871716]
We show that under training protocols that include data augmentation, neural networks learn to memorize entirely random labels in a benign way. We demonstrate that deep models have the surprising ability to separate noise from signal by distributing the task of memorization and feature learning to different layers.
arXiv Detail & Related papers (2022-10-25T13:41:31Z)
Measures of Information Reflect Memorization Patterns [53.71420125627608]
We show that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization. Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabelled in-distribution examples.
arXiv Detail & Related papers (2022-10-17T20:15:24Z)
Quantifying Memorization Across Neural Language Models [61.58529162310382]
Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized data verbatim. This is undesirable because memorization violates privacy (exposing user data), degrades utility (repeated easy-to-memorize text is often low quality), and hurts fairness (some texts are memorized over others). We describe three log-linear relationships that quantify the degree to which LMs emit memorized training data.
arXiv Detail & Related papers (2022-02-15T18:48:31Z)
Counterfactual Memorization in Neural Language Models [91.8747020391287]
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive information from their training data. An open question in previous studies of language model memorization is how to filter out "common" memorization. We formulate a notion of counterfactual memorization which characterizes how a model's predictions change if a particular document is omitted during training.
arXiv Detail & Related papers (2021-12-24T04:20:57Z)
Online Memorization of Random Firing Sequences by a Recurrent Neural Network [12.944868613449218]
Two modes of learning/memorization are considered: The first mode is strictly online, with a single pass through the data, while the second mode uses multiple passes through the data. In both modes, the learning is strictly local (quasi-Hebbian): At any given time step, only the weights between the neurons firing (or supposed to be firing) at the previous time step and those firing (or supposed to be firing) at the present time step are modified.
arXiv Detail & Related papers (2020-01-09T11:02:53Z)
Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation [117.29799759864127]
3D point cloud semantic and instance segmentation is crucial and fundamental for 3D scene understanding. Deep networks can easily forget the non-dominant cases during the learning process, resulting in unsatisfactory performance. We propose a memory-augmented network to learn and memorize the representative prototypes that cover diverse samples universally.
arXiv Detail & Related papers (2020-01-06T01:07:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.