The Memory Perturbation Equation: Understanding Model's Sensitivity to
Data
- URL: http://arxiv.org/abs/2310.19273v2
- Date: Tue, 16 Jan 2024 12:38:15 GMT
- Title: The Memory Perturbation Equation: Understanding Model's Sensitivity to
Data
- Authors: Peter Nickl, Lu Xu, Dharmesh Tailor, Thomas M\"ollenhoff, Mohammad
Emtiyaz Khan
- Abstract summary: We present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data.
Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data.
- Score: 16.98312108418346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding model's sensitivity to its training data is crucial but can
also be challenging and costly, especially during training. To simplify such
issues, we present the Memory-Perturbation Equation (MPE) which relates model's
sensitivity to perturbation in its training data. Derived using Bayesian
principles, the MPE unifies existing sensitivity measures, generalizes them to
a wide-variety of models and algorithms, and unravels useful properties
regarding sensitivities. Our empirical results show that sensitivity estimates
obtained during training can be used to faithfully predict generalization on
unseen test data. The proposed equation is expected to be useful for future
research on robust and adaptive learning.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Sensitivity-Aware Amortized Bayesian Inference [8.753065246797561]
Sensitivity analyses reveal the influence of various modeling choices on the outcomes of statistical analyses.
We propose sensitivity-aware amortized Bayesian inference (SA-ABI), a multifaceted approach to integrate sensitivity analyses into simulation-based inference with neural networks.
We demonstrate the effectiveness of our method in applied modeling problems, ranging from disease outbreak dynamics and global warming thresholds to human decision-making.
arXiv Detail & Related papers (2023-10-17T10:14:10Z) - Reconstructing Training Data from Model Gradient, Provably [68.21082086264555]
We reconstruct the training samples from a single gradient query at a randomly chosen parameter value.
As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy.
arXiv Detail & Related papers (2022-12-07T15:32:22Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Understanding Memorization from the Perspective of Optimization via
Efficient Influence Estimation [54.899751055620904]
We study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data)
Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one.
arXiv Detail & Related papers (2021-12-16T11:34:23Z) - Evaluating deep transfer learning for whole-brain cognitive decoding [11.898286908882561]
Transfer learning (TL) is well-suited to improve the performance of deep learning (DL) models in datasets with small numbers of samples.
Here, we evaluate TL for the application of DL models to the decoding of cognitive states from whole-brain functional Magnetic Resonance Imaging (fMRI) data.
arXiv Detail & Related papers (2021-11-01T15:44:49Z) - Bounding Information Leakage in Machine Learning [26.64770573405079]
This paper investigates fundamental bounds on information leakage.
We identify and bound the success rate of the worst-case membership inference attack.
We derive bounds on the mutual information between the sensitive attributes and model parameters.
arXiv Detail & Related papers (2021-05-09T08:49:14Z) - Learning Stable Nonparametric Dynamical Systems with Gaussian Process
Regression [9.126353101382607]
We learn a nonparametric Lyapunov function based on Gaussian process regression from data.
We prove that stabilization of the nominal model based on the nonparametric control Lyapunov function does not modify the behavior of the nominal model at training samples.
arXiv Detail & Related papers (2020-06-14T11:17:17Z) - AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS)
Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.