Understanding Memorization from the Perspective of Optimization via
Efficient Influence Estimation
- URL: http://arxiv.org/abs/2112.08798v1
- Date: Thu, 16 Dec 2021 11:34:23 GMT
- Title: Understanding Memorization from the Perspective of Optimization via
Efficient Influence Estimation
- Authors: Futong Liu, Tao Lin, Martin Jaggi
- Abstract summary: We study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data)
Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one.
- Score: 54.899751055620904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over-parameterized deep neural networks are able to achieve excellent
training accuracy while maintaining a small generalization error. It has also
been found that they are able to fit arbitrary labels, and this behaviour is
referred to as the phenomenon of memorization. In this work, we study the
phenomenon of memorization with turn-over dropout, an efficient method to
estimate influence and memorization, for data with true labels (real data) and
data with random labels (random data). Our main findings are: (i) For both real
data and random data, the optimization of easy examples (e.g., real data) and
difficult examples (e.g., random data) are conducted by the network
simultaneously, with easy ones at a higher speed; (ii) For real data, a correct
difficult example in the training dataset is more informative than an easy one.
By showing the existence of memorization on random data and real data, we
highlight the consistency between them regarding optimization and we emphasize
the implication of memorization during optimization.
Related papers
- Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution [62.71425232332837]
We show that training amortized models with noisy labels is inexpensive and surprisingly effective.
This approach significantly accelerates several feature attribution and data valuation methods, often yielding an order of magnitude speedup over existing approaches.
arXiv Detail & Related papers (2024-01-29T03:42:37Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - MILD: Modeling the Instance Learning Dynamics for Learning with Noisy
Labels [19.650299232829546]
We propose an iterative selection approach based on the Weibull mixture model to identify clean data.
In particular, we measure the difficulty of memorization and memorize for each instance via the transition times between being misclassified and being memorized.
Our strategy outperforms existing noisy-label learning methods.
arXiv Detail & Related papers (2023-06-20T14:26:53Z) - Leveraging Unlabeled Data to Track Memorization [15.4909376515404]
We propose a metric, called susceptibility, to gauge memorization for neural networks.
We empirically show the effectiveness of our metric in tracking memorization on various architectures and datasets.
arXiv Detail & Related papers (2022-12-08T18:36:41Z) - Online Missing Value Imputation and Change Point Detection with the
Gaussian Copula [21.26330349034669]
Missing value imputation is crucial for real-world data science.
We develop an online imputation algorithm for mixed data using the Gaussian copula.
arXiv Detail & Related papers (2020-09-25T16:27:47Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.