Understanding the role of importance weighting for deep learning
- URL: http://arxiv.org/abs/2103.15209v1
- Date: Sun, 28 Mar 2021 19:44:47 GMT
- Title: Understanding the role of importance weighting for deep learning
- Authors: Da Xu, Yuting Ye, Chuanwei Ruan
- Abstract summary: Recent paper by Byrd & Lipton raises concern on impact of importance weighting for deep learning models.
We provide formal characterizations and theoretical justifications on the role of importance weighting.
We reveal both the optimization dynamics and generalization performance under deep learning models.
- Score: 13.845232029169617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent paper by Byrd & Lipton (2019), based on empirical observations,
raises a major concern on the impact of importance weighting for the
over-parameterized deep learning models. They observe that as long as the model
can separate the training data, the impact of importance weighting diminishes
as the training proceeds. Nevertheless, there lacks a rigorous characterization
of this phenomenon. In this paper, we provide formal characterizations and
theoretical justifications on the role of importance weighting with respect to
the implicit bias of gradient descent and margin-based learning theory. We
reveal both the optimization dynamics and generalization performance under deep
learning models. Our work not only explains the various novel phenomenons
observed for importance weighting in deep learning, but also extends to the
studies where the weights are being optimized as part of the model, which
applies to a number of topics under active research.
Related papers
- Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Understanding the Learning Dynamics of Alignment with Human Feedback [17.420727709895736]
This paper provides an attempt to theoretically analyze the learning dynamics of human preference alignment.
We show how the distribution of preference datasets influences the rate of model updates and provide rigorous guarantees on the training accuracy.
arXiv Detail & Related papers (2024-03-27T16:39:28Z) - Enhancing Generative Class Incremental Learning Performance with Model Forgetting Approach [50.36650300087987]
This study presents a novel approach to Generative Class Incremental Learning (GCIL) by introducing the forgetting mechanism.
We have found that integrating the forgetting mechanisms significantly enhances the models' performance in acquiring new knowledge.
arXiv Detail & Related papers (2024-03-27T05:10:38Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Loss Dynamics of Temporal Difference Reinforcement Learning [36.772501199987076]
We study the case learning curves for temporal difference learning of a value function with linear function approximators.
We study how learning dynamics and plateaus depend on feature structure, learning rate, discount factor, and reward function.
arXiv Detail & Related papers (2023-07-10T18:17:50Z) - A Survey on Few-Shot Class-Incremental Learning [11.68962265057818]
Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks.
This paper provides a comprehensive survey on FSCIL.
FSCIL has achieved impressive achievements in various fields of computer vision.
arXiv Detail & Related papers (2023-04-17T10:15:08Z) - A Theoretical Study of Inductive Biases in Contrastive Learning [32.98250585760665]
We provide the first theoretical analysis of self-supervised learning that incorporates the effect of inductive biases originating from the model class.
We show that when the model has limited capacity, contrastive representations would recover certain special clustering structures that are compatible with the model architecture.
arXiv Detail & Related papers (2022-11-27T01:53:29Z) - Rethinking Importance Weighting for Transfer Learning [71.81262398144946]
Key assumption in supervised learning is that training and test data follow the same probability distribution.
As real-world machine learning tasks are becoming increasingly complex, novel approaches are explored to cope with such challenges.
arXiv Detail & Related papers (2021-12-19T14:35:25Z) - On the Dynamics of Training Attention Models [30.85940880569692]
We study the dynamics of training a simple attention-based classification model using gradient descent.
We prove that training must converge to attending to the discriminative words when the attention output is classified by a linear classifier.
arXiv Detail & Related papers (2020-11-19T18:55:30Z) - Counterfactual Representation Learning with Balancing Weights [74.67296491574318]
Key to causal inference with observational data is achieving balance in predictive features associated with each treatment type.
Recent literature has explored representation learning to achieve this goal.
We develop an algorithm for flexible, scalable and accurate estimation of causal effects.
arXiv Detail & Related papers (2020-10-23T19:06:03Z) - Usable Information and Evolution of Optimal Representations During
Training [79.38872675793813]
In particular, we find that semantically meaningful but ultimately irrelevant information is encoded in the early transient dynamics of training.
We show these effects on both perceptual decision-making tasks inspired by literature, as well as on standard image classification tasks.
arXiv Detail & Related papers (2020-10-06T03:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.