Unshuffling Data for Improved Generalization
- URL: http://arxiv.org/abs/2002.11894v3
- Date: Fri, 20 Nov 2020 23:14:33 GMT
- Title: Unshuffling Data for Improved Generalization
- Authors: Damien Teney, Ehsan Abbasnejad, Anton van den Hengel
- Abstract summary: Generalization beyond the training distribution is a core challenge in machine learning.
We show that partitioning the data into well-chosen, non-i.i.d. subsets treated as multiple training environments can guide the learning of models with better out-of-distribution generalization.
- Score: 65.57124325257409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization beyond the training distribution is a core challenge in
machine learning. The common practice of mixing and shuffling examples when
training neural networks may not be optimal in this regard. We show that
partitioning the data into well-chosen, non-i.i.d. subsets treated as multiple
training environments can guide the learning of models with better
out-of-distribution generalization. We describe a training procedure to capture
the patterns that are stable across environments while discarding spurious
ones. The method makes a step beyond correlation-based learning: the choice of
the partitioning allows injecting information about the task that cannot be
otherwise recovered from the joint distribution of the training data. We
demonstrate multiple use cases with the task of visual question answering,
which is notorious for dataset biases. We obtain significant improvements on
VQA-CP, using environments built from prior knowledge, existing meta data, or
unsupervised clustering. We also get improvements on GQA using annotations of
"equivalent questions", and on multi-dataset training (VQA v2 / Visual Genome)
by treating them as distinct environments.
Related papers
- Efficient Bias Mitigation Without Privileged Information [14.21628601482357]
Deep neural networks trained via empirical risk minimisation often exhibit significant performance disparities across groups.
Existing bias mitigation methods that aim to address this issue often rely on group labels for training or validation.
We propose Targeted Augmentations for Bias Mitigation (TAB), a framework that leverages the entire training history of a helper model to identify spurious samples.
We show that TAB improves worst-group performance without any group information or model selection, outperforming existing methods while maintaining overall accuracy.
arXiv Detail & Related papers (2024-09-26T09:56:13Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Mutual Information Learned Classifiers: an Information-theoretic
Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
Cross entropy loss can easily lead us to find models which demonstrate severe overfitting behavior.
In this paper, we prove that the existing cross entropy loss minimization for training DNN classifiers essentially learns the conditional entropy of the underlying data distribution.
We propose a mutual information learning framework where we train DNN classifiers via learning the mutual information between the label and input.
arXiv Detail & Related papers (2022-10-03T15:09:19Z) - Mutual Information Learned Classifiers: an Information-theoretic
Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
We show that the existing cross entropy loss minimization problem essentially learns the label conditional entropy of the underlying data distribution.
We propose a mutual information learning framework where we train deep neural network classifiers via learning the mutual information between the label and the input.
arXiv Detail & Related papers (2022-09-21T01:06:30Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Multi-Domain Joint Training for Person Re-Identification [51.73921349603597]
Deep learning-based person Re-IDentification (ReID) often requires a large amount of training data to achieve good performance.
It appears that collecting more training data from diverse environments tends to improve the ReID performance.
We propose an approach called Domain-Camera-Sample Dynamic network (DCSD) whose parameters can be adaptive to various factors.
arXiv Detail & Related papers (2022-01-06T09:20:59Z) - A Batch Normalization Classifier for Domain Adaptation [0.0]
Adapting a model to perform well on unforeseen data outside its training set is a common problem that continues to motivate new approaches.
We demonstrate that application of batch normalization in the output layer, prior to softmax activation, results in improved generalization across visual data domains in a refined ResNet model.
arXiv Detail & Related papers (2021-03-22T08:03:44Z) - Variational Clustering: Leveraging Variational Autoencoders for Image
Clustering [8.465172258675763]
Variational Autoencoders (VAEs) naturally lend themselves to learning data distributions in a latent space.
We propose a method based on VAEs where we use a Gaussian Mixture prior to help cluster the images accurately.
Our method simultaneously learns a prior that captures the latent distribution of the images and a posterior to help discriminate well between data points.
arXiv Detail & Related papers (2020-05-10T09:34:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.