Sufficient Invariant Learning for Distribution Shift
- URL: http://arxiv.org/abs/2210.13533v2
- Date: Mon, 28 Aug 2023 08:58:18 GMT
- Title: Sufficient Invariant Learning for Distribution Shift
- Authors: Taero Kim, Sungjun Lim, Kyungwoo Song
- Abstract summary: We argue that learning sufficient invariant features from the training set is crucial for the distribution shift case.
ASGDRO learns sufficient invariant features by seeking common flat minima across all groups or domains.
- Score: 16.838595294610105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning algorithms have shown remarkable performance in diverse
applications. However, it is still challenging to guarantee performance in
distribution shifts when distributions of training and test datasets are
different. There have been several approaches to improve the performance in
distribution shift cases by learning invariant features across groups or
domains. However, we observe that the previous works only learn invariant
features partially. While the prior works focus on the limited invariant
features, we first raise the importance of the sufficient invariant features.
Since only training sets are given empirically, the learned partial invariant
features from training sets might not be present in the test sets under
distribution shift. Therefore, the performance improvement on distribution
shifts might be limited. In this paper, we argue that learning sufficient
invariant features from the training set is crucial for the distribution shift
case. Concretely, we newly observe the connection between a) sufficient
invariant features and b) flatness differences between groups or domains.
Moreover, we propose a new algorithm, Adaptive Sharpness-aware Group
Distributionally Robust Optimization (ASGDRO), to learn sufficient invariant
features across domains or groups. ASGDRO learns sufficient invariant features
by seeking common flat minima across all groups or domains. Therefore, ASGDRO
improves the performance on diverse distribution shift cases. Besides, we
provide a new simple dataset, Heterogeneous-CMNIST, to diagnose whether the
various algorithms learn sufficient invariant features.
Related papers
- Winning Prize Comes from Losing Tickets: Improve Invariant Learning by
Exploring Variant Parameters for Out-of-Distribution Generalization [76.27711056914168]
Out-of-Distribution (OOD) Generalization aims to learn robust models that generalize well to various environments without fitting to distribution-specific features.
Recent studies based on Lottery Ticket Hypothesis (LTH) address this problem by minimizing the learning target to find some of the parameters that are critical to the task.
We propose Exploring Variant parameters for Invariant Learning (EVIL) which also leverages the distribution knowledge to find the parameters that are sensitive to distribution shift.
arXiv Detail & Related papers (2023-10-25T06:10:57Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Empirical Study on Optimizer Selection for Out-of-Distribution
Generalization [16.386766049451317]
Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution.
In this study, we examine the performance of popular first-order generalizations for different classes of distributional shift.
arXiv Detail & Related papers (2022-11-15T23:56:30Z) - Distributional Shift Adaptation using Domain-Specific Features [41.91388601229745]
In open-world scenarios, streaming big data can be Out-Of-Distribution (OOD)
We propose a simple yet effective approach that relies on correlations in general regardless of whether the features are invariant or not.
Our approach uses the most confidently predicted samples identified by an OOD base model to train a new model that effectively adapts to the target domain.
arXiv Detail & Related papers (2022-11-09T04:16:21Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Improving Out-of-Distribution Robustness via Selective Augmentation [61.147630193060856]
Machine learning algorithms assume that training and test examples are drawn from the same distribution.
distribution shift is a common problem in real-world applications and can cause models to perform dramatically worse at test time.
We propose a mixup-based technique which learns invariant functions via selective augmentation called LISA.
arXiv Detail & Related papers (2022-01-02T05:58:33Z) - Invariance-based Multi-Clustering of Latent Space Embeddings for
Equivariant Learning [12.770012299379099]
We present an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning.
Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate.
arXiv Detail & Related papers (2021-07-25T03:27:47Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.