Sufficient Invariant Learning for Distribution Shift
- URL: http://arxiv.org/abs/2210.13533v3
- Date: Tue, 19 Nov 2024 00:06:13 GMT
- Title: Sufficient Invariant Learning for Distribution Shift
- Authors: Taero Kim, Subeen Park, Sungjun Lim, Yonghan Jung, Krikamol Muandet, Kyungwoo Song,
- Abstract summary: We introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework.
SIL focuses on learning a sufficient subset of invariant features rather than relying on a single feature.
We propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima.
- Score: 20.88069274935592
- License:
- Abstract: Learning robust models under distribution shifts between training and test datasets is a fundamental challenge in machine learning. While learning invariant features across environments is a popular approach, it often assumes that these features are fully observed in both training and test sets-a condition frequently violated in practice. When models rely on invariant features absent in the test set, their robustness in new environments can deteriorate. To tackle this problem, we introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework, which focuses on learning a sufficient subset of invariant features rather than relying on a single feature. After demonstrating the limitation of existing invariant learning methods, we propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima across the environments. We theoretically demonstrate that finding a common flat minima enables robust predictions based on diverse invariant features. Empirical evaluations on multiple datasets, including our new benchmark, confirm ASGDRO's robustness against distribution shifts, highlighting the limitations of existing methods.
Related papers
- Winning Prize Comes from Losing Tickets: Improve Invariant Learning by
Exploring Variant Parameters for Out-of-Distribution Generalization [76.27711056914168]
Out-of-Distribution (OOD) Generalization aims to learn robust models that generalize well to various environments without fitting to distribution-specific features.
Recent studies based on Lottery Ticket Hypothesis (LTH) address this problem by minimizing the learning target to find some of the parameters that are critical to the task.
We propose Exploring Variant parameters for Invariant Learning (EVIL) which also leverages the distribution knowledge to find the parameters that are sensitive to distribution shift.
arXiv Detail & Related papers (2023-10-25T06:10:57Z) - Environment Diversification with Multi-head Neural Network for Invariant
Learning [7.255121332331688]
This work proposes EDNIL, an invariant learning framework containing a multi-head neural network to absorb data biases.
We show that this framework does not require prior knowledge about environments or strong assumptions about the pre-trained model.
We demonstrate that models trained with EDNIL are empirically more robust against distributional shifts.
arXiv Detail & Related papers (2023-08-17T04:33:38Z) - Invariant Meta Learning for Out-of-Distribution Generalization [1.1718589131017048]
In this paper, we propose invariant meta learning for out-of-distribution tasks.
Specifically, invariant optimal meta-initialization,and fast adapt to out-of-distribution tasks with regularization penalty.
arXiv Detail & Related papers (2023-01-26T12:53:21Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Invariance-based Multi-Clustering of Latent Space Embeddings for
Equivariant Learning [12.770012299379099]
We present an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning.
Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate.
arXiv Detail & Related papers (2021-07-25T03:27:47Z) - Distance-based Hyperspherical Classification for Multi-source Open-Set
Domain Adaptation [34.97934677830779]
Vision systems trained in closed-world scenarios will inevitably fail when presented with new environmental conditions.
How to move towards open-world learning is a long standing research question.
In this work we tackle multi-source Open-Set domain adaptation by introducing HyMOS.
arXiv Detail & Related papers (2021-07-05T14:56:57Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z) - Invariant Risk Minimization Games [48.00018458720443]
In this work, we pose such invariant risk minimization as finding the Nash equilibrium of an ensemble game among several environments.
By doing so, we develop a simple training algorithm that uses best response dynamics and equilibria in our experiments, yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al.
arXiv Detail & Related papers (2020-02-11T21:25:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.