The Risks of Invariant Risk Minimization
- URL: http://arxiv.org/abs/2010.05761v2
- Date: Sat, 27 Mar 2021 16:23:24 GMT
- Title: The Risks of Invariant Risk Minimization
- Authors: Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski
- Abstract summary: Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
- Score: 52.7137956951533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Invariant Causal Prediction (Peters et al., 2016) is a technique for
out-of-distribution generalization which assumes that some aspects of the data
distribution vary across the training set but that the underlying causal
mechanisms remain constant. Recently, Arjovsky et al. (2019) proposed Invariant
Risk Minimization (IRM), an objective based on this idea for learning deep,
invariant features of data which are a complex function of latent variables;
many alternatives have subsequently been suggested. However, formal guarantees
for all of these works are severely lacking. In this paper, we present the
first analysis of classification under the IRM objective--as well as these
recently proposed alternatives--under a fairly natural and general model. In
the linear case, we show simple conditions under which the optimal solution
succeeds or, more often, fails to recover the optimal invariant predictor. We
furthermore present the very first results in the non-linear regime: we
demonstrate that IRM can fail catastrophically unless the test data are
sufficiently similar to the training distribution--this is precisely the issue
that it was intended to solve. Thus, in this setting we find that IRM and its
alternatives fundamentally do not improve over standard Empirical Risk
Minimization.
Related papers
- Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Continual Invariant Risk Minimization [46.051656238770086]
Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations.
Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations.
arXiv Detail & Related papers (2023-10-21T11:44:47Z) - On the Variance, Admissibility, and Stability of Empirical Risk
Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z) - Counterfactual Supervision-based Information Bottleneck for
Out-of-Distribution Generalization [40.94431121318241]
We show that the invariant risk minimization algorithm (IB-IRM) is not sufficient for learning invariant features in linear classification problems.
We propose a textitCounterfactual Supervision-based Information Bottleneck (CSIB) learning algorithm that provably recovers the invariant features.
arXiv Detail & Related papers (2022-08-16T15:26:00Z) - The Missing Invariance Principle Found -- the Reciprocal Twin of
Invariant Risk Minimization [7.6146285961466]
In Risk Minimization (IRM) can fail to generalize poorly to out-of-distribution (OOD) data.
We show that MRI-v1 can guarantee invariant predictors given sufficient environments.
We also demonstrate that MRI strongly out-performs IRM and achieves a near-optimal OOD in image-based problems.
arXiv Detail & Related papers (2022-05-29T00:14:51Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Meta-Learned Invariant Risk Minimization [12.6484257912092]
Empirical Risk Minimization (ERM) based machine learning algorithms have suffered from weak generalization performance on data obtained from out-of-distribution (OOD)
In this paper, we propose a novel meta-learning based approach for IRM.
We show that our algorithm not only has better OOD generalization performance than IRMv1 and all IRM variants, but also addresses the weakness of IRMv1 with improved stability.
arXiv Detail & Related papers (2021-03-24T02:52:48Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.