The Missing Invariance Principle Found -- the Reciprocal Twin of
Invariant Risk Minimization
- URL: http://arxiv.org/abs/2205.14546v1
- Date: Sun, 29 May 2022 00:14:51 GMT
- Title: The Missing Invariance Principle Found -- the Reciprocal Twin of
Invariant Risk Minimization
- Authors: Dongsung Huh and Avinash Baidya
- Abstract summary: In Risk Minimization (IRM) can fail to generalize poorly to out-of-distribution (OOD) data.
We show that MRI-v1 can guarantee invariant predictors given sufficient environments.
We also demonstrate that MRI strongly out-performs IRM and achieves a near-optimal OOD in image-based problems.
- Score: 7.6146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models often generalize poorly to out-of-distribution (OOD)
data as a result of relying on features that are spuriously correlated with the
label during training. Recently, the technique of Invariant Risk Minimization
(IRM) was proposed to learn predictors that only use invariant features by
conserving the feature-conditioned class expectation $\mathbb{E}_e[y|f(x)]$
across environments. However, more recent studies have demonstrated that IRM
can fail in various task settings. Here, we identify a fundamental flaw of IRM
formulation that causes the failure. We then introduce a complementary notion
of invariance, MRI, that is based on conserving the class-conditioned feature
expectation $\mathbb{E}_e[f(x)|y]$ across environments, that corrects for the
flaw in IRM. Further, we introduce a simplified, practical version of the MRI
formulation called as MRI-v1. We note that this constraint is convex which
confers it with an advantage over the practical version of IRM, IRM-v1, which
imposes non-convex constraints. We prove that in a general linear problem
setting, MRI-v1 can guarantee invariant predictors given sufficient
environments. We also empirically demonstrate that MRI strongly out-performs
IRM and consistently achieves near-optimal OOD generalization in image-based
nonlinear problems.
Related papers
- Invariant Risk Minimization Is A Total Variation Model [3.000494957386027]
Invariant risk minimization (IRM) is an arising approach to generalize invariant features to different environments in machine learning.
We show that IRM is essentially a total variation based on $L2$ (TV-$ell$) of the learning risk.
We propose a novel IRM framework based on the TV-$ell$ model.
arXiv Detail & Related papers (2024-05-02T15:34:14Z) - Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal
Approach [51.012396632595554]
Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments.
Recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail in unseen domains.
We develop an approach based on conditional mutual information with respect to RS-SCM, then rigorously rectify the spurious and fake invariant effects.
arXiv Detail & Related papers (2023-12-15T12:58:05Z) - On the Variance, Admissibility, and Stability of Empirical Risk
Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z) - What Is Missing in IRM Training and Evaluation? Challenges and Solutions [41.56612265456626]
Invariant risk minimization (IRM) has received increasing attention as a way to acquire environment-agnostic data representations and predictions.
Recent works have found that the optimality of the originally-proposed IRM optimization (IRM) may be compromised in practice.
We identify and resolve three practical limitations in IRM training and evaluation.
arXiv Detail & Related papers (2023-03-04T07:06:24Z) - Probable Domain Generalization via Quantile Risk Minimization [90.15831047587302]
Domain generalization seeks predictors which perform well on unseen test distributions.
We propose a new probabilistic framework for DG where the goal is to learn predictors that perform well with high probability.
arXiv Detail & Related papers (2022-07-20T14:41:09Z) - Meta-Learned Invariant Risk Minimization [12.6484257912092]
Empirical Risk Minimization (ERM) based machine learning algorithms have suffered from weak generalization performance on data obtained from out-of-distribution (OOD)
In this paper, we propose a novel meta-learning based approach for IRM.
We show that our algorithm not only has better OOD generalization performance than IRMv1 and all IRM variants, but also addresses the weakness of IRMv1 with improved stability.
arXiv Detail & Related papers (2021-03-24T02:52:48Z) - Bayesian Uncertainty Estimation of Learned Variational MRI
Reconstruction [63.202627467245584]
We introduce a Bayesian variational framework to quantify the model-immanent (epistemic) uncertainty.
We demonstrate that our approach yields competitive results for undersampled MRI reconstruction.
arXiv Detail & Related papers (2021-02-12T18:08:14Z) - Does Invariant Risk Minimization Capture Invariance? [23.399091822468407]
We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. can fail to capture "natural" invariances.
This can lead to worse generalization on new environments.
arXiv Detail & Related papers (2021-01-04T18:02:45Z) - Empirical or Invariant Risk Minimization? A Sample Complexity
Perspective [49.43806345820883]
It is unclear when in-variant risk generalization (IRM) should be preferred over the widely-employed empirical risk minimization (ERM) framework.
We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and behavior.
We further investigate how different factors -- the number of environments, complexity of the model, and IRM penalty weight -- impact the sample complexity of IRM in relation to its distance from the OOD solutions.
arXiv Detail & Related papers (2020-10-30T17:55:30Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.