Related papers: The Missing Invariance Principle Found -- the Reciprocal Twin of Invariant Risk Minimization

The Missing Invariance Principle Found -- the Reciprocal Twin of Invariant Risk Minimization

URL: http://arxiv.org/abs/2205.14546v1
Date: Sun, 29 May 2022 00:14:51 GMT
Title: The Missing Invariance Principle Found -- the Reciprocal Twin of Invariant Risk Minimization
Authors: Dongsung Huh and Avinash Baidya
Abstract summary: In Risk Minimization (IRM) can fail to generalize poorly to out-of-distribution (OOD) data. We show that MRI-v1 can guarantee invariant predictors given sufficient environments. We also demonstrate that MRI strongly out-performs IRM and achieves a near-optimal OOD in image-based problems.
Score: 7.6146285961466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models often generalize poorly to out-of-distribution (OOD) data as a result of relying on features that are spuriously correlated with the label during training. Recently, the technique of Invariant Risk Minimization (IRM) was proposed to learn predictors that only use invariant features by conserving the feature-conditioned class expectation $\mathbb{E}_e[y|f(x)]$ across environments. However, more recent studies have demonstrated that IRM can fail in various task settings. Here, we identify a fundamental flaw of IRM formulation that causes the failure. We then introduce a complementary notion of invariance, MRI, that is based on conserving the class-conditioned feature expectation $\mathbb{E}_e[f(x)|y]$ across environments, that corrects for the flaw in IRM. Further, we introduce a simplified, practical version of the MRI formulation called as MRI-v1. We note that this constraint is convex which confers it with an advantage over the practical version of IRM, IRM-v1, which imposes non-convex constraints. We prove that in a general linear problem setting, MRI-v1 can guarantee invariant predictors given sufficient environments. We also empirically demonstrate that MRI strongly out-performs IRM and consistently achieves near-optimal OOD generalization in image-based nonlinear problems.

Related papers

Invariant Risk Minimization Is A Total Variation Model [3.000494957386027]
Invariant risk minimization (IRM) is an arising approach to generalize invariant features to different environments in machine learning. We show that IRM is essentially a total variation based on $L2$ (TV-$ell$) of the learning risk. We propose a novel IRM framework based on the TV-$ell$ model.
arXiv Detail & Related papers (2024-05-02T15:34:14Z)
Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal Approach [51.012396632595554]
Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments. Recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail in unseen domains. We develop an approach based on conditional mutual information with respect to RS-SCM, then rigorously rectify the spurious and fake invariant effects.
arXiv Detail & Related papers (2023-12-15T12:58:05Z)
On the Variance, Admissibility, and Stability of Empirical Risk Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates. We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance. We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z)
What Is Missing in IRM Training and Evaluation? Challenges and Solutions [41.56612265456626]
Invariant risk minimization (IRM) has received increasing attention as a way to acquire environment-agnostic data representations and predictions. Recent works have found that the optimality of the originally-proposed IRM optimization (IRM) may be compromised in practice. We identify and resolve three practical limitations in IRM training and evaluation.
arXiv Detail & Related papers (2023-03-04T07:06:24Z)
Learning Optimal Features via Partial Invariance [18.552839725370383]
Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments. We show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $textitpartial invariance$. Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
arXiv Detail & Related papers (2023-01-28T02:48:14Z)
Probable Domain Generalization via Quantile Risk Minimization [90.15831047587302]
Domain generalization seeks predictors which perform well on unseen test distributions. We propose a new probabilistic framework for DG where the goal is to learn predictors that perform well with high probability.
arXiv Detail & Related papers (2022-07-20T14:41:09Z)
Meta-Learned Invariant Risk Minimization [12.6484257912092]
Empirical Risk Minimization (ERM) based machine learning algorithms have suffered from weak generalization performance on data obtained from out-of-distribution (OOD) In this paper, we propose a novel meta-learning based approach for IRM. We show that our algorithm not only has better OOD generalization performance than IRMv1 and all IRM variants, but also addresses the weakness of IRMv1 with improved stability.
arXiv Detail & Related papers (2021-03-24T02:52:48Z)
Bayesian Uncertainty Estimation of Learned Variational MRI Reconstruction [63.202627467245584]
We introduce a Bayesian variational framework to quantify the model-immanent (epistemic) uncertainty. We demonstrate that our approach yields competitive results for undersampled MRI reconstruction.
arXiv Detail & Related papers (2021-02-12T18:08:14Z)
Does Invariant Risk Minimization Capture Invariance? [23.399091822468407]
We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. can fail to capture "natural" invariances. This can lead to worse generalization on new environments.
arXiv Detail & Related papers (2021-01-04T18:02:45Z)
Empirical or Invariant Risk Minimization? A Sample Complexity Perspective [49.43806345820883]
It is unclear when in-variant risk generalization (IRM) should be preferred over the widely-employed empirical risk minimization (ERM) framework. We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and behavior. We further investigate how different factors -- the number of environments, complexity of the model, and IRM penalty weight -- impact the sample complexity of IRM in relation to its distance from the OOD solutions.
arXiv Detail & Related papers (2020-10-30T17:55:30Z)
The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data. We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.