An Empirical Study of Invariant Risk Minimization
- URL: http://arxiv.org/abs/2004.05007v2
- Date: Mon, 6 Jul 2020 09:10:51 GMT
- Title: An Empirical Study of Invariant Risk Minimization
- Authors: Yo Joong Choe, Jiyeon Ham, Kyubyong Park
- Abstract summary: Invariant risk minimization is a proposed framework for learning predictors that are invariant to spurious correlations.
Despite its theoretical justifications, IRM has not been extensively tested across various settings.
We empirically investigate several research questions using IRMv1, which is the first practical algorithm proposed to approximately solve IRM.
- Score: 5.412466703928342
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Invariant risk minimization (IRM) (Arjovsky et al., 2019) is a recently
proposed framework designed for learning predictors that are invariant to
spurious correlations across different training environments. Yet, despite its
theoretical justifications, IRM has not been extensively tested across various
settings. In an attempt to gain a better understanding of the framework, we
empirically investigate several research questions using IRMv1, which is the
first practical algorithm proposed to approximately solve IRM. By extending the
ColoredMNIST experiment in different ways, we find that IRMv1 (i) performs
better as the spurious correlation varies more widely between training
environments, (ii) learns an approximately invariant predictor when the
underlying relationship is approximately invariant, and (iii) can be extended
to an analogous setting for text classification.
Related papers
- Continual Invariant Risk Minimization [46.051656238770086]
Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations.
Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations.
arXiv Detail & Related papers (2023-10-21T11:44:47Z) - What Is Missing in IRM Training and Evaluation? Challenges and Solutions [41.56612265456626]
Invariant risk minimization (IRM) has received increasing attention as a way to acquire environment-agnostic data representations and predictions.
Recent works have found that the optimality of the originally-proposed IRM optimization (IRM) may be compromised in practice.
We identify and resolve three practical limitations in IRM training and evaluation.
arXiv Detail & Related papers (2023-03-04T07:06:24Z) - Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP)
In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods.
We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z) - Kernelized Heterogeneous Risk Minimization [25.5458915855661]
We propose a Kernelized Heterogeneous Risk Minimization (KerHRM) algorithm, which achieves both the latent exploration and invariant learning in kernel space.
We theoretically justify our algorithm and empirically validate the effectiveness of our algorithm with extensive experiments.
arXiv Detail & Related papers (2021-10-24T12:26:50Z) - Heterogeneous Risk Minimization [25.5458915855661]
Invariant learning methods for out-of-distribution generalization have been proposed by leveraging multiple training environments to find invariant relationships.
Modern datasets are assembled by merging data from multiple sources without explicit source labels.
We propose Heterogeneous Risk Minimization (HRM) framework to achieve joint learning of latent heterogeneity among the data and invariant relationship.
arXiv Detail & Related papers (2021-05-09T02:51:36Z) - Empirical or Invariant Risk Minimization? A Sample Complexity
Perspective [49.43806345820883]
It is unclear when in-variant risk generalization (IRM) should be preferred over the widely-employed empirical risk minimization (ERM) framework.
We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and behavior.
We further investigate how different factors -- the number of environments, complexity of the model, and IRM penalty weight -- impact the sample complexity of IRM in relation to its distance from the OOD solutions.
arXiv Detail & Related papers (2020-10-30T17:55:30Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale.
We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments.
We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z) - Invariant Risk Minimization Games [48.00018458720443]
In this work, we pose such invariant risk minimization as finding the Nash equilibrium of an ensemble game among several environments.
By doing so, we develop a simple training algorithm that uses best response dynamics and equilibria in our experiments, yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al.
arXiv Detail & Related papers (2020-02-11T21:25:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.