Related papers: Invariant Risk Minimization Games

Invariant Risk Minimization Games

URL: http://arxiv.org/abs/2002.04692v2
Date: Wed, 18 Mar 2020 21:18:17 GMT
Title: Invariant Risk Minimization Games
Authors: Kartik Ahuja, Karthikeyan Shanmugam, Kush R. Varshney, Amit Dhurandhar
Abstract summary: In this work, we pose such invariant risk minimization as finding the Nash equilibrium of an ensemble game among several environments. By doing so, we develop a simple training algorithm that uses best response dynamics and equilibria in our experiments, yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al.
Score: 48.00018458720443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The standard risk minimization paradigm of machine learning is brittle when operating in environments whose test distributions are different from the training distribution due to spurious correlations. Training on data from many environments and finding invariant predictors reduces the effect of spurious features by concentrating models on features that have a causal relationship with the outcome. In this work, we pose such invariant risk minimization as finding the Nash equilibrium of an ensemble game among several environments. By doing so, we develop a simple training algorithm that uses best response dynamics and, in our experiments, yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al. (2019). One key theoretical contribution is showing that the set of Nash equilibria for the proposed game are equivalent to the set of invariant predictors for any finite number of environments, even with nonlinear classifiers and transformations. As a result, our method also retains the generalization guarantees to a large set of environments shown in Arjovsky et al. (2019). The proposed algorithm adds to the collection of successful game-theoretic machine learning algorithms such as generative adversarial networks.

Related papers

Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions [8.758206783988404]
We propose a reinforcement learning framework under a broad class of risk objectives, characterized by convex scoring functions.<n>This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk, and mean-risk utility.<n>We validate our approach in simulation experiments with a financial application in statistical arbitrage trading, demonstrating the effectiveness of the algorithm.
arXiv Detail & Related papers (2025-05-07T16:31:42Z)
Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods. Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions. We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [61.580419063416734]
A recent stream of structured learning approaches has improved the practical state of the art for a range of optimization problems. The key idea is to exploit the statistical distribution over instances instead of dealing with instances separately. In this article, we investigate methods that smooth the risk by perturbing the policy, which eases optimization and improves the generalization error.
arXiv Detail & Related papers (2024-07-24T12:00:30Z)
Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning [12.947265104477237]
Pursuing causality from data is a fundamental problem in scientific discovery, treatment intervention, and transfer learning. The proposed Focused Adversial Invariant Regularization (FAIR) framework utilizes an innovative minimax optimization approach. It is shown that FAIR-NN can find the invariant variables and quasi-causal variables under a minimal identification condition.
arXiv Detail & Related papers (2024-05-07T23:37:40Z)
Sufficient Invariant Learning for Distribution Shift [20.88069274935592]
We introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework. SIL focuses on learning a sufficient subset of invariant features rather than relying on a single feature. We propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima.
arXiv Detail & Related papers (2022-10-24T18:34:24Z)
A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization [44.88645911638269]
independence-driven importance algorithms in stable learning literature have shown empirical effectiveness. In this paper, we theoretically prove the effectiveness of such algorithms by explaining them as feature selection processes. We prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set.
arXiv Detail & Related papers (2021-11-03T17:18:49Z)
Robust Reconfigurable Intelligent Surfaces via Invariant Risk and Causal Representations [55.50218493466906]
In this paper, the problem of robust reconfigurable intelligent surface (RIS) system design under changes in data distributions is investigated. Using the notion of invariant risk minimization (IRM), an invariant causal representation across multiple environments is used such that the predictor is simultaneously optimal for each environment. A neural network-based solution is adopted to seek the predictor and its performance is validated via simulations against an empirical risk minimization-based design.
arXiv Detail & Related papers (2021-05-04T21:36:31Z)
Adaptive Sampling for Minimax Fair Classification [40.936345085421955]
We propose an adaptive sampling algorithm based on the principle of optimism, and derive theoretical bounds on its performance. By deriving algorithm independent lower-bounds for a specific class of problems, we show that the performance achieved by our adaptive scheme cannot be improved in general.
arXiv Detail & Related papers (2021-03-01T04:58:27Z)
A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution. We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.