Linear Regression Games: Convergence Guarantees to Approximate
Out-of-Distribution Solutions
- URL: http://arxiv.org/abs/2010.15234v1
- Date: Wed, 28 Oct 2020 21:10:24 GMT
- Title: Linear Regression Games: Convergence Guarantees to Approximate
Out-of-Distribution Solutions
- Authors: Kartik Ahuja, Karthikeyan Shanmugam, Amit Dhurandhar
- Abstract summary: In this work, we extend the framework in Ahuja et al. for linear regressions by projecting the ensemble-game on an $ell_infty$ ball.
We show that such projections help achieve non-trivial OOD guarantees despite not achieving perfect invariance.
- Score: 35.313551211453266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, invariant risk minimization (IRM) (Arjovsky et al.) was proposed as
a promising solution to address out-of-distribution (OOD) generalization. In
Ahuja et al., it was shown that solving for the Nash equilibria of a new class
of "ensemble-games" is equivalent to solving IRM. In this work, we extend the
framework in Ahuja et al. for linear regressions by projecting the
ensemble-game on an $\ell_{\infty}$ ball. We show that such projections help
achieve non-trivial OOD guarantees despite not achieving perfect invariance.
For linear models with confounders, we prove that Nash equilibria of these
games are closer to the ideal OOD solutions than the standard empirical risk
minimization (ERM) and we also provide learning algorithms that provably
converge to these Nash Equilibria. Empirical comparisons of the proposed
approach with the state-of-the-art show consistent gains in achieving OOD
solutions in several settings involving anti-causal variables and confounders.
Related papers
- Independent RL for Cooperative-Competitive Agents: A Mean-Field Perspective [11.603515105957461]
We address in this paper Reinforcement Learning (RL) among agents that are grouped into teams such that there is cooperation within each team but general-sum competition across different teams.
arXiv Detail & Related papers (2024-03-17T21:11:55Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms.
Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z) - Soft-Bellman Equilibrium in Affine Markov Games: Forward Solutions and
Inverse Learning [37.176741793213694]
We formulate a class of Markov games, termed affine Markov games, where an affine reward function couples the players' actions.
We introduce a novel solution concept, the soft-Bellman equilibrium, where each player is boundedly rational and chooses a soft-Bellman policy.
We then solve the inverse game problem of inferring the players' reward parameters from observed state-action trajectories via a projected-gradient algorithm.
arXiv Detail & Related papers (2023-03-31T22:50:47Z) - Differentiable Arbitrating in Zero-sum Markov Games [59.62061049680365]
We study how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating.
The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way.
We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level.
arXiv Detail & Related papers (2023-02-20T16:05:04Z) - Offline Learning in Markov Games with General Function Approximation [22.2472618685325]
We study offline multi-agent reinforcement learning (RL) in Markov games.
We provide the first framework for sample-efficient offline learning in Markov games.
arXiv Detail & Related papers (2023-02-06T05:22:27Z) - Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers [21.462231105582347]
We propose an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium.
We also suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Gini Correlated Equilibrium (MGCE)
We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.
arXiv Detail & Related papers (2021-06-17T12:34:18Z) - A Variational Inequality Approach to Bayesian Regression Games [90.79402153164587]
We prove the existence of the uniqueness of a class of convex and generalize it to smooth cost functions.
We provide two simple algorithms of solving them with necessarily strong convergence.
arXiv Detail & Related papers (2021-03-24T22:33:11Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.