Learning not to Regret
- URL: http://arxiv.org/abs/2303.01074v2
- Date: Mon, 19 Feb 2024 21:55:51 GMT
- Title: Learning not to Regret
- Authors: David Sychrovsk\'y, Michal \v{S}ustr, Elnaz Davoodi, Michael Bowling,
Marc Lanctot, Martin Schmid
- Abstract summary: We present a novel "learning not to regret" framework, enabling us to meta-learn a regret minimizer tailored to a specific distribution.
Our key contribution, Neural Predictive Regret Matching, is uniquely meta-learned to converge rapidly for the chosen distribution of games.
Our experiments show that the meta-learned algorithms outpace their non-meta-learned counterparts, achieving more than tenfold improvements.
- Score: 19.945846614714167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The literature on game-theoretic equilibrium finding predominantly focuses on
single games or their repeated play. Nevertheless, numerous real-world
scenarios feature playing a game sampled from a distribution of similar, but
not identical games, such as playing poker with different public cards or
trading correlated assets on the stock market. As these similar games feature
similar equilibra, we investigate a way to accelerate equilibrium finding on
such a distribution. We present a novel "learning not to regret" framework,
enabling us to meta-learn a regret minimizer tailored to a specific
distribution. Our key contribution, Neural Predictive Regret Matching, is
uniquely meta-learned to converge rapidly for the chosen distribution of games,
while having regret minimization guarantees on any game. We validated our
algorithms' faster convergence on a distribution of river poker games. Our
experiments show that the meta-learned algorithms outpace their
non-meta-learned counterparts, achieving more than tenfold improvements.
Related papers
- Meta-Learning in Self-Play Regret Minimization [10.843705580746397]
We present a general approach to online optimization which plays a crucial role in many algorithms for approximating Nash equilibria in two-player zero-sum games.
We build upon this, extending the framework to the more challenging self-play setting, which is the basis for most state-of-the-art equilibrium approximation algorithms.
Our meta-learned algorithms considerably outperform other state-of-the-art regret minimization algorithms.
arXiv Detail & Related papers (2025-04-26T13:27:24Z) - Swap Regret and Correlated Equilibria Beyond Normal-Form Games [62.01542145970044]
We present a new variant of swap regret for polytope games that we call profile swap regret''
Although we show profile swap regret is NP-hard to compute given a transcript of play, we show it is nonetheless possible to design efficient learning algorithms that guarantee at most $O(sqrtT)$ profile swap regret.
arXiv Detail & Related papers (2025-02-27T16:16:26Z) - Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games [21.168085154982712]
equilibria in multiplayer games are neither unique nor non-exploitable.
This paper takes an initial step towards addressing these challenges by focusing on the natural objective of equal share.
We design a series of efficient algorithms, inspired by no-regret learning, that provably attain approximate equal share across various settings.
arXiv Detail & Related papers (2024-06-06T15:59:17Z) - No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling [10.376707874029561]
We show that when the players use Thompson sampling, the game dynamics converge to the Nash equilibrium under a mild assumption on the payoff matrices.
algorithmic collusion doesn't arise in this case despite the fact that the players do not intentionally deploy competitive strategies.
arXiv Detail & Related papers (2024-05-23T08:21:48Z) - Is Learning in Games Good for the Learners? [14.781100349601587]
We consider tradeoffs between reward and regret in repeated gameplay between two agents.
We show that any such equilibrium is reachable by a pair of algorithms which maintain their regret guarantees against arbitrary opponents.
We also consider the question of learning reward-optimal strategies via repeated play with a no-regret agent when game is initially unknown.
arXiv Detail & Related papers (2023-05-31T02:10:27Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Learning in Multi-Player Stochastic Games [1.0878040851638]
We consider the problem of simultaneous learning in games with many players in the finite-horizon setting.
While the typical target solution for a game is a Nash equilibrium, this is intractable with many players.
We turn to a different target: algorithms which generate an equilibrium when they are used by all players.
arXiv Detail & Related papers (2022-10-25T19:02:03Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Regret Minimization and Convergence to Equilibria in General-sum Markov
Games [57.568118148036376]
We present the first algorithm for learning in general-sum Markov games that provides sublinear regret guarantees when executed by all agents.
Our algorithm is decentralized, computationally efficient, and does not require any communication between agents.
arXiv Detail & Related papers (2022-07-28T16:27:59Z) - No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning.
We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures.
Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.