Adaptive Learning in Continuous Games: Optimal Regret Bounds and
Convergence to Nash Equilibrium
- URL: http://arxiv.org/abs/2104.12761v1
- Date: Mon, 26 Apr 2021 17:52:29 GMT
- Title: Adaptive Learning in Continuous Games: Optimal Regret Bounds and
Convergence to Nash Equilibrium
- Authors: Yu-Guan Hsieh, Kimon Antonakopoulos, Panayotis Mertikopoulos
- Abstract summary: No-regret algorithms are not created equal in terms of game-theoretic guarantees.
We propose a range of no-regret policies based on optimistic mirror descent.
- Score: 33.9962699667578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In game-theoretic learning, several agents are simultaneously following their
individual interests, so the environment is non-stationary from each player's
perspective. In this context, the performance of a learning algorithm is often
measured by its regret. However, no-regret algorithms are not created equal in
terms of game-theoretic guarantees: depending on how they are tuned, some of
them may drive the system to an equilibrium, while others could produce cyclic,
chaotic, or otherwise divergent trajectories. To account for this, we propose a
range of no-regret policies based on optimistic mirror descent, with the
following desirable properties: i) they do not require any prior tuning or
knowledge of the game; ii) they all achieve O(\sqrt{T}) regret against
arbitrary, adversarial opponents; and iii) they converge to the best response
against convergent opponents. Also, if employed by all players, then iv) they
guarantee O(1) social regret; while v) the induced sequence of play converges
to Nash equilibrium with O(1) individual regret in all variationally stable
games (a class of games that includes all monotone and convex-concave zero-sum
games).
Related papers
- Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games [21.168085154982712]
equilibria in multiplayer games are neither unique nor non-exploitable.
This paper takes an initial step towards addressing these challenges by focusing on the natural objective of equal share.
We design a series of efficient algorithms, inspired by no-regret learning, that provably attain approximate equal share across various settings.
arXiv Detail & Related papers (2024-06-06T15:59:17Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Learning in Multi-Player Stochastic Games [1.0878040851638]
We consider the problem of simultaneous learning in games with many players in the finite-horizon setting.
While the typical target solution for a game is a Nash equilibrium, this is intractable with many players.
We turn to a different target: algorithms which generate an equilibrium when they are used by all players.
arXiv Detail & Related papers (2022-10-25T19:02:03Z) - Regret Minimization and Convergence to Equilibria in General-sum Markov
Games [57.568118148036376]
We present the first algorithm for learning in general-sum Markov games that provides sublinear regret guarantees when executed by all agents.
Our algorithm is decentralized, computationally efficient, and does not require any communication between agents.
arXiv Detail & Related papers (2022-07-28T16:27:59Z) - No-Regret Learning in Games with Noisy Feedback: Faster Rates and
Adaptivity via Learning Rate Separation [76.61911795703062]
We study the problem of regret when the learner is involved in a continuous game with other optimizing agents.
In this case, if all players follow a no-regret algorithm, it is possible to achieve significantly lower regret relative to fully adversarial environments.
We propose a fully adaptive method that smoothly interpolates between worst- and best-case regret guarantees.
arXiv Detail & Related papers (2022-06-13T10:13:51Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - No-regret learning and mixed Nash equilibria: They do not mix [64.37511607254115]
We study the dynamics of "follow-the-regularized-leader" (FTRL)
We show that any Nash equilibrium which is not strict cannot be stable and attracting under FTRL.
This result has significant implications for predicting the outcome of a learning process.
arXiv Detail & Related papers (2020-10-19T13:49:06Z) - No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium [76.78447814623665]
We give the first uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games.
We introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games.
Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions.
arXiv Detail & Related papers (2020-04-01T17:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.