The equivalence of dynamic and strategic stability under regularized
learning in games
- URL: http://arxiv.org/abs/2311.02407v1
- Date: Sat, 4 Nov 2023 14:07:33 GMT
- Title: The equivalence of dynamic and strategic stability under regularized
learning in games
- Authors: Victor Boone and Panayotis Mertikopoulos
- Abstract summary: We examine the long-run behavior of regularized, no-regret learning in finite games.
We obtain an equivalence between strategic and dynamic stability.
We show that methods based on entropic regularization converge at a geometric rate.
- Score: 33.74394172275373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we examine the long-run behavior of regularized, no-regret
learning in finite games. A well-known result in the field states that the
empirical frequencies of no-regret play converge to the game's set of coarse
correlated equilibria; however, our understanding of how the players' actual
strategies evolve over time is much more limited - and, in many cases,
non-existent. This issue is exacerbated further by a series of recent results
showing that only strict Nash equilibria are stable and attracting under
regularized learning, thus making the relation between learning and pointwise
solution concepts particularly elusive. In lieu of this, we take a more general
approach and instead seek to characterize the \emph{setwise} rationality
properties of the players' day-to-day play. To that end, we focus on one of the
most stringent criteria of setwise strategic stability, namely that any
unilateral deviation from the set in question incurs a cost to the deviator - a
property known as closedness under better replies (club). In so doing, we
obtain a far-reaching equivalence between strategic and dynamic stability: a
product of pure strategies is closed under better replies if and only if its
span is stable and attracting under regularized learning. In addition, we
estimate the rate of convergence to such sets, and we show that methods based
on entropic regularization (like the exponential weights algorithm) converge at
a geometric rate, while projection-based methods converge within a finite
number of iterations, even with bandit, payoff-based feedback.
Related papers
- Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability [11.793922711718645]
We consider decentralized learning for zero-sum games, where players only see their information and are to actions and payoffs of the opponent.
Previous works demonstrated convergence to a Nash equilibrium in this setting using double time-scale algorithms under strong reachability assumptions.
Our contribution is a rational and convergent algorithm, utilizing Tsallis-entropy regularization in a value-iteration-based algorithm.
arXiv Detail & Related papers (2023-12-13T09:31:30Z) - Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations [98.5802673062712]
We introduce temporally-coupled perturbations, presenting a novel challenge for existing robust reinforcement learning methods.
We propose GRAD, a novel game-theoretic approach that treats the temporally-coupled robust RL problem as a partially observable two-player zero-sum game.
arXiv Detail & Related papers (2023-07-22T12:10:04Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - On the convergence of policy gradient methods to Nash equilibria in
general stochastic games [33.786186304912]
We study the long-run behavior of policy gradient methods with respect to Nash equilibrium policies.
We show that policy gradient trajectories with gradient estimates provided by the REINFORCE algorithm achieve an $mathcalO (1/sqrtn)$ distance-squared convergence rate.
arXiv Detail & Related papers (2022-10-17T08:51:59Z) - A unified stochastic approximation framework for learning in games [82.74514886461257]
We develop a flexible approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite)
The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, exponential/multiplicative weights for learning in finite games, optimistic and bandit variants of the above, etc.
arXiv Detail & Related papers (2022-06-08T14:30:38Z) - Learning Equilibria in Matching Markets from Bandit Feedback [139.29934476625488]
We develop a framework and algorithms for learning stable market outcomes under uncertainty.
Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.
arXiv Detail & Related papers (2021-08-19T17:59:28Z) - Survival of the strictest: Stable and unstable equilibria under
regularized learning with partial information [32.384868685390906]
We examine the Nash equilibrium convergence properties of no-regret learning in general N-player games.
We establish a comprehensive equivalence between the stability of a Nash equilibrium and its support.
It provides a clear refinement criterion for the prediction of the day-to-day behavior of no-regret learning in games.
arXiv Detail & Related papers (2021-01-12T18:55:11Z) - Learning from History for Byzantine Robust Optimization [52.68913869776858]
Byzantine robustness has received significant attention recently given its importance for distributed learning.
We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
arXiv Detail & Related papers (2020-12-18T16:22:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.