Asynchronous Gradient Play in Zero-Sum Multi-agent Games
- URL: http://arxiv.org/abs/2211.08980v1
- Date: Wed, 16 Nov 2022 15:37:23 GMT
- Title: Asynchronous Gradient Play in Zero-Sum Multi-agent Games
- Authors: Ruicheng Ao, Shicong Cen, Yuejie Chi
- Abstract summary: We study asynchronous gradient plays in zero-sum polymatrix games under delayed feedbacks.
To the best of our knowledge, this work is the first that aims to understand asynchronous gradient play in zero-sum polymatrix games.
- Score: 25.690033495071923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding equilibria via gradient play in competitive multi-agent games has
been attracting a growing amount of attention in recent years, with emphasis on
designing efficient strategies where the agents operate in a decentralized and
symmetric manner with guaranteed convergence. While significant efforts have
been made in understanding zero-sum two-player matrix games, the performance in
zero-sum multi-agent games remains inadequately explored, especially in the
presence of delayed feedbacks, leaving the scalability and resiliency of
gradient play open to questions.
In this paper, we make progress by studying asynchronous gradient plays in
zero-sum polymatrix games under delayed feedbacks. We first establish that the
last iterate of entropy-regularized optimistic multiplicative weight updates
(OMWU) method converges linearly to the quantal response equilibrium (QRE), the
solution concept under bounded rationality, in the absence of delays. While the
linear convergence continues to hold even when the feedbacks are randomly
delayed under mild statistical assumptions, it converges at a noticeably slower
rate due to a smaller tolerable range of learning rates. Moving beyond, we
demonstrate entropy-regularized OMWU -- by adopting two-timescale learning
rates in a delay-aware manner -- enjoys faster last-iterate convergence under
fixed delays, and continues to converge provably even when the delays are
arbitrarily bounded in an average-iterate manner. Our methods also lead to
finite-time guarantees to approximate the Nash equilibrium (NE) by moderating
the amount of regularization. To the best of our knowledge, this work is the
first that aims to understand asynchronous gradient play in zero-sum polymatrix
games under a wide range of delay assumptions, highlighting the role of
learning rates separation.
Related papers
- Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling.
Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z) - Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations [98.5802673062712]
We introduce temporally-coupled perturbations, presenting a novel challenge for existing robust reinforcement learning methods.
We propose GRAD, a novel game-theoretic approach that treats the temporally-coupled robust RL problem as a partially observable two-player zero-sum game.
arXiv Detail & Related papers (2023-07-22T12:10:04Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - Fast Policy Extragradient Methods for Competitive Games with Entropy
Regularization [40.21627891283402]
This paper investigates the problem of computing the equilibrium of competitive games.
Motivated by the algorithmic role of entropy regularization, we develop provably efficient extragradient methods.
arXiv Detail & Related papers (2021-05-31T17:51:15Z) - Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU)
We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant.
We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z) - Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games [116.0771177871705]
We characterize the finite-time last-iterate convergence rate for joint OGD learning on $lambda$-cocoercive games.
We show, via a novel double-stopping time technique, that this adaptive algorithm achieves same finite-time last-iterate convergence rate as non-adaptive counterpart.
arXiv Detail & Related papers (2020-02-23T01:46:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.