Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
- URL: http://arxiv.org/abs/2208.02204v1
- Date: Wed, 3 Aug 2022 16:41:01 GMT
- Title: Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
- Authors: Fivos Kalogiannis, Ioannis Anagnostides, Ioannis Panageas,
Emmanouil-Vasileios Vlatakis-Gkaragkounis, Vaggos Chatziafratis, Stelios
Stavroulakis
- Abstract summary: We introduce a class of games in which a team identically players is competing against an adversarial player.
This setting allows for a unifying treatment of zero-sum Markov games potential games.
Our main contribution is the first algorithm for computing stationary $epsilon$-approximate Nash equilibria in adversarial team Markov games.
- Score: 19.717850955051837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computing Nash equilibrium policies is a central problem in multi-agent
reinforcement learning that has received extensive attention both in theory and
in practice. However, provable guarantees have been thus far either limited to
fully competitive or cooperative scenarios or impose strong assumptions that
are difficult to meet in most practical applications. In this work, we depart
from those prior results by investigating infinite-horizon \emph{adversarial
team Markov games}, a natural and well-motivated class of games in which a team
of identically-interested players -- in the absence of any explicit
coordination or communication -- is competing against an adversarial player.
This setting allows for a unifying treatment of zero-sum Markov games and
Markov potential games, and serves as a step to model more realistic strategic
interactions that feature both competing and cooperative interests. Our main
contribution is the first algorithm for computing stationary
$\epsilon$-approximate Nash equilibria in adversarial team Markov games with
computational complexity that is polynomial in all the natural parameters of
the game, as well as $1/\epsilon$. The proposed algorithm is particularly
natural and practical, and it is based on performing independent policy
gradient steps for each player in the team, in tandem with best responses from
the side of the adversary; in turn, the policy for the adversary is then
obtained by solving a carefully constructed linear program. Our analysis
leverages non-standard techniques to establish the KKT optimality conditions
for a nonlinear program with nonconvex constraints, thereby leading to a
natural interpretation of the induced Lagrange multipliers. Along the way, we
significantly extend an important characterization of optimal policies in
adversarial (normal-form) team games due to Von Stengel and Koller (GEB `97).
Related papers
- Tractable Equilibrium Computation in Markov Games through Risk Aversion [12.980882140751895]
We show that risk-averse quantal response equilibria (RQE) become tractable to compute in all $n$-player matrix and finite-horizon Markov games.
RQE is independent of the underlying game structure and only depends on agents' degree of risk-aversion and bounded rationality.
arXiv Detail & Related papers (2024-06-20T09:53:56Z) - Optimistic Policy Gradient in Multi-Player Markov Games with a Single
Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller.
Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z) - Soft-Bellman Equilibrium in Affine Markov Games: Forward Solutions and
Inverse Learning [37.176741793213694]
We formulate a class of Markov games, termed affine Markov games, where an affine reward function couples the players' actions.
We introduce a novel solution concept, the soft-Bellman equilibrium, where each player is boundedly rational and chooses a soft-Bellman policy.
We then solve the inverse game problem of inferring the players' reward parameters from observed state-action trajectories via a projected-gradient algorithm.
arXiv Detail & Related papers (2023-03-31T22:50:47Z) - Breaking the Curse of Multiagents in a Large State Space: RL in Markov
Games with Independent Linear Function Approximation [56.715186432566576]
We propose a new model, independent linear Markov game, for reinforcement learning with a large state space and a large number of agents.
We design new algorithms for learning correlated equilibria (CCE) and Markov correlated equilibria (CE) with sample bounds complexity that only scalely with each agent's own function class complexity.
Our algorithms rely on two key technical innovations: (1) utilizing policy replay to tackle non-stationarity incurred by multiple agents and the use of function approximation; and (2) separating learning Markov equilibria and exploration in the Markov games.
arXiv Detail & Related papers (2023-02-07T18:47:48Z) - Offline Learning in Markov Games with General Function Approximation [22.2472618685325]
We study offline multi-agent reinforcement learning (RL) in Markov games.
We provide the first framework for sample-efficient offline learning in Markov games.
arXiv Detail & Related papers (2023-02-06T05:22:27Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
Markov Games with Structured Transitions [145.54544979467872]
We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions.
We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario.
Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
arXiv Detail & Related papers (2022-07-25T18:29:16Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.