The Complexity of Markov Equilibrium in Stochastic Games
- URL: http://arxiv.org/abs/2204.03991v1
- Date: Fri, 8 Apr 2022 10:51:01 GMT
- Title: The Complexity of Markov Equilibrium in Stochastic Games
- Authors: Constantinos Daskalakis and Noah Golowich and Kaiqing Zhang
- Abstract summary: We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum games is computationally intractable.
Our results stand in sharp contrast to normal-form games where exact CCEs are efficiently computable.
- Score: 44.77547027158141
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show that computing approximate stationary Markov coarse correlated
equilibria (CCE) in general-sum stochastic games is computationally
intractable, even when there are two players, the game is turn-based, the
discount factor is an absolute constant, and the approximation is an absolute
constant. Our intractability results stand in sharp contrast to normal-form
games where exact CCEs are efficiently computable. A fortiori, our results
imply that there are no efficient algorithms for learning stationary Markov CCE
policies in multi-agent reinforcement learning (MARL), even when the
interaction is two-player and turn-based, and both the discount factor and the
desired approximation of the learned policies is an absolute constant. In turn,
these results stand in sharp contrast to single-agent reinforcement learning
(RL) where near-optimal stationary Markov policies can be efficiently learned.
Complementing our intractability results for stationary Markov CCEs, we provide
a decentralized algorithm (assuming shared randomness among players) for
learning a nonstationary Markov CCE policy with polynomial time and sample
complexity in all problem parameters. Previous work for learning Markov CCE
policies all required exponential time and sample complexity in the number of
players.
Related papers
- Convergence of Decentralized Actor-Critic Algorithm in General-sum Markov Games [3.8779763612314633]
We study the properties of learning algorithms in general-sum Markov games.
In particular, we focus on a decentralized algorithm where each agent adopts an actor-critic learning dynamic.
arXiv Detail & Related papers (2024-09-06T20:49:11Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Optimistic Policy Gradient in Multi-Player Markov Games with a Single
Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller.
Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - Breaking the Curse of Multiagency: Provably Efficient Decentralized
Multi-Agent RL with Function Approximation [44.051717720483595]
This paper presents the first line of MARL algorithms that provably resolve the curse of multiagency approximation.
In exchange for learning a weaker version of CCEs, this algorithm applies to a wider range of problems under generic function approximation.
Our algorithm always outputs Markov CCEs, and an optimal rate of $widetildemathcalO(epsilon-2)$ for finding $epsilon$-optimal solutions.
arXiv Detail & Related papers (2023-02-13T18:59:25Z) - Breaking the Curse of Multiagents in a Large State Space: RL in Markov
Games with Independent Linear Function Approximation [56.715186432566576]
We propose a new model, independent linear Markov game, for reinforcement learning with a large state space and a large number of agents.
We design new algorithms for learning correlated equilibria (CCE) and Markov correlated equilibria (CE) with sample bounds complexity that only scalely with each agent's own function class complexity.
Our algorithms rely on two key technical innovations: (1) utilizing policy replay to tackle non-stationarity incurred by multiple agents and the use of function approximation; and (2) separating learning Markov equilibria and exploration in the Markov games.
arXiv Detail & Related papers (2023-02-07T18:47:48Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Finite-Sample Analysis of Decentralized Q-Learning for Stochastic Games [3.441021278275805]
Learning in games is arguably the most standard and fundamental setting in multi-agent reinforcement learning (MARL)
We establish the finite-sample complexity of fully decentralized Q-learning algorithms in a significant class of general approximation games (SGs)
We focus on the practical while challenging setting of fully decentralized MARL, where neither the rewards nor the actions of other agents can be observed by each agent.
arXiv Detail & Related papers (2021-12-15T03:33:39Z) - On the Complexity of Computing Markov Perfect Equilibrium in General-Sum
Stochastic Games [18.48133964089095]
Games (SGs) lay the foundation for the study of multi-agent reinforcement learning (MARL) and sequential agent interactions.
We derive an approximate Perfect Equilibrium (MPE) in a finite-state SGs Game within the exponential precision of textbfPPAD-complete.
Our results indicate that finding an MPE in SGs is highly unlikely to be textbfNP-hard unless textbfNP=textbfco-NP.
arXiv Detail & Related papers (2021-09-04T05:47:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.