Independent Learning in Constrained Markov Potential Games
- URL: http://arxiv.org/abs/2402.17885v1
- Date: Tue, 27 Feb 2024 20:57:35 GMT
- Title: Independent Learning in Constrained Markov Potential Games
- Authors: Philip Jordan, Anas Barakat, Niao He
- Abstract summary: Constrained Markov games offer a formal framework for modeling multi-agent reinforcement learning problems.
We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria.
- Score: 19.083595175045073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Constrained Markov games offer a formal mathematical framework for modeling
multi-agent reinforcement learning problems where the behavior of the agents is
subject to constraints. In this work, we focus on the recently introduced class
of constrained Markov Potential Games. While centralized algorithms have been
proposed for solving such constrained games, the design of converging
independent learning algorithms tailored for the constrained setting remains an
open question. We propose an independent policy gradient algorithm for learning
approximate constrained Nash equilibria: Each agent observes their own actions
and rewards, along with a shared state. Inspired by the optimization
literature, our algorithm performs proximal-point-like updates augmented with a
regularized constraint set. Each proximal step is solved inexactly using a
stochastic switching gradient algorithm. Notably, our algorithm can be
implemented independently without a centralized coordination mechanism
requiring turn-based agent updates. Under some technical constraint
qualification conditions, we establish convergence guarantees towards
constrained approximate Nash equilibria. We perform simulations to illustrate
our results.
Related papers
- Convergence of Decentralized Actor-Critic Algorithm in General-sum Markov Games [3.8779763612314633]
We study the properties of learning algorithms in general-sum Markov games.
In particular, we focus on a decentralized algorithm where each agent adopts an actor-critic learning dynamic.
arXiv Detail & Related papers (2024-09-06T20:49:11Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - Breaking the Curse of Multiagents in a Large State Space: RL in Markov
Games with Independent Linear Function Approximation [56.715186432566576]
We propose a new model, independent linear Markov game, for reinforcement learning with a large state space and a large number of agents.
We design new algorithms for learning correlated equilibria (CCE) and Markov correlated equilibria (CE) with sample bounds complexity that only scalely with each agent's own function class complexity.
Our algorithms rely on two key technical innovations: (1) utilizing policy replay to tackle non-stationarity incurred by multiple agents and the use of function approximation; and (2) separating learning Markov equilibria and exploration in the Markov games.
arXiv Detail & Related papers (2023-02-07T18:47:48Z) - Learning Two-Player Mixture Markov Games: Kernel Function Approximation
and Correlated Equilibrium [157.0902680672422]
We consider learning Nash equilibria in two-player zero-sum Markov Games with nonlinear function approximation.
We propose a novel online learning algorithm to find a Nash equilibrium by minimizing the duality gap.
arXiv Detail & Related papers (2022-08-10T14:21:54Z) - A unified stochastic approximation framework for learning in games [82.74514886461257]
We develop a flexible approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite)
The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, exponential/multiplicative weights for learning in finite games, optimistic and bandit variants of the above, etc.
arXiv Detail & Related papers (2022-06-08T14:30:38Z) - Learn to Match with No Regret: Reinforcement Learning in Markov Matching
Markets [151.03738099494765]
We study a Markov matching market involving a planner and a set of strategic agents on the two sides of the market.
We propose a reinforcement learning framework that integrates optimistic value iteration with maximum weight matching.
We prove that the algorithm achieves sublinear regret.
arXiv Detail & Related papers (2022-03-07T19:51:25Z) - Gradient play in stochastic games: stationary points, convergence, and
sample complexity [6.97785632069611]
We study the performance of the gradient play algorithm for games (SGs)
We show that Nash equilibria (NEs) and first-order stationary policies are equivalent in this setting.
For a subclass of SGs called Markov potential games, we design a sample-based reinforcement learning algorithm.
arXiv Detail & Related papers (2021-06-01T03:03:45Z) - Last-iterate Convergence of Decentralized Optimistic Gradient
Descent/Ascent in Infinite-horizon Competitive Markov Games [37.70703888365849]
We study infinite-horizon discounted two-player zero-sum Markov games.
We develop a decentralized algorithm that converges to the set of Nash equilibria under self-play.
arXiv Detail & Related papers (2021-02-08T21:45:56Z) - Learning Nash Equilibria in Zero-Sum Stochastic Games via
Entropy-Regularized Policy Approximation [18.35524179586723]
We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum games.
We propose a new Q-learning type algorithm that uses a sequence of entropy-regularized soft policies to approximate the Nash policy.
We prove that under certain conditions, by updating the regularized Q-function, the algorithm converges to a Nash equilibrium.
arXiv Detail & Related papers (2020-09-01T01:03:44Z) - Competitive Mirror Descent [67.31015611281225]
Constrained competitive optimization involves multiple agents trying to minimize conflicting objectives, subject to constraints.
We propose competitive mirror descent (CMD): a general method for solving such problems based on first order information.
As a special case we obtain a novel competitive multiplicative weights algorithm for problems on the positive cone.
arXiv Detail & Related papers (2020-06-17T22:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.