Regularization of the policy updates for stabilizing Mean Field Games
- URL: http://arxiv.org/abs/2304.01547v2
- Date: Thu, 13 Apr 2023 13:53:24 GMT
- Title: Regularization of the policy updates for stabilizing Mean Field Games
- Authors: Talal Algumaei, Ruben Solozabal, Reda Alami, Hakim Hacid, Merouane
Debbah, Martin Takac
- Abstract summary: This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL)
MARL where multiple agents interact in the same environment and whose goal is to maximize the individual returns.
We name our algorithm Mean Field Proximal Policy Optimization (MF-PPO), and we empirically show the effectiveness of our method in the OpenSpiel framework.
- Score: 0.2348805691644085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL)
where multiple agents interact in the same environment and whose goal is to
maximize the individual returns. Challenges arise when scaling up the number of
agents due to the resultant non-stationarity that the many agents introduce. In
order to address this issue, Mean Field Games (MFG) rely on the symmetry and
homogeneity assumptions to approximate games with very large populations.
Recently, deep Reinforcement Learning has been used to scale MFG to games with
larger number of states. Current methods rely on smoothing techniques such as
averaging the q-values or the updates on the mean-field distribution. This work
presents a different approach to stabilize the learning based on proximal
updates on the mean-field policy. We name our algorithm Mean Field Proximal
Policy Optimization (MF-PPO), and we empirically show the effectiveness of our
method in the OpenSpiel framework.
Related papers
- A Single Online Agent Can Efficiently Learn Mean Field Games [16.00164239349632]
Mean field games (MFGs) are a promising framework for modeling the behavior of large-population systems.
This paper introduces a novel online single-agent model-free learning scheme, which enables a single agent to learn MFNE using online samples.
arXiv Detail & Related papers (2024-05-05T16:38:04Z) - Local Optimization Achieves Global Optimality in Multi-Agent
Reinforcement Learning [139.53668999720605]
We present a multi-agent PPO algorithm in which the local policy of each agent is updated similarly to vanilla PPO.
We prove that with standard regularity conditions on the Markov game and problem-dependent quantities, our algorithm converges to the globally optimal policy at a sublinear rate.
arXiv Detail & Related papers (2023-05-08T16:20:03Z) - Decentralized Multi-Agent Reinforcement Learning for Continuous-Space
Stochastic Games [0.0]
We study multi-agent learning in games decentralized with general state spaces and an information structure in which agents do not observe each other's actions.
We propose a MARL algorithm and we prove the near-optimality of its policy updates.
arXiv Detail & Related papers (2023-03-16T14:25:16Z) - Robust Policy Optimization in Deep Reinforcement Learning [16.999444076456268]
In continuous action domains, parameterized distribution of action distribution allows easy control of exploration.
In particular, we propose an algorithm called Robust Policy Optimization (RPO), which leverages a perturbed distribution.
We evaluated our methods on various continuous control tasks from DeepMind Control, OpenAI Gym, Pybullet, and IsaacGym.
arXiv Detail & Related papers (2022-12-14T22:43:56Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - Reinforcement Learning for Mean Field Games, with Applications to
Economics [0.0]
Mean field games (MFG) and mean field control problems (MFC) are frameworks to study Nash equilibria or social optima in games with a continuum of agents.
We present a two timescale approach with RL for MFG and MFC, which relies on a unified Q-learning algorithm.
arXiv Detail & Related papers (2021-06-25T16:45:04Z) - Permutation Invariant Policy Optimization for Mean-Field Multi-Agent
Reinforcement Learning: A Principled Approach [128.62787284435007]
We propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture.
We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence.
In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors.
arXiv Detail & Related papers (2021-05-18T04:35:41Z) - Provable Fictitious Play for General Mean-Field Games [111.44976345867005]
We propose a reinforcement learning algorithm for stationary mean-field games.
The goal is to learn a pair of mean-field state and stationary policy that constitutes the Nash equilibrium.
arXiv Detail & Related papers (2020-10-08T18:46:48Z) - Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field
Control/Game in Continuous Time [109.06623773924737]
We study the policy gradient method for the linear-quadratic mean-field control and game.
We show that it converges to the optimal solution at a linear rate, which is verified by a synthetic simulation.
arXiv Detail & Related papers (2020-08-16T06:34:11Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.