Population-aware Online Mirror Descent for Mean-Field Games by Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2403.03552v1
- Date: Wed, 6 Mar 2024 08:55:34 GMT
- Title: Population-aware Online Mirror Descent for Mean-Field Games by Deep
Reinforcement Learning
- Authors: Zida Wu, Mathieu Lauriere, Samuel Jia Cong Chua, Matthieu Geist,
Olivier Pietquin, Ankur Mehta
- Abstract summary: Mean Field Games (MFGs) have the ability to handle large-scale multi-agent systems.
We propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium.
- Score: 43.004209289015975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mean Field Games (MFGs) have the ability to handle large-scale multi-agent
systems, but learning Nash equilibria in MFGs remains a challenging task. In
this paper, we propose a deep reinforcement learning (DRL) algorithm that
achieves population-dependent Nash equilibrium without the need for averaging
or sampling from history, inspired by Munchausen RL and Online Mirror Descent.
Through the design of an additional inner-loop replay buffer, the agents can
effectively learn to achieve Nash equilibrium from any distribution, mitigating
catastrophic forgetting. The resulting policy can be applied to various initial
distributions. Numerical experiments on four canonical examples demonstrate our
algorithm has better convergence properties than SOTA algorithms, in particular
a DRL version of Fictitious Play for population-dependent policies.
Related papers
- Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning [55.65738319966385]
We propose a novel online algorithm, iterative Nash policy optimization (INPO)
Unlike previous methods, INPO bypasses the need for estimating the expected win rate for individual responses.
With an LLaMA-3-8B-based SFT model, INPO achieves a 42.6% length-controlled win rate on AlpacaEval 2.0 and a 37.8% win rate on Arena-Hard.
arXiv Detail & Related papers (2024-06-30T08:00:34Z) - A Single Online Agent Can Efficiently Learn Mean Field Games [16.00164239349632]
Mean field games (MFGs) are a promising framework for modeling the behavior of large-population systems.
This paper introduces a novel online single-agent model-free learning scheme, which enables a single agent to learn MFNE using online samples.
arXiv Detail & Related papers (2024-05-05T16:38:04Z) - MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games [5.778024594615575]
This paper proposes an online mean-field reinforcement learning algorithm for computing Nash equilibria of sequential games.
MFOML is the first fully approximate multi-agent reinforcement learning algorithm for provably solving Nash equilibria.
As a byproduct, we also obtain the first tractable globally convergent computational for approximate computing of monotone mean-field games.
arXiv Detail & Related papers (2024-05-01T02:19:31Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - The Effective Horizon Explains Deep RL Performance in Stochastic Environments [21.148001945560075]
Reinforcement learning (RL) theory has largely focused on proving mini complexity sample bounds.
We introduce a new RL algorithm, SQIRL, that iteratively learns a nearoptimal policy by exploring randomly to collect rollouts.
We leverage SQIRL to derive instance-dependent sample complexity bounds for RL that are exponential only in an "effective horizon" look-ahead and on the complexity of the class used for approximation.
arXiv Detail & Related papers (2023-12-13T18:58:56Z) - Scalable Deep Reinforcement Learning Algorithms for Mean Field Games [60.550128966505625]
Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents.
Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods.
Existing algorithms to solve MFGs require the mixing of approximated quantities such as strategies or $q$-values.
We propose two methods to address this shortcoming. The first one learns a mixed strategy from distillation of historical data into a neural network and is applied to the Fictitious Play algorithm.
The second one is an online mixing method based on
arXiv Detail & Related papers (2022-03-22T18:10:32Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Mean Field Games Flock! The Reinforcement Learning Way [34.67098179276852]
We present a method enabling a large number of agents to learn how to flock.
This is a natural behavior observed in large populations of animals.
We show numerically that our algorithm learn multi-group or high-dimensional flocking with obstacles.
arXiv Detail & Related papers (2021-05-17T15:17:36Z) - Scaling up Mean Field Games with Online Mirror Descent [55.36153467919289]
We address scaling up equilibrium computation in Mean Field Games (MFGs) using Online Mirror Descent (OMD)
We show that continuous-time OMD provably converges to a Nash equilibrium under a natural and well-motivated set of monotonicity assumptions.
A thorough experimental investigation on various single and multi-population MFGs shows that OMD outperforms traditional algorithms such as Fictitious Play (FP)
arXiv Detail & Related papers (2021-02-28T21:28:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.