Population-aware Online Mirror Descent for Mean-Field Games with Common Noise by Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2509.03030v1
- Date: Wed, 03 Sep 2025 05:33:46 GMT
- Title: Population-aware Online Mirror Descent for Mean-Field Games with Common Noise by Deep Reinforcement Learning
- Authors: Zida Wu, Mathieu Lauriere, Matthieu Geist, Olivier Pietquin, Ankur Mehta,
- Abstract summary: Mean Field Games (MFGs) offer a powerful framework for studying large-scale multi-agent systems.<n>Yet, learning Nash equilibria in MFGs remains a challenging problem.<n>We introduce an efficient deep reinforcement learning (DRL) algorithm to achieve population-dependent Nash equilibria.
- Score: 28.970166223191836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mean Field Games (MFGs) offer a powerful framework for studying large-scale multi-agent systems. Yet, learning Nash equilibria in MFGs remains a challenging problem, particularly when the initial distribution is unknown or when the population is subject to common noise. In this paper, we introduce an efficient deep reinforcement learning (DRL) algorithm designed to achieve population-dependent Nash equilibria without relying on averaging or historical sampling, inspired by Munchausen RL and Online Mirror Descent. The resulting policy is adaptable to various initial distributions and sources of common noise. Through numerical experiments on seven canonical examples, we demonstrate that our algorithm exhibits superior convergence properties compared to state-of-the-art algorithms, particularly a DRL version of Fictitious Play for population-dependent policies. The performance in the presence of common noise underscores the robustness and adaptability of our approach.
Related papers
- Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics [4.4685491639611685]
Mean field games (MFGs) have emerged as a powerful framework for modeling interactions in large-scale multi-agent systems.<n>This paper introduces a novel deep reinforcement learning (DRL) algorithm specifically designed for non-stationary continuous MFGs.
arXiv Detail & Related papers (2025-10-25T04:50:52Z) - DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data [65.09939942413651]
We propose a principled extension to GRPO that addresses inter-group imbalance with two key innovations.<n> Domain-aware reward scaling counteracts frequency bias by reweighting optimization based on domain prevalence.<n>Difficulty-aware reward scaling leverages prompt-level self-consistency to identify and prioritize uncertain prompts that offer greater learning value.
arXiv Detail & Related papers (2025-05-21T03:43:29Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Population-aware Online Mirror Descent for Mean-Field Games by Deep
Reinforcement Learning [43.004209289015975]
Mean Field Games (MFGs) have the ability to handle large-scale multi-agent systems.
We propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium.
arXiv Detail & Related papers (2024-03-06T08:55:34Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Heavy-tailed denoising score matching [5.371337604556311]
We develop an iterative noise scaling algorithm to consistently initialise the multiple levels of noise in Langevin dynamics.
On the practical side, our use of heavy-tailed DSM leads to improved score estimation, controllable sampling convergence, and more balanced unconditional generative performance for imbalanced datasets.
arXiv Detail & Related papers (2021-12-17T22:04:55Z) - Signatured Deep Fictitious Play for Mean Field Games with Common Noise [0.0]
Existing deep learning methods for solving mean-field games (MFGs) with common noise fix the sampling common noise paths and then solve the corresponding MFGs.
We propose a novel single-loop algorithm, named signatured deep fictitious play, by which we can work with the unfixed common noise setup to avoid the nested-loop structure.
The proposed algorithm can accurately capture the effect of common uncertainty changes on mean-field equilibria without further training of neural networks.
arXiv Detail & Related papers (2021-06-06T23:09:46Z) - Mean Field Games Flock! The Reinforcement Learning Way [34.67098179276852]
We present a method enabling a large number of agents to learn how to flock.
This is a natural behavior observed in large populations of animals.
We show numerically that our algorithm learn multi-group or high-dimensional flocking with obstacles.
arXiv Detail & Related papers (2021-05-17T15:17:36Z) - Scaling up Mean Field Games with Online Mirror Descent [55.36153467919289]
We address scaling up equilibrium computation in Mean Field Games (MFGs) using Online Mirror Descent (OMD)
We show that continuous-time OMD provably converges to a Nash equilibrium under a natural and well-motivated set of monotonicity assumptions.
A thorough experimental investigation on various single and multi-population MFGs shows that OMD outperforms traditional algorithms such as Fictitious Play (FP)
arXiv Detail & Related papers (2021-02-28T21:28:36Z) - Robust Reinforcement Learning using Adversarial Populations [118.73193330231163]
Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness.
We show that using a single adversary does not consistently yield robustness to dynamics variations under standard parametrizations of the adversary.
We propose a population-based augmentation to the Robust RL formulation in which we randomly initialize a population of adversaries and sample from the population uniformly during training.
arXiv Detail & Related papers (2020-08-04T20:57:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.