Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics
- URL: http://arxiv.org/abs/2510.22158v1
- Date: Sat, 25 Oct 2025 04:50:52 GMT
- Title: Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics
- Authors: Lorenzo Magnino, Kai Shao, Zida Wu, Jiacheng Shen, Mathieu Laurière,
- Abstract summary: Mean field games (MFGs) have emerged as a powerful framework for modeling interactions in large-scale multi-agent systems.<n>This paper introduces a novel deep reinforcement learning (DRL) algorithm specifically designed for non-stationary continuous MFGs.
- Score: 4.4685491639611685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mean field games (MFGs) have emerged as a powerful framework for modeling interactions in large-scale multi-agent systems. Despite recent advancements in reinforcement learning (RL) for MFGs, existing methods are typically limited to finite spaces or stationary models, hindering their applicability to real-world problems. This paper introduces a novel deep reinforcement learning (DRL) algorithm specifically designed for non-stationary continuous MFGs. The proposed approach builds upon a Fictitious Play (FP) methodology, leveraging DRL for best-response computation and supervised learning for average policy representation. Furthermore, it learns a representation of the time-dependent population distribution using a Conditional Normalizing Flow. To validate the effectiveness of our method, we evaluate it on three different examples of increasing complexity. By addressing critical limitations in scalability and density approximation, this work represents a significant advancement in applying DRL techniques to complex MFG problems, bringing the field closer to real-world multi-agent systems.
Related papers
- Sample-Efficient Neurosymbolic Deep Reinforcement Learning [49.60927398960061]
We propose a neuro-symbolic Deep RL approach that integrates background symbolic knowledge to improve sample efficiency.<n>Online reasoning is performed to guide the training process through two mechanisms.<n>We show improved performance over a state-of-the-art reward machine baseline.
arXiv Detail & Related papers (2026-01-06T09:28:53Z) - Scaling Online Distributionally Robust Reinforcement Learning: Sample-Efficient Guarantees with General Function Approximation [18.596128578766958]
Distributionally robust RL (DR-RL) addresses this issue by optimizing worst-case performance over an uncertainty set of transition dynamics.<n>We propose an online DR-RL algorithm with general function approximation that learns an optimal robust policy purely through interaction with the environment.<n>We provide a theoretical analysis establishing a near-optimal sublinear regret bound under a total variation uncertainty set, demonstrating the sample efficiency and effectiveness of our method.
arXiv Detail & Related papers (2025-12-22T02:12:04Z) - Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs [78.09559830840595]
We present the first systematic study on quantizing diffusion-based language models.<n>We identify the presence of activation outliers, characterized by abnormally large activation values.<n>We implement state-of-the-art PTQ methods and conduct a comprehensive evaluation.
arXiv Detail & Related papers (2025-08-20T17:59:51Z) - Agentic Reinforced Policy Optimization [66.96989268893932]
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks.<n>Current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions.<n>We propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents.
arXiv Detail & Related papers (2025-07-26T07:53:11Z) - Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning [15.789898162610529]
Reinforcement learning from human feedback (RLHF) has become a crucial step in building reliable generative AI models.<n>This study is to develop a disciplined approach to fine-tune diffusion models using continuous-time RL.
arXiv Detail & Related papers (2025-02-03T20:50:05Z) - Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient [26.675822002049372]
Deep Diffusion Policy Gradient (DDiffPG) is a novel actor-critic algorithm that learns from scratch multimodal policies.
DDiffPG forms a multimodal training batch and utilizes mode-specific Q-learning to mitigate the inherent greediness of the RL objective.
Our approach further allows the policy to be conditioned on mode-specific embeddings to explicitly control the learned modes.
arXiv Detail & Related papers (2024-06-02T09:32:28Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Unsupervised Solution Operator Learning for Mean-Field Games via Sampling-Invariant Parametrizations [7.230928145936957]
We develop a novel framework to learn the MFG solution operator.
Our model takes a MFG instances as input and output their solutions with one forward pass.
It is discretization-free, making it suitable for learning operators of high-dimensional MFGs.
arXiv Detail & Related papers (2024-01-27T19:07:49Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Adversarial Inverse Reinforcement Learning for Mean Field Games [17.392418397388823]
Mean field games (MFGs) provide a mathematically tractable framework for modelling large-scale multi-agent systems.
This paper proposes a novel framework, Mean-Field Adversarial IRL (MF-AIRL), which is capable of tackling uncertainties in demonstrations.
arXiv Detail & Related papers (2021-04-29T21:03:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.