Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
- URL: http://arxiv.org/abs/2309.10953v2
- Date: Thu, 2 May 2024 21:20:49 GMT
- Title: Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
- Authors: Andrea Angiuli, Jean-Pierre Fouque, Ruimeng Hu, Alan Raydan,
- Abstract summary: We present a reinforcement learning (RL) algorithm designed to solve mean field games (MFG) and mean field control (MFC) problems in a unified manner.
The proposed approach pairs the actor-critic (AC) paradigm with a representation of the mean field distribution via a parameterized score function.
A modification of the algorithm allows us to solve mixed mean field control games (MFCGs)
- Score: 1.4999444543328293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the development and analysis of a reinforcement learning (RL) algorithm designed to solve continuous-space mean field game (MFG) and mean field control (MFC) problems in a unified manner. The proposed approach pairs the actor-critic (AC) paradigm with a representation of the mean field distribution via a parameterized score function, which can be efficiently updated in an online fashion, and uses Langevin dynamics to obtain samples from the resulting distribution. The AC agent and the score function are updated iteratively to converge, either to the MFG equilibrium or the MFC optimum for a given mean field problem, depending on the choice of learning rates. A straightforward modification of the algorithm allows us to solve mixed mean field control games (MFCGs). The performance of our algorithm is evaluated using linear-quadratic benchmarks in the asymptotic infinite horizon framework.
Related papers
- Unified continuous-time q-learning for mean-field game and mean-field control problems [4.416317245952636]
We introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function.
We devise a unified q-learning algorithm for both mean-field game (MFG) and mean-field control (MFC) problems.
For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions.
arXiv Detail & Related papers (2024-07-05T14:06:59Z) - Fast Value Tracking for Deep Reinforcement Learning [7.648784748888187]
Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interact with their environment.
Existing algorithms often view these problem as static, focusing on point estimates for model parameters to maximize expected rewards.
Our research leverages the Kalman paradigm to introduce a novel quantification and sampling algorithm called Langevinized Kalman TemporalTD.
arXiv Detail & Related papers (2024-03-19T22:18:19Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference [47.460898983429374]
We introduce an ensemble Kalman filter (EnKF) into the non-mean-field (NMF) variational inference framework to approximate the posterior distribution of the latent states.
This novel marriage between EnKF and GPSSM not only eliminates the need for extensive parameterization in learning variational distributions, but also enables an interpretable, closed-form approximation of the evidence lower bound (ELBO)
We demonstrate that the resulting EnKF-aided online algorithm embodies a principled objective function by ensuring data-fitting accuracy while incorporating model regularizations to mitigate overfitting.
arXiv Detail & Related papers (2023-12-10T15:22:30Z) - PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback [106.63518036538163]
We present a novel unified bilevel optimization-based framework, textsfPARL, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning.
Our framework addressed these concerns by explicitly parameterizing the distribution of the upper alignment objective (reward design) by the lower optimal variable.
Our empirical results substantiate that the proposed textsfPARL can address the alignment concerns in RL by showing significant improvements.
arXiv Detail & Related papers (2023-08-03T18:03:44Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic
Mean-Field Games [14.209473797379667]
We study large population multi-agent reinforcement learning (RL) in the context of discrete-time linear-quadratic mean-field games (LQ-MFGs)
Our setting differs from most existing work on RL for MFGs, in that we consider a non-stationary MFG over an infinite horizon.
We propose an actor-critic algorithm to iteratively compute the mean-field equilibrium (MFE) of the LQ-MFG.
arXiv Detail & Related papers (2020-09-09T15:17:52Z) - Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field
Control/Game in Continuous Time [109.06623773924737]
We study the policy gradient method for the linear-quadratic mean-field control and game.
We show that it converges to the optimal solution at a linear rate, which is verified by a synthetic simulation.
arXiv Detail & Related papers (2020-08-16T06:34:11Z) - Unified Reinforcement Q-Learning for Mean Field Game and Control
Problems [0.0]
We present a Reinforcement Learning (RL) algorithm to solve infinite horizon Mean Field Game (MFG) and Mean Field Control (MFC) problems.
Our approach can be described as a unified two-timescale Mean Field Q-learning: The emphsame algorithm can learn either the MFG or the MFC solution by simply tuning the ratio of two learning parameters.
arXiv Detail & Related papers (2020-06-24T17:45:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.