Related papers: Mean Field Games Flock! The Reinforcement Learning Way

Mean Field Games Flock! The Reinforcement Learning Way

URL: http://arxiv.org/abs/2105.07933v1
Date: Mon, 17 May 2021 15:17:36 GMT
Title: Mean Field Games Flock! The Reinforcement Learning Way
Authors: Sarah Perrin, Mathieu Lauri\`ere, Julien P\'erolat, Matthieu Geist, Romuald \'Elie, Olivier Pietquin
Abstract summary: We present a method enabling a large number of agents to learn how to flock. This is a natural behavior observed in large populations of animals. We show numerically that our algorithm learn multi-group or high-dimensional flocking with obstacles.
Score: 34.67098179276852
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We present a method enabling a large number of agents to learn how to flock, which is a natural behavior observed in large populations of animals. This problem has drawn a lot of interest but requires many structural assumptions and is tractable only in small dimensions. We phrase this problem as a Mean Field Game (MFG), where each individual chooses its acceleration depending on the population behavior. Combining Deep Reinforcement Learning (RL) and Normalizing Flows (NF), we obtain a tractable solution requiring only very weak assumptions. Our algorithm finds a Nash Equilibrium and the agents adapt their velocity to match the neighboring flock's average one. We use Fictitious Play and alternate: (1) computing an approximate best response with Deep RL, and (2) estimating the next population distribution with NF. We show numerically that our algorithm learn multi-group or high-dimensional flocking with obstacles.

Related papers

Reinforcement Learning for Finite Space Mean-Field Type Games [3.8207676009459886]
Mean field type games (MFTGs) describe Nash equilibria between large coalitions. We develop reinforcement learning methods for such games in a finite space setting.
arXiv Detail & Related papers (2024-09-25T17:15:26Z)
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning [55.65738319966385]
We propose a novel online algorithm, iterative Nash policy optimization (INPO) Unlike previous methods, INPO bypasses the need for estimating the expected win rate for individual responses. With an LLaMA-3-8B-based SFT model, INPO achieves a 42.6% length-controlled win rate on AlpacaEval 2.0 and a 37.8% win rate on Arena-Hard.
arXiv Detail & Related papers (2024-06-30T08:00:34Z)
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning [43.004209289015975]
Mean Field Games (MFGs) have the ability to handle large-scale multi-agent systems. We propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium.
arXiv Detail & Related papers (2024-03-06T08:55:34Z)
Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z)
Regularization of the policy updates for stabilizing Mean Field Games [0.2348805691644085]
This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL) MARL where multiple agents interact in the same environment and whose goal is to maximize the individual returns. We name our algorithm Mean Field Proximal Policy Optimization (MF-PPO), and we empirically show the effectiveness of our method in the OpenSpiel framework.
arXiv Detail & Related papers (2023-04-04T05:45:42Z)
Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization [6.6765384699410095]
We argue for the benefits of learning a population of complementary policies, which can be simultaneously rolled out at inference. We show that Poppy produces a set of complementary policies, and obtains state-of-the-art RL results on four popular NP-hard problems.
arXiv Detail & Related papers (2022-10-07T11:58:08Z)
Learning in Mean Field Games: A Survey [44.93300994923148]
Mean Field Games (MFGs) rely on a mean-field approximation to allow the number of players to grow to infinity. Recent literature on Reinforcement Learning methods to learnlibria and social optima in MFGs. We present a general framework for classical iterative methods to solve MFGs in an exact way.
arXiv Detail & Related papers (2022-05-25T17:49:37Z)
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games [60.550128966505625]
Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents. Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods. Existing algorithms to solve MFGs require the mixing of approximated quantities such as strategies or $q$-values. We propose two methods to address this shortcoming. The first one learns a mixed strategy from distillation of historical data into a neural network and is applied to the Fictitious Play algorithm. The second one is an online mixing method based on
arXiv Detail & Related papers (2022-03-22T18:10:32Z)
Scaling up Mean Field Games with Online Mirror Descent [55.36153467919289]
We address scaling up equilibrium computation in Mean Field Games (MFGs) using Online Mirror Descent (OMD) We show that continuous-time OMD provably converges to a Nash equilibrium under a natural and well-motivated set of monotonicity assumptions. A thorough experimental investigation on various single and multi-population MFGs shows that OMD outperforms traditional algorithms such as Fictitious Play (FP)
arXiv Detail & Related papers (2021-02-28T21:28:36Z)
Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism [107.48538091418412]
We study exploration in multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. We focus in particular on the allocation of distributed computing resources, where we may obtain results faster by allocating more resources per pull.
arXiv Detail & Related papers (2020-10-31T18:19:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.