Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
- URL: http://arxiv.org/abs/2203.11973v1
- Date: Tue, 22 Mar 2022 18:10:32 GMT
- Title: Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
- Authors: Mathieu Lauri\`ere, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush
Jain, Theophile Cabannes, Georgios Piliouras, Julien P\'erolat, Romuald
\'Elie, Olivier Pietquin, Matthieu Geist
- Abstract summary: Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents.
Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods.
Existing algorithms to solve MFGs require the mixing of approximated quantities such as strategies or $q$-values.
We propose two methods to address this shortcoming. The first one learns a mixed strategy from distillation of historical data into a neural network and is applied to the Fictitious Play algorithm.
The second one is an online mixing method based on
- Score: 60.550128966505625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mean Field Games (MFGs) have been introduced to efficiently approximate games
with very large populations of strategic agents. Recently, the question of
learning equilibria in MFGs has gained momentum, particularly using model-free
reinforcement learning (RL) methods. One limiting factor to further scale up
using RL is that existing algorithms to solve MFGs require the mixing of
approximated quantities such as strategies or $q$-values. This is non-trivial
in the case of non-linear function approximation that enjoy good generalization
properties, e.g. neural networks. We propose two methods to address this
shortcoming. The first one learns a mixed strategy from distillation of
historical data into a neural network and is applied to the Fictitious Play
algorithm. The second one is an online mixing method based on regularization
that does not require memorizing historical data or previous estimates. It is
used to extend Online Mirror Descent. We demonstrate numerically that these
methods efficiently enable the use of Deep RL algorithms to solve various MFGs.
In addition, we show that these methods outperform SotA baselines from the
literature.
Related papers
- Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Rethinking Population-assisted Off-policy Reinforcement Learning [7.837628433605179]
Off-policy reinforcement learning algorithms struggle with convergence to local optima due to limited exploration.
Population-based algorithms offer a natural exploration strategy, but their black-box operators are inefficient.
Recent algorithms have integrated these two methods, connecting them through a shared replay buffer.
arXiv Detail & Related papers (2023-05-04T15:53:00Z) - Deep Learning for Mean Field Games with non-separable Hamiltonians [0.0]
This paper introduces a new method for solving high-dimensional Mean Field Games (MFGs)
We achieve this by using two neural networks to approximate the unknown solutions of the MFG system and forward-backward conditions.
Our method is efficient, even with a small number of iterations, and is capable of handling up to 300 dimensions with a single layer.
arXiv Detail & Related papers (2023-01-07T15:39:48Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling.
Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance
Metric Learning and Behavior Regularization [10.243908145832394]
We study the offline meta-reinforcement learning (OMRL) problem, a paradigm which enables reinforcement learning (RL) algorithms to quickly adapt to unseen tasks.
This problem is still not fully understood, for which two major challenges need to be addressed.
We provide analysis and insight showing that some simple design choices can yield substantial improvements over recent approaches.
arXiv Detail & Related papers (2020-10-02T17:13:39Z) - Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules.
This paper introduces a new meta-learning approach that discovers an entire update rule.
It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z) - Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? [15.578423102700764]
We propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms.
We show that the RL agents learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations.
arXiv Detail & Related papers (2020-03-03T16:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.