Concave Utility Reinforcement Learning: the Mean-field Game viewpoint
- URL: http://arxiv.org/abs/2106.03787v2
- Date: Wed, 9 Jun 2021 09:27:45 GMT
- Title: Concave Utility Reinforcement Learning: the Mean-field Game viewpoint
- Authors: Matthieu Geist, Julien P\'erolat, Mathieu Lauri\`ere, Romuald Elie,
Sarah Perrin, Olivier Bachem, R\'emi Munos, Olivier Pietquin
- Abstract summary: Concave Utility Reinforcement Learning (CURL) extends RL from linear to concave utilities in the occupancy measure induced by the agent's policy.
This more general paradigm invalidates the classical Bellman equations, and calls for new algorithms.
We show that CURL is a subclass of Mean-field Games (MFGs)
- Score: 42.403650997341806
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Concave Utility Reinforcement Learning (CURL) extends RL from linear to
concave utilities in the occupancy measure induced by the agent's policy. This
encompasses not only RL but also imitation learning and exploration, among
others. Yet, this more general paradigm invalidates the classical Bellman
equations, and calls for new algorithms. Mean-field Games (MFGs) are a
continuous approximation of many-agent RL. They consider the limit case of a
continuous distribution of identical agents, anonymous with symmetric
interests, and reduce the problem to the study of a single representative agent
in interaction with the full population. Our core contribution consists in
showing that CURL is a subclass of MFGs. We think this important to bridge
together both communities. It also allows to shed light on aspects of both
fields: we show the equivalence between concavity in CURL and monotonicity in
the associated MFG, between optimality conditions in CURL and Nash equilibrium
in MFG, or that Fictitious Play (FP) for this class of MFGs is simply
Frank-Wolfe, bringing the first convergence rate for discrete-time FP for MFGs.
We also experimentally demonstrate that, using algorithms recently introduced
for solving MFGs, we can address the CURL problem more efficiently.
Related papers
- Analysis of Multiscale Reinforcement Q-Learning Algorithms for Mean Field Control Games [2.3833208322103605]
Mean Field Control Games (MFCG) represent competitive games between a large number of large collaborative groups of agents.
We prove the convergence of a three-timescale Reinforcement Q-Learning (RL) algorithm to solve MFCG.
arXiv Detail & Related papers (2024-05-27T10:01:52Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Learning Discrete-Time Major-Minor Mean Field Games [61.09249862334384]
We propose a novel discrete time version of major-minor MFGs (M3FGs) and a learning algorithm based on fictitious play and partitioning the probability simplex.
M3FGs generalize MFGs with common noise and can handle not only random exogeneous environment states but also major players.
arXiv Detail & Related papers (2023-12-17T18:22:08Z) - On Imitation in Mean-field Games [53.27734434016737]
We explore the problem of imitation learning (IL) in the context of mean-field games (MFGs)
We show that when only the reward depends on the population distribution, IL in MFGs can be reduced to single-agent IL with similar guarantees.
We propose a new adversarial formulation where the reinforcement learning problem is replaced by a mean-field control problem.
arXiv Detail & Related papers (2023-06-26T15:58:13Z) - Online Learning with Adversaries: A Differential-Inclusion Analysis [52.43460995467893]
We introduce an observation-matrix-based framework for fully asynchronous online Federated Learning with adversaries.
Our main result is that the proposed algorithm almost surely converges to the desired mean $mu.$
We derive this convergence using a novel differential-inclusion-based two-timescale analysis.
arXiv Detail & Related papers (2023-04-04T04:32:29Z) - Individual-Level Inverse Reinforcement Learning for Mean Field Games [16.79251229846642]
Mean Field IRL (MFIRL) is the first dedicated IRL framework for MFGs that can handle both cooperative and non-cooperative environments.
We develop a practical algorithm effective for MFGs with unknown dynamics.
arXiv Detail & Related papers (2022-02-13T20:35:01Z) - Reinforcement Learning for Mean Field Games, with Applications to
Economics [0.0]
Mean field games (MFG) and mean field control problems (MFC) are frameworks to study Nash equilibria or social optima in games with a continuum of agents.
We present a two timescale approach with RL for MFG and MFC, which relies on a unified Q-learning algorithm.
arXiv Detail & Related papers (2021-06-25T16:45:04Z) - Unified Reinforcement Q-Learning for Mean Field Game and Control
Problems [0.0]
We present a Reinforcement Learning (RL) algorithm to solve infinite horizon Mean Field Game (MFG) and Mean Field Control (MFC) problems.
Our approach can be described as a unified two-timescale Mean Field Q-learning: The emphsame algorithm can learn either the MFG or the MFC solution by simply tuning the ratio of two learning parameters.
arXiv Detail & Related papers (2020-06-24T17:45:44Z) - Alternating the Population and Control Neural Networks to Solve
High-Dimensional Stochastic Mean-Field Games [9.909883019034613]
We present an alternating population and agent control neural network for solving mean field games (MFGs)
Our algorithm is geared toward high-dimensional instances of MFGs that are beyond reach with existing solution methods.
We show the potential of our method on up to 100-dimensional MFG problems.
arXiv Detail & Related papers (2020-02-24T08:24:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.