Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning
- URL: http://arxiv.org/abs/2107.04050v2
- Date: Tue, 9 May 2023 09:17:03 GMT
- Title: Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning
- Authors: Barna P\'asztor, Ilija Bogunovic, Andreas Krause
- Abstract summary: We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
- Score: 89.31889875864599
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning in multi-agent systems is highly challenging due to several factors
including the non-stationarity introduced by agents' interactions and the
combinatorial nature of their state and action spaces. In particular, we
consider the Mean-Field Control (MFC) problem which assumes an asymptotically
infinite population of identical agents that aim to collaboratively maximize
the collective reward. In many cases, solutions of an MFC problem are good
approximations for large systems, hence, efficient learning for MFC is valuable
for the analogous discrete agent setting with many agents. Specifically, we
focus on the case of unknown system dynamics where the goal is to
simultaneously optimize for the rewards and learn from experience. We propose
an efficient model-based reinforcement learning algorithm, $M^3-UCRL$, that
runs in episodes, balances between exploration and exploitation during policy
learning, and provably solves this problem. Our main theoretical contributions
are the first general regret bounds for model-based reinforcement learning for
MFC, obtained via a novel mean-field type analysis. To learn the system's
dynamics, $M^3-UCRL$ can be instantiated with various statistical models, e.g.,
neural networks or Gaussian Processes. Moreover, we provide a practical
parametrization of the core optimization problem that facilitates
gradient-based optimization techniques when combined with differentiable
dynamics approximation methods such as neural networks.
Related papers
- M$^{2}$M: Learning controllable Multi of experts and multi-scale operators are the Partial Differential Equations need [43.534771810528305]
This paper introduces a framework of multi-scale and multi-expert (M$2$M) neural operators to simulate and learn PDEs efficiently.
We employ a divide-and-conquer strategy to train a multi-expert gated network for the dynamic router policy.
Our method incorporates a controllable prior gating mechanism that determines the selection rights of experts, enhancing the model's efficiency.
arXiv Detail & Related papers (2024-10-01T15:42:09Z) - Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - Stochastic Q-learning for Large Discrete Action Spaces [79.1700188160944]
In complex environments with discrete action spaces, effective decision-making is critical in reinforcement learning (RL)
We present value-based RL approaches which, as opposed to optimizing over the entire set of $n$ actions, only consider a variable set of actions, possibly as small as $mathcalO(log(n)$)$.
The presented value-based RL methods include, among others, Q-learning, StochDQN, StochDDQN, all of which integrate this approach for both value-function updates and action selection.
arXiv Detail & Related papers (2024-05-16T17:58:44Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Addressing the issue of stochastic environments and local
decision-making in multi-objective reinforcement learning [0.0]
Multi-objective reinforcement learning (MORL) is a relatively new field which builds on conventional Reinforcement Learning (RL)
This thesis focuses on what factors influence the frequency with which value-based MORL Q-learning algorithms learn the optimal policy for an environment.
arXiv Detail & Related papers (2022-11-16T04:56:42Z) - Interfacing Finite Elements with Deep Neural Operators for Fast
Multiscale Modeling of Mechanics Problems [4.280301926296439]
In this work, we explore the idea of multiscale modeling with machine learning and employ DeepONet, a neural operator, as an efficient surrogate of the expensive solver.
DeepONet is trained offline using data acquired from the fine solver for learning the underlying and possibly unknown fine-scale dynamics.
We present various benchmarks to assess accuracy and speedup, and in particular we develop a coupling algorithm for a time-dependent problem.
arXiv Detail & Related papers (2022-02-25T20:46:08Z) - Multi-Task Learning on Networks [0.0]
Multi-objective optimization problems arising in the multi-task learning context have specific features and require adhoc methods.
In this thesis the solutions in the Input Space are represented as probability distributions encapsulating the knowledge contained in the function evaluations.
In this space of probability distributions, endowed with the metric given by the Wasserstein distance, a new algorithm MOEA/WST can be designed in which the model is not directly on the objective function.
arXiv Detail & Related papers (2021-12-07T09:13:10Z) - An Efficient Application of Neuroevolution for Competitive Multiagent
Learning [0.0]
NEAT is a popular evolutionary strategy used to obtain the best performing neural network architecture.
This paper utilizes the NEAT algorithm to achieve competitive multiagent learning on a modified pong game environment.
arXiv Detail & Related papers (2021-05-23T10:34:48Z) - Softmax with Regularization: Better Value Estimation in Multi-Agent
Reinforcement Learning [72.28520951105207]
Overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning.
We propose a novel regularization-based update scheme that penalizes large joint action-values deviating from a baseline.
We show that our method provides a consistent performance improvement on a set of challenging StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-03-22T14:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.