Discrete-Time Mean Field Control with Environment States
- URL: http://arxiv.org/abs/2104.14900v1
- Date: Fri, 30 Apr 2021 10:58:01 GMT
- Title: Discrete-Time Mean Field Control with Environment States
- Authors: Kai Cui, Anam Tahir, Mark Sinzger, Heinz Koeppl
- Abstract summary: Mean field control and mean field games have been established as a tractable solution for large-scale multi-agent problems with many agents.
We rigorously establish approximate optimality as the number of agents grows in the finite agent case.
We find that a dynamic programming principle holds, resulting in the existence of an optimal stationary policy.
- Score: 25.44061731738579
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent reinforcement learning methods have shown remarkable potential in
solving complex multi-agent problems but mostly lack theoretical guarantees.
Recently, mean field control and mean field games have been established as a
tractable solution for large-scale multi-agent problems with many agents. In
this work, driven by a motivating scheduling problem, we consider a
discrete-time mean field control model with common environment states. We
rigorously establish approximate optimality as the number of agents grows in
the finite agent case and find that a dynamic programming principle holds,
resulting in the existence of an optimal stationary policy. As exact solutions
are difficult in general due to the resulting continuous action space of the
limiting mean field Markov decision process, we apply established deep
reinforcement learning methods to solve the associated mean field control
problem. The performance of the learned mean field control policy is compared
to typical multi-agent reinforcement learning approaches and is found to
converge to the mean field performance for sufficiently many agents, verifying
the obtained theoretical results and reaching competitive solutions.
Related papers
- DePAint: A Decentralized Safe Multi-Agent Reinforcement Learning Algorithm considering Peak and Average Constraints [1.1549572298362787]
We propose a momentum-based decentralized gradient policy method, DePAint, to solve the problem.
This is the first privacy-preserving fully decentralized multi-agent reinforcement learning algorithm that considers both peak and average constraints.
arXiv Detail & Related papers (2023-10-22T16:36:03Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Major-Minor Mean Field Multi-Agent Reinforcement Learning [29.296206774925388]
Multi-agent reinforcement learning (MARL) remains difficult to scale to many agents.
Recent MARL using Mean Field Control (MFC) provides a tractable and rigorous approach to otherwise difficult cooperative MARL.
We generalize MFC to instead simultaneously model many similar and few complex agents.
arXiv Detail & Related papers (2023-03-19T14:12:57Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and
Learning Mean-Field Control [23.494528616672024]
We use state-of-the-art mean-field control techniques to convert many-agent swarm control into classical single-agent control of distributions.
Here, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior.
arXiv Detail & Related papers (2022-09-15T16:15:04Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Permutation Invariant Policy Optimization for Mean-Field Multi-Agent
Reinforcement Learning: A Principled Approach [128.62787284435007]
We propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture.
We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence.
In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors.
arXiv Detail & Related papers (2021-05-18T04:35:41Z) - Scalable, Decentralized Multi-Agent Reinforcement Learning Methods
Inspired by Stigmergy and Ant Colonies [0.0]
We investigate a novel approach to decentralized multi-agent learning and planning.
In particular, this method is inspired by the cohesion, coordination, and behavior of ant colonies.
The approach combines single-agent RL and an ant-colony-inspired decentralized, stigmergic algorithm for multi-agent path planning and environment modification.
arXiv Detail & Related papers (2021-05-08T01:04:51Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.