MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2106.07551v1
- Date: Sat, 5 Jun 2021 03:27:08 GMT
- Title: MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning
- Authors: Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen,
Yaodong Yang, Weinan Zhang, Jun Wang
- Abstract summary: Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms.
We present MALib, a scalable and efficient computing framework for PB-MARL.
- Score: 61.28547338576706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Population-based multi-agent reinforcement learning (PB-MARL) refers to the
series of methods nested with reinforcement learning (RL) algorithms, which
produces a self-generated sequence of tasks arising from the coupled population
dynamics. By leveraging auto-curricula to induce a population of distinct
emergent strategies, PB-MARL has achieved impressive success in tackling
multi-agent tasks. Despite remarkable prior arts of distributed RL frameworks,
PB-MARL poses new challenges for parallelizing the training frameworks due to
the additional complexity of multiple nested workloads between sampling,
training and evaluation involved with heterogeneous policy interactions. To
solve these problems, we present MALib, a scalable and efficient computing
framework for PB-MARL. Our framework is comprised of three key components: (1)
a centralized task dispatching model, which supports the self-generated tasks
and scalable training with heterogeneous policy combinations; (2) a programming
architecture named Actor-Evaluator-Learner, which achieves high parallelism for
both training and sampling, and meets the evaluation requirement of
auto-curriculum learning; (3) a higher-level abstraction of MARL training
paradigms, which enables efficient code reuse and flexible deployments on
different distributed computing paradigms. Experiments on a series of complex
tasks such as multi-agent Atari Games show that MALib achieves throughput
higher than 40K FPS on a single machine with $32$ CPU cores; 5x speedup than
RLlib and at least 3x speedup than OpenSpiel in multi-agent training tasks.
MALib is publicly available at https://github.com/sjtu-marl/malib.
Related papers
- PPS-QMIX: Periodically Parameter Sharing for Accelerating Convergence of
Multi-Agent Reinforcement Learning [20.746383793882984]
Training for multi-agent reinforcement learning(MARL) is a time-consuming process.
One drawback is that strategy of each agent in MARL is independent but actually in cooperation.
We propose three simple approaches called Average Sharing(A-PPS), Reward-Scalability Periodically and Partial Personalized Periodically.
arXiv Detail & Related papers (2024-03-05T03:59:01Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Multipath agents for modular multitask ML systems [2.579908688646812]
The presented work introduces a novel methodology allowing to define multiple methods as distinct agents.
Agents can collaborate and compete to generate and improve ML models for a given tasks.
arXiv Detail & Related papers (2023-02-06T11:57:45Z) - MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning
Library [82.77446613763809]
We present MARLlib, a library designed to offer fast development for multi-agent tasks and algorithm combinations.
MARLlib can effectively disentangle the intertwined nature of the multi-agent task and the learning process of the algorithm.
The library's source code is publicly accessible on GitHub.
arXiv Detail & Related papers (2022-10-11T03:11:12Z) - Efficient Distributed Framework for Collaborative Multi-Agent
Reinforcement Learning [17.57163419315147]
Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers.
There are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency.
In this paper, we design an distributed MARL framework based on the actor-work-learner architecture.
arXiv Detail & Related papers (2022-05-11T03:12:49Z) - Containerized Distributed Value-Based Multi-Agent Reinforcement Learning [18.79371121484969]
We propose a containerized multi-agent reinforcement learning framework.
To own knowledge, our method is the first to solve the challenging Google Research Football full game $5_v_5$.
On the StarCraft II micromanagement benchmark, our method gets $4$-$18times$ better results compared to state-of-the-art non-distributed MARL algorithms.
arXiv Detail & Related papers (2021-10-15T15:54:06Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.