Related papers: MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

URL: http://arxiv.org/abs/2106.07551v1
Date: Sat, 5 Jun 2021 03:27:08 GMT
Title: MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Authors: Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang
Abstract summary: Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms. We present MALib, a scalable and efficient computing framework for PB-MARL.
Score: 61.28547338576706
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms, which produces a self-generated sequence of tasks arising from the coupled population dynamics. By leveraging auto-curricula to induce a population of distinct emergent strategies, PB-MARL has achieved impressive success in tackling multi-agent tasks. Despite remarkable prior arts of distributed RL frameworks, PB-MARL poses new challenges for parallelizing the training frameworks due to the additional complexity of multiple nested workloads between sampling, training and evaluation involved with heterogeneous policy interactions. To solve these problems, we present MALib, a scalable and efficient computing framework for PB-MARL. Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms. Experiments on a series of complex tasks such as multi-agent Atari Games show that MALib achieves throughput higher than 40K FPS on a single machine with $32$ CPU cores; 5x speedup than RLlib and at least 3x speedup than OpenSpiel in multi-agent training tasks. MALib is publicly available at https://github.com/sjtu-marl/malib.

Related papers

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models. Controlled Decoding provides a mechanism for aligning a model at inference time without retraining. We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
O-MAPL: Offline Multi-agent Preference Learning [5.4482836906033585]
Inferring reward functions from demonstrations is a key challenge in reinforcement learning (RL) We introduce a novel end-to-end preference-based learning framework for cooperative MARL. Our algorithm outperforms existing methods across various tasks.
arXiv Detail & Related papers (2025-01-31T08:08:20Z)
Multi-task Representation Learning for Mixed Integer Linear Programming [13.106799330951842]
This paper introduces the first multi-task learning framework for ML-guided MILP solving. We demonstrate that our multi-task learning model performs similarly to specialized models within the same distribution. It significantly outperforms them in generalization across problem sizes and tasks.
arXiv Detail & Related papers (2024-12-18T23:33:32Z)
PPS-QMIX: Periodically Parameter Sharing for Accelerating Convergence of Multi-Agent Reinforcement Learning [20.746383793882984]
Training for multi-agent reinforcement learning(MARL) is a time-consuming process. One drawback is that strategy of each agent in MARL is independent but actually in cooperation. We propose three simple approaches called Average Sharing(A-PPS), Reward-Scalability Periodically and Partial Personalized Periodically.
arXiv Detail & Related papers (2024-03-05T03:59:01Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning. We propose MADiff, a novel generative multi-agent learning framework to tackle this problem. Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)
Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z)
Multipath agents for modular multitask ML systems [2.579908688646812]
The presented work introduces a novel methodology allowing to define multiple methods as distinct agents. Agents can collaborate and compete to generate and improve ML models for a given tasks.
arXiv Detail & Related papers (2023-02-06T11:57:45Z)
MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library [82.77446613763809]
We present MARLlib, a library designed to offer fast development for multi-agent tasks and algorithm combinations. MARLlib can effectively disentangle the intertwined nature of the multi-agent task and the learning process of the algorithm. The library's source code is publicly accessible on GitHub.
arXiv Detail & Related papers (2022-10-11T03:11:12Z)
Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning [17.57163419315147]
Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. There are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. In this paper, we design an distributed MARL framework based on the actor-work-learner architecture.
arXiv Detail & Related papers (2022-05-11T03:12:49Z)
Containerized Distributed Value-Based Multi-Agent Reinforcement Learning [18.79371121484969]
We propose a containerized multi-agent reinforcement learning framework. To own knowledge, our method is the first to solve the challenging Google Research Football full game $5_v_5$. On the StarCraft II micromanagement benchmark, our method gets $4$-$18times$ better results compared to state-of-the-art non-distributed MARL algorithms.
arXiv Detail & Related papers (2021-10-15T15:54:06Z)
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks. Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy. The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z)
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications. We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting. Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.