Hybrid Training for Enhanced Multi-task Generalization in Multi-agent Reinforcement Learning
- URL: http://arxiv.org/abs/2408.13567v1
- Date: Sat, 24 Aug 2024 12:37:03 GMT
- Title: Hybrid Training for Enhanced Multi-task Generalization in Multi-agent Reinforcement Learning
- Authors: Mingliang Zhang, Sichang Su, Chengyang He, Guillaume Sartoretti,
- Abstract summary: HyGen is a novel hybrid MARL framework, which integrates online and offline learning to ensure both multi-task generalization and training efficiency.
We empirically demonstrate that our framework effectively extracts and refines general skills, yielding impressive generalization to unseen tasks.
- Score: 7.6201940008534175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-agent reinforcement learning (MARL), achieving multi-task generalization to diverse agents and objectives presents significant challenges. Existing online MARL algorithms primarily focus on single-task performance, but their lack of multi-task generalization capabilities typically results in substantial computational waste and limited real-life applicability. Meanwhile, existing offline multi-task MARL approaches are heavily dependent on data quality, often resulting in poor performance on unseen tasks. In this paper, we introduce HyGen, a novel hybrid MARL framework, Hybrid Training for Enhanced Multi-Task Generalization, which integrates online and offline learning to ensure both multi-task generalization and training efficiency. Specifically, our framework extracts potential general skills from offline multi-task datasets. We then train policies to select the optimal skills under the centralized training and decentralized execution paradigm (CTDE). During this stage, we utilize a replay buffer that integrates both offline data and online interactions. We empirically demonstrate that our framework effectively extracts and refines general skills, yielding impressive generalization to unseen tasks. Comparative analyses on the StarCraft multi-agent challenge show that HyGen outperforms a wide range of existing solely online and offline methods.
Related papers
- Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning [11.790581500542439]
Reinforcement learning (RL) with diverse offline datasets can have the advantage of leveraging the relation of multiple tasks.
We present a skill-based multi-task RL technique on heterogeneous datasets that are generated by behavior policies of different quality.
We show that our multi-task offline RL approach is robust to the mixed configurations of different-quality datasets.
arXiv Detail & Related papers (2024-08-28T07:36:20Z) - Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z) - Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local
Value Regularization [23.416448404647305]
OMIGA is a new offline m ulti-agent RL algorithm with implicit global-to-local v alue regularization.
We show that OMIGA achieves superior performance over the state-of-the-art offline MARL methods in almost all tasks.
arXiv Detail & Related papers (2023-07-21T14:37:54Z) - Multi-task Hierarchical Adversarial Inverse Reinforcement Learning [40.60364143826424]
Multi-task Imitation Learning (MIL) aims to train a policy capable of performing a distribution of tasks based on multi-task expert demonstrations.
Existing MIL algorithms suffer from low data efficiency and poor performance on complex long-horizontal tasks.
We develop Multi-task Hierarchical Adversarial Inverse Reinforcement Learning (MH-AIRL) to learn hierarchically-structured multi-task policies.
arXiv Detail & Related papers (2023-05-22T01:58:40Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Effective Adaptation in Multi-Task Co-Training for Unified Autonomous
Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks.
We find that their performances are sub-optimal or even lag far behind the single-task baseline.
We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence
Model Conquers All StarCraftII Tasks [43.588686040547486]
offline pre-training with online fine-tuning has never been studied, nor datasets or benchmarks for offline MARL research are available.
We propose the novel architecture of multi-agent decision transformer (MADT) for effective offline learning.
When evaluated on StarCraft II offline dataset, MADT demonstrates superior performance than state-of-the-art offline RL baselines.
arXiv Detail & Related papers (2021-12-06T08:11:05Z) - Channel Exchanging Networks for Multimodal and Multitask Dense Image
Prediction [125.18248926508045]
We propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for both multimodal fusion and multitask learning.
CEN dynamically exchanges channels betweenworks of different modalities.
For the application of dense image prediction, the validity of CEN is tested by four different scenarios.
arXiv Detail & Related papers (2021-12-04T05:47:54Z) - MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning [61.28547338576706]
Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms.
We present MALib, a scalable and efficient computing framework for PB-MARL.
arXiv Detail & Related papers (2021-06-05T03:27:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.