Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training
- URL: http://arxiv.org/abs/2409.19391v1
- Date: Sat, 28 Sep 2024 15:57:24 GMT
- Title: Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training
- Authors: Pihe Hu, Shaolong Li, Zhuoran Li, Ling Pan, Longbo Huang,
- Abstract summary: Multi-agent Reinforcement Learning (MARL) relies on neural networks with numerous parameters in multi-agent scenarios.
This paper proposes the utilization of dynamic sparse training (DST), a technique proven effective in deep supervised learning tasks.
We introduce an innovative Multi-Agent Sparse Training (MAST) framework aimed at simultaneously enhancing the reliability of learning targets and the rationality of sample distribution.
- Score: 38.03693752287459
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep Multi-agent Reinforcement Learning (MARL) relies on neural networks with numerous parameters in multi-agent scenarios, often incurring substantial computational overhead. Consequently, there is an urgent need to expedite training and enable model compression in MARL. This paper proposes the utilization of dynamic sparse training (DST), a technique proven effective in deep supervised learning tasks, to alleviate the computational burdens in MARL training. However, a direct adoption of DST fails to yield satisfactory MARL agents, leading to breakdowns in value learning within deep sparse value-based MARL models. Motivated by this challenge, we introduce an innovative Multi-Agent Sparse Training (MAST) framework aimed at simultaneously enhancing the reliability of learning targets and the rationality of sample distribution to improve value learning in sparse models. Specifically, MAST incorporates the Soft Mellowmax Operator with a hybrid TD-($\lambda$) schema to establish dependable learning targets. Additionally, it employs a dual replay buffer mechanism to enhance the distribution of training samples. Building upon these aspects, MAST utilizes gradient-based topology evolution to exclusively train multiple MARL agents using sparse networks. Our comprehensive experimental investigation across various value-based MARL algorithms on multiple benchmarks demonstrates, for the first time, significant reductions in redundancy of up to $20\times$ in Floating Point Operations (FLOPs) for both training and inference, with less than $3\%$ performance degradation.
Related papers
- DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.
We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.
Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z) - O-MAPL: Offline Multi-agent Preference Learning [5.4482836906033585]
Inferring reward functions from demonstrations is a key challenge in reinforcement learning (RL)
We introduce a novel end-to-end preference-based learning framework for cooperative MARL.
Our algorithm outperforms existing methods across various tasks.
arXiv Detail & Related papers (2025-01-31T08:08:20Z) - Meta-Sparsity: Learning Optimal Sparse Structures in Multi-task Networks through Meta-learning [4.462334751640166]
meta-sparsity is a framework for learning model sparsity that allows deep neural networks (DNNs) to generate optimal sparse shared structures in multi-task learning setting.
Inspired by Model Agnostic Meta-Learning (MAML), the emphasis is on learning shared and optimally sparse parameters in multi-task scenarios.
The effectiveness of meta-sparsity is rigorously evaluated by extensive experiments on two datasets.
arXiv Detail & Related papers (2025-01-21T13:25:32Z) - Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training [15.462969044840868]
We introduce LW-FedMML, a layer-wise federated multimodal learning approach which decomposes the training process into multiple stages.
We conduct extensive experiments across various FL and multimodal learning settings to validate the effectiveness of our proposed method.
Specifically, LW-FedMML reduces memory usage by up to $2.7times$, computational operations (FLOPs) by $2.4times$, and total communication cost by $2.3times$.
arXiv Detail & Related papers (2024-07-22T07:06:17Z) - Representation Learning For Efficient Deep Multi-Agent Reinforcement Learning [10.186029242664931]
We present MAPO-LSO which applies a form of comprehensive representation learning devised to supplement MARL training.
Specifically, MAPO-LSO proposes a multi-agent extension of transition dynamics reconstruction and self-predictive learning.
Empirical results demonstrate MAPO-LSO to show notable improvements in sample efficiency and learning performance compared to its vanilla MARL counterpart.
arXiv Detail & Related papers (2024-06-05T03:11:44Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal
Difference and Successor Representation [32.80370188601152]
The paper proposes the Multi-Agent Adaptive Kalman Temporal Difference (MAK-TD) framework and its Successor Representation-based variant, referred to as the MAK-SR.
The proposed MAK-TD/SR frameworks consider the continuous nature of the action-space that is associated with high dimensional multi-agent environments.
arXiv Detail & Related papers (2021-12-30T18:21:53Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Softmax with Regularization: Better Value Estimation in Multi-Agent
Reinforcement Learning [72.28520951105207]
Overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning.
We propose a novel regularization-based update scheme that penalizes large joint action-values deviating from a baseline.
We show that our method provides a consistent performance improvement on a set of challenging StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-03-22T14:18:39Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.