Related papers: Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning

Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning

URL: http://arxiv.org/abs/2410.08540v1
Date: Fri, 11 Oct 2024 05:22:54 GMT
Title: Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning
Authors: Xinran Li, Ling Pan, Jun Zhang,
Abstract summary: We introduce emphKaleidoscope, a novel adaptive partial parameter sharing scheme. It promotes diversity among policy networks by encouraging discrepancy among these masks, without sacrificing the efficiencies of parameter sharing. We extend Kaleidoscope to critic ensembles in the context of actor-critic algorithms, which could help improve value estimations.
Score: 14.01772209044574
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In multi-agent reinforcement learning (MARL), parameter sharing is commonly employed to enhance sample efficiency. However, the popular approach of full parameter sharing often leads to homogeneous policies among agents, potentially limiting the performance benefits that could be derived from policy diversity. To address this critical limitation, we introduce \emph{Kaleidoscope}, a novel adaptive partial parameter sharing scheme that fosters policy heterogeneity while still maintaining high sample efficiency. Specifically, Kaleidoscope maintains one set of common parameters alongside multiple sets of distinct, learnable masks for different agents, dictating the sharing of parameters. It promotes diversity among policy networks by encouraging discrepancy among these masks, without sacrificing the efficiencies of parameter sharing. This design allows Kaleidoscope to dynamically balance high sample efficiency with a broad policy representational capacity, effectively bridging the gap between full parameter sharing and non-parameter sharing across various environments. We further extend Kaleidoscope to critic ensembles in the context of actor-critic algorithms, which could help improve value estimations.Our empirical evaluations across extensive environments, including multi-agent particle environment, multi-agent MuJoCo and StarCraft multi-agent challenge v2, demonstrate the superior performance of Kaleidoscope compared with existing parameter sharing approaches, showcasing its potential for performance enhancement in MARL. The code is publicly available at \url{https://github.com/LXXXXR/Kaleidoscope}.

Related papers

Graft: Integrating the Domain Knowledge via Efficient Parameter Synergy for MLLMs [56.76586846269894]
Multimodal Large Language Models (MLLMs) have achieved success across various domains.<n>Despite its importance, the study of knowledge sharing among domain-specific MLLMs remains largely underexplored.<n>We propose a unified parameter integration framework that enables modular composition of expert capabilities.
arXiv Detail & Related papers (2025-06-30T15:07:41Z)
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers [86.5541501589166]
DiffMoE is a batch-level global token pool that enables experts to access global token distributions during training. It achieves state-of-the-art performance among diffusion models on ImageNet benchmark. The effectiveness of our approach extends beyond class-conditional generation to more challenging tasks such as text-to-image generation.
arXiv Detail & Related papers (2025-03-18T17:57:07Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL [10.00022425344723]
HyperMARL is a parameter sharing approach that uses hypernetworks to generate agent-specific parameters without altering the learning objective. It consistently performs competitively with fully shared, non- parameter-sharing, and diversity-promoting baselines. These findings establish hypernetworks as a versatile approach for MARL across diverse environments.
arXiv Detail & Related papers (2024-12-05T15:09:51Z)
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [72.25707314772254]
We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task. The upper level of this framework is dedicated to learning a task-specific mask that delineates the harmony subspace, while the inner level focuses on updating parameters to enhance the overall performance of the unified policy.
arXiv Detail & Related papers (2024-05-28T11:41:41Z)
Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration [5.326588461041464]
Multi-agent reinforcement learning (MARL) is transforming fields like autonomous vehicle networks. MARL strategies for different roles can be updated flexibly according to the scales, which is still a challenge for current MARL frameworks. We propose a novel MARL framework named Scalable and Heterogeneous Proximal Policy Optimization (SHPPO) We show SHPPO exhibits superior performance in classic MARL environments like Starcraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF)
arXiv Detail & Related papers (2024-04-05T03:02:57Z)
Adaptive parameter sharing for multi-agent reinforcement learning [16.861543418593044]
We propose a novel parameter sharing method inspired by research pertaining to the brain in biology. It maps each type of agent to different regions within a shared network based on their identity, resulting in distinctworks. Our method can increase the diversity of strategies among different agents without additional training parameters.
arXiv Detail & Related papers (2023-12-14T15:00:32Z)
Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning [65.51668094117802]
We propose a human-centered interactive HPO approach tailored towards multi-objective machine learning (ML) Instead of relying on the user guessing the most suitable indicator for their needs, our approach automatically learns an appropriate indicator.
arXiv Detail & Related papers (2023-09-07T09:22:05Z)
Parameter Sharing with Network Pruning for Scalable Multi-Agent Deep Reinforcement Learning [20.35644044703191]
We propose a simple method that adopts structured pruning for a deep neural network to increase the representational capacity of the joint policy without introducing additional parameters. We evaluate the proposed method on several benchmark tasks, and numerical results show that the proposed method significantly outperforms other parameter-sharing methods.
arXiv Detail & Related papers (2023-03-02T02:17:14Z)
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training [88.80694147730883]
We investigate a variety of Modality-Shared Contrastive Language-Image Pre-training (MS-CLIP) frameworks. In studied conditions, we observe that a mostly unified encoder for vision and language signals outperforms all other variations that separate more parameters. Our approach outperforms vanilla CLIP by 1.6 points in linear probing on a collection of 24 downstream vision tasks.
arXiv Detail & Related papers (2022-07-26T05:19:16Z)
Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity. The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z)
Mix and Mask Actor-Critic Methods [0.0]
Shared feature spaces for actor-critic methods aims to capture generalized latent representations to be used by the policy and value function. We present a novel feature-sharing framework to address these difficulties by introducing the mix and mask mechanisms and the distributional scalarization technique. From our experimental results, we demonstrate significant performance improvements compared to alternative methods using separate networks and networks with a shared backbone.
arXiv Detail & Related papers (2021-06-24T14:12:45Z)
Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing [4.855663359344748]
Sharing parameters in deep reinforcement learning has played an essential role in allowing algorithms to scale to a large number of agents. However, having all agents share the same parameters can also have a detrimental effect on learning. We propose a novel method to automatically identify agents which may benefit from sharing parameters by partitioning them based on their abilities and goals.
arXiv Detail & Related papers (2021-02-15T11:33:52Z)
Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly. In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously. In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z)
FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC) It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces. We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.