Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning
- URL: http://arxiv.org/abs/2410.08540v1
- Date: Fri, 11 Oct 2024 05:22:54 GMT
- Title: Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning
- Authors: Xinran Li, Ling Pan, Jun Zhang,
- Abstract summary: We introduce emphKaleidoscope, a novel adaptive partial parameter sharing scheme.
It promotes diversity among policy networks by encouraging discrepancy among these masks, without sacrificing the efficiencies of parameter sharing.
We extend Kaleidoscope to critic ensembles in the context of actor-critic algorithms, which could help improve value estimations.
- Score: 14.01772209044574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In multi-agent reinforcement learning (MARL), parameter sharing is commonly employed to enhance sample efficiency. However, the popular approach of full parameter sharing often leads to homogeneous policies among agents, potentially limiting the performance benefits that could be derived from policy diversity. To address this critical limitation, we introduce \emph{Kaleidoscope}, a novel adaptive partial parameter sharing scheme that fosters policy heterogeneity while still maintaining high sample efficiency. Specifically, Kaleidoscope maintains one set of common parameters alongside multiple sets of distinct, learnable masks for different agents, dictating the sharing of parameters. It promotes diversity among policy networks by encouraging discrepancy among these masks, without sacrificing the efficiencies of parameter sharing. This design allows Kaleidoscope to dynamically balance high sample efficiency with a broad policy representational capacity, effectively bridging the gap between full parameter sharing and non-parameter sharing across various environments. We further extend Kaleidoscope to critic ensembles in the context of actor-critic algorithms, which could help improve value estimations.Our empirical evaluations across extensive environments, including multi-agent particle environment, multi-agent MuJoCo and StarCraft multi-agent challenge v2, demonstrate the superior performance of Kaleidoscope compared with existing parameter sharing approaches, showcasing its potential for performance enhancement in MARL. The code is publicly available at \url{https://github.com/LXXXXR/Kaleidoscope}.
Related papers
- Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks [2.681242476043447]
We present Capability-Aware Shared Hypernetworks (CASH), a novel architecture for heterogeneous multi-agent coordination.
CASH generates sufficient diversity while maintaining sample-efficiency via soft parameter-sharing hypernetworks.
We present experiments across two heterogeneous coordination tasks and three standard learning paradigms.
arXiv Detail & Related papers (2025-01-10T15:39:39Z) - Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.
Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.
We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.
Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - HyperMARL: Adaptive Hypernetworks for Multi-Agent RL [10.00022425344723]
HyperMARL is a parameter sharing approach that uses hypernetworks to generate agent-specific parameters without altering the learning objective.
It consistently performs competitively with fully shared, non- parameter-sharing, and diversity-promoting baselines.
These findings establish hypernetworks as a versatile approach for MARL across diverse environments.
arXiv Detail & Related papers (2024-12-05T15:09:51Z) - Adaptive parameter sharing for multi-agent reinforcement learning [16.861543418593044]
We propose a novel parameter sharing method inspired by research pertaining to the brain in biology.
It maps each type of agent to different regions within a shared network based on their identity, resulting in distinctworks.
Our method can increase the diversity of strategies among different agents without additional training parameters.
arXiv Detail & Related papers (2023-12-14T15:00:32Z) - Learning Visual Representation from Modality-Shared Contrastive
Language-Image Pre-training [88.80694147730883]
We investigate a variety of Modality-Shared Contrastive Language-Image Pre-training (MS-CLIP) frameworks.
In studied conditions, we observe that a mostly unified encoder for vision and language signals outperforms all other variations that separate more parameters.
Our approach outperforms vanilla CLIP by 1.6 points in linear probing on a collection of 24 downstream vision tasks.
arXiv Detail & Related papers (2022-07-26T05:19:16Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - Mix and Mask Actor-Critic Methods [0.0]
Shared feature spaces for actor-critic methods aims to capture generalized latent representations to be used by the policy and value function.
We present a novel feature-sharing framework to address these difficulties by introducing the mix and mask mechanisms and the distributional scalarization technique.
From our experimental results, we demonstrate significant performance improvements compared to alternative methods using separate networks and networks with a shared backbone.
arXiv Detail & Related papers (2021-06-24T14:12:45Z) - Scaling Multi-Agent Reinforcement Learning with Selective Parameter
Sharing [4.855663359344748]
Sharing parameters in deep reinforcement learning has played an essential role in allowing algorithms to scale to a large number of agents.
However, having all agents share the same parameters can also have a detrimental effect on learning.
We propose a novel method to automatically identify agents which may benefit from sharing parameters by partitioning them based on their abilities and goals.
arXiv Detail & Related papers (2021-02-15T11:33:52Z) - Multi-agent Policy Optimization with Approximatively Synchronous
Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly.
In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously.
In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.