MaskMA: Towards Zero-Shot Multi-Agent Decision Making with Mask-Based
Collaborative Learning
- URL: http://arxiv.org/abs/2310.11846v2
- Date: Fri, 23 Feb 2024 02:11:14 GMT
- Title: MaskMA: Towards Zero-Shot Multi-Agent Decision Making with Mask-Based
Collaborative Learning
- Authors: Jie Liu, Yinmin Zhang, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu,
Wanli Ouyang
- Abstract summary: We propose a Mask-Based collaborative learning framework for Multi-Agent decision making (MaskMA)
We show MaskMA can achieve an impressive 77.8% average zero-shot win rate on 60 unseen test maps by decentralized execution.
- Score: 56.00558959816801
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building a single generalist agent with strong zero-shot capability has
recently sparked significant advancements. However, extending this capability
to multi-agent decision making scenarios presents challenges. Most current
works struggle with zero-shot transfer, due to two challenges particular to the
multi-agent settings: (a) a mismatch between centralized training and
decentralized execution; and (b) difficulties in creating generalizable
representations across diverse tasks due to varying agent numbers and action
spaces. To overcome these challenges, we propose a Mask-Based collaborative
learning framework for Multi-Agent decision making (MaskMA). Firstly, we
propose to randomly mask part of the units and collaboratively learn the
policies of unmasked units to handle the mismatch. In addition, MaskMA
integrates a generalizable action representation by dividing the action space
into intrinsic actions solely related to the unit itself and interactive
actions involving interactions with other units. This flexibility allows MaskMA
to tackle tasks with varying agent numbers and thus different action spaces.
Extensive experiments in SMAC reveal MaskMA, with a single model trained on 11
training maps, can achieve an impressive 77.8% average zero-shot win rate on 60
unseen test maps by decentralized execution, while also performing effectively
on other types of downstream tasks (e.g., varied policies collaboration, ally
malfunction, and ad hoc team play).
Related papers
- Improving Global Parameter-sharing in Physically Heterogeneous Multi-agent Reinforcement Learning with Unified Action Space [22.535906675532196]
In a multi-agent system, action semantics indicates the different influences of agents' actions toward other entities.
Previous multi-agent reinforcement learning (MARL) algorithms apply global parameter-sharing across different types of heterogeneous agents.
We introduce the Unified Action Space (UAS) to fulfill the requirement.
arXiv Detail & Related papers (2024-08-14T09:15:11Z) - Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning [54.40927310957792]
We introduce a novel concept of personalized expert demonstrations, tailored for each individual agent or, more broadly, each individual type of agent within a heterogeneous team.
These demonstrations solely pertain to single-agent behaviors and how each agent can achieve personal goals without encompassing any cooperative elements.
We propose an approach that selectively utilizes personalized expert demonstrations as guidance and allows agents to learn to cooperate.
arXiv Detail & Related papers (2024-03-13T20:11:20Z) - Multi-agent Continual Coordination via Progressive Task
Contextualization [5.31057635825112]
This paper proposes an approach Multi-Agent Continual Coordination via Progressive Task Contextualization, dubbed MACPro.
We show in multiple multi-agent benchmarks that existing continual learning methods fail, while MACPro is able to achieve close-to-optimal performance.
arXiv Detail & Related papers (2023-05-07T15:04:56Z) - CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action
Spaces [9.578169216444813]
This paper proposes an approach to coordinating multi-robot manipulation through learned latent action spaces that are shared across different agents.
We validate our method in simulated multi-robot manipulation tasks and demonstrate improvement over previous baselines in terms of sample efficiency and learning performance.
arXiv Detail & Related papers (2022-11-28T23:20:47Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Transformer-based Value Function Decomposition for Cooperative
Multi-agent Reinforcement Learning in StarCraft [1.160208922584163]
The StarCraft II Multi-Agent Challenge (SMAC) was created to be a benchmark problem for cooperative multi-agent reinforcement learning (MARL)
This paper introduces a new architecture TransMix, a transformer-based joint action-value mixing network.
arXiv Detail & Related papers (2022-08-15T16:13:16Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Multi-Task Adversarial Attack [3.412750324146571]
Multi-Task adversarial Attack (MTA) is a unified framework that can craft adversarial examples for multiple tasks efficiently.
MTA uses a generator for adversarial perturbations which consists of a shared encoder for all tasks and multiple task-specific decoders.
Thanks to the shared encoder, MTA reduces the storage cost and speeds up the inference when attacking multiple tasks simultaneously.
arXiv Detail & Related papers (2020-11-19T13:56:58Z) - A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied
Tasks [111.34055449929487]
We introduce the novel task FurnMove in which agents work together to move a piece of furniture through a living room to a goal.
Unlike existing tasks, FurnMove requires agents to coordinate at every timestep.
We identify two challenges when training agents to complete FurnMove: existing decentralized action sampling procedures do not permit expressive joint action policies.
Using SYNC-policies and CORDIAL, our agents achieve a 58% completion rate on FurnMove, an impressive absolute gain of 25 percentage points over competitive decentralized baselines.
arXiv Detail & Related papers (2020-07-09T17:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.