RODE: Learning Roles to Decompose Multi-Agent Tasks
- URL: http://arxiv.org/abs/2010.01523v1
- Date: Sun, 4 Oct 2020 09:20:59 GMT
- Title: RODE: Learning Roles to Decompose Multi-Agent Tasks
- Authors: Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson,
Chongjie Zhang
- Abstract summary: Role-based learning holds the promise of achieving scalable multi-agent learning by decomposing complex tasks using roles.
We propose to first decompose joint action spaces into restricted role action spaces by clustering actions according to their effects on the environment and other agents.
By virtue of these advances, our method outperforms the current state-of-the-art MARL algorithms on 10 of the 14 scenarios that comprise the challenging StarCraft II micromanagement benchmark.
- Score: 69.56458960841165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Role-based learning holds the promise of achieving scalable multi-agent
learning by decomposing complex tasks using roles. However, it is largely
unclear how to efficiently discover such a set of roles. To solve this problem,
we propose to first decompose joint action spaces into restricted role action
spaces by clustering actions according to their effects on the environment and
other agents. Learning a role selector based on action effects makes role
discovery much easier because it forms a bi-level learning hierarchy -- the
role selector searches in a smaller role space and at a lower temporal
resolution, while role policies learn in significantly reduced primitive
action-observation spaces. We further integrate information about action
effects into the role policies to boost learning efficiency and policy
generalization. By virtue of these advances, our method (1) outperforms the
current state-of-the-art MARL algorithms on 10 of the 14 scenarios that
comprise the challenging StarCraft II micromanagement benchmark and (2)
achieves rapid transfer to new environments with three times the number of
agents. Demonstrative videos are available at
https://sites.google.com/view/rode-marl .
Related papers
- Active Fine-Tuning of Generalist Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget.
We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z) - Variational Offline Multi-agent Skill Discovery [43.869625428099425]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills.
Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining.
arXiv Detail & Related papers (2024-05-26T00:24:46Z) - Action Pick-up in Dynamic Action Space Reinforcement Learning [6.15205100319133]
We propose an intelligent Action Pick-up (AP) algorithm to autonomously choose valuable actions that are most likely to boost performance from a set of new actions.
In this paper, we first theoretically analyze and find that a prior optimal policy plays an important role in action pick-up by providing useful knowledge and experience.
We then design two different AP methods: frequency-based global method and state clustering-based local method, based on the prior optimal policy.
arXiv Detail & Related papers (2023-04-03T10:55:16Z) - CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action
Spaces [9.578169216444813]
This paper proposes an approach to coordinating multi-robot manipulation through learned latent action spaces that are shared across different agents.
We validate our method in simulated multi-robot manipulation tasks and demonstrate improvement over previous baselines in terms of sample efficiency and learning performance.
arXiv Detail & Related papers (2022-11-28T23:20:47Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Learning to Transfer Role Assignment Across Team Sizes [48.43860606706273]
We propose a framework to learn role assignment and transfer across team sizes.
We demonstrate that re-using the role-based credit assignment structure can foster the learning process of larger reinforcement learning teams.
arXiv Detail & Related papers (2022-04-17T11:22:01Z) - Hierarchically Decoupled Imitation for Morphological Transfer [95.19299356298876]
We show that transferring learned information from a morphologically simpler agent can massively improve the sample efficiency of a more complex one.
First, we show that incentivizing a complex agent's low-level to imitate a simpler agent's low-level significantly improves zero-shot high-level transfer.
Second, we show that KL-regularized training of the high level stabilizes learning and prevents mode-collapse.
arXiv Detail & Related papers (2020-03-03T18:56:49Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.