Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation
- URL: http://arxiv.org/abs/2302.04094v1
- Date: Wed, 8 Feb 2023 14:44:21 GMT
- Title: Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation
- Authors: Xinyi Yang, Shiyu Huang, Yiwen Sun, Yuxiang Yang, Chao Yu, Wei-Wei Tu,
Huazhong Yang, Yu Wang
- Abstract summary: Multi-agent reinforcement learning (MARL) has shown promising results for solving this issue.
Goal-conditioned hierarchical reinforcement learning (HRL) provides a promising direction to tackle this challenge.
We propose MAGE-X, a graph-based goal-conditioned hierarchical method for multi-agent navigation tasks.
- Score: 28.71585436726336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the multi-agent navigation problem, which requires
multiple agents to reach the target goals in a limited time. Multi-agent
reinforcement learning (MARL) has shown promising results for solving this
issue. However, it is inefficient for MARL to directly explore the (nearly)
optimal policy in the large search space, which is exacerbated as the agent
number increases (e.g., 10+ agents) or the environment is more complex (e.g.,
3D simulator). Goal-conditioned hierarchical reinforcement learning (HRL)
provides a promising direction to tackle this challenge by introducing a
hierarchical structure to decompose the search space, where the low-level
policy predicts primitive actions in the guidance of the goals derived from the
high-level policy. In this paper, we propose Multi-Agent Graph-Enhanced
Commander-Executor (MAGE-X), a graph-based goal-conditioned hierarchical method
for multi-agent navigation tasks. MAGE-X comprises a high-level Goal Commander
and a low-level Action Executor. The Goal Commander predicts the probability
distribution of goals and leverages them to assign each agent the most
appropriate final target. The Action Executor utilizes graph neural networks
(GNN) to construct a subgraph for each agent that only contains crucial
partners to improve cooperation. Additionally, the Goal Encoder in the Action
Executor captures the relationship between the agent and the designated goal to
encourage the agent to reach the final target. The results show that MAGE-X
outperforms the state-of-the-art MARL baselines with a 100% success rate with
only 3 million training steps in multi-agent particle environments (MPE) with
50 agents, and at least a 12% higher success rate and 2x higher data efficiency
in a more complicated quadrotor 3D navigation task.
Related papers
- HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model [39.169389255970806]
HiAgent is a framework that leverages subgoals as memory chunks to manage the working memory of Large Language Model (LLM)-based agents hierarchically.
Results show that HiAgent achieves a twofold increase in success rate and reduces the average number of steps required by 3.8.
arXiv Detail & Related papers (2024-08-18T17:59:49Z) - Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement [50.481380478458945]
Iterative step-level Process Refinement (IPR) framework provides detailed step-by-step guidance to enhance agent training.
Our experiments on three complex agent tasks demonstrate that our framework outperforms a variety of strong baselines.
arXiv Detail & Related papers (2024-06-17T03:29:13Z) - Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [56.00992369295851]
Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents.
This paper delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations.
We propose Agent-FLAN to effectively Fine-tune LANguage models for Agents.
arXiv Detail & Related papers (2024-03-19T16:26:10Z) - Learning to Use Tools via Cooperative and Interactive Agents [58.77710337157665]
Tool learning empowers large language models (LLMs) as agents to use external tools and extend their utility.
We propose ConAgents, a Cooperative and interactive Agents framework, which coordinates three specialized agents for tool selection, tool execution, and action calibration separately.
Our experiments on three datasets show that the LLMs, when equipped with ConAgents, outperform baselines with substantial improvement.
arXiv Detail & Related papers (2024-03-05T15:08:16Z) - MASP: Scalable GNN-based Planning for Multi-Agent Navigation [17.788592987873905]
We propose a goal-conditioned hierarchical planner for navigation tasks with a substantial number of agents.
We also leverage graph neural networks (GNN) to model the interaction between agents and goals, improving goal achievement.
The results demonstrate that MASP outperforms classical planning-based competitors and RL baselines.
arXiv Detail & Related papers (2023-12-05T06:05:04Z) - Agents meet OKR: An Object and Key Results Driven Agent System with
Hierarchical Self-Collaboration and Self-Evaluation [25.308341461293857]
OKR-Agent is designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.
Our framework includes two novel modules: hierarchical Objects and Key Results generation and multi-level evaluation.
arXiv Detail & Related papers (2023-11-28T06:16:30Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement
Learning [7.2972297703292135]
Multiagent reinforcement learning (MARL) can solve complex cooperative tasks.
In this paper, we design a graph network called Cooperation Graph (CG)
We propose a Cooperation Graph Multiagent Reinforcement Learning (CG-MARL) algorithm, which can efficiently deal with the sparse reward problem in multiagent tasks.
arXiv Detail & Related papers (2022-08-05T06:32:16Z) - Multi-Agent Embodied Visual Semantic Navigation with Scene Prior
Knowledge [42.37872230561632]
In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given.
Most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor fault tolerance when completing more complicated tasks.
We propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects.
arXiv Detail & Related papers (2021-09-20T13:31:03Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.