Related papers: Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation

Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation

URL: http://arxiv.org/abs/2403.08282v2
Date: Mon, 18 Mar 2024 05:03:53 GMT
Title: Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation
Authors: Zhonghan Zhao, Kewei Chen, Dongxu Guo, Wenhao Chai, Tian Ye, Yanting Zhang, Gaoang Wang,
Abstract summary: We design a hierarchical auto-organizing navigation system for multi-agent organization in Minecraft. We also design a series of navigation tasks in the Minecraft environment, which includes searching and exploring. We aim to develop embodied organizations that push the boundaries of embodied AI, moving it towards a more human-like organizational structure.
Score: 12.753472502707153
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to the dynamic and unpredictable open-world setting, navigating complex environments in Minecraft poses significant challenges for multi-agent systems. Agents must interact with the environment and coordinate their actions with other agents to achieve common objectives. However, traditional approaches often struggle to efficiently manage inter-agent communication and task distribution, crucial for effective multi-agent navigation. Furthermore, processing and integrating multi-modal information (such as visual, textual, and auditory data) is essential for agents to comprehend their goals and navigate the environment successfully and fully. To address this issue, we design the HAS framework to auto-organize groups of LLM-based agents to complete navigation tasks. In our approach, we devise a hierarchical auto-organizing navigation system, which is characterized by 1) a hierarchical system for multi-agent organization, ensuring centralized planning and decentralized execution; 2) an auto-organizing and intra-communication mechanism, enabling dynamic group adjustment under subtasks; 3) a multi-modal information platform, facilitating multi-modal perception to perform the three navigation tasks with one system. To assess organizational behavior, we design a series of navigation tasks in the Minecraft environment, which includes searching and exploring. We aim to develop embodied organizations that push the boundaries of embodied AI, moving it towards a more human-like organizational structure.

Related papers

Agentic Web: Weaving the Next Web with AI Agents [109.13815627467514]
The emergence of AI agents powered by large language models (LLMs) marks a pivotal shift toward the Agentic Web.<n>In this paradigm, agents interact directly with one another to plan, coordinate, and execute complex tasks on behalf of users.<n>We present a structured framework for understanding and building the Agentic Web.
arXiv Detail & Related papers (2025-07-28T17:58:12Z)
AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving [30.50203052125566]
projectname is a hierarchical multi-agent framework for general-purpose task solving.<n>projectname features a central planning agent that decomposes complex objectives and delegates sub-tasks to a team of specialized agents.<n>Each sub-agent is equipped with general programming and analytical tools, as well as abilities to tackle a wide range of real-world specific tasks.
arXiv Detail & Related papers (2025-06-14T13:45:37Z)
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis [70.39500621448383]
Open-world mobile manipulation task remains a challenge due to the need for generalization to open-ended instructions and environments.<n>We propose a novel multi-modal agent architecture that maintains multi-view scene frames and agent states for decision-making and controls the robot by function calling.<n>We highlight our fine-tuned OWMM-VLM as the first dedicated foundation model for mobile manipulators with global scene understanding, robot state tracking, and multi-modal action generation in a unified model.
arXiv Detail & Related papers (2025-06-04T17:57:44Z)
AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems [22.291969093748005]
AgentNet is a decentralized, Retrieval-Augmented Generation (RAG)-based framework for multi-agent systems. Unlike traditional multi-agent systems that depend on static assignments or centralized control, AgentNet allows agents to specialize dynamically. AgentNet promotes scalable adaptability and enables privacy-preserving collaboration across organizations.
arXiv Detail & Related papers (2025-04-01T09:45:25Z)
OmniNova:A General Multimodal Agent Framework [0.5439020425819]
Large Language Models (LLMs) with specialized tools presents new opportunities for intelligent automation systems. We present OmniNova, a modular multi-agent automation framework that combines language models with specialized tools such as web search, crawling, and code execution capabilities.
arXiv Detail & Related papers (2025-03-25T19:21:01Z)
Autonomous Deep Agent [0.7489814067742621]
Deep Agent is an advanced autonomous AI system designed to manage complex multi-phase tasks. The system's foundation is built on our Hierarchical Task DAG framework. Deep Agent establishes a novel paradigm in self-governing AI systems.
arXiv Detail & Related papers (2025-02-10T21:46:54Z)
LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning [12.996741471128539]
Developing intelligent agents for long-term cooperation in dynamic open-world scenarios is a major challenge in multi-agent systems. We propose Decentralized Adaptive Knowledge Graph Memory and Structured Communication System (DAMCS) in a novel Multi-agent Crafter environment. Our generative agents, powered by Large Language Models (LLMs), are more scalable than traditional MARL agents by leveraging external knowledge and language for long-term planning and reasoning.
arXiv Detail & Related papers (2025-02-08T05:26:02Z)
Cooperative and Asynchronous Transformer-based Mission Planning for Heterogeneous Teams of Mobile Robots [1.1049608786515839]
This paper presents the Cooperative and Asynchronous Transformer-based Mission Planning (CATMiP) framework. CatMiP uses multi-agent reinforcement learning to coordinate agents with heterogeneous sensing, motion, and actuation capabilities. It easily adapts to mission complexities and communication constraints, and scales to varying environment sizes and team compositions.
arXiv Detail & Related papers (2024-10-08T21:14:09Z)
Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems. In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy. Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z)
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents. We propose the Internet of Agents (IoA), a novel framework that addresses these limitations. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z)
CAMON: Cooperative Agents for Multi-Object Navigation with LLM-based Conversations [22.79813240034754]
Large language models (LLMs) have exhibited remarkable comprehension and planning abilities. This paper proposes a framework for decentralized multi-agent navigation, leveraging LLM-enabled communication and collaboration.
arXiv Detail & Related papers (2024-06-30T09:14:33Z)
QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds [51.05639500325598]
We introduce QuadrupedGPT, designed to follow diverse commands with agility comparable to that of a pet. Our agent shows proficiency in handling diverse tasks and intricate instructions, representing a significant step toward the development of versatile quadruped agents.
arXiv Detail & Related papers (2024-06-24T12:14:24Z)
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration [52.25473993987409]
We propose Mobile-Agent-v2, a multi-agent architecture for mobile device operation assistance. The architecture comprises three agents: planning agent, decision agent, and reflection agent. We show that Mobile-Agent-v2 achieves over a 30% improvement in task completion compared to the single-agent architecture.
arXiv Detail & Related papers (2024-06-03T05:50:00Z)
AgentScope: A Flexible yet Robust Multi-Agent Platform [66.64116117163755]
AgentScope is a developer-centric multi-agent platform with message exchange as its core communication mechanism. The abundant syntactic tools, built-in agents and service functions, user-friendly interfaces for application demonstration and utility monitor, zero-code programming workstation, and automatic prompt tuning mechanism significantly lower the barriers to both development and deployment.
arXiv Detail & Related papers (2024-02-21T04:11:28Z)
An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents. Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction. We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z)
AutoAgents: A Framework for Automatic Agent Generation [27.74332323317923]
AutoAgents is an innovative framework that adaptively generates and coordinates multiple specialized agents to build an AI team according to different tasks. Our experiments on various benchmarks demonstrate that AutoAgents generates more coherent and accurate solutions than the existing multi-agent methods.
arXiv Detail & Related papers (2023-09-29T14:46:30Z)
Multi-Level Compositional Reasoning for Interactive Instruction Following [24.581542880280203]
Multi-level Compositional Reasoning Agent (MCR-Agent) At the highest level, we infer a sequence of human-interpretable subgoals to be executed based on language instructions by a high-level policy composition controller. At the middle level, we discriminatively control the agent's navigation by a master policy by alternating between a navigation policy and various independent interaction policies. At the lowest level, we infer manipulation actions with the corresponding object masks using the appropriate interaction policy.
arXiv Detail & Related papers (2023-08-18T08:38:28Z)
Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space. Also, we propose a novel framework to adopt the multi-agent options in the MARL process. We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z)
DC-MRTA: Decentralized Multi-Robot Task Allocation and Navigation in Complex Environments [55.204450019073036]
We present a novel reinforcement learning based task allocation and decentralized navigation algorithm for mobile robots in warehouse environments. We consider the problem of joint decentralized task allocation and navigation and present a two level approach to solve it. We observe improvement up to 14% in terms of task completion time and up-to 40% improvement in terms of computing collision-free trajectories for the robots.
arXiv Detail & Related papers (2022-09-07T00:35:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.