LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search
- URL: http://arxiv.org/abs/2509.26324v2
- Date: Sat, 04 Oct 2025 22:23:46 GMT
- Title: LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search
- Authors: Ruiyang Wang, Hao-Lun Hsu, David Hunt, Shaocheng Luo, Jiwoo Kim, Miroslav Pajic,
- Abstract summary: We introduce LLM-MCoX (LLM-based Multi-robot Coordinated Exploration and Search), a novel framework for intelligent coordination of homogeneous and heterogeneous robot teams.<n>Our approach combines real-time LiDAR scan processing for frontier cluster extraction and doorway detection with multimodal LLM reasoning.<n>LLMs enables natural language-based object search capabilities, allowing human operators to provide high-level semantic guidance.
- Score: 13.039064446429407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous exploration and object search in unknown indoor environments remain challenging for multi-robot systems (MRS). Traditional approaches often rely on greedy frontier assignment strategies with limited inter-robot coordination. In this work, we introduce LLM-MCoX (LLM-based Multi-robot Coordinated Exploration and Search), a novel framework that leverages Large Language Models (LLMs) for intelligent coordination of both homogeneous and heterogeneous robot teams tasked with efficient exploration and target object search. Our approach combines real-time LiDAR scan processing for frontier cluster extraction and doorway detection with multimodal LLM reasoning (e.g., GPT-4o) to generate coordinated waypoint assignments based on shared environment maps and robot states. LLM-MCoX demonstrates superior performance compared to existing methods, including greedy and Voronoi-based planners, achieving 22.7% faster exploration times and 50% improved search efficiency in large environments with 6 robots. Notably, LLM-MCoX enables natural language-based object search capabilities, allowing human operators to provide high-level semantic guidance that traditional algorithms cannot interpret.
Related papers
- SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z) - Heterogeneous Robot Collaboration in Unstructured Environments with Grounded Generative Intelligence [54.91177026001217]
Large language model (LLM)-enabled teaming methods typically assume well-structured and known environments.<n>We present SPINE-HT, a framework that addresses these limitations by grounding the reasoning abilities of LLMs in the context of a heterogeneous robot team.<n>Our framework achieves nearly twice the success rate compared to prior LLM-enabled heterogeneous teaming approaches.
arXiv Detail & Related papers (2025-10-30T18:24:38Z) - Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System [8.88014241557266]
Heterogeneous multirobot systems show great potential in complex tasks requiring coordinated hybrid cooperation.<n>Existing methods that rely on static or task-specific models often lack generalizability across diverse tasks and dynamic environments.<n>We propose a hierarchical multimodal framework that integrates a prompted large language model (LLM) with a fine-tuned vision-language model (VLM)
arXiv Detail & Related papers (2025-06-05T13:27:41Z) - MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement [72.760041766681]
We propose MLE-STAR, a novel approach to build machine learning agents.<n>MLE-STAR first leverages external knowledge by using a search engine to retrieve effective models from the web.<n>We introduce a novel ensembling method using an effective strategy suggested by MLE-STAR.
arXiv Detail & Related papers (2025-05-27T18:11:25Z) - Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities [65.98704516122228]
The integration of foundation models (FMs) into robotics has enabled robots to understand natural language and reason about the semantics in their environments.<n>This paper addresses the deployment of FM-enabled robots in the field, where missions often require a robot to operate in large-scale and unstructured environments.<n>We present the first demonstration of large-scale LLM-enabled robot planning in unstructured environments with several kilometers of missions.
arXiv Detail & Related papers (2025-05-14T15:28:43Z) - MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models [5.28115111932163]
We present MLLM-Search, a novel zero-shot person search architecture for mobile robots.<n>Our approach introduces a novel visual prompting method to provide robots with spatial understanding of the environment.<n>Experiments with a mobile robot in a multi-room floor of a building showed that MLLM-Search was able to generalize to finding a person in a new unseen environment.
arXiv Detail & Related papers (2024-11-27T21:59:29Z) - MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [62.854649499866774]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.<n>We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.<n>We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - Large Language Models for Orchestrating Bimanual Robots [19.60907949776435]
We present LAnguage-model-based Bimanual ORchestration (LABOR) to analyze task configurations and devise coordination control policies.
We evaluate our method through simulated experiments involving two classes of long-horizon tasks using the NICOL humanoid robot.
arXiv Detail & Related papers (2024-04-02T15:08:35Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - From Simulations to Reality: Enhancing Multi-Robot Exploration for Urban
Search and Rescue [46.377510400989536]
We present a novel hybrid algorithm for efficient multi-robot exploration in unknown environments with limited communication and no global positioning information.
We redefine the local best and global best positions to suit scenarios without continuous target information.
The presented work holds promise for enhancing multi-robot exploration in scenarios with limited information and communication capabilities.
arXiv Detail & Related papers (2023-11-28T17:05:25Z) - Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation Using Vision Language Models [8.668211481067457]
Co-NavGPT is a novel framework that integrates a Vision Language Model (VLM) as a global planner.<n>Co-NavGPT aggregates sub-maps from multiple robots with diverse viewpoints into a unified global map.<n>The VLM uses this information to assign frontiers across the robots, facilitating coordinated and efficient exploration.
arXiv Detail & Related papers (2023-10-11T23:17:43Z) - Intrinsic Language-Guided Exploration for Complex Long-Horizon Robotic
Manipulation Tasks [12.27904219271791]
Current reinforcement learning algorithms struggle in sparse and complex environments.
We propose the Intrinsically Guided Exploration from Large Language Models (IGE-LLMs) framework.
arXiv Detail & Related papers (2023-09-28T11:14:52Z) - Language to Rewards for Robotic Skill Synthesis [37.21434094015743]
We introduce a new paradigm that harnesses large language models (LLMs) to define reward parameters that can be optimized and accomplish variety of robotic tasks.
Using reward as the intermediate interface generated by LLMs, we can effectively bridge the gap between high-level language instructions or corrections to low-level robot actions.
arXiv Detail & Related papers (2023-06-14T17:27:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.