Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using
Large Language Models
- URL: http://arxiv.org/abs/2310.07937v2
- Date: Mon, 25 Dec 2023 07:57:13 GMT
- Title: Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using
Large Language Models
- Authors: Bangguo Yu, Hamidreza Kasaei, Ming Cao
- Abstract summary: Co-NavGPT is an innovative framework that integrates Large Language Models as a global planner for multi-robot cooperative visual target navigation.
It encodes the explored environment data into prompts, enhancing LLMs' scene comprehension.
It then assigns exploration frontiers to each robot for efficient target search.
- Score: 10.312968200748118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In advanced human-robot interaction tasks, visual target navigation is
crucial for autonomous robots navigating unknown environments. While numerous
approaches have been developed in the past, most are designed for single-robot
operations, which often suffer from reduced efficiency and robustness due to
environmental complexities. Furthermore, learning policies for multi-robot
collaboration are resource-intensive. To address these challenges, we propose
Co-NavGPT, an innovative framework that integrates Large Language Models (LLMs)
as a global planner for multi-robot cooperative visual target navigation.
Co-NavGPT encodes the explored environment data into prompts, enhancing LLMs'
scene comprehension. It then assigns exploration frontiers to each robot for
efficient target search. Experimental results on Habitat-Matterport 3D (HM3D)
demonstrate that Co-NavGPT surpasses existing models in success rates and
efficiency without any learning process, demonstrating the vast potential of
LLMs in multi-robot collaboration domains. The supplementary video, prompts,
and code can be accessed via the following link:
https://sites.google.com/view/co-navgpt
Related papers
- LPAC: Learnable Perception-Action-Communication Loops with Applications
to Coverage Control [80.86089324742024]
We propose a learnable Perception-Action-Communication (LPAC) architecture for the problem.
CNN processes localized perception; a graph neural network (GNN) facilitates robot communications.
Evaluations show that the LPAC models outperform standard decentralized and centralized coverage control algorithms.
arXiv Detail & Related papers (2024-01-10T00:08:00Z) - From Simulations to Reality: Enhancing Multi-Robot Exploration for Urban
Search and Rescue [46.377510400989536]
We present a novel hybrid algorithm for efficient multi-robot exploration in unknown environments with limited communication and no global positioning information.
We redefine the local best and global best positions to suit scenarios without continuous target information.
The presented work holds promise for enhancing multi-robot exploration in scenarios with limited information and communication capabilities.
arXiv Detail & Related papers (2023-11-28T17:05:25Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments [14.179677726976056]
SayNav is a new approach that leverages human knowledge from Large Language Models (LLMs) for efficient generalization to complex navigation tasks.
SayNav achieves state-of-the-art results and even outperforms an oracle based baseline with strong ground-truth assumptions by more than 8% in terms of success rate.
arXiv Detail & Related papers (2023-09-08T02:24:37Z) - Target Search and Navigation in Heterogeneous Robot Systems with Deep
Reinforcement Learning [3.3167319223959373]
We design a heterogeneous robot system consisting of a UAV and a UGV for search and rescue missions in unknown environments.
The system is able to search for targets and navigate to them in a maze-like mine environment with the policies learned through deep reinforcement learning algorithms.
arXiv Detail & Related papers (2023-08-01T07:09:14Z) - Learning Hierarchical Interactive Multi-Object Search for Mobile
Manipulation [10.21450780640562]
We introduce a novel interactive multi-object search task in which a robot has to open doors to navigate rooms and search inside cabinets and drawers to find target objects.
These new challenges require combining manipulation and navigation skills in unexplored environments.
We present HIMOS, a hierarchical reinforcement learning approach that learns to compose exploration, navigation, and manipulation skills.
arXiv Detail & Related papers (2023-07-12T12:25:33Z) - GNM: A General Navigation Model to Drive Any Robot [67.40225397212717]
General goal-conditioned model for vision-based navigation can be trained on data obtained from many distinct but structurally similar robots.
We analyze the necessary design decisions for effective data sharing across robots.
We deploy the trained GNM on a range of new robots, including an under quadrotor.
arXiv Detail & Related papers (2022-10-07T07:26:41Z) - LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
Vision, and Action [76.71101507291473]
We present a system, LM-Nav, for robotic navigation that enjoys the benefits of training on unannotated large datasets of trajectories.
We show that such a system can be constructed entirely out of pre-trained models for navigation (ViNG), image-language association (CLIP), and language modeling (GPT-3), without requiring any fine-tuning or language-annotated robot data.
arXiv Detail & Related papers (2022-07-10T10:41:50Z) - Socially Compliant Navigation Dataset (SCAND): A Large-Scale Dataset of
Demonstrations for Social Navigation [92.66286342108934]
Social navigation is the capability of an autonomous agent, such as a robot, to navigate in a'socially compliant' manner in the presence of other intelligent agents such as humans.
Our dataset contains 8.7 hours, 138 trajectories, 25 miles of socially compliant, human teleoperated driving demonstrations.
arXiv Detail & Related papers (2022-03-28T19:09:11Z) - Decentralized Global Connectivity Maintenance for Multi-Robot
Navigation: A Reinforcement Learning Approach [12.649986200029717]
This work investigates how to navigate a multi-robot team in unknown environments while maintaining connectivity.
We propose a reinforcement learning approach to develop a decentralized policy, which is shared among multiple robots.
We validate the effectiveness of the proposed approach by comparing different combinations of connectivity constraints and behavior cloning.
arXiv Detail & Related papers (2021-09-17T13:20:19Z) - Collaborative Visual Navigation [69.20264563368762]
We propose a large-scale 3D dataset, CollaVN, for multi-agent visual navigation (MAVN)
Diverse MAVN variants are explored to make our problem more general.
A memory-augmented communication framework is proposed. Each agent is equipped with a private, external memory to persistently store communication information.
arXiv Detail & Related papers (2021-07-02T15:48:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.