RoCo: Dialectic Multi-Robot Collaboration with Large Language Models
- URL: http://arxiv.org/abs/2307.04738v1
- Date: Mon, 10 Jul 2023 17:52:01 GMT
- Title: RoCo: Dialectic Multi-Robot Collaboration with Large Language Models
- Authors: Zhao Mandi, Shreeya Jain, Shuran Song
- Abstract summary: We propose a novel approach to multi-robot collaboration that harnesses the power of pre-trained large language models (LLMs)
We show RoCo easily incorporates human-in-the-loop, where a user can communicate and collaborate with a robot agent to complete tasks together.
- Score: 13.260289557301688
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel approach to multi-robot collaboration that harnesses the
power of pre-trained large language models (LLMs) for both high-level
communication and low-level path planning. Robots are equipped with LLMs to
discuss and collectively reason task strategies. They then generate sub-task
plans and task space waypoint paths, which are used by a multi-arm motion
planner to accelerate trajectory planning. We also provide feedback from the
environment, such as collision checking, and prompt the LLM agents to improve
their plan and waypoints in-context. For evaluation, we introduce RoCoBench, a
6-task benchmark covering a wide range of multi-robot collaboration scenarios,
accompanied by a text-only dataset for agent representation and reasoning. We
experimentally demonstrate the effectiveness of our approach -- it achieves
high success rates across all tasks in RoCoBench and adapts to variations in
task semantics. Our dialog setup offers high interpretability and flexibility
-- in real world experiments, we show RoCo easily incorporates
human-in-the-loop, where a user can communicate and collaborate with a robot
agent to complete tasks together. See project website
https://project-roco.github.io for videos and code.
Related papers
- COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems.
A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots.
The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z) - ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts.
We leverage natural language prompts and contextual information from the Robot Operating System (ROS)
Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Unified Human-Scene Interaction via Prompted Chain-of-Contacts [61.87652569413429]
Human-Scene Interaction (HSI) is a vital component of fields like embodied AI and virtual reality.
This paper presents a unified HSI framework, UniHSI, which supports unified control of diverse interactions through language commands.
arXiv Detail & Related papers (2023-09-14T17:59:49Z) - Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint.
During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations.
Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z) - AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers [20.857692296678632]
For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks.
Recent advances in large language models have shown promise for translating natural language into robot action sequences.
We show that our approach outperforms several methods using LLMs as planners in complex task domains.
arXiv Detail & Related papers (2023-06-10T21:58:29Z) - Chat with the Environment: Interactive Multimodal Perception Using Large
Language Models [19.623070762485494]
Large Language Models (LLMs) have shown remarkable reasoning ability in few-shot robotic planning.
Our study demonstrates that LLMs can provide high-level planning and reasoning skills and control interactive robot behavior in a multimodal environment.
arXiv Detail & Related papers (2023-03-14T23:01:27Z) - ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.