Related papers: AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers

AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers

URL: http://arxiv.org/abs/2306.06531v3
Date: Fri, 22 Mar 2024 00:21:04 GMT
Title: AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers
Authors: Yongchao Chen, Jacob Arkin, Charles Dawson, Yang Zhang, Nicholas Roy, Chuchu Fan,
Abstract summary: For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks. Recent advances in large language models have shown promise for translating natural language into robot action sequences. We show that our approach outperforms several methods using LLMs as planners in complex task domains.
Score: 20.857692296678632
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks described by natural language. Recent advances in large language models (LLMs) have shown promise for translating natural language into robot action sequences for complex tasks. However, existing approaches either translate the natural language directly into robot trajectories or factor the inference process by decomposing language into task sub-goals and relying on a motion planner to execute each sub-goal. When complex environmental and temporal constraints are involved, inference over planning tasks must be performed jointly with motion plans using traditional task-and-motion planning (TAMP) algorithms, making factorization into subgoals untenable. Rather than using LLMs to directly plan task sub-goals, we instead perform few-shot translation from natural language task descriptions to an intermediate task representation that can then be consumed by a TAMP algorithm to jointly solve the task and motion plan. To improve translation, we automatically detect and correct both syntactic and semantic errors via autoregressive re-prompting, resulting in significant improvements in task completion. We show that our approach outperforms several methods using LLMs as planners in complex task domains. See our project website https://yongchao98.github.io/MIT-REALM-AutoTAMP/ for prompts, videos, and code.

Related papers

LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language [17.914580097058106]
Bimanual robotic manipulation presents an inherent challenge due to the complexity involved in the spatial and temporal coordination between two hands. Existing works predominantly focus on attaining human-level manipulation skills for robotic hands, yet little attention has been paid to task planning on long-horizon timescales. This paper introduces LLM+MAP, a bimanual planning framework that integrates LLM reasoning and multi-agent planning.
arXiv Detail & Related papers (2025-03-21T17:04:01Z)
COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems. A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots. The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z)
Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation [49.43094200366251]
We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition. Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies.
arXiv Detail & Related papers (2024-08-29T03:03:35Z)
Scaling Up Natural Language Understanding for Multi-Robots Through the Lens of Hierarchy [8.180994118420053]
Long-horizon planning is hindered by challenges such as uncertainty accumulation, computational complexity, delayed rewards and incomplete information. This work proposes an approach to exploit the task hierarchy from human instructions to facilitate multi-robot planning.
arXiv Detail & Related papers (2024-08-15T14:46:13Z)
Empowering Large Language Models on Robotic Manipulation with Affordance Prompting [23.318449345424725]
Large language models fail to interact with the physical world by generating control sequences properly. Existing LLM-based approaches circumvent this problem by relying on additional pre-defined skills or pre-trained sub-policies. We propose a framework called LLM+A(ffordance) where the LLM serves as both the sub-task planner and the motion controller.
arXiv Detail & Related papers (2024-04-17T03:06:32Z)
Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs [7.746160514029531]
We demonstrate experimental results with LLMs that address robotics task planning problems. Our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning. Our approach is evaluated on a multi-modal prompt simulation benchmark.
arXiv Detail & Related papers (2024-03-20T17:58:12Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Interactive Task Planning with Language Models [89.5839216871244]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals and distinct tasks, even during execution. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain specific pretrained models. We propose a simple framework that achieves interactive task planning with language models by incorporating both high-level planning and low-level skill execution.
arXiv Detail & Related papers (2023-10-16T17:59:12Z)
Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents [99.17668730578586]
Pre-trained large language models (LLMs) capture procedural knowledge about the world. Plan, Eliminate, and Track (PET) framework translates a task description into a list of high-level sub-tasks. PET framework leads to a significant 15% improvement over SOTA for generalization to human goal specifications.
arXiv Detail & Related papers (2023-05-03T20:11:22Z)
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps. We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.