Related papers: ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

URL: http://arxiv.org/abs/2209.11302v1
Date: Thu, 22 Sep 2022 20:29:49 GMT
Title: ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Authors: Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg
Abstract summary: Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
Score: 68.57918965060787
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumerating all possible next steps for scoring, or generate free-form text that may contain actions not possible on a given robot in its current context. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed. We make concrete recommendations about prompt structure and generation constraints through ablation experiments, demonstrate state of the art success rates in VirtualHome household tasks, and deploy our method on a physical robot arm for tabletop tasks. Website at progprompt.github.io

Related papers

Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model [6.9268843428933025]
Large language models (LLMs) have demonstrated powerful planning and reasoning capabilities for comprehension and processing of semantic information. We propose a novel language-model based framework that enables robots to autonomously plan behaviors and low-level execution under given textual instructions.
arXiv Detail & Related papers (2024-08-15T17:33:32Z)
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy [56.505551117094534]
We introduce LLaRA: Large Language and Robotics Assistant, a framework that formulates robot action policy as visuo-textual conversations. First, we present an automated pipeline to generate conversation-style instruction tuning data for robots from existing behavior cloning datasets. We show that a VLM finetuned with a limited amount of such datasets can produce meaningful action decisions for robotic control.
arXiv Detail & Related papers (2024-06-28T17:59:12Z)
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs [42.31298987176411]
We introduce ROBO-INSTRUCT, which synthesizes task-specific simulation environments on the fly during program execution. ROBO-INSTRUCT integrates an LLM-aided post-processing procedure to refine instructions for better alignment with robot programs.
arXiv Detail & Related papers (2024-05-30T15:47:54Z)
Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs [7.746160514029531]
We demonstrate experimental results with LLMs that address robotics task planning problems. Our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning. Our approach is evaluated on a multi-modal prompt simulation benchmark.
arXiv Detail & Related papers (2024-03-20T17:58:12Z)
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation. We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Interactive Task Planning with Language Models [89.5839216871244]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals and distinct tasks, even during execution. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain specific pretrained models. We propose a simple framework that achieves interactive task planning with language models by incorporating both high-level planning and low-level skill execution.
arXiv Detail & Related papers (2023-10-16T17:59:12Z)
Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint. During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations. Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z)
AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers [20.857692296678632]
For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks. Recent advances in large language models have shown promise for translating natural language into robot action sequences. We show that our approach outperforms several methods using LLMs as planners in complex task domains.
arXiv Detail & Related papers (2023-06-10T21:58:29Z)
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model [63.66204449776262]
Instruct2Act is a framework that maps multi-modal instructions to sequential actions for robotic manipulation tasks. Our approach is adjustable and flexible in accommodating various instruction modalities and input types. Our zero-shot method outperformed many state-of-the-art learning-based policies in several tasks.
arXiv Detail & Related papers (2023-05-18T17:59:49Z)
Generating Executable Action Plans with Environmentally-Aware Language Models [4.162663632560141]
Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents. We propose an approach to generate environmentally-aware action plans that agents are better able to execute.
arXiv Detail & Related papers (2022-10-10T18:56:57Z)
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps. We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.