ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models
- URL: http://arxiv.org/abs/2209.11302v1
- Date: Thu, 22 Sep 2022 20:29:49 GMT
- Title: ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models
- Authors: Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu,
Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg
- Abstract summary: Large language models (LLMs) can be used to score potential next actions during task planning.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
- Score: 68.57918965060787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Task planning can require defining myriad domain knowledge about the world in
which a robot needs to act. To ameliorate that effort, large language models
(LLMs) can be used to score potential next actions during task planning, and
even generate action sequences directly, given an instruction in natural
language with no additional domain information. However, such methods either
require enumerating all possible next steps for scoring, or generate free-form
text that may contain actions not possible on a given robot in its current
context. We present a programmatic LLM prompt structure that enables plan
generation functional across situated environments, robot capabilities, and
tasks. Our key insight is to prompt the LLM with program-like specifications of
the available actions and objects in an environment, as well as with example
programs that can be executed. We make concrete recommendations about prompt
structure and generation constraints through ablation experiments, demonstrate
state of the art success rates in VirtualHome household tasks, and deploy our
method on a physical robot arm for tabletop tasks. Website at
progprompt.github.io
Related papers
- Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model [6.9268843428933025]
Large language models (LLMs) have demonstrated powerful planning and reasoning capabilities for comprehension and processing of semantic information.
We propose a novel language-model based framework that enables robots to autonomously plan behaviors and low-level execution under given textual instructions.
arXiv Detail & Related papers (2024-08-15T17:33:32Z) - LLaRA: Supercharging Robot Learning Data for Vision-Language Policy [56.505551117094534]
We introduce LLaRA: Large Language and Robotics Assistant, a framework that formulates robot action policy as visuo-textual conversations.
First, we present an automated pipeline to generate conversation-style instruction tuning data for robots from existing behavior cloning datasets.
We show that a VLM finetuned with a limited amount of such datasets can produce meaningful action decisions for robotic control.
arXiv Detail & Related papers (2024-06-28T17:59:12Z) - Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs [7.746160514029531]
We demonstrate experimental results with LLMs that address robotics task planning problems.
Our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning.
Our approach is evaluated on a multi-modal prompt simulation benchmark.
arXiv Detail & Related papers (2024-03-20T17:58:12Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Interactive Task Planning with Language Models [89.5839216871244]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals and distinct tasks, even during execution.
Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain specific pretrained models.
We propose a simple framework that achieves interactive task planning with language models by incorporating both high-level planning and low-level skill execution.
arXiv Detail & Related papers (2023-10-16T17:59:12Z) - AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers [20.857692296678632]
For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks.
Recent advances in large language models have shown promise for translating natural language into robot action sequences.
We show that our approach outperforms several methods using LLMs as planners in complex task domains.
arXiv Detail & Related papers (2023-06-10T21:58:29Z) - Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions
with Large Language Model [63.66204449776262]
Instruct2Act is a framework that maps multi-modal instructions to sequential actions for robotic manipulation tasks.
Our approach is adjustable and flexible in accommodating various instruction modalities and input types.
Our zero-shot method outperformed many state-of-the-art learning-based policies in several tasks.
arXiv Detail & Related papers (2023-05-18T17:59:49Z) - Generating Executable Action Plans with Environmentally-Aware Language
Models [4.162663632560141]
Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents.
We propose an approach to generate environmentally-aware action plans that agents are better able to execute.
arXiv Detail & Related papers (2022-10-10T18:56:57Z) - Language Models as Zero-Shot Planners: Extracting Actionable Knowledge
for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps.
We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans.
We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.