Generating Executable Action Plans with Environmentally-Aware Language
Models
- URL: http://arxiv.org/abs/2210.04964v2
- Date: Tue, 2 May 2023 04:58:54 GMT
- Title: Generating Executable Action Plans with Environmentally-Aware Language
Models
- Authors: Maitrey Gramopadhye, Daniel Szafir
- Abstract summary: Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents.
We propose an approach to generate environmentally-aware action plans that agents are better able to execute.
- Score: 4.162663632560141
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) trained using massive text datasets have
recently shown promise in generating action plans for robotic agents from high
level text queries. However, these models typically do not consider the robot's
environment, resulting in generated plans that may not actually be executable,
due to ambiguities in the planned actions or environmental constraints. In this
paper, we propose an approach to generate environmentally-aware action plans
that agents are better able to execute. Our approach involves integrating
environmental objects and object relations as additional inputs into LLM action
plan generation to provide the system with an awareness of its surroundings,
resulting in plans where each generated action is mapped to objects present in
the scene. We also design a novel scoring function that, along with generating
the action steps and associating them with objects, helps the system
disambiguate among object instances and take into account their states. We
evaluated our approach using the VirtualHome simulator and the ActivityPrograms
knowledge base and found that action plans generated from our system had a 310%
improvement in executability and a 147% improvement in correctness over prior
work. The complete code and a demo of our method is publicly available at
https://github.com/hri-ironlab/scene_aware_language_planner.
Related papers
- DynaSaur: Large Language Agents Beyond Predefined Actions [108.75187263724838]
Existing LLM agent systems typically select actions from a fixed and predefined set at every step.
We propose an LLM agent framework that enables the dynamic creation and composition of actions in an online manner.
Our experiments on the GAIA benchmark demonstrate that this framework offers significantly greater flexibility and outperforms previous methods.
arXiv Detail & Related papers (2024-11-04T02:08:59Z) - Language Models can Infer Action Semantics for Symbolic Planners from Environment Feedback [26.03718733867297]
We propose Predicting Semantics of Actions with Language Models (PSALM)
PSALM learns action semantics by leveraging the strengths of both symbolic planners and Large Language Models (LLMs)
Experiments show PSALM boosts plan success rate from 36.4% (on Claude-3.5) to 100%, and explores the environment more efficiently than prior work to infer ground truth domain action semantics.
arXiv Detail & Related papers (2024-06-04T21:29:56Z) - PDDLEGO: Iterative Planning in Textual Environments [56.12148805913657]
Planning in textual environments has been shown to be a long-standing challenge even for current models.
We propose PDDLEGO that iteratively construct a planning representation that can lead to a partial plan for a given sub-goal.
We show that plans produced by few-shot PDDLEGO are 43% more efficient than generating plans end-to-end on the Coin Collector simulation.
arXiv Detail & Related papers (2024-05-30T08:01:20Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint.
During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations.
Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z) - ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z) - Language Models as Zero-Shot Planners: Extracting Actionable Knowledge
for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps.
We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans.
We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.