Related papers: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

URL: http://arxiv.org/abs/2201.07207v1
Date: Tue, 18 Jan 2022 18:59:45 GMT
Title: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Authors: Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
Abstract summary: We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps. We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
Score: 111.33545170562337
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Can world knowledge learned by large language models (LLMs) be used to act in interactive environments? In this paper, we investigate the possibility of grounding high-level tasks, expressed in natural language (e.g. "make breakfast"), to a chosen set of actionable steps (e.g. "open fridge"). While prior work focused on learning from explicit step-by-step examples of how to act, we surprisingly find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans without any further training. However, the plans produced naively by LLMs often cannot map precisely to admissible actions. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions. Our evaluation in the recent VirtualHome environment shows that the resulting method substantially improves executability over the LLM baseline. The conducted human evaluation reveals a trade-off between executability and correctness but shows a promising sign towards extracting actionable knowledge from language models. Website at https://huangwl18.github.io/language-planner

Related papers

Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment [18.256529559741075]
Large language models (LLMs) have risen to prominence as 'chatbots' for users to interact via natural language. We have implemented a simple text-based environment that simulates, very abstractly, a household setting. Our findings show that environmental complexity and game restrictions hamper performance.
arXiv Detail & Related papers (2025-02-17T12:20:39Z)
Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few Examples [17.372378259072992]
We propose FLARE (Few-shot Language with environmental Adaptive Replanning Embodied agent) to generate plans grounded in the environment. We additionally propose to correct the mistakes using visual cues from the agent. The proposed scheme allows us to use a few language pairs thanks to the visual cues and outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-23T05:20:01Z)
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control [58.72492647570062]
We introduce our method -- Learnable Latent Codes as Bridges (LCB) -- as an alternate architecture to overcome limitations. We find that methodoutperforms baselines that leverage pure language as the interface layer on tasks that require reasoning and multi-step behaviors.
arXiv Detail & Related papers (2024-05-08T04:14:06Z)
Grounding Language Plans in Demonstrations Through Counterfactual Perturbations [25.19071357445557]
Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks.
arXiv Detail & Related papers (2024-03-25T19:04:59Z)
Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs [7.746160514029531]
We demonstrate experimental results with LLMs that address robotics task planning problems. Our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning. Our approach is evaluated on a multi-modal prompt simulation benchmark.
arXiv Detail & Related papers (2024-03-20T17:58:12Z)
Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in Low-Resource Languages [0.0]
"prompting" is where a user provides a description of a task and some completed examples of the task to a PLM as context before prompting the PLM to perform the task on a new example. We consider three methods: few-shot prompting (prompt), language-adaptive fine-tuning (LAFT), and neural machine translation (translate) We find that translate and prompt settings are a compute-efficient and cost-effective method of few-shot prompting for the selected low-resource languages.
arXiv Detail & Related papers (2024-03-09T21:36:13Z)
Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors [40.18762220245365]
Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games. Previous work does little to explore what environment state information is provided to LLM actors via language. We propose Brief Language INputs for DEcision-making Responses (BLINDER), a method for automatically selecting concise state descriptions.
arXiv Detail & Related papers (2023-07-21T22:02:50Z)
Guiding Pretraining in Reinforcement Learning with Large Language Models [133.32146904055233]
We describe a method that uses background knowledge from text corpora to shape exploration. This method, called ELLM, rewards an agent for achieving goals suggested by a language model. By leveraging large-scale language model pretraining, ELLM guides agents toward human-meaningful and plausibly useful behaviors without requiring a human in the loop.
arXiv Detail & Related papers (2023-02-13T21:16:03Z)
Translating Natural Language to Planning Goals with Large-Language Models [19.738395237639136]
Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks. Our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language. Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning.
arXiv Detail & Related papers (2023-02-10T09:17:52Z)
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
Inner Monologue: Embodied Reasoning through Planning with Language Models [81.07216635735571]
Large Language Models (LLMs) can be applied to domains beyond natural language processing. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios.
arXiv Detail & Related papers (2022-07-12T15:20:48Z)
Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings. We demonstrate that this framework enables effective generalization across different environments. For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.