Related papers: Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs

Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs

URL: http://arxiv.org/abs/2403.13801v2
Date: Sat, 6 Apr 2024 04:12:47 GMT
Title: Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs
Authors: Yusuke Mikami, Andrew Melnik, Jun Miura, Ville Hautamäki,
Abstract summary: We demonstrate experimental results with LLMs that address robotics task planning problems. Our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning. Our approach is evaluated on a multi-modal prompt simulation benchmark.
Score: 7.746160514029531
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We demonstrate experimental results with LLMs that address robotics task planning problems. Recently, LLMs have been applied in robotics task planning, particularly using a code generation approach that converts complex high-level instructions into mid-level policy codes. In contrast, our approach acquires text descriptions of the task and scene objects, then formulates task planning through natural language reasoning, and outputs coordinate level control commands, thus reducing the necessity for intermediate representation code as policies with pre-defined APIs. Our approach is evaluated on a multi-modal prompt simulation benchmark, demonstrating that our prompt engineering experiments with natural language reasoning significantly enhance success rates compared to its absence. Furthermore, our approach illustrates the potential for natural language descriptions to transfer robotics skills from known tasks to previously unseen tasks. The project website: https://natural-language-as-policies.github.io/

Related papers

CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models [19.73329768987112]
CurricuLLM is a curriculum learning tool for complex robot control tasks. It generates subtasks that aid target task learning in natural language form. It also translates natural language description of subtasks into executable code. CurricuLLM can aid learning complex robot control tasks.
arXiv Detail & Related papers (2024-09-27T01:48:16Z)
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control [58.72492647570062]
We introduce our method -- Learnable Latent Codes as Bridges (LCB) -- as an alternate architecture to overcome limitations. We find that methodoutperforms baselines that leverage pure language as the interface layer on tasks that require reasoning and multi-step behaviors.
arXiv Detail & Related papers (2024-05-08T04:14:06Z)
Empowering Large Language Models on Robotic Manipulation with Affordance Prompting [23.318449345424725]
Large language models fail to interact with the physical world by generating control sequences properly. Existing LLM-based approaches circumvent this problem by relying on additional pre-defined skills or pre-trained sub-policies. We propose a framework called LLM+A(ffordance) where the LLM serves as both the sub-task planner and the motion controller.
arXiv Detail & Related papers (2024-04-17T03:06:32Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors [40.18762220245365]
Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games. Previous work does little to explore what environment state information is provided to LLM actors via language. We propose Brief Language INputs for DEcision-making Responses (BLINDER), a method for automatically selecting concise state descriptions.
arXiv Detail & Related papers (2023-07-21T22:02:50Z)
AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers [20.857692296678632]
For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks. Recent advances in large language models have shown promise for translating natural language into robot action sequences. We show that our approach outperforms several methods using LLMs as planners in complex task domains.
arXiv Detail & Related papers (2023-06-10T21:58:29Z)
PADL: Language-Directed Physics-Based Character Control [66.517142635815]
We present PADL, which allows users to issue natural language commands for specifying high-level tasks and low-level skills that a character should perform. We show that our framework can be applied to effectively direct a simulated humanoid character to perform a diverse array of complex motor skills.
arXiv Detail & Related papers (2023-01-31T18:59:22Z)
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
Neuro-Symbolic Causal Language Planning with Commonsense Prompting [67.06667162430118]
Language planning aims to implement complex high-level goals by decomposition into simpler low-level steps. Previous methods require either manual exemplars or annotated programs to acquire such ability from large language models. This paper proposes Neuro-Symbolic Causal Language Planner (CLAP) that elicits procedural knowledge from the LLMs with commonsense-infused prompting.
arXiv Detail & Related papers (2022-06-06T22:09:52Z)
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps. We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.