Generalizable Long-Horizon Manipulations with Large Language Models
- URL: http://arxiv.org/abs/2310.02264v1
- Date: Tue, 3 Oct 2023 17:59:46 GMT
- Title: Generalizable Long-Horizon Manipulations with Large Language Models
- Authors: Haoyu Zhou, Mingyu Ding, Weikun Peng, Masayoshi Tomizuka, Lin Shao,
Chuang Gan
- Abstract summary: This work introduces a framework harnessing the capabilities of Large Language Models (LLMs) to generate primitive task conditions for generalizable long-horizon manipulations.
We create a challenging robotic manipulation task suite based on Pybullet for long-horizon task evaluation.
- Score: 91.740084601715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work introduces a framework harnessing the capabilities of Large
Language Models (LLMs) to generate primitive task conditions for generalizable
long-horizon manipulations with novel objects and unseen tasks. These task
conditions serve as guides for the generation and adjustment of Dynamic
Movement Primitives (DMP) trajectories for long-horizon task execution. We
further create a challenging robotic manipulation task suite based on Pybullet
for long-horizon task evaluation. Extensive experiments in both simulated and
real-world environments demonstrate the effectiveness of our framework on both
familiar tasks involving new objects and novel but related tasks, highlighting
the potential of LLMs in enhancing robotic system versatility and adaptability.
Project website: https://object814.github.io/Task-Condition-With-LLM/
Related papers
- MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.
We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.
We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy [68.50785963043161]
GemBench is a novel benchmark to assess generalization capabilities of vision-language robotic manipulation policies.
We present 3D-LOTUS++, a framework that integrates 3D-LOTUS's motion planning capabilities with the task planning capabilities of LLMs.
3D-LOTUS++ achieves state-of-the-art performance on novel tasks of GemBench, setting a new standard for generalization in robotic manipulation.
arXiv Detail & Related papers (2024-10-02T09:02:34Z) - LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner [9.044939946653002]
Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks.
We propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework.
LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional search planner to achieve a high success rate and efficiency.
arXiv Detail & Related papers (2024-09-30T17:58:18Z) - ReLEP: A Novel Framework for Real-world Long-horizon Embodied Planning [7.668848364013772]
We present ReLEP, a framework for Real world Long-horizon Embodied Planning.
At its core lies a fine-tuned large vision language model that formulates plans as sequences of skill functions.
ReLEP can accomplish a wide range of daily tasks and outperforms other state-of-the-art baseline methods.
arXiv Detail & Related papers (2024-09-24T01:47:23Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation.
Specifically, task decomposition, tool selection, and parameter prediction are assessed.
Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z) - NOD-TAMP: Generalizable Long-Horizon Planning with Neural Object Descriptors [16.475094344344512]
We propose to combine two paradigms: Neural Object Descriptors (NODs) that produce generalizable object-centric features and Task and Motion Planning (TAMP) frameworks that chain short-horizon skills to solve multi-step tasks.
We introduce NOD-TAMP, a TAMP-based framework that extracts short manipulation trajectories from a handful of human demonstrations, adapts these trajectories using NOD features, and composes them to solve broad long-horizon, contact-rich tasks.
arXiv Detail & Related papers (2023-11-02T18:26:28Z) - LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic
Tabletop Manipulation [38.66406497318709]
This work focuses on the tabletop manipulation task and releases a simulation benchmark, textitLoHoRavens, which covers various long-horizon reasoning aspects spanning color, size, space, arithmetics and reference.
We investigate two methods of bridging the modality gap: caption generation and learnable interface for incorporating explicit and implicit observation feedback to the LLM.
arXiv Detail & Related papers (2023-10-18T14:53:14Z) - GenSim: Generating Robotic Simulation Tasks via Large Language Models [34.79613485106202]
GenSim aims to automatically generate rich simulation environments and expert demonstrations.
We use GPT4 to expand the existing benchmark by ten times to over 100 tasks.
With minimal sim-to-real adaptation, multitask policies pretrained on GPT4-generated simulation tasks exhibit stronger transfer to unseen long-horizon tasks in the real world.
arXiv Detail & Related papers (2023-10-02T17:23:48Z) - ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.