Zero-shot Robotic Manipulation with Language-guided Instruction and Formal Task Planning
- URL: http://arxiv.org/abs/2501.15214v1
- Date: Sat, 25 Jan 2025 13:33:22 GMT
- Title: Zero-shot Robotic Manipulation with Language-guided Instruction and Formal Task Planning
- Authors: Junfeng Tang, Zihan Ye, Yuping Yan, Ziqi Zheng, Ting Gao, Yaochu Jin,
- Abstract summary: We propose an innovative language-guided symbolic task planning (LM-SymOpt) framework with optimization.
It is the first expert-free planning framework since we combine the world knowledge from Large Language Models with formal reasoning.
Our experimental results show that LM-SymOpt outperforms existing LLM-based planning approaches.
- Score: 16.89900521727246
- License:
- Abstract: Robotic manipulation is often challenging due to the long-horizon tasks and the complex object relationships. A common solution is to develop a task and motion planning framework that integrates planning for high-level task and low-level motion. Recently, inspired by the powerful reasoning ability of Large Language Models (LLMs), LLM-based planning approaches have achieved remarkable progress. However, these methods still heavily rely on expert-specific knowledge, often generating invalid plans for unseen and unfamiliar tasks. To address this issue, we propose an innovative language-guided symbolic task planning (LM-SymOpt) framework with optimization. It is the first expert-free planning framework since we combine the world knowledge from LLMs with formal reasoning, resulting in improved generalization capability to new tasks. Specifically, differ to most existing work, our LM-SymOpt employs LLMs to translate natural language instructions into symbolic representations, thereby representing actions as high-level symbols and reducing the search space for planning. Next, after evaluating the action probability of completing the task using LLMs, a weighted random sampling method is introduced to generate candidate plans. Their feasibility is assessed through symbolic reasoning and their cost efficiency is then evaluated using trajectory optimization for selecting the optimal planning. Our experimental results show that LM-SymOpt outperforms existing LLM-based planning approaches.
Related papers
- Large Language Models as Common-Sense Heuristics [0.9093413254392775]
Large Language Models (LLMs) possess parametrised knowledge across a wide range of topics, enabling them to leverage the natural language descriptions of planning tasks in their solutions.
We introduce a novel planning method, which leverages the parametrised knowledge of LLMs by using their output as a for Hill-Climbing Search.
Our method outperforms the task success rate of similar systems within a common household environment by 22 percentage points, with consistently executable plans.
arXiv Detail & Related papers (2025-01-31T00:26:38Z) - AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation [81.32722475387364]
Large Language Model-based agents have garnered significant attention and are becoming increasingly popular.
Planning ability is a crucial component of an LLM-based agent, which generally entails achieving a desired goal from an initial state.
Recent studies have demonstrated that utilizing expert-level trajectory for instruction-tuning LLMs effectively enhances their planning capabilities.
arXiv Detail & Related papers (2024-08-01T17:59:46Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning [78.2390460278551]
Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation.
Here, we present LLM3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface.
Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning.
arXiv Detail & Related papers (2024-03-18T08:03:47Z) - Learning adaptive planning representations with natural language
guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations.
Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z) - ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon
Sequential Task Planning [7.701407633867452]
Large Language Models (LLMs) offer the potential to enhance the generalizability as task-agnostic planners.
We introduce ISR-LLM, a novel framework that improves LLM-based planning through an iterative self-refinement process.
We show that ISR-LLM is able to achieve markedly higher success rates in task accomplishments compared to state-of-the-art LLM-based planners.
arXiv Detail & Related papers (2023-08-26T01:31:35Z) - AdaPlanner: Adaptive Planning from Feedback with Language Models [56.367020818139665]
Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks.
We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback.
To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities.
arXiv Detail & Related papers (2023-05-26T05:52:27Z) - Understanding the Capabilities of Large Language Models for Automated
Planning [24.37599752610625]
The study seeks to shed light on the capabilities of LLMs in solving complex planning problems.
It provides insights into the most effective approaches for using LLMs in this context.
arXiv Detail & Related papers (2023-05-25T15:21:09Z) - Learning to Plan with Natural Language [111.76828049344839]
Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks.
For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step.
We propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.
arXiv Detail & Related papers (2023-04-20T17:09:12Z) - Plansformer: Generating Symbolic Plans using Transformers [24.375997526106246]
Large Language Models (LLMs) have been the subject of active research, significantly advancing the field of Natural Language Processing (NLP)
We introduce Plansformer; an LLM fine-tuned on planning problems and capable of generating plans with favorable behavior in terms of correctness and length with reduced knowledge-engineering efforts.
For one configuration of Plansformer, we achieve 97% valid plans, out of which 95% are optimal for Towers of Hanoi - a puzzle-solving domain.
arXiv Detail & Related papers (2022-12-16T19:06:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.