Related papers: Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models

Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models

URL: http://arxiv.org/abs/2407.12979v2
Date: Sat, 09 Nov 2024 05:23:47 GMT
Title: Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models
Authors: Sadegh Mahdavi, Raquel Aoki, Keyi Tang, Yanshuai Cao,
Abstract summary: Large Language Models (LLMs) have shown remarkable performance in various natural language tasks. Planning problems into the Planning Domain Definition Language (PDDL) has been proposed as a potential solution. We propose a novel approach that leverages LLMs and environment feedback to automatically generate PDDL domain and problem description files.
Score: 7.3238629831871735
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models (LLMs) have shown remarkable performance in various natural language tasks, but they often struggle with planning problems that require structured reasoning. To address this limitation, the conversion of planning problems into the Planning Domain Definition Language (PDDL) has been proposed as a potential solution, enabling the use of automated planners. However, generating accurate PDDL files typically demands human inputs or correction, which can be time-consuming and costly. In this paper, we propose a novel approach that leverages LLMs and environment feedback to automatically generate PDDL domain and problem description files without the need for human intervention. Our method introduces an iterative refinement process that generates multiple problem PDDL candidates and progressively refines the domain PDDL based on feedback obtained from interacting with the environment. To guide the refinement process, we develop an Exploration Walk (EW) metric, which provides rich feedback signals for LLMs to update the PDDL file. We evaluate our approach on $10$ PDDL environments. We achieve an average task solve rate of 66% compared to a 29% solve rate by GPT-4's intrinsic planning with chain-of-thought prompting. Our work enables the automated modeling of planning environments using LLMs and environment feedback, eliminating the need for human intervention in the PDDL translation process and paving the way for more reliable LLM agents in challenging problems. Our code is available at https://github.com/BorealisAI/llm-pddl-planning

Related papers

Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Generating Symbolic World Models via Test-time Scaling of Large Language Models [28.258707611580643]
Planning Domain Definition Language (PDDL) is leveraged as a planning abstraction that enables precise and formal state descriptions. We introduce a simple yet effective algorithm, which first employs a Best-of-N sampling approach to improve the quality of the initial solution and then refines the solution in a fine-grained manner with verbalized machine learning. Our method outperforms o1-mini by a considerable margin in the generation of PDDL domain, achieving over 50% success rate on two tasks.
arXiv Detail & Related papers (2025-02-07T07:52:25Z)
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation. We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents. We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z)
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages [20.62336315814875]
We introduce benchmarkName, a benchmark designed to evaluate language models' ability to generate PDDL code from natural language descriptions of planning tasks. We present a dataset of $132,037$ text-to-PDDL pairs across 13 different tasks, with varying levels of difficulty.
arXiv Detail & Related papers (2024-07-03T17:59:53Z)
NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions [8.004470925893957]
We present NL2Plan, the first domain-agnostic offline LLM-driven planning system. We evaluate NL2Plan on four planning domains and find that it solves 10 out of 15 tasks. In addition to using NL2Plan in end-to-end mode, users can inspect and correct all of its intermediate results.
arXiv Detail & Related papers (2024-05-07T11:27:13Z)
PROC2PDDL: Open-Domain Planning Representations from Texts [56.627183903841164]
Proc2PDDL is the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%.
arXiv Detail & Related papers (2024-02-29T19:40:25Z)
Real-World Planning with PDDL+ and Beyond [55.73913765642435]
We present Nyx, a novel PDDL+ planner built to emphasize lightness, simplicity, and, most importantly, adaptability. Nyx can be tailored to virtually any potential real-world application requiring some form of AI Planning, paving the way for wider adoption of planning methods for solving real-world problems.
arXiv Detail & Related papers (2024-02-19T07:35:49Z)
TIC: Translate-Infer-Compile for accurate "text to plan" using LLMs and Logical Representations [0.0]
We study the problem of generating plans for given natural language planning task requests. Our approach comprises of (a) translate: using an LLM only for generating a interpretable intermediate representation of natural language task description. We observe that using an LLM to only output the intermediate representation significantly reduces LLM errors.
arXiv Detail & Related papers (2024-02-09T18:39:13Z)
AutoPlanBench: Automatically generating benchmarks for LLM planners from PDDL [52.005042190810116]
We present AutoPlanBench, a novel method for automatically converting planning benchmarks written in PDDL into textual descriptions. We show that while the best LLM planners do well on some planning tasks, others remain out of reach of current methods.
arXiv Detail & Related papers (2023-11-16T11:55:27Z)
HDDL 2.1: Towards Defining a Formalism and a Semantics for Temporal HTN Planning [64.07762708909846]
Real world applications need modelling rich and diverse automated planning problems. hierarchical task network (HTN) formalism does not allow to represent planning problems with numerical and temporal constraints. We propose to fill the gap between HDDL and these operational needs and to extend HDDL by taking inspiration from PDDL 2.1.
arXiv Detail & Related papers (2023-06-12T18:21:23Z)
AdaPlanner: Adaptive Planning from Feedback with Language Models [56.367020818139665]
Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks. We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback. To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities.
arXiv Detail & Related papers (2023-05-26T05:52:27Z)
Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning [39.29964085305846]
Methods that use pre-trained large language models directly as planners are currently impractical due to limited correctness of plans. In this work, we introduce a novel alternative paradigm that constructs an explicit world (domain) model in planning domain definition language (PDDL) and then uses it to plan with sound domain-independent planners.
arXiv Detail & Related papers (2023-05-24T08:59:15Z)
Policy-Guided Lazy Search with Feedback for Task and Motion Planning [19.789123503976917]
PDDLStream solvers have recently emerged as viable solutions for Task and Motion Planning problems. We propose LAZY, a solver for PDDLStream problems that maintains a single integrated search over action skeletons. We show that this leads to significant speed-ups in the search for a feasible solution evaluated over unseen test environments.
arXiv Detail & Related papers (2022-10-25T14:33:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.