Leveraging Pre-trained Large Language Models to Construct and Utilize
World Models for Model-based Task Planning
- URL: http://arxiv.org/abs/2305.14909v2
- Date: Thu, 2 Nov 2023 03:06:19 GMT
- Title: Leveraging Pre-trained Large Language Models to Construct and Utilize
World Models for Model-based Task Planning
- Authors: Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Abstract summary: Methods that use pre-trained large language models directly as planners are currently impractical due to limited correctness of plans.
In this work, we introduce a novel alternative paradigm that constructs an explicit world (domain) model in planning domain definition language (PDDL) and then uses it to plan with sound domain-independent planners.
- Score: 39.29964085305846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a growing interest in applying pre-trained large language models
(LLMs) to planning problems. However, methods that use LLMs directly as
planners are currently impractical due to several factors, including limited
correctness of plans, strong reliance on feedback from interactions with
simulators or even the actual environment, and the inefficiency in utilizing
human feedback. In this work, we introduce a novel alternative paradigm that
constructs an explicit world (domain) model in planning domain definition
language (PDDL) and then uses it to plan with sound domain-independent
planners. To address the fact that LLMs may not generate a fully functional
PDDL model initially, we employ LLMs as an interface between PDDL and sources
of corrective feedback, such as PDDL validators and humans. For users who lack
a background in PDDL, we show that LLMs can translate PDDL into natural
language and effectively encode corrective feedback back to the underlying
domain model. Our framework not only enjoys the correctness guarantee offered
by the external planners but also reduces human involvement by allowing users
to correct domain models at the beginning, rather than inspecting and
correcting (through interactive prompting) every generated plan as in previous
work. On two IPC domains and a Household domain that is more complicated than
commonly used benchmarks such as ALFWorld, we demonstrate that GPT-4 can be
leveraged to produce high-quality PDDL models for over 40 actions, and the
corrected PDDL models are then used to successfully solve 48 challenging
planning tasks. Resources, including the source code, are released at:
https://guansuns.github.io/pages/llm-dm.
Related papers
- Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models [7.3238629831871735]
Large Language Models (LLMs) have shown remarkable performance in various natural language tasks.
Planning problems into the Planning Domain Definition Language (PDDL) has been proposed as a potential solution.
We propose a novel approach that leverages LLMs and environment feedback to automatically generate PDDL domain and problem description files.
arXiv Detail & Related papers (2024-07-17T19:50:51Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z) - Multi-Reference Preference Optimization for Large Language Models [56.84730239046117]
We introduce a novel closed-form formulation for direct preference optimization using multiple reference models.
The resulting algorithm, Multi-Reference Preference Optimization (MRPO), leverages broader prior knowledge from diverse reference models.
Our experiments demonstrate that LLMs finetuned with MRPO generalize better in various preference data, regardless of data scarcity or abundance.
arXiv Detail & Related papers (2024-05-26T00:29:04Z) - NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions [8.004470925893957]
We present NL2Plan, the first domain-agnostic offline LLM-driven planning system.
We evaluate NL2Plan on four planning domains and find that it solves 10 out of 15 tasks.
In addition to using NL2Plan in end-to-end mode, users can inspect and correct all of its intermediate results.
arXiv Detail & Related papers (2024-05-07T11:27:13Z) - Generating consistent PDDL domains with Large Language Models [4.8551773468225745]
Large Language Models (LLMs) are capable of transforming natural language domain descriptions into plausibly looking PDDL markup.
We present a novel concept to significantly improve the quality of LLM-generated PDDL models by performing automated consistency checking during the generation process.
Although the proposed consistency checking strategies still can't guarantee absolute correctness of generated models, they can serve as valuable source of feedback reducing the amount of correction efforts expected from a human in the loop.
arXiv Detail & Related papers (2024-04-11T13:48:48Z) - Real-World Planning with PDDL+ and Beyond [55.73913765642435]
We present Nyx, a novel PDDL+ planner built to emphasize lightness, simplicity, and, most importantly, adaptability.
Nyx can be tailored to virtually any potential real-world application requiring some form of AI Planning, paving the way for wider adoption of planning methods for solving real-world problems.
arXiv Detail & Related papers (2024-02-19T07:35:49Z) - TIC: Translate-Infer-Compile for accurate "text to plan" using LLMs and Logical Representations [0.0]
We study the problem of generating plans for given natural language planning task requests.
Our approach comprises of (a) translate: using an LLM only for generating a interpretable intermediate representation of natural language task description.
We observe that using an LLM to only output the intermediate representation significantly reduces LLM errors.
arXiv Detail & Related papers (2024-02-09T18:39:13Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning [54.682106515794864]
offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected datasets.
This paper introduces $textbfLanguage Models for $textbfMo$tion Control ($textbfLaMo$), a general framework based on Decision Transformers to use pre-trained Language Models (LMs) for offline RL.
Empirical results indicate $textbfLaMo$ achieves state-of-the-art performance in sparse-reward tasks.
arXiv Detail & Related papers (2023-10-31T16:24:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.