Related papers: Generating Symbolic World Models via Test-time Scaling of Large Language Models

Generating Symbolic World Models via Test-time Scaling of Large Language Models

URL: http://arxiv.org/abs/2502.04728v1
Date: Fri, 07 Feb 2025 07:52:25 GMT
Title: Generating Symbolic World Models via Test-time Scaling of Large Language Models
Authors: Zhouliang Yu, Yuhuan Yuan, Tim Z. Xiao, Fuxiang Frank Xia, Jie Fu, Ge Zhang, Ge Lin, Weiyang Liu,
Abstract summary: Planning Domain Definition Language (PDDL) is leveraged as a planning abstraction that enables precise and formal state descriptions.<n>We introduce a simple yet effective algorithm, which first employs a Best-of-N sampling approach to improve the quality of the initial solution and then refines the solution in a fine-grained manner with verbalized machine learning.<n>Our method outperforms o1-mini by a considerable margin in the generation of PDDL domain, achieving over 50% success rate on two tasks.
Score: 28.258707611580643
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Solving complex planning problems requires Large Language Models (LLMs) to explicitly model the state transition to avoid rule violations, comply with constraints, and ensure optimality-a task hindered by the inherent ambiguity of natural language. To overcome such ambiguity, Planning Domain Definition Language (PDDL) is leveraged as a planning abstraction that enables precise and formal state descriptions. With PDDL, we can generate a symbolic world model where classic searching algorithms, such as A*, can be seamlessly applied to find optimal plans. However, directly generating PDDL domains with current LLMs remains an open challenge due to the lack of PDDL training data. To address this challenge, we propose to scale up the test-time computation of LLMs to enhance their PDDL reasoning capabilities, thereby enabling the generation of high-quality PDDL domains. Specifically, we introduce a simple yet effective algorithm, which first employs a Best-of-N sampling approach to improve the quality of the initial solution and then refines the solution in a fine-grained manner with verbalized machine learning. Our method outperforms o1-mini by a considerable margin in the generation of PDDL domain, achieving over 50% success rate on two tasks (i.e., generating PDDL domains from natural language description or PDDL problems). This is done without requiring additional training. By taking advantage of PDDL as state abstraction, our method is able to outperform current state-of-the-art methods on almost all competition-level planning tasks.

Related papers

Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models. DisCIPL uses a Planner model to generate a task-specific inference program. Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z)
An Extensive Evaluation of PDDL Capabilities in off-the-shelf LLMs [11.998185452551878]
Large language models (LLMs) have exhibited proficiency in code generation and chain-of-thought reasoning. This study evaluates the potential of LLMs to understand and generate Planning Domain Definition Language (PDDL)
arXiv Detail & Related papers (2025-02-27T15:13:07Z)
LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore? [87.71321254733384]
Large language models (LLMs) can generate planning approaches tailored to specific planning problems.<n>LLMs can achieve state-of-the-art performance on some standard IPC domains.<n>We discuss whether these results signify a paradigm shift and how they can complement existing planning approaches.
arXiv Detail & Related papers (2025-01-30T22:21:12Z)
On the Limit of Language Models as Planning Formalizers [4.145422873316857]
Large Language Models fail to create verifiable plans in grounded environments. An emerging line of work shows success in using LLM as a formalizer to generate a formal representation of the planning domain. We observe that large enough models can effectively formalize descriptions as PDDL, outperforming those directly generating plans.
arXiv Detail & Related papers (2024-12-13T05:50:22Z)
Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models [7.3238629831871735]
Large Language Models (LLMs) have shown remarkable performance in various natural language tasks. Planning problems into the Planning Domain Definition Language (PDDL) has been proposed as a potential solution. We propose a novel approach that leverages LLMs and environment feedback to automatically generate PDDL domain and problem description files.
arXiv Detail & Related papers (2024-07-17T19:50:51Z)
Generating consistent PDDL domains with Large Language Models [4.8551773468225745]
Large Language Models (LLMs) are capable of transforming natural language domain descriptions into plausibly looking PDDL markup. We present a novel concept to significantly improve the quality of LLM-generated PDDL models by performing automated consistency checking during the generation process. Although the proposed consistency checking strategies still can't guarantee absolute correctness of generated models, they can serve as valuable source of feedback reducing the amount of correction efforts expected from a human in the loop.
arXiv Detail & Related papers (2024-04-11T13:48:48Z)
PROC2PDDL: Open-Domain Planning Representations from Texts [56.627183903841164]
Proc2PDDL is the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%.
arXiv Detail & Related papers (2024-02-29T19:40:25Z)
Real-World Planning with PDDL+ and Beyond [55.73913765642435]
We present Nyx, a novel PDDL+ planner built to emphasize lightness, simplicity, and, most importantly, adaptability. Nyx can be tailored to virtually any potential real-world application requiring some form of AI Planning, paving the way for wider adoption of planning methods for solving real-world problems.
arXiv Detail & Related papers (2024-02-19T07:35:49Z)
Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning [59.543858889996024]
Large language models (LLMs) have revolutionized a large variety of NLP tasks.<n>We show how to leverage an LLM to automatically generate NL prompts from PDDL input.
arXiv Detail & Related papers (2023-11-16T11:55:27Z)
HDDL 2.1: Towards Defining a Formalism and a Semantics for Temporal HTN Planning [64.07762708909846]
Real world applications need modelling rich and diverse automated planning problems. hierarchical task network (HTN) formalism does not allow to represent planning problems with numerical and temporal constraints. We propose to fill the gap between HDDL and these operational needs and to extend HDDL by taking inspiration from PDDL 2.1.
arXiv Detail & Related papers (2023-06-12T18:21:23Z)
Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning [39.29964085305846]
Methods that use pre-trained large language models directly as planners are currently impractical due to limited correctness of plans. In this work, we introduce a novel alternative paradigm that constructs an explicit world (domain) model in planning domain definition language (PDDL) and then uses it to plan with sound domain-independent planners.
arXiv Detail & Related papers (2023-05-24T08:59:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.