Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
- URL: http://arxiv.org/abs/2509.13351v1
- Date: Sun, 14 Sep 2025 02:42:34 GMT
- Title: Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
- Authors: Pulkit Verma, Ngoc La, Anthony Favier, Swaroop Mishra, Julie A. Shah,
- Abstract summary: Large language models (LLMs) have demonstrated impressive capabilities across diverse tasks, yet their ability to perform structured symbolic planning remains limited.<n>We present a novel instruction tuning framework, PDDL-Instruct, designed to enhance LLMs' symbolic planning capabilities through logical chain-of-thought reasoning.
- Score: 23.185497225384207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated impressive capabilities across diverse tasks, yet their ability to perform structured symbolic planning remains limited, particularly in domains requiring formal representations like the Planning Domain Definition Language (PDDL). In this paper, we present a novel instruction tuning framework, PDDL-Instruct, designed to enhance LLMs' symbolic planning capabilities through logical chain-of-thought reasoning. Our approach focuses on teaching models to rigorously reason about action applicability, state transitions, and plan validity using explicit logical inference steps. By developing instruction prompts that guide models through the precise logical reasoning required to determine when actions can be applied in a given state, we enable LLMs to self-correct their planning processes through structured reflection. The framework systematically builds verification skills by decomposing the planning process into explicit reasoning chains about precondition satisfaction, effect application, and invariant preservation. Experimental results on multiple planning domains show that our chain-of-thought reasoning based instruction-tuned models are significantly better at planning, achieving planning accuracy of up to 94% on standard benchmarks, representing a 66% absolute improvement over baseline models. This work bridges the gap between the general reasoning capabilities of LLMs and the logical precision required for automated planning, offering a promising direction for developing better AI planning systems.
Related papers
- iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning [28.763018368302117]
Large language models (LLMs) can perform reliable step-by-step reasoning during problem-solving.<n> generating accurate and effective textual plans remains challenging due to hallucinations.<n>We propose iCLP, a novel framework that enables LLMs to adaptively generate latent plans.
arXiv Detail & Related papers (2025-12-30T06:19:04Z) - PLAN-TUNING: Post-Training Language Models to Learn Step-by-Step Planning for Complex Problem Solving [66.42260489147617]
We introduce PLAN-TUNING, a framework that distills synthetic task decompositions from large-scale language models.<n>Plan-TUNING fine-tunes smaller models via supervised and reinforcement-learning objectives to improve complex reasoning.<n>Our analysis demonstrates how planning trajectories improves complex reasoning capabilities.
arXiv Detail & Related papers (2025-07-10T07:30:44Z) - LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore? [87.71321254733384]
Large language models (LLMs) can generate planning approaches tailored to specific planning problems.<n>LLMs can achieve state-of-the-art performance on some standard IPC domains.<n>We discuss whether these results signify a paradigm shift and how they can complement existing planning approaches.
arXiv Detail & Related papers (2025-01-30T22:21:12Z) - Think Small, Plan Smart: Minimalist Symbolic Abstraction and Heuristic Subspace Search for LLM-Guided Task Planning [19.421916137269275]
Large language models (LLMs) offer a promising interface for translating complex and ambiguous natural language instructions into actionable plans.<n>Recent frameworks combine LLMs with symbolic planners by first generating action models (Planning Domain Definition Language) and then applying search.<n>We propose PLAHX, a two-stage LLM-symbolic planning framework that integrates abstract symbolic representations with meta-heuristic subspace search in a parallel and iterative fashion.
arXiv Detail & Related papers (2025-01-25T13:33:22Z) - Embodied CoT Distillation From LLM To Off-the-shelf Agents [6.318203525449058]
DeDer is a framework for decomposing and distilling the embodied reasoning capabilities from large language models (LLMs)<n>Our experiments with the ALFRED benchmark demonstrate that DeDer surpasses leading language planning and distillation approaches.
arXiv Detail & Related papers (2024-12-16T07:18:02Z) - Non-myopic Generation of Language Models for Reasoning and Planning [45.75146679449453]
This paper proposes a novel method, Predictive-Decoding, that leverages Model Predictive Control to enhance planning accuracy.
Our experiments show significant improvements in a wide range of tasks for math, coding, and agents.
arXiv Detail & Related papers (2024-10-22T17:13:38Z) - Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes.
CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks.
It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing [61.98556945939045]
We propose a framework to learn planning-based reasoning through Direct Preference Optimization (DPO) on collected trajectories.
Our results on challenging logical reasoning benchmarks demonstrate the effectiveness of our learning framework.
arXiv Detail & Related papers (2024-02-01T15:18:33Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - PlanBench: An Extensible Benchmark for Evaluating Large Language Models
on Planning and Reasoning about Change [34.93870615625937]
PlanBench is a benchmark suite based on the kinds of domains used in the automated planning community.
PlanBench provides sufficient diversity in both the task domains and the specific planning capabilities.
arXiv Detail & Related papers (2022-06-21T16:15:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.