Fugu-MT 論文翻訳(概要): Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning

論文の概要: Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning

arxiv url: http://arxiv.org/abs/2509.13351v1
Date: Sun, 14 Sep 2025 02:42:34 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-18 18:41:50.548028
Title: Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
Title（参考訳）: LLMの計画指導:シンボリック・プランニングのための論理的チェーン・オブ・ソート・インストラクション・チューニング
Authors: Pulkit Verma, Ngoc La, Anthony Favier, Swaroop Mishra, Julie A. Shah,
Abstract要約: 大規模言語モデル(LLM)は、様々なタスクにまたがる印象的な機能を示しているが、構造化されたシンボリックプランニングを実行する能力はまだ限られている。論理的連鎖推論によりLLMのシンボリックプランニング能力を高めるために設計された新しい命令チューニングフレームワークPDDL-Instructを提案する。
参考スコア（独自算出の注目度）: 23.185497225384207
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have demonstrated impressive capabilities across diverse tasks, yet their ability to perform structured symbolic planning remains limited, particularly in domains requiring formal representations like the Planning Domain Definition Language (PDDL). In this paper, we present a novel instruction tuning framework, PDDL-Instruct, designed to enhance LLMs' symbolic planning capabilities through logical chain-of-thought reasoning. Our approach focuses on teaching models to rigorously reason about action applicability, state transitions, and plan validity using explicit logical inference steps. By developing instruction prompts that guide models through the precise logical reasoning required to determine when actions can be applied in a given state, we enable LLMs to self-correct their planning processes through structured reflection. The framework systematically builds verification skills by decomposing the planning process into explicit reasoning chains about precondition satisfaction, effect application, and invariant preservation. Experimental results on multiple planning domains show that our chain-of-thought reasoning based instruction-tuned models are significantly better at planning, achieving planning accuracy of up to 94% on standard benchmarks, representing a 66% absolute improvement over baseline models. This work bridges the gap between the general reasoning capabilities of LLMs and the logical precision required for automated planning, offering a promising direction for developing better AI planning systems.
Abstract（参考訳）: 大規模言語モデル(LLM)は、様々なタスクにまたがる印象的な機能を示しているが、構造化された象徴的計画を実行する能力は、特にプランニングドメイン定義言語(PDDL)のような形式的な表現を必要とするドメインにおいて制限されている。本稿では,LLMの論理的チェーン・オブ・シークレット推論によるシンボリックプランニング能力の向上を目的とした,新しいインストラクションチューニングフレームワークPDDL-Instructを提案する。我々のアプローチは、明示的な論理的推論ステップを用いて、アクション適用性、状態遷移、計画妥当性について厳格に推論するモデルを教えることに焦点を当てている。与えられた状態にいつアクションを適用できるかを決定するのに必要な正確な論理的推論を通じてモデルを導出する命令プロンプトを開発することにより、LLMは構造化されたリフレクションを通して計画プロセスの自己修正を可能にする。このフレームワークは、事前条件満足度、効果応用、不変保存に関する明確な推論チェーンに計画プロセスを分解することで、検証スキルを体系的に構築する。複数の計画領域での実験結果から,我々のチェーン・オブ・ソート・ベース・インストラクション・チューニング・モデルは,標準ベンチマークで最大94%の計画精度を達成し,ベースラインモデルに対して66%の絶対的改善を達成している。この作業は、LLMの一般的な推論能力と自動計画に必要な論理的精度のギャップを埋め、より良いAI計画システムを開発するための有望な方向を提供する。

論文の概要: Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning

関連論文リスト