Fugu-MT 論文翻訳(概要): Knowledge Model Prompting Increases LLM Performance on Planning Tasks

論文の概要: Knowledge Model Prompting Increases LLM Performance on Planning Tasks

arxiv url: http://arxiv.org/abs/2602.03900v1
Date: Tue, 03 Feb 2026 09:47:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-05 19:45:11.213442
Title: Knowledge Model Prompting Increases LLM Performance on Planning Tasks
Title（参考訳）: 計画課題におけるLLM性能向上のための知識モデルの提案
Authors: Erik Goh, John Kos, Ashok Goel,
Abstract要約: 本稿では,タスク・メソッド・知識・フレームワークが大規模言語モデルの推論能力を向上できるかどうかを検討する。この研究はPlanBenchベンチマークを用いてTMKを評価し、推論と計画能力をテストする。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLM) can struggle with reasoning ability and planning tasks. Many prompting techniques have been developed to assist with LLM reasoning, notably Chain-of-Thought (CoT); however, these techniques, too, have come under scrutiny as LLMs' ability to reason at all has come into question. Borrowing from the domain of cognitive and educational science, this paper investigates whether the Task-Method-Knowledge (TMK) framework can improve LLM reasoning capabilities beyond its previously demonstrated success in educational applications. The TMK framework's unique ability to capture causal, teleological, and hierarchical reasoning structures, combined with its explicit task decomposition mechanisms, makes it particularly well-suited for addressing language model reasoning deficiencies, and unlike other hierarchical frameworks such as HTN and BDI, TMK provides explicit representations of not just what to do and how to do it, but also why actions are taken. The study evaluates TMK by experimenting on the PlanBench benchmark, focusing on the Blocksworld domain to test for reasoning and planning capabilities, examining whether TMK-structured prompting can help language models better decompose complex planning problems into manageable sub-tasks. Results also highlight significant performance inversion in reasoning models. TMK prompting enables the reasoning model to achieve up to an accuracy of 97.3\% on opaque, symbolic tasks (Random versions of Blocksworld in PlanBench) where it previously failed (31.5\%), suggesting the potential to bridge the gap between semantic approximation and symbolic manipulation. Our findings suggest that TMK functions not merely as context, but also as a mechanism that steers reasoning models away from their default linguistic modes to engage formal, code-execution pathways in the context of the experiments.
Abstract（参考訳）: 大規模言語モデル(LLM)は推論能力と計画タスクに苦労することがある。 LLM推論を支援するために多くのプロンプト技術、特にCoT(Chain-of-Thought)が開発されているが、LLMの推論能力が疑問視されているため、これらの技術も精査されている。本稿では,認知科学と教育科学の分野から借用して,これまで実証された教育分野での成功を超えて,TMK(Task-Method-Knowledge)フレームワークがLCM推論能力を向上させることができるかどうかを検討する。 TMKフレームワークは、因果的、遠隔的、階層的推論構造を捕捉するユニークな能力と、その明示的なタスク分解機構を組み合わせることで、言語モデル推論の欠陥に対処するのに特に適しており、HTNやBDIのような他の階層的フレームワークとは異なり、TMKは何をすべきか、どのように行うべきかだけでなく、なぜアクションを取るのかという明示的な表現を提供する。この研究は、PlanBenchベンチマークを用いてTMKを評価し、推論と計画能力をテストするBlocksworldドメインに注目し、TMK構造化プロンプトが複雑な計画問題を管理可能なサブタスクに分解するのに役立つかどうかを調べる。結果は推論モデルにおける顕著な性能逆転も強調する。 TMKプロンプトにより、不透明でシンボリックなタスク(プランベンチのBlocksworldのランサムバージョン)で97.3\%の精度で推論モデルが実現できる(31.5\%)が、これはセマンティック近似とシンボリック操作のギャップを埋める可能性を示唆している。この結果から,TMK は文脈だけでなく,標準言語モードからモデルを引き離す機構として機能し,実験の文脈において形式的,コード実行経路を関与させる可能性が示唆された。

論文の概要: Knowledge Model Prompting Increases LLM Performance on Planning Tasks

関連論文リスト