Fugu-MT 論文翻訳(概要): Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

論文の概要: Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

arxiv url: http://arxiv.org/abs/2603.14458v1
Date: Sun, 15 Mar 2026 16:06:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:35.818913
Title: Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs
Title（参考訳）: 知識のない推論を蒸留する - 信頼性の高いLLMのためのフレームワーク
Authors: Auksarapak Kietkajornrit, Jad Tarifi, Nima Asgharbeygi,
Abstract要約: 本稿では,実際の検索と回答の合成からプランニングを分離するモジュラーフレームワークを提案する。ライトウェイトな学生プランナーは、教師-学生フレームワークを介して訓練され、構造化された分解を生成する。提案手法をSEAL-0(SEAL-0)で評価した。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fact-seeking question answering with large language models (LLMs) remains unreliable when answers depend on up-to-date or conflicting information. Although retrieval-augmented and tool-using LLMs reduce hallucinations, they often rely on implicit planning, leading to inefficient tool usage. We propose a modular framework that explicitly separates planning from factual retrieval and answer synthesis. A lightweight student planner is trained via a teacher-student framework to generate structured decompositions consisting of abstract reasoning steps and searchable fact requests. The supervision signals contain only planning traces and fact requests, without providing factual answers or retrieved evidence. At inference, the planner produces plans, while prompt-engineered modules perform retrieval and response synthesis. We evaluate the proposed framework on SEAL-0, an extremely challenging benchmark for search-augmented LLMs. Results show that supervised planning improves both accuracy and latency compared to monolithic reasoning models and prompt-based tool-augmented frameworks, demonstrating that explicitly learned planning structures are essential for reliable fact-seeking LLMs.
Abstract（参考訳）: 大きな言語モデル (LLM) で答える実測的な質問は、答えが最新の情報や矛盾する情報に依存する場合、信頼性が低いままである。検索が強化され、ツールを使用するLLMは幻覚を減少させるが、暗黙の計画に依存することが多く、非効率なツールの使用につながる。本稿では,実際の検索と回答の合成から計画を明確に分離するモジュラーフレームワークを提案する。軽量な学生プランナーは教師の学習フレームワークを介して訓練され、抽象的な推論ステップと検索可能な事実要求からなる構造化された分解を生成する。監視信号は、事実の回答や証拠の回収なしに、計画的トレースと事実要求のみを含む。推測において、プランナーは計画を生成し、プロンプトエンジニアリングされたモジュールは検索および応答合成を行う。提案手法をSEAL-0(SEAL-0)で評価した。その結果、教師ありプランニングはモノリシック推論モデルやプロンプトベースのツール拡張フレームワークと比較して精度とレイテンシを向上し、信頼性の高いファクト検索 LLM には明示的に学習されたプランニング構造が不可欠であることを示した。

論文の概要: Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

関連論文リスト