Fugu-MT 論文翻訳(概要): Procedural Knowledge at Scale Improves Reasoning

論文の概要: Procedural Knowledge at Scale Improves Reasoning

arxiv url: http://arxiv.org/abs/2604.01348v1
Date: Wed, 01 Apr 2026 20:01:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-03 14:21:09.869213
Title: Procedural Knowledge at Scale Improves Reasoning
Title（参考訳）: スケールでの手続き的知識は推論を改善する
Authors: Di Wu, Devendra Singh Sachan, Wen-tau Yih, Mingda Chen,
Abstract要約: Reasoning Memoryは、大規模に手続き的な知識を明示的に取り出し再利用する、推論モデルのためのフレームワークである。 Reasoning Memoryは、ドキュメント、トラジェクトリ、テンプレートの知識、および計算に適合したテスト時間スケーリングベースラインで、RAGを一貫して上回ります。
参考スコア（独自算出の注目度）: 25.36077714467684
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Test-time scaling has emerged as an effective way to improve language models on challenging reasoning tasks. However, most existing methods treat each problem in isolation and do not systematically reuse knowledge from prior reasoning trajectories. In particular, they underutilize procedural knowledge: how to reframe a problem, choose an approach, and verify or backtrack when needed. We introduce Reasoning Memory, a retrieval-augmented generation (RAG) framework for reasoning models that explicitly retrieves and reuses procedural knowledge at scale. Starting from existing corpora of step-by-step reasoning trajectories, we decompose each trajectory into self-contained subquestion-subroutine pairs, yielding a datastore of 32 million compact procedural knowledge entries. At inference time, a lightweight in-thought prompt lets the model verbalize the core subquestion, retrieve relevant subroutines within its reasoning trace, and reason under diverse retrieved subroutines as implicit procedural priors. Across six math, science, and coding benchmarks, Reasoning Memory consistently outperforms RAG with document, trajectory, and template knowledge, as well as a compute-matched test-time scaling baseline. With a higher inference budget, it improves over no retrieval by up to 19.2% and over the strongest compute-matched baseline by 7.9% across task types. Ablation studies show that these gains come from two key factors: the broad procedural coverage of the source trajectories and our decomposition and retrieval design, which together enable effective extraction and reuse of procedural knowledge.
Abstract（参考訳）: テストタイムのスケーリングは、困難な推論タスクにおいて、言語モデルを改善する効果的な方法として現れました。しかし、既存のほとんどの手法は、それぞれの問題を分離して扱い、事前の推論軌跡から知識を体系的に再利用しない。特に彼らは、問題を再設計し、アプローチを選択し、必要な時に検証またはバックトラックする方法という、手続き的な知識を過小評価しています。本稿では、大規模に手続き的知識を明示的に取得・再利用する推論モデルのための検索強化世代(RAG)フレームワークであるReasoning Memoryを紹介する。既存のステップ・バイ・ステップの推論軌道のコーパスから、各軌道を自己完結したサブクエストとサブルーチンのペアに分解し、2200万のコンパクトな手続き的知識エントリのデータストアを生成する。推論時に、軽量なインシテットプロンプトは、モデルがコアサブクエストを言語化し、その推論トレース内で関連するサブルーチンを検索し、暗黙の手続き的先行として、様々なサブルーチンの下で推論することを可能にする。 6つの数学、科学、コーディングのベンチマークで、Reasoning Memoryは、ドキュメント、軌跡、テンプレートの知識、および計算に適合したテスト時間スケーリングベースラインでRAGを一貫して上回っている。推論予算を高くすることで、検索を19.2%、最強の計算マッチングベースラインを7.9%改善する。アブレーション研究は、これらの成果が2つの主要な要因から得られたことを示している: ソース・トラジェクトリの広範囲なプロシージャ・カバレッジと、我々の分解・検索設計により、プロシージャ・ナレッジの効果的な抽出と再利用を可能にする。

論文の概要: Procedural Knowledge at Scale Improves Reasoning

関連論文リスト