Fugu-MT 論文翻訳(概要): Dual-Track CoT: Budget-Aware Stepwise Guidance for Small LMs

論文の概要: Dual-Track CoT: Budget-Aware Stepwise Guidance for Small LMs

arxiv url: http://arxiv.org/abs/2604.25039v1
Date: Mon, 27 Apr 2026 22:43:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-29 16:49:17.620543
Title: Dual-Track CoT: Budget-Aware Stepwise Guidance for Small LMs
Title（参考訳）: Dual-Track CoT:小さなLMのための予算対応型ステップワイドガイダンス
Authors: Sagnik Chatterjee, Atharva Patil, Sricharan Ramesh,
Abstract要約: 小言語モデルは、厳密な計算とトークンの予算の下で、多段階の推論に苦しむ。自己整合性のような既存のテスト時間推論手法ではパフォーマンスが向上するが、トークンコストが高く、ステップレベルの細かい制御ができないことが多い。 SLM(Small Language Models)は、同じまたは少ないトークンを確実に使用することができるか?
参考スコア（独自算出の注目度）: 0.3823356975862005
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models (LLMs) solve many reasoning tasks via chain-of-thought (CoT) prompting, but smaller models (about 7 to 8B parameters) still struggle with multi-step reasoning under tight compute and token budgets. Existing test time reasoning methods such as self consistency (sampling multiple rationales and voting), Tree-of-Thoughts (search over intermediate thoughts), and critique revise loops improve performance, but often at high token cost and without fine-grained step-level control. This project1 aims to address that gap: can Small Language Models (SLMs) reason reliably using the same or fewer tokens? This question is both scientific and practical. Scientifically, it probes whether process supervision and simple test-time controls (such as token budgets and rejection of redundant steps) can substitute for model scale or large sampling counts. Practically, many deployments (on-device, low-latency, or cost-constrained settings) cannot afford huge models or dozens of sampled rationales per query. A method that improves SLM reasoning at fixed cost would therefore be directly useful.
Abstract（参考訳）: 大きな言語モデル(LLM)は、チェーン・オブ・シンクレット(CoT)のプロンプトによって多くの推論タスクを解決するが、より小さなモデル(約7から8Bのパラメータ)は、厳密な計算とトークンの予算の下で、多段階の推論に苦慮している。自己整合性(複数の合理性や投票をサンプリングする)、トリー・オブ・ソート(中間思想について調べる)、批判的修正ループといった既存のテスト時間推論手法は、パフォーマンスを向上させるが、しばしばトークンコストが高く、ステップレベルの細かい制御が不要である。このプロジェクト1は、このギャップに対処することを目的としています。 SLM(Small Language Models)は、同じまたは少ないトークンを確実に使用できますか? この問題は科学的にも実用的でもあります。科学的には、プロセスの監督と単純なテストタイムコントロール(トークン予算や冗長なステップの拒否など)が、モデルスケールや大規模なサンプリングカウントに取って代わるかどうかを調査する。実際、多くのデプロイメント(オンデバイス、低レイテンシ、コスト制約のある設定)では、クエリ毎に巨大なモデルや数十のサンプルの合理化ができない。したがって、固定コストでのSLM推論を改善する方法は、直接的に有用である。

論文の概要: Dual-Track CoT: Budget-Aware Stepwise Guidance for Small LMs

関連論文リスト