Fugu-MT 論文翻訳(概要): Breaking the Reward Barrier: Accelerating Tree-of-Thought Reasoning via Speculative Exploration

論文の概要: Breaking the Reward Barrier: Accelerating Tree-of-Thought Reasoning via Speculative Exploration

arxiv url: http://arxiv.org/abs/2605.10195v2
Date: Thu, 14 May 2026 07:42:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 15:19:49.878975
Title: Breaking the Reward Barrier: Accelerating Tree-of-Thought Reasoning via Speculative Exploration
Title（参考訳）: リワードバリアを破る - 投機的探索を通してのトリー・オブ・サート推論の加速
Authors: Shuzhang Zhong, Haochen Huang, Shengxuan Qiu, Pengfei Zuo, Runsheng Wang, Meng Li,
Abstract要約: Tree-of-Thought (ToT)推論構造大規模言語モデル(LLM)推論は木に基づく探索である。 ToTの効率は報酬依存性障壁によって制約される。 ToTの高電位分岐を予測・拡張するための投機的経路選択を提案する。
参考スコア（独自算出の注目度）: 9.100637658218979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tree-of-Thought (ToT) reasoning structures Large Language Model (LLM) inference as a tree-based search, demonstrating strong potential for solving complex mathematical and programming tasks. However, its efficiency is constrained by the reward dependency barrier -- a synchronization bottleneck caused by sequential reward-guided exploration that limits search parallelism and introduces substantial latency. Prior system optimizations, mainly designed for linear Chain-of-Thought (CoT) reasoning, cannot address these challenges, leaving the efficiency of ToT underexplored. To enhance ToT reasoning efficiency, we observe that the reasoning paths can be explored speculatively to break the reward synchronization barrier. Therefore, in this paper, we propose SPEX and introduce three key techniques: (i) intra-query speculative path selection to predict and expand high-potential branches of ToT, (ii) inter-query budget allocation to balance speculative resource allocation across queries dynamically, and (iii) adaptive early termination to prune deep and redundant branches for a skewed search tree. We implement SPEX on top of the SGLang framework and evaluate it across diverse ToT algorithms and LLMs. Extensive experiments show that SPEX achieves $1.2 \sim 3 \times$ speedup for different ToT reasoning algorithms. Moreover, SPEX synergizes with token-level speculative decoding, achieving cumulative speedups of up to $4.1\times$. Ablation studies further confirm the contributions of each technique. Overall, SPEX represents a significant step toward efficient and scalable ToT reasoning, unlocking the parallelism required for high-performance inference-time scaling for LLMs.
Abstract（参考訳）: Tree-of-Thought(ToT)推論構造大規模言語モデル(LLM)推論は木に基づく探索であり、複雑な数学的およびプログラミングタスクを解く強力な可能性を示している。しかし、その効率は報酬依存性の障壁によって制約される -- シーケンシャルな報酬誘導探索によって引き起こされる同期のボトルネックであり、検索の並列性を制限し、かなりのレイテンシをもたらす。従来のシステム最適化は主に線形連鎖(CoT)推論のために設計されていたが、これらの課題には対処できず、ToTの効率は過小評価されている。 ToT推論効率を向上させるために,推論経路を投機的に探索し,報酬同期障壁を壊すことを観察する。そこで本稿では,SPEXを提案するとともに,3つの重要な技術を紹介する。 i)ToTの高電位分岐を予測・拡張するためのクエリ内投機経路選択二クエリ間の投機的リソース割当を動的にバランスさせるためのクエリ間予算割当三潜木探索用の深葉・冗長枝への適応早期終了。我々は,SPEXをSGLangフレームワーク上に実装し,多様なToTアルゴリズムやLLMで評価する。大規模な実験により、SPEXは異なるToT推論アルゴリズムに対して1.2 \sim 3 \times$ speedupを達成した。さらに、SPEXはトークンレベルの投機的デコードと同期し、最大4.1\times$の累積スピードアップを達成する。アブレーション研究は、それぞれのテクニックの貢献をさらに確認する。全体として、SPEXは効率よくスケーラブルなToT推論への重要なステップであり、LLMの高速な推論時間スケーリングに必要な並列性を解放している。

論文の概要: Breaking the Reward Barrier: Accelerating Tree-of-Thought Reasoning via Speculative Exploration

関連論文リスト