Fugu-MT 論文翻訳(概要): Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search

論文の概要: Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search

arxiv url: http://arxiv.org/abs/2509.25835v2
Date: Wed, 01 Oct 2025 04:57:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-02 12:11:26.80549
Title: Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search
Title（参考訳）: チェーン・イン・トレー:LLM木探索における逐次推論への回帰
Authors: Xinzhe Li,
Abstract要約: テスト時のスケーリングにより、言語モデルは推論時に追加の計算を割り当てることで、ロングホライズン推論タスクを改善することができる。 CiTは,各ステップで分岐するのではなく,検索中に分岐するタイミングを適応的に決定するフレームワークである。我々はCitをツリー思考(ToTBS)、ReST-MCTS、RAPの3つの代表的なLCMチェーンに統合し、GSM8KとMath500で評価する。
参考スコア（独自算出の注目度）: 4.12237459236889
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Test-time scaling enables large language models (LLMs) to improve performance on long-horizon reasoning tasks by allocating additional compute at inference. Tree-search-based approaches achieve state-of-the-art results in this setting, but they are notoriously inefficient, often an order of magnitude slower than simpler iterative methods. We introduce Chain-in-Tree (CiT), a plug-in framework that adaptively decides when to branch during search rather than branching at every step. CiT relies on lightweight Branching Necessity (BN) evaluation methods: BN-DP (Direct Prompting), where an auxiliary LLM directly judges whether a step requires branching, and BN-SC (Self-Consistency), which clusters multiple candidate actions to estimate agreement. We integrate CiT into three representative LLM-in-the-loop tree search frameworks: Tree of Thoughts (ToT-BS), ReST-MCTS, and RAP, and evaluate across GSM8K and Math500. Our results show that: (1) BN-DP consistently reduces token generation, model invocations, and runtime by 75-85 percent across all settings, with negligible accuracy loss and sometimes accuracy gains; (2) BN-SC typically yields substantial savings (up to 80 percent) but shows instability in 1-4 out of 14 settings, caused by a small subset of examples that produce very long reasoning steps; (3) the quality of auxiliary LLMs is critical, not only the BN evaluator in BN-DP, but also the models used in BN-SC for clustering and equivalence checking. When these roles are filled by smaller LLMs, performance degrades. Importantly, BN-SC does not require LLMs in domains with deterministic action spaces, where clustering can be done programmatically. We also provide a theoretical guarantee that BN-DP never increases LLM invocations relative to the baseline and release a unified implementation of CiT across ToT-BS, ReST-MCTS, and RAP to facilitate reproducibility and extension.
Abstract（参考訳）: テストタイムスケーリングにより、大規模言語モデル(LLM)は、推論時に追加の計算を割り当てることで、長距離推論タスクのパフォーマンスを向上させることができる。木探索に基づくアプローチは、この設定で最先端の結果を得るが、それらは非常に非効率であり、しばしば単純な反復法よりも桁違いに遅い。プラグインフレームワークであるChain-in-Tree(CiT)を導入し、各ステップで分岐するのではなく、検索中にいつ分岐するかを適応的に決定する。 CiT は BN-DP (Direct Prompting) と BN-SC (Self-Consistency) という軽量な分岐要求 (BN) 評価手法に依存している。我々はCitを3つの代表的なLLM-in-the-loop木探索フレームワークに統合する: Tree of Thoughts (ToT-BS), ReST-MCTS, RAP, そしてGSM8KとMath500で評価する。その結果,(1)BN-DP はトークン生成,モデル呼び出し,ランタイムを常に75～85パーセント削減し,その精度が低下し,精度が低下することがある。(2)BN-SC は,通常,非常に長い推論ステップをもたらす例の小さなサブセットによって,14 つの設定のうち 1-4 の不安定性を示す。(3) BN-DP の BN 評価器だけでなく,BN-SC のクラスタリングや同値チェックに使用されるモデルも重要である。これらの役割がより小さなLSMで満たされると、性能は低下する。重要なことは、BN-SCは、クラスタリングをプログラム的に行うことができる決定論的アクション空間を持つ領域のLLMを必要としないことである。また,BN-DPがLLM呼び出しをベースラインに対して増加させず,再現性と拡張を容易にするため,ToT-BS,ReST-MCTS,RAPにまたがるCitTの統一実装をリリースすることを理論的に保証する。

論文の概要: Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search

関連論文リスト