Fugu-MT 論文翻訳(概要): Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?

論文の概要: Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?

arxiv url: http://arxiv.org/abs/2510.12680v1
Date: Tue, 14 Oct 2025 16:19:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-15 19:02:32.388646
Title: Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?
Title（参考訳）: ハイブリッド思考のデミスティフィケーション:LLMはシンクとノンシンクを真に切り替えられるか?
Authors: Shouren Wang, Wang Yang, Xianxuan Long, Qifan Wang, Vipin Chaudhary, Xiaotian Han,
Abstract要約: 制御可能性に影響を与える要因を分析し,最も重要な4つの要因を同定する。本稿では,標準学習と比較して,両方のモードで精度を維持できる実践的なレシピを提案する。
参考スコア（独自算出の注目度）: 46.403110838087194
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hybrid thinking enables LLMs to switch between reasoning and direct answering, offering a balance between efficiency and reasoning capability. Yet our experiments reveal that current hybrid thinking LLMs only achieve partial mode separation: reasoning behaviors often leak into the no-think mode. To understand and mitigate this, we analyze the factors influencing controllability and identify four that matter most: (1) larger data scale, (2) using think and no-think answers from different questions rather than the same question, (3) a moderate increase in no-think data number, and (4) a two-phase strategy that first trains reasoning ability and then applies hybrid think training. Building on these findings, we propose a practical recipe that, compared to standard training, can maintain accuracy in both modes while significantly reducing no-think output length (from $1085$ to $585$ on MATH500) and occurrences of reasoning-supportive tokens such as ``\texttt{wait}'' (from $5917$ to $522$ on MATH500). Our findings highlight the limitations of current hybrid thinking and offer directions for strengthening its controllability.
Abstract（参考訳）: ハイブリッド思考により、LLMは推論と直接応答を切り替え、効率と推論能力のバランスをとることができる。しかし、我々の実験により、現在のハイブリッド思考 LLM は部分的なモード分離しか達成していないことが判明した。この理解と緩和のために,制御可能性に影響を与える要因を分析し,(1)大きなデータ尺度,(2)同じ質問ではなく異なる質問からの思考と無思考の回答を用いたこと,(3)無思考のデータ数の適度な増加,(4)最初に推論能力を訓練し,その後にハイブリッドシンクトレーニングを適用する2段階戦略,の4つを重要視した。これらの知見に基づいて, 標準トレーニングと比較して, 両モードの精度を維持しつつ, 概念のない出力長(MATH500は1085ドルから585ドル) と '`\texttt{wait}' (MATH500は5917ドルから522ドル) のような推論支援トークンの発生を著しく低減できる実用的レシピを提案する。本研究は,現在のハイブリッド思考の限界を浮き彫りにし,コントロール可能性を高めるための方向性を提示する。

論文の概要: Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?

関連論文リスト