Fugu-MT 論文翻訳(概要): Efficient Reasoning via Thought-Training and Thought-Free Inference

論文の概要: Efficient Reasoning via Thought-Training and Thought-Free Inference

arxiv url: http://arxiv.org/abs/2511.03408v1
Date: Wed, 05 Nov 2025 12:20:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-06 18:19:32.422178
Title: Efficient Reasoning via Thought-Training and Thought-Free Inference
Title（参考訳）: 思考訓練と思考自由推論による効率的な推論
Authors: Canhui Wu, Qiong Cao, Chao Xue, Wei Xi, Xiaodong He,
Abstract要約: textbf3TF (textbfThought-textbfTraining and textbfThought-textbfFree inference) は,短時間の視点で効率的な推論を行うフレームワークである。まず、推論モードと非推論モードの両方で動作可能なハイブリッドモデルをトレーニングし、さらにCoTアノテートデータでトレーニングし、構造化推論の内部化を行う。圧縮ベースのアプローチとは異なり、3TFは非共振出力の推論品質を改善し、モデルを可能にする。
参考スコア（独自算出の注目度）: 26.7513102215969
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in large language models (LLMs) have leveraged explicit Chain-of-Thought (CoT) prompting to improve reasoning accuracy. However, most existing methods primarily compress verbose reasoning outputs. These Long-to-Short transformations aim to improve efficiency, but still rely on explicit reasoning during inference. In this work, we introduce \textbf{3TF} (\textbf{T}hought-\textbf{T}raining and \textbf{T}hought-\textbf{F}ree inference), a framework for efficient reasoning that takes a Short-to-Long perspective. We first train a hybrid model that can operate in both reasoning and non-reasoning modes, and then further train it on CoT-annotated data to internalize structured reasoning, while enforcing concise, thought-free outputs at inference time using the no-reasoning mode. Unlike compression-based approaches, 3TF improves the reasoning quality of non-reasoning outputs, enabling models to perform rich internal reasoning implicitly while keeping external outputs short. Empirically, 3TF-trained models obtain large improvements on reasoning benchmarks under thought-free inference, demonstrating that high quality reasoning can be learned and executed implicitly without explicit step-by-step generation.
Abstract（参考訳）: 大規模言語モデル (LLM) の最近の進歩は、推論精度を向上させるために明示的なChain-of-Thought (CoT) を活用している。しかし、既存のほとんどの手法は、主に冗長推論出力を圧縮する。これらのLong-to-Short変換は効率の向上を目的としているが、推論中に明確な推論に依存している。本研究では, ショート・トゥ・ロングの観点からの効率的な推論のためのフレームワークである \textbf{3TF} (\textbf{T}hought-\textbf{T}raining と \textbf{T}hought-\textbf{F}ree inference を導入する。まず、推論モードと非推論モードの両方で動作可能なハイブリッドモデルをトレーニングし、さらにCoTアノテートしたデータでトレーニングし、構造化推論を内部化します。圧縮ベースのアプローチとは異なり、3TFは非共振出力の推論品質を改善し、外部出力を短く保ちながら、モデルが暗黙的にリッチな内部推論を行うことを可能にする。経験的に、3TF学習モデルは、思考自由推論の下での推論ベンチマークを大幅に改善し、明示的なステップバイステップ生成なしに、高品質な推論を学習し、暗黙的に実行することができることを示した。

論文の概要: Efficient Reasoning via Thought-Training and Thought-Free Inference

関連論文リスト