Fugu-MT 論文翻訳(概要): FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness

論文の概要: FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness

arxiv url: http://arxiv.org/abs/2601.01332v1
Date: Sun, 04 Jan 2026 02:33:30 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-06 16:25:22.233441
Title: FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness
Title（参考訳）: FLOP-Efficient Training: Test-Time Compute Awarenessに基づく早期停止
Authors: Hossam Amer, Maryam Dialameh, Hossein Rajabzadeh, Walid Ahmed, Weiwei Zhang, Yang Liu,
Abstract要約: FLOPで測定されたトレーニング計算のスケーリングは、大規模な言語モデルの精度を向上させるために長年にわたって行われてきた。我々は、中間チェックポイントとそれに対応するTTC構成が、完全に訓練されたモデルの正確さに一致または超えるようなTTC対応トレーニングを導入する。この知見に基づいて,チェックポイントとTTC構成を共同で選択し,精度を犠牲にすることなくトレーニング計算を最小化する早期停止アルゴリズムを提案する。
参考スコア（独自算出の注目度）: 5.2612663135589175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scaling training compute, measured in FLOPs, has long been shown to improve the accuracy of large language models, yet training remains resource-intensive. Prior work shows that increasing test-time compute (TTC)-for example through iterative sampling-can allow smaller models to rival or surpass much larger ones at lower overall cost. We introduce TTC-aware training, where an intermediate checkpoint and a corresponding TTC configuration can together match or exceed the accuracy of a fully trained model while requiring substantially fewer training FLOPs. Building on this insight, we propose an early stopping algorithm that jointly selects a checkpoint and TTC configuration to minimize training compute without sacrificing accuracy. To make this practical, we develop an efficient TTC evaluation method that avoids exhaustive search, and we formalize a break-even bound that identifies when increased inference compute compensates for reduced training compute. Experiments demonstrate up to 92\% reductions in training FLOPs while maintaining and sometimes remarkably improving accuracy. These results highlight a new perspective for balancing training and inference compute in model development, enabling faster deployment cycles and more frequent model refreshes. Codes will be publicly released.
Abstract（参考訳）: FLOPで測定されたトレーニング計算のスケーリングは、大規模な言語モデルの精度向上に長年使われてきたが、トレーニングはリソース集約型のままである。以前の研究は、例えば反復サンプリングによるテスト時間計算(TTC)の増加は、より小さなモデルでより大規模なモデルに対抗したり、全体のコストを下げることを可能にすることを示していた。本稿では,中間チェックポイントとそれに対応するTTC構成を併用して,完全に訓練されたモデルの精度を向上すると同時に,FLOPを著しく少なくするTTC対応トレーニングを提案する。この知見に基づいて,チェックポイントとTTC構成を共同で選択し,精度を犠牲にすることなくトレーニング計算を最小化する早期停止アルゴリズムを提案する。そこで本研究では,効率的なTTC評価手法を開発し,トレーニング計算の削減のために,推論計算が増大した場合に識別するブレーク・エクイティ・バウンダリを定式化する。実験では、FLOPのトレーニングを最大92%削減し、同時に精度を著しく向上させる。これらの結果は、モデル開発におけるトレーニングと推論計算のバランスをとるための新しい視点を強調し、より高速なデプロイメントサイクルとより頻繁なモデルリフレッシュを可能にします。コードは公開されます。

論文の概要: FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness

関連論文リスト