Fugu-MT 論文翻訳(概要): Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning

論文の概要: Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning

arxiv url: http://arxiv.org/abs/2605.27991v2
Date: Thu, 04 Jun 2026 16:06:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-06 06:55:34.578959
Title: Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning
Title（参考訳）: 動的ランダム影響推論としてのグラディエントフロー最適化--ディープラーニングへの応用によるテストと早期停止
Authors: Minhao Yao, Ruoyu Wang, Xihong Lin, Lin Liu, Zhonghua Liu,
Abstract要約: 我々は勾配流学習のための統計的推論フレームワークを開発した。トレーニング時間は、分散が再配置される方法を決定する分散成分パラメータになる。固定勾配系における深層学習モデルは、理論の現代AIインスタンス化を提供する。
参考スコア（独自算出の注目度）: 16.158545640309438
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient-flow optimization is usually viewed as an algorithmic procedure for minimizing empirical loss, with training duration selected by validation or heuristic early-stopping rules. We develop a statistical inference framework for the gradient-flow training trajectory itself. The central object is fixed-operator squared-error gradient flow: whenever the fitted value evolves through a time-invariant positive semidefinite training operator, the trained model output at each training time is exactly equivalent to the best linear unbiased predictor, or empirical-Bayes posterior mean, under a corresponding random-effects model. Under this representation, training time becomes a variance-component parameter governing how variance is reallocated from residual noise to structured signal. This turns two basic training decisions into inferential problems. First, whether training is needed is formulated as a variance-component test for signal beyond initialization. Second, how long to train is formulated as restricted maximum likelihood (REML) estimation of the training-time variance component. The resulting REML-guided early stopping rule has a spectral interpretation: it selects the training time at which optimized spectral losses become empirically decorrelated from the eigenvalues of the training operator, yielding an effective degrees-of-freedom measure for the evolving trained model. We establish asymptotic prediction optimality for fixed-design in-sample risk and, under additional kernel regularity conditions, random-design out-of-sample risk. Deep learning models in fixed-kernel gradient regimes provide canonical modern-AI instantiations of the theory. Numerical experiments and a UK Biobank proteomics application show that the proposed inferential approach attains competitive prediction accuracy while reducing the reliance on validation splits and repeated checkpoint evaluation.
Abstract（参考訳）: グラディエントフロー最適化は通常、経験的損失を最小限に抑えるアルゴリズムの手順と見なされ、検証やヒューリスティックな早期停止規則によってトレーニング期間が選択される。本研究では,勾配流学習軌跡自体の統計的推論フレームワークを開発する。中央オブジェクトは固定演算子二乗誤差勾配流であり、適合した値が時間不変の正準定値トレーニング演算子を介して進化するたびに、トレーニング時間毎のトレーニングされたモデル出力は、対応するランダムエフェクトモデルの下で、最良の線形偏差予測子または経験的ベイズ平均値と正確に等価である。この表現の下では、トレーニング時間は、残留雑音から構造化信号への分散がどのように再配置されるかを決定する分散成分パラメータとなる。これにより、2つの基本的なトレーニング決定が推論問題に変換される。まず、トレーニングが必要かどうかを、初期化以外の信号に対する分散成分テストとして定式化する。第2に、トレーニング時間分散成分の制限された最大度(REML)推定として、トレーニングまでの期間を定式化する。最適化されたスペクトル損失がトレーニングオペレータの固有値と経験的に非相関となるトレーニング時間を選択し、進化したトレーニングモデルに対して効果的な自由度尺度を生成する。固定設計のインサンプルリスクと、追加のカーネル規則性条件下では、ランダム設計のアウトサンプルリスクに対して、漸近的予測最適性を確立する。固定カーネル勾配系における深層学習モデルは、理論の標準的現代AIインスタンス化を提供する。数値実験と英国バイオバンク・プロテオミクスの応用により,提案した推論手法は,検証分割への依存を低減し,繰り返しチェックポイント評価を行うとともに,競合予測精度が向上することを示した。

論文の概要: Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning

関連論文リスト