Fugu-MT 論文翻訳(概要): Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise

論文の概要: Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise

arxiv url: http://arxiv.org/abs/2605.00064v1
Date: Thu, 30 Apr 2026 08:54:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-04 17:43:28.665737
Title: Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise
Title（参考訳）: 予測可能な仮想雑音を有する確率的勾配の情報理論一般化境界
Authors: Mohammad Partohaghighi,
Abstract要約: 各反復における摂動共分散は、過去の実SGD履歴に依存するが、現在または将来のランダム性には依存しない。この予測可能性により、条件付きガウス相対エントロピーの議論が可能になり、適応的な仮想ノイズ幾何学を持つSGDの一般化境界が得られる。このフレームワークは、アルゴリズムを変更することなく、仮想摂動解析を履歴依存SGDに拡張しながら、固定等方性と幾何認識境界を特別なケースとして回復する。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Information-theoretic generalization bounds analyze stochastic optimization by relating expected generalization error to the mutual information between learned parameters and training data. Virtual perturbation analyses of SGD add auxiliary Gaussian noise only in the proof, making mutual information tractable while leaving the actual SGD trajectory unchanged. Existing bounds, however, typically require perturbation covariances to be fixed independently of the optimization history, limiting their ability to represent geometries induced by moving gradient statistics, preconditioners, curvature proxies, and other pathwise information. We introduce predictable history-adaptive virtual perturbations, where the perturbation covariance at each iteration may depend on the past real SGD history but not on current or future randomness. This predictability enables a conditional Gaussian relative-entropy argument and yields generalization bounds for SGD with adaptive virtual-noise geometry. The bounds replace fixed sensitivity and gradient-deviation terms with conditional adaptive counterparts, include an output-sensitivity penalty from accumulated perturbation covariance, and reduce the deviation term to a conditional variance only under conditional unbiasedness. Since adaptive covariances may be data-dependent, we separate local Gaussian smoothing from global reference-kernel comparison. The resulting bound includes a covariance-comparison cost measuring the KL price of using an admissible reference geometry different from the actual adaptive covariance. Fixed-noise-style bounds are recovered under admissible synchronization, such as deterministic, public, or prefix-observable covariance rules. The framework recovers fixed isotropic and geometry-aware bounds as special cases while extending virtual perturbation analysis to history-dependent SGD without modifying the algorithm.
Abstract（参考訳）: 情報理論の一般化境界は、予測一般化誤差と学習パラメータと学習データ間の相互情報とを関連付けて確率的最適化を解析する。 SGDの仮想摂動解析は証明にのみ補助的なガウスノイズを付加し、実際のSGD軌道をそのまま残しながら相互情報を抽出できるようにする。しかし、既存の境界は、通常、最適化履歴とは独立に摂動共変を固定し、移動勾配統計、前条件、曲率プロキシ、その他の経路情報によって誘導される幾何学を表現する能力を制限する。予測可能な履歴適応型仮想摂動を導入し、各反復における摂動共分散は過去の実SGD履歴に依存するが、現在のランダム性や将来のランダム性には依存しない。この予測可能性により、条件付きガウス相対エントロピーの議論が可能になり、適応的な仮想ノイズ幾何学を持つSGDの一般化境界が得られる。境界は、条件適応性のある条件に置換され、蓄積された摂動共分散から出力感度のペナルティが与えられ、条件不偏性の下でのみ条件分散への偏差項が減少する。適応共分散はデータ依存である可能性があるので、グローバル参照カーネル比較から局所ガウス滑らか化を分離する。得られた境界は、実際の適応的共分散とは異なる許容可能な参照幾何を使用するKL価格を測定する共分散比較コストを含む。固定ノイズスタイルのバウンダリは、決定論的、パブリック、プレフィックス可観測共分散ルールなどの許容同期の下で復元される。このフレームワークは、アルゴリズムを変更することなく、仮想摂動解析を履歴依存SGDに拡張しながら、固定等方性と幾何認識境界を特別なケースとして回復する。

論文の概要: Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise

関連論文リスト