Fugu-MT 論文翻訳(概要): A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics

論文の概要: A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics

arxiv url: http://arxiv.org/abs/2002.01987v3
Date: Mon, 23 Mar 2020 21:47:47 GMT
ステータス: 翻訳完了
システム内更新日: 2023-01-03 21:21:06.879836
Title: A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics
Title（参考訳）: 2層ニューラルネットにおける平均場学習理論:エントロピー正則化とマッキーン・ブラソフダイナミクスの制御
Authors: Belinda Tzen and Maxim Raginsky
Abstract要約: 我々は「ほぼガウス的」なランダム重みを持つ2層ニューラルネットによる関数の普遍近似の問題を考える。この問題は、勾配降下によって生じる重み付けがi.i.d.から良好に動かない遅延訓練の最近の研究によって動機づけられている。我々は,重みに対する確率測度空間上の自由エネルギー関数のグローバル最小化として,この問題を表現できることを示した。
参考スコア（独自算出の注目度）: 12.754502898545555
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of universal approximation of functions by two-layer neural nets with random weights that are "nearly Gaussian" in the sense of Kullback-Leibler divergence. This problem is motivated by recent works on lazy training, where the weight updates generated by stochastic gradient descent do not move appreciably from the i.i.d. Gaussian initialization. We first consider the mean-field limit, where the finite population of neurons in the hidden layer is replaced by a continual ensemble, and show that our problem can be phrased as global minimization of a free-energy functional on the space of probability measures over the weights. This functional trades off the $L^2$ approximation risk against the KL divergence with respect to a centered Gaussian prior. We characterize the unique global minimizer and then construct a controlled nonlinear dynamics in the space of probability measures over weights that solves a McKean--Vlasov optimal control problem. This control problem is closely related to the Schr\"odinger bridge (or entropic optimal transport) problem, and its value is proportional to the minimum of the free energy. Finally, we show that SGD in the lazy training regime (which can be ensured by jointly tuning the variance of the Gaussian prior and the entropic regularization parameter) serves as a greedy approximation to the optimal McKean--Vlasov distributional dynamics and provide quantitative guarantees on the $L^2$ approximation error.
Abstract（参考訳）: クルバック・リーブラーの発散という意味では、「ほぼガウス的」なランダム重みを持つ2層ニューラルネットワークによる関数の普遍近似の問題を考える。この問題は、確率勾配降下によって生じる重み付けがガウス初期化から順応的に動かない遅延学習の最近の研究によって動機づけられている。まず,隠れた層内のニューロンの有限個数が連続的なアンサンブルに置き換えられる平均場限界を考察し,本問題を重み付け上の確率測度の空間上の自由エネルギー汎関数の大域的最小化として表現できることを示した。この関数はKLの発散に対する$L^2$近似リスクをガウスの先行中心に対して引き離す。 We characterize the unique global minimizer and then construct a controlled nonlinear dynamics in the space of probability measures over weights that solves a McKean--Vlasov optimal control problem. This control problem is closely related to the Schr\"odinger bridge (or entropic optimal transport) problem, and its value is proportional to the minimum of the free energy. Finally, we show that SGD in the lazy training regime (which can be ensured by jointly tuning the variance of the Gaussian prior and the entropic regularization parameter) serves as a greedy approximation to the optimal McKean--Vlasov distributional dynamics and provide quantitative guarantees on the $L^2$ approximation error.

関連論文リスト

Finite-Time Information-Theoretic Bounds in Queueing Control [54.11376591632282]
本稿では, 待ち行列の待ち行列を, 待ち行列と待ち行列の双方で処理するネットワーク上のスケジューリング問題において, 待ち行列の総長を求める新しいポリシーを導出する。これらの結果は「ドリフトオンリー」な手法の基本的な制限を明らかにし、待ち行列制御における原則的、非漸近的最適性への道を示す。
論文参考訳（メタデータ） (2025-06-23T04:14:40Z)
Weak coupling limit for quantum systems with unbounded weakly commuting system operators [50.24983453990065]
この研究は、電磁場と相互作用するオープン無限次元量子系の縮小力学や、フェルミ粒子やボース粒子によって形成される貯水池に対する弱結合限界(WCL)の厳密な解析に費やされている。我々は,貯水池の多点相関関数の項が WCL においてゼロでないことを条件として,貯水池統計の弱い結合限界を導出する。得られた還元系力学が、元のハミルトニアンへのラムシフトと解釈できる修正されたハミルトニアンを持つユニタリ力学に収束することを証明する。
論文参考訳（メタデータ） (2025-05-13T05:32:34Z)
Mirror Mean-Field Langevin Dynamics [0.09208007322096533]
固有平均場ランゲヴィンダイナミクス(MMFLD)を提案し,$mathbbRd$の凸部分集合に制約された確率測度の最適化について検討した。我々は,一様対数ソボレフ不等式による連続MMFLDの線形収束保証と,その時間および粒子分散のカオス結果を一様に伝播する。
論文参考訳（メタデータ） (2025-05-05T12:49:42Z)
Mean-field underdamped Langevin dynamics and its spacetime discretization [5.832709207282124]
確率測度空間上で定義された非線形汎函数の特殊クラスを最適化するために,N粒子アンダーダム化ランゲヴィンアルゴリズムという新しい手法を提案する。本アルゴリズムは, 平均場下減衰ランゲヴィンダイナミクスの時空離散化に基づく。
論文参考訳（メタデータ） (2023-12-26T23:59:04Z)
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems [78.96969465641024]
平均場ランゲヴィンのダイナミクスを、対称で証明可能な収束した更新で、初めて確率分布に対する最小の最適化に拡張する。また,時間と粒子の離散化機構について検討し,カオス結果の新たな均一時間伝播を証明した。
論文参考訳（メタデータ） (2023-12-02T13:01:29Z)
Projected Langevin dynamics and a gradient flow for entropic optimal transport [0.8057006406834466]
エントロピー規則化された最適輸送からサンプリングした類似拡散力学を導入する。部分多様体 $Pi(mu,nu)$ の誘導されたワッサーシュタイン幾何学の研究により、SDE はこの結合空間上のワッサーシュタイン勾配フローとみなすことができると論じる。
論文参考訳（メタデータ） (2023-09-15T17:55:56Z)
Wasserstein Quantum Monte Carlo: A Novel Approach for Solving the Quantum Many-Body Schr\"odinger Equation [56.9919517199927]
ワーッセルシュタイン量子モンテカルロ (WQMC) はフィッシャー・ラオ計量ではなくワーッセルシュタイン計量によって誘導される勾配流を用いており、テレポートではなく確率質量の輸送に対応する。我々は、WQMCの力学が分子系の基底状態へのより高速な収束をもたらすことを実証的に実証した。
論文参考訳（メタデータ） (2023-07-06T17:54:08Z)
Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction [49.66486092259376]
平均場ランゲヴィンダイナミクス(英: mean-field Langevin dynamics、MFLD)は、分布依存のドリフトを含むランゲヴィン力学の非線形一般化である。近年の研究では、MFLDは測度空間で機能するエントロピー規則化された凸関数を地球規模で最小化することが示されている。有限粒子近似,時間分散,勾配近似による誤差を考慮し,MFLDのカオスの均一時間伝播を示す枠組みを提供する。
論文参考訳（メタデータ） (2023-06-12T16:28:11Z)
Accelerating Convergence in Global Non-Convex Optimization with Reversible Diffusion [0.0]
ランゲヴィン・ダイナミクスは、グローバルな非最適化実験で広く用いられている。提案手法は,速度と離散化誤差のトレードオフについて検討する。
論文参考訳（メタデータ） (2023-05-19T07:49:40Z)
Trajectory Inference via Mean-field Langevin in Path Space [0.17205106391379024]
軌道推論は、時間的限界のスナップショットから集団のダイナミクスを回復することを目的としている。経路空間におけるウィナー測度に対するミンエントロピー推定器は、Lavenantらによって導入された。
論文参考訳（メタデータ） (2022-05-14T23:13:00Z)
Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
平均場ランゲヴィン力学の収束速度解析について述べる。ダイナミックスに付随する$p_q$により、凸最適化において古典的な結果と平行な収束理論を開発できる。
論文参考訳（メタデータ） (2022-01-25T17:13:56Z)
Lifting the Convex Conjugate in Lagrangian Relaxations: A Tractable Approach for Continuous Markov Random Fields [53.31927549039624]
断片的な離散化は既存の離散化問題と矛盾しないことを示す。この理論を2つの画像のマッチング問題に適用する。
論文参考訳（メタデータ） (2021-07-13T12:31:06Z)
A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
学習エネルギーベースモデル(EBM)の勾配流を最適化する新しい数値スキームを提案する。フォッカー・プランク方程式から大域相対エントロピーの2階ワッサーシュタイン勾配流を導出する。既存のスキームと比較して、ワッサーシュタイン勾配流は実データ密度を近似するより滑らかで近似的な数値スキームである。
論文参考訳（メタデータ） (2019-10-31T02:26:20Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。