Fugu-MT 論文翻訳(概要): Global Convergence of Gradient Descent for Score Matching in Gaussian Mixtures via Reverse Fisher Divergence

論文の概要: Global Convergence of Gradient Descent for Score Matching in Gaussian Mixtures via Reverse Fisher Divergence

arxiv url: http://arxiv.org/abs/2606.19876v1
Date: Thu, 18 Jun 2026 07:34:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.700945
Title: Global Convergence of Gradient Descent for Score Matching in Gaussian Mixtures via Reverse Fisher Divergence
Title（参考訳）: 逆水深変化によるガウス混合系のスコアマッチングにおける勾配線量の大域的収束
Authors: Alexander Tyurin,
Abstract要約: そこで本研究では,学生分布に対する期待値の逆のフィッシャー発散(Fisher divergence)について検討する。我々は、目標平均に対して$widetilde(1)$-separationの仮定の下で、大域収束保証を証明した。我々はリアプノフに基づく勾配勾配勾配の動的解析を頼りにしており、逆のフィッシャー発散は前方のフィッシャー発散よりもはるかに良い最適化環境を持つことが示されている。
参考スコア（独自算出の注目度）: 67.12978375116599
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The score matching problem is a central training objective in modern generative modeling, diffusion models, fitting unnormalized statistical models, and inverse problems. A standard approach is to minimize the forward Fisher divergence, where the expectation is taken with respect to the teacher distribution. However, recent results show that even in simple Gaussian mixture model settings, this objective can lead to undesirable and initialization-dependent convergence behavior. In this paper, we study an alternative objective: the reverse Fisher divergence, where the expectation is taken with respect to the student distribution. We analyze gradient descent (GD) for fitting Gaussian mixture models and show that this change in the objective leads to significantly better optimization properties. First, when the teacher distribution is a single Gaussian and the student is a Gaussian mixture model with fixed weights and identity covariances, we prove the global convergence of GD from arbitrary initializations. Second, we extend the analysis to the case where the teacher is also a Gaussian mixture model and prove global convergence guarantees under a global random initialization scheme and a $\widetildeΩ(1)$-separation assumption on the target means. In particular, with high probability, each student component converges near its closest teacher component, and we provide conditions under which the student distribution converges in total variation distance. Our proofs rely on a new Lyapunov-based analysis of the gradient descent dynamics, showing that the reverse Fisher divergence has a much more favorable optimization landscape than the forward Fisher divergence.
Abstract（参考訳）: スコアマッチング問題は、現代の生成モデル、拡散モデル、正規化されていない統計モデル、逆問題における中心的な訓練目標である。標準的なアプローチは、教師の分布に関して期待される、フォワードフィッシャーの分岐を最小限にすることである。しかし、最近の研究では、単純なガウス混合モデル設定においても、この目的は望ましくない初期化依存収束挙動をもたらすことが示されている。そこで,本研究では,学生の分布に対する期待値の逆のフィッシャー発散という,別の目的について検討する。ガウス混合モデルに適用するための勾配勾配勾配(GD)を解析し、この目的の変化が最適化特性を著しく向上させることを示す。まず、教師分布が1つのガウス多様体であり、学生が固定重みと同一性共分散を持つガウス混合モデルであるとき、任意の初期化からGDの大域収束性を証明する。第2に,教師がガウス混合モデルであり,大域的ランダム初期化スキームと対象手段上の$\widetildeΩ(1)$-セパレーション仮定の下で大域収束を保証することを証明した場合には,解析を拡張する。特に、高い確率で、各生徒成分が最も近い教師成分付近に収束し、学生分布が全変動距離に収束する条件を提供する。我々の証明は、リプノフに基づく勾配降下ダイナミクスの新たな解析に依存しており、逆のフィッシャー発散の方が前方のフィッシャー発散よりもはるかに良い最適化環境を持つことが示されている。

論文の概要: Global Convergence of Gradient Descent for Score Matching in Gaussian Mixtures via Reverse Fisher Divergence

関連論文リスト