Fugu-MT 論文翻訳(概要): Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

論文の概要: Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

arxiv url: http://arxiv.org/abs/2605.20235v1
Date: Sat, 16 May 2026 16:51:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-21 19:19:56.220983
Title: Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine
Title（参考訳）: マニフォールド仮説に基づく拡散モデル学習の可能性:崩壊と再定義
Authors: Wei Huang, Andi Han, Mingyuan Bai, Huanjian Zhou, Qixin Zhang, Taiji Suzuki, Kenji Fukumizu,
Abstract要約: 拡散モデルは、顕著な品質で高次元データを生成する。彼らのトレーニングがいかに効率的にスコア関数を学習するかは理論的には説明がつかないままである。我々はこの原理をScore-induced Latent Diffusion (SiLD)として定式化する。
参考スコア（独自算出の注目度）: 60.669081685261965
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models generate high-dimensional data with remarkable quality, yet how their training efficiently learns the score function, bypassing the curse of dimensionality when data is supported on low-dimensional manifolds, remains theoretically unexplained. We identify a collapse-and-refine mechanism driven by the geometry of the score function itself: at small noise scales, the diverging singularity of the score drives a rapid dimensional collapse of the induced denoising map onto the data manifold projection; at moderate noise scales, training refines the intrinsic density on the learned manifold. We instantiate this principle as Score-induced Latent Diffusion (SiLD), a two-stage framework in which both manifold learning and density estimation emerge from a single denoising score matching objective, replacing the heuristic KL regularization of VAE-based latent diffusion models. We prove that the resulting sample complexity depends on the intrinsic dimension rather than the ambient dimension. Experiments on Stacked MNIST, CelebA variants, and molecular generation benchmarks show that SiLD matches or outperforms VAE-based LDMs in generation quality and consistently improves reconstruction, validating our theoretical predictions.
Abstract（参考訳）: 拡散モデルは、顕著な品質で高次元データを生成するが、そのトレーニングは、低次元多様体上でデータが支持されたときの次元の呪いを回避して、スコア関数を効率的に学習する方法は理論的には説明されていない。スコア関数自体の幾何学によって駆動される崩壊・縮小機構を同定する:小さなノイズスケールでは、スコアのばらつき特異点が誘導されたデノナイジングマップの高速な次元崩壊をデータ多様体射影に駆動し、中程度のノイズスケールでは、学習された多様体の固有密度を洗練させる。本稿では, この原理を, VAEに基づく潜伏拡散モデルのヒューリスティックKL正規化を代替し, 1つの復調スコアマッチング目標から, 多様体学習と密度推定の両方が出現する2段階のフレームワークであるScore-induced Latent Diffusion (SiLD) として定式化する。得られたサンプルの複雑さは、周囲の次元よりも本質的な次元に依存することを証明している。 Stacked MNIST, CelebA variants, および分子生成ベンチマークの実験により, SiLD は VAE ベースの LDM を生成品質で一致または性能良くし, 再構築を継続的に改善し, 理論的予測を検証した。

論文の概要: Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

関連論文リスト