Fugu-MT 論文翻訳(概要): Manifold Generalization Provably Proceeds Memorization in Diffusion Models

論文の概要: Manifold Generalization Provably Proceeds Memorization in Diffusion Models

arxiv url: http://arxiv.org/abs/2603.23792v1
Date: Tue, 24 Mar 2026 23:50:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-26 21:06:11.060775
Title: Manifold Generalization Provably Proceeds Memorization in Diffusion Models
Title（参考訳）: 拡散モデルにおけるマンニフォールドの一般化
Authors: Zebang Shen, Ya-Ping Hsieh, Niao He,
Abstract要約: 拡散モデルは、学習したスコアがエンフカースである場合でも、しばしば新しいサンプルを生成する。粗いスコアで訓練された拡散モデルは、多様体支持の語彙性を利用することができることを証明した。
参考スコア（独自算出の注目度）: 33.15269246693525
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the \emph{manifold hypothesis}, this behavior can instead be explained by coarse scores capturing the \emph{geometry} of the data while discarding the fine-scale distributional structure of the population measure~$μ_{\scriptscriptstyle\mathrm{data}}$. Concretely, whereas estimating the full data distribution $μ_{\scriptscriptstyle\mathrm{data}}$ supported on a $k$-dimensional manifold is known to require the classical minimax rate $\tilde{\mathcal{O}}(N^{-1/k})$, we prove that diffusion models trained with coarse scores can exploit the \emph{regularity of the manifold support} and attain a near-parametric rate toward a \emph{different} target distribution. This target distribution has density uniformly comparable to that of~$μ_{\scriptscriptstyle\mathrm{data}}$ throughout any $\tilde{\mathcal{O}}\bigl(N^{-β/(4k)}\bigr)$-neighborhood of the manifold, where $β$ denotes the manifold regularity. Our guarantees therefore depend only on the smoothness of the underlying support, and are especially favorable when the data density itself is irregular, for instance non-differentiable. In particular, when the manifold is sufficiently smooth, we obtain that \emph{generalization} -- formalized as the ability to generate novel, high-fidelity samples -- occurs at a statistical rate strictly faster than that required to estimate the full population distribution~$μ_{\scriptscriptstyle\mathrm{data}}$.
Abstract（参考訳）: 拡散モデルはしばしば、学習したスコアが 'emph{coarse}' である場合でも、新しいサンプルを生成する。本稿では,この振る舞いを,人口測定値〜$μ_{\scriptstyle\mathrm{data}}$の微細分布構造を捨てたまま,データの「emph{geometry}」を捕捉した粗いスコアで説明できることを示す。具体的には、$k 次元多様体上でサポートされているフルデータ分布 $μ_{\scriptscriptstyle\mathrm{data}}$ を推定することは、古典的なミニマックス率 $\tilde{\mathcal{O}}(N^{-1/k})$ を必要とすることが知られているが、粗いスコアで訓練された拡散モデルが多様体サポートの \emph{regularity を活用でき、かつ \emph{different} の目標分布に対してほぼパラメトリックレートが得られることを証明している。この分布は、任意の$\tilde{\mathcal{O}}\bigl(N^{-β/(4k)}\bigr)$-neighborhood of the manifold, ここで$β$は多様体の正則性を表す。したがって、我々の保証は基盤となる支持の滑らかさにのみ依存しており、特にデータ密度自体が不規則である場合、例えば微分不可能である。特に、多様体が十分に滑らかであるとき、新しい高忠実度サンプルを生成する能力として形式化された \emph{ Generalization} が、全人口分布~$μ_{\scriptstyle\mathrm{data}}$を推定するために必要なものよりも厳密な統計速度で発生する。

論文の概要: Manifold Generalization Provably Proceeds Memorization in Diffusion Models

関連論文リスト