Fugu-MT 論文翻訳(概要): When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

論文の概要: When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

arxiv url: http://arxiv.org/abs/2509.24912v1
Date: Mon, 29 Sep 2025 15:18:43 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:20.085113
Title: When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis
Title（参考訳）: スコアが幾何学を学ぶとき:マニフォールド仮説の下での速度分離
Authors: Xiang Li, Zebang Shen, Ya-Ping Hsieh, Niao He,
Abstract要約: 拡散モデルと逆問題はしばしば低雑音限界におけるデータ分布の学習として解釈される。彼らの成功は、完全な分布ではなく、データ多様体を暗黙的に学習することから生じると我々は主張する。スコア誤差が$o(sigma-2)$であるのに対して、特定のデータ分布を復元するにはより厳密な$o(1)$エラーが必要である。
参考スコア（独自算出の注目度）: 33.93481564069631
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Score-based methods, such as diffusion models and Bayesian inverse problems, are often interpreted as learning the data distribution in the low-noise limit ($\sigma \to 0$). In this work, we propose an alternative perspective: their success arises from implicitly learning the data manifold rather than the full distribution. Our claim is based on a novel analysis of scores in the small-$\sigma$ regime that reveals a sharp separation of scales: information about the data manifold is $\Theta(\sigma^{-2})$ stronger than information about the distribution. We argue that this insight suggests a paradigm shift from the less practical goal of distributional learning to the more attainable task of geometric learning, which provably tolerates $O(\sigma^{-2})$ larger errors in score approximation. We illustrate this perspective through three consequences: i) in diffusion models, concentration on data support can be achieved with a score error of $o(\sigma^{-2})$, whereas recovering the specific data distribution requires a much stricter $o(1)$ error; ii) more surprisingly, learning the uniform distribution on the manifold-an especially structured and useful object-is also $O(\sigma^{-2})$ easier; and iii) in Bayesian inverse problems, the maximum entropy prior is $O(\sigma^{-2})$ more robust to score errors than generic priors. Finally, we validate our theoretical findings with preliminary experiments on large-scale models, including Stable Diffusion.
Abstract（参考訳）: 拡散モデルやベイズ逆問題のようなスコアベースの手法は、しばしば低雑音の極限(\sigma \to 0$)でデータ分布を学ぶものとして解釈される。そこで本研究では,全分布ではなく,データ多様体を暗黙的に学習することで,その成功が生まれる,という別の視点を提案する。我々の主張は、小さな$\sigma$体制におけるスコアの新たな分析に基づいており、これはスケールの鋭い分離を明らかにしている: データ多様体に関する情報は、分布に関する情報よりも強い$\Theta(\sigma^{-2})$である。この知見は,分布学習のより実践的な目標から,より達成可能な幾何学習へのパラダイムシフトを示唆している。この視点を3つの結果から説明します。 i)拡散モデルでは、スコア誤差が$o(\sigma^{-2})$であるのに対して、特定のデータ分布を復元するにはより厳格な$o(1)$エラーが必要である。 i)より驚くべきことに、多様体上の一様分布、特に構造化され有用な対象について学習すること。三ベイズ逆問題において、最大エントロピー前のエントロピーが$O(\sigma^{-2})$で、一般的な先行よりもエラーをスコアする。最後に, 安定拡散を含む大規模モデルに関する予備実験により, 理論的知見を検証した。

論文の概要: When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

関連論文リスト