Fugu-MT 論文翻訳(概要): Locality in Image Diffusion Models Emerges from Data Statistics

論文の概要: Locality in Image Diffusion Models Emerges from Data Statistics

arxiv url: http://arxiv.org/abs/2509.09672v2
Date: Thu, 30 Oct 2025 17:40:53 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-31 16:05:09.413506
Title: Locality in Image Diffusion Models Emerges from Data Statistics
Title（参考訳）: データ統計から得られた画像拡散モデルの局所性
Authors: Artem Lukoianov, Chenyang Yuan, Justin Solomon, Vincent Sitzmann,
Abstract要約: 近年の研究では、画像拡散モデルの一般化能力は、トレーニングされたニューラルネットワークの局所特性から生じることが示されている。深部拡散モデルの局所性が画像データセットの統計的特性として現れることを示す。理論的および実験的に、この局所性は画像データセットに存在する画素相関から直接生じることを示す。
参考スコア（独自算出の注目度）: 19.257597016636844
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent work has shown that the generalization ability of image diffusion models arises from the locality properties of the trained neural network. In particular, when denoising a particular pixel, the model relies on a limited neighborhood of the input image around that pixel, which, according to the previous work, is tightly related to the ability of these models to produce novel images. Since locality is central to generalization, it is crucial to understand why diffusion models learn local behavior in the first place, as well as the factors that govern the properties of locality patterns. In this work, we present evidence that the locality in deep diffusion models emerges as a statistical property of the image dataset and is not due to the inductive bias of convolutional neural networks, as suggested in previous work. Specifically, we demonstrate that an optimal parametric linear denoiser exhibits similar locality properties to deep neural denoisers. We show, both theoretically and experimentally, that this locality arises directly from pixel correlations present in the image datasets. Moreover, locality patterns are drastically different on specialized datasets, approximating principal components of the data's covariance. We use these insights to craft an analytical denoiser that better matches scores predicted by a deep diffusion model than prior expert-crafted alternatives. Our key takeaway is that while neural network architectures influence generation quality, their primary role is to capture locality patterns inherent in the data.
Abstract（参考訳）: 近年の研究では、画像拡散モデルの一般化能力は、トレーニングされたニューラルネットワークの局所特性から生じることが示されている。特に、特定の画素を復調する際には、その画素の周囲の入力画像の限られた近傍にモデルが依存しており、これは以前の研究によれば、これらのモデルが新しい画像を生成する能力と密接に関連している。局所性は一般化の中心であるため、拡散モデルがそもそも局所挙動を学習する理由や、局所性パターンの性質を規定する要因を理解することが重要である。本研究では, 画像データセットの統計的特性として深部拡散モデルの局所性が出現し, 畳み込みニューラルネットワークの帰納バイアスによるものではないことを示す。具体的には、最適パラメトリックリニアデノイザがディープニューラルデノイザに類似した局所性を示すことを示す。理論的および実験的に、この局所性は画像データセットに存在する画素相関から直接生じることを示す。さらに、局所性パターンは、データの共分散の主成分を近似することで、特別なデータセットで大幅に異なる。これらの知見を用いて分析的なデノイザを構築し、より深い拡散モデルによって予測されるスコアを、従来の専門家による代替よりもよく一致させる。私たちの重要な特徴は、ニューラルネットワークアーキテクチャが生成品質に影響を与える一方で、その主な役割は、データに固有の局所パターンをキャプチャすることだ。

論文の概要: Locality in Image Diffusion Models Emerges from Data Statistics

関連論文リスト