Fugu-MT 論文翻訳(概要): Supersampling Stable Diffusion and Beyond: A Seamless, Training-Free Approach for Scaling Neural Networks Using Common Interpolation Methods

論文の概要: Supersampling Stable Diffusion and Beyond: A Seamless, Training-Free Approach for Scaling Neural Networks Using Common Interpolation Methods

arxiv url: http://arxiv.org/abs/2605.08698v2
Date: Thu, 14 May 2026 14:25:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 15:19:49.852692
Title: Supersampling Stable Diffusion and Beyond: A Seamless, Training-Free Approach for Scaling Neural Networks Using Common Interpolation Methods
Title（参考訳）: スーパーサンプリング安定拡散とそれを超える:共通補間法によるニューラルネットワークのスケーリングのためのシームレスで訓練不要なアプローチ
Authors: Md Abu Obaida Zishan, Jannatun Noor, Annajiat Alim Rasel,
Abstract要約: 定数係数で乗算すれば、畳み込みカーネルを正確にスケールできる数学的手法を示す。本研究では,ディープニューラルネットワークが高精度なトレーニングデータに適応できることを実証する。
参考スコア（独自算出の注目度）: 2.464790797105706
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Stable Diffusion (SD) has evolved DDPM (Denoising Diffusion Probabilistic Model) based image generation significantly by denoising in latent space instead of feature space. This popularized DDPM-based image generation as the cost and compute barrier was significantly lowered. However, these models could only generate fixed-resolution images according to their training configuration. When we attempt to generate higher resolutions, the resulting images show object duplication artifacts consistently. To solve this problem without finetuning SD models, recent works have tried dilating the convolution kernels of the models and have achieved a great level of success. But dilated kernels are harder to fine-tune due to being zero-gapped. Apart from this, other methods, such as patched diffusion, could not solve the object-duplication problem efficiently. Hence, to overcome the limitations of dilated convolutions, we propose kernel interpolation of SD models for higher-resolution image generation. In this work, we show mathematically that interpolation can correctly scale convolution kernels if multiplied by a constant coefficient and achieve competitive empirical results in generating beyond-training-resolution images with Stable Diffusion using zero training. Furthermore, we demonstrate that our method enables interpolation of deep neural networks to adapt to higher-dimensional training data, with a worst-case performance drop of $2.6\%$ in accuracy and F1-Score relative to the baseline. This shows the applicability of our method to be general, where we interpolate fully-connected layers, going beyond convolution layers. We also discuss how we can reduce the memory footprints of training neural networks, using our method up to at least $4\times$.
Abstract（参考訳）: DDPM(Denoising Diffusion Probabilistic Model)に基づく画像生成は,特徴空間の代わりに潜在空間を denoising することによって大幅に進化した。このDDPMベースの画像生成は、コストと計算障壁が大幅に低下した。しかし、これらのモデルでは、トレーニング設定に従って、固定解像度の画像しか生成できなかった。より高解像度の画像を生成しようとすると、結果のイメージはオブジェクトの複製アーティファクトを一貫して示します。 SDモデルを微調整することなくこの問題を解決するために、最近の研究はモデルの畳み込みカーネルを拡張しようと試み、大きな成功を収めた。しかし、拡張されたカーネルはゼロにすることで微調整が難しい。これとは別に、パッチ拡散などの他の手法では、オブジェクト複製問題を効率的に解くことができなかった。したがって、拡張畳み込みの限界を克服するため、高解像度画像生成のためのSDモデルのカーネル補間を提案する。本研究では,定数係数で乗算すると,補間によって畳み込みカーネルを正確にスケールできることを示す。さらに,本手法により,より高次元のトレーニングデータに適応するディープニューラルネットワークの補間が可能であり,最悪の性能低下は,精度が2.6\%,ベースラインがF1-Scoreであることを示す。これは、完全に接続された層を相互に補間し、畳み込み層を越える方法の適用性を示している。また、ニューラルネットワークをトレーニングする際のメモリフットプリントを少なくとも4\times$まで削減する方法についても論じています。

論文の概要: Supersampling Stable Diffusion and Beyond: A Seamless, Training-Free Approach for Scaling Neural Networks Using Common Interpolation Methods

関連論文リスト