Fugu-MT 論文翻訳(概要): Efficient and Training-Free Single-Image Diffusion Models

論文の概要: Efficient and Training-Free Single-Image Diffusion Models

arxiv url: http://arxiv.org/abs/2606.04299v1
Date: Wed, 03 Jun 2026 00:05:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.433789
Title: Efficient and Training-Free Single-Image Diffusion Models
Title（参考訳）: 効率・学習自由な単一画像拡散モデル
Authors: Haojun Qiu, Kiriakos N. Kutulakos, David B. Lindell,
Abstract要約: 内部構造が単一の参照画像と一致した画像を生成することの問題点を考察する。我々は、パッチベースのデノイザを、効率的でトレーニング不要な画像拡散モデルに統合する。提案手法は,訓練された単一画像拡散モデルと比較して,最先端の世代品質と多様性を実現する。
参考スコア（独自算出の注目度）: 17.578119446864132
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of generating images whose internal structure -- defined by the distribution of patches across multiple scales -- matches that of a single reference image. Recent approaches address this problem by training a diffusion model on a single image. But even in this setting, training is computationally expensive and requires hours of optimization. Instead, we model the image using a dataset of its patches at different scales. As this dataset is finite and the dimensionality of its patches is small, the score function for a noisy patch can be computed tractably using an optimal, closed-form denoiser, eliminating the need for neural network training. We integrate this patch-based denoiser into an efficient, training-free image diffusion model, and we describe how our method connects to classical patch-based image restoration techniques. Our approach achieves state-of-the-art generation quality and diversity compared to trained single-image diffusion models, and we demonstrate applications, including unconditional image generation, text-guided stylization, image symmetrization, and retargeting. Further, we show that our approach is compatible with latent space diffusion, and we show multiple additional acceleration techniques to achieve megapixel single-image generation in one second, and gigapixel generation in minutes.
Abstract（参考訳）: 複数のスケールにまたがるパッチの分布によって定義された内部構造が単一の参照画像と一致するような画像を生成するという問題を考える。近年のアプローチでは、単一の画像上で拡散モデルをトレーニングすることでこの問題に対処している。しかし、この設定でさえ、トレーニングは計算コストが高く、数時間の最適化が必要です。代わりに、異なるスケールでパッチのデータセットを使用してイメージをモデル化する。このデータセットは有限であり、パッチの寸法が小さいため、ノイズの多いパッチのスコア関数を最適でクローズドなデノイザを用いてトラクタブルに計算することができ、ニューラルネットワークのトレーニングの必要性を排除できる。我々は、このパッチベースのデノイザを効率よくトレーニング不要な画像拡散モデルに統合し、我々の手法が古典的なパッチベースの画像復元技術とどのように結びつくかを述べる。提案手法は,非条件画像生成,テキスト誘導型スタイリゼーション,画像対称性,再ターゲティングなどの応用例を示す。さらに,本手法は遅延空間拡散と互換性があることを示し,1秒でメガピクセルの単一画像生成,数分でギガピクセルの生成を実現するために,さらに複数の加速技術を示す。

論文の概要: Efficient and Training-Free Single-Image Diffusion Models

関連論文リスト