Fugu-MT 論文翻訳(概要): Amortized Guidance for Image Inpainting with Pretrained Diffusion Models

論文の概要: Amortized Guidance for Image Inpainting with Pretrained Diffusion Models

arxiv url: http://arxiv.org/abs/2605.13010v1
Date: Wed, 13 May 2026 05:02:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-14 23:30:27.821306
Title: Amortized Guidance for Image Inpainting with Pretrained Diffusion Models
Title（参考訳）: 予め訓練した拡散モデルによる画像塗布の補正誘導
Authors: Yilie Huang, Xun Yu Zhou,
Abstract要約: 生成拡散モデルを用いた画像インパインティングについて検討した。拡散処理によるアモルタイズインペインティング(Amortized Inpainting with Diffusion)と呼ばれる中層モデルを導入する。 AIDは、トレーニング済みの拡散バックボーンを固定し、小さな再利用可能なガイダンスモジュールをオフラインでトレーニングし、インスタンスごとの最適化なしにマスクされたイメージ間で再利用する。
参考スコア（独自算出の注目度）: 3.9819516108444115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study image inpainting with generative diffusion models. Existing methods typically either train dedicated task-specific models, or adapt a pretrained diffusion model separately for each masked image at deployment. We introduce a middle-ground model, termed Amortized Inpainting with Diffusion (AID), which keeps a pretrained diffusion backbone fixed, trains a small reusable guidance module offline, and then reuses it across masked images without per-instance optimization. We formulate it as a deterministic guidance problem with a supervised terminal objective. To make this problem learnable in high dimensions, we derive an auxiliary Gaussian formulation and prove that solving this randomized problem recovers the optimal deterministic guidance field. This bridge yields a principled continuous-time actor--critic algorithm for learning the guidance module in a fully data-driven manner. Empirically, on AFHQv2 and FFHQ under the pixel EDM pipeline and on ImageNet under the latent EDM2 pipeline, AID consistently improves the quality--speed trade-off over strong fixed-backbone and amortized inpainting baselines across multiple mask types, while adding less than one percent trainable overhead.
Abstract（参考訳）: 生成拡散モデルを用いた画像インパインティングについて検討した。既存の方法は、通常、専用のタスク固有のモデルをトレーニングするか、デプロイ時に各マスクされたイメージに対して個別に事前訓練された拡散モデルを適用するかのいずれかである。 AID(Amortized Inpainting with Diffusion)と呼ばれる,事前トレーニングした拡散バックボーンを固定し,小さな再利用可能なガイダンスモジュールをオフラインでトレーニングし,また,マスク付き画像に対して,インスタンスごとの最適化を行なわずに再利用するミドルグラウンドモデルを導入する。我々はこれを、教師付き端末目的による決定論的ガイダンス問題として定式化する。この問題を高次元で学習できるようにするため、補助的なガウス式を導出し、このランダム化問題の解法が最適決定性誘導場を回復することを証明する。このブリッジは、完全にデータ駆動の方法でガイダンスモジュールを学習するための、原則化された連続時間アクター-批判的アルゴリズムを提供する。実証的には、ピクセルEDMパイプライン下のAFHQv2とFFHQ、潜伏EDM2パイプライン下のImageNetでは、AIDは、強い固定バックボーンに対する品質-高速トレードオフを一貫して改善し、複数のマスクタイプにわたるベースラインを修復し、トレーニング可能なオーバーヘッドを1%未満増やしている。

論文の概要: Amortized Guidance for Image Inpainting with Pretrained Diffusion Models

関連論文リスト