Fugu-MT 論文翻訳(概要): Your Pre-trained Diffusion Model Secretly Knows Restoration

論文の概要: Your Pre-trained Diffusion Model Secretly Knows Restoration

arxiv url: http://arxiv.org/abs/2604.04924v1
Date: Mon, 06 Apr 2026 17:59:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:19.333591
Title: Your Pre-trained Diffusion Model Secretly Knows Restoration
Title（参考訳）: トレーニング済みの拡散モデルが修復を秘かに知る
Authors: Sudarshan Rajagopalan, Vishal M. Patel,
Abstract要約: 本研究では,事前学習した拡散モデルが本質的に復元動作を有しており,即時埋め込みを直接学習することで解錠可能であることを示す。トレーニング済みのWANビデオモデルとFLUX画像モデルに軽量な学習プロンプトを導入し、それらを高性能な復元モデルに変換する。
参考スコア（独自算出の注目度）: 55.7186754179308
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pre-trained diffusion models have enabled significant advancements in All-in-One Restoration (AiOR), offering improved perceptual quality and generalization. However, diffusion-based restoration methods primarily rely on fine-tuning or Control-Net style modules to leverage the pre-trained diffusion model's priors for AiOR. In this work, we show that these pre-trained diffusion models inherently possess restoration behavior, which can be unlocked by directly learning prompt embeddings at the output of the text encoder. Interestingly, this behavior is largely inaccessible through text prompts and text-token embedding optimization. Furthermore, we observe that naive prompt learning is unstable because the forward noising process using degraded images is misaligned with the reverse sampling trajectory. To resolve this, we train prompts within a diffusion bridge formulation that aligns training and inference dynamics, enforcing a coherent denoising path from noisy degraded states to clean images. Building on these insights, we introduce our lightweight learned prompts on the pre-trained WAN video model and FLUX image models, converting them into high-performing restoration models. Extensive experiments demonstrate that our approach achieves competitive performance and generalization across diverse degradations, while avoiding fine-tuning and restoration-specific control modules.
Abstract（参考訳）: 事前訓練された拡散モデルによりオールインワン修復(AiOR)が大幅に進歩し、知覚品質と一般化が向上した。しかし、拡散に基づく復元法は、主にAiORの事前訓練された拡散モデルの事前の活用のために、微調整または制御ネットスタイルのモジュールに依存している。本研究では、これらの事前学習拡散モデルが本質的に復元動作を有しており、テキストエンコーダの出力に即時埋め込みを学習することで解き放つことができることを示す。興味深いことに、この動作はテキストプロンプトとテキストへの埋め込み最適化によってほとんどアクセスできない。さらに,劣化画像を用いたフォワードノイズ発生過程が逆サンプリング軌道と誤一致しているため,ナイーブ・プロンプト・ラーニングが不安定であることも観察した。これを解決するために、我々は、トレーニングと推論のダイナミクスを整合させる拡散ブリッジの定式化のプロンプトを訓練し、ノイズのある劣化状態からクリーンな画像へのコヒーレントな denoising パスを強制する。これらの知見に基づいて、トレーニング済みのWANビデオモデルとFLUX画像モデルに関する軽量な学習プロンプトを導入し、それらを高性能な復元モデルに変換する。大規模な実験により,本手法は微調整および復元特異的制御モジュールを回避しつつ,様々な劣化に対して競争性能と一般化を実現することを示した。

論文の概要: Your Pre-trained Diffusion Model Secretly Knows Restoration

関連論文リスト