Fugu-MT 論文翻訳(概要): Guiding Diffusion Models with Semantically Degraded Conditions

論文の概要: Guiding Diffusion Models with Semantically Degraded Conditions

arxiv url: http://arxiv.org/abs/2603.10780v1
Date: Wed, 11 Mar 2026 13:54:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-12 16:22:32.97639
Title: Guiding Diffusion Models with Semantically Degraded Conditions
Title（参考訳）: 逐次劣化条件付き拡散モデル
Authors: Shilong Han, Yuming Zhang, Hongxia Wang,
Abstract要約: 条件劣化誘導(CDG)を提案する。 CDGはnullプロンプトを戦略的に劣化した条件である$boldsymbolc_textdeg$に置き換える。軽量でプラグアンドプレイのモジュールとして、CDGは構成精度とテキストイメージのアライメントを大幅に改善する。
参考スコア（独自算出の注目度）: 19.061619300086875
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Classifier-Free Guidance (CFG) is a cornerstone of modern text-to-image models, yet its reliance on a semantically vacuous null prompt ($\varnothing$) generates a guidance signal prone to geometric entanglement. This is a key factor limiting its precision, leading to well-documented failures in complex compositional tasks. We propose Condition-Degradation Guidance (CDG), a novel paradigm that replaces the null prompt with a strategically degraded condition, $\boldsymbol{c}_{\text{deg}}$. This reframes guidance from a coarse "good vs. null" contrast to a more refined "good vs. almost good" discrimination, thereby compelling the model to capture fine-grained semantic distinctions. We find that tokens in transformer text encoders split into two functional roles: content tokens encoding object semantics, and context-aggregating tokens capturing global context. By selectively degrading only the former, CDG constructs $\boldsymbol{c}_{\text{deg}}$ without external models or training. Validated across diverse architectures including Stable Diffusion 3, FLUX, and Qwen-Image, CDG markedly improves compositional accuracy and text-image alignment. As a lightweight, plug-and-play module, it achieves this with negligible computational overhead. Our work challenges the reliance on static, information-sparse negative samples and establishes a new principle for diffusion guidance: the construction of adaptive, semantically-aware negative samples is critical to achieving precise semantic control. Code is available at https://github.com/Ming-321/Classifier-Degradation-Guidance.
Abstract（参考訳）: 分類自由誘導(CFG)は現代のテキスト・画像モデルの基盤であるが、意味論的に空虚なヌルプロンプト(\varnothing$)に依存しているため、幾何学的絡み合いがちな誘導信号を生成する。これはその精度を制限する重要な要素であり、複雑な構成タスクにおいて文書化された失敗につながる。 nullプロンプトを戦略的に劣化した条件である$\boldsymbol{c}_{\text{deg}}$に置き換える新しいパラダイムであるCondition-Degradation Guidance (CDG)を提案する。これは、より洗練された「良い vs. null」差別とは対照的に、粗い「良い vs. null」からのガイダンスを再構成し、よりきめ細かなセマンティックな区別を捉えるようにモデルを説得する。トランスフォーマーテキストエンコーダのトークンは、オブジェクトセマンティクスをコードするコンテントトークンと、グローバルコンテキストをキャプチャするコンテクスト集約トークンの2つの機能的な役割に分かれている。前者のみを選択的に分解することで、CDGは外部モデルやトレーニングなしで$\boldsymbol{c}_{\text{deg}}$を構築する。安定拡散3、FLUX、Qwen-Imageなど様々なアーキテクチャで検証されているCDGは、構成精度とテキストイメージアライメントを大幅に改善する。軽量でプラグアンドプレイのモジュールとして、計算オーバーヘッドを無視してこれを実現する。我々の研究は,静的で情報に疎い負のサンプルへの依存に挑戦し,拡散誘導の新たな原則を確立する。適応的,意味的に認識可能な負のサンプルの構築は,正確な意味制御を実現する上で重要である。コードはhttps://github.com/Ming-321/Classifier-Degradation-Guidanceで入手できる。

論文の概要: Guiding Diffusion Models with Semantically Degraded Conditions

関連論文リスト