Fugu-MT 論文翻訳(概要): CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing

論文の概要: CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing

arxiv url: http://arxiv.org/abs/2508.06937v1
Date: Sat, 09 Aug 2025 11:06:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-12 21:23:28.618383
Title: CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing
Title（参考訳）: CannyEdit: 学習不要の画像編集のための選択型キャニーコントロールとデュアルプロンプトガイダンス
Authors: Weiyan Xie, Han Gao, Didan Deng, Kaican Li, April Hua Liu, Yongxiang Huang, Nevin L. Zhang,
Abstract要約: CannyEditは、地域画像編集のための新しいトレーニング不要のフレームワークである。 Selective Canny Control and Dual-Prompt Guidanceを紹介する。 CannyEditは2.93から10.49パーセントの改善を達成している。
参考スコア（独自算出の注目度）: 13.934827997942424
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Recent advances in text-to-image (T2I) models have enabled training-free regional image editing by leveraging the generative priors of foundation models. However, existing methods struggle to balance text adherence in edited regions, context fidelity in unedited areas, and seamless integration of edits. We introduce CannyEdit, a novel training-free framework that addresses these challenges through two key innovations: (1) Selective Canny Control, which masks the structural guidance of Canny ControlNet in user-specified editable regions while strictly preserving details of the source images in unedited areas via inversion-phase ControlNet information retention. This enables precise, text-driven edits without compromising contextual integrity. (2) Dual-Prompt Guidance, which combines local prompts for object-specific edits with a global target prompt to maintain coherent scene interactions. On real-world image editing tasks (addition, replacement, removal), CannyEdit outperforms prior methods like KV-Edit, achieving a 2.93 to 10.49 percent improvement in the balance of text adherence and context fidelity. In terms of editing seamlessness, user studies reveal only 49.2 percent of general users and 42.0 percent of AIGC experts identified CannyEdit's results as AI-edited when paired with real images without edits, versus 76.08 to 89.09 percent for competitor methods.
Abstract（参考訳）: テキスト・ツー・イメージ(T2I)モデルの最近の進歩は,基礎モデルの創成的先行を生かして,訓練不要な地域画像編集を可能にしている。しかし、既存の手法では、編集領域におけるテキストの付着性のバランス、未編集領域におけるコンテキストの忠実さ、編集のシームレスな統合に苦慮している。 Inversion-phase ControlNet情報保持を通じて、未編集領域のソースイメージの詳細を厳密に保存しつつ、ユーザ指定の編集可能領域におけるCanny ControlNetの構造的ガイダンスを隠蔽するSelective Canny Control。これにより、コンテキスト整合性を損なうことなく、正確でテキスト駆動の編集が可能になる。 2) オブジェクト固有の編集のための局所的なプロンプトとグローバルなターゲットプロンプトを組み合わせて、一貫性のあるシーンインタラクションを維持するデュアルプロンプトガイダンス。実世界の画像編集タスク(追加、置換、削除)において、CannyEditはKV-Editのような従来の手法よりも優れており、テキストの定着とコンテキストの忠実さのバランスが2.93から10.49パーセント向上している。シームレスな編集に関しては、一般ユーザーの49.2%とAIGCの専門家の42.0パーセントのみが、CannyEditの結果を編集なしで実際の画像と組み合わせた場合のAI編集であると認識している。

論文の概要: CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing

関連論文リスト