Fugu-MT 論文翻訳(概要): Consistent-Inversion: Reverse Consistency Guidance for Structure-Preserving Visual Editing

論文の概要: Consistent-Inversion: Reverse Consistency Guidance for Structure-Preserving Visual Editing

arxiv url: http://arxiv.org/abs/2606.07145v1
Date: Fri, 05 Jun 2026 11:00:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.700967
Title: Consistent-Inversion: Reverse Consistency Guidance for Structure-Preserving Visual Editing
Title（参考訳）: Consistent-Inversion:Reverse Consistency Guidance for Structure-Preserving Visual Editing (特集:情報ネットワーク)
Authors: Xiaocheng Lu, Jingcai Guo, Song Guo,
Abstract要約: Consistent-Inversionは、構造保存ビジュアル編集のためのトレーニング不要の逆整合ガイダンスフレームワークである。 SD3.5プロトコルを統一したプロトコルで、ターゲット・プロンプトアライメントを維持しながら、背景および構造的忠実性を改善する。
参考スコア（独自算出の注目度）: 41.38183848746174
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-guided diffusion models have become effective tools for real-image visual editing, where the edited image must follow a target instruction while preserving editing-irrelevant structure. Most training-free editors rely on inversion: a source image is mapped to a noisy latent trajectory and the terminal latent is reused for target-prompt denoising. This reuse is useful for preservation, but it also couples source reconstruction and target editing. The resulting trajectory mismatch may either damage background/layout details or over-constrain the intended edit. This paper presents Consistent-Inversion, a training-free reverse consistency guidance framework for structure-preserving visual editing. Instead of treating the inverted source latent as a fixed initialization, Consistent-Inversion checks whether an intermediate target trajectory can be reversed toward the source inversion trajectory under the source prompt. To make this check well-defined, we construct an auxiliary target-side noise representation, perform source-guided reverse denoising, and use the resulting reverse consistency discrepancy as a correction signal for selected early target denoising steps. The method does not update model parameters, is compatible with inversion-based editors, and introduces only a small inference overhead when applied sparsely. Experiments on PIE-Bench show that Consistent-Inversion improves background and structural fidelity under a unified SD3.5 protocol while maintaining target-prompt alignment, and compatibility experiments further verify the same correction principle on classical Stable-Diffusion inversion pipelines.
Abstract（参考訳）: テキスト誘導拡散モデルはリアルイメージの視覚的編集に有効なツールとなり、編集不要な構造を維持しながら、編集された画像はターゲット命令に従う必要がある。ほとんどのトレーニングフリーエディタは、インバージョンに依存しており、ソースイメージはノイズの多い潜在軌道にマッピングされ、端末ラテントはターゲットプロンプトの復調のために再利用される。この再利用は保存に有用であるが、ソースの再構築とターゲット編集を兼ね備えている。結果として得られた軌道ミスマッチは、バックグラウンド/レイアウトの詳細を傷つけるか、意図した編集を過剰に制限する。本稿では,構造保存型視覚編集のためのトレーニング不要な逆整合ガイダンスフレームワークであるConsistent-Inversionを提案する。 Inverted source latent を固定初期化として扱う代わりに、Consistent-Inversion は、中間目標軌道がソースプロンプトの下でソース反転軌道へ逆転できるかどうかをチェックする。このチェックを適切に定義するために、我々は、補助目標側ノイズ表現を構築し、ソース誘導逆復調を行い、結果の逆整合不一致を、選択した早期目標復調ステップの補正信号として利用する。このメソッドはモデルパラメータを更新せず、インバージョンベースのエディタと互換性があり、スパースで適用された場合、わずかな推論オーバーヘッドしか導入しない。 PIE-Benchの実験では、コンシステント・インバージョン(Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Inversion, Consistent-Diffusion Inversion)により、SD3.5プロトコルの背景および構造フィリティが向上し、ターゲット・プロンプトアライメントを維持しながら向上することが示されている。

論文の概要: Consistent-Inversion: Reverse Consistency Guidance for Structure-Preserving Visual Editing

関連論文リスト