Fugu-MT 論文翻訳(概要): From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance

論文の概要: From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance

arxiv url: http://arxiv.org/abs/2604.15948v1
Date: Fri, 17 Apr 2026 11:10:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-20 22:00:19.885101
Title: From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance
Title（参考訳）: コンペティションからコペティションへ:テキスト誘導に基づくコペティティブトレーニングフリー画像編集
Authors: Jinhao Shen, Haoqian Du, Xulu Zhang, Xiao-Yong Wei, Qing Li,
Abstract要約: CoEditは、注意制御を競争から合弁交渉に転換する、新しいゼロショットフレームワークである。本稿では,調和最大化問題として注意制御を再構成するために,分岐間の方向性エントロピー相互作用を定量化するデュアルエントロピー注意操作を提案する。また,遅延表現を時間とともに動的に調整し,蓄積した編集誤差を最小限に抑えるために,エントロピーラテントリファインメント機構を提案する。
参考スコア（独自算出の注目度）: 11.574335632043491
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-guided image editing, a pivotal task in modern multimedia content creation, has seen remarkable progress with training-free methods that eliminate the need for additional optimization. Despite recent progress, existing methods are typically constrained by a competitive paradigm in which the editing and reconstruction branches are independently driven by their respective objectives to maximize alignment with target and source prompts. The adversarial strategy causes semantic conflicts and unpredictable outcomes due to the lack of coordination between branches. To overcome these issues, we propose Coopetitive Training-Free Image Editing (CoEdit), a novel zero-shot framework that transforms attention control from competition to coopetitive negotiation, achieving editing harmony across spatial and temporal dimensions. Spatially, CoEdit introduces Dual-Entropy Attention Manipulation, which quantifies directional entropic interactions between branches to reformulate attention control as a harmony-maximization problem, eventually improving the localization of editable and preservable regions. Temporally, we present Entropic Latent Refinement mechanism to dynamically adjust latent representations over time, minimizing accumulated editing errors and ensuring consistent semantic transitions throughout the denoising trajectory. Additionally, we propose the Fidelity-Constrained Editing Score, a composite metric that jointly evaluates semantic editing and background fidelity. Extensive experiments on standard benchmarks demonstrate that CoEdit achieves superior performance in both editing quality and structural preservation, enhancing multimedia information utilization by enabling more effective interaction between visual and textual modalities. The code will be available at https://github.com/JinhaoShen/CoEdit.
Abstract（参考訳）: 現代マルチメディアコンテンツ作成における重要なタスクであるテキスト誘導画像編集は、追加の最適化の必要性を排除したトレーニング不要の手法によって目覚ましい進歩を遂げている。最近の進歩にもかかわらず、既存の手法は典型的には、編集と再構成のブランチがそれぞれの目的によって独立に駆動され、ターゲットとソースのプロンプトとの整合性を最大化する、という競争パラダイムによって制約される。対立戦略は、枝間の調整の欠如により意味的な対立や予測不可能な結果を引き起こす。これらの課題を克服するために,コンペティティブな学習自由画像編集(CoEdit)を提案する。これは,注意制御を競争から協調的交渉へ変換し,空間的・時間的次元にわたる編集調和を実現する,新しいゼロショットフレームワークである。空間的に、CoEditは二重エントロピー・アテンション・マニピュレーション(Dual-Entropy Attention Manipulation)を導入し、これは分岐間の方向性エントロピー相互作用を定量化し、調和最大化問題として注意制御を再構成し、最終的に編集可能な領域と保存可能な領域のローカライゼーションを改善する。時間とともに潜在表現を動的に調整し、蓄積した編集エラーを最小限に抑え、認知軌道全体を通して一貫した意味的遷移を確実にするエントロピック潜在表現制限機構を提案する。さらに,セマンティックな編集と背景の忠実さを共同で評価する合成計量であるFidelity-Constrained Editing Scoreを提案する。標準ベンチマークでの大規模な実験により、CoEditは、編集品質と構造保存の両方において優れた性能を達成し、視覚とテキストのモダリティのより効果的な相互作用を可能にすることで、マルチメディア情報の利用を向上させることが示されている。コードはhttps://github.com/JinhaoShen/CoEdit.comから入手できる。

論文の概要: From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance

関連論文リスト