Fugu-MT 論文翻訳(概要): DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing

論文の概要: DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing

arxiv url: http://arxiv.org/abs/2605.02417v1
Date: Mon, 04 May 2026 10:09:18 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:50.234033
Title: DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing
Title（参考訳）: DirectEdit:フローベース画像編集のためのステップレベル精度インバージョン
Authors: Desong Yang, Mang Ye,
Abstract要約: 我々は、事前訓練されたテキスト・ツー・イメージ(T2I)モデルのトレーニング不要な編集方法であるDirectEditを提案する。 DirectEditは、追加の神経機能評価(NFE)を導入することなく、固有の再構成エラーを除去する実験により、DirectEditは効率よく正確な画像編集を実現し、最先端の手法よりも優れたパフォーマンスを提供することが示された。
参考スコア（独自算出の注目度）: 51.56484100374058
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With recent advancements in large-scale pre-trained text-to-image (T2I) models, training-free image editing methods have demonstrated remarkable success. Typically, these methods involve adding noise to a clean image via an inversion process, followed by separate denoising steps for the reconstruction and editing paths during the forward process. However, since the reconstruction path is approximated using noisy latents from mismatched timesteps, existing methods inevitably suffer from accumulated drift, which fundamentally limits reconstruction fidelity. To address this challenge, we systematically analyze the inversion process within the flow transformer and propose DirectEdit, a simple yet effective editing method that eliminates the inherent reconstruction error without introducing additional neural function evaluations (NFEs). Unlike most prior works that attempt to rectify the inversion path, DirectEdit focuses on directly aligning the forward paths, enabling precise reconstruction and reliable feature sharing. Furthermore, we introduce a preservation mechanism based on attention feature injection and multi-branch mask-guided noise blending, which effectively balances fidelity and editability. Extensive experiments across diverse scenarios demonstrate that DirectEdit achieves efficient and accurate image editing, delivering superior performance that outperforms state-of-the-art methods. Code and examples are available at https://desongyang.github.io/Directedit.
Abstract（参考訳）: 近年の大規模訓練済みテキスト・ツー・イメージ(T2I)モデルの進歩により、トレーニング不要の画像編集手法が顕著な成功を収めている。通常、これらの手法は、反転過程を通じてクリーンな画像にノイズを加えることを含み、その後、前処理中の再構成と編集の手順を別々に記述する。しかし, 既設工法では, 既設工法では, 既設工法ではドリフトの蓄積が必然的に困難であり, 基本的には復元忠実度を制限している。この課題に対処するために、フロートランスの逆転過程を体系的に解析し、ニューラルファンクション評価(NFE)を導入することなく、固有の再構成エラーを除去する、シンプルで効果的な編集方法であるDirectEditを提案する。インバージョンパスの修正を試みるほとんどの以前の作業とは異なり、DirectEditはフォワードパスを直接整列することに重点を置いており、正確な再構築と信頼性の高い機能共有を可能にしている。さらに,注目機能注入とマルチブランチマスク誘導ノイズブレンディングに基づく保存機構を導入する。さまざまなシナリオにわたる大規模な実験は、DirectEditが効率的で正確な画像編集を実現し、最先端の手法よりも優れたパフォーマンスを提供することを示した。コードと例はhttps://desongyang.github.io/Directedit.comで公開されている。

論文の概要: DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing

関連論文リスト