Fugu-MT 論文翻訳(概要): TweezeEdit: Consistent and Efficient Image Editing with Path Regularization

論文の概要: TweezeEdit: Consistent and Efficient Image Editing with Path Regularization

arxiv url: http://arxiv.org/abs/2508.10498v1
Date: Thu, 14 Aug 2025 09:59:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-15 22:24:48.267719
Title: TweezeEdit: Consistent and Efficient Image Editing with Path Regularization
Title（参考訳）: TweezeEdit: 経路正規化による一貫性と効率的な画像編集
Authors: Jianda Mao, Kaibo Wang, Yang Xiang, Kani Chen,
Abstract要約: 我々は、一貫性と効率的な画像編集のためのチューニング不要かつ逆変換のないフレームワークであるTweezeEditを提案する。本手法は, 逆アンカーのみに依存するのではなく, denoising path全体を正規化することで, これらの制約に対処する。実験では、TweezeEditのセマンティックな保存とターゲットアライメントにおける優れたパフォーマンスを示し、既存の手法よりも優れています。
参考スコア（独自算出の注目度）: 6.248205481752008
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large-scale pre-trained diffusion models empower users to edit images through text guidance. However, existing methods often over-align with target prompts while inadequately preserving source image semantics. Such approaches generate target images explicitly or implicitly from the inversion noise of the source images, termed the inversion anchors. We identify this strategy as suboptimal for semantic preservation and inefficient due to elongated editing paths. We propose TweezeEdit, a tuning- and inversion-free framework for consistent and efficient image editing. Our method addresses these limitations by regularizing the entire denoising path rather than relying solely on the inversion anchors, ensuring source semantic retention and shortening editing paths. Guided by gradient-driven regularization, we efficiently inject target prompt semantics along a direct path using a consistency model. Extensive experiments demonstrate TweezeEdit's superior performance in semantic preservation and target alignment, outperforming existing methods. Remarkably, it requires only 12 steps (1.6 seconds per edit), underscoring its potential for real-time applications.
Abstract（参考訳）: 大規模な事前学習拡散モデルにより、ユーザーはテキストガイダンスを通じて画像を編集できる。しかしながら、既存のメソッドは、ソースイメージのセマンティクスを不十分に保存しながら、ターゲットプロンプトと過度に調整することが多い。このようなアプローチは、インバージョンアンカーと呼ばれるソース画像の逆ノイズから、明示的にまたは暗黙的にターゲット画像を生成する。我々は,この戦略を,編集経路の延長による意味保存と非効率の亜最適化とみなす。我々は、一貫性と効率的な画像編集のためのチューニング不要かつ逆変換のないフレームワークであるTweezeEditを提案する。本手法は,インバージョンアンカーのみに依存するのではなく,記述パス全体を規則化し,ソースのセマンティック保持を確実にし,編集パスを短縮することで,これらの制約に対処する。勾配駆動型正規化法により、直進経路に沿ったターゲットプロンプトセマンティクスを一貫性モデルを用いて効率的に注入する。大規模な実験は、TweezeEditのセマンティックな保存とターゲットアライメントにおける優れたパフォーマンスを示し、既存の手法よりも優れています。注目すべきは、編集に12ステップ(1.6秒)しか必要とせず、リアルタイムアプリケーションの可能性を強調していることだ。

論文の概要: TweezeEdit: Consistent and Efficient Image Editing with Path Regularization

関連論文リスト