Fugu-MT 論文翻訳(概要): Delta-Adapter: Scalable Exemplar-Based Image Editing with Single-Pair Supervision

論文の概要: Delta-Adapter: Scalable Exemplar-Based Image Editing with Single-Pair Supervision

arxiv url: http://arxiv.org/abs/2605.07940v1
Date: Fri, 08 May 2026 16:09:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.189537
Title: Delta-Adapter: Scalable Exemplar-Based Image Editing with Single-Pair Supervision
Title（参考訳）: Delta-Adapter:シングルペアスーパービジョンによるスケーラブルな例ベースの画像編集
Authors: Jiacheng Chen, Songze Li, Han Fu, Baoquan Zhao, Wei Liu, Yanyan Liang, Li Qing, Xudong Mao,
Abstract要約: 既存の手法はペア・オブ・ペアの監視パラダイムに依存している。本稿では,単一ペア監視下での移動可能な編集セマンティクスを学習するDelta-Adapterを提案する。
参考スコア（独自算出の注目度）: 39.983456878703855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Exemplar-based image editing applies a transformation defined by a source-target image pair to a new query image. Existing methods rely on a pair-of-pairs supervision paradigm, requiring two image pairs sharing the same edit semantics to learn the target transformation. This constraint makes training data difficult to curate at scale and limits generalization across diverse edit types. We propose Delta-Adapter, a method that learns transferable editing semantics under single-pair supervision, requiring no textual guidance. Rather than directly exposing the exemplar pair to the model, we leverage a pre-trained vision encoder to extract a semantic delta that encodes the visual transformation between the two images. This semantic delta is injected into a pre-trained image editing model via a Perceiver-based adapter. Since the target image is never directly visible to the model, it can serve as the prediction target, enabling single-pair supervision without requiring additional exemplar pairs. This formulation allows us to leverage existing large-scale editing datasets for training. To further promote faithful transformation transfer, we introduce a semantic delta consistency loss that aligns the semantic change of the generated output with the ground-truth semantic delta extracted from the exemplar pair. Extensive experiments demonstrate that Delta-Adapter consistently improves both editing accuracy and content consistency over four strong baselines on seen editing tasks, while also generalizing more effectively to unseen editing tasks. Code will be available at https://delta-adapter.github.io.
Abstract（参考訳）: 例ベースの画像編集は、ソースターゲットイメージペアによって定義された変換を新しいクエリイメージに適用する。既存の手法はペア対の監視パラダイムに依存しており、2つのイメージペアが同じ編集セマンティクスを共有してターゲット変換を学ぶ必要がある。この制約により、トレーニングデータを大規模にキュレートすることが難しくなり、さまざまな編集タイプにまたがる一般化が制限される。本研究では,一対の監督下での移動可能な編集セマンティクスを学習し,テキストによるガイダンスを必要としないDelta-Adapterを提案する。モデルに模範対を直接露光する代わりに、事前学習された視覚エンコーダを用いて、2つの画像間の視覚変換を符号化するセマンティックデルタを抽出する。このセマンティックデルタは、Perceiverベースのアダプタを介してトレーニング済みの画像編集モデルに注入される。対象画像がモデルに直接見えることはないため、予測対象として機能し、一対の監視が不要になる。この定式化により、既存の大規模編集データセットをトレーニングに活用することができる。さらに忠実な変換伝達を促進するために、生成した出力のセマンティックな変化を、模範対から抽出した接地トラスなセマンティック・デルタと整合させるセマンティック・デルタ整合性損失を導入する。広範な実験により、Delta-Adapterは、表示されない編集タスクをより効果的に一般化しつつ、4つの強力なベースラインに対して、編集精度とコンテンツ一貫性の両方を一貫して改善することを示した。コードはhttps://delta-adapter.github.ioで公開される。

論文の概要: Delta-Adapter: Scalable Exemplar-Based Image Editing with Single-Pair Supervision

関連論文リスト