Fugu-MT 論文翻訳(概要): Object-aware Inversion and Reassembly for Image Editing

論文の概要: Object-aware Inversion and Reassembly for Image Editing

arxiv url: http://arxiv.org/abs/2310.12149v1
Date: Wed, 18 Oct 2023 17:59:02 GMT
ステータス: 翻訳完了
システム内更新日: 2023-10-19 15:27:47.570946
Title: Object-aware Inversion and Reassembly for Image Editing
Title（参考訳）: 画像編集のためのオブジェクト認識インバージョンと再組み立て
Authors: Zhen Yang, Dinggang Gui, Wen Wang, Hao Chen, Bohan Zhuang, Chunhua Shen
Abstract要約: オブジェクトレベルのきめ細かい編集を可能にするために,オブジェクト認識型インバージョンと再アセンブリ(OIR)を提案する。画像の編集時に各編集ペアに対して最適な反転ステップを見つけるために,検索基準を用いる。本手法は,オブジェクトの形状,色,材料,カテゴリなどの編集において,特に多目的編集シナリオにおいて優れた性能を発揮する。
参考スコア（独自算出の注目度）: 64.8466081220814
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: By comparing the original and target prompts in editing task, we can obtain numerous editing pairs, each comprising an object and its corresponding editing target. To allow editability while maintaining fidelity to the input image, existing editing methods typically involve a fixed number of inversion steps that project the whole input image to its noisier latent representation, followed by a denoising process guided by the target prompt. However, we find that the optimal number of inversion steps for achieving ideal editing results varies significantly among different editing pairs, owing to varying editing difficulties. Therefore, the current literature, which relies on a fixed number of inversion steps, produces sub-optimal generation quality, especially when handling multiple editing pairs in a natural image. To this end, we propose a new image editing paradigm, dubbed Object-aware Inversion and Reassembly (OIR), to enable object-level fine-grained editing. Specifically, we design a new search metric, which determines the optimal inversion steps for each editing pair, by jointly considering the editability of the target and the fidelity of the non-editing region. We use our search metric to find the optimal inversion step for each editing pair when editing an image. We then edit these editing pairs separately to avoid concept mismatch. Subsequently, we propose an additional reassembly step to seamlessly integrate the respective editing results and the non-editing region to obtain the final edited image. To systematically evaluate the effectiveness of our method, we collect two datasets for benchmarking single- and multi-object editing, respectively. Experiments demonstrate that our method achieves superior performance in editing object shapes, colors, materials, categories, etc., especially in multi-object editing scenarios.
Abstract（参考訳）: 編集タスクにおけるオリジナルプロンプトとターゲットプロンプトを比較することで、オブジェクトとその対応する編集ターゲットを含む多数の編集ペアを得ることができる。既存の編集方法は、入力画像に対する忠実性を保ちながら、編集性を確保するため、通常、入力画像全体をノイズの潜在表現に投影する固定数の反転ステップを伴い、続いてターゲットプロンプトによってガイドされる復調処理を行う。しかし, 理想的な編集結果を得るための最適な反転ステップの数は, 編集困難度の違いにより, 異なる編集ペア間で大きく異なることがわかった。そのため、現在の文献では、特に複数の編集ペアを自然画像で処理する場合に、一定数の反転ステップに依存するため、準最適生成品質が得られる。そこで本稿では,オブジェクトレベルのきめ細かな編集を可能にするために,oir(object-aware inversion and reassembly)と呼ばれる新しい画像編集パラダイムを提案する。具体的には,ターゲットの編集可能性と非編集領域の忠実性を同時に考慮し,編集ペア毎の最適な反転ステップを決定する新しい検索指標を設計する。画像の編集時に各編集ペアに対して最適な反転ステップを見つけるために,検索基準を用いる。次に、これらの編集ペアを別々に編集し、概念ミスマッチを避ける。その後、各編集結果と非編集領域をシームレスに統合し、最終的な編集画像を得るための追加の組立ステップを提案する。提案手法の有効性を体系的に評価するために,単目的および多目的編集をベンチマークするための2つのデータセットを収集した。実験により, オブジェクト形状, 色, 材料, カテゴリなどの編集において, 特にマルチオブジェクト編集において, 優れた性能が得られた。

論文の概要: Object-aware Inversion and Reassembly for Image Editing

関連論文リスト