Fugu-MT 論文翻訳(概要): EditIDv2: Editable ID Customization with Data-Lubricated ID Feature Integration for Text-to-Image Generation

論文の概要: EditIDv2: Editable ID Customization with Data-Lubricated ID Feature Integration for Text-to-Image Generation

arxiv url: http://arxiv.org/abs/2509.05659v1
Date: Sat, 06 Sep 2025 09:29:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-09 14:07:03.635764
Title: EditIDv2: Editable ID Customization with Data-Lubricated ID Feature Integration for Text-to-Image Generation
Title（参考訳）: EditIDv2: テキスト・画像生成のためのData-Lubricated ID機能統合による編集可能なIDカスタマイズ
Authors: Guandong Li, Zhaobin Chu,
Abstract要約: EditIDv2は、高複雑さの物語シーンと長いテキスト入力用に特別に設計されたチューニング不要のソリューションである。複雑な物語環境において、少量のデータ潤滑だけでアイデンティティの整合性を保ちながら、深いマルチレベルのセマンティック編集を実現する。
参考スコア（独自算出の注目度）: 10.474377498273205
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose EditIDv2, a tuning-free solution specifically designed for high-complexity narrative scenes and long text inputs. Existing character editing methods perform well under simple prompts, but often suffer from degraded editing capabilities, semantic understanding biases, and identity consistency breakdowns when faced with long text narratives containing multiple semantic layers, temporal logic, and complex contextual relationships. In EditID, we analyzed the impact of the ID integration module on editability. In EditIDv2, we further explore and address the influence of the ID feature integration module. The core of EditIDv2 is to discuss the issue of editability injection under minimal data lubrication. Through a sophisticated decomposition of PerceiverAttention, the introduction of ID loss and joint dynamic training with the diffusion model, as well as an offline fusion strategy for the integration module, we achieve deep, multi-level semantic editing while maintaining identity consistency in complex narrative environments using only a small amount of data lubrication. This meets the demands of long prompts and high-quality image generation, and achieves excellent results in the IBench evaluation.
Abstract（参考訳）: 複雑な物語シーンや長いテキスト入力に特化して設計された,チューニング不要なソリューションであるEditIDv2を提案する。既存の文字編集手法は単純なプロンプトではうまく機能するが、複数の意味層、時間論理、複雑な文脈関係を含む長文の物語に直面した際、劣化した編集能力、意味的理解のバイアス、アイデンティティの一貫性の低下に悩まされることが多い。 EditIDでは,ID統合モジュールが編集性に与える影響を分析した。 EditIDv2では、ID機能統合モジュールの影響をさらに調査し、対処する。 EditIDv2の中核は、最小限のデータ潤滑下での編集可能性注入の問題について議論することである。 PerceiverAttentionの洗練された分解、ID損失の導入、拡散モデルによる共同動的トレーニング、および統合モジュールのオフライン融合戦略により、少量のデータ潤滑だけで複雑な物語環境におけるアイデンティティの整合性を維持しつつ、深いマルチレベルセマンティック編集を実現する。これにより、長いプロンプトと高品質な画像生成の要求を満たすことができ、IBench評価において優れた結果が得られる。

論文の概要: EditIDv2: Editable ID Customization with Data-Lubricated ID Feature Integration for Text-to-Image Generation

関連論文リスト