Fugu-MT 論文翻訳(概要): SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

論文の概要: SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

arxiv url: http://arxiv.org/abs/2510.25970v1
Date: Wed, 29 Oct 2025 21:12:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-31 16:05:09.57517
Title: SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing
Title（参考訳）: SplitFlow: インバージョンフリーのテキスト画像編集のためのフロー分解
Authors: Sung-Hoon Yoon, Minghan Li, Gaspard Beaudouin, Congcong Wen, Muhammad Rafay Azhar, Mengyu Wang,
Abstract要約: 整流流モデルは, 安定したサンプリング軌道と高忠実度出力により, 画像生成におけるデファクトスタンダードとなっている。強力な生成能力にもかかわらず、画像編集タスクには限界がある。近年の取り組みでは、ソースとターゲットの分布を直接ODEベースのアプローチでインバージョンせずにマッピングする試みが行われている。本稿では,これらの制約に対応するために,インバージョンフリーな定式化に基づくフロー分解・集約フレームワークを提案する。
参考スコア（独自算出の注目度）: 15.234877788378563
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rectified flow models have become a de facto standard in image generation due to their stable sampling trajectories and high-fidelity outputs. Despite their strong generative capabilities, they face critical limitations in image editing tasks: inaccurate inversion processes for mapping real images back into the latent space, and gradient entanglement issues during editing often result in outputs that do not faithfully reflect the target prompt. Recent efforts have attempted to directly map source and target distributions via ODE-based approaches without inversion; however,these methods still yield suboptimal editing quality. In this work, we propose a flow decomposition-and-aggregation framework built upon an inversion-free formulation to address these limitations. Specifically, we semantically decompose the target prompt into multiple sub-prompts, compute an independent flow for each, and aggregate them to form a unified editing trajectory. While we empirically observe that decomposing the original flow enhances diversity in the target space, generating semantically aligned outputs still requires consistent guidance toward the full target prompt. To this end, we design a projection and soft-aggregation mechanism for flow, inspired by gradient conflict resolution in multi-task learning. This approach adaptively weights the sub-target velocity fields, suppressing semantic redundancy while emphasizing distinct directions, thereby preserving both diversity and consistency in the final edited output. Experimental results demonstrate that our method outperforms existing zero-shot editing approaches in terms of semantic fidelity and attribute disentanglement. The code is available at https://github.com/Harvard-AI-and-Robotics-Lab/SplitFlow.
Abstract（参考訳）: 整流流モデルは, 安定したサンプリング軌道と高忠実度出力により, 画像生成におけるデファクトスタンダードとなっている。実際の画像を潜在空間にマッピングする不正確な反転プロセスや、編集中の勾配絡みの問題はしばしば、ターゲットのプロンプトを忠実に反映しない出力をもたらす。近年, ソースとターゲットの分布を直接変換する手法が試みられているが, それらの手法はいまだに最適な編集品質を保っている。本研究では,これらの制約に対処するために,インバージョンフリーな定式化に基づくフロー分解・集約フレームワークを提案する。具体的には、ターゲットプロンプトを複数のサブプロンプトに意味的に分解し、個別のフローを計算し、それらを集約して統合された編集軌道を形成する。元のフローの分解がターゲット空間の多様性を高めることを実証的に観察する一方で、意味的に整合したアウトプットを生成するには、完全なターゲットプロンプトに対する一貫したガイダンスが必要である。この目的のために,マルチタスク学習における勾配コンフリクトの解消に触発されて,フローのプロジェクションとソフトアグリゲーション機構を設計する。このアプローチは、サブターゲット速度場を適応的に重み付けし、異なる方向を強調しながら意味的冗長性を抑え、最終的な編集出力における多様性と一貫性の両方を維持する。実験により,本手法は既存のゼロショット編集手法よりも意味的忠実度や属性のゆがみの点で優れていることが示された。コードはhttps://github.com/Harvard-AI-and-Robotics-Lab/SplitFlowで公開されている。

論文の概要: SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

関連論文リスト