Fugu-MT 論文翻訳(概要): Rethinking Vector Field Learning for Generative Segmentation

論文の概要: Rethinking Vector Field Learning for Generative Segmentation

arxiv url: http://arxiv.org/abs/2603.19218v1
Date: Thu, 19 Mar 2026 17:58:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-20 17:19:06.324808
Title: Rethinking Vector Field Learning for Generative Segmentation
Title（参考訳）: 生成セグメンテーションのためのベクトル場学習の再考
Authors: Chaoyang Wang, Yaobo Liang, Boci Peng, Fan Duan, Jingdong Wang, Yunhai Tong,
Abstract要約: 生成的セグメンテーションのためのモデリング拡散モデルが注目されている。ベクトル場学習の観点から拡散セグメンテーションを再考する。本稿では,学習した消滅速度場を距離認識補正項で拡張するベクトル場再構成手法を提案する。この補正は、誘引的相互作用と反発的相互作用の両方を導入し、元の拡散訓練フレームワークを保ちながら、セントロイド付近の勾配等級を増大させる。
参考スコア（独自算出の注目度）: 50.08025820235397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Taming diffusion models for generative segmentation has attracted increasing attention. While existing approaches primarily focus on architectural tweaks or training heuristics, there remains a limited understanding of the intrinsic mismatch between continuous flow matching objectives and discrete perception tasks. In this work, we revisit diffusion segmentation from the perspective of vector field learning. We identify two key limitations of the commonly used flow matching objective: gradient vanishing and trajectory traversing, which result in slow convergence and poor class separation. To tackle these issues, we propose a principled vector field reshaping strategy that augments the learned velocity field with a detached distance-aware correction term. This correction introduces both attractive and repulsive interactions, enhancing gradient magnitudes near centroids while preserving the original diffusion training framework. Furthermore, we design a computationally efficient, quasi-random category encoding scheme inspired by Kronecker sequences, which integrates seamlessly with an end-to-end pixel neural field framework for pixel-level semantic alignment. Extensive experiments consistently demonstrate significant improvements over vanilla flow matching approaches, substantially narrowing the performance gap between generative segmentation and strong discriminative specialists.
Abstract（参考訳）: 生成的セグメンテーションのためのモデリング拡散モデルが注目されている。既存のアプローチは主にアーキテクチャの微調整やトレーニングのヒューリスティックスに重点を置いているが、継続的フローマッチングの目的と離散的な知覚タスクとの間の本質的なミスマッチについては、依然として限定的な理解が残っている。本研究では,ベクトル場学習の観点から拡散セグメンテーションを再考する。フローマッチングの目的として,勾配の消失と軌跡のトラバースという2つの重要な制約を同定し,収束が遅く,クラス分離が貧弱になることを示した。これらの問題に対処するために,学習速度場を分離した距離認識補正項で拡張するベクトル場再構成手法を提案する。この補正は、誘引的相互作用と反発的相互作用の両方を導入し、元の拡散訓練フレームワークを保ちながら、セントロイド付近の勾配等級を増大させる。さらに,Kroneckerシーケンスにインスパイアされた計算効率のよい準ランダムなカテゴリ符号化方式を設計し,画素レベルのセマンティックアライメントのためのエンドツーエンドのニューラルネットワークフレームワークとシームレスに統合する。大規模な実験は、バニラフローマッチングアプローチよりも大幅に改善され、生成的セグメンテーションと強力な差別的スペシャリストのパフォーマンスギャップが大幅に狭まる。

論文の概要: Rethinking Vector Field Learning for Generative Segmentation

関連論文リスト