Fugu-MT 論文翻訳(概要): FUSE: Frequency-domain Unification and Spectral Energy Alignment for Multi-modal Object Re-Identification

論文の概要: FUSE: Frequency-domain Unification and Spectral Energy Alignment for Multi-modal Object Re-Identification

arxiv url: http://arxiv.org/abs/2606.20044v1
Date: Thu, 18 Jun 2026 10:21:32 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.791855
Title: FUSE: Frequency-domain Unification and Spectral Energy Alignment for Multi-modal Object Re-Identification
Title（参考訳）: FUSE:マルチモーダル物体再同定のための周波数領域統一とスペクトルエネルギーアライメント
Authors: Xuanhao Qi, Tom H. Luan, Yukang Zhang, Jinkai Zheng, Zhou Su, Shuwei Li, Lei Tan,
Abstract要約: 本稿では,多モードReIDをスペクトル歪みとエネルギーアライメントの2段階プロセスとして再構成する周波数領域フレームワークであるFUSEを紹介する。提案したスペクトル分解モジュールは、機能を低、中、高周波サブ空間に適応的に分割する。 RGBNT201、RGBNT100、MSVR310の実験では、FUSEは9.1% mAPと9.5% Rank-1の改善を達成した。
参考スコア（独自算出の注目度）: 41.52868809663972
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite significant progress in multi-modal Re-Identification (ReID), existing methods tend to emphasize low-frequency cues. Consequently, they focus on attributes such as color, illumination, and coarse appearance, while overlooking mid and high-frequency structures that encode geometric, textural, and identity-discriminative details. This imbalance leads to incomplete spectral representations and unstable cross-modal alignment. To overcome these limitations, we introduce FUSE, a frequency-domain framework that reformulates multi-modal ReID as a two-stage process of spectral disentanglement and energy alignment. The proposed Spectral Decomposition Module (SDM) adaptively partitions features into low, mid, and high-frequency subspaces, enabling hierarchical spectral modeling. The Cross-Modal Alignment Module (CAM) further enforces energy alignment and subspace complementarity across modalities via frequency-consistency regularization. In addition, FUSE incorporates learnable frequency modulation to enhance robustness under varying illumination and heterogeneous sensor conditions. Extensive experiments on RGBNT201, RGBNT100, and MSVR310 show that FUSE achieves 9.1\% mAP and 9.5\% Rank-1 improvements, establishing an interpretable frequency-domain paradigm for multi-modal representation learning.
Abstract（参考訳）: ReID(Multi-modal Re-Identification)の進歩にもかかわらず、既存の手法は低周波キューを強調する傾向にある。その結果、色、照明、粗い外観などの属性に焦点が当てられ、幾何学的、テクスチュラル、アイデンティティを識別する細部を符号化する中・高周波構造を見渡せるようになった。この不均衡は、不完全なスペクトル表現と不安定なクロスモーダルアライメントをもたらす。これらの制限を克服するために、スペクトルの歪みとエネルギーのアライメントの2段階プロセスとしてマルチモーダルReIDを再構成する周波数領域フレームワークであるFUSEを導入する。提案したスペクトル分解モジュール(SDM)は、特徴を低、中、高周波サブ空間に適応的に分割し、階層的なスペクトルモデリングを可能にする。 CAM(Cross-Modal Alignment Module)は、周波数整合正則化により、モダリティ間のエネルギーアライメントと部分空間の相補性をさらに強化する。さらに、FUSEは学習可能な周波数変調を導入し、様々な照明および異種センサ条件下で堅牢性を高める。 RGBNT201、RGBNT100、MSVR310の大規模な実験により、FUSEは9.1\% mAPと9.5\% Rank-1の改善を実現し、マルチモーダル表現学習のための解釈可能な周波数領域パラダイムを確立した。

論文の概要: FUSE: Frequency-domain Unification and Spectral Energy Alignment for Multi-modal Object Re-Identification

関連論文リスト