Fugu-MT 論文翻訳(概要): Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF

論文の概要: Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF

arxiv url: http://arxiv.org/abs/2506.08953v1
Date: Tue, 10 Jun 2025 16:20:52 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-11 15:11:42.857678
Title: Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF
Title（参考訳）: サイド情報埋め込みを用いたクロススペクトルボディ認識:LLCMのベンチマークとIJB-MDFのレンジ誘発咬合の分析
Authors: Anirudh Nanduri, Siyuan Huang, Rama Chellappa,
Abstract要約: ViT(Vision Transformers)は、顔や身体の認識を含む幅広い生体計測タスクにおいて、印象的なパフォーマンスを誇示している。本研究では、視認性(VIS)画像に事前訓練されたVTモデルを、クロススペクトル体認識の難しい問題に適用する。このアイデアに基づいて、我々はSide Information Embedding (SIE)を統合し、ドメインとカメラ情報のエンコーディングの影響を調べ、スペクトル間マッチングを強化する。驚くべきことに、我々の結果は、ドメイン情報を明示的に組み込むことなく、カメラ情報のみを符号化することで、LLCMデータセット上で最先端のパフォーマンスが得られることを示している。
参考スコア（独自算出の注目度）: 51.36007967653781
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision Transformers (ViTs) have demonstrated impressive performance across a wide range of biometric tasks, including face and body recognition. In this work, we adapt a ViT model pretrained on visible (VIS) imagery to the challenging problem of cross-spectral body recognition, which involves matching images captured in the visible and infrared (IR) domains. Recent ViT architectures have explored incorporating additional embeddings beyond traditional positional embeddings. Building on this idea, we integrate Side Information Embedding (SIE) and examine the impact of encoding domain and camera information to enhance cross-spectral matching. Surprisingly, our results show that encoding only camera information - without explicitly incorporating domain information - achieves state-of-the-art performance on the LLCM dataset. While occlusion handling has been extensively studied in visible-spectrum person re-identification (Re-ID), occlusions in visible-infrared (VI) Re-ID remain largely underexplored - primarily because existing VI-ReID datasets, such as LLCM, SYSU-MM01, and RegDB, predominantly feature full-body, unoccluded images. To address this gap, we analyze the impact of range-induced occlusions using the IARPA Janus Benchmark Multi-Domain Face (IJB-MDF) dataset, which provides a diverse set of visible and infrared images captured at various distances, enabling cross-range, cross-spectral evaluations.
Abstract（参考訳）: ViT(Vision Transformers)は、顔や身体の認識を含む幅広い生体計測タスクにおいて、印象的なパフォーマンスを誇示している。本研究では、視認性(VIS)画像に事前訓練されたViTモデルを、可視・赤外領域で撮影された画像のマッチングを含む、クロススペクトル体認識の難しい問題に適用する。近年のViTアーキテクチャでは、従来の位置埋め込みを超えて追加の埋め込みを組み込むことが検討されている。このアイデアに基づいて、我々はSide Information Embedding (SIE)を統合し、ドメインとカメラ情報のエンコーディングの影響を調べ、スペクトル間マッチングを強化する。驚くべきことに、我々の結果は、ドメイン情報を明示的に組み込むことなく、カメラ情報のみを符号化することで、LLCMデータセット上で最先端のパフォーマンスが得られることを示している。可視光人物再識別(Re-ID)では、オクルージョンハンドリングが広く研究されているが、可視赤外線(VI) Re-IDのオクルージョンは、主に既存のVI-ReIDデータセットであるLLCM、SYSU-MM01、RegDBなど、主にフルボディで非閉塞な画像を特徴とする。このギャップに対処するために, IARPA Janus Benchmark Multi-Domain Face (IJB-MDF) データセットを用いて, 様々な距離で捉えた多様な可視・赤外画像の集合を提供し, クロスレンジ, クロススペクトル評価を可能にする。

論文の概要: Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF

関連論文リスト