CycleTrans: Learning Neutral yet Discriminative Features for
Visible-Infrared Person Re-Identification
- URL: http://arxiv.org/abs/2208.09844v1
- Date: Sun, 21 Aug 2022 08:41:40 GMT
- Title: CycleTrans: Learning Neutral yet Discriminative Features for
Visible-Infrared Person Re-Identification
- Authors: Qiong Wu, Jiaer Xia, Pingyang Dai, Yiyi Zhou, Yongjian Wu, Rongrong Ji
- Abstract summary: Visible-infrared person re-identification (VI-ReID) is a task of matching the same individuals across the visible and infrared modalities.
Existing VI-ReID methods mainly focus on learning general features across modalities, often at the expense of feature discriminability.
We present a novel cycle-construction-based network for neutral yet discriminative feature learning, termed CycleTrans.
- Score: 79.84912525821255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visible-infrared person re-identification (VI-ReID) is a task of matching the
same individuals across the visible and infrared modalities. Its main challenge
lies in the modality gap caused by cameras operating on different spectra.
Existing VI-ReID methods mainly focus on learning general features across
modalities, often at the expense of feature discriminability. To address this
issue, we present a novel cycle-construction-based network for neutral yet
discriminative feature learning, termed CycleTrans. Specifically, CycleTrans
uses a lightweight Knowledge Capturing Module (KCM) to capture rich semantics
from the modality-relevant feature maps according to pseudo queries.
Afterwards, a Discrepancy Modeling Module (DMM) is deployed to transform these
features into neutral ones according to the modality-irrelevant prototypes. To
ensure feature discriminability, another two KCMs are further deployed for
feature cycle constructions. With cycle construction, our method can learn
effective neutral features for visible and infrared images while preserving
their salient semantics. Extensive experiments on SYSU-MM01 and RegDB datasets
validate the merits of CycleTrans against a flurry of state-of-the-art methods,
+4.57% on rank-1 in SYSU-MM01 and +2.2% on rank-1 in RegDB.
Related papers
- Prototype-Driven Multi-Feature Generation for Visible-Infrared Person Re-identification [11.664820595258988]
Primary challenges in visible-infrared person re-identification arise from the differences between visible (vis) and infrared (ir) images.
Existing methods often rely on horizontal partitioning to align part-level features, which can introduce inaccuracies.
We propose a novel Prototype-Driven Multi-feature generation framework (PDM) aimed at mitigating cross-modal discrepancies.
arXiv Detail & Related papers (2024-09-09T14:12:23Z) - Dynamic Identity-Guided Attention Network for Visible-Infrared Person Re-identification [17.285526655788274]
Visible-infrared person re-identification (VI-ReID) aims to match people with the same identity between visible and infrared modalities.
Existing methods generally try to bridge the cross-modal differences at image or feature level.
We introduce a dynamic identity-guided attention network (DIAN) to mine identity-guided and modality-consistent embeddings.
arXiv Detail & Related papers (2024-05-21T12:04:56Z) - Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling.
Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z) - Frequency Domain Nuances Mining for Visible-Infrared Person
Re-identification [75.87443138635432]
Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information.
We propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information.
Our method outperforms the second-best method by 5.2% in Rank-1 accuracy and 5.8% in mAP on the SYSU-MM01 dataset.
arXiv Detail & Related papers (2024-01-04T09:19:54Z) - Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Exploring Invariant Representation for Visible-Infrared Person
Re-Identification [77.06940947765406]
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM)
Experiment results on two standard cross-spectral person re-identification datasets, RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
arXiv Detail & Related papers (2023-02-02T05:24:50Z) - On Exploring Pose Estimation as an Auxiliary Learning Task for
Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework.
By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features.
Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z) - CMTR: Cross-modality Transformer for Visible-infrared Person
Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task.
We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information.
Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.