Learning Progressive Modality-shared Transformers for Effective
Visible-Infrared Person Re-identification
- URL: http://arxiv.org/abs/2212.00226v1
- Date: Thu, 1 Dec 2022 02:20:16 GMT
- Title: Learning Progressive Modality-shared Transformers for Effective
Visible-Infrared Person Re-identification
- Authors: Hu Lu and Xuezhang Zou and Pingping Zhang
- Abstract summary: We propose a novel deep learning framework named Progressive Modality-shared Transformer (PMT) for effective VI-ReID.
To reduce the negative effect of modality gaps, we first take the gray-scale images as an auxiliary modality and propose a progressive learning strategy.
To cope with the problem of large intra-class differences and small inter-class differences, we propose a Discriminative Center Loss.
- Score: 27.75907274034702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visible-Infrared Person Re-Identification (VI-ReID) is a challenging
retrieval task under complex modality changes. Existing methods usually focus
on extracting discriminative visual features while ignoring the reliability and
commonality of visual features between different modalities. In this paper, we
propose a novel deep learning framework named Progressive Modality-shared
Transformer (PMT) for effective VI-ReID. To reduce the negative effect of
modality gaps, we first take the gray-scale images as an auxiliary modality and
propose a progressive learning strategy. Then, we propose a Modality-Shared
Enhancement Loss (MSEL) to guide the model to explore more reliable identity
information from modality-shared features. Finally, to cope with the problem of
large intra-class differences and small inter-class differences, we propose a
Discriminative Center Loss (DCL) combined with the MSEL to further improve the
discrimination of reliable features. Extensive experiments on SYSU-MM01 and
RegDB datasets show that our proposed framework performs better than most
state-of-the-art methods. For model reproduction, we release the source code at
https://github.com/hulu88/PMT.
Related papers
- Exploring Stronger Transformer Representation Learning for Occluded Person Re-Identification [2.552131151698595]
We proposed a novel self-supervision and supervision combining transformer-based person re-identification framework, namely SSSC-TransReID.
We designed a self-supervised contrastive learning branch, which can enhance the feature representation for person re-identification without negative samples or additional pre-training.
Our proposed model obtains superior Re-ID performance consistently and outperforms the state-of-the-art ReID methods by large margins on the mean average accuracy (mAP) and Rank-1 accuracy.
arXiv Detail & Related papers (2024-10-21T03:17:25Z) - Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification [64.36210786350568]
We propose a novel learning framework named textbfEDITOR to select diverse tokens from vision Transformers for multi-modal object ReID.
Our framework can generate more discriminative features for multi-modal object ReID.
arXiv Detail & Related papers (2024-03-15T12:44:35Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations.
Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space.
We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z) - Learning Cross-modality Information Bottleneck Representation for
Heterogeneous Person Re-Identification [61.49219876388174]
Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance.
Existing methods mainly focus on learning a shared feature space to reduce the modality discrepancy between visible and infrared modalities.
We present a novel mutual information and modality consensus network, namely CMInfoNet, to extract modality-invariant identity features.
arXiv Detail & Related papers (2023-08-29T06:55:42Z) - Dynamic Enhancement Network for Partial Multi-modality Person
Re-identification [52.70235136651996]
We design a novel dynamic enhancement network (DENet), which allows missing arbitrary modalities while maintaining the representation ability of multiple modalities.
Since the missing state might be changeable, we design a dynamic enhancement module, which dynamically enhances modality features according to the missing state in an adaptive manner.
arXiv Detail & Related papers (2023-05-25T06:22:01Z) - MRCN: A Novel Modality Restitution and Compensation Network for
Visible-Infrared Person Re-identification [36.88929785476334]
We propose a novel Modality Restitution and Compensation Network (MRCN) to narrow the gap between the two modalities.
Our method achieves 95.1% in terms of Rank-1 and 89.2% in terms of mAP on the RegDB dataset.
arXiv Detail & Related papers (2023-03-26T05:03:18Z) - On Exploring Pose Estimation as an Auxiliary Learning Task for
Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework.
By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features.
Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z) - MMD-ReID: A Simple but Effective Solution for Visible-Thermal Person
ReID [20.08880264104061]
We propose a simple but effective framework, MMD-ReID, that reduces the modality gap by an explicit discrepancy reduction constraint.
We conduct extensive experiments to demonstrate both qualitatively and quantitatively the effectiveness of MMD-ReID.
The proposed framework significantly outperforms the state-of-the-art methods on SYSU-MM01 and RegDB datasets.
arXiv Detail & Related papers (2021-11-09T11:33:32Z) - CMTR: Cross-modality Transformer for Visible-infrared Person
Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task.
We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information.
Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.