Parameter Sharing Exploration and Hetero-Center based Triplet Loss for
Visible-Thermal Person Re-Identification
- URL: http://arxiv.org/abs/2008.06223v2
- Date: Fri, 4 Dec 2020 01:36:49 GMT
- Title: Parameter Sharing Exploration and Hetero-Center based Triplet Loss for
Visible-Thermal Person Re-Identification
- Authors: Haijun Liu, Xiaoheng Tan and Xichuan Zhou
- Abstract summary: This paper focuses on the visible-thermal cross-modality person re-identification (VT Re-ID) task.
Our proposed method distinctly outperforms the state-of-the-art methods by large margins.
- Score: 17.402673438396345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper focuses on the visible-thermal cross-modality person
re-identification (VT Re-ID) task, whose goal is to match person images between
the daytime visible modality and the nighttime thermal modality. The two-stream
network is usually adopted to address the cross-modality discrepancy, the most
challenging problem for VT Re-ID, by learning the multi-modality person
features. In this paper, we explore how many parameters of two-stream network
should share, which is still not well investigated in the existing literature.
By well splitting the ResNet50 model to construct the modality-specific feature
extracting network and modality-sharing feature embedding network, we
experimentally demonstrate the effect of parameters sharing of two-stream
network for VT Re-ID. Moreover, in the framework of part-level person feature
learning, we propose the hetero-center based triplet loss to relax the strict
constraint of traditional triplet loss through replacing the comparison of
anchor to all the other samples by anchor center to all the other centers. With
the extremely simple means, the proposed method can significantly improve the
VT Re-ID performance. The experimental results on two datasets show that our
proposed method distinctly outperforms the state-of-the-art methods by large
margins, especially on RegDB dataset achieving superior performance,
rank1/mAP/mINP 91.05%/83.28%/68.84%. It can be a new baseline for VT Re-ID,
with a simple but effective strategy.
Related papers
- Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression [63.23578860867408]
We investigate how to integrate the evaluations of importance and sparsity scores into a single stage.
We present OFB, a cost-efficient approach that simultaneously evaluates both importance and sparsity scores.
Experiments demonstrate that OFB can achieve superior compression performance over state-of-the-art searching-based and pruning-based methods.
arXiv Detail & Related papers (2024-03-23T13:22:36Z) - Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations.
Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space.
We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z) - Unified Single-Stage Transformer Network for Efficient RGB-T Tracking [47.88113335927079]
We propose a single-stage Transformer RGB-T tracking network, namely USTrack, which unifies the above three stages into a single ViT (Vision Transformer) backbone.
With this structure, the network can extract fusion features of the template and search region under the mutual interaction of modalities.
Experiments on three popular RGB-T tracking benchmarks demonstrate that our method achieves new state-of-the-art performance while maintaining the fastest inference speed 84.2FPS.
arXiv Detail & Related papers (2023-08-26T05:09:57Z) - Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input.
We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Learning Progressive Modality-shared Transformers for Effective
Visible-Infrared Person Re-identification [27.75907274034702]
We propose a novel deep learning framework named Progressive Modality-shared Transformer (PMT) for effective VI-ReID.
To reduce the negative effect of modality gaps, we first take the gray-scale images as an auxiliary modality and propose a progressive learning strategy.
To cope with the problem of large intra-class differences and small inter-class differences, we propose a Discriminative Center Loss.
arXiv Detail & Related papers (2022-12-01T02:20:16Z) - On Exploring Pose Estimation as an Auxiliary Learning Task for
Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework.
By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features.
Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z) - MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared
Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.
Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space.
We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z) - CMTR: Cross-modality Transformer for Visible-infrared Person
Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task.
We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information.
Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for
Gesture Recognition [89.0152015268929]
We propose the first neural architecture search (NAS)-based method for RGB-D gesture recognition.
The proposed method includes two key components: 1) enhanced temporal representation via the 3D Central Difference Convolution (3D-CDC) family, and optimized backbones for multi-modal-rate branches and lateral connections.
The resultant multi-rate network provides a new perspective to understand the relationship between RGB and depth modalities and their temporal dynamics.
arXiv Detail & Related papers (2020-08-21T10:45:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.