Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person
Re-Identification
- URL: http://arxiv.org/abs/2006.00878v5
- Date: Tue, 29 Dec 2020 08:36:53 GMT
- Title: Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person
Re-Identification
- Authors: Hanrong Ye, Hong Liu, Fanyang Meng, Xia Li
- Abstract summary: We propose a novel ranking loss function, named Bi-directional Exponential Angular Triplet Loss, to help learn an angularly separable common feature space.
On SYSU-MM01 dataset, the performance is improved from 7.40% / 11.46% to 38.57% / 38.61% for rank-1 accuracy / mAP.
The proposed method can be generalized to the task of single-modality Re-ID.
- Score: 18.09586167227815
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: RGB-Infrared person re-identification (RGB-IR Re- ID) is a cross-modality
matching problem, where the modality discrepancy is a big challenge. Most
existing works use Euclidean metric based constraints to resolve the
discrepancy between features of images from different modalities. However,
these methods are incapable of learning angularly discriminative feature
embedding because Euclidean distance cannot measure the included angle between
embedding vectors effectively. As an angularly discriminative feature space is
important for classifying the human images based on their embedding vectors, in
this paper, we propose a novel ranking loss function, named Bi-directional
Exponential Angular Triplet Loss, to help learn an angularly separable common
feature space by explicitly constraining the included angles between embedding
vectors. Moreover, to help stabilize and learn the magnitudes of embedding
vectors, we adopt a common space batch normalization layer. The quantitative
and qualitative experiments on the SYSU-MM01 and RegDB dataset support our
analysis. On SYSU-MM01 dataset, the performance is improved from 7.40% / 11.46%
to 38.57% / 38.61% for rank-1 accuracy / mAP compared with the baseline. The
proposed method can be generalized to the task of single-modality Re-ID and
improves the rank-1 accuracy / mAP from 92.0% / 81.7% to 94.7% / 86.6% on the
Market-1501 dataset, from 82.6% / 70.6% to 87.6% / 77.1% on the DukeMTMC-reID
dataset. Code: https://github.com/prismformore/expAT
Related papers
- RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - Improving CNN-based Person Re-identification using score Normalization [2.462953128215087]
This paper proposes a novel approach for PRe-ID, which combines a CNN based feature extraction method with Cross-view Quadratic Discriminant Analysis (XQDA) for metric learning.
The proposed approach is tested on four challenging datasets, including VIPeR, GRID, CUHK01, VIPeR and PRID450S.
arXiv Detail & Related papers (2023-07-01T18:12:27Z) - Learning Feature Recovery Transformer for Occluded Person
Re-identification [71.18476220969647]
We propose a new approach called Feature Recovery Transformer (FRT) to address the two challenges simultaneously.
To reduce the interference of the noise during feature matching, we mainly focus on visible regions that appear in both images and develop a visibility graph to calculate the similarity.
In terms of the second challenge, based on the developed graph similarity, for each query image, we propose a recovery transformer that exploits the feature sets of its $k$-nearest neighbors in the gallery to recover the complete features.
arXiv Detail & Related papers (2023-01-05T02:36:16Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Learning Instance-level Spatial-Temporal Patterns for Person
Re-identification [80.43222559182072]
We propose a novel Instance-level and Spatial-temporal Disentangled Re-ID method (InSTD) to improve Re-ID accuracy.
In our proposed framework, personalized information such as moving direction is explicitly considered to further narrow down the search space.
The proposed method achieves mAP of 90.8% on Market-1501 and 89.1% on DukeMTMC-reID, improving from the baseline 82.2% and 72.7%, respectively.
arXiv Detail & Related papers (2021-07-31T07:44:47Z) - Leaning Compact and Representative Features for Cross-Modality Person
Re-Identification [18.06382007908855]
This paper pays close attention to the cross-modality visible-infrared person re-identification (VI Re-ID) task.
The proposed method is superior to the other most advanced methods in terms of impressive performance.
arXiv Detail & Related papers (2021-03-26T01:53:16Z) - SAR-U-Net: squeeze-and-excitation block and atrous spatial pyramid
pooling based residual U-Net for automatic liver CT segmentation [3.192503074844775]
A modified U-Net based framework is presented, which leverages techniques from Squeeze-and-Excitation (SE) block, Atrous Spatial Pyramid Pooling (ASPP) and residual learning.
The effectiveness of the proposed method was tested on two public datasets LiTS17 and SLiver07.
arXiv Detail & Related papers (2021-03-11T02:32:59Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Cross-Modality Paired-Images Generation for RGB-Infrared Person
Re-Identification [29.92261627385826]
We propose to generate cross-modality paired-images and perform both global set-level and fine-grained instance-level alignments.
Our method can explicitly remove modality-specific features and the modality variation can be better reduced.
Our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and mAP.
arXiv Detail & Related papers (2020-02-10T22:15:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.