Exploring Invariant Representation for Visible-Infrared Person
Re-Identification
- URL: http://arxiv.org/abs/2302.00884v1
- Date: Thu, 2 Feb 2023 05:24:50 GMT
- Title: Exploring Invariant Representation for Visible-Infrared Person
Re-Identification
- Authors: Lei Tan, Yukang Zhang, Shengmei Shen, Yan Wang, Pingyang Dai, Xianming
Lin, Yongjian Wu, Rongrong Ji
- Abstract summary: Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM)
Experiment results on two standard cross-spectral person re-identification datasets, RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
- Score: 77.06940947765406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-spectral person re-identification, which aims to associate identities
to pedestrians across different spectra, faces a main challenge of the modality
discrepancy. In this paper, we address the problem from both image-level and
feature-level in an end-to-end hybrid learning framework named robust feature
mining network (RFM). In particular, we observe that the reflective intensity
of the same surface in photos shot in different wavelengths could be
transformed using a linear model. Besides, we show the variable linear factor
across the different surfaces is the main culprit which initiates the modality
discrepancy. We integrate such a reflection observation into an image-level
data augmentation by proposing the linear transformation generator (LTG).
Moreover, at the feature level, we introduce a cross-center loss to explore a
more compact intra-class distribution and modality-aware spatial attention to
take advantage of textured regions more efficiently. Experiment results on two
standard cross-spectral person re-identification datasets, i.e., RegDB and
SYSU-MM01, have demonstrated state-of-the-art performance.
Related papers
- RLE: A Unified Perspective of Data Augmentation for Cross-Spectral Re-identification [59.5042031913258]
Non-linear modality discrepancy mainly comes from diverse linear transformations acting on the surface of different materials.
We propose a Random Linear Enhancement (RLE) strategy which includes Moderate Random Linear Enhancement (MRLE) and Radical Random Linear Enhancement (RRLE)
The experimental results not only demonstrate the superiority and effectiveness of RLE but also confirm its great potential as a general-purpose data augmentation for cross-spectral re-identification.
arXiv Detail & Related papers (2024-11-02T12:13:37Z) - Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling.
Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Learning Feature Recovery Transformer for Occluded Person
Re-identification [71.18476220969647]
We propose a new approach called Feature Recovery Transformer (FRT) to address the two challenges simultaneously.
To reduce the interference of the noise during feature matching, we mainly focus on visible regions that appear in both images and develop a visibility graph to calculate the similarity.
In terms of the second challenge, based on the developed graph similarity, for each query image, we propose a recovery transformer that exploits the feature sets of its $k$-nearest neighbors in the gallery to recover the complete features.
arXiv Detail & Related papers (2023-01-05T02:36:16Z) - A Bidirectional Conversion Network for Cross-Spectral Face Recognition [1.9766522384767227]
Cross-spectral face recognition is challenging due to the dramatic difference between the visible light and IR imageries.
This paper proposes a framework of bidirectional cross-spectral conversion (BCSC-GAN) between the heterogeneous face images.
The network reduces the cross-spectral recognition problem into an intra-spectral problem, and improves performance by fusing bidirectional information.
arXiv Detail & Related papers (2022-05-03T16:20:10Z) - Towards Homogeneous Modality Learning and Multi-Granularity Information
Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views.
Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Dual Gaussian-based Variational Subspace Disentanglement for
Visible-Infrared Person Re-Identification [19.481092102536827]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task in night-time intelligent surveillance systems.
We present a dual Gaussian-based variational auto-encoder (DG-VAE) to disentangle an identity-discriminable and an identity-ambiguous cross-modality feature subspace.
Our method outperforms state-of-the-art methods on two VI-ReID datasets.
arXiv Detail & Related papers (2020-08-06T08:43:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.