Shape-centered Representation Learning for Visible-Infrared Person
Re-identification
- URL: http://arxiv.org/abs/2310.17952v2
- Date: Mon, 30 Oct 2023 01:37:18 GMT
- Title: Shape-centered Representation Learning for Visible-Infrared Person
Re-identification
- Authors: Shuang Li, Jiaxu Leng, Ji Gan, Mengjingcheng Mo, Xinbo Gao
- Abstract summary: Current Visible-Infrared Person Re-Identification (VI-ReID) methods prioritize extracting distinguishing appearance features.
We propose the Shape-centered Representation Learning framework (ScRL), which focuses on learning shape features and appearance features associated with shapes.
To acquire appearance features related to shape, we design the Appearance Feature Enhancement (AFE), which accentuates identity-related features while suppressing identity-unrelated features guided by shape features.
- Score: 53.56628297970931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current Visible-Infrared Person Re-Identification (VI-ReID) methods
prioritize extracting distinguishing appearance features, ignoring the natural
resistance of body shape against modality changes. Initially, we gauged the
discriminative potential of shapes by a straightforward concatenation of shape
and appearance features. However, two unresolved issues persist in the
utilization of shape features. One pertains to the dependence on auxiliary
models for shape feature extraction in the inference phase, along with the
errors in generated infrared shapes due to the intrinsic modality disparity.
The other issue involves the inadequately explored correlation between shape
and appearance features. To tackle the aforementioned challenges, we propose
the Shape-centered Representation Learning framework (ScRL), which focuses on
learning shape features and appearance features associated with shapes.
Specifically, we devise the Shape Feature Propagation (SFP), facilitating
direct extraction of shape features from original images with minimal
complexity costs during inference. To restitute inaccuracies in infrared body
shapes at the feature level, we present the Infrared Shape Restitution (ISR).
Furthermore, to acquire appearance features related to shape, we design the
Appearance Feature Enhancement (AFE), which accentuates identity-related
features while suppressing identity-unrelated features guided by shape
features. Extensive experiments are conducted to validate the effectiveness of
the proposed ScRL. Achieving remarkable results, the Rank-1 (mAP) accuracy
attains 76.1%, 71.2%, 92.4% (72.6%, 52.9%, 86.7%) on the SYSU-MM01, HITSZ-VCM,
RegDB datasets respectively, outperforming existing state-of-the-art methods.
Related papers
- Dynamic Identity-Guided Attention Network for Visible-Infrared Person Re-identification [17.285526655788274]
Visible-infrared person re-identification (VI-ReID) aims to match people with the same identity between visible and infrared modalities.
Existing methods generally try to bridge the cross-modal differences at image or feature level.
We introduce a dynamic identity-guided attention network (DIAN) to mine identity-guided and modality-consistent embeddings.
arXiv Detail & Related papers (2024-05-21T12:04:56Z) - DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation [84.0586749616249]
This paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity Facial Appearance Editing.
For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability.
In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC)
This module is guided to learn decoupled source-regarding features, thereby better preserving the identity and alleviating artifacts from non-facial attributes such as hair, clothes, and background.
arXiv Detail & Related papers (2024-03-26T12:53:10Z) - Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Shape-Erased Feature Learning for Visible-Infrared Person
Re-Identification [90.39454748065558]
Body shape is one of the significant modality-shared cues for VI-ReID.
We propose shape-erased feature learning paradigm that decorrelates modality-shared features in two subspaces.
Experiments on SYSU-MM01, RegDB, and HITSZ-VCM datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-04-09T10:22:10Z) - MRCN: A Novel Modality Restitution and Compensation Network for
Visible-Infrared Person Re-identification [36.88929785476334]
We propose a novel Modality Restitution and Compensation Network (MRCN) to narrow the gap between the two modalities.
Our method achieves 95.1% in terms of Rank-1 and 89.2% in terms of mAP on the RegDB dataset.
arXiv Detail & Related papers (2023-03-26T05:03:18Z) - Exploring Invariant Representation for Visible-Infrared Person
Re-Identification [77.06940947765406]
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM)
Experiment results on two standard cross-spectral person re-identification datasets, RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
arXiv Detail & Related papers (2023-02-02T05:24:50Z) - 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch
Feature Swapping for Bodies and Faces [12.114711258010367]
We propose a self-supervised approach to train a 3D shape variational autoencoder which encourages a disentangled latent representation of identity features.
Experimental results conducted on 3D meshes show that state-of-the-art methods for latent disentanglement are not able to disentangle identity features of faces and bodies.
arXiv Detail & Related papers (2021-11-24T11:53:33Z) - Exploring Modality-shared Appearance Features and Modality-invariant
Relation Features for Cross-modality Person Re-Identification [72.95858515157603]
Cross-modality person re-identification works rely on discriminative modality-shared features.
Despite some initial success, such modality-shared appearance features cannot capture enough modality-invariant information.
A novel cross-modality quadruplet loss is proposed to further reduce the cross-modality variations.
arXiv Detail & Related papers (2021-04-23T11:14:07Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.