Dual Gaussian-based Variational Subspace Disentanglement for
Visible-Infrared Person Re-Identification
- URL: http://arxiv.org/abs/2008.02520v1
- Date: Thu, 6 Aug 2020 08:43:35 GMT
- Title: Dual Gaussian-based Variational Subspace Disentanglement for
Visible-Infrared Person Re-Identification
- Authors: Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew
- Abstract summary: Visible-infrared person re-identification (VI-ReID) is a challenging and essential task in night-time intelligent surveillance systems.
We present a dual Gaussian-based variational auto-encoder (DG-VAE) to disentangle an identity-discriminable and an identity-ambiguous cross-modality feature subspace.
Our method outperforms state-of-the-art methods on two VI-ReID datasets.
- Score: 19.481092102536827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visible-infrared person re-identification (VI-ReID) is a challenging and
essential task in night-time intelligent surveillance systems. Except for the
intra-modality variance that RGB-RGB person re-identification mainly overcomes,
VI-ReID suffers from additional inter-modality variance caused by the inherent
heterogeneous gap. To solve the problem, we present a carefully designed dual
Gaussian-based variational auto-encoder (DG-VAE), which disentangles an
identity-discriminable and an identity-ambiguous cross-modality feature
subspace, following a mixture-of-Gaussians (MoG) prior and a standard Gaussian
distribution prior, respectively. Disentangling cross-modality
identity-discriminable features leads to more robust retrieval for VI-ReID. To
achieve efficient optimization like conventional VAE, we theoretically derive
two variational inference terms for the MoG prior under the supervised setting,
which not only restricts the identity-discriminable subspace so that the model
explicitly handles the cross-modality intra-identity variance, but also enables
the MoG distribution to avoid posterior collapse. Furthermore, we propose a
triplet swap reconstruction (TSR) strategy to promote the above disentangling
process. Extensive experiments demonstrate that our method outperforms
state-of-the-art methods on two VI-ReID datasets.
Related papers
- Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling.
Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations.
Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space.
We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z) - Exploring Invariant Representation for Visible-Infrared Person
Re-Identification [77.06940947765406]
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM)
Experiment results on two standard cross-spectral person re-identification datasets, RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
arXiv Detail & Related papers (2023-02-02T05:24:50Z) - Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared
Person Re-Identification [84.32086702849338]
We propose a novel modality-adaptive mixup and invariant decomposition (MID) approach for RGB-infrared person re-identification.
MID designs a modality-adaptive mixup scheme to generate suitable mixed modality images between RGB and infrared images.
Experiments on two challenging benchmarks demonstrate superior performance of MID over state-of-the-art methods.
arXiv Detail & Related papers (2022-03-03T14:26:49Z) - Cross-Modality Earth Mover's Distance for Visible Thermal Person
Re-Identification [82.01051164653583]
Visible thermal person re-identification (VT-ReID) suffers from the inter-modality discrepancy and intra-identity variations.
We propose the Cross-Modality Earth Mover's Distance (CM-EMD) that can alleviate the impact of the intra-identity variations during modality alignment.
arXiv Detail & Related papers (2022-03-03T12:26:59Z) - MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared
Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.
Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space.
We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z) - Multi-Scale Cascading Network with Compact Feature Learning for
RGB-Infrared Person Re-Identification [35.55895776505113]
Multi-Scale Part-Aware Cascading framework (MSPAC) is formulated by aggregating multi-scale fine-grained features from part to global.
Cross-modality correlations can thus be efficiently explored on salient features for distinctive modality-invariant feature learning.
arXiv Detail & Related papers (2020-12-12T15:39:11Z) - DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition [85.94331736287765]
We formulate HFR as a dual generation problem, and tackle it via a novel Dual Variational Generation (DVG-Face) framework.
We integrate abundant identity information of large-scale visible data into the joint distribution.
Massive new diverse paired heterogeneous images with the same identity can be generated from noises.
arXiv Detail & Related papers (2020-09-20T09:48:24Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.