Related papers: Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification

Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification

URL: http://arxiv.org/abs/2008.02520v1
Date: Thu, 6 Aug 2020 08:43:35 GMT
Title: Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification
Authors: Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew
Abstract summary: Visible-infrared person re-identification (VI-ReID) is a challenging and essential task in night-time intelligent surveillance systems. We present a dual Gaussian-based variational auto-encoder (DG-VAE) to disentangle an identity-discriminable and an identity-ambiguous cross-modality feature subspace. Our method outperforms state-of-the-art methods on two VI-ReID datasets.
Score: 19.481092102536827
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visible-infrared person re-identification (VI-ReID) is a challenging and essential task in night-time intelligent surveillance systems. Except for the intra-modality variance that RGB-RGB person re-identification mainly overcomes, VI-ReID suffers from additional inter-modality variance caused by the inherent heterogeneous gap. To solve the problem, we present a carefully designed dual Gaussian-based variational auto-encoder (DG-VAE), which disentangles an identity-discriminable and an identity-ambiguous cross-modality feature subspace, following a mixture-of-Gaussians (MoG) prior and a standard Gaussian distribution prior, respectively. Disentangling cross-modality identity-discriminable features leads to more robust retrieval for VI-ReID. To achieve efficient optimization like conventional VAE, we theoretically derive two variational inference terms for the MoG prior under the supervised setting, which not only restricts the identity-discriminable subspace so that the model explicitly handles the cross-modality intra-identity variance, but also enables the MoG distribution to avoid posterior collapse. Furthermore, we propose a triplet swap reconstruction (TSR) strategy to promote the above disentangling process. Extensive experiments demonstrate that our method outperforms state-of-the-art methods on two VI-ReID datasets.

Related papers

Adaptive Illumination-Invariant Synergistic Feature Integration in a Stratified Granular Framework for Visible-Infrared Re-Identification [18.221111822542024]
Visible-Infrared Person Re-Identification (VI-ReID) plays a crucial role in applications such as search and rescue, infrastructure protection, and nighttime surveillance. We propose textbfAMINet, an Adaptive Modality Interaction Network. AMINet employs multi-granularity feature extraction to capture comprehensive identity attributes from both full-body and upper-body images.
arXiv Detail & Related papers (2025-02-28T15:42:58Z)
Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling. Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z)
Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities. Existing attack methods have primarily focused on the characteristics of the visible image modality. This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z)
Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations. Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space. We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z)
Exploring Invariant Representation for Visible-Infrared Person Re-Identification [77.06940947765406]
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy. In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM) Experiment results on two standard cross-spectral person re-identification datasets, RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
arXiv Detail & Related papers (2023-02-02T05:24:50Z)
Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-Identification [84.32086702849338]
We propose a novel modality-adaptive mixup and invariant decomposition (MID) approach for RGB-infrared person re-identification. MID designs a modality-adaptive mixup scheme to generate suitable mixed modality images between RGB and infrared images. Experiments on two challenging benchmarks demonstrate superior performance of MID over state-of-the-art methods.
arXiv Detail & Related papers (2022-03-03T14:26:49Z)
Cross-Modality Earth Mover's Distance for Visible Thermal Person Re-Identification [82.01051164653583]
Visible thermal person re-identification (VT-ReID) suffers from the inter-modality discrepancy and intra-identity variations. We propose the Cross-Modality Earth Mover's Distance (CM-EMD) that can alleviate the impact of the intra-identity variations during modality alignment.
arXiv Detail & Related papers (2022-03-03T12:26:59Z)
MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality. Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space. We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z)
Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification [35.55895776505113]
Multi-Scale Part-Aware Cascading framework (MSPAC) is formulated by aggregating multi-scale fine-grained features from part to global. Cross-modality correlations can thus be efficiently explored on salient features for distinctive modality-invariant feature learning.
arXiv Detail & Related papers (2020-12-12T15:39:11Z)
DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition [85.94331736287765]
We formulate HFR as a dual generation problem, and tackle it via a novel Dual Variational Generation (DVG-Face) framework. We integrate abundant identity information of large-scale visible data into the joint distribution. Massive new diverse paired heterogeneous images with the same identity can be generated from noises.
arXiv Detail & Related papers (2020-09-20T09:48:24Z)
Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images. We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.