Related papers: Learning Cross-modality Information Bottleneck Representation for Heterogeneous Person Re-Identification

Learning Cross-modality Information Bottleneck Representation for Heterogeneous Person Re-Identification

URL: http://arxiv.org/abs/2308.15063v1
Date: Tue, 29 Aug 2023 06:55:42 GMT
Title: Learning Cross-modality Information Bottleneck Representation for Heterogeneous Person Re-Identification
Authors: Haichao Shi, Mandi Luo, Xiao-Yu Zhang, Ran He
Abstract summary: Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance. Existing methods mainly focus on learning a shared feature space to reduce the modality discrepancy between visible and infrared modalities. We present a novel mutual information and modality consensus network, namely CMInfoNet, to extract modality-invariant identity features.
Score: 61.49219876388174
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance. Existing methods mainly focus on learning a shared feature space to reduce the modality discrepancy between visible and infrared modalities, which still leave two problems underexplored: information redundancy and modality complementarity. To this end, properly eliminating the identity-irrelevant information as well as making up for the modality-specific information are critical and remains a challenging endeavor. To tackle the above problems, we present a novel mutual information and modality consensus network, namely CMInfoNet, to extract modality-invariant identity features with the most representative information and reduce the redundancies. The key insight of our method is to find an optimal representation to capture more identity-relevant information and compress the irrelevant parts by optimizing a mutual information bottleneck trade-off. Besides, we propose an automatically search strategy to find the most prominent parts that identify the pedestrians. To eliminate the cross- and intra-modality variations, we also devise a modality consensus module to align the visible and infrared modalities for task-specific guidance. Moreover, the global-local feature representations can also be acquired for key parts discrimination. Experimental results on four benchmarks, i.e., SYSU-MM01, RegDB, Occluded-DukeMTMC, Occluded-REID, Partial-REID and Partial\_iLIDS dataset, have demonstrated the effectiveness of CMInfoNet.

Related papers

ADMC: Attention-based Diffusion Model for Missing Modalities Feature Completion [25.1725138364452]
We introduce an Attention-based Diffusion model for Missing Modalities feature Completion (ADMC)<n>Our framework independently trains feature extraction networks for each modality, preserving their unique characteristics and avoiding over-coupling.<n>Our approach achieves state-of-the-art results on the IEMOCAP and MIntRec benchmarks, demonstrating its effectiveness in both missing and complete modality scenarios.
arXiv Detail & Related papers (2025-07-08T03:08:52Z)
Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification [60.20318058777603]
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for fine-tuning or retraining. Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains. We propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method to solve this unique problem.
arXiv Detail & Related papers (2024-07-10T04:06:39Z)
Dynamic Identity-Guided Attention Network for Visible-Infrared Person Re-identification [17.285526655788274]
Visible-infrared person re-identification (VI-ReID) aims to match people with the same identity between visible and infrared modalities. Existing methods generally try to bridge the cross-modal differences at image or feature level. We introduce a dynamic identity-guided attention network (DIAN) to mine identity-guided and modality-consistent embeddings.
arXiv Detail & Related papers (2024-05-21T12:04:56Z)
Transferring Modality-Aware Pedestrian Attentive Learning for Visible-Infrared Person Re-identification [43.05147831905626]
We propose a novel Transferring Modality-Aware Pedestrian Attentive Learning (TMPA) model. TMPA focuses on the pedestrian regions to efficiently compensate for missing modality-specific features. experiments conducted on the benchmark SYSU-MM01 and RegDB datasets demonstrated the effectiveness of our proposed TMPA model.
arXiv Detail & Related papers (2023-12-12T07:15:17Z)
Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations. Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space. We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z)
Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification [90.39454748065558]
Body shape is one of the significant modality-shared cues for VI-ReID. We propose shape-erased feature learning paradigm that decorrelates modality-shared features in two subspaces. Experiments on SYSU-MM01, RegDB, and HITSZ-VCM datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-04-09T10:22:10Z)
CLIP-Driven Fine-grained Text-Image Person Re-identification [50.94827165464813]
TIReID aims to retrieve the image corresponding to the given text query from a pool of candidate images. We propose a CLIP-driven Fine-grained information excavation framework (CFine) to fully utilize the powerful knowledge of CLIP for TIReID.
arXiv Detail & Related papers (2022-10-19T03:43:12Z)
On Exploring Pose Estimation as an Auxiliary Learning Task for Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework. By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features. Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z)
CMTR: Cross-modality Transformer for Visible-infrared Person Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task. We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information. Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z)
Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images. We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
Cross-modality Person re-identification with Shared-Specific Feature Transfer [112.60513494602337]
Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis. We propose a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics.
arXiv Detail & Related papers (2020-02-28T00:18:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.