Mix-Modality Person Re-Identification: A New and Practical Paradigm
- URL: http://arxiv.org/abs/2412.04719v1
- Date: Fri, 06 Dec 2024 02:19:57 GMT
- Title: Mix-Modality Person Re-Identification: A New and Practical Paradigm
- Authors: Wei Liu, Xin Xu, Hua Chang, Xin Yuan, Zheng Wang,
- Abstract summary: We propose a new and more practical mix-modality retrieval paradigm.
Existing visible-infrared person re-identification (VI-ReID) methods have achieved some results in the bi-modality mutual retrieval paradigm.
This paper proposes a Mix-Modality person re-identification (MM-ReID) task, explores the influence of modality mixing ratio on performance, and constructs mix-modality test sets for existing datasets.
- Score: 20.01921944345468
- License:
- Abstract: Current visible-infrared cross-modality person re-identification research has only focused on exploring the bi-modality mutual retrieval paradigm, and we propose a new and more practical mix-modality retrieval paradigm. Existing Visible-Infrared person re-identification (VI-ReID) methods have achieved some results in the bi-modality mutual retrieval paradigm by learning the correspondence between visible and infrared modalities. However, significant performance degradation occurs due to the modality confusion problem when these methods are applied to the new mix-modality paradigm. Therefore, this paper proposes a Mix-Modality person re-identification (MM-ReID) task, explores the influence of modality mixing ratio on performance, and constructs mix-modality test sets for existing datasets according to the new mix-modality testing paradigm. To solve the modality confusion problem in MM-ReID, we propose a Cross-Identity Discrimination Harmonization Loss (CIDHL) adjusting the distribution of samples in the hyperspherical feature space, pulling the centers of samples with the same identity closer, and pushing away the centers of samples with different identities while aggregating samples with the same modality and the same identity. Furthermore, we propose a Modality Bridge Similarity Optimization Strategy (MBSOS) to optimize the cross-modality similarity between the query and queried samples with the help of the similar bridge sample in the gallery. Extensive experiments demonstrate that compared to the original performance of existing cross-modality methods on MM-ReID, the addition of our CIDHL and MBSOS demonstrates a general improvement.
Related papers
- From Cross-Modal to Mixed-Modal Visible-Infrared Re-Identification [11.324518300593983]
Current VI-ReID methods focus on cross-modality matching, but real-world applications often involve mixed galleries containing both V and I images.
This is because gallery images from the same modality may have lower domain gaps but correspond to different identities.
This paper introduces a novel mixed-modal ReID setting, where galleries contain data from both modalities.
arXiv Detail & Related papers (2025-01-23T01:28:05Z) - Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification [30.983346937558743]
Key challenges in USL-VI-ReID are to effectively generate pseudo-labels and establish pseudo-label correspondences.
We propose a Multi-Memory Matching framework for USL-VI-ReID.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the reliability of the established cross-modality correspondences.
arXiv Detail & Related papers (2024-01-12T01:24:04Z) - Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations.
Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space.
We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared
Person Re-Identification [84.32086702849338]
We propose a novel modality-adaptive mixup and invariant decomposition (MID) approach for RGB-infrared person re-identification.
MID designs a modality-adaptive mixup scheme to generate suitable mixed modality images between RGB and infrared images.
Experiments on two challenging benchmarks demonstrate superior performance of MID over state-of-the-art methods.
arXiv Detail & Related papers (2022-03-03T14:26:49Z) - Multi-Modal Mutual Information Maximization: A Novel Approach for
Unsupervised Deep Cross-Modal Hashing [73.29587731448345]
We propose a novel method, dubbed Cross-Modal Info-Max Hashing (CMIMH)
We learn informative representations that can preserve both intra- and inter-modal similarities.
The proposed method consistently outperforms other state-of-the-art cross-modal retrieval methods.
arXiv Detail & Related papers (2021-12-13T08:58:03Z) - Parameter Sharing Exploration and Hetero-Center based Triplet Loss for
Visible-Thermal Person Re-Identification [17.402673438396345]
This paper focuses on the visible-thermal cross-modality person re-identification (VT Re-ID) task.
Our proposed method distinctly outperforms the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-08-14T07:40:35Z) - A Similarity Inference Metric for RGB-Infrared Cross-Modality Person
Re-identification [66.49212581685127]
Cross-modality person re-identification (re-ID) is a challenging task due to the large discrepancy between IR and RGB modalities.
Existing methods address this challenge typically by aligning feature distributions or image styles across modalities.
This paper presents a novel similarity inference metric (SIM) that exploits the intra-modality sample similarities to circumvent the cross-modality discrepancy.
arXiv Detail & Related papers (2020-07-03T05:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.