Hierarchical Identity Learning for Unsupervised Visible-Infrared Person Re-Identification
- URL: http://arxiv.org/abs/2509.11587v1
- Date: Mon, 15 Sep 2025 05:10:43 GMT
- Title: Hierarchical Identity Learning for Unsupervised Visible-Infrared Person Re-Identification
- Authors: Haonan Shi, Yubin Wang, De Cheng, Lingfeng He, Nannan Wang, Xinbo Gao,
- Abstract summary: Unsupervised visible-infrared person re-identification (USVI-ReID) aims to learn modality-invariant image features from unlabeled cross-modal person datasets.
- Score: 81.3063589622217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised visible-infrared person re-identification (USVI-ReID) aims to learn modality-invariant image features from unlabeled cross-modal person datasets by reducing the modality gap while minimizing reliance on costly manual annotations. Existing methods typically address USVI-ReID using cluster-based contrastive learning, which represents a person by a single cluster center. However, they primarily focus on the commonality of images within each cluster while neglecting the finer-grained differences among them. To address the limitation, we propose a Hierarchical Identity Learning (HIL) framework. Since each cluster may contain several smaller sub-clusters that reflect fine-grained variations among images, we generate multiple memories for each existing coarse-grained cluster via a secondary clustering. Additionally, we propose Multi-Center Contrastive Learning (MCCL) to refine representations for enhancing intra-modal clustering and minimizing cross-modal discrepancies. To further improve cross-modal matching quality, we design a Bidirectional Reverse Selection Transmission (BRST) mechanism, which establishes reliable cross-modal correspondences by performing bidirectional matching of pseudo-labels. Extensive experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate that the proposed method outperforms existing approaches. The source code is available at: https://github.com/haonanshi0125/HIL.
Related papers
- Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification [14.343677160918723]
Unsupervised visible-infrared person re-identification (USVI-ReID) aims to match individuals across visible and infrared cameras without relying on any annotation.<n> estimating reliable cross-modality association is a major challenge in USVI-ReID.<n>This paper focuses on addressing cross-modality learning from two aspects: bias-mitigated global association and modality-invariant representation learning.
arXiv Detail & Related papers (2025-12-08T17:42:28Z) - Dynamic Modality-Camera Invariant Clustering for Unsupervised Visible-Infrared Person Re-identification [46.63906666692304]
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) offers a more flexible and cost-effective alternative to supervised methods.<n>Existing methods simply cluster modality-specific samples and employ strong association techniques to achieve instance-to-cluster or cluster-to-cluster cross-modality associations.<n>We propose a novel Dynamic Modality-Camera Invariant Clustering (DMIC) framework for USL-VI-ReID.
arXiv Detail & Related papers (2024-12-11T09:31:03Z) - Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling.
Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z) - Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification [32.537029197752915]
Unsupervised visible-infrared person re-identification (USVI-ReID) aims to match specified people in infrared images to visible images without annotations, and vice versa.
Most existing methods address the USVI-ReID using cluster-based contrastive learning, which simply employs the cluster center as a representation of a person.
We propose a Progressive Contrastive Learning with Hard and Dynamic Prototypes method for USVI-ReID.
arXiv Detail & Related papers (2024-02-29T10:37:49Z) - Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID [57.500045584556794]
We introduce a Modality-Unified Label Transfer (MULT) module that simultaneously accounts for both homogeneous and heterogeneous fine-grained instance-level structures.<n>The proposed MULT ensures that the generated pseudo-labels maintain alignment across modalities while upholding structural consistency within intra-modality.<n> Experiments demonstrate that our proposed method outperforms existing state-of-the-art USL-VI-ReID methods.
arXiv Detail & Related papers (2024-02-01T15:33:17Z) - Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification [30.983346937558743]
Key challenges in USL-VI-ReID are to effectively generate pseudo-labels and establish pseudo-label correspondences.
We propose a Multi-Memory Matching framework for USL-VI-ReID.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the reliability of the established cross-modality correspondences.
arXiv Detail & Related papers (2024-01-12T01:24:04Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Camera-aware Proxies for Unsupervised Person Re-Identification [60.26031011794513]
This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations.
We propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera.
Based on the camera-aware proxies, we design both intra- and inter-camera contrastive learning components for our Re-ID model.
arXiv Detail & Related papers (2020-12-19T12:37:04Z) - Unsupervised Person Re-identification via Softened Similarity Learning [122.70472387837542]
Person re-identification (re-ID) is an important topic in computer vision.
This paper studies the unsupervised setting of re-ID, which does not require any labeled information.
Experiments on two image-based and video-based datasets demonstrate state-of-the-art performance.
arXiv Detail & Related papers (2020-04-07T17:16:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.