Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID
- URL: http://arxiv.org/abs/2305.12673v4
- Date: Sun, 03 Nov 2024 13:48:21 GMT
- Title: Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID
- Authors: De Cheng, Lingfeng He, Nannan Wang, Shizhou Zhang, Zhen Wang, Xinbo Gao,
- Abstract summary: We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
- Score: 56.573905143954015
- License:
- Abstract: Unsupervised visible-infrared person re-identification (USL-VI-ReID) aims to match pedestrian images of the same identity from different modalities without annotations. Existing works mainly focus on alleviating the modality gap by aligning instance-level features of the unlabeled samples. However, the relationships between cross-modality clusters are not well explored. To this end, we propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters. Specifically, we design a Many-to-many Bilateral Cross-Modality Cluster Matching (MBCCM) algorithm through optimizing the maximum matching problem in a bipartite graph. Then, the matched pairwise clusters utilize shared visible and infrared pseudo-labels during the model training. Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level. Meanwhile, the cross-modality Consistency Constraint (CC) is proposed to explicitly reduce the large modality discrepancy. Extensive experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method, surpassing state-of-the-art approaches by a large margin of 8.76% mAP on average.
Related papers
- Extended Cross-Modality United Learning for Unsupervised Visible-Infrared Person Re-identification [34.93081601924748]
Unsupervised learning aims to learn modality-invariant features from unlabeled cross-modality datasets.
Existing methods lack cross-modality clustering or excessively pursue cluster-level association.
We propose Extended Cross-Modality United Learning (ECUL) framework, incorporating Extended Modality-Camera Clustering (EMCC) and Two-Step Memory Updating Strategy (TSMem) modules.
arXiv Detail & Related papers (2024-12-26T09:30:26Z) - Dynamic Modality-Camera Invariant Clustering for Unsupervised Visible-Infrared Person Re-identification [46.63906666692304]
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) offers a more flexible and cost-effective alternative to supervised methods.
Existing methods simply cluster modality-specific samples and employ strong association techniques to achieve instance-to-cluster or cluster-to-cluster cross-modality associations.
We propose a novel Dynamic Modality-Camera Invariant Clustering (DMIC) framework for USL-VI-ReID.
arXiv Detail & Related papers (2024-12-11T09:31:03Z) - Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling.
Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z) - Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID [57.500045584556794]
We introduce a Modality-Unified Label Transfer (MULT) module that simultaneously accounts for both homogeneous and heterogeneous fine-grained instance-level structures.
The proposed MULT ensures that the generated pseudo-labels maintain alignment across modalities while upholding structural consistency within intra-modality.
Experiments demonstrate that our proposed method outperforms existing state-of-the-art USL-VI-ReID methods.
arXiv Detail & Related papers (2024-02-01T15:33:17Z) - Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification [30.983346937558743]
Key challenges in USL-VI-ReID are to effectively generate pseudo-labels and establish pseudo-label correspondences.
We propose a Multi-Memory Matching framework for USL-VI-ReID.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the reliability of the established cross-modality correspondences.
arXiv Detail & Related papers (2024-01-12T01:24:04Z) - Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label Refinement [53.044703127757295]
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) aims at learning modality-invariant features from unlabeled cross-modality dataset.
We propose a Dual Optimal Transport Label Assignment (DOTLA) framework to simultaneously assign the generated labels from one modality to its counterpart modality.
The proposed DOTLA mechanism formulates a mutual reinforcement and efficient solution to cross-modality data association, which could effectively reduce the side-effects of some insufficient and noisy label associations.
arXiv Detail & Related papers (2023-05-22T04:40:30Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.