Extended Cross-Modality United Learning for Unsupervised Visible-Infrared Person Re-identification
- URL: http://arxiv.org/abs/2412.19134v1
- Date: Thu, 26 Dec 2024 09:30:26 GMT
- Title: Extended Cross-Modality United Learning for Unsupervised Visible-Infrared Person Re-identification
- Authors: Ruixing Wu, Yiming Yang, Jiakai He, Haifeng Hu,
- Abstract summary: Unsupervised learning aims to learn modality-invariant features from unlabeled cross-modality datasets.
Existing methods lack cross-modality clustering or excessively pursue cluster-level association.
We propose Extended Cross-Modality United Learning (ECUL) framework, incorporating Extended Modality-Camera Clustering (EMCC) and Two-Step Memory Updating Strategy (TSMem) modules.
- Score: 34.93081601924748
- License:
- Abstract: Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) aims to learn modality-invariant features from unlabeled cross-modality datasets and reduce the inter-modality gap. However, the existing methods lack cross-modality clustering or excessively pursue cluster-level association, which makes it difficult to perform reliable modality-invariant features learning. To deal with this issue, we propose a Extended Cross-Modality United Learning (ECUL) framework, incorporating Extended Modality-Camera Clustering (EMCC) and Two-Step Memory Updating Strategy (TSMem) modules. Specifically, we design ECUL to naturally integrates intra-modality clustering, inter-modality clustering and inter-modality instance selection, establishing compact and accurate cross-modality associations while reducing the introduction of noisy labels. Moreover, EMCC captures and filters the neighborhood relationships by extending the encoding vector, which further promotes the learning of modality-invariant and camera-invariant knowledge in terms of clustering algorithm. Finally, TSMem provides accurate and generalized proxy points for contrastive learning by updating the memory in stages. Extensive experiments results on SYSU-MM01 and RegDB datasets demonstrate that the proposed ECUL shows promising performance and even outperforms certain supervised methods.
Related papers
- Dynamic Modality-Camera Invariant Clustering for Unsupervised Visible-Infrared Person Re-identification [46.63906666692304]
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) offers a more flexible and cost-effective alternative to supervised methods.
Existing methods simply cluster modality-specific samples and employ strong association techniques to achieve instance-to-cluster or cluster-to-cluster cross-modality associations.
We propose a novel Dynamic Modality-Camera Invariant Clustering (DMIC) framework for USL-VI-ReID.
arXiv Detail & Related papers (2024-12-11T09:31:03Z) - RoNID: New Intent Discovery with Generated-Reliable Labels and Cluster-friendly Representations [27.775731666470175]
New Intent Discovery (NID) aims to identify novel intent groups in the open-world scenario.
Current methods face issues with inaccurate pseudo-labels and poor representation learning.
We propose a Robust New Intent Discovery framework optimized by an EM-style method.
arXiv Detail & Related papers (2024-04-13T11:58:28Z) - Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID [57.500045584556794]
We introduce a Modality-Unified Label Transfer (MULT) module that simultaneously accounts for both homogeneous and heterogeneous fine-grained instance-level structures.
The proposed MULT ensures that the generated pseudo-labels maintain alignment across modalities while upholding structural consistency within intra-modality.
Experiments demonstrate that our proposed method outperforms existing state-of-the-art USL-VI-ReID methods.
arXiv Detail & Related papers (2024-02-01T15:33:17Z) - Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification [30.983346937558743]
Key challenges in USL-VI-ReID are to effectively generate pseudo-labels and establish pseudo-label correspondences.
We propose a Multi-Memory Matching framework for USL-VI-ReID.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the reliability of the established cross-modality correspondences.
arXiv Detail & Related papers (2024-01-12T01:24:04Z) - Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label Refinement [53.044703127757295]
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) aims at learning modality-invariant features from unlabeled cross-modality dataset.
We propose a Dual Optimal Transport Label Assignment (DOTLA) framework to simultaneously assign the generated labels from one modality to its counterpart modality.
The proposed DOTLA mechanism formulates a mutual reinforcement and efficient solution to cross-modality data association, which could effectively reduce the side-effects of some insufficient and noisy label associations.
arXiv Detail & Related papers (2023-05-22T04:40:30Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Dynamic Clustering and Cluster Contrastive Learning for Unsupervised
Person Re-identification [29.167783500369442]
Unsupervised Re-ID methods aim at learning robust and discriminative features from unlabeled data.
We propose a dynamic clustering and cluster contrastive learning (DCCC) method.
Experiments on several widely used public datasets validate the effectiveness of our proposed DCCC.
arXiv Detail & Related papers (2023-03-13T01:56:53Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.