Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning
- URL: http://arxiv.org/abs/2507.12942v1
- Date: Thu, 17 Jul 2025 09:31:34 GMT
- Title: Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning
- Authors: Yafei Zhang, Lingqi Kong, Huafeng Li, Jie Wen,
- Abstract summary: This paper explores a weakly supervised cross-modal person ReID method that uses only single-modal sample identity labels.<n>We propose a heterogeneous expert collaborative consistency learning framework, designed to establish robust cross-modal identity correspondences.<n> Experimental results on two challenging datasets validate the effectiveness of the proposed method.
- Score: 13.578738226091911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To reduce the reliance of visible-infrared person re-identification (ReID) models on labeled cross-modal samples, this paper explores a weakly supervised cross-modal person ReID method that uses only single-modal sample identity labels, addressing scenarios where cross-modal identity labels are unavailable. To mitigate the impact of missing cross-modal labels on model performance, we propose a heterogeneous expert collaborative consistency learning framework, designed to establish robust cross-modal identity correspondences in a weakly supervised manner. This framework leverages labeled data from each modality to independently train dedicated classification experts. To associate cross-modal samples, these classification experts act as heterogeneous predictors, predicting the identities of samples from the other modality. To improve prediction accuracy, we design a cross-modal relationship fusion mechanism that effectively integrates predictions from different experts. Under the implicit supervision provided by cross-modal identity correspondences, collaborative and consistent learning among the experts is encouraged, significantly enhancing the model's ability to extract modality-invariant features and improve cross-modal identity recognition. Experimental results on two challenging datasets validate the effectiveness of the proposed method.
Related papers
- Style-Aware Blending and Prototype-Based Cross-Contrast Consistency for Semi-Supervised Medical Image Segmentation [4.989577402211973]
We propose a style-aware blending and prototype-based cross-contrast consistency learning framework.<n>Inspired by the empirical observation that the distribution mismatch between labeled and unlabeled data can be characterized by statistical moments, we design a style-guided distribution blending module.<n>Considering the potential noise in strong pseudo-labels, we introduce a prototype-based cross-contrast strategy to encourage the model to learn informative supervisory signals.
arXiv Detail & Related papers (2025-07-28T11:26:24Z) - Dual-Granularity Cross-Modal Identity Association for Weakly-Supervised Text-to-Person Image Matching [7.1469465755934785]
Weakly supervised text-to-person image matching is a crucial approach to reducing models' reliance on large-scale manually labeled samples.<n>We propose a dual-granularity identity association mechanism to predict complex one-to-many identity relationships.<n> Experimental results demonstrate that the proposed method substantially boosts cross-modal matching accuracy.
arXiv Detail & Related papers (2025-07-09T10:59:13Z) - Mix-Modality Person Re-Identification: A New and Practical Paradigm [20.01921944345468]
We propose a new and more practical mix-modality retrieval paradigm.<n>Existing visible-infrared person re-identification (VI-ReID) methods have achieved some results in the bi-modality mutual retrieval paradigm.<n>This paper proposes a Mix-Modality person re-identification (MM-ReID) task, explores the influence of modality mixing ratio on performance, and constructs mix-modality test sets for existing datasets.
arXiv Detail & Related papers (2024-12-06T02:19:57Z) - CART: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
Cross-modal retrieval aims to search for instances, which are semantically related to the query through the interaction of different modal data.<n>Traditional solutions utilize a single-tower or dual-tower framework to explicitly compute the score between queries and candidates.<n>We propose a generative cross-modal retrieval framework (CART) based on coarse-to-fine semantic modeling.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment [23.310509459311046]
Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling.
Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID.
arXiv Detail & Related papers (2024-04-10T02:03:14Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - Modality Unifying Network for Visible-Infrared Person Re-Identification [24.186989535051623]
Visible-infrared person re-identification (VI-ReID) is a challenging task due to large cross-modality discrepancies and intra-class variations.
Existing methods mainly focus on learning modality-shared representations by embedding different modalities into the same feature space.
We propose a novel Modality Unifying Network (MUN) to explore a robust auxiliary modality for VI-ReID.
arXiv Detail & Related papers (2023-09-12T14:22:22Z) - Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label Refinement [53.044703127757295]
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) aims at learning modality-invariant features from unlabeled cross-modality dataset.
We propose a Dual Optimal Transport Label Assignment (DOTLA) framework to simultaneously assign the generated labels from one modality to its counterpart modality.
The proposed DOTLA mechanism formulates a mutual reinforcement and efficient solution to cross-modality data association, which could effectively reduce the side-effects of some insufficient and noisy label associations.
arXiv Detail & Related papers (2023-05-22T04:40:30Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - Trusted Multi-View Classification with Dynamic Evidential Fusion [73.35990456162745]
We propose a novel multi-view classification algorithm, termed trusted multi-view classification (TMC)
TMC provides a new paradigm for multi-view learning by dynamically integrating different views at an evidence level.
Both theoretical and experimental results validate the effectiveness of the proposed model in accuracy, robustness and trustworthiness.
arXiv Detail & Related papers (2022-04-25T03:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.