Pseudo-label Refinement for Improving Self-Supervised Learning Systems
- URL: http://arxiv.org/abs/2410.14242v1
- Date: Fri, 18 Oct 2024 07:47:59 GMT
- Title: Pseudo-label Refinement for Improving Self-Supervised Learning Systems
- Authors: Zia-ur-Rehman, Arif Mahmood, Wenxiong Kang,
- Abstract summary: Self-supervised learning systems use clustering-based pseudo-labels to provide supervision without the need for human annotations.
The noise in these pseudo-labels caused by the clustering methods poses a challenge to the learning process leading to degraded performance.
We propose a pseudo-label refinement algorithm to address this issue.
- Score: 22.276126184466207
- License:
- Abstract: Self-supervised learning systems have gained significant attention in recent years by leveraging clustering-based pseudo-labels to provide supervision without the need for human annotations. However, the noise in these pseudo-labels caused by the clustering methods poses a challenge to the learning process leading to degraded performance. In this work, we propose a pseudo-label refinement (SLR) algorithm to address this issue. The cluster labels from the previous epoch are projected to the current epoch cluster-labels space and a linear combination of the new label and the projected label is computed as a soft refined label containing the information from the previous epoch clusters as well as from the current epoch. In contrast to the common practice of using the maximum value as a cluster/class indicator, we employ hierarchical clustering on these soft pseudo-labels to generate refined hard-labels. This approach better utilizes the information embedded in the soft labels, outperforming the simple maximum value approach for hard label generation. The effectiveness of the proposed SLR algorithm is evaluated in the context of person re-identification (Re-ID) using unsupervised domain adaptation (UDA). Experimental results demonstrate that the modified Re-ID baseline, incorporating the SLR algorithm, achieves significantly improved mean Average Precision (mAP) performance in various UDA tasks, including real-to-synthetic, synthetic-to-real, and different real-to-real scenarios. These findings highlight the efficacy of the SLR algorithm in enhancing the performance of self-supervised learning systems.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - Complementary Labels Learning with Augmented Classes [22.460256396941528]
Complementary Labels Learning (CLL) arises in many real-world tasks such as private questions classification and online learning.
We propose a novel problem setting called Complementary Labels Learning with Augmented Classes (CLLAC)
By using unlabeled data, we propose an unbiased estimator of classification risk for CLLAC, which is guaranteed to be provably consistent.
arXiv Detail & Related papers (2022-11-19T13:55:27Z) - Plug-and-Play Pseudo Label Correction Network for Unsupervised Person
Re-identification [36.3733132520186]
We propose a graph-based pseudo label correction network (GLC) to refine the pseudo labels in the manner of supervised clustering.
GLC learns to rectify the initial noisy labels by means of the relationship constraints between samples on the k Nearest Neighbor graph.
Our method is widely compatible with various clustering-based methods and promotes the state-of-the-art performance consistently.
arXiv Detail & Related papers (2022-06-14T05:59:37Z) - Refining Pseudo Labels with Clustering Consensus over Generations for
Unsupervised Object Re-identification [84.72303377833732]
Unsupervised object re-identification targets at learning discriminative representations for object retrieval without any annotations.
We propose to estimate pseudo label similarities between consecutive training generations with clustering consensus and refine pseudo labels with temporally propagated and ensembled pseudo labels.
The proposed pseudo label refinery strategy is simple yet effective and can be seamlessly integrated into existing clustering-based unsupervised re-identification methods.
arXiv Detail & Related papers (2021-06-11T02:42:42Z) - Dual-Refinement: Joint Label and Feature Refinement for Unsupervised
Domain Adaptive Person Re-Identification [51.98150752331922]
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data.
We propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase.
Our method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-26T07:35:35Z) - Delving Deep into Label Smoothing [112.24527926373084]
Label smoothing is an effective regularization tool for deep neural networks (DNNs)
We present an Online Label Smoothing (OLS) strategy, which generates soft labels based on the statistics of the model prediction for the target category.
arXiv Detail & Related papers (2020-11-25T08:03:11Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z) - ESL: Entropy-guided Self-supervised Learning for Domain Adaptation in
Semantic Segmentation [35.03150829133562]
We propose Entropy-guided Self-supervised Learning, leveraging entropy as the confidence indicator for producing more accurate pseudo-labels.
On different UDA benchmarks, ESL consistently outperforms strong SSL baselines and achieves state-of-the-art results.
arXiv Detail & Related papers (2020-06-15T18:10:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.