Learning with Noisy Labels Using Collaborative Sample Selection and
Contrastive Semi-Supervised Learning
- URL: http://arxiv.org/abs/2310.15533v1
- Date: Tue, 24 Oct 2023 05:37:20 GMT
- Title: Learning with Noisy Labels Using Collaborative Sample Selection and
Contrastive Semi-Supervised Learning
- Authors: Qing Miao, Xiaohe Wu, Chao Xu, Yanli Ji, Wangmeng Zuo, Yiwen Guo,
Zhaopeng Meng
- Abstract summary: Collaborative Sample Selection (CSS) removes noisy samples from identified clean set.
We introduce a co-training mechanism with a contrastive loss in semi-supervised learning.
- Score: 76.00798972439004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning with noisy labels (LNL) has been extensively studied, with existing
approaches typically following a framework that alternates between clean sample
selection and semi-supervised learning (SSL). However, this approach has a
limitation: the clean set selected by the Deep Neural Network (DNN) classifier,
trained through self-training, inevitably contains noisy samples. This mixture
of clean and noisy samples leads to misguidance in DNN training during SSL,
resulting in impaired generalization performance due to confirmation bias
caused by error accumulation in sample selection. To address this issue, we
propose a method called Collaborative Sample Selection (CSS), which leverages
the large-scale pre-trained model CLIP. CSS aims to remove the mixed noisy
samples from the identified clean set. We achieve this by training a
2-Dimensional Gaussian Mixture Model (2D-GMM) that combines the probabilities
from CLIP with the predictions from the DNN classifier. To further enhance the
adaptation of CLIP to LNL, we introduce a co-training mechanism with a
contrastive loss in semi-supervised learning. This allows us to jointly train
the prompt of CLIP and the DNN classifier, resulting in improved feature
representation, boosted classification performance of DNNs, and reciprocal
benefits to our Collaborative Sample Selection. By incorporating auxiliary
information from CLIP and utilizing prompt fine-tuning, we effectively
eliminate noisy samples from the clean set and mitigate confirmation bias
during training. Experimental results on multiple benchmark datasets
demonstrate the effectiveness of our proposed method in comparison with the
state-of-the-art approaches.
Related papers
- CLIPCleaner: Cleaning Noisy Labels with CLIP [36.434849361479316]
textitCLIPCleaner is a zero-shot classifier for efficient, offline, clean sample selection.
textitCLIPCleaner offers a simple, single-step approach that achieves competitive or superior performance on benchmark datasets.
arXiv Detail & Related papers (2024-08-19T14:05:58Z) - Debiased Sample Selection for Combating Noisy Labels [24.296451733127956]
We propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.
Specifically, to mitigate the training bias, we design a robust network architecture that integrates with multiple experts.
By training on the mixture of two class-discriminative mini-batches, the model mitigates the effect of the imbalanced training set.
arXiv Detail & Related papers (2024-01-24T10:37:28Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - PASS: Peer-Agreement based Sample Selection for training with Noisy Labels [16.283722126438125]
The prevalence of noisy-label samples poses a significant challenge in deep learning, inducing overfitting effects.
Current methodologies often rely on the small-loss hypothesis or feature-based selection to separate noisy- and clean-label samples.
We propose a new noisy-label detection method, termed Peer-Agreement based Sample Selection (PASS), to address this problem.
arXiv Detail & Related papers (2023-03-20T00:35:33Z) - Boosting Discriminative Visual Representation Learning with
Scenario-Agnostic Mixup [54.09898347820941]
We propose textbfScenario-textbfAgnostic textbfMixup (SAMix) for both Self-supervised Learning (SSL) and supervised learning (SL) scenarios.
Specifically, we hypothesize and verify the objective function of mixup generation as optimizing local smoothness between two mixed classes.
A label-free generation sub-network is designed, which effectively provides non-trivial mixup samples and improves transferable abilities.
arXiv Detail & Related papers (2021-11-30T14:49:59Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.