Related papers: Collaborative Label Correction via Entropy Thresholding

Collaborative Label Correction via Entropy Thresholding

URL: http://arxiv.org/abs/2103.17008v1
Date: Wed, 31 Mar 2021 11:42:55 GMT
Title: Collaborative Label Correction via Entropy Thresholding
Authors: Hao Wu, Jiaochao Yao, Jiajie Wang, Yinru Chen, Ya Zhang, Yanfeng Wang
Abstract summary: Deep neural networks (DNNs) have the capacity to fit extremely noisy labels. They tend to learn data with clean labels first and then memorize those with noisy labels. We show the low entropy predictions determined by a given threshold are much more reliable as the supervision than the original noisy labels.
Score: 22.012654529811904
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks (DNNs) have the capacity to fit extremely noisy labels nonetheless they tend to learn data with clean labels first and then memorize those with noisy labels. We examine this behavior in light of the Shannon entropy of the predictions and demonstrate the low entropy predictions determined by a given threshold are much more reliable as the supervision than the original noisy labels. It also shows the advantage in maintaining more training samples than previous methods. Then, we power this entropy criterion with the Collaborative Label Correction (CLC) framework to further avoid undesired local minimums of the single network. A range of experiments have been conducted on multiple benchmarks with both synthetic and real-world settings. Extensive results indicate that our CLC outperforms several state-of-the-art methods.

Related papers

Exploiting the Potential Supervision Information of Clean Samples in Partial Label Learning [8.969478423832188]
We show that clean samples can be collected to offer guidance and enhance the confidence of the most possible candidates.<n>We attribute the most reliable candidates with higher significance under the assumption that for each clean sample, if its label is one of the candidates of its nearest neighbor in the representation space, it is more likely to be the ground truth of its neighbor.
arXiv Detail & Related papers (2025-05-14T13:04:55Z)
Efficient Adaptive Label Refinement for Label Noise Learning [14.617885790129336]
We propose Adaptive Label Refinement (ALR) to avoid incorrect labels and thoroughly learning clean samples. ALR is simple and efficient, requiring no prior knowledge of noise or auxiliary datasets. We validate ALR's effectiveness through experiments on benchmark datasets with artificial label noise (CIFAR-10/100) and real-world datasets with inherent noise (ANIMAL-10N, Clothing1M, WebVision)
arXiv Detail & Related papers (2025-02-01T09:58:08Z)
Mitigating Noisy Supervision Using Synthetic Samples with Soft Labels [13.314778587751588]
Noisy labels are ubiquitous in real-world datasets, especially in the large-scale ones derived from crowdsourcing and web searching. It is challenging to train deep neural networks with noisy datasets since the networks are prone to overfitting the noisy labels during training. We propose a framework that trains the model with new synthetic samples to mitigate the impact of noisy labels.
arXiv Detail & Related papers (2024-06-22T04:49:39Z)
Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching. By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously. Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z)
Improving group robustness under noisy labels using predictive uncertainty [0.9449650062296823]
We use the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels. We propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels.
arXiv Detail & Related papers (2022-12-14T04:40:50Z)
Neighborhood Collective Estimation for Noisy Label Identification and Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels. Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias. We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z)
SELC: Self-Ensemble Label Correction Improves Learning with Noisy Labels [4.876988315151037]
Deep neural networks are prone to overfitting noisy labels, resulting in poor generalization performance. We present a method self-ensemble label correction (SELC) to progressively correct noisy labels and refine the model. SELC obtains more promising and stable results in the presence of class-conditional, instance-dependent, and real-world label noise.
arXiv Detail & Related papers (2022-05-02T18:42:47Z)
Unsupervised Domain Adaptive Salient Object Detection Through Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations. We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z)
S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise. In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space. Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z)
Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs) We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data. We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z)
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation. We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models. We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.