Error-Bounded Correction of Noisy Labels
- URL: http://arxiv.org/abs/2011.10077v1
- Date: Thu, 19 Nov 2020 19:23:23 GMT
- Title: Error-Bounded Correction of Noisy Labels
- Authors: Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris
Metaxas, Chao Chen
- Abstract summary: We show that the prediction of a noisy classifier can indeed be a good indicator of whether the label of a training data is clean.
Based on the theoretical result, we propose a novel algorithm that corrects the labels based on the noisy classifier prediction.
We incorporate our label correction algorithm into the training of deep neural networks and train models that achieve superior testing performance on multiple public datasets.
- Score: 17.510654621245656
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: To collect large scale annotated data, it is inevitable to introduce label
noise, i.e., incorrect class labels. To be robust against label noise, many
successful methods rely on the noisy classifiers (i.e., models trained on the
noisy training data) to determine whether a label is trustworthy. However, it
remains unknown why this heuristic works well in practice. In this paper, we
provide the first theoretical explanation for these methods. We prove that the
prediction of a noisy classifier can indeed be a good indicator of whether the
label of a training data is clean. Based on the theoretical result, we propose
a novel algorithm that corrects the labels based on the noisy classifier
prediction. The corrected labels are consistent with the true Bayesian optimal
classifier with high probability. We incorporate our label correction algorithm
into the training of deep neural networks and train models that achieve
superior testing performance on multiple public datasets.
Related papers
- Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - From Noisy Prediction to True Label: Noisy Prediction Calibration via
Generative Model [22.722830935155223]
Noisy Prediction (NPC) is a new approach to learning with noisy labels.
NPC corrects the noisy prediction from the pre-trained classifier to the true label as a post-processing scheme.
Our method boosts the classification performances of all baseline models on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-02T07:15:45Z) - Two Wrongs Don't Make a Right: Combating Confirmation Bias in Learning
with Label Noise [6.303101074386922]
Robust Label Refurbishment (Robust LR) is a new hybrid method that integrates pseudo-labeling and confidence estimation techniques to refurbish noisy labels.
We show that our method successfully alleviates the damage of both label noise and confirmation bias.
For example, Robust LR achieves up to 4.5% absolute top-1 accuracy improvement over the previous best on the real-world noisy dataset WebVision.
arXiv Detail & Related papers (2021-12-06T12:10:17Z) - Instance Correction for Learning with Open-set Noisy Labels [145.06552420999986]
We use the sample selection approach to handle open-set noisy labels.
The discarded data are seen to be mislabeled and do not participate in training.
We modify the instances of discarded data to make predictions for the discarded data consistent with given labels.
arXiv Detail & Related papers (2021-06-01T13:05:55Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - EvidentialMix: Learning with Combined Open-set and Closed-set Noisy
Labels [30.268962418683955]
We study a new variant of the noisy label problem that combines the open-set and closed-set noisy labels.
Our results show that our method produces superior classification results and better feature representations than previous state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:15:32Z) - Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels.
Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z) - Multi-Class Classification from Noisy-Similarity-Labeled Data [98.13491369929798]
We propose a method for learning from only noisy-similarity-labeled data.
We use a noise transition matrix to bridge the class-posterior probability between clean and noisy data.
We build a novel learning system which can assign noise-free class labels for instances.
arXiv Detail & Related papers (2020-02-16T05:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.