Learning with Feature-Dependent Label Noise: A Progressive Approach
- URL: http://arxiv.org/abs/2103.07756v2
- Date: Tue, 16 Mar 2021 07:28:12 GMT
- Title: Learning with Feature-Dependent Label Noise: A Progressive Approach
- Authors: Yikai Zhang, Songzhu Zheng, Pengxiang Wu, Mayank Goswami, Chao Chen
- Abstract summary: We propose a new family of feature-dependent label noise, which is much more general than commonly used i.i.d. label noise.
We provide theoretical guarantees showing that for a wide variety of (unknown) noise patterns, a classifier trained with this strategy converges to be consistent with the Bayes classifier.
- Score: 19.425199841491246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Label noise is frequently observed in real-world large-scale datasets. The
noise is introduced due to a variety of reasons; it is heterogeneous and
feature-dependent. Most existing approaches to handling noisy labels fall into
two categories: they either assume an ideal feature-independent noise, or
remain heuristic without theoretical guarantees. In this paper, we propose to
target a new family of feature-dependent label noise, which is much more
general than commonly used i.i.d. label noise and encompasses a broad spectrum
of noise patterns. Focusing on this general noise family, we propose a
progressive label correction algorithm that iteratively corrects labels and
refines the model. We provide theoretical guarantees showing that for a wide
variety of (unknown) noise patterns, a classifier trained with this strategy
converges to be consistent with the Bayes classifier. In experiments, our
method outperforms SOTA baselines and is robust to various noise types and
levels.
Related papers
- Handling Realistic Label Noise in BERT Text Classification [1.0515439489916731]
Real label noise is not random; rather, it is often correlated with input features or other annotator-specific factors.
We show that the presence of these types of noise significantly degrades BERT classification performance.
arXiv Detail & Related papers (2023-05-23T18:30:31Z) - Rethinking the Value of Labels for Instance-Dependent Label Noise
Learning [43.481591776038144]
noisy labels in real-world applications often depend on both the true label and the features.
In this work, we tackle instance-dependent label noise with a novel deep generative model that avoids explicitly modeling the noise transition matrix.
Our algorithm leverages casual representation learning and simultaneously identifies the high-level content and style latent factors from the data.
arXiv Detail & Related papers (2023-05-10T15:29:07Z) - Robustness to Label Noise Depends on the Shape of the Noise Distribution
in Feature Space [6.748225062396441]
We show that both the scale and the shape of the noise distribution influence the posterior likelihood.
We show that when the noise distribution targets decision boundaries, classification robustness can drop off even at a small scale of noise.
arXiv Detail & Related papers (2022-06-02T15:41:59Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial
Awareness [2.1930130356902207]
We propose a principled model of label noise that generalizes instance-dependent noise to multiple labelers.
Under our labeler-dependent model, label noise manifests itself under two modalities: natural error of good-faith labelers, and adversarial labels provided by malicious actors.
We present two adversarial attack vectors that more accurately reflect the label noise that may be encountered in real-world settings.
arXiv Detail & Related papers (2021-05-28T19:58:18Z) - Training Classifiers that are Universally Robust to All Label Noise
Levels [91.13870793906968]
Deep neural networks are prone to overfitting in the presence of label noise.
We propose a distillation-based framework that incorporates a new subcategory of Positive-Unlabeled learning.
Our framework generally outperforms at medium to high noise levels.
arXiv Detail & Related papers (2021-05-27T13:49:31Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - Extended T: Learning with Mixed Closed-set and Open-set Noisy Labels [86.5943044285146]
The label noise transition matrix $T$ reflects the probabilities that true labels flip into noisy ones.
In this paper, we focus on learning under the mixed closed-set and open-set label noise.
Our method can better model the mixed label noise, following its more robust performance than the prior state-of-the-art label-noise learning methods.
arXiv Detail & Related papers (2020-12-02T02:42:45Z) - NoiseRank: Unsupervised Label Noise Reduction with Dependence Models [11.08987870095179]
We propose NoiseRank, for unsupervised label noise reduction using Markov Random Fields (MRF)
We construct a dependence model to estimate the posterior probability of an instance being incorrectly labeled given the dataset, and rank instances based on their estimated probabilities.
NoiseRank improves state-of-the-art classification on Food101-N (20% noise) and is effective on high noise Clothing-1M (40% noise)
arXiv Detail & Related papers (2020-03-15T01:10:25Z) - Confidence Scores Make Instance-dependent Label-noise Learning Possible [129.84497190791103]
In learning with noisy labels, for every instance, its label can randomly walk to other classes following a transition distribution which is named a noise model.
We introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is equipped with a confidence score.
We find with the help of confidence scores, the transition distribution of each instance can be approximately estimated.
arXiv Detail & Related papers (2020-01-11T16:15:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.