Training Classifiers that are Universally Robust to All Label Noise
Levels
- URL: http://arxiv.org/abs/2105.13892v1
- Date: Thu, 27 May 2021 13:49:31 GMT
- Title: Training Classifiers that are Universally Robust to All Label Noise
Levels
- Authors: Jingyi Xu, Tony Q. S. Quek, Kai Fong Ernest Chong
- Abstract summary: Deep neural networks are prone to overfitting in the presence of label noise.
We propose a distillation-based framework that incorporates a new subcategory of Positive-Unlabeled learning.
Our framework generally outperforms at medium to high noise levels.
- Score: 91.13870793906968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For classification tasks, deep neural networks are prone to overfitting in
the presence of label noise. Although existing methods are able to alleviate
this problem at low noise levels, they encounter significant performance
reduction at high noise levels, or even at medium noise levels when the label
noise is asymmetric. To train classifiers that are universally robust to all
noise levels, and that are not sensitive to any variation in the noise model,
we propose a distillation-based framework that incorporates a new subcategory
of Positive-Unlabeled learning. In particular, we shall assume that a small
subset of any given noisy dataset is known to have correct labels, which we
treat as "positive", while the remaining noisy subset is treated as
"unlabeled". Our framework consists of the following two components: (1) We
shall generate, via iterative updates, an augmented clean subset with
additional reliable "positive" samples filtered from "unlabeled" samples; (2)
We shall train a teacher model on this larger augmented clean set. With the
guidance of the teacher model, we then train a student model on the whole
dataset. Experiments were conducted on the CIFAR-10 dataset with synthetic
label noise at multiple noise levels for both symmetric and asymmetric noise.
The results show that our framework generally outperforms at medium to high
noise levels. We also evaluated our framework on Clothing1M, a real-world noisy
dataset, and we achieved 2.94% improvement in accuracy over existing
state-of-the-art methods.
Related papers
- NoisyAG-News: A Benchmark for Addressing Instance-Dependent Noise in Text Classification [7.464154519547575]
Existing research on learning with noisy labels predominantly focuses on synthetic noise patterns.
We constructed a benchmark dataset to better understand label noise in real-world text classification settings.
Our findings reveal that while pre-trained models are resilient to synthetic noise, they struggle against instance-dependent noise.
arXiv Detail & Related papers (2024-07-09T06:18:40Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework.
We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels.
Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT)
Most de-noising methods fail to identify the hard noises.
We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z) - Label noise detection under the Noise at Random model with ensemble
filters [5.994719700262245]
This work investigates the performance of ensemble noise detection under two different noise models.
We investigate the effect of class distribution on noise detection performance since it changes the total noise level observed in a dataset.
arXiv Detail & Related papers (2021-12-02T21:49:41Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - LongReMix: Robust Learning with High Confidence Samples in a Noisy Label
Environment [33.376639002442914]
We propose the new 2-stage noisy-label training algorithm LongReMix.
We test LongReMix on the noisy-label benchmarks CIFAR-10, CIFAR-100, WebVision, Clothing1M, and Food101-N.
Our approach achieves state-of-the-art performance in most datasets.
arXiv Detail & Related papers (2021-03-06T18:48:40Z) - GANs for learning from very high class conditional noisy labels [1.6516902135723865]
We use Generative Adversarial Networks (GANs) to design a class conditional label noise (CCN) robust scheme for binary classification.
It first generates a set of correctly labelled data points from noisy labelled data and 0.1% or 1% clean labels.
arXiv Detail & Related papers (2020-10-19T15:01:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.