Is your noise correction noisy? PLS: Robustness to label noise with two
stage detection
- URL: http://arxiv.org/abs/2210.04578v1
- Date: Mon, 10 Oct 2022 11:32:28 GMT
- Title: Is your noise correction noisy? PLS: Robustness to label noise with two
stage detection
- Authors: Paul Albert, Eric Arazo, Tarun Kirshna, Noel E. O'Connor, Kevin
McGuinness
- Abstract summary: This paper proposes to improve the correction accuracy of noisy samples once they have been detected.
In many state-of-the-art contributions, a two phase approach is adopted where the noisy samples are detected before guessing a corrected pseudo-label.
We propose the pseudo-loss, a simple metric that we find to be strongly correlated with pseudo-label correctness on noisy samples.
- Score: 16.65296285599679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing robust algorithms capable of training accurate neural networks on
uncurated datasets from the web has been the subject of much research as it
reduces the need for time consuming human labor. The focus of many previous
research contributions has been on the detection of different types of label
noise; however, this paper proposes to improve the correction accuracy of noisy
samples once they have been detected. In many state-of-the-art contributions, a
two phase approach is adopted where the noisy samples are detected before
guessing a corrected pseudo-label in a semi-supervised fashion. The guessed
pseudo-labels are then used in the supervised objective without ensuring that
the label guess is likely to be correct. This can lead to confirmation bias,
which reduces the noise robustness. Here we propose the pseudo-loss, a simple
metric that we find to be strongly correlated with pseudo-label correctness on
noisy samples. Using the pseudo-loss, we dynamically down weight
under-confident pseudo-labels throughout training to avoid confirmation bias
and improve the network accuracy. We additionally propose to use a confidence
guided contrastive objective that learns robust representation on an
interpolated objective between class bound (supervised) for confidently
corrected samples and unsupervised representation for under-confident label
corrections. Experiments demonstrate the state-of-the-art performance of our
Pseudo-Loss Selection (PLS) algorithm on a variety of benchmark datasets
including curated data synthetically corrupted with in-distribution and
out-of-distribution noise, and two real world web noise datasets. Our
experiments are fully reproducible [github coming soon]
Related papers
- Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels.
We propose an approach to tackling overfitting caused by label noise.
Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Uncertainty-Aware Learning Against Label Noise on Imbalanced Datasets [23.4536532321199]
We propose an Uncertainty-aware Label Correction framework to handle label noise on imbalanced datasets.
Inspired by our observations, we propose an Uncertainty-aware Label Correction framework to handle label noise on imbalanced datasets.
arXiv Detail & Related papers (2022-07-12T11:35:55Z) - Two Wrongs Don't Make a Right: Combating Confirmation Bias in Learning
with Label Noise [6.303101074386922]
Robust Label Refurbishment (Robust LR) is a new hybrid method that integrates pseudo-labeling and confidence estimation techniques to refurbish noisy labels.
We show that our method successfully alleviates the damage of both label noise and confirmation bias.
For example, Robust LR achieves up to 4.5% absolute top-1 accuracy improvement over the previous best on the real-world noisy dataset WebVision.
arXiv Detail & Related papers (2021-12-06T12:10:17Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for
Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs)
We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data.
We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - Improving Generalization of Deep Fault Detection Models in the Presence
of Mislabeled Data [1.3535770763481902]
We propose a novel two-step framework for robust training with label noise.
In the first step, we identify outliers (including the mislabeled samples) based on the update in the hypothesis space.
In the second step, we propose different approaches to modifying the training data based on the identified outliers and a data augmentation technique.
arXiv Detail & Related papers (2020-09-30T12:33:25Z) - Improving Face Recognition by Clustering Unlabeled Faces in the Wild [77.48677160252198]
We propose a novel identity separation method based on extreme value theory.
It greatly reduces the problems caused by overlapping-identity label noise.
Experiments on both controlled and real settings demonstrate our method's consistent improvements.
arXiv Detail & Related papers (2020-07-14T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.