Augmentation Strategies for Learning with Noisy Labels
- URL: http://arxiv.org/abs/2103.02130v2
- Date: Thu, 4 Mar 2021 02:05:43 GMT
- Title: Augmentation Strategies for Learning with Noisy Labels
- Authors: Kento Nishi, Yi Ding, Alex Rich, Tobias H\"ollerer
- Abstract summary: We evaluate different augmentation strategies for algorithms tackling the "learning with noisy labels" problem.
We find that using one set of augmentations for loss modeling tasks and another set for learning is the most effective.
We introduce this augmentation strategy to the state-of-the-art technique and demonstrate that we can improve performance across all evaluated noise levels.
- Score: 3.698228929379249
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imperfect labels are ubiquitous in real-world datasets. Several recent
successful methods for training deep neural networks (DNNs) robust to label
noise have used two primary techniques: filtering samples based on loss during
a warm-up phase to curate an initial set of cleanly labeled samples, and using
the output of a network as a pseudo-label for subsequent loss calculations. In
this paper, we evaluate different augmentation strategies for algorithms
tackling the "learning with noisy labels" problem. We propose and examine
multiple augmentation strategies and evaluate them using synthetic datasets
based on CIFAR-10 and CIFAR-100, as well as on the real-world dataset
Clothing1M. Due to several commonalities in these algorithms, we find that
using one set of augmentations for loss modeling tasks and another set for
learning is the most effective, improving results on the state-of-the-art and
other previous methods. Furthermore, we find that applying augmentation during
the warm-up period can negatively impact the loss convergence behavior of
correctly versus incorrectly labeled samples. We introduce this augmentation
strategy to the state-of-the-art technique and demonstrate that we can improve
performance across all evaluated noise levels. In particular, we improve
accuracy on the CIFAR-10 benchmark at 90% symmetric noise by more than 15% in
absolute accuracy and we also improve performance on the real-world dataset
Clothing1M.
(* equal contribution)
Related papers
- Mitigating Noisy Supervision Using Synthetic Samples with Soft Labels [13.314778587751588]
Noisy labels are ubiquitous in real-world datasets, especially in the large-scale ones derived from crowdsourcing and web searching.
It is challenging to train deep neural networks with noisy datasets since the networks are prone to overfitting the noisy labels during training.
We propose a framework that trains the model with new synthetic samples to mitigate the impact of noisy labels.
arXiv Detail & Related papers (2024-06-22T04:49:39Z) - Dynamic Loss For Robust Learning [17.33444812274523]
This work presents a novel meta-learning based dynamic loss that automatically adjusts the objective functions with the training process to robustly learn a classifier from long-tailed noisy data.
Our method achieves state-of-the-art accuracy on multiple real-world and synthetic datasets with various types of data biases, including CIFAR-10/100, Animal-10N, ImageNet-LT, and Webvision.
arXiv Detail & Related papers (2022-11-22T01:48:25Z) - A Study on the Impact of Data Augmentation for Training Convolutional
Neural Networks in the Presence of Noisy Labels [14.998309259808236]
Label noise is common in large real-world datasets, and its presence harms the training process of deep neural networks.
We evaluate the impact of data augmentation as a design choice for training deep neural networks.
We show that the appropriate selection of data augmentation can drastically improve the model robustness to label noise.
arXiv Detail & Related papers (2022-08-23T20:04:17Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Class-Aware Contrastive Semi-Supervised Learning [51.205844705156046]
We propose a general method named Class-aware Contrastive Semi-Supervised Learning (CCSSL) to improve pseudo-label quality and enhance the model's robustness in the real-world setting.
Our proposed CCSSL has significant performance improvements over the state-of-the-art SSL methods on the standard datasets CIFAR100 and STL10.
arXiv Detail & Related papers (2022-03-04T12:18:23Z) - Synergistic Network Learning and Label Correction for Noise-robust Image
Classification [28.27739181560233]
Deep Neural Networks (DNNs) tend to overfit training label noise, resulting in poorer model performance in practice.
We propose a robust label correction framework combining the ideas of small loss selection and noise correction.
We demonstrate our method on both synthetic and real-world datasets with different noise types and rates.
arXiv Detail & Related papers (2022-02-27T23:06:31Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - Semantic Perturbations with Normalizing Flows for Improved
Generalization [62.998818375912506]
We show that perturbations in the latent space can be used to define fully unsupervised data augmentations.
We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
arXiv Detail & Related papers (2021-08-18T03:20:00Z) - Learning from Noisy Labels via Dynamic Loss Thresholding [69.61904305229446]
We propose a novel method named Dynamic Loss Thresholding (DLT)
During the training process, DLT records the loss value of each sample and calculates dynamic loss thresholds.
Experiments on CIFAR-10/100 and Clothing1M demonstrate substantial improvements over recent state-of-the-art methods.
arXiv Detail & Related papers (2021-04-01T07:59:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.