UNICON: Combating Label Noise Through Uniform Selection and Contrastive
Learning
- URL: http://arxiv.org/abs/2203.14542v2
- Date: Thu, 31 Mar 2022 07:41:20 GMT
- Title: UNICON: Combating Label Noise Through Uniform Selection and Contrastive
Learning
- Authors: Nazmul Karim, Mamshad Nayeem Rizve, Nazanin Rahnavard, Ajmal Mian,
Mubarak Shah
- Abstract summary: We propose UNICON, a simple yet effective sample selection method which is robust to high label noise.
We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
- Score: 89.56465237941013
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised deep learning methods require a large repository of annotated
data; hence, label noise is inevitable. Training with such noisy data
negatively impacts the generalization performance of deep neural networks. To
combat label noise, recent state-of-the-art methods employ some sort of sample
selection mechanism to select a possibly clean subset of data. Next, an
off-the-shelf semi-supervised learning method is used for training where
rejected samples are treated as unlabeled data. Our comprehensive analysis
shows that current selection methods disproportionately select samples from
easy (fast learnable) classes while rejecting those from relatively harder
ones. This creates class imbalance in the selected clean set and in turn,
deteriorates performance under high label noise. In this work, we propose
UNICON, a simple yet effective sample selection method which is robust to high
label noise. To address the disproportionate selection of easy and hard
samples, we introduce a Jensen-Shannon divergence based uniform selection
mechanism which does not require any probabilistic modeling and hyperparameter
tuning. We complement our selection method with contrastive learning to further
combat the memorization of noisy labels. Extensive experimentation on multiple
benchmark datasets demonstrates the effectiveness of UNICON; we obtain an 11.4%
improvement over the current state-of-the-art on CIFAR100 dataset with a 90%
noise rate. Our code is publicly available
Related papers
- Foster Adaptivity and Balance in Learning with Noisy Labels [26.309508654960354]
We propose a novel approach named textbfSED to deal with label noise in a textbfSelf-adaptivtextbfE and class-balancetextbfD manner.
A mean-teacher model is then employed to correct labels of noisy samples.
We additionally propose a self-adaptive and class-balanced sample re-weighting mechanism to assign different weights to detected noisy samples.
arXiv Detail & Related papers (2024-07-03T03:10:24Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Adaptive Sample Selection for Robust Learning under Label Noise [1.71982924656402]
Deep Neural Networks (DNNs) have been shown to be susceptible to memorization or overfitting in the presence of noisily labelled data.
A prominent class of algorithms rely on sample selection strategies, motivated by curriculum learning.
We propose a data-dependent, adaptive sample selection strategy that relies only on batch statistics.
arXiv Detail & Related papers (2021-06-29T12:10:58Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - Multi-Objective Interpolation Training for Robustness to Label Noise [17.264550056296915]
We show that standard supervised contrastive learning degrades in the presence of label noise.
We propose a novel label noise detection method that exploits the robust feature representations learned via contrastive learning.
Experiments on synthetic and real-world noise benchmarks demonstrate that MOIT/MOIT+ achieves state-of-the-art results.
arXiv Detail & Related papers (2020-12-08T15:01:54Z) - No Regret Sample Selection with Noisy Labels [0.0]
Experimental results on multiple noisy-labeled datasets demonstrate that our sample selection strategy works effectively in the DNN training.
The proposed method achieves the best or the second-best performance among state-of-the-art methods, while requiring a significantly lower computational cost.
arXiv Detail & Related papers (2020-03-06T13:17:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.