Uncertainty-Aware Learning Against Label Noise on Imbalanced Datasets
- URL: http://arxiv.org/abs/2207.05471v1
- Date: Tue, 12 Jul 2022 11:35:55 GMT
- Title: Uncertainty-Aware Learning Against Label Noise on Imbalanced Datasets
- Authors: Yingsong Huang, Bing Bai, Shengwei Zhao, Kun Bai, Fei Wang
- Abstract summary: We propose an Uncertainty-aware Label Correction framework to handle label noise on imbalanced datasets.
Inspired by our observations, we propose an Uncertainty-aware Label Correction framework to handle label noise on imbalanced datasets.
- Score: 23.4536532321199
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Learning against label noise is a vital topic to guarantee a reliable
performance for deep neural networks. Recent research usually refers to dynamic
noise modeling with model output probabilities and loss values, and then
separates clean and noisy samples. These methods have gained notable success.
However, unlike cherry-picked data, existing approaches often cannot perform
well when facing imbalanced datasets, a common scenario in the real world. We
thoroughly investigate this phenomenon and point out two major issues that
hinder the performance, i.e., \emph{inter-class loss distribution discrepancy}
and \emph{misleading predictions due to uncertainty}. The first issue is that
existing methods often perform class-agnostic noise modeling. However, loss
distributions show a significant discrepancy among classes under class
imbalance, and class-agnostic noise modeling can easily get confused with noisy
samples and samples in minority classes. The second issue refers to that models
may output misleading predictions due to epistemic uncertainty and aleatoric
uncertainty, thus existing methods that rely solely on the output probabilities
may fail to distinguish confident samples. Inspired by our observations, we
propose an Uncertainty-aware Label Correction framework~(ULC) to handle label
noise on imbalanced datasets. First, we perform epistemic uncertainty-aware
class-specific noise modeling to identify trustworthy clean samples and
refine/discard highly confident true/corrupted labels. Then, we introduce
aleatoric uncertainty in the subsequent learning process to prevent noise
accumulation in the label noise modeling process. We conduct experiments on
several synthetic and real-world datasets. The results demonstrate the
effectiveness of the proposed method, especially on imbalanced datasets.
Related papers
- Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Doubly Stochastic Models: Learning with Unbiased Label Noises and
Inference Stability [85.1044381834036]
We investigate the implicit regularization effects of label noises under mini-batch sampling settings of gradient descent.
We find such implicit regularizer would favor some convergence points that could stabilize model outputs against perturbation of parameters.
Our work doesn't assume SGD as an Ornstein-Uhlenbeck like process and achieve a more general result with convergence of approximation proved.
arXiv Detail & Related papers (2023-04-01T14:09:07Z) - Label Noise-Robust Learning using a Confidence-Based Sieving Strategy [15.997774467236352]
In learning tasks with label noise, improving model robustness against overfitting is a pivotal challenge.
Identifying the samples with noisy labels and preventing the model from learning them is a promising approach to address this challenge.
We propose a novel discriminator metric called confidence error and a sieving strategy called CONFES to differentiate between the clean and noisy samples effectively.
arXiv Detail & Related papers (2022-10-11T10:47:28Z) - Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT)
Most de-noising methods fail to identify the hard noises.
We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Robustness and reliability when training with noisy labels [12.688634089849023]
Labelling of data for supervised learning can be costly and time-consuming.
Deep neural networks have proved capable of fitting random labels, regularisation and the use of robust loss functions.
arXiv Detail & Related papers (2021-10-07T10:30:20Z) - Open-set Label Noise Can Improve Robustness Against Inherent Label Noise [27.885927200376386]
We show that open-set noisy labels can be non-toxic and even benefit the robustness against inherent noisy labels.
We propose a simple yet effective regularization by introducing Open-set samples with Dynamic Noisy Labels (ODNL) into training.
arXiv Detail & Related papers (2021-06-21T07:15:50Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.