Improving group robustness under noisy labels using predictive
uncertainty
- URL: http://arxiv.org/abs/2212.07026v1
- Date: Wed, 14 Dec 2022 04:40:50 GMT
- Title: Improving group robustness under noisy labels using predictive
uncertainty
- Authors: Dongpin Oh, Dae Lee, Jeunghyun Byun, and Bonggun Shin
- Abstract summary: We use the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels.
We propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels.
- Score: 0.9449650062296823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The standard empirical risk minimization (ERM) can underperform on certain
minority groups (i.e., waterbirds in lands or landbirds in water) due to the
spurious correlation between the input and its label. Several studies have
improved the worst-group accuracy by focusing on the high-loss samples. The
hypothesis behind this is that such high-loss samples are
\textit{spurious-cue-free} (SCF) samples. However, these approaches can be
problematic since the high-loss samples may also be samples with noisy labels
in the real-world scenarios. To resolve this issue, we utilize the predictive
uncertainty of a model to improve the worst-group accuracy under noisy labels.
To motivate this, we theoretically show that the high-uncertainty samples are
the SCF samples in the binary classification problem. This theoretical result
implies that the predictive uncertainty is an adequate indicator to identify
SCF samples in a noisy label setting. Motivated from this, we propose a novel
ENtropy based Debiasing (END) framework that prevents models from learning the
spurious cues while being robust to the noisy labels. In the END framework, we
first train the \textit{identification model} to obtain the SCF samples from a
training set using its predictive uncertainty. Then, another model is trained
on the dataset augmented with an oversampled SCF set. The experimental results
show that our END framework outperforms other strong baselines on several
real-world benchmarks that consider both the noisy labels and the
spurious-cues.
Related papers
- Foster Adaptivity and Balance in Learning with Noisy Labels [26.309508654960354]
We propose a novel approach named textbfSED to deal with label noise in a textbfSelf-adaptivtextbfE and class-balancetextbfD manner.
A mean-teacher model is then employed to correct labels of noisy samples.
We additionally propose a self-adaptive and class-balanced sample re-weighting mechanism to assign different weights to detected noisy samples.
arXiv Detail & Related papers (2024-07-03T03:10:24Z) - Mitigating Noisy Supervision Using Synthetic Samples with Soft Labels [13.314778587751588]
Noisy labels are ubiquitous in real-world datasets, especially in the large-scale ones derived from crowdsourcing and web searching.
It is challenging to train deep neural networks with noisy datasets since the networks are prone to overfitting the noisy labels during training.
We propose a framework that trains the model with new synthetic samples to mitigate the impact of noisy labels.
arXiv Detail & Related papers (2024-06-22T04:49:39Z) - Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Breaking the Spurious Causality of Conditional Generation via Fairness
Intervention with Corrective Sampling [77.15766509677348]
Conditional generative models often inherit spurious correlations from the training dataset.
This can result in label-conditional distributions that are imbalanced with respect to another latent attribute.
We propose a general two-step strategy to mitigate this issue.
arXiv Detail & Related papers (2022-12-05T08:09:33Z) - Learning with Noisy Labels over Imbalanced Subpopulations [13.477553187049462]
Learning with noisy labels (LNL) has attracted significant attention from the research community.
We propose a novel LNL method to simultaneously deal with noisy labels and imbalanced subpopulations.
We introduce a feature-based metric that takes the sample correlation into account for estimating samples' clean probabilities.
arXiv Detail & Related papers (2022-11-16T07:25:24Z) - Learning from Noisy Labels with Coarse-to-Fine Sample Credibility
Modeling [22.62790706276081]
Training deep neural network (DNN) with noisy labels is practically challenging.
Previous efforts tend to handle part or full data in a unified denoising flow.
We propose a coarse-to-fine robust learning method called CREMA to handle noisy data in a divide-and-conquer manner.
arXiv Detail & Related papers (2022-08-23T02:06:38Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample.
We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.