Learning Not to Learn in the Presence of Noisy Labels
- URL: http://arxiv.org/abs/2002.06541v1
- Date: Sun, 16 Feb 2020 09:12:27 GMT
- Title: Learning Not to Learn in the Presence of Noisy Labels
- Authors: Liu Ziyin, Blair Chen, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov,
Louis-Philippe Morency, Masahito Ueda
- Abstract summary: We show that a new class of loss functions called the gambler's loss provides strong robustness to label noise across various levels of corruption.
We show that training with this loss function encourages the model to "abstain" from learning on the data points with noisy labels.
- Score: 104.7655376309784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning in the presence of label noise is a challenging yet important task:
it is crucial to design models that are robust in the presence of mislabeled
datasets. In this paper, we discover that a new class of loss functions called
the gambler's loss provides strong robustness to label noise across various
levels of corruption. We show that training with this loss function encourages
the model to "abstain" from learning on the data points with noisy labels,
resulting in a simple and effective method to improve robustness and
generalization. In addition, we propose two practical extensions of the method:
1) an analytical early stopping criterion to approximately stop training before
the memorization of noisy labels, as well as 2) a heuristic for setting
hyperparameters which do not require knowledge of the noise corruption rate. We
demonstrate the effectiveness of our method by achieving strong results across
three image and text classification tasks as compared to existing baselines.
Related papers
- Learning Camouflaged Object Detection from Noisy Pseudo Label [60.9005578956798]
This paper introduces the first weakly semi-supervised Camouflaged Object Detection (COD) method.
It aims for budget-efficient and high-precision camouflaged object segmentation with an extremely limited number of fully labeled images.
We propose a noise correction loss that facilitates the model's learning of correct pixels in the early learning stage.
When using only 20% of fully labeled data, our method shows superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T04:53:51Z) - ERASE: Error-Resilient Representation Learning on Graphs for Label Noise
Tolerance [53.73316938815873]
We propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE) to learn representations with error tolerance.
ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience.
Our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability.
arXiv Detail & Related papers (2023-12-13T17:59:07Z) - Fine tuning Pre trained Models for Robustness Under Noisy Labels [34.68018860186995]
The presence of noisy labels in a training dataset can significantly impact the performance of machine learning models.
We introduce a novel algorithm called TURN, which robustly and efficiently transfers the prior knowledge of pre-trained models.
arXiv Detail & Related papers (2023-10-24T20:28:59Z) - Mitigating Label Noise through Data Ambiguation [9.51828574518325]
Large models with high expressive power are prone to memorizing incorrect labels, thereby harming generalization performance.
In this paper, we suggest to address the shortcomings of both methodologies by "ambiguating" the target information.
More precisely, we leverage the framework of so-called superset learning to construct set-valued targets based on a confidence threshold.
arXiv Detail & Related papers (2023-05-23T07:29:08Z) - Label Noise-Robust Learning using a Confidence-Based Sieving Strategy [15.997774467236352]
In learning tasks with label noise, improving model robustness against overfitting is a pivotal challenge.
Identifying the samples with noisy labels and preventing the model from learning them is a promising approach to address this challenge.
We propose a novel discriminator metric called confidence error and a sieving strategy called CONFES to differentiate between the clean and noisy samples effectively.
arXiv Detail & Related papers (2022-10-11T10:47:28Z) - Towards Harnessing Feature Embedding for Robust Learning with Noisy
Labels [44.133307197696446]
The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods.
We propose a novel feature embedding-based method for deep learning with label noise, termed LabEl NoiseDilution (LEND)
arXiv Detail & Related papers (2022-06-27T02:45:09Z) - On Learning Contrastive Representations for Learning with Noisy Labels [26.23187556876699]
Deep neural networks are able to memorize noisy labels easily with a softmax cross-entropy (CE) loss.
Previous studies attempted to address this issue by incorporating a noise-robust loss function to the CE loss.
We propose a novel contrastive regularization function to learn such representations over noisy data.
arXiv Detail & Related papers (2022-03-03T15:58:05Z) - Robust Long-Tailed Learning under Label Noise [50.00837134041317]
This work investigates the label noise problem under long-tailed label distribution.
We propose a robust framework,algo, that realizes noise detection for long-tailed learning.
Our framework can naturally leverage semi-supervised learning algorithms to further improve the generalisation.
arXiv Detail & Related papers (2021-08-26T03:45:00Z) - Searching for Robustness: Loss Learning for Noisy Classification Tasks [81.70914107917551]
We parameterize a flexible family of loss functions using Taylors and apply evolutionary strategies to search for noise-robust losses in this space.
The resulting white-box loss provides a simple and fast "plug-and-play" module that enables effective noise-robust learning in diverse downstream tasks.
arXiv Detail & Related papers (2021-02-27T15:27:22Z) - Noisy Self-Knowledge Distillation for Text Summarization [83.49809205891496]
We apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training.
Our student summarization model is trained with guidance from a teacher which generates smoothed labels to help regularize training.
We demonstrate experimentally on three benchmarks that our framework boosts the performance of both pretrained and non-pretrained summarizers.
arXiv Detail & Related papers (2020-09-15T12:53:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.