An Exploration into why Output Regularization Mitigates Label Noise
- URL: http://arxiv.org/abs/2104.12477v1
- Date: Mon, 26 Apr 2021 11:16:30 GMT
- Title: An Exploration into why Output Regularization Mitigates Label Noise
- Authors: Neta Shoham, Tomer Avidor, Nadav Israel
- Abstract summary: Noise robust losses is one of the more promising approaches for dealing with label noise.
We show that losses that incorporate an output regularization term, such as label smoothing and entropy, become symmetric as the regularization coefficient goes to infinity.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Label noise presents a real challenge for supervised learning algorithms.
Consequently, mitigating label noise has attracted immense research in recent
years. Noise robust losses is one of the more promising approaches for dealing
with label noise, as these methods only require changing the loss function and
do not require changing the design of the classifier itself, which can be
expensive in terms of development time. In this work we focus on losses that
use output regularization (such as label smoothing and entropy). Although these
losses perform well in practice, their ability to mitigate label noise lack
mathematical rigor. In this work we aim at closing this gap by showing that
losses, which incorporate an output regularization term, become symmetric as
the regularization coefficient goes to infinity. We argue that the
regularization coefficient can be seen as a hyper-parameter controlling the
symmetricity, and thus, the noise robustness of the loss function.
Related papers
- Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels.
We propose an approach to tackling overfitting caused by label noise.
Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z) - Mitigating Memorization of Noisy Labels by Clipping the Model Prediction [43.11056374542014]
Cross Entropy (CE) loss has been shown to be not robust to noisy labels due to its unboundedness.
We propose LogitClip, which clamps the norm of the logit vector to ensure that it is upper bounded by a constant.
arXiv Detail & Related papers (2022-12-08T03:35:42Z) - Do We Need to Penalize Variance of Losses for Learning with Label Noise? [91.38888889609002]
We find that the variance should be increased for the problem of learning with noisy labels.
By exploiting the label noise transition matrix, regularizers can be easily designed to reduce the variance of losses.
Empirically, the proposed method by increasing the variance of losses significantly improves the generalization ability of baselines on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-01-30T06:19:08Z) - Analyzing and Improving the Optimization Landscape of Noise-Contrastive
Estimation [50.85788484752612]
Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models.
It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance.
In this work, we formally pinpoint reasons for NCE's poor performance when an inappropriate noise distribution is used.
arXiv Detail & Related papers (2021-10-21T16:57:45Z) - Robustness and reliability when training with noisy labels [12.688634089849023]
Labelling of data for supervised learning can be costly and time-consuming.
Deep neural networks have proved capable of fitting random labels, regularisation and the use of robust loss functions.
arXiv Detail & Related papers (2021-10-07T10:30:20Z) - Learning with Noisy Labels via Sparse Regularization [76.31104997491695]
Learning with noisy labels is an important task for training accurate deep neural networks.
Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels.
We introduce the sparse regularization strategy to approximate the one-hot constraint.
arXiv Detail & Related papers (2021-07-31T09:40:23Z) - Asymmetric Loss Functions for Learning with Noisy Labels [82.50250230688388]
We propose a new class of loss functions, namely textitasymmetric loss functions, which are robust to learning with noisy labels for various types of noise.
Experimental results on benchmark datasets demonstrate that asymmetric loss functions can outperform state-of-the-art methods.
arXiv Detail & Related papers (2021-06-06T12:52:48Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.