Related papers: Mitigating Memorization of Noisy Labels by Clipping the Model Prediction

Mitigating Memorization of Noisy Labels by Clipping the Model Prediction

URL: http://arxiv.org/abs/2212.04055v3
Date: Tue, 13 Jun 2023 04:17:07 GMT
Title: Mitigating Memorization of Noisy Labels by Clipping the Model Prediction
Authors: Hongxin Wei, Huiping Zhuang, Renchunzi Xie, Lei Feng, Gang Niu, Bo An, Yixuan Li
Abstract summary: Cross Entropy (CE) loss has been shown to be not robust to noisy labels due to its unboundedness. We propose LogitClip, which clamps the norm of the logit vector to ensure that it is upper bounded by a constant.
Score: 43.11056374542014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the presence of noisy labels, designing robust loss functions is critical for securing the generalization performance of deep neural networks. Cross Entropy (CE) loss has been shown to be not robust to noisy labels due to its unboundedness. To alleviate this issue, existing works typically design specialized robust losses with the symmetric condition, which usually lead to the underfitting issue. In this paper, our key idea is to induce a loss bound at the logit level, thus universally enhancing the noise robustness of existing losses. Specifically, we propose logit clipping (LogitClip), which clamps the norm of the logit vector to ensure that it is upper bounded by a constant. In this manner, CE loss equipped with our LogitClip method is effectively bounded, mitigating the overfitting to examples with noisy labels. Moreover, we present theoretical analyses to certify the noise-tolerant ability of LogitClip. Extensive experiments show that LogitClip not only significantly improves the noise robustness of CE loss, but also broadly enhances the generalization performance of popular robust losses.

Related papers

Personalized Denoising Implicit Feedback for Robust Recommender System [60.719158008403376]
We show that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution. We propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized.
arXiv Detail & Related papers (2025-02-01T07:13:06Z)
An Inclusive Theoretical Framework of Robust Supervised Contrastive Loss against Label Noise [20.696895070598643]
We propose a unified theoretical framework for robust losses under the pairwise contrastive paradigm. We derive a general robust condition for arbitrary contrastive losses, which serves as a criterion to verify the theoretical robustness of a supervised contrastive loss against label noise. We show that our theory is an inclusive framework that provides explanations to prior robust techniques.
arXiv Detail & Related papers (2025-01-02T08:12:23Z)
Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels. We propose an approach to tackling overfitting caused by label noise. Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z)
Expressive Losses for Verified Robustness via Convex Combinations [67.54357965665676]
We study the relationship between the over-approximation coefficient and performance profiles across different expressive losses. We show that, while expressivity is essential, better approximations of the worst-case loss are not necessarily linked to superior robustness-accuracy trade-offs.
arXiv Detail & Related papers (2023-05-23T12:20:29Z)
Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity [55.29408396918968]
We study a family of loss functions named label-distributionally robust (LDR) losses for multi-class classification. Our contributions include both consistency and robustness by establishing top-$k$ consistency of LDR losses for multi-class classification. We propose a new adaptive LDR loss that automatically adapts the individualized temperature parameter to the noise degree of class label of each instance.
arXiv Detail & Related papers (2021-12-30T00:27:30Z)
Robustness and reliability when training with noisy labels [12.688634089849023]
Labelling of data for supervised learning can be costly and time-consuming. Deep neural networks have proved capable of fitting random labels, regularisation and the use of robust loss functions.
arXiv Detail & Related papers (2021-10-07T10:30:20Z)
Learning with Noisy Labels via Sparse Regularization [76.31104997491695]
Learning with noisy labels is an important task for training accurate deep neural networks. Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels. We introduce the sparse regularization strategy to approximate the one-hot constraint.
arXiv Detail & Related papers (2021-07-31T09:40:23Z)
Asymmetric Loss Functions for Learning with Noisy Labels [82.50250230688388]
We propose a new class of loss functions, namely textitasymmetric loss functions, which are robust to learning with noisy labels for various types of noise. Experimental results on benchmark datasets demonstrate that asymmetric loss functions can outperform state-of-the-art methods.
arXiv Detail & Related papers (2021-06-06T12:52:48Z)
An Exploration into why Output Regularization Mitigates Label Noise [0.0]
Noise robust losses is one of the more promising approaches for dealing with label noise. We show that losses that incorporate an output regularization term, such as label smoothing and entropy, become symmetric as the regularization coefficient goes to infinity.
arXiv Detail & Related papers (2021-04-26T11:16:30Z)
Lower-bounded proper losses for weakly supervised classification [73.974163801142]
We discuss the problem of weakly supervised learning of classification, in which instances are given weak labels. We derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation. We experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses.
arXiv Detail & Related papers (2021-03-04T08:47:07Z)
Normalized Loss Functions for Deep Learning with Noisy Labels [39.32101898670049]
We show that the commonly used Cross Entropy (CE) loss is not robust to noisy labels. We propose a framework to build robust loss functions called Active Passive Loss (APL)
arXiv Detail & Related papers (2020-06-24T08:25:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.