Related papers: Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning

Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning

URL: http://arxiv.org/abs/2306.05497v2
Date: Mon, 24 Jun 2024 09:02:08 GMT
Title: Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning
Authors: Max Staats, Matthias Thamm, Bernd Rosenow,
Abstract summary: Large annotated datasets inevitably contain noisy labels, which poses a major challenge for training deep neural networks as they easily memorize the labels. Noise-robust loss functions have emerged as a notable strategy to counteract this issue, but it remains challenging to create a robust loss function which is not susceptible to underfitting. We propose a novel method denoted as logit bias, which adds a real number $epsilon$ to the logit at the position of the correct class.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large annotated datasets inevitably contain noisy labels, which poses a major challenge for training deep neural networks as they easily memorize the labels. Noise-robust loss functions have emerged as a notable strategy to counteract this issue, but it remains challenging to create a robust loss function which is not susceptible to underfitting. Through a quantitative approach, this paper explores the limited overlap between the network output at initialization and regions of non-vanishing gradients of bounded loss functions in the initial learning phase. Using these insights, we address underfitting of the MAE loss with a novel method denoted as logit bias, which adds a real number $\epsilon$ to the logit at the position of the correct class. This method enables bounded losses to learn, even on datasets like WebVision, consisting of over a million images from 1000 classes. Extensive numerical experiments show that the logit bias enables MAE to compete with state-of-the-art noise robust loss functions. In addition, we demonstrate that our method can be used to determine optimal parameters for other loss functions -- without having to train networks. Remarkably, our method determines the hyperparameters based on the number of classes, resulting in loss functions which require zero dataset or noise-dependent parameters.

Related papers

Active Negative Loss: A Robust Framework for Learning with Noisy Labels [26.853357479214004]
Noise-robust loss functions offer an effective solution for enhancing learning in the presence of label noise. We introduce a novel loss function class, termed Normalized Negative Loss Functions (NNLFs), which serve as passive loss functions within the APL framework. In non-symmetric noise scenarios, we propose an entropy-based regularization technique to mitigate the vulnerability to the label imbalance.
arXiv Detail & Related papers (2024-12-03T11:00:15Z)
Robust Network Learning via Inverse Scale Variational Sparsification [55.64935887249435]
We introduce an inverse scale variational sparsification framework within a time-continuous inverse scale space formulation. Unlike frequency-based methods, our approach not only removes noise by smoothing small-scale features. We show the efficacy of our approach through enhanced robustness against various noise types.
arXiv Detail & Related papers (2024-09-27T03:17:35Z)
LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization [56.67706781191521]
An adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown to the learner. We present a robust online rounds optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown.
arXiv Detail & Related papers (2024-08-12T17:08:31Z)
Robust Loss Functions for Training Decision Trees with Noisy Labels [4.795403008763752]
We consider training decision trees using noisily labeled data, focusing on loss functions that can lead to robust learning algorithms. First, we offer novel theoretical insights on the robustness of many existing loss functions in the context of decision tree learning. Second, we introduce a framework for constructing robust loss functions, called distribution losses.
arXiv Detail & Related papers (2023-12-20T11:27:46Z)
Xtreme Margin: A Tunable Loss Function for Binary Classification Problems [0.0]
We provide an overview of a novel loss function, the Xtreme Margin loss function. Unlike the binary cross-entropy and the hinge loss functions, this loss function provides researchers and practitioners flexibility with their training process.
arXiv Detail & Related papers (2022-10-31T22:39:32Z)
The Fisher-Rao Loss for Learning under Label Noise [9.238700679836855]
We study the Fisher-Rao loss function, which emerges from the Fisher-Rao distance in the statistical manifold of discrete distributions. We derive an upper bound for the performance degradation in the presence of label noise, and analyse the learning speed of this loss.
arXiv Detail & Related papers (2022-10-28T20:50:10Z)
Learning with Noisy Labels via Sparse Regularization [76.31104997491695]
Learning with noisy labels is an important task for training accurate deep neural networks. Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels. We introduce the sparse regularization strategy to approximate the one-hot constraint.
arXiv Detail & Related papers (2021-07-31T09:40:23Z)
On Codomain Separability and Label Inference from (Noisy) Loss Functions [11.780563744330038]
We introduce the notion of codomain separability to study the necessary and sufficient conditions under which label inference is possible from any (noisy) loss function values. We show that for many commonly used loss functions, including multiclass cross-entropy with common activation functions and some Bregman divergence-based losses, it is possible to design label inference attacks for arbitrary noise levels.
arXiv Detail & Related papers (2021-07-07T05:29:53Z)
Asymmetric Loss Functions for Learning with Noisy Labels [82.50250230688388]
We propose a new class of loss functions, namely textitasymmetric loss functions, which are robust to learning with noisy labels for various types of noise. Experimental results on benchmark datasets demonstrate that asymmetric loss functions can outperform state-of-the-art methods.
arXiv Detail & Related papers (2021-06-06T12:52:48Z)
Searching for Robustness: Loss Learning for Noisy Classification Tasks [81.70914107917551]
We parameterize a flexible family of loss functions using Taylors and apply evolutionary strategies to search for noise-robust losses in this space. The resulting white-box loss provides a simple and fast "plug-and-play" module that enables effective noise-robust learning in diverse downstream tasks.
arXiv Detail & Related papers (2021-02-27T15:27:22Z)
Normalized Loss Functions for Deep Learning with Noisy Labels [39.32101898670049]
We show that the commonly used Cross Entropy (CE) loss is not robust to noisy labels. We propose a framework to build robust loss functions called Active Passive Loss (APL)
arXiv Detail & Related papers (2020-06-24T08:25:46Z)
Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks. We propose a feature distortion method (Disout) for addressing the aforementioned problem. The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)
Learning Not to Learn in the Presence of Noisy Labels [104.7655376309784]
We show that a new class of loss functions called the gambler's loss provides strong robustness to label noise across various levels of corruption. We show that training with this loss function encourages the model to "abstain" from learning on the data points with noisy labels.
arXiv Detail & Related papers (2020-02-16T09:12:27Z)
Learning Adaptive Loss for Robust Learning with Noisy Labels [59.06189240645958]
Robust loss is an important strategy for handling robust learning issue. We propose a meta-learning method capable of robust hyper tuning. Four kinds of SOTA loss functions are attempted to be minimization, general availability and effectiveness.
arXiv Detail & Related papers (2020-02-16T00:53:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.