A Curriculum View of Robust Loss Functions
- URL: http://arxiv.org/abs/2305.02139v1
- Date: Wed, 3 May 2023 14:13:03 GMT
- Title: A Curriculum View of Robust Loss Functions
- Authors: Zebin Ou, Yue Zhang
- Abstract summary: We show that most loss functions can be rewritten into a form with the same class-score margin and different sample-weighting functions.
We show that simple fixes to the curriculums can make underfitting robust loss functions competitive with the state-of-the-art.
- Score: 10.285812236489814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust loss functions are designed to combat the adverse impacts of label
noise, whose robustness is typically supported by theoretical bounds agnostic
to the training dynamics. However, these bounds may fail to characterize the
empirical performance as it remains unclear why robust loss functions can
underfit. We show that most loss functions can be rewritten into a form with
the same class-score margin and different sample-weighting functions. The
resulting curriculum view provides a straightforward analysis of the training
dynamics, which helps attribute underfitting to diminished average sample
weights and noise robustness to larger weights for clean samples. We show that
simple fixes to the curriculums can make underfitting robust loss functions
competitive with the state-of-the-art, and training schedules can substantially
affect the noise robustness even with robust loss functions. Code is available
at \url{github}.
Related papers
- LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization [56.67706781191521]
An adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown to the learner.
We present a robust online rounds optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown.
arXiv Detail & Related papers (2024-08-12T17:08:31Z) - Robust Loss Functions for Training Decision Trees with Noisy Labels [4.795403008763752]
We consider training decision trees using noisily labeled data, focusing on loss functions that can lead to robust learning algorithms.
First, we offer novel theoretical insights on the robustness of many existing loss functions in the context of decision tree learning.
Second, we introduce a framework for constructing robust loss functions, called distribution losses.
arXiv Detail & Related papers (2023-12-20T11:27:46Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning [0.0]
Large annotated datasets inevitably contain noisy labels, which poses a major challenge for training deep neural networks as they easily memorize the labels.
Noise-robust loss functions have emerged as a notable strategy to counteract this issue, but it remains challenging to create a robust loss function which is not susceptible to underfitting.
We propose a novel method denoted as logit bias, which adds a real number $epsilon$ to the logit at the position of the correct class.
arXiv Detail & Related papers (2023-06-08T18:38:55Z) - The Fisher-Rao Loss for Learning under Label Noise [9.238700679836855]
We study the Fisher-Rao loss function, which emerges from the Fisher-Rao distance in the statistical manifold of discrete distributions.
We derive an upper bound for the performance degradation in the presence of label noise, and analyse the learning speed of this loss.
arXiv Detail & Related papers (2022-10-28T20:50:10Z) - Probabilistically Robust Learning: Balancing Average- and Worst-case
Performance [105.87195436925722]
We propose a framework called robustness probabilistic that bridges the gap between the accurate, yet brittle average case and the robust, yet conservative worst case.
From a theoretical point of view, this framework overcomes the trade-offs between the performance and the sample-complexity of worst-case and average-case learning.
arXiv Detail & Related papers (2022-02-02T17:01:38Z) - Searching for Robustness: Loss Learning for Noisy Classification Tasks [81.70914107917551]
We parameterize a flexible family of loss functions using Taylors and apply evolutionary strategies to search for noise-robust losses in this space.
The resulting white-box loss provides a simple and fast "plug-and-play" module that enables effective noise-robust learning in diverse downstream tasks.
arXiv Detail & Related papers (2021-02-27T15:27:22Z) - Risk Bounds for Robust Deep Learning [1.52292571922932]
It has been observed that certain loss functions can render deep-learning pipelines robust against flaws in the data.
We especially show that empirical-risk minimization with unbounded, Lipschitz-continuous loss functions, such as the least-absolute deviation loss, Huber loss, Cauchy loss, and Tukey's biweight loss, can provide efficient prediction under minimal assumptions on the data.
arXiv Detail & Related papers (2020-09-14T05:06:59Z) - An Equivalence between Loss Functions and Non-Uniform Sampling in
Experience Replay [72.23433407017558]
We show that any loss function evaluated with non-uniformly sampled data can be transformed into another uniformly sampled loss function.
Surprisingly, we find in some environments PER can be replaced entirely by this new loss function without impact to empirical performance.
arXiv Detail & Related papers (2020-07-12T17:45:24Z) - Normalized Loss Functions for Deep Learning with Noisy Labels [39.32101898670049]
We show that the commonly used Cross Entropy (CE) loss is not robust to noisy labels.
We propose a framework to build robust loss functions called Active Passive Loss (APL)
arXiv Detail & Related papers (2020-06-24T08:25:46Z) - Learning Not to Learn in the Presence of Noisy Labels [104.7655376309784]
We show that a new class of loss functions called the gambler's loss provides strong robustness to label noise across various levels of corruption.
We show that training with this loss function encourages the model to "abstain" from learning on the data points with noisy labels.
arXiv Detail & Related papers (2020-02-16T09:12:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.