When Optimizing $f$-divergence is Robust with Label Noise
- URL: http://arxiv.org/abs/2011.03687v3
- Date: Wed, 18 Aug 2021 20:49:13 GMT
- Title: When Optimizing $f$-divergence is Robust with Label Noise
- Authors: Jiaheng Wei, Yang Liu
- Abstract summary: We show when maximizing a properly defined $f$-divergence measure with respect to a classifier's predictions and the supervised labels is robust with label noise.
We derive a nice decoupling property for a family of $f$-divergence measures when label noise presents, where the divergence is shown to be a linear combination of the variational difference defined on the clean distribution and a bias term introduced due to the noise.
- Score: 10.452709936265274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show when maximizing a properly defined $f$-divergence measure with
respect to a classifier's predictions and the supervised labels is robust with
label noise. Leveraging its variational form, we derive a nice decoupling
property for a family of $f$-divergence measures when label noise presents,
where the divergence is shown to be a linear combination of the variational
difference defined on the clean distribution and a bias term introduced due to
the noise. The above derivation helps us analyze the robustness of different
$f$-divergence functions. With established robustness, this family of
$f$-divergence functions arises as useful metrics for the problem of learning
with noisy labels, which do not require the specification of the labels' noise
rate. When they are possibly not robust, we propose fixes to make them so. In
addition to the analytical results, we present thorough experimental evidence.
Our code is available at
https://github.com/UCSC-REAL/Robust-f-divergence-measures.
Related papers
- $ε$-Softmax: Approximating One-Hot Vectors for Mitigating Label Noise [99.91399796174602]
Noisy labels pose a common challenge for training accurate deep neural networks.<n>We propose $epsilon$-softmax, which modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error.<n>We prove theoretically that $epsilon$-softmax can achieve noise-tolerant learning with controllable excess risk bound for almost any loss function.
arXiv Detail & Related papers (2025-08-04T13:10:48Z) - Robust Classification with Noisy Labels Based on Posterior Maximization [4.550290285002704]
In this paper, we investigate the robustness to label noise of an $f$-divergence-based class of objective functions recently proposed for supervised classification.
We show that, in the presence of label noise, any of the $f$-PML objective functions can be corrected to obtain a neural network that is equal to the one learned with the clean dataset.
arXiv Detail & Related papers (2025-04-09T11:52:51Z) - Learning with Noisy Labels: the Exploration of Error Bounds in Classification [7.657250843344973]
In this paper, we focus on the error bounds of excess risks for classification problems with noisy labels within deep learning frameworks.
We estimate the statistical error on a dependent (mixing) sequence, bounding it with the help of the associated independent block sequence.
The main task is then to estimate the approximation error for the continuous function from $[0,1]d$ to $mathbbRK$.
arXiv Detail & Related papers (2025-01-25T10:06:50Z) - Inaccurate Label Distribution Learning with Dependency Noise [52.08553913094809]
We introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning.
We show that DN-ILDL effectively addresses the ILDL problem and outperforms existing LDL methods.
arXiv Detail & Related papers (2024-05-26T07:58:07Z) - Dirichlet-Based Prediction Calibration for Learning with Noisy Labels [40.78497779769083]
Learning with noisy labels can significantly hinder the generalization performance of deep neural networks (DNNs)
Existing approaches address this issue through loss correction or example selection methods.
We propose the textitDirichlet-based Prediction (DPC) method as a solution.
arXiv Detail & Related papers (2024-01-13T12:33:04Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Doubly Stochastic Models: Learning with Unbiased Label Noises and
Inference Stability [85.1044381834036]
We investigate the implicit regularization effects of label noises under mini-batch sampling settings of gradient descent.
We find such implicit regularizer would favor some convergence points that could stabilize model outputs against perturbation of parameters.
Our work doesn't assume SGD as an Ornstein-Uhlenbeck like process and achieve a more general result with convergence of approximation proved.
arXiv Detail & Related papers (2023-04-01T14:09:07Z) - Lifting Weak Supervision To Structured Prediction [12.219011764895853]
Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates.
We introduce techniques new to weak supervision based on pseudo-Euclidean embeddings and tensor decompositions.
Several of our results, which can be viewed as robustness guarantees in structured prediction with noisy labels, may be of independent interest.
arXiv Detail & Related papers (2022-11-24T02:02:58Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - Robustness and reliability when training with noisy labels [12.688634089849023]
Labelling of data for supervised learning can be costly and time-consuming.
Deep neural networks have proved capable of fitting random labels, regularisation and the use of robust loss functions.
arXiv Detail & Related papers (2021-10-07T10:30:20Z) - Label Noise in Adversarial Training: A Novel Perspective to Study Robust
Overfitting [45.58217741522973]
We show that label noise exists in adversarial training.
Such label noise is due to the mismatch between the true label distribution of adversarial examples and the label inherited from clean examples.
We propose a method to automatically calibrate the label to address the label noise and robust overfitting.
arXiv Detail & Related papers (2021-10-07T01:15:06Z) - Instance-dependent Label-noise Learning under a Structural Causal Model [92.76400590283448]
Label noise will degenerate the performance of deep learning algorithms.
By leveraging a structural causal model, we propose a novel generative approach for instance-dependent label-noise learning.
arXiv Detail & Related papers (2021-09-07T10:42:54Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.