Detecting Label Noise via Leave-One-Out Cross Validation
- URL: http://arxiv.org/abs/2103.11352v1
- Date: Sun, 21 Mar 2021 10:02:50 GMT
- Title: Detecting Label Noise via Leave-One-Out Cross Validation
- Authors: Yu-Hang Tang, Yuanran Zhu, Wibe A. de Jong
- Abstract summary: We present a simple algorithm for identifying and correcting real-valued noisy labels from a mixture of clean and corrupted samples.
A heteroscedastic noise model is employed, in which additive Gaussian noise terms with independent variances are associated with each and all of the observed labels.
We show that the presented method can pinpoint corrupted samples and lead to better regression models when trained on synthetic and real-world scientific data sets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a simple algorithm for identifying and correcting real-valued
noisy labels from a mixture of clean and corrupted samples using Gaussian
process regression. A heteroscedastic noise model is employed, in which
additive Gaussian noise terms with independent variances are associated with
each and all of the observed labels. Thus, the method effectively applies a
sample-specific Tikhonov regularization term, generalizing the uniform
regularization prevalent in standard Gaussian process regression. Optimizing
the noise model using maximum likelihood estimation leads to the containment of
the GPR model's predictive error by the posterior standard deviation in
leave-one-out cross-validation. A multiplicative update scheme is proposed for
solving the maximum likelihood estimation problem under non-negative
constraints. While we provide a proof of monotonic convergence for certain
special cases, the multiplicative scheme has empirically demonstrated monotonic
convergence behavior in virtually all our numerical experiments. We show that
the presented method can pinpoint corrupted samples and lead to better
regression models when trained on synthetic and real-world scientific data
sets.
Related papers
- ROTI-GCV: Generalized Cross-Validation for right-ROTationally Invariant Data [1.194799054956877]
Two key tasks in high-dimensional regularized regression are tuning the regularization strength for accurate predictions and estimating the out-of-sample risk.
We introduce a new framework, ROTI-GCV, for reliably performing cross-validation under challenging conditions.
arXiv Detail & Related papers (2024-06-17T15:50:00Z) - GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added.
From the global perspective, the difficulty of reconstructing images with different anomalies is uneven.
We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z) - Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery [4.446017969073817]
We show empirical success in terms of accuracy and robustness to noise with bootstrapping-based sequential thresholding least-squares estimator.
We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate.
arXiv Detail & Related papers (2023-01-30T04:07:59Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Convergence for score-based generative modeling with polynomial
complexity [9.953088581242845]
We prove the first convergence guarantees for the core mechanic behind Score-based generative modeling.
Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality.
We show that a predictor-corrector gives better convergence than using either portion alone.
arXiv Detail & Related papers (2022-06-13T14:57:35Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Information-Theoretic Generalization Bounds for Iterative
Semi-Supervised Learning [81.1071978288003]
In particular, we seek to understand the behaviour of the em generalization error of iterative SSL algorithms using information-theoretic principles.
Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates.
arXiv Detail & Related papers (2021-10-03T05:38:49Z) - Learning Noise Transition Matrix from Only Noisy Labels via Total
Variation Regularization [88.91872713134342]
We propose a theoretically grounded method that can estimate the noise transition matrix and learn a classifier simultaneously.
We show the effectiveness of the proposed method through experiments on benchmark and real-world datasets.
arXiv Detail & Related papers (2021-02-04T05:09:18Z) - Generalized Multi-Output Gaussian Process Censored Regression [7.111443975103331]
We introduce a heteroscedastic multi-output Gaussian process model which combines the non-parametric flexibility of GPs with the ability to leverage information from correlated outputs under input-dependent noise conditions.
Results show how the added flexibility allows our model to better estimate the underlying non-censored (i.e. true) process under potentially complex censoring dynamics.
arXiv Detail & Related papers (2020-09-10T12:46:29Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.