Related papers: Practical estimation of the optimal classification error with soft labels and calibration

Practical estimation of the optimal classification error with soft labels and calibration

URL: http://arxiv.org/abs/2505.20761v1
Date: Tue, 27 May 2025 06:04:57 GMT
Title: Practical estimation of the optimal classification error with soft labels and calibration
Authors: Ryota Ushio, Takashi Ishida, Masashi Sugiyama,
Abstract summary: We extend a previous work that utilizes soft labels for estimating the Bayes error, the optimal error rate.<n>We tackle a more challenging problem setting: estimation with corrupted soft labels.<n>Our method is instance-free, i.e., we do not assume access to any input instances.
Score: 52.1410307583181
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides a means of answering this question in the setting of binary classification, which is practical and theoretically supported. We extend a previous work that utilizes soft labels for estimating the Bayes error, the optimal error rate, in two important ways. First, we theoretically investigate the properties of the bias of the hard-label-based estimator discussed in the original work. We reveal that the decay rate of the bias is adaptive to how well the two class-conditional distributions are separated, and it can decay significantly faster than the previous result suggested as the number of hard labels per instance grows. Second, we tackle a more challenging problem setting: estimation with corrupted soft labels. One might be tempted to use calibrated soft labels instead of clean ones. However, we reveal that calibration guarantee is not enough, that is, even perfectly calibrated soft labels can result in a substantially inaccurate estimate. Then, we show that isotonic calibration can provide a statistically consistent estimator under an assumption weaker than that of the previous work. Our method is instance-free, i.e., we do not assume access to any input instances. This feature allows it to be adopted in practical scenarios where the instances are not available due to privacy issues. Experiments with synthetic and real-world datasets show the validity of our methods and theory.

Related papers

Data-Driven Estimation of the False Positive Rate of the Bayes Binary Classifier via Soft Labels [25.40796153743837]
We propose an estimator for the false positive rate (FPR) of the Bayes classifier, that is, the optimal classifier with respect to accuracy, from a given dataset. We develop effective FPR estimators by leveraging a denoising technique and the Nadaraya-Watson estimator.
arXiv Detail & Related papers (2024-01-27T20:41:55Z)
Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process. We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified. Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z)
Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples. For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class. We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z)
Model Calibration in Dense Classification with Adaptive Label Perturbation [44.62722402349157]
Existing dense binary classification models are prone to being over-confident. We propose Adaptive Label Perturbation (ASLP) which learns a unique label perturbation level for each training image. ASLP can significantly improve calibration degrees of dense binary classification models on both in-distribution and out-of-distribution data.
arXiv Detail & Related papers (2023-07-25T14:40:11Z)
SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation. We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training. In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z)
Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data. We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z)
Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
Self-Supervised Learning from Semantically Imprecise Data [7.24935792316121]
Learning from imprecise labels such as "animal" or "bird" is an important capability when expertly labeled training data is scarce. CHILLAX is a recently proposed method to tackle this task. We extend CHILLAX with a self-supervised scheme using constrained extrapolation to generate pseudo-labels.
arXiv Detail & Related papers (2021-04-22T07:26:14Z)
Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation [17.212805760360954]
We use a framework centered on model misspecification in method-of-moments latent variable estimation. We then introduce a correction that provably removes this bias in certain cases. We observe theoretically and with synthetic experiments that for well-specified models, labeled points are worth a constant factor more than unlabeled points.
arXiv Detail & Related papers (2021-03-03T23:52:38Z)
Improving Generalization of Deep Fault Detection Models in the Presence of Mislabeled Data [1.3535770763481902]
We propose a novel two-step framework for robust training with label noise. In the first step, we identify outliers (including the mislabeled samples) based on the update in the hypothesis space. In the second step, we propose different approaches to modifying the training data based on the identified outliers and a data augmentation technique.
arXiv Detail & Related papers (2020-09-30T12:33:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.