Is the Performance of My Deep Network Too Good to Be True? A Direct
Approach to Estimating the Bayes Error in Binary Classification
- URL: http://arxiv.org/abs/2202.00395v1
- Date: Tue, 1 Feb 2022 13:22:26 GMT
- Title: Is the Performance of My Deep Network Too Good to Be True? A Direct
Approach to Estimating the Bayes Error in Binary Classification
- Authors: Takashi Ishida, Ikko Yamane, Nontawat Charoenphakdee, Gang Niu,
Masashi Sugiyama
- Abstract summary: In classification problems, the Bayes error can be used as a criterion to evaluate classifiers with state-of-the-art performance.
We propose a simple and direct Bayes error estimator, where we just take the mean of the labels that show emphuncertainty of the classes.
Our flexible approach enables us to perform Bayes error estimation even for weakly supervised data.
- Score: 86.32752788233913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a fundamental limitation in the prediction performance that a
machine learning model can achieve due to the inevitable uncertainty of the
prediction target. In classification problems, this can be characterized by the
Bayes error, which is the best achievable error with any classifier. The Bayes
error can be used as a criterion to evaluate classifiers with state-of-the-art
performance and can be used to detect test set overfitting. We propose a simple
and direct Bayes error estimator, where we just take the mean of the labels
that show \emph{uncertainty} of the classes. Our flexible approach enables us
to perform Bayes error estimation even for weakly supervised data. In contrast
to others, our method is model-free and even instance-free. Moreover, it has no
hyperparameters and gives a more accurate estimate of the Bayes error than
classifier-based baselines. Experiments using our method suggest that a
recently proposed classifier, the Vision Transformer, may have already reached
the Bayes error for certain benchmark datasets.
Related papers
- Mitigating Word Bias in Zero-shot Prompt-based Classifiers [55.60306377044225]
We show that matching class priors correlates strongly with the oracle upper bound performance.
We also demonstrate large consistent performance gains for prompt settings over a range of NLP tasks.
arXiv Detail & Related papers (2023-09-10T10:57:41Z) - Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - How to Fix a Broken Confidence Estimator: Evaluating Post-hoc Methods for Selective Classification with Deep Neural Networks [1.4502611532302039]
We show that a simple $p$-norm normalization of the logits, followed by taking the maximum logit as the confidence estimator, can lead to considerable gains in selective classification performance.
Our results are shown to be consistent under distribution shift.
arXiv Detail & Related papers (2023-05-24T18:56:55Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Robust Importance Sampling for Error Estimation in the Context of
Optimal Bayesian Transfer Learning [13.760785726194591]
We introduce a novel class of Bayesian minimum mean-square error (MMSE) estimators for optimal Bayesian transfer learning (OBTL)
We employ the proposed estimator to evaluate the classification accuracy of a broad family of classifiers that span diverse learning capabilities.
Experimental results based on both synthetic data as well as real-world RNA sequencing (RNA-seq) data show that our proposed OBTL error estimation scheme clearly outperforms standard error estimators.
arXiv Detail & Related papers (2021-09-05T19:11:33Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - How to Control the Error Rates of Binary Classifiers [0.0]
We show how to turn binary classification into statistical tests, calculate the classification p-values, and use them to limit the target error rate.
In particular, we show how to turn binary classifiers into statistical tests, calculate the classification p-values, and use them to limit the target error rate.
arXiv Detail & Related papers (2020-10-21T14:43:14Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.