On Focal Loss for Class-Posterior Probability Estimation: A Theoretical
Perspective
- URL: http://arxiv.org/abs/2011.09172v2
- Date: Mon, 14 Dec 2020 04:15:40 GMT
- Title: On Focal Loss for Class-Posterior Probability Estimation: A Theoretical
Perspective
- Authors: Nontawat Charoenphakdee, Jayakorn Vongkulbhisal, Nuttapong
Chairatanakul, Masashi Sugiyama
- Abstract summary: We first prove that the focal loss is classification-calibrated, i.e., its minimizer surely yields the Bayes-optimal classifier.
We then prove that the focal loss is not strictly proper, i.e., the confidence score of the classifier does not match the true class-posterior probability.
Our proposed transformation significantly improves the accuracy of class-posterior probability estimation.
- Score: 83.19406301934245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The focal loss has demonstrated its effectiveness in many real-world
applications such as object detection and image classification, but its
theoretical understanding has been limited so far. In this paper, we first
prove that the focal loss is classification-calibrated, i.e., its minimizer
surely yields the Bayes-optimal classifier and thus the use of the focal loss
in classification can be theoretically justified. However, we also prove a
negative fact that the focal loss is not strictly proper, i.e., the confidence
score of the classifier obtained by focal loss minimization does not match the
true class-posterior probability and thus it is not reliable as a
class-posterior probability estimator. To mitigate this problem, we next prove
that a particular closed-form transformation of the confidence score allows us
to recover the true class-posterior probability. Through experiments on
benchmark datasets, we demonstrate that our proposed transformation
significantly improves the accuracy of class-posterior probability estimation.
Related papers
- Sharp error bounds for imbalanced classification: how many examples in the minority class? [6.74159270845872]
Reweighting the loss function is a standard procedure allowing to equilibrate between the true positive and true negative rates within the risk measure.
Despite significant theoretical work in this area, existing results do not adequately address a main challenge within the imbalanced classification framework.
We present two contributions in the setting where the rare class probability approaches zero.
arXiv Detail & Related papers (2023-10-23T11:45:34Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Beyond calibration: estimating the grouping loss of modern neural
networks [68.8204255655161]
Proper scoring rule theory shows that given the calibration loss, the missing piece to characterize individual errors is the grouping loss.
We show that modern neural network architectures in vision and NLP exhibit grouping loss, notably in distribution shifts settings.
arXiv Detail & Related papers (2022-10-28T07:04:20Z) - DBCal: Density Based Calibration of classifier predictions for
uncertainty quantification [0.0]
We present a technique that quantifies the uncertainty of predictions from a machine learning method.
We prove that our method provides an accurate estimate of the probability that the outputs of two neural networks are correct.
arXiv Detail & Related papers (2022-04-01T01:03:41Z) - Identifying Incorrect Classifications with Balanced Uncertainty [21.130311978327196]
Uncertainty estimation is critical for cost-sensitive deep-learning applications.
We propose the distributional imbalance to model the imbalance in uncertainty estimation as two kinds of distribution biases.
We then propose Balanced True Class Probability framework, which learns an uncertainty estimator with a novel Distributional Focal Loss objective.
arXiv Detail & Related papers (2021-10-15T11:52:31Z) - Don't Just Blame Over-parametrization for Over-confidence: Theoretical
Analysis of Calibration in Binary Classification [58.03725169462616]
We show theoretically that over-parametrization is not the only reason for over-confidence.
We prove that logistic regression is inherently over-confident, in the realizable, under-parametrized setting.
Perhaps surprisingly, we also show that over-confidence is not always the case.
arXiv Detail & Related papers (2021-02-15T21:38:09Z) - Classification with Rejection Based on Cost-sensitive Classification [83.50402803131412]
We propose a novel method of classification with rejection by ensemble of learning.
Experimental results demonstrate the usefulness of our proposed approach in clean, noisy, and positive-unlabeled classification.
arXiv Detail & Related papers (2020-10-22T14:05:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.