Related papers: On the relationship between calibrated predictors and unbiased volume estimation

On the relationship between calibrated predictors and unbiased volume estimation

URL: http://arxiv.org/abs/2112.12560v1
Date: Thu, 23 Dec 2021 14:22:19 GMT
Title: On the relationship between calibrated predictors and unbiased volume estimation
Authors: Teodora Popordanoska, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes, Matthew B. Blaschko
Abstract summary: Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities.
Score: 18.96093589337619
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value of the label conditioned on the confidence. Such calibrated predictions have utility in a range of medical imaging applications, including surgical planning under uncertainty and active learning systems. At the same time it is often an accurate volume measurement that is of real importance for many medical applications. This work investigates the relationship between model calibration and volume estimation. We demonstrate both mathematically and empirically that if the predictor is calibrated per image, we can obtain the correct volume by taking an expectation of the probability scores per pixel/voxel of the image. Furthermore, we show that convex combinations of calibrated classifiers preserve volume estimation, but do not preserve calibration. Therefore, we conclude that having a calibrated predictor is a sufficient, but not necessary condition for obtaining an unbiased estimate of the volume. We validate our theoretical findings empirically on a collection of 18 different (calibrated) training strategies on the tasks of glioma volume estimation on BraTS 2018, and ischemic stroke lesion volume estimation on ISLES 2018 datasets.

Related papers

A conformalized learning of a prediction set with applications to medical imaging classification [14.304858613146536]
We present an algorithm that can produce a prediction set containing the true label with a user-specified probability, such as 90%. We applied the proposed algorithm to several standard medical imaging classification datasets.
arXiv Detail & Related papers (2024-08-09T12:49:04Z)
Robust Conformal Volume Estimation in 3D Medical Images [0.5799785223420274]
Volumetry is one of the principal downstream applications of 3D medical image segmentation. We propose an efficient approach for density ratio estimation relying on the compressed latent representations generated by the segmentation model.
arXiv Detail & Related papers (2024-07-29T12:18:07Z)
Reassessing How to Compare and Improve the Calibration of Machine Learning Models [7.183341902583164]
A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction. We show that there exist trivial recalibration approaches that can appear seemingly state-of-the-art unless calibration and prediction metrics are accompanied by additional generalization metrics.
arXiv Detail & Related papers (2024-06-06T13:33:45Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks. We analyze problem statement, calibration definitions, and different approaches to evaluation. Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z)
Fair admission risk prediction with proportional multicalibration [0.16249424686052708]
Multicalibration constrains calibration error among flexibly-defined subpopulations. It is possible for a decision-maker to learn to trust or distrust model predictions for specific groups. We propose proportional multicalibration, a criteria that constrains the percent calibration error among groups and within prediction bins.
arXiv Detail & Related papers (2022-09-29T08:15:29Z)
Improving Trustworthiness of AI Disease Severity Rating in Medical Imaging with Ordinal Conformal Prediction Sets [0.7734726150561088]
A lack of statistically rigorous uncertainty quantification is a significant factor undermining trust in AI results. Recent developments in distribution-free uncertainty quantification present practical solutions for these issues. We demonstrate a technique for forming ordinal prediction sets that are guaranteed to contain the correct stenosis severity.
arXiv Detail & Related papers (2022-07-05T18:01:20Z)
T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z)
Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation. We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z)
X-CAL: Explicit Calibration for Survival Analysis [22.642252425363335]
When a model's predicted number of events within any time interval is similar to the observed number, it is called well-calibrated. We develop explicit calibration (X-CAL) which turns D-CALIBRATION into a differentiable objective. X-CAL allows practitioners to directly optimize calibration and strike a desired balance between predictive power and calibration.
arXiv Detail & Related papers (2021-01-13T21:00:23Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
Individual Calibration with Randomized Forecasting [116.2086707626651]
We show that calibration for individual samples is possible in the regression setup if the predictions are randomized. We design a training objective to enforce individual calibration and use it to train randomized regression functions.
arXiv Detail & Related papers (2020-06-18T05:53:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.