X-CAL: Explicit Calibration for Survival Analysis
- URL: http://arxiv.org/abs/2101.05346v1
- Date: Wed, 13 Jan 2021 21:00:23 GMT
- Title: X-CAL: Explicit Calibration for Survival Analysis
- Authors: Mark Goldstein, Xintian Han, Aahlad Puli, Adler J. Perotte and Rajesh
Ranganath
- Abstract summary: When a model's predicted number of events within any time interval is similar to the observed number, it is called well-calibrated.
We develop explicit calibration (X-CAL) which turns D-CALIBRATION into a differentiable objective.
X-CAL allows practitioners to directly optimize calibration and strike a desired balance between predictive power and calibration.
- Score: 22.642252425363335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Survival analysis models the distribution of time until an event of interest,
such as discharge from the hospital or admission to the ICU. When a model's
predicted number of events within any time interval is similar to the observed
number, it is called well-calibrated. A survival model's calibration can be
measured using, for instance, distributional calibration (D-CALIBRATION)
[Haider et al., 2020] which computes the squared difference between the
observed and predicted number of events within different time intervals.
Classically, calibration is addressed in post-training analysis. We develop
explicit calibration (X-CAL), which turns D-CALIBRATION into a differentiable
objective that can be used in survival modeling alongside maximum likelihood
estimation and other objectives. X-CAL allows practitioners to directly
optimize calibration and strike a desired balance between predictive power and
calibration. In our experiments, we fit a variety of shallow and deep models on
simulated data, a survival dataset based on MNIST, on length-of-stay prediction
using MIMIC-III data, and on brain cancer data from The Cancer Genome Atlas. We
show that the models we study can be miscalibrated. We give experimental
evidence on these datasets that X-CAL improves D-CALIBRATION without a large
decrease in concordance or likelihood.
Related papers
- ForeCal: Random Forest-based Calibration for DNNs [0.0]
We propose ForeCal, a novel post-hoc calibration algorithm based on Random forests.
ForeCal exploits two unique properties of Random forests: the ability to enforce weak monotonicity and range-preservation.
We show that ForeCal outperforms existing methods in terms of Expected Error(ECE) with minimal impact on the discriminative power of the base as measured by AUC.
arXiv Detail & Related papers (2024-09-04T04:56:41Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Bayesian calibration of stochastic agent based model via random forest [1.4447019135112433]
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology.
These models are usually and highly parametrized, requiring precise calibration for predictive performance.
This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs.
arXiv Detail & Related papers (2024-06-27T20:50:06Z) - Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - Explain Variance of Prediction in Variational Time Series Models for
Clinical Deterioration Prediction [4.714591319660812]
We propose a novel view of clinical variable measurement frequency from a predictive modeling perspective.
The prediction variance is estimated by sampling the conditional hidden space in variational models and can be approximated deterministically by delta's method.
We tested our ideas on a public ICU dataset with deterioration prediction task and study the relation between variance SHAP and measurement time intervals.
arXiv Detail & Related papers (2024-02-09T22:14:40Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - On the relationship between calibrated predictors and unbiased volume
estimation [18.96093589337619]
Machine learning driven medical image segmentation has become standard in medical image analysis.
However, deep learning models are prone to overconfident predictions.
This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities.
arXiv Detail & Related papers (2021-12-23T14:22:19Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z) - Individual Calibration with Randomized Forecasting [116.2086707626651]
We show that calibration for individual samples is possible in the regression setup if the predictions are randomized.
We design a training objective to enforce individual calibration and use it to train randomized regression functions.
arXiv Detail & Related papers (2020-06-18T05:53:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.