Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model
- URL: http://arxiv.org/abs/2006.01752v1
- Date: Tue, 2 Jun 2020 16:26:49 GMT
- Title: Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model
- Authors: Alejandro Schuler, Aashish Bhardwaj, Vincent Liu
- Abstract summary: Clinical researchers often select among and evaluate risk prediction models.
Standard metrics calculated from retrospective data are only related to model utility under certain assumptions.
When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
- Score: 71.9860741092209
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clinical researchers often select among and evaluate risk prediction models
using standard machine learning metrics based on confusion matrices. However,
if these models are used to allocate interventions to patients, standard
metrics calculated from retrospective data are only related to model utility
(in terms of reductions in outcomes) under certain assumptions. When
predictions are delivered repeatedly throughout time (e.g. in a patient
encounter), the relationship between standard metrics and utility is further
complicated. Several kinds of evaluations have been used in the literature, but
it has not been clear what the target of estimation is in each evaluation. We
synthesize these approaches, determine what is being estimated in each of them,
and discuss under what assumptions those estimates are valid. We demonstrate
our insights using simulated data as well as real data used in the design of an
early warning system. Our theoretical and empirical results show that
evaluations without interventional data either do not estimate meaningful
quantities, require strong assumptions, or are limited to estimating best-case
scenario bounds.
Related papers
- Deep Learning Methods for the Noniterative Conditional Expectation G-Formula for Causal Inference from Complex Observational Data [3.0958655016140892]
The g-formula can be used to estimate causal effects of sustained treatment strategies using observational data.
Parametric models are subject to model misspecification, which may result in biased causal estimates.
We propose a unified deep learning framework for the NICE g-formula estimator.
arXiv Detail & Related papers (2024-10-28T21:00:46Z) - Ranking and Combining Latent Structured Predictive Scores without Labeled Data [2.5064967708371553]
This paper introduces a novel structured unsupervised ensemble learning model (SUEL)
It exploits the dependency between a set of predictors with continuous predictive scores, rank the predictors without labeled data and combine them to an ensembled score with weights.
The efficacy of the proposed methods is rigorously assessed through both simulation studies and real-world application of risk genes discovery.
arXiv Detail & Related papers (2024-08-14T20:14:42Z) - Doing Great at Estimating CATE? On the Neglected Assumptions in
Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading.
We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators.
We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Semi-supervised learning and the question of true versus estimated
propensity scores [0.456877715768796]
We propose a simple procedure that reconciles the strong intuition that a known propensity functions should be useful for estimating treatment effects.
Further, simulation studies suggest that direct regression may be preferable to inverse-propensity weight estimators in many circumstances.
arXiv Detail & Related papers (2020-09-14T04:13:12Z) - Impact of Medical Data Imprecision on Learning Results [9.379890125442333]
We study the impact of imprecision on prediction results in a healthcare application.
A pre-trained model is used to predict future state of hyperthyroidism for patients.
arXiv Detail & Related papers (2020-07-24T06:54:57Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Uncertainty estimation for classification and risk prediction on medical
tabular data [0.0]
This work advances the understanding of uncertainty estimation for classification and risk prediction on medical data.
In a data-scarce field such as healthcare, the ability to measure the uncertainty of a model's prediction could potentially lead to improved effectiveness of decision support tools.
arXiv Detail & Related papers (2020-04-13T08:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.