Related papers: Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model

Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model

URL: http://arxiv.org/abs/2006.01752v1
Date: Tue, 2 Jun 2020 16:26:49 GMT
Title: Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model
Authors: Alejandro Schuler, Aashish Bhardwaj, Vincent Liu
Abstract summary: Clinical researchers often select among and evaluate risk prediction models. Standard metrics calculated from retrospective data are only related to model utility under certain assumptions. When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
Score: 71.9860741092209
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Clinical researchers often select among and evaluate risk prediction models using standard machine learning metrics based on confusion matrices. However, if these models are used to allocate interventions to patients, standard metrics calculated from retrospective data are only related to model utility (in terms of reductions in outcomes) under certain assumptions. When predictions are delivered repeatedly throughout time (e.g. in a patient encounter), the relationship between standard metrics and utility is further complicated. Several kinds of evaluations have been used in the literature, but it has not been clear what the target of estimation is in each evaluation. We synthesize these approaches, determine what is being estimated in each of them, and discuss under what assumptions those estimates are valid. We demonstrate our insights using simulated data as well as real data used in the design of an early warning system. Our theoretical and empirical results show that evaluations without interventional data either do not estimate meaningful quantities, require strong assumptions, or are limited to estimating best-case scenario bounds.

Related papers

Conformalized Regression for Continuous Bounded Outcomes [0.0]
Regression problems with bounded continuous outcomes frequently arise in real-world statistical and machine learning applications.<n>Most of the existing statistical and machine learning literature has focused either on point prediction of bounded outcomes or on interval prediction based on approximations.<n>We develop conformal prediction intervals for bounded outcomes based on transformation models and beta regression.
arXiv Detail & Related papers (2025-07-18T15:51:48Z)
Deep Learning Methods for the Noniterative Conditional Expectation G-Formula for Causal Inference from Complex Observational Data [3.0958655016140892]
The g-formula can be used to estimate causal effects of sustained treatment strategies using observational data. Parametric models are subject to model misspecification, which may result in biased causal estimates. We propose a unified deep learning framework for the NICE g-formula estimator.
arXiv Detail & Related papers (2024-10-28T21:00:46Z)
Ranking and Combining Latent Structured Predictive Scores without Labeled Data [2.5064967708371553]
This paper introduces a novel structured unsupervised ensemble learning model (SUEL) It exploits the dependency between a set of predictors with continuous predictive scores, rank the predictors without labeled data and combine them to an ensembled score with weights. The efficacy of the proposed methods is rigorously assessed through both simulation studies and real-world application of risk genes discovery.
arXiv Detail & Related papers (2024-08-14T20:14:42Z)
Doing Great at Estimating CATE? On the Neglected Assumptions in Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading. We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators. We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Semi-supervised learning and the question of true versus estimated propensity scores [0.456877715768796]
We propose a simple procedure that reconciles the strong intuition that a known propensity functions should be useful for estimating treatment effects. Further, simulation studies suggest that direct regression may be preferable to inverse-propensity weight estimators in many circumstances.
arXiv Detail & Related papers (2020-09-14T04:13:12Z)
Impact of Medical Data Imprecision on Learning Results [9.379890125442333]
We study the impact of imprecision on prediction results in a healthcare application. A pre-trained model is used to predict future state of hyperthyroidism for patients.
arXiv Detail & Related papers (2020-07-24T06:54:57Z)
Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data. We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)
Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials. We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
Uncertainty estimation for classification and risk prediction on medical tabular data [0.0]
This work advances the understanding of uncertainty estimation for classification and risk prediction on medical data. In a data-scarce field such as healthcare, the ability to measure the uncertainty of a model's prediction could potentially lead to improved effectiveness of decision support tools.
arXiv Detail & Related papers (2020-04-13T08:46:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.