On the explainability of hospitalization prediction on a large COVID-19
patient dataset
- URL: http://arxiv.org/abs/2110.15002v1
- Date: Thu, 28 Oct 2021 10:23:38 GMT
- Title: On the explainability of hospitalization prediction on a large COVID-19
patient dataset
- Authors: Ivan Girardi, Panagiotis Vagenas, Dario Arcos-D\'iaz, Lydia Bessa\"i,
Alexander B\"usser, Ludovico Furlan, Raffaello Furlan, Mauro Gatti, Andrea
Giovannini, Ellen Hoeven, Chiara Marchiori
- Abstract summary: We develop various AI models to predict hospitalization on a large (over 110$k$) cohort of COVID-19 positive-tested US patients.
Despite high data unbalance, the models reach average precision 0.96-0.98 (0.75-0.85), recall 0.96-0.98 (0.74-0.85), and $F_score 0.97-0.98 (0.79-0.83) on the non-hospitalized (or hospitalized) class.
- Score: 45.82374977939355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop various AI models to predict hospitalization on a large (over
110$k$) cohort of COVID-19 positive-tested US patients, sourced from March 2020
to February 2021. Models range from Random Forest to Neural Network (NN) and
Time Convolutional NN, where combination of the data modalities (tabular and
time dependent) are performed at different stages (early vs. model fusion).
Despite high data unbalance, the models reach average precision 0.96-0.98
(0.75-0.85), recall 0.96-0.98 (0.74-0.85), and $F_1$-score 0.97-0.98
(0.79-0.83) on the non-hospitalized (or hospitalized) class. Performances do
not significantly drop even when selected lists of features are removed to
study model adaptability to different scenarios. However, a systematic study of
the SHAP feature importance values for the developed models in the different
scenarios shows a large variability across models and use cases. This calls for
even more complete studies on several explainability methods before their
adoption in high-stakes scenarios.
Related papers
- Deep State-Space Generative Model For Correlated Time-to-Event Predictions [54.3637600983898]
We propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events.
Our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
arXiv Detail & Related papers (2024-07-28T02:42:36Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Learning Clinical Concepts for Predicting Risk of Progression to Severe
COVID-19 [17.781861866125023]
Using data from a major healthcare provider, we develop survival models predicting severe COVID-19 progression.
We develop two sets of high-performance risk scores: (i) an unconstrained model built from all available features; and (ii) a pipeline that learns a small set of clinical concepts before training a risk predictor.
arXiv Detail & Related papers (2022-08-28T02:59:35Z) - An Interpretable Web-based Glioblastoma Multiforme Prognosis Prediction
Tool using Random Forest Model [1.1024591739346292]
We propose predictive models that estimate GBM patients' health status of one-year after treatments.
We used total of 467 GBM patients' clinical profile consists of 13 features and two follow-up dates.
Our machine learning models suggest that the top three prognostic factors for GBM patient survival were MGMT gene promoter, the extent of resection, and age.
arXiv Detail & Related papers (2021-08-30T07:56:34Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive
Surveillance of COVID-19 Using Heterogeneous Features and their Interactions [2.30238915794052]
We propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days.
Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties.
arXiv Detail & Related papers (2020-07-31T23:37:38Z) - A General Framework for Survival Analysis and Multi-State Modelling [70.31153478610229]
We use neural ordinary differential equations as a flexible and general method for estimating multi-state survival models.
We show that our model exhibits state-of-the-art performance on popular survival data sets and demonstrate its efficacy in a multi-state setting.
arXiv Detail & Related papers (2020-06-08T19:24:54Z) - Forecasting the Spread of Covid-19 Under Control Scenarios Using LSTM
and Dynamic Behavioral Models [2.11622808613962]
This study proposes a novel hybrid model which combines a Long short-term memory (LSTM) artificial recurrent neural network with dynamic behavioral models.
The proposed model considers the effect of multiple factors to enhance the accuracy in predicting the number of cases and deaths across the top ten most-affected countries and Australia.
arXiv Detail & Related papers (2020-05-24T10:43:55Z) - Adaptive Prediction Timing for Electronic Health Records [3.308743964406688]
We introduce a novel, more realistic, approach to generating patient outcome predictions at an adaptive rate.
We use a Recurrent Neural Network (RNN) and a Bayesian embedding layer with a new aggregation method to demonstrate adaptive prediction timing.
At 48 hours after patient admission, our model achieves equal performance compared to its static-windowed counterparts.
arXiv Detail & Related papers (2020-03-05T12:02:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.