A Comprehensive Benchmark for COVID-19 Predictive Modeling Using
Electronic Health Records in Intensive Care
- URL: http://arxiv.org/abs/2209.07805v4
- Date: Tue, 23 Jan 2024 17:14:20 GMT
- Title: A Comprehensive Benchmark for COVID-19 Predictive Modeling Using
Electronic Health Records in Intensive Care
- Authors: Junyi Gao, Yinghao Zhu, Wenqing Wang, Yasha Wang, Wen Tang, Ewen M.
Harrison, Liantao Ma
- Abstract summary: We propose two clinical prediction tasks, Outcome-specific length-of-stay prediction and Early mortality prediction for COVID-19 patients in intensive care units.
The two tasks are adapted from the naive length-of-stay and mortality prediction tasks to accommodate the clinical practice for COVID-19 patients.
We propose fair, detailed, open-source data-preprocessing pipelines and evaluate 17 state-of-the-art predictive models on two tasks.
- Score: 15.64030213048907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The COVID-19 pandemic has posed a heavy burden to the healthcare system
worldwide and caused huge social disruption and economic loss. Many deep
learning models have been proposed to conduct clinical predictive tasks such as
mortality prediction for COVID-19 patients in intensive care units using
Electronic Health Record (EHR) data. Despite their initial success in certain
clinical applications, there is currently a lack of benchmarking results to
achieve a fair comparison so that we can select the optimal model for clinical
use. Furthermore, there is a discrepancy between the formulation of traditional
prediction tasks and real-world clinical practice in intensive care. To fill
these gaps, we propose two clinical prediction tasks, Outcome-specific
length-of-stay prediction and Early mortality prediction for COVID-19 patients
in intensive care units. The two tasks are adapted from the naive
length-of-stay and mortality prediction tasks to accommodate the clinical
practice for COVID-19 patients. We propose fair, detailed, open-source
data-preprocessing pipelines and evaluate 17 state-of-the-art predictive models
on two tasks, including 5 machine learning models, 6 basic deep learning models
and 6 deep learning predictive models specifically designed for EHR data. We
provide benchmarking results using data from two real-world COVID-19 EHR
datasets. One dataset is publicly available without needing any inquiry and
another dataset can be accessed on request. We provide fair, reproducible
benchmarking results for two tasks. We deploy all experiment results and models
on an online platform. We also allow clinicians and researchers to upload their
data to the platform and get quick prediction results using our trained models.
We hope our efforts can further facilitate deep learning and machine learning
research for COVID-19 predictive modeling.
Related papers
- Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics.
Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks [3.1580072841682734]
This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset.
The effective knowledge transfer is completed through the task-aware fine-tuning stage.
We conducted experiments on a real-world claims dataset with more than one million patient records.
arXiv Detail & Related papers (2021-06-24T15:25:41Z) - Deep Learning with Heterogeneous Graph Embeddings for Mortality
Prediction from Electronic Health Records [2.2859570135269625]
We train a Heterogeneous Graph Model (HGM) on Electronic Health Record data and use the resulting embedding vector as additional information added to a Conal Neural Network (CNN) model for predicting in-hospital mortality.
We find that adding HGM to a CNN model increases the mortality prediction accuracy up to 4%.
arXiv Detail & Related papers (2020-12-28T02:27:09Z) - Building Deep Learning Models to Predict Mortality in ICU Patients [0.0]
We propose several deep learning models using the same features as the SAPS II score.
Several experiments have been conducted based on the well known clinical dataset Medical Information Mart for Intensive Care III.
arXiv Detail & Related papers (2020-12-11T16:27:04Z) - HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in
Intensive Care Units [31.368873375366213]
HOLMES is an online model ensemble serving framework for healthcare applications.
We demonstrate that HOLMES is able to navigate the accuracy/latency tradeoff efficiently, compose the ensemble, and serve the model ensemble pipeline.
HOLMES is tested on risk prediction task on pediatric cardio ICU data with above 95% prediction accuracy and sub-second latency on 64-bed simulation.
arXiv Detail & Related papers (2020-08-10T12:38:46Z) - Individualized Prediction of COVID-19 Adverse outcomes with MLHO [9.197411456718708]
We developed an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health outcomes.
We modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics.
Our results demonstrated that while demographic variables are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model.
arXiv Detail & Related papers (2020-08-10T02:44:52Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.