Benchmarking Predictive Risk Models for Emergency Departments with Large
Public Electronic Health Records
- URL: http://arxiv.org/abs/2111.11017v1
- Date: Mon, 22 Nov 2021 06:51:11 GMT
- Title: Benchmarking Predictive Risk Models for Emergency Departments with Large
Public Electronic Health Records
- Authors: Feng Xie, Jun Zhou, Jin Wee Lee, Mingrui Tan, Siqi Li, Logasan S/O
Rajnthern, Marcel Lucas Chee, Bibhas Chakraborty, An-Kwok Ian Wong, Alon
Dagan, Marcus Eng Hock Ong, Fei Gao, Nan Liu
- Abstract summary: There is an absence of widely accepted ED benchmarks based on large-scale public EHR.
We proposed a public ED benchmark suite and obtained a benchmark dataset containing over 500,000 ED visits episodes from 2011 to 2019.
Our codes are open-source so that anyone with access to MIMIC-IV-ED could follow the same steps of data processing, build the benchmarks, and reproduce the experiments.
- Score: 7.928862476020428
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a continuously growing demand for emergency department (ED) services
across the world, especially under the COVID-19 pandemic. Risk triaging plays a
crucial role in prioritizing limited medical resources for patients who need
them most. Recently the pervasive use of Electronic Health Records (EHR) has
generated a large volume of stored data, accompanied by vast opportunities for
the development of predictive models which could improve emergency care.
However, there is an absence of widely accepted ED benchmarks based on
large-scale public EHR, which new researchers could easily access. Success in
filling in this gap could enable researchers to start studies on ED more
quickly and conveniently without verbose data preprocessing and facilitate
comparisons among different studies and methodologies. In this paper, based on
the Medical Information Mart for Intensive Care IV Emergency Department
(MIMIC-IV-ED) database, we proposed a public ED benchmark suite and obtained a
benchmark dataset containing over 500,000 ED visits episodes from 2011 to 2019.
Three ED-based prediction tasks (hospitalization, critical outcomes, and
72-hour ED revisit) were introduced, where various popular methodologies, from
machine learning methods to clinical scoring systems, were implemented. The
results of their performance were evaluated and compared. Our codes are
open-source so that anyone with access to MIMIC-IV-ED could follow the same
steps of data processing, build the benchmarks, and reproduce the experiments.
This study provided insights, suggestions, as well as protocols for future
researchers to process the raw data and quickly build up models for emergency
care.
Related papers
- FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection [83.54960238236548]
FEDMEKI not only preserves data privacy but also enhances the capability of medical foundation models.
FEDMEKI allows medical foundation models to learn from a broader spectrum of medical knowledge without direct data exposure.
arXiv Detail & Related papers (2024-08-17T15:18:56Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling [22.94521527609479]
EMERGE is a Retrieval-Augmented Generation driven framework aimed at enhancing multimodal EHR predictive modeling.
Our approach extracts entities from both time-series data and clinical notes by prompting Large Language Models.
The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses.
arXiv Detail & Related papers (2024-05-27T10:53:15Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics.
Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z) - Prompting Large Language Models for Zero-Shot Clinical Prediction with
Structured Longitudinal Electronic Health Record Data [7.815738943706123]
Large Language Models (LLMs) are traditionally tailored for natural language processing.
This research investigates the adaptability of LLMs, like GPT-4, to EHR data.
In response to the longitudinal, sparse, and knowledge-infused nature of EHR data, our prompting approach involves taking into account specific characteristics.
arXiv Detail & Related papers (2024-01-25T20:14:50Z) - Reliable Generation of Privacy-preserving Synthetic EHR Time Series via Diffusion Models [4.240899165468488]
Electronic Health Records (EHRs) are rich sources of patient-level data, offering valuable resources for medical data analysis.
However, privacy concerns often restrict access to EHRs, hindering downstream analysis.
This study aims to overcome these challenges by generating realistic and privacy-preserving synthetic EHR time series efficiently.
arXiv Detail & Related papers (2023-10-23T18:56:01Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - How to Leverage Multimodal EHR Data for Better Medical Predictions? [13.401754962583771]
The complexity of electronic health records ( EHR) data is a challenge for the application of deep learning.
In this paper, we first extract the accompanying clinical notes from EHR and propose a method to integrate these data.
The results on two medical prediction tasks show that our fused model with different data outperforms the state-of-the-art method.
arXiv Detail & Related papers (2021-10-29T13:26:05Z) - Self-Supervised Graph Learning with Hyperbolic Embedding for Temporal
Health Event Prediction [13.24834156675212]
We propose a hyperbolic embedding method with information flow to pre-train medical code representations in a hierarchical structure.
We incorporate these pre-trained representations into a graph neural network to detect disease complications.
We present a new hierarchy-enhanced historical prediction proxy task in our self-supervised learning framework to fully utilize EHR data.
arXiv Detail & Related papers (2021-06-09T00:42:44Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.