IGNITE: Individualized GeNeration of Imputations in Time-series
Electronic health records
- URL: http://arxiv.org/abs/2401.04402v1
- Date: Tue, 9 Jan 2024 07:57:21 GMT
- Title: IGNITE: Individualized GeNeration of Imputations in Time-series
Electronic health records
- Authors: Ghadeer O. Ghosheh, Jin Li, Tingting Zhu
- Abstract summary: We propose a novel deep-learning model that learns the underlying patient dynamics to generate personalized values conditioning on an individual's demographic characteristics and treatments.
Our proposed model, IGNITE, utilise a conditional dual-variational autoencoder augmented with dual-stage attention to generate missing values for an individual.
We show that IGNITE outperforms state-of-the-art approaches in missing data reconstruction and task prediction.
- Score: 7.451873794596469
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Electronic Health Records present a valuable modality for driving
personalized medicine, where treatment is tailored to fit individual-level
differences. For this purpose, many data-driven machine learning and
statistical models rely on the wealth of longitudinal EHRs to study patients'
physiological and treatment effects. However, longitudinal EHRs tend to be
sparse and highly missing, where missingness could also be informative and
reflect the underlying patient's health status. Therefore, the success of
data-driven models for personalized medicine highly depends on how the EHR data
is represented from physiological data, treatments, and the missing values in
the data. To this end, we propose a novel deep-learning model that learns the
underlying patient dynamics over time across multivariate data to generate
personalized realistic values conditioning on an individual's demographic
characteristics and treatments. Our proposed model, IGNITE (Individualized
GeNeration of Imputations in Time-series Electronic health records), utilises a
conditional dual-variational autoencoder augmented with dual-stage attention to
generate missing values for an individual. In IGNITE, we further propose a
novel individualized missingness mask (IMM), which helps our model generate
values based on the individual's observed data and missingness patterns. We
further extend the use of IGNITE from imputing missingness to a personalized
data synthesizer, where it generates missing EHRs that were never observed
prior or even generates new patients for various applications. We validate our
model on three large publicly available datasets and show that IGNITE
outperforms state-of-the-art approaches in missing data reconstruction and task
prediction.
Related papers
- Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics.
Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - Integrated Convolutional and Recurrent Neural Networks for Health Risk
Prediction using Patient Journey Data with Many Missing Values [9.418011774179794]
This paper proposes a novel end-to-end approach to modeling EHR patient journey data with Integrated Convolutional and Recurrent Neural Networks.
Our model can capture both long- and short-term temporal patterns within each patient journey and effectively handle the high degree of missingness in EHR data without any imputation data generation.
arXiv Detail & Related papers (2022-11-11T07:36:18Z) - COPER: Continuous Patient State Perceiver [13.735956129637945]
We propose a novel COntinuous patient state PERceiver model, called COPER, to cope with irregular time-series in EHRs.
neural ordinary differential equations (ODEs) help COPER to generate regular time-series to feed to Perceiver model.
To evaluate the performance of the proposed model, we use in-hospital mortality prediction task on MIMIC-III dataset.
arXiv Detail & Related papers (2022-08-05T14:32:57Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - SANSformers: Self-Supervised Forecasting in Electronic Health Records
with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities.
We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data.
Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z) - Handling Non-ignorably Missing Features in Electronic Health Records
Data Using Importance-Weighted Autoencoders [8.518166245293703]
We propose a novel extension of VAEs called Importance-Weighted Autoencoders (IWAEs) to flexibly handle Missing Not At Random patterns in the Physionet data.
Our proposed method models the missingness mechanism using an embedded neural network, eliminating the need to specify the exact form of the missingness mechanism a priori.
arXiv Detail & Related papers (2021-01-18T22:53:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.