Sequential Inference of Hospitalization Electronic Health Records Using Probabilistic Models
- URL: http://arxiv.org/abs/2403.19011v2
- Date: Wed, 24 Apr 2024 15:06:26 GMT
- Title: Sequential Inference of Hospitalization Electronic Health Records Using Probabilistic Models
- Authors: Alan D. Kaplan, Priyadip Ray, John D. Greene, Vincent X. Liu,
- Abstract summary: In this work we design a probabilistic unsupervised model for multiple arbitrary-length sequences contained in hospitalization Electronic Health Record (EHR) data.
The model uses a latent variable structure and captures complex relationships between medications, diagnoses, laboratory tests, neurological assessments, and medications.
Inference algorithms are derived that use partial data to infer properties of the complete sequences, including their length and presence of specific values.
- Score: 3.2988476179015005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the dynamic hospital setting, decision support can be a valuable tool for improving patient outcomes. Data-driven inference of future outcomes is challenging in this dynamic setting, where long sequences such as laboratory tests and medications are updated frequently. This is due in part to heterogeneity of data types and mixed-sequence types contained in variable length sequences. In this work we design a probabilistic unsupervised model for multiple arbitrary-length sequences contained in hospitalization Electronic Health Record (EHR) data. The model uses a latent variable structure and captures complex relationships between medications, diagnoses, laboratory tests, neurological assessments, and medications. It can be trained on original data, without requiring any lossy transformations or time binning. Inference algorithms are derived that use partial data to infer properties of the complete sequences, including their length and presence of specific values. We train this model on data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system. The results are evaluated against held-out data for predicting the length of sequences and presence of Intensive Care Unit (ICU) in hospitalization bed sequences. Our method outperforms a baseline approach, showing that in these experiments the trained model captures information in the sequences that is informative of their future values.
Related papers
- COPER: Continuous Patient State Perceiver [13.735956129637945]
We propose a novel COntinuous patient state PERceiver model, called COPER, to cope with irregular time-series in EHRs.
neural ordinary differential equations (ODEs) help COPER to generate regular time-series to feed to Perceiver model.
To evaluate the performance of the proposed model, we use in-hospital mortality prediction task on MIMIC-III dataset.
arXiv Detail & Related papers (2022-08-05T14:32:57Z) - A Variational Autoencoder for Heterogeneous Temporal and Longitudinal
Data [0.3749861135832073]
Recently proposed extensions to VAEs that can handle temporal and longitudinal data have applications in healthcare, behavioural modelling, and predictive maintenance.
We propose the heterogeneous longitudinal VAE (HL-VAE) that extends the existing temporal and longitudinal VAEs to heterogeneous data.
HL-VAE provides efficient inference for high-dimensional datasets and includes likelihood models for continuous, count, categorical, and ordinal data.
arXiv Detail & Related papers (2022-04-20T10:18:39Z) - Unsupervised Probabilistic Models for Sequential Electronic Health
Records [3.8015092217142223]
The model consists of a layered set of latent variables that encode underlying structure in the data.
We train this model on episodic data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system.
The resulting properties of the trained model generate novel insight from these complex and multifaceted data.
arXiv Detail & Related papers (2022-04-15T02:11:44Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Sequential Diagnosis Prediction with Transformer and Ontological
Representation [35.88195694025553]
We propose an end-to-end robust transformer-based model called SETOR to handle irregular intervals between a patient's visits with admitted timestamps and length of stay in each visit.
Experiments conducted on two real-world healthcare datasets show that, our sequential diagnoses prediction model SETOR achieves better predictive results than previous state-of-the-art approaches.
arXiv Detail & Related papers (2021-09-07T13:09:55Z) - CSDI: Conditional Score-based Diffusion Models for Probabilistic Time
Series Imputation [107.63407690972139]
Conditional Score-based Diffusion models for Imputation (CSDI) is a novel time series imputation method that utilizes score-based diffusion models conditioned on observed data.
CSDI improves by 40-70% over existing probabilistic imputation methods on popular performance metrics.
In addition, C reduces the error by 5-20% compared to the state-of-the-art deterministic imputation methods.
arXiv Detail & Related papers (2021-07-07T22:20:24Z) - DeepRite: Deep Recurrent Inverse TreatmEnt Weighting for Adjusting
Time-varying Confounding in Modern Longitudinal Observational Data [68.29870617697532]
We propose Deep Recurrent Inverse TreatmEnt weighting (DeepRite) for time-varying confounding in longitudinal data.
DeepRite is shown to recover the ground truth from synthetic data, and estimate unbiased treatment effects from real data.
arXiv Detail & Related papers (2020-10-28T15:05:08Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z) - Longitudinal Variational Autoencoder [1.4680035572775534]
A common approach to analyse high-dimensional data that contains missing values is to learn a low-dimensional representation using variational autoencoders (VAEs)
Standard VAEs assume that the learnt representations are i.i.d., and fail to capture the correlations between the data samples.
We propose the Longitudinal VAE (L-VAE), that uses a multi-output additive Gaussian process (GP) prior to extend the VAE's capability to learn structured low-dimensional representations.
Our approach can simultaneously accommodate both time-varying shared and random effects, produce structured low-dimensional representations
arXiv Detail & Related papers (2020-06-17T10:30:14Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.