Deep Representation Learning of Electronic Health Records to Unlock
Patient Stratification at Scale
- URL: http://arxiv.org/abs/2003.06516v2
- Date: Sat, 18 Jul 2020 10:28:54 GMT
- Title: Deep Representation Learning of Electronic Health Records to Unlock
Patient Stratification at Scale
- Authors: Isotta Landi, Benjamin S. Glicksberg, Hao-Chih Lee, Sarah Cherng,
Giulia Landi, Matteo Danieletto, Joel T. Dudley, Cesare Furlanello, and
Riccardo Miotto
- Abstract summary: We present an unsupervised framework based on deep learning to process heterogeneous EHRs.
We derive patient representations that can efficiently and effectively enable patient stratification at scale.
- Score: 0.5498849973527224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deriving disease subtypes from electronic health records (EHRs) can guide
next-generation personalized medicine. However, challenges in summarizing and
representing patient data prevent widespread practice of scalable EHR-based
stratification analysis. Here we present an unsupervised framework based on
deep learning to process heterogeneous EHRs and derive patient representations
that can efficiently and effectively enable patient stratification at scale. We
considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising
of a total of 57,464 clinical concepts. We introduce a representation learning
model based on word embeddings, convolutional neural networks, and autoencoders
(i.e., ConvAE) to transform patient trajectories into low-dimensional latent
vectors. We evaluated these representations as broadly enabling patient
stratification by applying hierarchical clustering to different multi-disease
and disease-specific patient cohorts. ConvAE significantly outperformed several
baselines in a clustering task to identify patients with different complex
conditions, with 2.61 entropy and 0.31 purity average scores. When applied to
stratify patients within a certain condition, ConvAE led to various clinically
relevant subtypes for different disorders, including type 2 diabetes,
Parkinson's disease and Alzheimer's disease, largely related to comorbidities,
disease progression, and symptom severity. With these results, we demonstrate
that ConvAE can generate patient representations that lead to clinically
meaningful insights. This scalable framework can help better understand varying
etiologies in heterogeneous sub-populations and unlock patterns for EHR-based
research in the realm of personalized medicine.
Related papers
- TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data [42.96821770394798]
TACCO is a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data.
We conduct experiments on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction.
In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO.
arXiv Detail & Related papers (2024-06-14T14:18:38Z) - Clustering of Disease Trajectories with Explainable Machine Learning: A Case Study on Postoperative Delirium Phenotypes [13.135589459700865]
We propose an approach that combines supervised machine learning for personalized POD risk prediction with unsupervised clustering techniques to uncover potential POD phenotypes.
We show that clustering patients in the SHAP feature importance space successfully recovers the true underlying phenotypes, outperforming clustering in the raw feature space.
arXiv Detail & Related papers (2024-05-06T10:05:46Z) - Hypergraph Convolutional Networks for Fine-grained ICU Patient
Similarity Analysis and Risk Prediction [15.06049250330114]
The Intensive Care Unit (ICU) is one of the most important parts of a hospital, which admits critically ill patients and provides continuous monitoring and treatment.
Various patient outcome prediction methods have been attempted to assist healthcare professionals in clinical decision-making.
arXiv Detail & Related papers (2023-08-24T05:26:56Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Toward Cohort Intelligence: A Universal Cohort Representation Learning
Framework for Electronic Health Record Analysis [15.137213823470544]
We propose a universal COhort Representation lEarning (CORE) framework to augment EHR utilization by leveraging the fine-grained cohort information among patients.
CORE is readily applicable to diverse backbone models, serving as a universal plug-in framework to infuse cohort information into healthcare methods for boosted performance.
arXiv Detail & Related papers (2023-04-10T09:12:37Z) - Longitudinal patient stratification of electronic health records with
flexible adjustment for clinical outcomes [0.7874708385247353]
We develop a recurrent neural network autoencoder to cluster EHR data using reconstruction, outcome, and clustering losses.
We demonstrate the model performance on $29,229$ diabetes patients, showing it finds clusters of patients with both different trajectories and different outcomes.
arXiv Detail & Related papers (2021-11-11T11:21:39Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching [70.08786840301435]
We propose CrOss-Modal PseudO-SiamEse network (COMPOSE) to address these challenges for patient-trial matching.
Experiment results show COMPOSE can reach 98.0% AUC on patient-criteria matching and 83.7% accuracy on patient-trial matching.
arXiv Detail & Related papers (2020-06-15T21:01:33Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z) - Learning Dynamic and Personalized Comorbidity Networks from Event Data
using Deep Diffusion Processes [102.02672176520382]
Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals.
In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition.
We develop deep diffusion processes to model "dynamic comorbidity networks"
arXiv Detail & Related papers (2020-01-08T15:47:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.