Longitudinal patient stratification of electronic health records with
flexible adjustment for clinical outcomes
- URL: http://arxiv.org/abs/2111.06152v1
- Date: Thu, 11 Nov 2021 11:21:39 GMT
- Title: Longitudinal patient stratification of electronic health records with
flexible adjustment for clinical outcomes
- Authors: Oliver Carr, Avelino Javer, Patrick Rockenschaub, Owen Parsons, Robert
D\"urichen
- Abstract summary: We develop a recurrent neural network autoencoder to cluster EHR data using reconstruction, outcome, and clustering losses.
We demonstrate the model performance on $29,229$ diabetes patients, showing it finds clusters of patients with both different trajectories and different outcomes.
- Score: 0.7874708385247353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increase in availability of longitudinal electronic health record (EHR)
data is leading to improved understanding of diseases and discovery of novel
phenotypes. The majority of clustering algorithms focus only on patient
trajectories, yet patients with similar trajectories may have different
outcomes. Finding subgroups of patients with different trajectories and
outcomes can guide future drug development and improve recruitment to clinical
trials. We develop a recurrent neural network autoencoder to cluster EHR data
using reconstruction, outcome, and clustering losses which can be weighted to
find different types of patient clusters. We show our model is able to discover
known clusters from both data biases and outcome differences, outperforming
baseline models. We demonstrate the model performance on $29,229$ diabetes
patients, showing it finds clusters of patients with both different
trajectories and different outcomes which can be utilized to aid clinical
decision making.
Related papers
- TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data [42.96821770394798]
TACCO is a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data.
We conduct experiments on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction.
In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO.
arXiv Detail & Related papers (2024-06-14T14:18:38Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Mixture Model Framework for Traumatic Brain Injury Prognosis Using
Heterogeneous Clinical and Outcome Data [3.7363119896212478]
We develop a method for modeling large heterogeneous data types relevant to TBI.
The model is trained on a dataset encompassing a variety of data types, including demographics, blood-based biomarkers, and imaging findings.
It is used to stratify patients into distinct groups in an unsupervised learning setting.
arXiv Detail & Related papers (2020-12-22T19:31:03Z) - Phenotyping Clusters of Patient Trajectories suffering from Chronic
Complex Disease [3.1564542805009332]
We evaluate three different clustering models on a large hospital dataset of vital-sign observations from patients suffering from COPD.
We propose novel modifications to deal with unevenly sampled time-series data and unbalanced class distribution to improve phenotype separation.
arXiv Detail & Related papers (2020-11-17T01:18:33Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z) - Deep Representation Learning of Electronic Health Records to Unlock
Patient Stratification at Scale [0.5498849973527224]
We present an unsupervised framework based on deep learning to process heterogeneous EHRs.
We derive patient representations that can efficiently and effectively enable patient stratification at scale.
arXiv Detail & Related papers (2020-03-14T00:04:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.