Deep Semi-Supervised Embedded Clustering (DSEC) for Stratification of
Heart Failure Patients
- URL: http://arxiv.org/abs/2012.13233v3
- Date: Sun, 17 Jan 2021 23:51:44 GMT
- Title: Deep Semi-Supervised Embedded Clustering (DSEC) for Stratification of
Heart Failure Patients
- Authors: Oliver Carr, Stojan Jovanovic, Luca Albergante, Fernando Andreotti,
Robert D\"urichen, Nadia Lipunova, Janie Baxter, Rabia Khan, Benjamin Irving
- Abstract summary: In this work we apply deep semi-supervised embedded clustering to determine data-driven patient subgroups of heart failure.
We find clinically relevant clusters from an embedded space derived from heterogeneous data.
The proposed algorithm can potentially find new undiagnosed subgroups of patients that have different outcomes.
- Score: 50.48904066814385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determining phenotypes of diseases can have considerable benefits for
in-hospital patient care and to drug development. The structure of high
dimensional data sets such as electronic health records are often represented
through an embedding of the data, with clustering methods used to group data of
similar structure. If subgroups are known to exist within data, supervised
methods may be used to influence the clusters discovered. We propose to extend
deep embedded clustering to a semi-supervised deep embedded clustering
algorithm to stratify subgroups through known labels in the data. In this work
we apply deep semi-supervised embedded clustering to determine data-driven
patient subgroups of heart failure from the electronic health records of 4,487
heart failure and control patients. We find clinically relevant clusters from
an embedded space derived from heterogeneous data. The proposed algorithm can
potentially find new undiagnosed subgroups of patients that have different
outcomes, and, therefore, lead to improved treatments.
Related papers
- Federated unsupervised random forest for privacy-preserving patient
stratification [0.4499833362998487]
We introduce a novel multi-omics clustering approach utilizing unsupervised random-forests.
We have validated our approach on machine learning benchmark data sets and on cancer data from The Cancer Genome Atlas.
Our method is competitive with the state-of-the-art in terms of disease subtyping, but at the same time substantially improves the cluster interpretability.
arXiv Detail & Related papers (2024-01-29T12:04:14Z) - A system for exploring big data: an iterative k-means searchlight for
outlier detection on open health data [0.4588028371034407]
We present a system that explores multiple combinations of variables using a searchlight technique and identifies outliers.
We illustrate this system by anaylzing open health care data released by New York State.
Several anomalous trends in the data are identified, including cost overruns at specific hospitals, and increases in diagnoses such as suicides.
arXiv Detail & Related papers (2023-04-05T02:09:15Z) - Simple and Scalable Algorithms for Cluster-Aware Precision Medicine [0.0]
We propose a simple and scalable approach to joint clustering and embedding.
This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods.
Our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings.
arXiv Detail & Related papers (2022-11-29T19:27:26Z) - DeepMCAT: Large-Scale Deep Clustering for Medical Image Categorization [24.100651548850895]
We propose an unsupervised approach for automatically clustering and categorizing large-scale medical image datasets.
We investigated the end-to-end training using both class-balanced and imbalanced large-scale datasets.
arXiv Detail & Related papers (2021-09-30T22:39:57Z) - Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain
Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain.
We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one.
Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z) - Robust Recursive Partitioning for Heterogeneous Treatment Effects with
Uncertainty Quantification [84.53697297858146]
Subgroup analysis of treatment effects plays an important role in applications from medicine to public policy to recommender systems.
Most of the current methods of subgroup analysis begin with a particular algorithm for estimating individualized treatment effects (ITE)
This paper develops a new method for subgroup analysis, R2P, that addresses all these weaknesses.
arXiv Detail & Related papers (2020-06-14T14:50:02Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.