Clustering Survival Data using a Mixture of Non-parametric Experts
- URL: http://arxiv.org/abs/2405.15934v1
- Date: Fri, 24 May 2024 20:47:58 GMT
- Title: Clustering Survival Data using a Mixture of Non-parametric Experts
- Authors: Gabriel Buginga, Edmundo de Souza e Silva,
- Abstract summary: This study introduces SurvMixClust, a novel algorithm for survival analysis that integrates clustering with survival function prediction.
Our evaluations on five public datasets show that SurvMixClust creates balanced clusters with distinct survival curves, outperforms clustering baselines, and competes with non-clustering survival models in predictive accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Survival analysis aims to predict the timing of future events across various fields, from medical outcomes to customer churn. However, the integration of clustering into survival analysis, particularly for precision medicine, remains underexplored. This study introduces SurvMixClust, a novel algorithm for survival analysis that integrates clustering with survival function prediction within a unified framework. SurvMixClust learns latent representations for clustering while also predicting individual survival functions using a mixture of non-parametric experts. Our evaluations on five public datasets show that SurvMixClust creates balanced clusters with distinct survival curves, outperforms clustering baselines, and competes with non-clustering survival models in predictive accuracy, as measured by the time-dependent c-index and log-rank metrics.
Related papers
- HACSurv: A Hierarchical Copula-based Approach for Survival Analysis with Dependent Competing Risks [51.95824566163554]
HACSurv is a survival analysis method that learns structures and cause-specific survival functions from data with competing risks.
By capturing the dependencies between risks and censoring, HACSurv achieves better survival predictions.
arXiv Detail & Related papers (2024-10-19T18:52:18Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Variational Deep Survival Machines: Survival Regression with Censored Outcomes [11.82370259688716]
Survival regression aims to predict the time when an event of interest will take place, typically a death or a failure.
We present a novel method to predict the survival time by better clustering the survival data and combine primitive distributions.
arXiv Detail & Related papers (2024-04-24T02:16:00Z) - Heterogeneous Datasets for Federated Survival Analysis Simulation [6.489759672413373]
This work proposes a novel technique for constructing realistic heterogeneous datasets by starting from existing non-federated datasets in a reproducible way.
Specifically, we provide two novel dataset-splitting algorithms based on the Dirichlet distribution to assign each data sample to a carefully chosen client.
The implementation of the proposed methods is publicly available in favor of and to encourage common practices to simulate federated environments for survival analysis.
arXiv Detail & Related papers (2023-01-28T11:37:07Z) - Deep Clustering Survival Machines with Interpretable Expert Distributions [14.938859205541014]
We propose a hybrid survival analysis method, referred to as deep clustering survival machines.
We learn weights of the expert distributions for individual instances according to their features discriminatively.
This method also facilitates interpretable subgrouping/clustering of all instances according to their associated expert distributions.
arXiv Detail & Related papers (2023-01-27T16:27:18Z) - A Deep Variational Approach to Clustering Survival Data [5.871238645229228]
We introduce a novel probabilistic approach to cluster survival data in a variational deep clustering setting.
Our proposed method employs a deep generative model to uncover the underlying distribution of both the explanatory variables and the potentially censored survival times.
arXiv Detail & Related papers (2021-06-10T14:10:25Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z) - Survival Cluster Analysis [93.50540270973927]
There is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles.
An approach that addresses this need is likely to improve characterization of individual outcomes.
arXiv Detail & Related papers (2020-02-29T22:41:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.