Simple and Scalable Algorithms for Cluster-Aware Precision Medicine
- URL: http://arxiv.org/abs/2211.16553v3
- Date: Wed, 17 May 2023 22:49:42 GMT
- Title: Simple and Scalable Algorithms for Cluster-Aware Precision Medicine
- Authors: Amanda M. Buch, Conor Liston, and Logan Grosenick
- Abstract summary: We propose a simple and scalable approach to joint clustering and embedding.
This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods.
Our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI-enabled precision medicine promises a transformational improvement in
healthcare outcomes by enabling data-driven personalized diagnosis, prognosis,
and treatment. However, the well-known "curse of dimensionality" and the
clustered structure of biomedical data together interact to present a joint
challenge in the high dimensional, limited observation precision medicine
regime. To overcome both issues simultaneously we propose a simple and scalable
approach to joint clustering and embedding that combines standard embedding
methods with a convex clustering penalty in a modular way. This novel,
cluster-aware embedding approach overcomes the complexity and limitations of
current joint embedding and clustering methods, which we show with
straightforward implementations of hierarchically clustered principal component
analysis (PCA), locally linear embedding (LLE), and canonical correlation
analysis (CCA). Through both numerical experiments and real-world examples, we
demonstrate that our approach outperforms traditional and contemporary
clustering methods on highly underdetermined problems (e.g., with just tens of
observations) as well as on large sample datasets. Importantly, our approach
does not require the user to choose the desired number of clusters, but instead
yields interpretable dendrograms of hierarchically clustered embeddings. Thus
our approach improves significantly on existing methods for identifying patient
subgroups in multiomics and neuroimaging data, enabling scalable and
interpretable biomarkers for precision medicine.
Related papers
- Hierarchical and Density-based Causal Clustering [6.082022112101251]
We propose plug-in estimators that are simple and readily implementable using off-the-shelf algorithms.
We go on to study their rate of convergence, and show that the additional cost of causal clustering is essentially the estimation error of the outcome regression functions.
arXiv Detail & Related papers (2024-11-02T14:01:04Z) - Federated unsupervised random forest for privacy-preserving patient
stratification [0.4499833362998487]
We introduce a novel multi-omics clustering approach utilizing unsupervised random-forests.
We have validated our approach on machine learning benchmark data sets and on cancer data from The Cancer Genome Atlas.
Our method is competitive with the state-of-the-art in terms of disease subtyping, but at the same time substantially improves the cluster interpretability.
arXiv Detail & Related papers (2024-01-29T12:04:14Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Tk-merge: Computationally Efficient Robust Clustering Under General
Assumptions [0.0]
We present a two-step hybrid robust clustering algorithm based on trimmed k-means and hierarchical agglomeration.
We also present natural generalizations of the approach as well as an adaptive procedure to estimate the amount of contamination in a data-driven fashion.
arXiv Detail & Related papers (2022-01-17T13:05:05Z) - Fast and Interpretable Consensus Clustering via Minipatch Learning [0.0]
We develop IMPACC: Interpretable MiniPatch Adaptive Consensus Clustering.
We develop adaptive sampling schemes for observations, which result in both improved reliability and computational savings.
Results show that our approach yields more accurate and interpretable cluster solutions.
arXiv Detail & Related papers (2021-10-05T22:39:28Z) - Deep Semi-Supervised Embedded Clustering (DSEC) for Stratification of
Heart Failure Patients [50.48904066814385]
In this work we apply deep semi-supervised embedded clustering to determine data-driven patient subgroups of heart failure.
We find clinically relevant clusters from an embedded space derived from heterogeneous data.
The proposed algorithm can potentially find new undiagnosed subgroups of patients that have different outcomes.
arXiv Detail & Related papers (2020-12-24T12:56:46Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.