Bayesian outcome-guided multi-view mixture models with applications in
molecular precision medicine
- URL: http://arxiv.org/abs/2303.00318v1
- Date: Wed, 1 Mar 2023 08:32:23 GMT
- Title: Bayesian outcome-guided multi-view mixture models with applications in
molecular precision medicine
- Authors: Paul D. W. Kirk, Filippo Pagani, Sylvia Richardson
- Abstract summary: Clustering is commonly performed as an initial analysis step for uncovering structure in 'omics datasets.
We propose a multi-view Bayesian mixture model that identifies groups of variables (views"), each of which defines a distinct clustering structure.
We consider applications in stratified medicine, for which our principal goal is to identify clusters of patients that define distinct, clinically actionable disease subtypes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering is commonly performed as an initial analysis step for uncovering
structure in 'omics datasets, e.g. to discover molecular subtypes of disease.
The high-throughput, high-dimensional nature of these datasets means that they
provide information on a diverse array of different biomolecular processes and
pathways. Different groups of variables (e.g. genes or proteins) will be
implicated in different biomolecular processes, and hence undertaking analyses
that are limited to identifying just a single clustering partition of the whole
dataset is therefore liable to conflate the multiple clustering structures that
may arise from these distinct processes. To address this, we propose a
multi-view Bayesian mixture model that identifies groups of variables
(``views"), each of which defines a distinct clustering structure. We consider
applications in stratified medicine, for which our principal goal is to
identify clusters of patients that define distinct, clinically actionable
disease subtypes. We adopt the semi-supervised, outcome-guided mixture
modelling approach of Bayesian profile regression that makes use of a response
variable in order to guide inference toward the clusterings that are most
relevant in a stratified medicine context. We present the model, together with
illustrative simulation examples, and examples from pan-cancer proteomics. We
demonstrate how the approach can be used to perform integrative clustering, and
consider an example in which different 'omics datasets are integrated in the
context of breast cancer subtyping.
Related papers
- Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Single-cell Multi-view Clustering via Community Detection with Unknown
Number of Clusters [64.31109141089598]
We introduce scUNC, an innovative multi-view clustering approach tailored for single-cell data.
scUNC seamlessly integrates information from different views without the need for a predefined number of clusters.
We conducted a comprehensive evaluation of scUNC using three distinct single-cell datasets.
arXiv Detail & Related papers (2023-11-28T08:34:58Z) - Conditionally Invariant Representation Learning for Disentangling
Cellular Heterogeneity [25.488181126364186]
This paper presents a novel approach that leverages domain variability to learn representations that are conditionally invariant to unwanted variability or distractors.
We apply our method to grand biological challenges, such as data integration in single-cell genomics.
Specifically, the proposed approach helps to disentangle biological signals from data biases that are unrelated to the target task or the causal explanation of interest.
arXiv Detail & Related papers (2023-07-02T12:52:41Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - Composite Feature Selection using Deep Ensembles [130.72015919510605]
We investigate the problem of discovering groups of predictive features without predefined grouping.
We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups.
We propose a new metric to measure similarity between discovered groups and the ground truth.
arXiv Detail & Related papers (2022-11-01T17:49:40Z) - MoReL: Multi-omics Relational Learning [26.484803417186384]
We propose a novel deep Bayesian generative model to efficiently infer a multi-partite graph encoding molecular interactions across heterogeneous views.
With such an optimal transport regularization in the deep Bayesian generative model, it not only allows incorporating view-specific side information, but also increases the model flexibility with the distribution-based regularization.
arXiv Detail & Related papers (2022-03-15T02:50:07Z) - Interpretable Single-Cell Set Classification with Kernel Mean Embeddings [14.686560033030101]
Kernel Mean Embedding encodes the cellular landscape of each profiled biological sample.
We train a simple linear classifier and achieve state-of-the-art classification accuracy on 3 flow and mass datasets.
arXiv Detail & Related papers (2022-01-18T21:40:36Z) - Group Heterogeneity Assessment for Multilevel Models [68.95633278540274]
Many data sets contain an inherent multilevel structure.
Taking this structure into account is critical for the accuracy and calibration of any statistical analysis performed on such data.
We propose a flexible framework for efficiently assessing differences between the levels of given grouping variables in the data.
arXiv Detail & Related papers (2020-05-06T12:42:04Z) - Distinguishing Cell Phenotype Using Cell Epigenotype [0.0]
Relationship between microscopic observations and macroscopic behavior is a fundamental open question in biophysical systems.
We develop a unified approach that---in contrast with existing methods---predicts cell type from macromolecular data even when accounting for the scale of human tissue diversity and limitations in the available data.
arXiv Detail & Related papers (2020-03-20T18:00:07Z) - Blocked Clusterwise Regression [0.0]
We generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple latent variables.
We contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting.
arXiv Detail & Related papers (2020-01-29T23:29:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.