CAVACHON: a hierarchical variational autoencoder to integrate multi-modal single-cell data
- URL: http://arxiv.org/abs/2405.18655v1
- Date: Tue, 28 May 2024 23:44:09 GMT
- Title: CAVACHON: a hierarchical variational autoencoder to integrate multi-modal single-cell data
- Authors: Ping-Han Hsieh, Ru-Xiu Hsiao, Katalin Ferenc, Anthony Mathelier, Rebekka Burkholz, Chien-Yu Chen, Geir Kjetil Sandve, Tatiana Belova, Marieke Lydia Kuijjer,
- Abstract summary: We propose a novel probabilistic learning framework that explicitly incorporates conditional independence relationships between multi-modal data.
We demonstrate the versatility of our framework across various applications pertinent to single-cell multi-omics data integration.
- Score: 10.429856767305687
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Paired single-cell sequencing technologies enable the simultaneous measurement of complementary modalities of molecular data at single-cell resolution. Along with the advances in these technologies, many methods based on variational autoencoders have been developed to integrate these data. However, these methods do not explicitly incorporate prior biological relationships between the data modalities, which could significantly enhance modeling and interpretation. We propose a novel probabilistic learning framework that explicitly incorporates conditional independence relationships between multi-modal data as a directed acyclic graph using a generalized hierarchical variational autoencoder. We demonstrate the versatility of our framework across various applications pertinent to single-cell multi-omics data integration. These include the isolation of common and distinct information from different modalities, modality-specific differential analysis, and integrated cell clustering. We anticipate that the proposed framework can facilitate the construction of highly flexible graphical models that can capture the complexities of biological hypotheses and unravel the connections between different biological data types, such as different modalities of paired single-cell multi-omics data. The implementation of the proposed framework can be found in the repository https://github.com/kuijjerlab/CAVACHON.
Related papers
- Joint Analysis of Single-Cell Data across Cohorts with Missing Modalities [13.675134007270774]
We propose (Single-Cell Cross-Cohort Cross-Category) integration, a novel framework that learns unified cell representations under domain shift.
Our generative approach learns rich cross-modal and cross-domain relationships that enable imputation of these missing modalities.
arXiv Detail & Related papers (2024-05-18T12:32:21Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture.
We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Mixed Models with Multiple Instance Learning [51.440557223100164]
We introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL)
Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets.
arXiv Detail & Related papers (2023-11-04T16:42:42Z) - Is your data alignable? Principled and interpretable alignability
testing and integration of single-cell data [24.457344926393397]
Single-cell data integration can provide a comprehensive molecular view of cells.
Existing methods suffer from several fundamental limitations.
We present a spectral manifold alignment and inference framework.
arXiv Detail & Related papers (2023-08-03T16:04:14Z) - Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications [90.6849884683226]
We study the challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data.
Using a precise information-theoretic definition of interactions, our key contribution is the derivation of lower and upper bounds.
We show how these theoretical results can be used to estimate multimodal model performance, guide data collection, and select appropriate multimodal models for various tasks.
arXiv Detail & Related papers (2023-06-07T15:44:53Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Multimodal hierarchical Variational AutoEncoders with Factor Analysis latent space [45.418113011182186]
This study proposes a novel method to address limitations by combining Variational AutoEncoders (VAEs) with a Factor Analysis latent space (FA-VAE)
The proposed FA-VAE method employs multiple VAEs to learn a private representation for each heterogeneous data view in a continuous latent space.
arXiv Detail & Related papers (2022-07-19T10:46:02Z) - MoReL: Multi-omics Relational Learning [26.484803417186384]
We propose a novel deep Bayesian generative model to efficiently infer a multi-partite graph encoding molecular interactions across heterogeneous views.
With such an optimal transport regularization in the deep Bayesian generative model, it not only allows incorporating view-specific side information, but also increases the model flexibility with the distribution-based regularization.
arXiv Detail & Related papers (2022-03-15T02:50:07Z) - Relating by Contrasting: A Data-efficient Framework for Multimodal
Generative Models [86.9292779620645]
We develop a contrastive framework for generative model learning, allowing us to train the model not just by the commonality between modalities, but by the distinction between "related" and "unrelated" multimodal data.
Under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data.
arXiv Detail & Related papers (2020-07-02T15:08:11Z) - Bayesian Sparse Factor Analysis with Kernelized Observations [67.60224656603823]
Multi-view problems can be faced with latent variable models.
High-dimensionality and non-linear issues are traditionally handled by kernel methods.
We propose merging both approaches into single model.
arXiv Detail & Related papers (2020-06-01T14:25:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.