scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration
- URL: http://arxiv.org/abs/2510.24987v1
- Date: Tue, 28 Oct 2025 21:28:39 GMT
- Title: scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration
- Authors: Jianle Sun, Chaoqi Liang, Ran Wei, Peng Zheng, Lei Bai, Wanli Ouyang, Hongliang Yan, Peng Ye,
- Abstract summary: We introduce a scalable and flexible generative framework called single-cell Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired multi-omics integration.<n>Our method achieves excellent performance on benchmark datasets in terms of batch correction, modality alignment, and biological signal preservation.
- Score: 53.683726781791385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advances in single-cell sequencing have enabled high-resolution profiling of diverse molecular modalities, while integrating unpaired multi-omics single-cell data remains challenging. Existing approaches either rely on pair information or prior correspondences, or require computing a global pairwise coupling matrix, limiting their scalability and flexibility. In this paper, we introduce a scalable and flexible generative framework called single-cell Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired multi-omics integration. Specifically, we disentangle each cell's latent representations into modality-shared and modality-specific components using a well-designed $\beta$-VAE architecture, which are augmented with isometric regularization to preserve intra-omics biological heterogeneity, adversarial objective to encourage cross-modal alignment, and masked reconstruction loss strategy to address the issue of missing features across modalities. Our method achieves excellent performance on benchmark datasets in terms of batch correction, modality alignment, and biological signal preservation. Crucially, it scales effectively to large-level datasets and supports integration of more than two omics, offering a powerful and flexible solution for large-scale multi-omics data integration and downstream biological discovery.
Related papers
- Modality-Specific Enhancement and Complementary Fusion for Semi-Supervised Multi-Modal Brain Tumor Segmentation [6.302779966909783]
We propose a novel semi-supervised multi-modal framework for medical image segmentation.<n>We introduce a Modality-specific Enhancing Module (MEM) to strengthen semantic unique cues to each modality.<n>We also introduce a learnable Complementary Information Fusion (CIF) module to adaptively exchange complementary knowledge between modalities.
arXiv Detail & Related papers (2025-12-10T16:15:17Z) - MoRE: Batch-Robust Multi-Omics Representations from Frozen Pre-trained Transformers [0.0]
We present MoRE (Multi-Omics Representation Embedding), a framework that repurposes frozen pre-trained transformers to align heterogeneous assays into a shared latent space.<n>Specifically, MoRE attaches lightweight, modality-specific adapters and a task-adaptive fusion layer to the frozen backbone.<n>We benchmark MoRE against established baselines, including scGPT, scVI, and Harmony with Scrublet, evaluating integration fidelity, rare population detection, and modality transfer.
arXiv Detail & Related papers (2025-11-25T15:04:06Z) - UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation [104.59740403500132]
Multi-modal image segmentation faces real-world deployment challenges from incomplete/corrupted modalities degrading performance.<n>We propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC)<n>Our approach hierarchically bridges representation gaps between complete and incomplete modalities across input, feature and output levels.
arXiv Detail & Related papers (2025-09-19T17:29:25Z) - CellPainTR: Generalizable Representation Learning for Cross-Dataset Cell Painting Analysis [51.56484100374058]
We introduce CellPainTR, a Transformer-based architecture designed to learn foundational representations of cellular morphology.<n>Our work represents a significant step towards creating truly foundational models for image-based profiling, enabling more reliable and scalable cross-study biological analysis.
arXiv Detail & Related papers (2025-09-02T03:30:07Z) - scMamba: A Scalable Foundation Model for Single-Cell Multi-Omics Integration Beyond Highly Variable Feature Selection [5.139014238424409]
scMamba is a model designed to integrate single-cell multi-omics data without the need for prior feature selection.<n> scMamba distills rich biological insights from high-dimensional, sparse single-cell multi-omics data.<n>Our findings position scMamba as a powerful tool for large-scale single-cell multi-omics integration.
arXiv Detail & Related papers (2025-06-25T12:58:01Z) - Controllable diffusion-based generation for multi-channel biological data [66.44042377817074]
This work proposes a unified diffusion framework for controllable generation over structured and spatial biological data.<n>We show state-of-the-art performance across both spatial and non-spatial prediction tasks, including protein imputation in IMC and gene-to-protein prediction in single-cell datasets.
arXiv Detail & Related papers (2025-06-24T00:56:21Z) - Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology [6.418265127069878]
We propose the use of omic embeddings during early and late fusion to capture complementary information from local (patch-level) to global (slide-level) interactions.<n>This dual fusion strategy enhances interpretability and classification performance, highlighting its potential for clinical diagnostics.
arXiv Detail & Related papers (2024-11-26T13:25:53Z) - CAVACHON: a hierarchical variational autoencoder to integrate multi-modal single-cell data [10.429856767305687]
We propose a novel probabilistic learning framework that explicitly incorporates conditional independence relationships between multi-modal data.
We demonstrate the versatility of our framework across various applications pertinent to single-cell multi-omics data integration.
arXiv Detail & Related papers (2024-05-28T23:44:09Z) - Mixed Models with Multiple Instance Learning [51.440557223100164]
We introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL)
Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets.
arXiv Detail & Related papers (2023-11-04T16:42:42Z) - Learning Multiscale Consistency for Self-supervised Electron Microscopy
Instance Segmentation [48.267001230607306]
We propose a pretraining framework that enhances multiscale consistency in EM volumes.
Our approach leverages a Siamese network architecture, integrating strong and weak data augmentations.
It effectively captures voxel and feature consistency, showing promise for learning transferable representations for EM analysis.
arXiv Detail & Related papers (2023-08-19T05:49:13Z) - Is your data alignable? Principled and interpretable alignability
testing and integration of single-cell data [24.457344926393397]
Single-cell data integration can provide a comprehensive molecular view of cells.
Existing methods suffer from several fundamental limitations.
We present a spectral manifold alignment and inference framework.
arXiv Detail & Related papers (2023-08-03T16:04:14Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.