PAST: A multimodal single-cell foundation model for histopathology and spatial transcriptomics in cancer
- URL: http://arxiv.org/abs/2507.06418v1
- Date: Tue, 08 Jul 2025 21:51:25 GMT
- Title: PAST: A multimodal single-cell foundation model for histopathology and spatial transcriptomics in cancer
- Authors: Changchun Yang, Haoyang Li, Yushuai Wu, Yilan Zhang, Yifeng Jiao, Yu Zhang, Rihan Huang, Yuan Cheng, Yuan Qi, Xin Guo, Xin Gao,
- Abstract summary: PAST is a pan-cancer single-cell foundation model trained on 20 million paired histopathology images and single-cell transcriptomes.<n>It predicts single-cell gene expression, virtual molecular staining, and multimodal survival analysis directly from routine pathology slides.<n>Our work establishes a new paradigm for pathology foundation models, providing a versatile tool for high-resolution spatial omics, mechanistic discovery, and precision cancer research.
- Score: 26.795192024462963
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While pathology foundation models have transformed cancer image analysis, they often lack integration with molecular data at single-cell resolution, limiting their utility for precision oncology. Here, we present PAST, a pan-cancer single-cell foundation model trained on 20 million paired histopathology images and single-cell transcriptomes spanning multiple tumor types and tissue contexts. By jointly encoding cellular morphology and gene expression, PAST learns unified cross-modal representations that capture both spatial and molecular heterogeneity at the cellular level. This approach enables accurate prediction of single-cell gene expression, virtual molecular staining, and multimodal survival analysis directly from routine pathology slides. Across diverse cancers and downstream tasks, PAST consistently exceeds the performance of existing approaches, demonstrating robust generalizability and scalability. Our work establishes a new paradigm for pathology foundation models, providing a versatile tool for high-resolution spatial omics, mechanistic discovery, and precision cancer research.
Related papers
- Integrating Pathology Foundation Models and Spatial Transcriptomics for Cellular Decomposition from Histology Images [0.0]
We propose a lightweight and training-efficient approach to predict cellular composition directly from histology images.<n>By training a lightweight multi-layer perceptron (MLP) regressor on cell-type abundances derived via cell2location, our method efficiently distills knowledge from pathology foundation models.
arXiv Detail & Related papers (2025-07-09T16:43:04Z) - Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification, An Interpretable Multi-Omics Approach [36.92842246372894]
Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN) is a deep learning framework that utilizes messenger-RNA, micro-RNA sequences, and DNA methylation samples.<n>By integrating multi-omics data with graph-based deep learning, our proposed approach demonstrates robust predictive performance and interpretability.
arXiv Detail & Related papers (2025-03-29T02:14:05Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [52.106879463828044]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - Joint Modelling Histology and Molecular Markers for Cancer Classification [4.267476747447838]
We introduce a novel digital pathology approach to jointly predict molecular markers and histology features.<n>Our method outperforms other state-of-the-art methods in classifying glioma, histology features and molecular markers.
arXiv Detail & Related papers (2025-02-11T21:52:32Z) - Multimodal Prototyping for cancer survival prediction [45.61869793509184]
Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification.
Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes.
This process generates many tokens, which leads to high memory requirements for computing attention and complicates post-hoc interpretability analyses.
Our framework outperforms state-of-the-art methods with much less computation while unlocking new interpretability analyses.
arXiv Detail & Related papers (2024-06-28T20:37:01Z) - Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis [7.996257103473235]
We propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis.
The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph.
We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas.
arXiv Detail & Related papers (2024-04-11T09:07:40Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - MGCT: Mutual-Guided Cross-Modality Transformer for Survival Outcome
Prediction using Integrative Histopathology-Genomic Features [2.3942863352287787]
Mutual-Guided Cross-Modality Transformer (MGCT) is a weakly-supervised, attention-based multimodal learning framework.
We propose MGCT to combine histology features and genomic features to model the genotype-phenotype interactions within the tumor microenvironment.
arXiv Detail & Related papers (2023-11-20T10:49:32Z) - Artificial-intelligence-based molecular classification of diffuse
gliomas using rapid, label-free optical imaging [59.79875531898648]
DeepGlioma is an artificial-intelligence-based diagnostic screening system.
DeepGlioma can predict the molecular alterations used by the World Health Organization to define the adult-type diffuse glioma taxonomy.
arXiv Detail & Related papers (2023-03-23T18:50:18Z) - CancerUniT: Towards a Single Unified Model for Effective Detection,
Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection
of CT Scans [45.83431075462771]
Human readers or radiologists routinely perform full-body multi-organ multi-disease detection and diagnosis in clinical practice.
Most medical AI systems are built to focus on single organs with a narrow list of a few diseases.
CancerUniT is a query-based Mask Transformer model with the output of multi-tumor prediction.
arXiv Detail & Related papers (2023-01-28T20:09:34Z) - Pan-Cancer Integrative Histology-Genomic Analysis via Interpretable
Multimodal Deep Learning [4.764927152701701]
We integrate whole slide pathology images, RNA-seq abundance, copy number variation, and mutation data from 5,720 patients across 14 major cancer types.
Our interpretable, weakly-supervised, multimodal deep learning algorithm is able to fuse these heterogeneous modalities for predicting outcomes.
We analyze morphologic and molecular markers responsible for prognostic predictions across all cancer types.
arXiv Detail & Related papers (2021-08-04T20:40:05Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.