Integrating Pathology Foundation Models and Spatial Transcriptomics for Cellular Decomposition from Histology Images
- URL: http://arxiv.org/abs/2507.07013v1
- Date: Wed, 09 Jul 2025 16:43:04 GMT
- Title: Integrating Pathology Foundation Models and Spatial Transcriptomics for Cellular Decomposition from Histology Images
- Authors: Yutong Sun, Sichen Zhu, Peng Qiu,
- Abstract summary: We propose a lightweight and training-efficient approach to predict cellular composition directly from histology images.<n>By training a lightweight multi-layer perceptron (MLP) regressor on cell-type abundances derived via cell2location, our method efficiently distills knowledge from pathology foundation models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid development of digital pathology and modern deep learning has facilitated the emergence of pathology foundation models that are expected to solve general pathology problems under various disease conditions in one unified model, with or without fine-tuning. In parallel, spatial transcriptomics has emerged as a transformative technology that enables the profiling of gene expression on hematoxylin and eosin (H&E) stained histology images. Spatial transcriptomics unlocks the unprecedented opportunity to dive into existing histology images at a more granular, cellular level. In this work, we propose a lightweight and training-efficient approach to predict cellular composition directly from H&E-stained histology images by leveraging information-enriched feature embeddings extracted from pre-trained pathology foundation models. By training a lightweight multi-layer perceptron (MLP) regressor on cell-type abundances derived via cell2location, our method efficiently distills knowledge from pathology foundation models and demonstrates the ability to accurately predict cell-type compositions from histology images, without physically performing the costly spatial transcriptomics. Our method demonstrates competitive performance compared to existing methods such as Hist2Cell, while significantly reducing computational complexity.
Related papers
- PAST: A multimodal single-cell foundation model for histopathology and spatial transcriptomics in cancer [26.795192024462963]
PAST is a pan-cancer single-cell foundation model trained on 20 million paired histopathology images and single-cell transcriptomes.<n>It predicts single-cell gene expression, virtual molecular staining, and multimodal survival analysis directly from routine pathology slides.<n>Our work establishes a new paradigm for pathology foundation models, providing a versatile tool for high-resolution spatial omics, mechanistic discovery, and precision cancer research.
arXiv Detail & Related papers (2025-07-08T21:51:25Z) - PixCell: A generative foundation model for digital histopathology images [49.00921097924924]
We introduce PixCell, the first diffusion-based generative foundation model for histopathology.<n>We train PixCell on PanCan-30M, a vast, diverse dataset derived from 69,184 H&E-stained whole slide images covering various cancer types.
arXiv Detail & Related papers (2025-06-05T15:14:32Z) - CytoFM: The first cytology foundation model [3.591868126270513]
We introduce CytoFM, the first self-supervised foundation model for digital Cytology.<n>We pretrain CytoFM on a diverse collection of datasets to learn robust, transferable representations.<n>Our results demonstrate that CytoFM performs better on two out of three downstream tasks than existing foundation models pretrained on histopathology.
arXiv Detail & Related papers (2025-04-18T01:37:50Z) - Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer [1.5416321520529301]
Efficient Knowledge Adaptation (PEKA) is a novel framework that integrates knowledge distillation and structure alignment losses for cross-modal knowledge transfer.<n>We evaluated PEKA for gene expression prediction using multiple spatial transcriptomics datasets.
arXiv Detail & Related papers (2025-04-09T17:24:41Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [52.106879463828044]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model [32.670806339139034]
We propose a novel approach that integrates topological constraints into a diffusion model to improve the generation of realistic, contextually accurate cell topologies.<n>Our method refines the simulation of cell distributions and interactions, increasing the precision and interpretability of results in downstream tasks.
arXiv Detail & Related papers (2024-12-08T18:02:22Z) - Multibranch Generative Models for Multichannel Imaging with an Application to PET/CT Synergistic Reconstruction [42.95604565673447]
This paper presents a novel approach for learned synergistic reconstruction of medical images using multibranch generative models.<n>We demonstrate the efficacy of our approach on both Modified National Institute of Standards and Technology (MNIST) and positron emission tomography (PET)/ computed tomography (CT) datasets.
arXiv Detail & Related papers (2024-04-12T18:21:08Z) - LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models [42.922303491557244]
Patient data from real-world clinical practice often suffers from data scarcity and long-tail imbalances.
This study addresses these challenges by generating lesion-containing image-segmentation pairs from lesion-free images.
LeFusion-generated data significantly improves the performance of state-of-the-art segmentation models.
arXiv Detail & Related papers (2024-03-21T01:25:39Z) - Tertiary Lymphoid Structures Generation through Graph-based Diffusion [54.37503714313661]
In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs.
We show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content.
arXiv Detail & Related papers (2023-10-10T14:37:17Z) - A Morphology Focused Diffusion Probabilistic Model for Synthesis of
Histopathology Images [0.5541644538483947]
Deep learning methods have made significant advances in the analysis and classification of tissue images.
These synthetic images have several applications in pathology including utilities in education, proficiency testing, privacy, and data sharing.
arXiv Detail & Related papers (2022-09-27T05:58:35Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.