A Large-Scale Benchmark of Cross-Modal Learning for Histology and Gene Expression in Spatial Transcriptomics
- URL: http://arxiv.org/abs/2508.01490v1
- Date: Sat, 02 Aug 2025 21:11:36 GMT
- Title: A Large-Scale Benchmark of Cross-Modal Learning for Histology and Gene Expression in Spatial Transcriptomics
- Authors: Rushin H. Gindra, Giovanni Palla, Mathias Nguyen, Sophia J. Wagner, Manuel Tran, Fabian J Theis, Dieter Saur, Lorin Crawford, Tingying Peng,
- Abstract summary: HESCAPE is a benchmark for evaluating cross-modal contrastive pretraining in spatial transcriptomics.<n>Gene models pretrained on spatial transcriptomics data outperform both those trained without spatial data and simple baseline approaches.<n>We identify batch effects as a key factor that interferes with effective cross-modal alignment.
- Score: 2.3070195554676993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spatial transcriptomics enables simultaneous measurement of gene expression and tissue morphology, offering unprecedented insights into cellular organization and disease mechanisms. However, the field lacks comprehensive benchmarks for evaluating multimodal learning methods that leverage both histology images and gene expression data. Here, we present HESCAPE, a large-scale benchmark for cross-modal contrastive pretraining in spatial transcriptomics, built on a curated pan-organ dataset spanning 6 different gene panels and 54 donors. We systematically evaluated state-of-the-art image and gene expression encoders across multiple pretraining strategies and assessed their effectiveness on two downstream tasks: gene mutation classification and gene expression prediction. Our benchmark demonstrates that gene expression encoders are the primary determinant of strong representational alignment, and that gene models pretrained on spatial transcriptomics data outperform both those trained without spatial data and simple baseline approaches. However, downstream task evaluation reveals a striking contradiction: while contrastive pretraining consistently improves gene mutation classification performance, it degrades direct gene expression prediction compared to baseline encoders trained without cross-modal objectives. We identify batch effects as a key factor that interferes with effective cross-modal alignment. Our findings highlight the critical need for batch-robust multimodal learning approaches in spatial transcriptomics. To accelerate progress in this direction, we release HESCAPE, providing standardized datasets, evaluation protocols, and benchmarking tools for the community
Related papers
- Gene-DML: Dual-Pathway Multi-Level Discrimination for Gene Expression Prediction from Histopathology Images [5.638556074980827]
Accurately predicting gene expression from histopathology images offers a scalable and non-invasive approach to molecular profiling.<n>Existing methods often underutilize the cross-modal representation alignment between histopathology images and gene expression profiles.<n>We propose Gene-DML, a unified framework that structures latent space through Dual-pathway Multi-Level discrimination.
arXiv Detail & Related papers (2025-07-19T15:45:12Z) - GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z) - Completing Spatial Transcriptomics Data for Gene Expression Prediction Benchmarking [1.177642303362119]
We introduce SpaRED, a database comprising 26 public datasets, and SpaCKLE, a state-of-the-art transformer-based gene expression completion model.<n>Our contributions constitute the most comprehensive benchmark of gene expression prediction from histology images to date.
arXiv Detail & Related papers (2025-05-05T19:17:29Z) - A Misclassification Network-Based Method for Comparative Genomic Analysis [3.7671415694914927]
Classifying genome sequences based on metadata has been an active area of research in comparative genomics for decades.<n>In this study, we integrate AI and network science approaches to develop a comparative genomic analysis framework.
arXiv Detail & Related papers (2024-12-09T23:22:15Z) - RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency [11.813883157319381]
We propose a novel framework that aligns gene and image features using a ranking-based alignment loss.<n>To further enhance the alignment's stability, we employ self-supervised knowledge distillation with a teacher-student network architecture.
arXiv Detail & Related papers (2024-11-22T17:08:28Z) - SpaRED benchmark: Enhancing Gene Expression Prediction from Histology Images with Spatial Transcriptomics Completion [2.032350440475489]
We present a systematically curated and processed database collected from 26 public sources.
We also propose a state-of-the-art transformer based completion technique for inferring missing gene expression.
Our contributions constitute the most comprehensive benchmark of gene expression prediction from histology images to date.
arXiv Detail & Related papers (2024-07-17T21:28:20Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - Efficient and Scalable Fine-Tune of Language Models for Genome
Understanding [49.606093223945734]
We present textscLingo: textscLanguage prefix ftextscIne-tuning for textscGentextscOmes.
Unlike DNA foundation models, textscLingo strategically leverages natural language foundation models' contextual cues.
textscLingo further accommodates numerous downstream fine-tune tasks by an adaptive rank sampling method.
arXiv Detail & Related papers (2024-02-12T21:40:45Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - CausalBench: A Large-scale Benchmark for Network Inference from
Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data.
CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - A Novel Granular-Based Bi-Clustering Method of Deep Mining the
Co-Expressed Genes [76.84066556597342]
Bi-clustering methods are used to mine bi-clusters whose subsets of samples (genes) are co-regulated under their test conditions.
Unfortunately, traditional bi-clustering methods are not fully effective in discovering such bi-clusters.
We propose a novel bi-clustering method by involving here the theory of Granular Computing.
arXiv Detail & Related papers (2020-05-12T02:04:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.