HiFusion: Hierarchical Intra-Spot Alignment and Regional Context Fusion for Spatial Gene Expression Prediction from Histopathology
- URL: http://arxiv.org/abs/2511.12969v2
- Date: Wed, 19 Nov 2025 05:37:46 GMT
- Title: HiFusion: Hierarchical Intra-Spot Alignment and Regional Context Fusion for Spatial Gene Expression Prediction from Histopathology
- Authors: Ziqiao Weng, Yaoyu Fang, Jiahe Qian, Xinkun Wang, Lee AD Cooper, Weidong Cai, Bo Zhou,
- Abstract summary: HiFusion is a novel deep learning framework that integrates two complementary components.<n>We show that HiFusion achieves state-of-the-art performance across both 2D slide-wise cross-validation and more challenging 3D sample-specific scenarios.<n>These results underscore HiFusion's potential as a robust, accurate, and scalable solution for ST inference from routine histopathology.
- Score: 7.982889842329205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatial transcriptomics (ST) bridges gene expression and tissue morphology but faces clinical adoption barriers due to technical complexity and prohibitive costs. While computational methods predict gene expression from H&E-stained whole-slide images (WSIs), existing approaches often fail to capture the intricate biological heterogeneity within spots and are susceptible to morphological noise when integrating contextual information from surrounding tissue. To overcome these limitations, we propose HiFusion, a novel deep learning framework that integrates two complementary components. First, we introduce the Hierarchical Intra-Spot Modeling module that extracts fine-grained morphological representations through multi-resolution sub-patch decomposition, guided by a feature alignment loss to ensure semantic consistency across scales. Concurrently, we present the Context-aware Cross-scale Fusion module, which employs cross-attention to selectively incorporate biologically relevant regional context, thereby enhancing representational capacity. This architecture enables comprehensive modeling of both cellular-level features and tissue microenvironmental cues, which are essential for accurate gene expression prediction. Extensive experiments on two benchmark ST datasets demonstrate that HiFusion achieves state-of-the-art performance across both 2D slide-wise cross-validation and more challenging 3D sample-specific scenarios. These results underscore HiFusion's potential as a robust, accurate, and scalable solution for ST inference from routine histopathology.
Related papers
- EXAONE Path 2.5: Pathology Foundation Model with Multi-Omics Alignment [7.030162358506499]
We present EXAONE Path 2.5, a pathology foundation model that jointly models histologic, genomic, epigenetic and transcriptomic modalities.<n>We evaluate EXAONE Path 2.5 against six leading pathology foundation models across two complementary benchmarks.
arXiv Detail & Related papers (2025-12-16T02:31:53Z) - A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - Dual-Path Knowledge-Augmented Contrastive Alignment Network for Spatially Resolved Transcriptomics [5.455957568203595]
High cost has driven efforts to predict spatial gene expression from whole slide images.<n>Current methods face significant limitations, such as under-exploitation of high-level biological context.<n>We propose DKAN, a novel Dual-path Knowledge-Augmented contrastive alignment Network.
arXiv Detail & Related papers (2025-11-21T10:58:04Z) - C3-Diff: Super-resolving Spatial Transcriptomics via Cross-modal Cross-content Contrastive Diffusion Modelling [5.986183062217602]
This study presents a cross-modal cross-content contrastive diffusion framework, called C3-Diff, for ST enhancement with histology images as guidance.<n>To overcome the problem of low sequencing sensitivity in ST maps, we perform nosing-based information augmentation on the surface of feature unit hypersphere.<n>We propose a dynamic cross-modal imputation-based training strategy to mitigate ST data scarcity.
arXiv Detail & Related papers (2025-11-04T13:12:25Z) - AdaFusion: Prompt-Guided Inference with Adaptive Fusion of Pathology Foundation Models [49.550545038402184]
We propose AdaFusion, a novel prompt-guided inference framework.<n>Our method compresses and aligns tile-level features from diverse models.<n>AdaFusion consistently surpasses individual PFMs across both classification and regression tasks.
arXiv Detail & Related papers (2025-08-07T07:09:31Z) - Adaptive Spatial Transcriptomics Interpolation via Cross-modal Cross-slice Modeling [26.230748488216648]
Spatial transcriptomics (ST) is a technique that characterizes the spatial gene profiling patterns within the tissue context.<n>We propose C2-STi, the first attempt for interpolating missing ST slices at arbitrary intermediate positions between adjacent ST slices.
arXiv Detail & Related papers (2025-05-15T22:14:39Z) - PH2ST:ST-Prompt Guided Histological Hypergraph Learning for Spatial Gene Expression Prediction [9.420121324844066]
We propose PH2ST, a prompt-guided hypergraph learning framework, to guide multi-scale histological representation learning for spatial gene expression prediction.<n> PH2ST not only outperforms existing state-of-the-art methods, but also shows strong potential for practical applications such as imputing missing spots, ST super-resolution, and local-to-global prediction.
arXiv Detail & Related papers (2025-03-21T03:10:43Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [57.044719143401664]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View [49.03501451546763]
We identify the importance of implicit correspondences across biological contexts for exploiting domain-invariant pathological composition.
We propose self-adaptive dynamic distillation to secure instance-aware trade-offs across different model constituents.
arXiv Detail & Related papers (2024-07-14T04:41:16Z) - Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics [5.904688354944791]
spatial transcriptomics allows to characterize spatial gene expression within tissue for discovery research.<n>Super-resolution approaches promise to enhance ST maps by integrating histology images with gene expressions of profiled tissue spots.<n>This paper proposes a cross-modal conditional diffusion model for super-resolving ST maps with the guidance of histology images.
arXiv Detail & Related papers (2024-04-19T16:01:00Z) - SIAN: Style-Guided Instance-Adaptive Normalization for Multi-Organ
Histopathology Image Synthesis [63.845552349914186]
We propose a style-guided instance-adaptive normalization (SIAN) to synthesize realistic color distributions and textures for different organs.
The four phases work together and are integrated into a generative network to embed image semantics, style, and instance-level boundaries.
arXiv Detail & Related papers (2022-09-02T16:45:46Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.