The Next Layer: Augmenting Foundation Models with Structure-Preserving and Attention-Guided Learning for Local Patches to Global Context Awareness in Computational Pathology
- URL: http://arxiv.org/abs/2508.19914v1
- Date: Wed, 27 Aug 2025 14:19:38 GMT
- Title: The Next Layer: Augmenting Foundation Models with Structure-Preserving and Attention-Guided Learning for Local Patches to Global Context Awareness in Computational Pathology
- Authors: Muhammad Waqas, Rukhmini Bandyopadhyay, Eman Showkatian, Amgad Muneer, Anas Zafar, Frank Rojas Alvarez, Maricel Corredor Marin, Wentao Li, David Jaffray, Cara Haymaker, John Heymach, Natalie I Vokes, Luisa Maren Solis Soto, Jianjun Zhang, Jia Wu,
- Abstract summary: We present EAGLE-Net, a structure-preserving, attention-guided MIL architecture designed to augment prediction and interpretability.<n>We benchmarked it on large pan-cancer datasets, including 3 cancer types for classification (10,260 slides) and 7 cancer types for survival prediction (4,172 slides)
- Score: 23.32822092398391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models have recently emerged as powerful feature extractors in computational pathology, yet they typically omit mechanisms for leveraging the global spatial structure of tissues and the local contextual relationships among diagnostically relevant regions - key elements for understanding the tumor microenvironment. Multiple instance learning (MIL) remains an essential next step following foundation model, designing a framework to aggregate patch-level features into slide-level predictions. We present EAGLE-Net, a structure-preserving, attention-guided MIL architecture designed to augment prediction and interpretability. EAGLE-Net integrates multi-scale absolute spatial encoding to capture global tissue architecture, a top-K neighborhood-aware loss to focus attention on local microenvironments, and background suppression loss to minimize false positives. We benchmarked EAGLE-Net on large pan-cancer datasets, including three cancer types for classification (10,260 slides) and seven cancer types for survival prediction (4,172 slides), using three distinct histology foundation backbones (REMEDIES, Uni-V1, Uni2-h). Across tasks, EAGLE-Net achieved up to 3% higher classification accuracy and the top concordance indices in 6 of 7 cancer types, producing smooth, biologically coherent attention maps that aligned with expert annotations and highlighted invasive fronts, necrosis, and immune infiltration. These results position EAGLE-Net as a generalizable, interpretable framework that complements foundation models, enabling improved biomarker discovery, prognostic modeling, and clinical decision support
Related papers
- The Geometry of Transfer: Unlocking Medical Vision Manifolds for Training-Free Model Ranking [31.961181244685932]
We propose a novel Topology-Driven Transferability Estimation framework that evaluates manifold tractability rather than statistical overlap.<n>Our approach significantly outperforms state-of-the-art baselines by around textbf31% relative improvement in the weighted Kendall.
arXiv Detail & Related papers (2026-02-27T11:04:15Z) - Using Unsupervised Domain Adaptation Semantic Segmentation for Pulmonary Embolism Detection in Computed Tomography Pulmonary Angiogram (CTPA) Images [0.0]
unsupervised domain adaptation (UDA) framework is proposed, utilizing a Transformer backbone and a Mean-Teacher architecture for cross-center semantic segmentation.<n>The primary focus is placed on enhancing pseudo-label reliability by learning deep structural information within the feature space.<n> Experimental validation conducted on cross-center datasets (FUMPE and CAD-PE) demonstrates significant performance gains.
arXiv Detail & Related papers (2026-02-23T14:33:24Z) - EXAONE Path 2.5: Pathology Foundation Model with Multi-Omics Alignment [7.030162358506499]
We present EXAONE Path 2.5, a pathology foundation model that jointly models histologic, genomic, epigenetic and transcriptomic modalities.<n>We evaluate EXAONE Path 2.5 against six leading pathology foundation models across two complementary benchmarks.
arXiv Detail & Related papers (2025-12-16T02:31:53Z) - PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer [54.958921946378304]
We introduce PanFoMa, a lightweight hybrid neural network that combines the strengths of Transformers and state-space models.<n>PanFoMa consists of a front-end local-context encoder with shared self-attention layers to capture complex, order-independent gene interactions.<n>We also construct a large-scale pan-cancer single-cell benchmark, PanFoMaBench, containing over 3.5 million high-quality cells.
arXiv Detail & Related papers (2025-12-02T08:31:31Z) - Topology-Constrained Learning for Efficient Laparoscopic Liver Landmark Detection [46.2391319253146]
Liver landmarks provide crucial anatomical guidance to the surgeon during laparoscopic liver surgery.<n>TopoNet is a novel topology-constrained learning framework for laparoscopic liver landmark detection.<n>Our framework adopts a snake-CNN dual-path encoder to simultaneously capture detailed RGB texture information and depth-informed topological structures.
arXiv Detail & Related papers (2025-07-01T07:35:36Z) - HieraEdgeNet: A Multi-Scale Edge-Enhanced Framework for Automated Pollen Recognition [10.159338629617919]
We introduce HieraEdgeNet, a multi-scale edge-enhancement framework for automated pollen recognition.<n>The framework's core innovation is the introduction of three synergistic modules.<n>On a large-scale dataset, HieraEdgeNet achieves a mean Average Precision (mAP@.5) of 0.9501, significantly outperforming state-of-the-art baseline models.
arXiv Detail & Related papers (2025-06-09T11:03:31Z) - A Graph-Based Framework for Interpretable Whole Slide Image Analysis [86.37618055724441]
We develop a framework that transforms whole-slide images into biologically-informed graph representations.<n>Our approach builds graph nodes from tissue regions that respect natural structures, not arbitrary grids.<n>We demonstrate strong performance on challenging cancer staging and survival prediction tasks.
arXiv Detail & Related papers (2025-03-14T20:15:04Z) - Leveraging Vision-Language Embeddings for Zero-Shot Learning in Histopathology Images [7.048241543461529]
We propose a novel framework called Multi-Resolution Prompt-guided Hybrid Embedding (MR-PHE) to address these challenges in zero-shot histopathology image classification.<n>We introduce a hybrid embedding strategy that integrates global image embeddings with weighted patch embeddings.<n>A similarity-based patch weighting mechanism assigns attention-like weights to patches based on their relevance to class embeddings.
arXiv Detail & Related papers (2025-03-13T12:18:37Z) - Comparative Analysis of Multi-Omics Integration Using Advanced Graph Neural Networks for Cancer Classification [40.45049709820343]
Multi-omics data integration poses significant challenges due to the high dimensionality, data complexity, and distinct characteristics of various omics types.
This study evaluates three graph neural network architectures for multi-omics (MO) integration based on graph-convolutional networks (GCN), graph-attention networks (GAT), and graph-transformer networks (GTN)
arXiv Detail & Related papers (2024-10-05T16:17:44Z) - Shifting Focus: From Global Semantics to Local Prominent Features in Swin-Transformer for Knee Osteoarthritis Severity Assessment [42.09313885494969]
We harness the Swin Transformer's capacity to discern extended spatial dependencies within images through the hierarchical framework.
Our novel contribution lies in refining local feature representations, orienting them specifically toward the final distribution of the classifier.
Our model demonstrates significant robustness and precision, as evidenced by extensive validation of two established benchmarks for Knee OsteoArthritis (KOA) grade classification.
arXiv Detail & Related papers (2024-03-15T01:09:58Z) - HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction [16.060286162384536]
HistGen is a learning-empowered framework for histopathology report generation.
It aims to boost report generation by aligning whole slide images (WSIs) and diagnostic reports from local and global granularity.
Experimental results on WSI report generation show the proposed model outperforms state-of-the-art (SOTA) models by a large margin.
arXiv Detail & Related papers (2024-03-08T15:51:43Z) - Neural Distance Embeddings for Biological Sequences [43.07977514121458]
We present NeuroSEED, a framework to embed sequences in geometric vector spaces.
We show the effectiveness of the hyperbolic space that captures the hierarchical structure and provides an average 22% reduction in embedding RMSE.
The proposed approaches display significant accuracy and/or runtime improvements on real-world datasets.
arXiv Detail & Related papers (2021-09-20T17:30:58Z) - Whole Slide Images are 2D Point Clouds: Context-Aware Survival
Prediction using Patch-based Graph Convolutional Networks [6.427108174481534]
We present Patch-GCN, a context-aware, spatially-resolved patch-based graph convolutional network that hierarchically aggregates instance-level histology features.
We demonstrate that Patch-GCN outperforms all prior weakly-supervised approaches by 3.58-9.46%.
arXiv Detail & Related papers (2021-07-27T19:17:37Z) - Topology-Aware Segmentation Using Discrete Morse Theory [38.65353702366932]
We propose a new approach to train deep image segmentation networks for better topological accuracy.
We identify global structures, including 1D skeletons and 2D patches, which are important for topological accuracy.
On diverse datasets, our method achieves superior performance on both the DICE score and topological metrics.
arXiv Detail & Related papers (2021-03-18T02:47:21Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.