CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis
- URL: http://arxiv.org/abs/2602.21637v2
- Date: Thu, 05 Mar 2026 03:47:19 GMT
- Title: CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis
- Authors: Di Zhang, Zhangpeng Gong, Xiaobo Pang, Jiashuai Liu, Junbo Lu, Hao Cui, Jiusong Ge, Zhi Zeng, Kai Yi, Yinghua Li, Si Liu, Tingsong Yu, Haoran Wang, Mireia Crispin-Ortuzar, Weimiao Yu, Chen Li, Zeyu Gao,
- Abstract summary: We present Cross-modal Adaptive Region (CARE), a foundation model for pathology that automatically partitions whole-slide images into morphologically relevant regions.<n>Based on only one-tenth of the pretraining data typically used by mainstream foundation models, CARE achieves superior average performance across 33 downstream benchmarks.
- Score: 23.45449218534003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models have recently achieved impressive success in computational pathology, demonstrating strong generalization across diverse histopathology tasks. However, existing models overlook the heterogeneous and non-uniform organization of pathological regions of interest (ROIs) because they rely on natural image backbones not tailored for tissue morphology. Consequently, they often fail to capture the coherent tissue architecture beyond isolated patches, limiting interpretability and clinical relevance. To address these challenges, we present Cross-modal Adaptive Region Encoder (CARE), a foundation model for pathology that automatically partitions WSIs into several morphologically relevant regions. Specifically, CARE employs a two-stage pretraining strategy: (1) a self-supervised unimodal pretraining stage that learns morphological representations from 34,277 whole-slide images (WSIs) without segmentation annotations, and (2) a cross-modal alignment stage that leverages RNA and protein profiles to refine the construction and representation of adaptive regions. This molecular guidance enables CARE to identify biologically relevant patterns and generate irregular yet coherent tissue regions, selecting the most representative area as ROI. CARE supports a broad range of pathology-related tasks, using either the ROI feature or the slide-level feature obtained by aggregating adaptive regions. Based on only one-tenth of the pretraining data typically used by mainstream foundation models, CARE achieves superior average performance across 33 downstream benchmarks, including morphological classification, molecular prediction, and survival analysis, and outperforms other foundation model baselines overall.
Related papers
- Benchmarking Computational Pathology Foundation Models For Semantic Segmentation [0.0]
Concatenating features from CONCH, PathDino and CellViT outperformed individual models across all the datasets by 7.95%.<n>We show that the vision language foundation model, CONCH performed the best across datasets when compared to vision-only foundation models.
arXiv Detail & Related papers (2026-02-21T08:00:46Z) - Towards Spatial Transcriptomics-driven Pathology Foundation Models [32.70436266943553]
We introduce a vision-omics self-supervised learning framework that infuses localized molecular information into pathology vision encoders.<n>We instantiate SEAL by training on over 700,000 paired gene expression spot-tissue region examples spanning tumor and normal samples from 14 organs.<n> SEAL encoders exhibit robust domain generalization on out-of-distribution and enable new cross-modal capabilities such as gene-to-image retrieval.
arXiv Detail & Related papers (2026-02-15T15:06:45Z) - EXAONE Path 2.5: Pathology Foundation Model with Multi-Omics Alignment [7.030162358506499]
We present EXAONE Path 2.5, a pathology foundation model that jointly models histologic, genomic, epigenetic and transcriptomic modalities.<n>We evaluate EXAONE Path 2.5 against six leading pathology foundation models across two complementary benchmarks.
arXiv Detail & Related papers (2025-12-16T02:31:53Z) - A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - From Classification to Cross-Modal Understanding: Leveraging Vision-Language Models for Fine-Grained Renal Pathology [9.268389327736735]
We model fine-grained glomerular subtyping as a clinically realistic few-shot problem.<n>We evaluate both pathology-specialized and general-purpose vision-language models under this setting.
arXiv Detail & Related papers (2025-11-15T01:44:11Z) - GRASPing Anatomy to Improve Pathology Segmentation [67.98147643529309]
We introduce GRASP, a modular plug-and-play framework that enhances pathology segmentation models.<n>We evaluate GRASP on two PET/CT datasets, conduct systematic ablation studies, and investigate the framework's inner workings.
arXiv Detail & Related papers (2025-08-05T12:26:36Z) - PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [57.044719143401664]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View [49.03501451546763]
We identify the importance of implicit correspondences across biological contexts for exploiting domain-invariant pathological composition.
We propose self-adaptive dynamic distillation to secure instance-aware trade-offs across different model constituents.
arXiv Detail & Related papers (2024-07-14T04:41:16Z) - HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis [19.04633470168871]
Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy.
In this paper, we propose a novel Hierarchical Adaptive Taxonomy (HATs) method, which is designed to thoroughly segment panoramic views of kidney structures by leveraging detailed anatomical insights.
Our approach entails (1) the innovative HATs technique which translates spatial relationships among 15 distinct object classes into a versatile "plug-and-play" loss function that spans across regions, functional units, and cells, (2) the incorporation of anatomical hierarchies and scale considerations into a unified simple matrix representation for all panoramic entities, and (3) the
arXiv Detail & Related papers (2024-06-30T05:35:26Z) - Shape-aware Meta-learning for Generalizing Prostate MRI Segmentation to
Unseen Domains [68.73614619875814]
We present a novel shape-aware meta-learning scheme to improve the model generalization in prostate MRI segmentation.
Experimental results show that our approach outperforms many state-of-the-art generalization methods consistently across all six settings of unseen domains.
arXiv Detail & Related papers (2020-07-04T07:56:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.