DCMIL: A Progressive Representation Learning of Whole Slide Images for Cancer Prognosis Analysis
- URL: http://arxiv.org/abs/2510.14403v2
- Date: Fri, 17 Oct 2025 05:24:10 GMT
- Title: DCMIL: A Progressive Representation Learning of Whole Slide Images for Cancer Prognosis Analysis
- Authors: Chao Tu, Kun Huang, Jie Zhang, Qianjin Feng, Yu Zhang, Zhenyuan Ning,
- Abstract summary: We propose an easy-to-hard progressive representation learning, termed dual-curriculum contrastive multi-instance learning (DCMIL)<n>Experiments on twelve cancer types (5,954 patients, 12.54 million tiles) demonstrate that DCMIL outperforms standard WSI-based prognostic models.<n>DCMIL identifies fine-grained prognosis-salient regions, provides robust instance uncertainty estimation, and captures morphological differences between normal and tumor tissues.
- Score: 20.81711023855778
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The burgeoning discipline of computational pathology shows promise in harnessing whole slide images (WSIs) to quantify morphological heterogeneity and develop objective prognostic modes for human cancers. However, progress is impeded by the computational bottleneck of gigapixel-size inputs and the scarcity of dense manual annotations. Current methods often overlook fine-grained information across multi-magnification WSIs and variations in tumor microenvironments. Here, we propose an easy-to-hard progressive representation learning, termed dual-curriculum contrastive multi-instance learning (DCMIL), to efficiently process WSIs for cancer prognosis. The model does not rely on dense annotations and enables the direct transformation of gigapixel-size WSIs into outcome predictions. Extensive experiments on twelve cancer types (5,954 patients, 12.54 million tiles) demonstrate that DCMIL outperforms standard WSI-based prognostic models. Additionally, DCMIL identifies fine-grained prognosis-salient regions, provides robust instance uncertainty estimation, and captures morphological differences between normal and tumor tissues, with the potential to generate new biological insights. All codes have been made publicly accessible at https://github.com/tuuuc/DCMIL.
Related papers
- MIL vs. Aggregation: Evaluating Patient-Level Survival Prediction Strategies Using Graph-Based Learning [52.231128973251124]
We compare various strategies for predicting survival at the WSI and patient level.<n>The former treats each WSI as an independent sample, mimicking the strategy adopted in other works.<n>The latter comprises methods to either aggregate the predictions of the several WSIs or automatically identify the most relevant slide.
arXiv Detail & Related papers (2025-03-29T11:14:02Z) - PathoHR: Breast Cancer Survival Prediction on High-Resolution Pathological Images [25.80544600229013]
We present PathoHR, a novel pipeline for accurate breast cancer survival prediction.<n>Our approach entails the incorporation of a plug-and-play high-resolution Vision Transformer (ViT) to enhance patch-wise WSI representation.
arXiv Detail & Related papers (2025-03-23T07:37:24Z) - Towards a Comprehensive Benchmark for Pathological Lymph Node Metastasis in Breast Cancer Sections [21.75452517154339]
We reprocessed 1,399 whole slide images (WSIs) and labels from the Camelyon-16 and Camelyon-17 datasets.
Based on the sizes of re-annotated tumor regions, we upgraded the binary cancer screening task to a four-class task.
arXiv Detail & Related papers (2024-11-16T09:19:24Z) - Foundation Models for Slide-level Cancer Subtyping in Digital Pathology [1.7641392161755438]
This work aims to compare the performance of various feature extractors developed under different pretraining strategies for cancer subtyping on WSI under a MIL framework.
Results demonstrate the ability of foundation models to surpass ImageNet-pretrained models for the prediction of six skin cancer subtypes.
arXiv Detail & Related papers (2024-10-21T11:04:58Z) - Lung-CADex: Fully automatic Zero-Shot Detection and Classification of Lung Nodules in Thoracic CT Images [45.29301790646322]
Computer-aided diagnosis can help with early lung nodul detection and facilitate subsequent nodule characterization.
We propose CADe, for segmenting lung nodules in a zero-shot manner using a variant of the Segment Anything Model called MedSAM.
We also propose, CADx, a method for the nodule characterization as benign/malignant by making a gallery of radiomic features and aligning image-feature pairs through contrastive learning.
arXiv Detail & Related papers (2024-07-02T19:30:25Z) - Multimodal Prototyping for cancer survival prediction [45.61869793509184]
Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification.
Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes.
This process generates many tokens, which leads to high memory requirements for computing attention and complicates post-hoc interpretability analyses.
Our framework outperforms state-of-the-art methods with much less computation while unlocking new interpretability analyses.
arXiv Detail & Related papers (2024-06-28T20:37:01Z) - Cross-attention-based saliency inference for predicting cancer
metastasis on whole slide images [3.7282630026096597]
Cross-attention-based salient instance inference MIL (CASiiMIL) is proposed to identify breast cancer lymph node micro-metastasis on whole slide images.
We introduce a negative representation learning algorithm to facilitate the learning of saliency-informed attention weights for improved sensitivity on tumor WSIs.
The proposed model outperforms the state-of-the-art MIL methods on two popular tumor metastasis detection datasets.
arXiv Detail & Related papers (2023-09-18T00:56:19Z) - CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images [3.1118773046912382]
We propose the Context-Aware Multiple Instance Learning (CAMIL) architecture for cancer diagnosis.
CAMIL incorporates neighbor-constrained attention to consider dependencies among tiles within a Whole Slide Images (WSI) and integrates contextual constraints as prior knowledge.
We evaluate CAMIL on subtyping non-small cell lung cancer (TCGA-NSCLC) and detecting lymph node metastasis, achieving test AUCs of 97.5%, 95.9%, and 88.1%, respectively.
arXiv Detail & Related papers (2023-05-09T10:06:37Z) - Active Learning Enhances Classification of Histopathology Whole Slide
Images with Attention-based Multiple Instance Learning [48.02011627390706]
We train an attention-based MIL and calculate a confidence metric for every image in the dataset to select the most uncertain WSIs for expert annotation.
With a novel attention guiding loss, this leads to an accuracy boost of the trained models with few regions annotated for each class.
It may in the future serve as an important contribution to train MIL models in the clinically relevant context of cancer classification in histopathology.
arXiv Detail & Related papers (2023-03-02T15:18:58Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Incorporating intratumoral heterogeneity into weakly-supervised deep
learning models via variance pooling [5.606290756924216]
Supervised learning tasks such as cancer survival prediction from gigapixel whole slide images (WSIs) are a critical challenge in computational pathology.
We develop a novel variance pooling architecture that enables a MIL model to incorporate intratumoral heterogeneity into its predictions.
An empirical study with 4,479 gigapixel WSIs from the Cancer Genome Atlas shows that adding variance pooling onto MIL frameworks improves survival prediction performance for five cancer types.
arXiv Detail & Related papers (2022-06-17T16:35:35Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.