Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study
- URL: http://arxiv.org/abs/2505.01109v1
- Date: Fri, 02 May 2025 08:43:50 GMT
- Title: Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study
- Authors: Ali Mammadov, Loic Le Folgoc, Julien Adam, Anne Buronfosse, Gilles Hayem, Guillaume Hocquet, Pietro Gori,
- Abstract summary: Multiple Instance Learning (MIL) has emerged as the best solution for Whole Slide Image (WSI) classification.<n>We conduct 710 experiments across 4 datasets, comparing 10 MIL strategies, 6 self-supervised methods with 4 backbones, 4 foundation models, and various pathology-adapted techniques.<n>We show that with a good SSL feature extractor, simple instance-based MILs, with very few parameters, obtain similar or better performance than complex, state-of-the-art (SOTA) embedding-based MIL methods.
- Score: 1.7397128744416201
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multiple Instance Learning (MIL) has emerged as the best solution for Whole Slide Image (WSI) classification. It consists of dividing each slide into patches, which are treated as a bag of instances labeled with a global label. MIL includes two main approaches: instance-based and embedding-based. In the former, each patch is classified independently, and then the patch scores are aggregated to predict the bag label. In the latter, bag classification is performed after aggregating patch embeddings. Even if instance-based methods are naturally more interpretable, embedding-based MILs have usually been preferred in the past due to their robustness to poor feature extractors. However, recently, the quality of feature embeddings has drastically increased using self-supervised learning (SSL). Nevertheless, many authors continue to endorse the superiority of embedding-based MIL. To investigate this further, we conduct 710 experiments across 4 datasets, comparing 10 MIL strategies, 6 self-supervised methods with 4 backbones, 4 foundation models, and various pathology-adapted techniques. Furthermore, we introduce 4 instance-based MIL methods never used before in the pathology domain. Through these extensive experiments, we show that with a good SSL feature extractor, simple instance-based MILs, with very few parameters, obtain similar or better performance than complex, state-of-the-art (SOTA) embedding-based MIL methods, setting new SOTA results on the BRACS and Camelyon16 datasets. Since simple instance-based MIL methods are naturally more interpretable and explainable to clinicians, our results suggest that more effort should be put into well-adapted SSL methods for WSI rather than into complex embedding-based MIL methods.
Related papers
- Reducing Variability of Multiple Instance Learning Methods for Digital Pathology [2.9284034606635267]
Digital pathology has revolutionized the field by enabling the digitization of tissue samples into whole slide images (WSIs)<n>WSIs are often divided into smaller patches with a global label.<n>MIL methods have emerged as a suitable solution for WSI classification.
arXiv Detail & Related papers (2025-06-30T22:10:24Z) - A Spatially-Aware Multiple Instance Learning Framework for Digital Pathology [4.012490059423154]
Multiple instance learning (MIL) is a promising approach for weakly supervised classification in pathology using whole slide images.<n>Recent advancements, such as Transformer based MIL (TransMIL), have incorporated spatial context and inter-patch relationships.<n>In this work, we enhance the ABMIL framework by integrating interaction-aware representations to address this question.
arXiv Detail & Related papers (2025-04-24T08:53:46Z) - How Effective Can Dropout Be in Multiple Instance Learning ? [2.0792866989795864]
Multiple Instance Learning (MIL) is a popular weakly-supervised method for various applications.<n>We propose a novel MIL-specific dropout method, termed MIL-Dropout, which systematically determines which instances to drop.<n> Experiments on five MIL benchmark datasets and two WSI datasets demonstrate that MIL-Dropout boosts the performance of current MIL methods with a negligible computational cost.
arXiv Detail & Related papers (2025-04-21T00:46:31Z) - Pathological Prior-Guided Multiple Instance Learning For Mitigating Catastrophic Forgetting in Breast Cancer Whole Slide Image Classification [50.899861205016265]
We propose a new framework PaGMIL to mitigate catastrophic forgetting in breast cancer WSI classification.<n>Our framework introduces two key components into the common MIL model architecture.<n>We evaluate the continual learning performance of PaGMIL across several public breast cancer datasets.
arXiv Detail & Related papers (2025-03-08T04:51:58Z) - Position: From Correlation to Causation: Max-Pooling-Based Multi-Instance Learning Leads to More Robust Whole Slide Image Classification [51.95824566163554]
We argue that well-trained max-pooling-based MIL models can make predictions based on causal factors and avoid relying on spurious correlations.<n>We propose a simple yet effective max-pooling-based MIL method (FocusMIL) that outperforms existing mainstream attention-based methods on two datasets.
arXiv Detail & Related papers (2024-08-18T12:15:22Z) - MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis.
We represent each WSI as an undirected graph.
To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z) - Reproducibility in Multiple Instance Learning: A Case For Algorithmic
Unit Tests [59.623267208433255]
Multiple Instance Learning (MIL) is a sub-domain of classification problems with positive and negative labels and a "bag" of inputs.
In this work, we examine five of the most prominent deep-MIL models and find that none of them respects the standard MIL assumption.
We identify and demonstrate this problem via a proposed "algorithmic unit test", where we create synthetic datasets that can be solved by a MIL respecting model.
arXiv Detail & Related papers (2023-10-27T03:05:11Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Feature Re-calibration based MIL for Whole Slide Image Classification [7.92885032436243]
Whole slide image (WSI) classification is a fundamental task for the diagnosis and treatment of diseases.
We propose to re-calibrate the distribution of a WSI bag (instances) by using the statistics of the max-instance (critical) feature.
We employ a position encoding module (PEM) to model spatial/morphological information, and perform pooling by multi-head self-attention (PSMA) with a Transformer encoder.
arXiv Detail & Related papers (2022-06-22T07:00:39Z) - CIL: Contrastive Instance Learning Framework for Distantly Supervised
Relation Extraction [52.94486705393062]
We go beyond typical multi-instance learning (MIL) framework and propose a novel contrastive instance learning (CIL) framework.
Specifically, we regard the initial MIL as the relational triple encoder and constraint positive pairs against negative pairs for each instance.
Experiments demonstrate the effectiveness of our proposed framework, with significant improvements over the previous methods on NYT10, GDS and KBP.
arXiv Detail & Related papers (2021-06-21T04:51:59Z) - Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples.
We provide a unified review of different ways of training graph neural networks (GNNs) using SSL.
Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z) - Dual-stream Multiple Instance Learning Network for Whole Slide Image
Classification with Self-supervised Contrastive Learning [16.84711797934138]
We address the challenging problem of whole slide image (WSI) classification.
WSI classification can be cast as a multiple instance learning (MIL) problem when only slide-level labels are available.
We propose a MIL-based method for WSI classification and tumor detection that does not require localized annotations.
arXiv Detail & Related papers (2020-11-17T20:51:15Z) - Dual-stream Maximum Self-attention Multi-instance Learning [11.685285490589981]
Multi-instance learning (MIL) is a form of weakly supervised learning where a single class label is assigned to a bag of instances while the instance-level labels are not available.
We propose a dual-stream maximum self-attention MIL model (DSMIL) parameterized by neural networks.
Our method achieves superior performance compared to the best MIL methods and demonstrates state-of-the-art performance on benchmark MIL datasets.
arXiv Detail & Related papers (2020-06-09T22:44:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.