Related papers: SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

URL: http://arxiv.org/abs/2505.06710v1
Date: Sat, 10 May 2025 17:23:36 GMT
Title: SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images
Authors: Yicheng Song, Tiancheng Lin, Die Peng, Su Yang, Yi Xu,
Abstract summary: This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme.<n>To learn effective features for MIL, we delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function.<n>We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other pre-training schemes.
Score: 12.827931905880163
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Various multi-instance learning (MIL) based approaches have been developed and successfully applied to whole-slide pathological images (WSI). Existing MIL methods emphasize the importance of feature aggregators, but largely neglect the instance-level representation learning. They assume that the availability of a pre-trained feature extractor can be directly utilized or fine-tuned, which is not always the case. This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme, i.e., propagating the weak bag-level labels to the corresponding instances for supervised learning. To learn effective features for MIL, we further delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function. We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other pre-training schemes (e.g., ImageNet pre-training and self-supervised learning) in different downstream tasks. We further show the compatibility and scalability of the proposed scheme by deploying it in fine-tuning the pathological-specific models and pre-training on merged multiple datasets. To our knowledge, this is the first work focusing on the representation learning for MIL.

Related papers

Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification [2.375943263571389]
Multiple instance learning (MIL) has become a preferred method for gigapixel whole slide image (WSI) classification without requiring patch-level annotations.<n>This study systematically evaluating MIL feature extractors across three dimensions: pre-training dataset, backbone model, and pre-training method.
arXiv Detail & Related papers (2024-08-02T10:34:23Z)
HyperMM : Robust Multimodal Learning with Varying-sized Inputs [4.377889826841039]
HyperMM is an end-to-end framework designed for learning with varying-sized inputs. We introduce a novel strategy for training a universal feature extractor using a conditional hypernetwork. We experimentally demonstrate the advantages of our method in two tasks: Alzheimer's disease detection and breast cancer classification.
arXiv Detail & Related papers (2024-07-30T12:13:18Z)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks. Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment. We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)
Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance. This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z)
TPMIL: Trainable Prototype Enhanced Multiple Instance Learning for Whole Slide Image Classification [13.195971707693365]
We develop a Trainable Prototype enhanced deep MIL framework for weakly supervised WSI classification. Our method is able to reveal the correlations between different tumor subtypes through distances between corresponding trained prototypes. We test our method on two WSI datasets and it achieves a new SOTA.
arXiv Detail & Related papers (2023-05-01T07:39:19Z)
Multi-Level Contrastive Learning for Dense Prediction Task [59.591755258395594]
We present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method for learning region-level feature representation for dense prediction tasks. Our method is motivated by the three key factors in detection: localization, scale consistency and recognition. Our method consistently outperforms the recent state-of-the-art methods on various datasets with significant margins.
arXiv Detail & Related papers (2023-04-04T17:59:04Z)
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information [77.80071279597665]
We propose an all-in-one single-stage pre-training approach, named Maximizing Multi-modal Mutual Information Pre-training (M3I Pre-training) Our approach achieves better performance than previous pre-training methods on various vision benchmarks, including ImageNet classification, object detection, LVIS long-tailed object detection, and ADE20k semantic segmentation.
arXiv Detail & Related papers (2022-11-17T18:59:49Z)
Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks. We find that their performances are sub-optimal or even lag far behind the single-task baseline. We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z)
Attention Awareness Multiple Instance Neural Network [4.061135251278187]
We propose an attention awareness multiple instance neural network framework. It consists of an instance-level classifier, a trainable MIL pooling operator based on spatial attention and a bag-level classification layer. Exhaustive experiments on a series of pattern recognition tasks demonstrate that our framework outperforms many state-of-the-art MIL methods.
arXiv Detail & Related papers (2022-05-27T03:29:17Z)
Dual-stream Maximum Self-attention Multi-instance Learning [11.685285490589981]
Multi-instance learning (MIL) is a form of weakly supervised learning where a single class label is assigned to a bag of instances while the instance-level labels are not available. We propose a dual-stream maximum self-attention MIL model (DSMIL) parameterized by neural networks. Our method achieves superior performance compared to the best MIL methods and demonstrates state-of-the-art performance on benchmark MIL datasets.
arXiv Detail & Related papers (2020-06-09T22:44:58Z)
Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method. PCL implicitly encodes semantic structures of the data into the learned embedding space. PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.