Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
- URL: http://arxiv.org/abs/2311.07125v4
- Date: Fri, 5 Jul 2024 01:59:23 GMT
- Title: Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
- Authors: Yunlong Zhang, Honglin Li, Yuxuan Sun, Sunyi Zheng, Chenglu Zhu, Lin Yang,
- Abstract summary: We present Attention-Challenging MIL (ACMIL) to mitigate overfitting.
ACMIL combines two techniques based on separate analyses for attention value concentration.
This paper extensively illustrates ACMIL's effectiveness in suppressing attention value concentration and overcoming the overfitting challenge.
- Score: 12.424186320807888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the application of Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) classification, attention mechanisms often focus on a subset of discriminative instances, which are closely linked to overfitting. To mitigate overfitting, we present Attention-Challenging MIL (ACMIL). ACMIL combines two techniques based on separate analyses for attention value concentration. Firstly, UMAP of instance features reveals various patterns among discriminative instances, with existing attention mechanisms capturing only some of them. To remedy this, we introduce Multiple Branch Attention (MBA) to capture more discriminative instances using multiple attention branches. Secondly, the examination of the cumulative value of Top-K attention scores indicates that a tiny number of instances dominate the majority of attention. In response, we present Stochastic Top-K Instance Masking (STKIM), which masks out a portion of instances with Top-K attention values and allocates their attention values to the remaining instances. The extensive experimental results on three WSI datasets with two pre-trained backbones reveal that our ACMIL outperforms state-of-the-art methods. Additionally, through heatmap visualization and UMAP visualization, this paper extensively illustrates ACMIL's effectiveness in suppressing attention value concentration and overcoming the overfitting challenge. The source code is available at \url{https://github.com/dazhangyu123/ACMIL}.
Related papers
- Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification [51.95824566163554]
We argue that synergizing the standard MIL assumption with variational inference encourages the model to focus on tumour morphology instead of spurious correlations.
Our method also achieves better classification boundaries for identifying hard instances and mitigates the effect of spurious correlations between bags and labels.
arXiv Detail & Related papers (2024-08-18T12:15:22Z) - Multiple Instance Verification [11.027466339522777]
We show that naive adaptations of attention-based multiple instance learning methods and standard verification methods are unsuitable for this setting.
Under the CAP framework, we propose two novel attention functions to address the challenge of distinguishing between highly similar instances in a target bag.
arXiv Detail & Related papers (2024-07-09T04:51:22Z) - Rethinking Attention-Based Multiple Instance Learning for Whole-Slide Pathological Image Classification: An Instance Attribute Viewpoint [11.09441191807822]
Multiple instance learning (MIL) is a robust paradigm for whole-slide pathological image (WSI) analysis.
This paper proposes an Attribute-Driven MIL (AttriMIL) framework to address these issues.
arXiv Detail & Related papers (2024-03-30T13:04:46Z) - DealMVC: Dual Contrastive Calibration for Multi-view Clustering [78.54355167448614]
We propose a novel Dual contrastive calibration network for Multi-View Clustering (DealMVC)
We first design a fusion mechanism to obtain a global cross-view feature. Then, a global contrastive calibration loss is proposed by aligning the view feature similarity graph and the high-confidence pseudo-label graph.
During the training procedure, the interacted cross-view feature is jointly optimized at both local and global levels.
arXiv Detail & Related papers (2023-08-17T14:14:28Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Robust Representation Learning by Clustering with Bisimulation Metrics
for Visual Reinforcement Learning with Distractions [9.088460902782547]
Clustering with Bisimulation Metrics (CBM) learns robust representations by grouping visual observations in the latent space.
CBM alternates between two steps: (1) grouping observations by measuring their bisimulation distances to the learned prototypes; (2) learning a set of prototypes according to the current cluster assignments.
Experiments demonstrate that CBM significantly improves the sample efficiency of popular visual RL algorithms.
arXiv Detail & Related papers (2023-02-12T13:27:34Z) - Attention Awareness Multiple Instance Neural Network [4.061135251278187]
We propose an attention awareness multiple instance neural network framework.
It consists of an instance-level classifier, a trainable MIL pooling operator based on spatial attention and a bag-level classification layer.
Exhaustive experiments on a series of pattern recognition tasks demonstrate that our framework outperforms many state-of-the-art MIL methods.
arXiv Detail & Related papers (2022-05-27T03:29:17Z) - UniVIP: A Unified Framework for Self-Supervised Visual Pre-training [50.87603616476038]
We propose a novel self-supervised framework to learn versatile visual representations on either single-centric-object or non-iconic dataset.
Massive experiments show that UniVIP pre-trained on non-iconic COCO achieves state-of-the-art transfer performance.
Our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2.5% with the same pre-training epochs in linear probing.
arXiv Detail & Related papers (2022-03-14T10:04:04Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z) - More Than Just Attention: Learning Cross-Modal Attentions with
Contrastive Constraints [63.08768589044052]
We propose Contrastive Content Re-sourcing ( CCR) and Contrastive Content Swapping ( CCS) constraints to address such limitation.
CCR and CCS constraints supervise the training of attention models in a contrastive learning manner without requiring explicit attention annotations.
Experiments on both Flickr30k and MS-COCO datasets demonstrate that integrating these attention constraints into two state-of-the-art attention-based models improves the model performance.
arXiv Detail & Related papers (2021-05-20T08:48:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.