Deep Multiple Instance Learning with Distance-Aware Self-Attention
- URL: http://arxiv.org/abs/2305.10552v2
- Date: Sat, 20 May 2023 12:45:27 GMT
- Title: Deep Multiple Instance Learning with Distance-Aware Self-Attention
- Authors: Georg W\"olflein and Lucie Charlotte Magister and Pietro Li\`o and
David J. Harrison and Ognjen Arandjelovi\'c
- Abstract summary: We introduce a novel multiple instance learning (MIL) model with distance-aware self-attention (DAS-MIL)
Unlike existing relative position representations for self-attention which are discrete, our approach introduces continuous distance-dependent terms into the computation of the attention weights.
We evaluate our model on a custom MNIST-based MIL dataset and on CAMELYON16, a publicly available cancer metastasis detection dataset.
- Score: 9.361964965928063
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional supervised learning tasks require a label for every instance in
the training set, but in many real-world applications, labels are only
available for collections (bags) of instances. This problem setting, known as
multiple instance learning (MIL), is particularly relevant in the medical
domain, where high-resolution images are split into smaller patches, but labels
apply to the image as a whole. Recent MIL models are able to capture
correspondences between patches by employing self-attention, allowing them to
weigh each patch differently based on all other patches in the bag. However,
these approaches still do not consider the relative spatial relationships
between patches within the larger image, which is especially important in
computational pathology. To this end, we introduce a novel MIL model with
distance-aware self-attention (DAS-MIL), which explicitly takes into account
relative spatial information when modelling the interactions between patches.
Unlike existing relative position representations for self-attention which are
discrete, our approach introduces continuous distance-dependent terms into the
computation of the attention weights, and is the first to apply relative
position representations in the context of MIL. We evaluate our model on a
custom MNIST-based MIL dataset that requires the consideration of relative
spatial information, as well as on CAMELYON16, a publicly available cancer
metastasis detection dataset, where we achieve a test AUROC score of 0.91. On
both datasets, our model outperforms existing MIL approaches that employ
absolute positional encodings, as well as existing relative position
representation schemes applied to MIL. Our code is available at
https://anonymous.4open.science/r/das-mil.
Related papers
- Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - MamMIL: Multiple Instance Learning for Whole Slide Images with State
Space Models [58.39336492765728]
pathological diagnosis, the gold standard for cancer diagnosis, has achieved superior performance by combining the Transformer with the multiple instance learning (MIL) framework using whole slide images (WSIs)
We propose a MamMIL framework for WSI classification by cooperating the selective structured state space model (i.e., Mamba) with MIL for the first time.
Specifically, to solve the problem that Mamba can only conduct unidirectional one-dimensional (1D) sequence modeling, we innovatively introduce a bidirectional state space model and a 2D context-aware block.
arXiv Detail & Related papers (2024-03-08T09:02:13Z) - Reproducibility in Multiple Instance Learning: A Case For Algorithmic
Unit Tests [59.623267208433255]
Multiple Instance Learning (MIL) is a sub-domain of classification problems with positive and negative labels and a "bag" of inputs.
In this work, we examine five of the most prominent deep-MIL models and find that none of them respects the standard MIL assumption.
We identify and demonstrate this problem via a proposed "algorithmic unit test", where we create synthetic datasets that can be solved by a MIL respecting model.
arXiv Detail & Related papers (2023-10-27T03:05:11Z) - RoFormer for Position Aware Multiple Instance Learning in Whole Slide
Image Classification [0.0]
Whole slide image (WSI) classification is a critical task in computational pathology.
Current methods rely on multiple-instance learning (MIL) models with frozen feature extractors.
We show that our method outperforms state-of-the-art MIL models on weakly supervised classification tasks.
arXiv Detail & Related papers (2023-10-03T09:59:59Z) - Smooth Attention for Deep Multiple Instance Learning: Application to CT
Intracranial Hemorrhage Detection [17.27358760040812]
Multiple Instance Learning (MIL) has been widely applied to medical imaging diagnosis, where bag labels are known and instance labels inside bags are unknown.
In this study, we propose a smooth attention deep MIL (SA-DMIL) model.
Smoothness is achieved by the introduction of first and second order constraints on the latent function encoding the attention paid to each instance in a bag.
arXiv Detail & Related papers (2023-07-18T17:38:04Z) - TPMIL: Trainable Prototype Enhanced Multiple Instance Learning for Whole
Slide Image Classification [13.195971707693365]
We develop a Trainable Prototype enhanced deep MIL framework for weakly supervised WSI classification.
Our method is able to reveal the correlations between different tumor subtypes through distances between corresponding trained prototypes.
We test our method on two WSI datasets and it achieves a new SOTA.
arXiv Detail & Related papers (2023-05-01T07:39:19Z) - Active Learning Enhances Classification of Histopathology Whole Slide
Images with Attention-based Multiple Instance Learning [48.02011627390706]
We train an attention-based MIL and calculate a confidence metric for every image in the dataset to select the most uncertain WSIs for expert annotation.
With a novel attention guiding loss, this leads to an accuracy boost of the trained models with few regions annotated for each class.
It may in the future serve as an important contribution to train MIL models in the clinically relevant context of cancer classification in histopathology.
arXiv Detail & Related papers (2023-03-02T15:18:58Z) - Feature Re-calibration based MIL for Whole Slide Image Classification [7.92885032436243]
Whole slide image (WSI) classification is a fundamental task for the diagnosis and treatment of diseases.
We propose to re-calibrate the distribution of a WSI bag (instances) by using the statistics of the max-instance (critical) feature.
We employ a position encoding module (PEM) to model spatial/morphological information, and perform pooling by multi-head self-attention (PSMA) with a Transformer encoder.
arXiv Detail & Related papers (2022-06-22T07:00:39Z) - Dual-stream Multiple Instance Learning Network for Whole Slide Image
Classification with Self-supervised Contrastive Learning [16.84711797934138]
We address the challenging problem of whole slide image (WSI) classification.
WSI classification can be cast as a multiple instance learning (MIL) problem when only slide-level labels are available.
We propose a MIL-based method for WSI classification and tumor detection that does not require localized annotations.
arXiv Detail & Related papers (2020-11-17T20:51:15Z) - Memory-Augmented Relation Network for Few-Shot Learning [114.47866281436829]
In this work, we investigate a new metric-learning method, Memory-Augmented Relation Network (MRN)
In MRN, we choose the samples that are visually similar from the working context, and perform weighted information propagation to attentively aggregate helpful information from chosen ones to enhance its representation.
We empirically demonstrate that MRN yields significant improvement over its ancestor and achieves competitive or even better performance when compared with other few-shot learning approaches.
arXiv Detail & Related papers (2020-05-09T10:09:13Z) - Improving Few-shot Learning by Spatially-aware Matching and
CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario.
We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.