Related papers: Dual-stream Maximum Self-attention Multi-instance Learning

Dual-stream Maximum Self-attention Multi-instance Learning

URL: http://arxiv.org/abs/2006.05538v1
Date: Tue, 9 Jun 2020 22:44:58 GMT
Title: Dual-stream Maximum Self-attention Multi-instance Learning
Authors: Bin Li, Kevin W. Eliceiri
Abstract summary: Multi-instance learning (MIL) is a form of weakly supervised learning where a single class label is assigned to a bag of instances while the instance-level labels are not available. We propose a dual-stream maximum self-attention MIL model (DSMIL) parameterized by neural networks. Our method achieves superior performance compared to the best MIL methods and demonstrates state-of-the-art performance on benchmark MIL datasets.
Score: 11.685285490589981
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-instance learning (MIL) is a form of weakly supervised learning where a single class label is assigned to a bag of instances while the instance-level labels are not available. Training classifiers to accurately determine the bag label and instance labels is a challenging but critical task in many practical scenarios, such as computational histopathology. Recently, MIL models fully parameterized by neural networks have become popular due to the high flexibility and superior performance. Most of these models rely on attention mechanisms that assign attention scores across the instance embeddings in a bag and produce the bag embedding using an aggregation operator. In this paper, we proposed a dual-stream maximum self-attention MIL model (DSMIL) parameterized by neural networks. The first stream deploys a simple MIL max-pooling while the top-activated instance embedding is determined and used to obtain self-attention scores across instance embeddings in the second stream. Different from most of the previous methods, the proposed model jointly learns an instance classifier and a bag classifier based on the same instance embeddings. The experiments results show that our method achieves superior performance compared to the best MIL methods and demonstrates state-of-the-art performance on benchmark MIL datasets.

Related papers

AHDMIL: Asymmetric Hierarchical Distillation Multi-Instance Learning for Fast and Accurate Whole-Slide Image Classification [51.525891360380285]
AHDMIL is an Asymmetric Hierarchical Distillation Multi-Instance Learning framework.<n>It eliminates irrelevant patches through a two-step training process.<n>It consistently outperforms previous state-of-the-art methods in both classification performance and inference speed.
arXiv Detail & Related papers (2025-08-07T07:47:16Z)
SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images [12.827931905880163]
This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme.<n>To learn effective features for MIL, we delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function.<n>We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other pre-training schemes.
arXiv Detail & Related papers (2025-05-10T17:23:36Z)
Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study [1.7397128744416201]
Multiple Instance Learning (MIL) has emerged as the best solution for Whole Slide Image (WSI) classification.<n>We conduct 710 experiments across 4 datasets, comparing 10 MIL strategies, 6 self-supervised methods with 4 backbones, 4 foundation models, and various pathology-adapted techniques.<n>We show that with a good SSL feature extractor, simple instance-based MILs, with very few parameters, obtain similar or better performance than complex, state-of-the-art (SOTA) embedding-based MIL methods.
arXiv Detail & Related papers (2025-05-02T08:43:50Z)
Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification [51.95824566163554]
We argue that synergizing the standard MIL assumption with variational inference encourages the model to focus on tumour morphology instead of spurious correlations. Our method also achieves better classification boundaries for identifying hard instances and mitigates the effect of spurious correlations between bags and labels.
arXiv Detail & Related papers (2024-08-18T12:15:22Z)
Rethinking Multiple Instance Learning for Whole Slide Image Classification: A Good Instance Classifier is All You Need [18.832471712088353]
We propose an instance-level weakly supervised contrastive learning algorithm for the first time under the MIL setting. We also propose an accurate pseudo label generation method through prototype learning.
arXiv Detail & Related papers (2023-07-05T12:44:52Z)
Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions. Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z)
Feature Re-calibration based MIL for Whole Slide Image Classification [7.92885032436243]
Whole slide image (WSI) classification is a fundamental task for the diagnosis and treatment of diseases. We propose to re-calibrate the distribution of a WSI bag (instances) by using the statistics of the max-instance (critical) feature. We employ a position encoding module (PEM) to model spatial/morphological information, and perform pooling by multi-head self-attention (PSMA) with a Transformer encoder.
arXiv Detail & Related papers (2022-06-22T07:00:39Z)
Attention Awareness Multiple Instance Neural Network [4.061135251278187]
We propose an attention awareness multiple instance neural network framework. It consists of an instance-level classifier, a trainable MIL pooling operator based on spatial attention and a bag-level classification layer. Exhaustive experiments on a series of pattern recognition tasks demonstrate that our framework outperforms many state-of-the-art MIL methods.
arXiv Detail & Related papers (2022-05-27T03:29:17Z)
Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks. Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients. We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z)
A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled. We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples. We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z)
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition [55.362258027878966]
We present momentum pseudo-labeling (MPL) as a simple yet effective strategy for semi-supervised speech recognition. MPL consists of a pair of online and offline models that interact and learn from each other, inspired by the mean teacher method. The experimental results demonstrate that MPL effectively improves over the base model and is scalable to different semi-supervised scenarios.
arXiv Detail & Related papers (2021-06-16T16:24:55Z)
A Visual Mining Approach to Improved Multiple-Instance Learning [3.611492083936225]
Multiple-instance learning (MIL) is a paradigm of machine learning that aims to classify a set (bag) of objects (instances) and assign labels only to the bags. We propose a multiscale tree-based visualization to support MIL. The first level of the tree represents the bags, and the second level represents the instances belonging to each bag.
arXiv Detail & Related papers (2020-12-14T05:12:43Z)
Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning [16.84711797934138]
We address the challenging problem of whole slide image (WSI) classification. WSI classification can be cast as a multiple instance learning (MIL) problem when only slide-level labels are available. We propose a MIL-based method for WSI classification and tumor detection that does not require localized annotations.
arXiv Detail & Related papers (2020-11-17T20:51:15Z)
Memory-Augmented Relation Network for Few-Shot Learning [114.47866281436829]
In this work, we investigate a new metric-learning method, Memory-Augmented Relation Network (MRN) In MRN, we choose the samples that are visually similar from the working context, and perform weighted information propagation to attentively aggregate helpful information from chosen ones to enhance its representation. We empirically demonstrate that MRN yields significant improvement over its ancestor and achieves competitive or even better performance when compared with other few-shot learning approaches.
arXiv Detail & Related papers (2020-05-09T10:09:13Z)
Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning [82.41415008107502]
Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments) We show that our EM-MIL approach more accurately models both the learning objective and the MIL assumptions.
arXiv Detail & Related papers (2020-03-31T23:36:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.