Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based
Tumor Classification
- URL: http://arxiv.org/abs/2307.07482v2
- Date: Fri, 17 Nov 2023 11:30:33 GMT
- Title: Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based
Tumor Classification
- Authors: Simon Holdenried-Krafft and Peter Somers and Ivonne A. Montes-Majarro
and Diana Silimon and Cristina Tar\'in and Falko Fend and Hendrik P. A.
Lensch
- Abstract summary: Whole slide image (WSI) assessment is a challenging and crucial step in cancer diagnosis and treatment planning.
Coarse-grained labels are easily accessible, which makes WSI classification an ideal use case for multiple instance learning (MIL)
We propose a novel embedding-based Dual-Query MIL pipeline (DQ-MIL)
- Score: 5.121989578393729
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Whole slide image (WSI) assessment is a challenging and crucial step in
cancer diagnosis and treatment planning. WSIs require high magnifications to
facilitate sub-cellular analysis. Precise annotations for patch- or even
pixel-level classifications in the context of gigapixel WSIs are tedious to
acquire and require domain experts. Coarse-grained labels, on the other hand,
are easily accessible, which makes WSI classification an ideal use case for
multiple instance learning (MIL). In our work, we propose a novel
embedding-based Dual-Query MIL pipeline (DQ-MIL). We contribute to both the
embedding and aggregation steps. Since all-purpose visual feature
representations are not yet available, embedding models are currently limited
in terms of generalizability. With our work, we explore the potential of
dynamic meta-embedding based on cutting-edge self-supervised pre-trained models
in the context of MIL. Moreover, we propose a new MIL architecture capable of
combining MIL-attention with correlated self-attention. The Dual-Query
Perceiver design of our approach allows us to leverage the concept of
self-distillation and to combine the advantages of a small model in the context
of a low data regime with the rich feature representation of a larger model. We
demonstrate the superior performance of our approach on three histopathological
datasets, where we show improvement of up to 10% over state-of-the-art
approaches.
Related papers
- Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification [10.667645628712542]
This paper proposes the first Vision-Language-based framework with Queryable Prototype Multiple Instance Learning (QPMIL-VL) specially designed for incremental WSI classification.
experiments on four TCGA datasets demonstrate that our QPMIL-VL framework is effective for incremental WSI classification.
arXiv Detail & Related papers (2024-10-14T14:49:34Z) - Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs)
We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z) - Mamba2MIL: State Space Duality Based Multiple Instance Learning for Computational Pathology [17.329498427735565]
We propose a novel Multiple Instance Learning framework called Mamba2MIL.
Mamba2MIL exploits order-related and order-independent features, resulting in suboptimal utilization of sequence information.
We conduct extensive experiments across multiple datasets, achieving improvements in nearly all performance metrics.
arXiv Detail & Related papers (2024-08-27T13:01:19Z) - Multi-modal Auto-regressive Modeling via Visual Words [96.25078866446053]
We propose the concept of visual tokens, which maps the visual features to probability distributions over Large Multi-modal Models' vocabulary.
We further explore the distribution of visual features in the semantic space within LMM and the possibility of using text embeddings to represent visual information.
arXiv Detail & Related papers (2024-03-12T14:58:52Z) - MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis.
We represent each WSI as an undirected graph.
To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - TPMIL: Trainable Prototype Enhanced Multiple Instance Learning for Whole
Slide Image Classification [13.195971707693365]
We develop a Trainable Prototype enhanced deep MIL framework for weakly supervised WSI classification.
Our method is able to reveal the correlations between different tumor subtypes through distances between corresponding trained prototypes.
We test our method on two WSI datasets and it achieves a new SOTA.
arXiv Detail & Related papers (2023-05-01T07:39:19Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Dual-stream Multiple Instance Learning Network for Whole Slide Image
Classification with Self-supervised Contrastive Learning [16.84711797934138]
We address the challenging problem of whole slide image (WSI) classification.
WSI classification can be cast as a multiple instance learning (MIL) problem when only slide-level labels are available.
We propose a MIL-based method for WSI classification and tumor detection that does not require localized annotations.
arXiv Detail & Related papers (2020-11-17T20:51:15Z) - Dynamic Memory Induction Networks for Few-Shot Text Classification [84.88381813651971]
This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification.
The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 24%.
arXiv Detail & Related papers (2020-05-12T12:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.