Related papers: Prompt-MIL: Boosting Multi-Instance Learning Schemes via Task-specific Prompt Tuning

Prompt-MIL: Boosting Multi-Instance Learning Schemes via Task-specific Prompt Tuning

URL: http://arxiv.org/abs/2303.12214v2
Date: Thu, 5 Oct 2023 03:50:19 GMT
Title: Prompt-MIL: Boosting Multi-Instance Learning Schemes via Task-specific Prompt Tuning
Authors: Jingwei Zhang, Saarthak Kapse, Ke Ma, Prateek Prasanna, Joel Saltz, Maria Vakalopoulou, Dimitris Samaras
Abstract summary: Whole slide image (WSI) classification is a critical task in computational pathology. Current state of the art methods are based on multi-instance learning schemes (MIL), which usually rely on pretrained features to represent the instances. We propose Prompt-MIL, an MIL framework that integrates prompts into WSI classification.
Score: 31.0183821423397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Whole slide image (WSI) classification is a critical task in computational pathology, requiring the processing of gigapixel-sized images, which is challenging for current deep-learning methods. Current state of the art methods are based on multi-instance learning schemes (MIL), which usually rely on pretrained features to represent the instances. Due to the lack of task-specific annotated data, these features are either obtained from well-established backbones on natural images, or, more recently from self-supervised models pretrained on histopathology. However, both approaches yield task-agnostic features, resulting in performance loss compared to the appropriate task-related supervision, if available. In this paper, we show that when task-specific annotations are limited, we can inject such supervision into downstream task training, to reduce the gap between fully task-tuned and task agnostic features. We propose Prompt-MIL, an MIL framework that integrates prompts into WSI classification. Prompt-MIL adopts a prompt tuning mechanism, where only a small fraction of parameters calibrates the pretrained features to encode task-specific information, rather than the conventional full fine-tuning approaches. Extensive experiments on three WSI datasets, TCGA-BRCA, TCGA-CRC, and BRIGHT, demonstrate the superiority of Prompt-MIL over conventional MIL methods, achieving a relative improvement of 1.49%-4.03% in accuracy and 0.25%-8.97% in AUROC while using fewer than 0.3% additional parameters. Compared to conventional full fine-tuning approaches, we fine-tune less than 1.3% of the parameters, yet achieve a relative improvement of 1.29%-13.61% in accuracy and 3.22%-27.18% in AUROC and reduce GPU memory consumption by 38%-45% while training 21%-27% faster. Our code is available at https://github.com/cvlab-stonybrook/PromptMIL.

Related papers

AHDMIL: Asymmetric Hierarchical Distillation Multi-Instance Learning for Fast and Accurate Whole-Slide Image Classification [51.525891360380285]
AHDMIL is an Asymmetric Hierarchical Distillation Multi-Instance Learning framework.<n>It eliminates irrelevant patches through a two-step training process.<n>It consistently outperforms previous state-of-the-art methods in both classification performance and inference speed.
arXiv Detail & Related papers (2025-08-07T07:47:16Z)
Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning [51.525891360380285]
HDMIL is a hierarchical distillation multi-instance learning framework that achieves fast and accurate classification by eliminating irrelevant patches. HDMIL consists of two key components: the dynamic multi-instance network (DMIN) and the lightweight instance pre-screening network (LIPN)
arXiv Detail & Related papers (2025-02-28T15:10:07Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts. Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance [51.36243421001282]
Gradient-Mask Tuning (GMT) is a method that selectively updates parameters during training based on their gradient information. Our empirical results across various tasks demonstrate that GMT not only outperforms traditional fine-tuning methods but also elevates the upper limits of LLM performance.
arXiv Detail & Related papers (2024-06-21T17:42:52Z)
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding [6.816428690763012]
A standard approach to leverage large-scale pre-trained models is to fine-tune all model parameters for downstream tasks. We propose VMT-Adapter, which shares knowledge from multiple tasks to enhance cross-task interaction. We also propose VMT-Adapter-Lite, which further reduces the trainable parameters by learning shared parameters between down- and up-projections.
arXiv Detail & Related papers (2023-12-14T08:25:04Z)
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames [55.72994484532856]
temporal action detection (TAD) has seen significant performance improvement with end-to-end training. Due to the memory bottleneck, only models with limited scales and limited data volumes can afford end-to-end training. We reduce the memory consumption for end-to-end training, and manage to scale up the TAD backbone to 1 billion parameters and the input video to 1,536 frames.
arXiv Detail & Related papers (2023-11-28T21:31:04Z)
Multi-Level Contrastive Learning for Dense Prediction Task [59.591755258395594]
We present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method for learning region-level feature representation for dense prediction tasks. Our method is motivated by the three key factors in detection: localization, scale consistency and recognition. Our method consistently outperforms the recent state-of-the-art methods on various datasets with significant margins.
arXiv Detail & Related papers (2023-04-04T17:59:04Z)
Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement [24.108008515395458]
We propose APE, an Adaptive Prior rEfinement method for CLIP's pre-trained knowledge, which achieves superior accuracy with high computational efficiency. For the average accuracy over 11 benchmarks, both APE and APE-T attain state-of-the-art and respectively outperform the second-best by +1.59% and +1.99% under 16 shots with x30 less learnable parameters.
arXiv Detail & Related papers (2023-04-03T17:58:54Z)
Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification [10.243293283318415]
Multiple Instance Learning (MIL) has shown promising results in digital Pathology Whole Slide Image (WSI) classification. We propose an efficient WSI fine-tuning framework motivated by the Information Bottleneck theory. Our framework is evaluated on five pathology WSI datasets on various WSI heads.
arXiv Detail & Related papers (2023-03-15T08:41:57Z)
Gigapixel Whole-Slide Images Classification using Locally Supervised Learning [31.213316201151954]
Histo whole slide images (WSIs) play a very important role in clinical studies and serve as the gold standard for many cancer diagnoses. Conventional methods rely on a multiple instance learning (MIL) strategy to process a WSI at patch level. We propose a locally supervised learning framework which processes the entire slide by exploring the entire local and global information.
arXiv Detail & Related papers (2022-07-17T19:31:54Z)
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z)
Meta-Generating Deep Attentive Metric for Few-shot Classification [53.07108067253006]
We present a novel deep metric meta-generation method to generate a specific metric for a new few-shot learning task. In this study, we structure the metric using a three-layer deep attentive network that is flexible enough to produce a discriminative metric for each task. We gain surprisingly obvious performance improvement over state-of-the-art competitors, especially in the challenging cases.
arXiv Detail & Related papers (2020-12-03T02:07:43Z)
iTAML: An Incremental Task-Agnostic Meta-learning Approach [123.10294801296926]
Humans can continuously learn new knowledge as their experience grows. Previous learning in deep neural networks can quickly fade out when they are trained on a new task. We introduce a novel meta-learning approach that seeks to maintain an equilibrium between all encountered tasks.
arXiv Detail & Related papers (2020-03-25T21:42:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.