Related papers: Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis

Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis

URL: http://arxiv.org/abs/2506.05184v1
Date: Thu, 05 Jun 2025 15:56:45 GMT
Title: Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis
Authors: Neeraj Kumar, Swaraj Nanda, Siddharth Singi, Jamal Benhamida, David Kim, Jie-Fu Chen, Amir Momeni-Boroujeni, Gregory M. Goldgof, Gabriele Campanella, Chad Vanderbilt,
Abstract summary: Pathology foundation models (PFMs) have emerged as powerful tools for analyzing whole slide images (WSIs)<n>TAPFM uses vision transformer (vit) attention for MIL aggregation while optimizing both for feature representations and attention weights.<n> evaluated on mutation prediction tasks for bladder cancer and lung adenocarcinoma.
Score: 8.076987502347327
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pathology foundation models (PFMs) have emerged as powerful tools for analyzing whole slide images (WSIs). However, adapting these pretrained PFMs for specific clinical tasks presents considerable challenges, primarily due to the availability of only weak (WSI-level) labels for gigapixel images, necessitating multiple instance learning (MIL) paradigm for effective WSI analysis. This paper proposes a novel approach for single-GPU \textbf{T}ask \textbf{A}daptation of \textbf{PFM}s (TAPFM) that uses vision transformer (\vit) attention for MIL aggregation while optimizing both for feature representations and attention weights. The proposed approach maintains separate computational graphs for MIL aggregator and the PFM to create stable training dynamics that align with downstream task objectives during end-to-end adaptation. Evaluated on mutation prediction tasks for bladder cancer and lung adenocarcinoma across institutional and TCGA cohorts, TAPFM consistently outperforms conventional approaches, with H-Optimus-0 (TAPFM) outperforming the benchmarks. TAPFM effectively handles multi-label classification of actionable mutations as well. Thus, TAPFM makes adaptation of powerful pre-trained PFMs practical on standard hardware for various clinical applications.

Related papers

TAP-SLF: Parameter-Efficient Adaptation of Vision Foundation Models for Multi-Task Ultrasound Image Analysis [1.5074458114135958]
Task-Aware Prompting and Selective Layer Fine-Tuning (TAP-SLF) is a unified framework for multi-task ultrasound image analysis.<n>TAP-SLF incorporates task-specific priors into the input token sequence and applies LoRA to selected specific top layers of the encoder.<n>Results on the FMC_UIA 2026 Challenge test set, combined with evaluations on the officially released training dataset using an 8:2 train-test split, demonstrate that task-aware prompting and selective layer tuning are effective strategies for efficient VFM adaptation.
arXiv Detail & Related papers (2026-02-28T03:21:07Z)
Fusion of Heterogeneous Pathology Foundation Models for Whole Slide Image Analysis [10.323462166785133]
Whole slide image (WSI) analysis has emerged as an increasingly essential technique in computational pathology.<n>Recent advances in pathological foundation models (FMs) have demonstrated significant advantages in deriving meaningful patch-level or slide-level feature representations from WSIs.<n>We propose a novel framework for the fusion of heterogeneous pathological FMs, called FuseCPath, yielding a model with a superior ensemble performance.
arXiv Detail & Related papers (2025-10-31T06:59:11Z)
GAS-MIL: Group-Aggregative Selection Multi-Instance Learning for Ensemble of Foundation Models in Digital Pathology Image Analysis [6.45975531973783]
GAS-MIL is a flexible ensemble framework that seamlessly integrates features from multiple foundation models.<n>It achieves superior or on-par performance relative to individual FMs and established MIL methods.<n>It provides a scalable foundation for future multimodal and precision oncology applications.
arXiv Detail & Related papers (2025-10-03T22:59:40Z)
MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation [55.37355146924576]
MedSeqFT is a sequential fine-tuning framework for medical image analysis.<n>It adapts pre-trained models to new tasks while refining their representational capacity.<n>It consistently outperforms state-of-the-art fine-tuning strategies.
arXiv Detail & Related papers (2025-09-07T15:22:53Z)
Ensemble of Pathology Foundation Models for MIDOG 2025 Track 2: Atypical Mitosis Classification [0.0]
We leveraged Pathology Foundation Models (PFMs) pre-trained on large histopathology datasets.<n>We incorporated ConvNeXt V2, a state-of-the-art convolutional neural network architecture, to complement PFMs.<n>We ensembled multiple PFMs to integrate complementary morphological insights, achieving balanced accuracy on the Preliminary Evaluation Phase dataset.
arXiv Detail & Related papers (2025-08-29T03:24:57Z)
AdaFusion: Prompt-Guided Inference with Adaptive Fusion of Pathology Foundation Models [49.550545038402184]
We propose AdaFusion, a novel prompt-guided inference framework.<n>Our method compresses and aligns tile-level features from diverse models.<n>AdaFusion consistently surpasses individual PFMs across both classification and regression tasks.
arXiv Detail & Related papers (2025-08-07T07:09:31Z)
Benchmarking histopathology foundation models in a multi-center dataset for skin cancer subtyping [1.927195358774599]
Pretraining on large-scale, in-domain datasets grants histopathology foundation models (FM) the ability to learn task-agnostic data representations.<n>In computational pathology, automated whole slide image analysis requires multiple instance learning (MIL) frameworks due to the gigapixel scale of the slides.<n>Our work presents a novel benchmark for evaluating histopathology FMs as patch-level feature extractors within a MIL classification framework.
arXiv Detail & Related papers (2025-06-23T14:12:16Z)
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models [56.503053716053]
Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting.<n>We argue that it falls short of fully leveraging TSFMs' capabilities, often resulting in overfitting and suboptimal performance.<n>We propose textbftextscfinetextbftextsctuning (textbfMSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the finetuning process.
arXiv Detail & Related papers (2025-06-17T01:06:01Z)
FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation [65.93276461982093]
Existing approaches either selectively fine-tune parameters or freeze the VFMs and update only the adapters.<n>We propose textbfFisherTune, a robust fine-tuning method guided by the Domain-Related Fisher Information Matrix (DR-FIM)<n>DR-FIM measures parameter sensitivity across tasks and domains, enabling selective updates that preserve generalization and enhance DGSS adaptability.
arXiv Detail & Related papers (2025-03-23T04:47:15Z)
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
We propose a novel textbfSelf-textbfPerceptinon textbfTuning (textbfSPT) method for anomaly segmentation.<n>The SPT method incorporates a self-drafting tuning strategy, which generates an initial coarse draft of the anomaly mask, followed by a refinement process.
arXiv Detail & Related papers (2024-11-26T08:33:25Z)
Adaptive Aggregation Weights for Federated Segmentation of Pancreas MRI [5.631060921219683]
Federated learning (FL) enables collaborative model training across institutions without sharing sensitive data.<n>Traditional FL methods, such as Federated Averaging (FedAvg), face difficulties in generalizing across domains.<n>This paper introduces a novel approach that incorporates adaptive aggregation weights.
arXiv Detail & Related papers (2024-10-29T20:53:01Z)
A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models [7.428199805959228]
Few-shot semantic segmentation (FSS) is a crucial challenge in computer vision.<n>With the emergence of vision foundation models (VFM) as generalist feature extractors, we seek to explore the adaptation of these models for FSS.<n>We propose a novel realistic benchmark with a simple and straightforward adaptation process tailored for this task.
arXiv Detail & Related papers (2024-01-20T19:50:51Z)
Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN) CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data. Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z)
Training Lightweight Graph Convolutional Networks with Phase-field Models [12.18340575383456]
We design lightweight graph convolutional networks (GCNs) using a particular class of regularizers, dubbed as phase-field models (PFMs) PFMs exhibit a bi-phase behavior using a particular ultra-local term that allows training both the topology and the weight parameters of GCNs as a part of a single "end-to-end" optimization problem.
arXiv Detail & Related papers (2022-12-19T12:49:03Z)
PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images [60.33197938330409]
PyMAF-X is a regression-based approach to recovering parametric full-body models from monocular images. PyMAF and PyMAF-X effectively improve the mesh-image alignment and achieve new state-of-the-art results.
arXiv Detail & Related papers (2022-07-13T17:58:33Z)
FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain. Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z)
Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E) We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.