Benchmarking Pathology Feature Extractors for Whole Slide Image Classification
- URL: http://arxiv.org/abs/2311.11772v5
- Date: Fri, 21 Jun 2024 10:43:34 GMT
- Title: Benchmarking Pathology Feature Extractors for Whole Slide Image Classification
- Authors: Georg Wölflein, Dyke Ferber, Asier R. Meneghetti, Omar S. M. El Nahhas, Daniel Truhn, Zunamys I. Carrero, David J. Harrison, Ognjen Arandjelović, Jakob Nikolas Kather,
- Abstract summary: Weakly supervised whole slide image classification is a key task in computational pathology.
We conduct a comprehensive benchmarking of feature extractors to answer three critical questions.
We observe empirically, and by analysing the latent space, that skipping stain normalisation and image augmentations does not degrade performance.
We develop a novel evaluation metric to compare relative downstream performance, and show that the choice of feature extractor is the most consequential factor for downstream performance.
- Score: 2.173830337391778
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly supervised whole slide image classification is a key task in computational pathology, which involves predicting a slide-level label from a set of image patches constituting the slide. Constructing models to solve this task involves multiple design choices, often made without robust empirical or conclusive theoretical justification. To address this, we conduct a comprehensive benchmarking of feature extractors to answer three critical questions: 1) Is stain normalisation still a necessary preprocessing step? 2) Which feature extractors are best for downstream slide-level classification? 3) How does magnification affect downstream performance? Our study constitutes the most comprehensive evaluation of publicly available pathology feature extractors to date, involving more than 10,000 training runs across 14 feature extractors, 9 tasks, 5 datasets, 3 downstream architectures, 2 levels of magnification, and various preprocessing setups. Our findings challenge existing assumptions: 1) We observe empirically, and by analysing the latent space, that skipping stain normalisation and image augmentations does not degrade performance, while significantly reducing memory and computational demands. 2) We develop a novel evaluation metric to compare relative downstream performance, and show that the choice of feature extractor is the most consequential factor for downstream performance. 3) We find that lower-magnification slides are sufficient for accurate slide-level classification. Contrary to previous patch-level benchmarking studies, our approach emphasises clinical relevance by focusing on slide-level biomarker prediction tasks in a weakly supervised setting with external validation cohorts. Our findings stand to streamline digital pathology workflows by minimising preprocessing needs and informing the selection of feature extractors.
Related papers
- PATHS: A Hierarchical Transformer for Efficient Whole Slide Image Analysis [9.862551438475666]
We propose a novel top-down method for hierarchical weakly supervised representation learning on slide-level tasks in computational pathology.
PATHS is inspired by the cross-magnification manner in which a human pathologist examines a slide, filtering patches at each magnification level to a small subset relevant to the diagnosis.
We apply PATHS to five datasets of The Cancer Genome Atlas (TCGA), and achieve superior performance on slide-level prediction tasks.
arXiv Detail & Related papers (2024-11-27T11:03:38Z) - Self-Contrastive Weakly Supervised Learning Framework for Prognostic Prediction Using Whole Slide Images [3.6330373579181927]
Prognostic prediction poses a unique challenge as the ground truth labels are inherently weak.
We propose a novel three-part framework comprising of a convolutional network based tissue segmentation algorithm for region of interest delineation.
Our best models yield an AUC of 0.721 and 0.678 for recurrence and treatment outcome prediction respectively.
arXiv Detail & Related papers (2024-05-24T06:45:36Z) - AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation [123.88875931128342]
A serious issue that harms the performance of zero-shot visual recognition is named objective misalignment.
We propose a novel architecture named AlignZeg, which embodies a comprehensive improvement of the segmentation pipeline.
Experiments demonstrate that AlignZeg markedly enhances zero-shot semantic segmentation.
arXiv Detail & Related papers (2024-04-08T16:51:33Z) - What Matters When Repurposing Diffusion Models for General Dense Perception Tasks? [49.84679952948808]
Recent works show promising results by simply fine-tuning T2I diffusion models for dense perception tasks.
We conduct a thorough investigation into critical factors that affect transfer efficiency and performance when using diffusion priors.
Our work culminates in the development of GenPercept, an effective deterministic one-step fine-tuning paradigm tailed for dense visual perception tasks.
arXiv Detail & Related papers (2024-03-10T04:23:24Z) - Low-resource finetuning of foundation models beats state-of-the-art in
histopathology [3.4577420145036375]
We benchmark the most popular vision foundation models as feature extractors for histopathology data.
By finetuning a foundation model on a single GPU for only two hours or three days depending on the dataset, we can match or outperform state-of-the-art feature extractors.
This is a considerable shift from the current state, where only few institutions with large amounts of resources and datasets are able to train a feature extractor.
arXiv Detail & Related papers (2024-01-09T18:46:59Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - A Knowledge-based Learning Framework for Self-supervised Pre-training
Towards Enhanced Recognition of Medical Images [14.304996977665212]
This study proposes a knowledge-based learning framework towards enhanced recognition of medical images.
It works in three phases by synergizing contrastive learning and generative learning models.
The proposed framework statistically excels in self-supervised benchmarks, achieving 2.08, 1.23, 1.12, 0.76 and 1.38 percentage points improvements over SimCLR in AUC/Dice.
arXiv Detail & Related papers (2022-11-27T03:58:58Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - Self-supervised Pretraining with Classification Labels for Temporal
Activity Detection [54.366236719520565]
Temporal Activity Detection aims to predict activity classes per frame.
Due to the expensive frame-level annotations required for detection, the scale of detection datasets is limited.
This work proposes a novel self-supervised pretraining method for detection leveraging classification labels.
arXiv Detail & Related papers (2021-11-26T18:59:28Z) - Cascaded Robust Learning at Imperfect Labels for Chest X-ray
Segmentation [61.09321488002978]
We present a novel cascaded robust learning framework for chest X-ray segmentation with imperfect annotation.
Our model consists of three independent network, which can effectively learn useful information from the peer networks.
Our methods could achieve a significant improvement on the accuracy in segmentation tasks compared to the previous methods.
arXiv Detail & Related papers (2021-04-05T15:50:16Z) - Overcoming the limitations of patch-based learning to detect cancer in
whole slide images [0.15658704610960567]
Whole slide images (WSIs) pose unique challenges when training deep learning models.
We outline the differences between patch or slide-level classification versus methods that need to localize or segment cancer accurately across the whole slide.
We propose a negative data sampling strategy, which drastically reduces the false positive rate and improves each metric pertinent to our problem.
arXiv Detail & Related papers (2020-12-01T16:37:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.