Evaluating Computational Pathology Foundation Models for Prostate Cancer Grading under Distribution Shifts
- URL: http://arxiv.org/abs/2410.06723v1
- Date: Wed, 9 Oct 2024 09:45:53 GMT
- Title: Evaluating Computational Pathology Foundation Models for Prostate Cancer Grading under Distribution Shifts
- Authors: Fredrik K. Gustafsson, Mattias Rantalainen,
- Abstract summary: We evaluate two computational pathology foundation models: UNI (trained on more than 100,000 whole-slide images) and CONCH (trained on more than 1.1 million image-caption pairs)
We find that while UNI and CONCH perform well relative to baselines, the absolute performance can still be far from satisfactory in certain settings.
- Score: 3.2995359570845912
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models have recently become a popular research direction within computational pathology. They are intended to be general-purpose feature extractors, promising to achieve good performance on a range of downstream tasks. Real-world pathology image data does however exhibit considerable variability. Foundation models should be robust to these variations and other distribution shifts which might be encountered in practice. We evaluate two computational pathology foundation models: UNI (trained on more than 100,000 whole-slide images) and CONCH (trained on more than 1.1 million image-caption pairs), by utilizing them as feature extractors within prostate cancer grading models. We find that while UNI and CONCH perform well relative to baselines, the absolute performance can still be far from satisfactory in certain settings. The fact that foundation models have been trained on large and varied datasets does not guarantee that downstream models always will be robust to common distribution shifts.
Related papers
- Weakly supervised deep learning model with size constraint for prostate cancer detection in multiparametric MRI and generalization to unseen domains [0.90668179713299]
We show that the model achieves on-par performance with strong fully supervised baseline models.
We also observe a performance decrease for both fully supervised and weakly supervised models when tested on unseen data domains.
arXiv Detail & Related papers (2024-11-04T12:24:33Z) - Phikon-v2, A large and public feature extractor for biomarker prediction [42.52549987351643]
We train a vision transformer using DINOv2 and publicly release one iteration of this model for further experimentation, coined Phikon-v2.
While trained on publicly available histology slides, Phikon-v2 surpasses our previously released model (Phikon) and performs on par with other histopathology foundation models (FM) trained on proprietary data.
arXiv Detail & Related papers (2024-09-13T20:12:29Z) - A Comprehensive Evaluation of Histopathology Foundation Models for Ovarian Cancer Subtype Classification [1.9499122087408571]
Histopathology foundation models show great promise across many tasks.
We report the most rigorous single-task validation of histopathology foundation models to date.
Histopathology foundation models offer a clear benefit to ovarian cancer subtyping.
arXiv Detail & Related papers (2024-05-16T11:21:02Z) - On the Out of Distribution Robustness of Foundation Models in Medical
Image Segmentation [47.95611203419802]
Foundations for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach.
We compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset.
We further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model's performance on out-of-distribution data.
arXiv Detail & Related papers (2023-11-18T14:52:10Z) - Generalized Logit Adjustment: Calibrating Fine-tuned Models by Removing Label Bias in Foundation Models [75.9543301303586]
Foundation models like CLIP allow zero-shot transfer on various tasks without additional training data.
Fine-tuning and ensembling are also commonly adopted to better fit the downstream tasks.
However, we argue that prior work has overlooked the inherent biases in foundation models.
arXiv Detail & Related papers (2023-10-12T08:01:11Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Realistic Data Enrichment for Robust Image Segmentation in
Histopathology [2.248423960136122]
We propose a new approach, based on diffusion models, which can enrich an imbalanced dataset with plausible examples from underrepresented groups.
Our method can simply expand limited clinical datasets making them suitable to train machine learning pipelines.
arXiv Detail & Related papers (2023-04-19T09:52:50Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.