Pathology Foundation Models are Scanner Sensitive: Benchmark and Mitigation with Contrastive ScanGen Loss
- URL: http://arxiv.org/abs/2507.22092v1
- Date: Tue, 29 Jul 2025 12:35:08 GMT
- Title: Pathology Foundation Models are Scanner Sensitive: Benchmark and Mitigation with Contrastive ScanGen Loss
- Authors: Gianluca Carloni, Biagio Brattoli, Seongho Keum, Jongchan Park, Taebum Lee, Chang Ho Ahn, Sergio Pereira,
- Abstract summary: We show that Foundation Models (FMs) still suffer from scanner bias.<n>We propose ScanGen, a contrastive loss function applied during task-specific fine-tuning that mitigates scanner bias.<n>Our approach is applied to the Multiple Instance Learning task of Epidermal Growth Factor Receptor (EGFR) mutation prediction from H&E-stained WSIs in lung cancer.
- Score: 6.310092608526967
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational pathology (CPath) has shown great potential in mining actionable insights from Whole Slide Images (WSIs). Deep Learning (DL) has been at the center of modern CPath, and while it delivers unprecedented performance, it is also known that DL may be affected by irrelevant details, such as those introduced during scanning by different commercially available scanners. This may lead to scanner bias, where the model outputs for the same tissue acquired by different scanners may vary. In turn, it hinders the trust of clinicians in CPath-based tools and their deployment in real-world clinical practices. Recent pathology Foundation Models (FMs) promise to provide better domain generalization capabilities. In this paper, we benchmark FMs using a multi-scanner dataset and show that FMs still suffer from scanner bias. Following this observation, we propose ScanGen, a contrastive loss function applied during task-specific fine-tuning that mitigates scanner bias, thereby enhancing the models' robustness to scanner variations. Our approach is applied to the Multiple Instance Learning task of Epidermal Growth Factor Receptor (EGFR) mutation prediction from H\&E-stained WSIs in lung cancer. We observe that ScanGen notably enhances the ability to generalize across scanners, while retaining or improving the performance of EGFR mutation prediction.
Related papers
- SCORPION: Addressing Scanner-Induced Variability in Histopathology [4.296051492560909]
Ensuring reliable model performance across diverse domains is a critical challenge in computational pathology.<n>We release SCORPION, a new dataset explicitly designed to evaluate model reliability under scanner variability.<n>We propose SimCons, a flexible framework that combines augmentation-based domain generalization techniques with a consistency loss to explicitly address scanner generalization.
arXiv Detail & Related papers (2025-07-28T15:00:49Z) - PixCell: A generative foundation model for digital histopathology images [49.00921097924924]
We introduce PixCell, the first diffusion-based generative foundation model for histopathology.<n>We train PixCell on PanCan-30M, a vast, diverse dataset derived from 69,184 H&E-stained whole slide images covering various cancer types.
arXiv Detail & Related papers (2025-06-05T15:14:32Z) - DISARM++: Beyond scanner-free harmonization [0.0]
Harmonization of T1-weighted MR images across different scanners is crucial for ensuring consistency in neuroimaging studies.<n>This study introduces a novel approach to direct image harmonization, moving beyond feature standardization to ensure that extracted features remain inherently reliable for downstream analysis.<n>Our method enables image transfer in two ways: (1) mapping images to a scanner-free space for uniform appearance across all scanners, and (2) transforming images into the domain of a specific scanner used in model training, embedding its unique characteristics.
arXiv Detail & Related papers (2025-05-06T17:36:49Z) - The Impact of Scanner Domain Shift on Deep Learning Performance in Medical Imaging: an Experimental Study [1.4628485867112924]
We evaluate the impact of scanner domain shift on convolutional neural network performance for different automated diagnostic tasks.
We find that network performance on data from a different scanner is almost always worse than on same-scanner data.
We attribute this drop to the standardized nature of CT acquisition systems which is not present in MRI or X-ray.
arXiv Detail & Related papers (2024-09-06T15:59:30Z) - On Sensitivity and Robustness of Normalization Schemes to Input
Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input.
DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases.
We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z) - Mind the Gap: Scanner-induced domain shifts pose challenges for
representation learning in histopathology [6.309771474997404]
Self-supervised pre-training can be used to overcome scanner-induced domain shifts for the downstream task of tumor segmentation.
We show that self-supervised pre-training successfully aligned different scanner representations, which, interestingly only results in a limited benefit for our downstream task.
arXiv Detail & Related papers (2022-11-29T12:16:39Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z) - Improved inter-scanner MS lesion segmentation by adversarial training on
longitudinal data [0.0]
The evaluation of white matter lesion progression is an important biomarker in the follow-up of MS patients.
Current automated lesion segmentation algorithms are susceptible to variability in image characteristics related to MRI scanner or protocol differences.
We propose a model that improves the consistency of MS lesion segmentations in inter-scanner studies.
arXiv Detail & Related papers (2020-02-03T16:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.