Deployment of Image Analysis Algorithms under Prevalence Shifts
- URL: http://arxiv.org/abs/2303.12540v2
- Date: Mon, 24 Jul 2023 13:35:16 GMT
- Title: Deployment of Image Analysis Algorithms under Prevalence Shifts
- Authors: Patrick Godau and Piotr Kalinowski and Evangelia Christodoulou and
Annika Reinke and Minu Tizabi and Luciana Ferrer and Paul J\"ager and Lena
Maier-Hein
- Abstract summary: Domain gaps are among the most relevant roadblocks in the clinical translation of machine learning (ML)-based solutions for medical image analysis.
We propose a workflow for prevalence-aware image classification that uses estimated deployment prevalences to adjust a trained classifier to a new environment.
- Score: 6.373765910269204
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Domain gaps are among the most relevant roadblocks in the clinical
translation of machine learning (ML)-based solutions for medical image
analysis. While current research focuses on new training paradigms and network
architectures, little attention is given to the specific effect of prevalence
shifts on an algorithm deployed in practice. Such discrepancies between class
frequencies in the data used for a method's development/validation and that in
its deployment environment(s) are of great importance, for example in the
context of artificial intelligence (AI) democratization, as disease prevalences
may vary widely across time and location. Our contribution is twofold. First,
we empirically demonstrate the potentially severe consequences of missing
prevalence handling by analyzing (i) the extent of miscalibration, (ii) the
deviation of the decision threshold from the optimum, and (iii) the ability of
validation metrics to reflect neural network performance on the deployment
population as a function of the discrepancy between development and deployment
prevalence. Second, we propose a workflow for prevalence-aware image
classification that uses estimated deployment prevalences to adjust a trained
classifier to a new environment, without requiring additional annotated
deployment data. Comprehensive experiments based on a diverse set of 30 medical
classification tasks showcase the benefit of the proposed workflow in
generating better classifier decisions and more reliable performance estimates
compared to current practice.
Related papers
- Towards Clinician-Preferred Segmentation: Leveraging Human-in-the-Loop for Test Time Adaptation in Medical Image Segmentation [10.65123164779962]
Deep learning-based medical image segmentation models often face performance degradation when deployed across various medical centers.
We propose a novel Human-in-the-loop TTA framework that capitalizes on the largely overlooked potential of clinician-corrected predictions.
Our framework conceives a divergence loss, designed specifically to diminish the prediction divergence instigated by domain disparities.
arXiv Detail & Related papers (2024-05-14T02:02:15Z) - Rethinking Model Prototyping through the MedMNIST+ Dataset Collection [0.11999555634662634]
This work presents a benchmark for the MedMNIST+ database to diversify the evaluation landscape.
We conduct a thorough analysis of common convolutional neural networks (CNNs) and Transformer-based architectures, for medical image classification.
Our findings suggest that computationally efficient training schemes and modern foundation models hold promise in bridging the gap between expensive end-to-end training and more resource-refined approaches.
arXiv Detail & Related papers (2024-04-24T10:19:25Z) - PULASki: Learning inter-rater variability using statistical distances to
improve probabilistic segmentation [36.136619420474766]
We propose the PULASki for biomedical image segmentation that accurately captures variability in expert annotations.
Our approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure.
Our method can also be applied to a wide range of multi-label segmentation tasks and is useful for downstream tasks such as hemodynamic modelling.
arXiv Detail & Related papers (2023-12-25T10:31:22Z) - Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via
Optimization Trajectory Distillation [73.83178465971552]
The success of automated medical image analysis depends on large-scale and expert-annotated training sets.
Unsupervised domain adaptation (UDA) has been raised as a promising approach to alleviate the burden of labeled data collection.
We propose optimization trajectory distillation, a unified approach to address the two technical challenges from a new perspective.
arXiv Detail & Related papers (2023-07-27T08:58:05Z) - On the Trade-off of Intra-/Inter-class Diversity for Supervised
Pre-training [72.8087629914444]
We study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset.
With the size of the pre-training dataset fixed, the best downstream performance comes with a balance on the intra-/inter-class diversity.
arXiv Detail & Related papers (2023-05-20T16:23:50Z) - Domain Adaptation with Adversarial Training on Penultimate Activations [82.9977759320565]
Enhancing model prediction confidence on unlabeled target data is an important objective in Unsupervised Domain Adaptation (UDA)
We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features.
arXiv Detail & Related papers (2022-08-26T19:50:46Z) - Unsupervised Domain Adaptation Using Feature Disentanglement And GCNs
For Medical Image Classification [5.6512908295414]
We propose an unsupervised domain adaptation approach that uses graph neural networks and, disentangled semantic and domain invariant structural features.
We test the proposed method for classification on two challenging medical image datasets with distribution shifts.
Experiments show our method achieves state-of-the-art results compared to other domain adaptation methods.
arXiv Detail & Related papers (2022-06-27T09:02:16Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - MCDAL: Maximum Classifier Discrepancy for Active Learning [74.73133545019877]
Recent state-of-the-art active learning methods have mostly leveraged Generative Adversarial Networks (GAN) for sample acquisition.
We propose in this paper a novel active learning framework that we call Maximum Discrepancy for Active Learning (MCDAL)
In particular, we utilize two auxiliary classification layers that learn tighter decision boundaries by maximizing the discrepancies among them.
arXiv Detail & Related papers (2021-07-23T06:57:08Z) - On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data.
It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.