FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical Imaging
- URL: http://arxiv.org/abs/2407.08822v1
- Date: Thu, 11 Jul 2024 19:12:23 GMT
- Title: FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical Imaging
- Authors: Kumail Alhamoud, Yasir Ghunaim, Motasem Alfarra, Thomas Hartvigsen, Philip Torr, Bernard Ghanem, Adel Bibi, Marzyeh Ghassemi,
- Abstract summary: FedMedICL is a unified framework and benchmark to holistically evaluate federated medical imaging challenges.
We comprehensively evaluate several popular methods on six diverse medical imaging datasets.
We find that a simple batch balancing technique surpasses advanced methods in average performance across FedMedICL experiments.
- Score: 68.6715007665896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For medical imaging AI models to be clinically impactful, they must generalize. However, this goal is hindered by (i) diverse types of distribution shifts, such as temporal, demographic, and label shifts, and (ii) limited diversity in datasets that are siloed within single medical institutions. While these limitations have spurred interest in federated learning, current evaluation benchmarks fail to evaluate different shifts simultaneously. However, in real healthcare settings, multiple types of shifts co-exist, yet their impact on medical imaging performance remains unstudied. In response, we introduce FedMedICL, a unified framework and benchmark to holistically evaluate federated medical imaging challenges, simultaneously capturing label, demographic, and temporal distribution shifts. We comprehensively evaluate several popular methods on six diverse medical imaging datasets (totaling 550 GPU hours). Furthermore, we use FedMedICL to simulate COVID-19 propagation across hospitals and evaluate whether methods can adapt to pandemic changes in disease prevalence. We find that a simple batch balancing technique surpasses advanced methods in average performance across FedMedICL experiments. This finding questions the applicability of results from previous, narrow benchmarks in real-world medical settings.
Related papers
- FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models [37.803490266325]
We introduce FairMedFM, a fairness benchmark for foundation models (FMs) research in medical imaging.
FairMedFM integrates with 17 popular medical imaging datasets, encompassing different modalities, dimensionalities, and sensitive attributes.
It explores 20 widely used FMs, with various usages such as zero-shot learning, linear probing, parameter-efficient fine-tuning, and prompting in various downstream tasks -- classification and segmentation.
arXiv Detail & Related papers (2024-07-01T05:47:58Z) - Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification [6.0233642055651115]
We introduce Medformer, a multi-granularity patching transformer tailored specifically for MedTS classification.
Our method incorporates three novel mechanisms to leverage the unique characteristics of MedTS.
We conduct extensive experiments on five public datasets under both subject-dependent and challenging subject-independent setups.
arXiv Detail & Related papers (2024-05-24T16:51:10Z) - Semi-Supervised Disease Classification based on Limited Medical Image Data [9.633774896301436]
This paper introduces a novel generative model inspired by H"older divergence for semi-supervised disease classification.
We conduct experiments on five benchmark datasets commonly used in PU medical learning.
Our approach achieves state-of-the-art performance on all five disease classification benchmarks.
arXiv Detail & Related papers (2024-05-07T13:11:08Z) - Plug-and-Play Feature Generation for Few-Shot Medical Image
Classification [23.969183389866686]
Few-shot learning presents immense potential in enhancing model generalization and practicality for medical image classification with limited training data.
We propose MedMFG, a flexible and lightweight plug-and-play method designed to generate sufficient class-distinctive features from limited samples.
arXiv Detail & Related papers (2023-10-14T02:36:14Z) - Med-Flamingo: a Multimodal Medical Few-shot Learner [58.85676013818811]
We propose Med-Flamingo, a multimodal few-shot learner adapted to the medical domain.
Based on OpenFlamingo-9B, we continue pre-training on paired and interleaved medical image-text data from publications and textbooks.
We conduct the first human evaluation for generative medical VQA where physicians review the problems and blinded generations in an interactive app.
arXiv Detail & Related papers (2023-07-27T20:36:02Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - Understanding the Tricks of Deep Learning in Medical Image Segmentation:
Challenges and Future Directions [66.40971096248946]
In this paper, we collect a series of MedISeg tricks for different model implementation phases.
We experimentally explore the effectiveness of these tricks on consistent baselines.
We also open-sourced a strong MedISeg repository, where each component has the advantage of plug-and-play.
arXiv Detail & Related papers (2022-09-21T12:30:05Z) - FedMed-GAN: Federated Domain Translation on Unsupervised Cross-Modality
Brain Image Synthesis [55.939957482776194]
We propose a new benchmark for federated domain translation on unsupervised brain image synthesis (termed as FedMed-GAN)
FedMed-GAN mitigates the mode collapse without sacrificing the performance of generators.
A comprehensive evaluation is provided for comparing FedMed-GAN and other centralized methods.
arXiv Detail & Related papers (2022-01-22T02:50:29Z) - Using Soft Labels to Model Uncertainty in Medical Image Segmentation [0.0]
We propose a simple method to obtain soft labels from the annotations of multiple physicians.
For each image, our method produces a single well-calibrated output that can be thresholded at multiple confidence levels.
We evaluated our method on the MICCAI 2021 QUBIQ challenge, showing that it performs well across multiple medical image segmentation tasks.
arXiv Detail & Related papers (2021-09-26T14:47:18Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.