Label-Efficient Self-Supervised Federated Learning for Tackling Data
Heterogeneity in Medical Imaging
- URL: http://arxiv.org/abs/2205.08576v1
- Date: Tue, 17 May 2022 18:33:43 GMT
- Title: Label-Efficient Self-Supervised Federated Learning for Tackling Data
Heterogeneity in Medical Imaging
- Authors: Rui Yan, Liangqiong Qu, Qingyue Wei, Shih-Cheng Huang, Liyue Shen,
Daniel Rubin, Lei Xing, Yuyin Zhou
- Abstract summary: We present a robust and label-efficient self-supervised FL framework for medical image analysis.
Specifically, we introduce a novel distributed self-supervised pre-training paradigm into the existing FL pipeline.
We show that our self-supervised FL algorithm generalizes well to out-of-distribution data and learns federated models more effectively in limited label scenarios.
- Score: 23.08596805950814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The curation of large-scale medical datasets from multiple institutions
necessary for training deep learning models is challenged by the difficulty in
sharing patient data with privacy-preserving. Federated learning (FL), a
paradigm that enables privacy-protected collaborative learning among different
institutions, is a promising solution to this challenge. However, FL generally
suffers from performance deterioration due to heterogeneous data distributions
across institutions and the lack of quality labeled data. In this paper, we
present a robust and label-efficient self-supervised FL framework for medical
image analysis. Specifically, we introduce a novel distributed self-supervised
pre-training paradigm into the existing FL pipeline (i.e., pre-training the
models directly on the decentralized target task datasets). Built upon the
recent success of Vision Transformers, we employ masked image encoding tasks
for self-supervised pre-training, to facilitate more effective knowledge
transfer to downstream federated models. Extensive empirical results on
simulated and real-world medical imaging federated datasets show that
self-supervised pre-training largely benefits the robustness of federated
models against various degrees of data heterogeneity. Notably, under severe
data heterogeneity, our method, without relying on any additional pre-training
data, achieves an improvement of 5.06%, 1.53% and 4.58% in test accuracy on
retinal, dermatology and chest X-ray classification compared with the
supervised baseline with ImageNet pre-training. Moreover, we show that our
self-supervised FL algorithm generalizes well to out-of-distribution data and
learns federated models more effectively in limited label scenarios, surpassing
the supervised baseline by 10.36% and the semi-supervised FL method by 8.3% in
test accuracy.
Related papers
- FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment.
Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality.
This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z) - Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification [2.5091334993691206]
Development of a robust deep-learning model for retinal disease diagnosis requires a substantial dataset for training.
The capacity to generalize effectively on smaller datasets remains a persistent challenge.
We've combined a wide range of data sources to improve performance and generalization to new data.
arXiv Detail & Related papers (2024-09-17T17:22:35Z) - FedKBP: Federated dose prediction framework for knowledge-based planning in radiation therapy [0.5575343193009424]
Dose prediction plays a key role in knowledge-based planning (KBP) by automatically generating patient-specific dose distribution.
Recent advances in deep learning-based dose prediction methods necessitates collaboration among data contributors for improved performance.
Federation learning (FL) has emerged as a solution, enabling medical centers to jointly train deep-learning models without compromising patient data privacy.
arXiv Detail & Related papers (2024-08-17T14:57:14Z) - Distributionally Robust Alignment for Medical Federated Vision-Language Pre-training Under Data Heterogeneity [4.84693589377679]
We propose Federated Distributionally Robust Alignment (FedDRA) for medical vision-language pre-training.
FedDRA achieves robust vision-language alignment under heterogeneous conditions.
Our method also adapts well to various medical pre-training methods.
arXiv Detail & Related papers (2024-04-05T01:17:25Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - Improving Performance of Federated Learning based Medical Image Analysis
in Non-IID Settings using Image Augmentation [1.5469452301122177]
Federated Learning (FL) is a suitable solution for making use of sensitive data belonging to patients, people, companies, or industries that are obligatory to work under rigid privacy constraints.
FL mainly or partially supports data privacy and security issues and provides an alternative to model problems facilitating multiple edge devices or organizations to contribute a training of a global model using a number of local data without having them.
This paper introduces a novel method dynamically balancing the data distributions of clients by augmenting images to address the non-IID data problem of FL.
arXiv Detail & Related papers (2021-12-12T10:05:42Z) - FedSLD: Federated Learning with Shared Label Distribution for Medical
Image Classification [6.0088002781256185]
We propose Federated Learning with Shared Label Distribution (FedSLD) for classification tasks.
FedSLD adjusts the contribution of each data sample to the local objective during optimization given knowledge of the distribution.
Our results show that FedSLD achieves better convergence performance than the compared leading FL optimization algorithms.
arXiv Detail & Related papers (2021-10-15T21:38:25Z) - Differentially private federated deep learning for multi-site medical
image segmentation [56.30543374146002]
Collaborative machine learning techniques such as federated learning (FL) enable the training of models on effectively larger datasets without data transfer.
Recent initiatives have demonstrated that segmentation models trained with FL can achieve performance similar to locally trained models.
However, FL is not a fully privacy-preserving technique and privacy-centred attacks can disclose confidential patient data.
arXiv Detail & Related papers (2021-07-06T12:57:32Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.