Label-Efficient Self-Supervised Federated Learning for Tackling Data
Heterogeneity in Medical Imaging
- URL: http://arxiv.org/abs/2205.08576v1
- Date: Tue, 17 May 2022 18:33:43 GMT
- Title: Label-Efficient Self-Supervised Federated Learning for Tackling Data
Heterogeneity in Medical Imaging
- Authors: Rui Yan, Liangqiong Qu, Qingyue Wei, Shih-Cheng Huang, Liyue Shen,
Daniel Rubin, Lei Xing, Yuyin Zhou
- Abstract summary: We present a robust and label-efficient self-supervised FL framework for medical image analysis.
Specifically, we introduce a novel distributed self-supervised pre-training paradigm into the existing FL pipeline.
We show that our self-supervised FL algorithm generalizes well to out-of-distribution data and learns federated models more effectively in limited label scenarios.
- Score: 23.08596805950814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The curation of large-scale medical datasets from multiple institutions
necessary for training deep learning models is challenged by the difficulty in
sharing patient data with privacy-preserving. Federated learning (FL), a
paradigm that enables privacy-protected collaborative learning among different
institutions, is a promising solution to this challenge. However, FL generally
suffers from performance deterioration due to heterogeneous data distributions
across institutions and the lack of quality labeled data. In this paper, we
present a robust and label-efficient self-supervised FL framework for medical
image analysis. Specifically, we introduce a novel distributed self-supervised
pre-training paradigm into the existing FL pipeline (i.e., pre-training the
models directly on the decentralized target task datasets). Built upon the
recent success of Vision Transformers, we employ masked image encoding tasks
for self-supervised pre-training, to facilitate more effective knowledge
transfer to downstream federated models. Extensive empirical results on
simulated and real-world medical imaging federated datasets show that
self-supervised pre-training largely benefits the robustness of federated
models against various degrees of data heterogeneity. Notably, under severe
data heterogeneity, our method, without relying on any additional pre-training
data, achieves an improvement of 5.06%, 1.53% and 4.58% in test accuracy on
retinal, dermatology and chest X-ray classification compared with the
supervised baseline with ImageNet pre-training. Moreover, we show that our
self-supervised FL algorithm generalizes well to out-of-distribution data and
learns federated models more effectively in limited label scenarios, surpassing
the supervised baseline by 10.36% and the semi-supervised FL method by 8.3% in
test accuracy.
Related papers
- Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification [2.5091334993691206]
Development of a robust deep-learning model for retinal disease diagnosis requires a substantial dataset for training.
The capacity to generalize effectively on smaller datasets remains a persistent challenge.
We've combined a wide range of data sources to improve performance and generalization to new data.
arXiv Detail & Related papers (2024-09-17T17:22:35Z) - FedKBP: Federated dose prediction framework for knowledge-based planning in radiation therapy [0.5575343193009424]
Dose prediction plays a key role in knowledge-based planning (KBP) by automatically generating patient-specific dose distribution.
Recent advances in deep learning-based dose prediction methods necessitates collaboration among data contributors for improved performance.
Federation learning (FL) has emerged as a solution, enabling medical centers to jointly train deep-learning models without compromising patient data privacy.
arXiv Detail & Related papers (2024-08-17T14:57:14Z) - Collaborative Training of Medical Artificial Intelligence Models with
non-uniform Labels [0.07176066267895696]
Building powerful and robust deep learning models requires training with large multi-party datasets.
We propose flexible federated learning (FFL) for collaborative training on such data.
We demonstrate that having heterogeneously labeled datasets, FFL-based training leads to significant performance increase.
arXiv Detail & Related papers (2022-11-24T13:48:54Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - Improving Performance of Federated Learning based Medical Image Analysis
in Non-IID Settings using Image Augmentation [1.5469452301122177]
Federated Learning (FL) is a suitable solution for making use of sensitive data belonging to patients, people, companies, or industries that are obligatory to work under rigid privacy constraints.
FL mainly or partially supports data privacy and security issues and provides an alternative to model problems facilitating multiple edge devices or organizations to contribute a training of a global model using a number of local data without having them.
This paper introduces a novel method dynamically balancing the data distributions of clients by augmenting images to address the non-IID data problem of FL.
arXiv Detail & Related papers (2021-12-12T10:05:42Z) - FedSLD: Federated Learning with Shared Label Distribution for Medical
Image Classification [6.0088002781256185]
We propose Federated Learning with Shared Label Distribution (FedSLD) for classification tasks.
FedSLD adjusts the contribution of each data sample to the local objective during optimization given knowledge of the distribution.
Our results show that FedSLD achieves better convergence performance than the compared leading FL optimization algorithms.
arXiv Detail & Related papers (2021-10-15T21:38:25Z) - Differentially private federated deep learning for multi-site medical
image segmentation [56.30543374146002]
Collaborative machine learning techniques such as federated learning (FL) enable the training of models on effectively larger datasets without data transfer.
Recent initiatives have demonstrated that segmentation models trained with FL can achieve performance similar to locally trained models.
However, FL is not a fully privacy-preserving technique and privacy-centred attacks can disclose confidential patient data.
arXiv Detail & Related papers (2021-07-06T12:57:32Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.