Related papers: Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging

Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging

URL: http://arxiv.org/abs/2205.08576v1
Date: Tue, 17 May 2022 18:33:43 GMT
Title: Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging
Authors: Rui Yan, Liangqiong Qu, Qingyue Wei, Shih-Cheng Huang, Liyue Shen, Daniel Rubin, Lei Xing, Yuyin Zhou
Abstract summary: We present a robust and label-efficient self-supervised FL framework for medical image analysis. Specifically, we introduce a novel distributed self-supervised pre-training paradigm into the existing FL pipeline. We show that our self-supervised FL algorithm generalizes well to out-of-distribution data and learns federated models more effectively in limited label scenarios.
Score: 23.08596805950814
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The curation of large-scale medical datasets from multiple institutions necessary for training deep learning models is challenged by the difficulty in sharing patient data with privacy-preserving. Federated learning (FL), a paradigm that enables privacy-protected collaborative learning among different institutions, is a promising solution to this challenge. However, FL generally suffers from performance deterioration due to heterogeneous data distributions across institutions and the lack of quality labeled data. In this paper, we present a robust and label-efficient self-supervised FL framework for medical image analysis. Specifically, we introduce a novel distributed self-supervised pre-training paradigm into the existing FL pipeline (i.e., pre-training the models directly on the decentralized target task datasets). Built upon the recent success of Vision Transformers, we employ masked image encoding tasks for self-supervised pre-training, to facilitate more effective knowledge transfer to downstream federated models. Extensive empirical results on simulated and real-world medical imaging federated datasets show that self-supervised pre-training largely benefits the robustness of federated models against various degrees of data heterogeneity. Notably, under severe data heterogeneity, our method, without relying on any additional pre-training data, achieves an improvement of 5.06%, 1.53% and 4.58% in test accuracy on retinal, dermatology and chest X-ray classification compared with the supervised baseline with ImageNet pre-training. Moreover, we show that our self-supervised FL algorithm generalizes well to out-of-distribution data and learns federated models more effectively in limited label scenarios, surpassing the supervised baseline by 10.36% and the semi-supervised FL method by 8.3% in test accuracy.

Related papers

FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z)
Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification [2.5091334993691206]
Development of a robust deep-learning model for retinal disease diagnosis requires a substantial dataset for training. The capacity to generalize effectively on smaller datasets remains a persistent challenge. We've combined a wide range of data sources to improve performance and generalization to new data.
arXiv Detail & Related papers (2024-09-17T17:22:35Z)
FedKBP: Federated dose prediction framework for knowledge-based planning in radiation therapy [0.5575343193009424]
Dose prediction plays a key role in knowledge-based planning (KBP) by automatically generating patient-specific dose distribution. Recent advances in deep learning-based dose prediction methods necessitates collaboration among data contributors for improved performance. Federation learning (FL) has emerged as a solution, enabling medical centers to jointly train deep-learning models without compromising patient data privacy.
arXiv Detail & Related papers (2024-08-17T14:57:14Z)
Distributionally Robust Alignment for Medical Federated Vision-Language Pre-training Under Data Heterogeneity [4.84693589377679]
We propose Federated Distributionally Robust Alignment (FedDRA) for medical vision-language pre-training. FedDRA achieves robust vision-language alignment under heterogeneous conditions. Our method also adapts well to various medical pre-training methods.
arXiv Detail & Related papers (2024-04-05T01:17:25Z)
SelfFed: Self-Supervised Federated Learning for Data Heterogeneity and Label Scarcity in Medical Images [17.07904450821442]
Self-supervised based federated learning strategies suffer from performance degradation due to label scarcity and diverse data distributions. We propose the SelfFed framework for medical images to overcome data heterogeneity and label scarcity issues. Our method achieves a maximum improvement of 8.8% and 4.1% on Retina and COVID-FL datasets on non-IID datasets.
arXiv Detail & Related papers (2023-07-04T06:50:16Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)
Federated Cycling (FedCy): Semi-supervised Federated Learning of Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos. We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z)
Improving Performance of Federated Learning based Medical Image Analysis in Non-IID Settings using Image Augmentation [1.5469452301122177]
Federated Learning (FL) is a suitable solution for making use of sensitive data belonging to patients, people, companies, or industries that are obligatory to work under rigid privacy constraints. FL mainly or partially supports data privacy and security issues and provides an alternative to model problems facilitating multiple edge devices or organizations to contribute a training of a global model using a number of local data without having them. This paper introduces a novel method dynamically balancing the data distributions of clients by augmenting images to address the non-IID data problem of FL.
arXiv Detail & Related papers (2021-12-12T10:05:42Z)
FedSLD: Federated Learning with Shared Label Distribution for Medical Image Classification [6.0088002781256185]
We propose Federated Learning with Shared Label Distribution (FedSLD) for classification tasks. FedSLD adjusts the contribution of each data sample to the local objective during optimization given knowledge of the distribution. Our results show that FedSLD achieves better convergence performance than the compared leading FL optimization algorithms.
arXiv Detail & Related papers (2021-10-15T21:38:25Z)
Differentially private federated deep learning for multi-site medical image segmentation [56.30543374146002]
Collaborative machine learning techniques such as federated learning (FL) enable the training of models on effectively larger datasets without data transfer. Recent initiatives have demonstrated that segmentation models trained with FL can achieve performance similar to locally trained models. However, FL is not a fully privacy-preserving technique and privacy-centred attacks can disclose confidential patient data.
arXiv Detail & Related papers (2021-07-06T12:57:32Z)
FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources. Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19. The data itself is still scarce due to patient privacy concerns. We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios. Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.