Privacy-preserving patient clustering for personalized federated
learning
- URL: http://arxiv.org/abs/2307.08847v1
- Date: Mon, 17 Jul 2023 21:19:08 GMT
- Title: Privacy-preserving patient clustering for personalized federated
learning
- Authors: Ahmed Elhussein and Gamze Gursoy
- Abstract summary: Federated Learning (FL) is a machine learning framework that enables multiple organizations to train a model without sharing their data with a central server.
This is a problem in medical settings, where variations in the patient population contribute significantly to distribution differences across hospitals.
We propose Privacy-preserving Community-Based Federated machine Learning (PCBFL), a novel Clustered FL framework that can cluster patients using patient-level data while protecting privacy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Federated Learning (FL) is a machine learning framework that enables multiple
organizations to train a model without sharing their data with a central
server. However, it experiences significant performance degradation if the data
is non-identically independently distributed (non-IID). This is a problem in
medical settings, where variations in the patient population contribute
significantly to distribution differences across hospitals. Personalized FL
addresses this issue by accounting for site-specific distribution differences.
Clustered FL, a Personalized FL variant, was used to address this problem by
clustering patients into groups across hospitals and training separate models
on each group. However, privacy concerns remained as a challenge as the
clustering process requires exchange of patient-level information. This was
previously solved by forming clusters using aggregated data, which led to
inaccurate groups and performance degradation. In this study, we propose
Privacy-preserving Community-Based Federated machine Learning (PCBFL), a novel
Clustered FL framework that can cluster patients using patient-level data while
protecting privacy. PCBFL uses Secure Multiparty Computation, a cryptographic
technique, to securely calculate patient-level similarity scores across
hospitals. We then evaluate PCBFL by training a federated mortality prediction
model using 20 sites from the eICU dataset. We compare the performance gain
from PCBFL against traditional and existing Clustered FL frameworks. Our
results show that PCBFL successfully forms clinically meaningful cohorts of
low, medium, and high-risk patients. PCBFL outperforms traditional and existing
Clustered FL frameworks with an average AUC improvement of 4.3% and AUPRC
improvement of 7.8%.
Related papers
- FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering [26.478852701376294]
Federated learning (FL) is an emerging distributed machine learning paradigm.
One of the major challenges in FL is the presence of uneven data distributions across client devices.
We propose em FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients.
arXiv Detail & Related papers (2024-07-09T02:47:16Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - FedClust: Optimizing Federated Learning on Non-IID Data through
Weight-Driven Client Clustering [28.057411252785176]
Federated learning (FL) is an emerging distributed machine learning paradigm enabling collaborative model training on decentralized devices without exposing their local data.
This paper proposes FedClust, a novel CFL approach leveraging correlations between local model weights and client data distributions.
arXiv Detail & Related papers (2024-03-07T01:50:36Z) - Contrastive encoder pre-training-based clustered federated learning for
heterogeneous data [17.580390632874046]
Federated learning (FL) enables distributed clients to collaboratively train a global model while preserving their data privacy.
We propose contrastive pre-training-based clustered federated learning (CP-CFL) to improve the model convergence and overall performance of FL systems.
arXiv Detail & Related papers (2023-11-28T05:44:26Z) - PFL-GAN: When Client Heterogeneity Meets Generative Models in
Personalized Federated Learning [55.930403371398114]
We propose a novel generative adversarial network (GAN) sharing and aggregation strategy for personalized learning (PFL)
PFL-GAN addresses the client heterogeneity in different scenarios. More specially, we first learn the similarity among clients and then develop an weighted collaborative data aggregation.
The empirical results through the rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.
arXiv Detail & Related papers (2023-08-23T22:38:35Z) - Stochastic Clustered Federated Learning [21.811496586350653]
This paper proposes StoCFL, a novel clustered federated learning approach for generic Non-IID issues.
In detail, StoCFL implements a flexible CFL framework that supports an arbitrary proportion of client participation and newly joined clients.
The results show that StoCFL could obtain promising cluster results even when the number of clusters is unknown.
arXiv Detail & Related papers (2023-03-02T01:39:16Z) - Federated Learning with Privacy-Preserving Ensemble Attention
Distillation [63.39442596910485]
Federated Learning (FL) is a machine learning paradigm where many local nodes collaboratively train a central model while keeping the training data decentralized.
We propose a privacy-preserving FL framework leveraging unlabeled public data for one-way offline knowledge distillation.
Our technique uses decentralized and heterogeneous local data like existing FL approaches, but more importantly, it significantly reduces the risk of privacy leakage.
arXiv Detail & Related papers (2022-10-16T06:44:46Z) - Federated Learning in Multi-Center Critical Care Research: A Systematic
Case Study using the eICU Database [24.31499341763427]
Federated learning (FL) has been proposed as a method to train a model on different units without exchanging data.
We investigate the effectiveness of FL on the publicly available eICU dataset for predicting the survival of each ICU stay.
arXiv Detail & Related papers (2022-04-20T09:03:09Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.