Related papers: Privacy-preserving patient clustering for personalized federated learning

Privacy-preserving patient clustering for personalized federated learning

URL: http://arxiv.org/abs/2307.08847v1
Date: Mon, 17 Jul 2023 21:19:08 GMT
Title: Privacy-preserving patient clustering for personalized federated learning
Authors: Ahmed Elhussein and Gamze Gursoy
Abstract summary: Federated Learning (FL) is a machine learning framework that enables multiple organizations to train a model without sharing their data with a central server. This is a problem in medical settings, where variations in the patient population contribute significantly to distribution differences across hospitals. We propose Privacy-preserving Community-Based Federated machine Learning (PCBFL), a novel Clustered FL framework that can cluster patients using patient-level data while protecting privacy.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Federated Learning (FL) is a machine learning framework that enables multiple organizations to train a model without sharing their data with a central server. However, it experiences significant performance degradation if the data is non-identically independently distributed (non-IID). This is a problem in medical settings, where variations in the patient population contribute significantly to distribution differences across hospitals. Personalized FL addresses this issue by accounting for site-specific distribution differences. Clustered FL, a Personalized FL variant, was used to address this problem by clustering patients into groups across hospitals and training separate models on each group. However, privacy concerns remained as a challenge as the clustering process requires exchange of patient-level information. This was previously solved by forming clusters using aggregated data, which led to inaccurate groups and performance degradation. In this study, we propose Privacy-preserving Community-Based Federated machine Learning (PCBFL), a novel Clustered FL framework that can cluster patients using patient-level data while protecting privacy. PCBFL uses Secure Multiparty Computation, a cryptographic technique, to securely calculate patient-level similarity scores across hospitals. We then evaluate PCBFL by training a federated mortality prediction model using 20 sites from the eICU dataset. We compare the performance gain from PCBFL against traditional and existing Clustered FL frameworks. Our results show that PCBFL successfully forms clinically meaningful cohorts of low, medium, and high-risk patients. PCBFL outperforms traditional and existing Clustered FL frameworks with an average AUC improvement of 4.3% and AUPRC improvement of 7.8%.

Related papers

Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z)
FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering [26.478852701376294]
Federated learning (FL) is an emerging distributed machine learning paradigm. One of the major challenges in FL is the presence of uneven data distributions across client devices. We propose em FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients.
arXiv Detail & Related papers (2024-07-09T02:47:16Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
FedClust: Optimizing Federated Learning on Non-IID Data through Weight-Driven Client Clustering [28.057411252785176]
Federated learning (FL) is an emerging distributed machine learning paradigm enabling collaborative model training on decentralized devices without exposing their local data. This paper proposes FedClust, a novel CFL approach leveraging correlations between local model weights and client data distributions.
arXiv Detail & Related papers (2024-03-07T01:50:36Z)
Contrastive encoder pre-training-based clustered federated learning for heterogeneous data [17.580390632874046]
Federated learning (FL) enables distributed clients to collaboratively train a global model while preserving their data privacy. We propose contrastive pre-training-based clustered federated learning (CP-CFL) to improve the model convergence and overall performance of FL systems.
arXiv Detail & Related papers (2023-11-28T05:44:26Z)
PFL-GAN: When Client Heterogeneity Meets Generative Models in Personalized Federated Learning [55.930403371398114]
We propose a novel generative adversarial network (GAN) sharing and aggregation strategy for personalized learning (PFL) PFL-GAN addresses the client heterogeneity in different scenarios. More specially, we first learn the similarity among clients and then develop an weighted collaborative data aggregation. The empirical results through the rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.
arXiv Detail & Related papers (2023-08-23T22:38:35Z)
Stochastic Clustered Federated Learning [21.811496586350653]
This paper proposes StoCFL, a novel clustered federated learning approach for generic Non-IID issues. In detail, StoCFL implements a flexible CFL framework that supports an arbitrary proportion of client participation and newly joined clients. The results show that StoCFL could obtain promising cluster results even when the number of clusters is unknown.
arXiv Detail & Related papers (2023-03-02T01:39:16Z)
Federated Learning with Privacy-Preserving Ensemble Attention Distillation [63.39442596910485]
Federated Learning (FL) is a machine learning paradigm where many local nodes collaboratively train a central model while keeping the training data decentralized. We propose a privacy-preserving FL framework leveraging unlabeled public data for one-way offline knowledge distillation. Our technique uses decentralized and heterogeneous local data like existing FL approaches, but more importantly, it significantly reduces the risk of privacy leakage.
arXiv Detail & Related papers (2022-10-16T06:44:46Z)
Federated Learning in Multi-Center Critical Care Research: A Systematic Case Study using the eICU Database [24.31499341763427]
Federated learning (FL) has been proposed as a method to train a model on different units without exchanging data. We investigate the effectiveness of FL on the publicly available eICU dataset for predicting the survival of each ICU stay.
arXiv Detail & Related papers (2022-04-20T09:03:09Z)
Heterogeneous Federated Learning via Grouped Sequential-to-Parallel Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm. We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge. We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units. The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.