PSI-PFL: Population Stability Index for Client Selection in non-IID Personalized Federated Learning
- URL: http://arxiv.org/abs/2506.00440v1
- Date: Sat, 31 May 2025 07:41:42 GMT
- Title: PSI-PFL: Population Stability Index for Client Selection in non-IID Personalized Federated Learning
- Authors: Daniel-M. Jimenez-Gutierrez, David Solans, Mohammed Elbamby, Nicolas Kourtellis,
- Abstract summary: Federated Learning (FL) enables decentralized machine learning (ML) model training while preserving data privacy by keeping data localized across clients.<n>We propose PSI-PFL, a novel client selection framework for Personalized Federated Learning (PFL)<n>Our approach selects more homogeneous clients based on PSI, reducing the impact of label skew, one of the most detrimental factors in FL performance.
- Score: 1.8777876049719082
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Learning (FL) enables decentralized machine learning (ML) model training while preserving data privacy by keeping data localized across clients. However, non-independent and identically distributed (non-IID) data across clients poses a significant challenge, leading to skewed model updates and performance degradation. Addressing this, we propose PSI-PFL, a novel client selection framework for Personalized Federated Learning (PFL) that leverages the Population Stability Index (PSI) to quantify and mitigate data heterogeneity (so-called non-IIDness). Our approach selects more homogeneous clients based on PSI, reducing the impact of label skew, one of the most detrimental factors in FL performance. Experimental results over multiple data modalities (tabular, image, text) demonstrate that PSI-PFL significantly improves global model accuracy, outperforming state-of-the-art baselines by up to 10\% under non-IID scenarios while ensuring fairer local performance. PSI-PFL enhances FL performance and offers practical benefits in applications where data privacy and heterogeneity are critical.
Related papers
- FedAPA: Server-side Gradient-Based Adaptive Personalized Aggregation for Federated Learning on Heterogeneous Data [5.906966694759679]
FedAPA is a novel PFL method featuring a server-side, gradient-based adaptive aggregation strategy to generate personalized models.<n>FedAPA guarantees theoretical convergence and achieves superior accuracy and computational efficiency compared to 10 PFL competitors across three datasets.
arXiv Detail & Related papers (2025-02-11T11:00:58Z) - Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.<n>We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z) - UA-PDFL: A Personalized Approach for Decentralized Federated Learning [5.065947993017158]
Federated learning (FL) is a privacy preserving machine learning paradigm designed to collaboratively learn a global model without data leakage.<n>To mitigate this issue, decentralized federated learning (DFL) has been proposed, where all participating clients engage in peer-to-peer communication without a central server.<n>We propose a novel unit representation aided personalized decentralized federated learning framework, named UA-PDFL, to deal with the non-IID challenge in DFL.
arXiv Detail & Related papers (2024-12-16T11:27:35Z) - The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy [38.55420329607416]
Both data quality and influence of DP noises should be taken into account when selecting clients.
An experiment results with real datasets under both convex and non- convex loss functions.
arXiv Detail & Related papers (2024-08-16T10:19:27Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Strategic Client Selection to Address Non-IIDness in HAPS-enabled FL Networks [21.446301665317378]
We propose a novel weighted attribute-based client selection strategy to mitigate the adverse effects of non-IID data.<n> Simulation results corroborate the effectiveness of the proposed client selection strategy in enhancing FL model accuracy and convergence rate.
arXiv Detail & Related papers (2024-01-10T18:22:00Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data [34.49090830845118]
Federated Learning (FL) has emerged to allow multiple clients to collaboratively train machine learning models on their private data at the network edge.
We propose a novel Semi-supervised SFL system, termed SemiSFL, which incorporates clustering regularization to perform SFL with unlabeled and non-IID client data.
Our system provides a 3.8x speed-up in training time, reduces the communication cost by about 70.3% while reaching the target accuracy, and achieves up to 5.8% improvement in accuracy under non-IID scenarios.
arXiv Detail & Related papers (2023-07-29T02:35:37Z) - PS-FedGAN: An Efficient Federated Learning Framework Based on Partially
Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation.
This work proposes a novel FL framework that requires only partial GAN model sharing.
Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.