Clustered Sampling: Low-Variance and Improved Representativity for
Clients Selection in Federated Learning
- URL: http://arxiv.org/abs/2105.05883v1
- Date: Wed, 12 May 2021 18:19:20 GMT
- Title: Clustered Sampling: Low-Variance and Improved Representativity for
Clients Selection in Federated Learning
- Authors: Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi
- Abstract summary: This work addresses the problem of optimizing communications between server and clients in federated learning (FL)
Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability.
We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients aggregation weights in FL.
- Score: 4.530678016396477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work addresses the problem of optimizing communications between server
and clients in federated learning (FL). Current sampling approaches in FL are
either biased, or non optimal in terms of server-clients communications and
training stability. To overcome this issue, we introduce \textit{clustered
sampling} for clients selection. We prove that clustered sampling leads to
better clients representatitivity and to reduced variance of the clients
stochastic aggregation weights in FL. Compatibly with our theory, we provide
two different clustering approaches enabling clients aggregation based on 1)
sample size, and 2) models similarity. Through a series of experiments in
non-iid and unbalanced scenarios, we demonstrate that model aggregation through
clustered sampling consistently leads to better training convergence and
variability when compared to standard sampling approaches. Our approach does
not require any additional operation on the clients side, and can be seamlessly
integrated in standard FL implementations. Finally, clustered sampling is
compatible with existing methods and technologies for privacy enhancement, and
for communication reduction through model compression.
Related papers
- Cohort Squeeze: Beyond a Single Communication Round per Cohort in Cross-Device Federated Learning [51.560590617691005]
We investigate whether it is possible to squeeze more juice" out of each cohort than what is possible in a single communication round.
Our approach leads to up to 74% reduction in the total communication cost needed to train a FL model in the cross-device setting.
arXiv Detail & Related papers (2024-06-03T08:48:49Z) - LEFL: Low Entropy Client Sampling in Federated Learning [6.436397118145477]
Federated learning (FL) is a machine learning paradigm where multiple clients collaborate to optimize a single global model using their private data.
We propose LEFL, an alternative sampling strategy by performing a one-time clustering of clients based on their model's learned high-level features.
We show of sampled clients selected with this approach yield a low relative entropy with respect to the global data distribution.
arXiv Detail & Related papers (2023-12-29T01:44:20Z) - Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance [37.646655530394604]
Federated Learning (FL) is a distributed learning paradigm to train a global model across multiple devices without collecting local data.
We present the first adaptive client sampler, K-Vib, employing an independent sampling procedure.
K-Vib achieves a linear speed-up on the regret bound $tildemathcalObig(Nfrac13Tfrac23/Kfrac43big)$ within a set communication budget.
arXiv Detail & Related papers (2023-10-04T10:08:01Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - When to Trust Aggregated Gradients: Addressing Negative Client Sampling
in Federated Learning [41.51682329500003]
We propose a novel learning rate adaptation mechanism to adjust the server learning rate for the aggregated gradient in each round.
We make theoretical deductions to find a meaningful and robust indicator that is positively related to the optimal server learning rate.
arXiv Detail & Related papers (2023-01-25T03:52:45Z) - Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated
Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation.
Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS)
In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z) - Mining Latent Relationships among Clients: Peer-to-peer Federated
Learning with Adaptive Neighbor Matching [6.959557494221414]
In federated learning (FL), clients may have diverse objectives, merging all clients' knowledge into one global model will cause negative transfers to local performance.
We take advantage of peer-to-peer (P2P) FL, where clients communicate with neighbors without a central server.
We propose an algorithm that enables clients to form an effective communication topology in a decentralized manner without assuming the number of clusters.
arXiv Detail & Related papers (2022-03-23T09:10:14Z) - On the Convergence of Clustered Federated Learning [57.934295064030636]
In a federated learning system, the clients, e.g. mobile devices and organization participants, usually have different personal preferences or behavior patterns.
This paper proposes a novel weighted client-based clustered FL algorithm to leverage the client's group and each client in a unified optimization framework.
arXiv Detail & Related papers (2022-02-13T02:39:19Z) - Adaptive Client Sampling in Federated Learning via Online Learning with
Bandit Feedback [36.05851452151107]
federated learning (FL) systems need to sample a subset of clients that are involved in each round of training.
Despite its importance, there is limited work on how to sample clients effectively.
We show how our sampling method can improve the convergence speed of optimization algorithms.
arXiv Detail & Related papers (2021-12-28T23:50:52Z) - Low-Latency Federated Learning over Wireless Channels with Differential
Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server.
In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.