Related papers: On The Impact of Client Sampling on Federated Learning Convergence

On The Impact of Client Sampling on Federated Learning Convergence

URL: http://arxiv.org/abs/2107.12211v1
Date: Mon, 26 Jul 2021 13:36:06 GMT
Title: On The Impact of Client Sampling on Federated Learning Convergence
Authors: Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi
Abstract summary: We introduce a novel decomposition theorem for the convergence of FL, allowing to clearly quantify the impact of client sampling on the global model update. Our results suggest that MD sampling should be used as default sampling scheme, due to the resilience to the changes in data ratio during the learning process, while Uniform sampling is superior only in the special case when clients have the same amount of data.
Score: 4.530678016396477
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While clients' sampling is a central operation of current state-of-the-art federated learning (FL) approaches, the impact of this procedure on the convergence and speed of FL remains to date under-investigated. In this work we introduce a novel decomposition theorem for the convergence of FL, allowing to clearly quantify the impact of client sampling on the global model update. Contrarily to previous convergence analyses, our theorem provides the exact decomposition of a given convergence step, thus enabling accurate considerations about the role of client sampling and heterogeneity. First, we provide a theoretical ground for previously reported results on the relationship between FL convergence and the variance of the aggregation weights. Second, we prove for the first time that the quality of FL convergence is also impacted by the resulting covariance between aggregation weights. Third, we establish that the sum of the aggregation weights is another source of slow-down and should be equal to 1 to improve FL convergence speed. Our theory is general, and is here applied to Multinomial Distribution (MD) and Uniform sampling, the two default client sampling in FL, and demonstrated through a series of experiments in non-iid and unbalanced scenarios. Our results suggest that MD sampling should be used as default sampling scheme, due to the resilience to the changes in data ratio during the learning process, while Uniform sampling is superior only in the special case when clients have the same amount of data.

Related papers

Convergence Analysis of Aggregation-Broadcast in LoRA-enabled Federated Learning [4.947778455281166]
Federated Learning (FL) enables collaborative model training across decentralized data sources.<n>Low-Rank Adaptation (LoRA) has been introduced into FL as an efficient fine-tuning method.<n>How to aggregate LoRA-updated local models on the server remains a critical and understudied problem.
arXiv Detail & Related papers (2025-08-02T12:54:17Z)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
Aggregation Weighting of Federated Learning via Generalization Bound Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions. We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z)
Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance [37.646655530394604]
Federated Learning (FL) is a distributed learning paradigm to train a global model across multiple devices without collecting local data. We present the first adaptive client sampler, K-Vib, employing an independent sampling procedure. K-Vib achieves a linear speed-up on the regret bound $tildemathcalObig(Nfrac13Tfrac23/Kfrac43big)$ within a set communication budget.
arXiv Detail & Related papers (2023-10-04T10:08:01Z)
A Lightweight Method for Tackling Unknown Participation Statistics in Federated Averaging [39.15781847115902]
In federated learning (FL), clients usually have diverse participation statistics that are unknown a priori. We present a new algorithm called FedAU, which improves FedAvg by adaptively weighting the client updates based on online estimates of the optimal weights. Our theoretical results reveal important and interesting insights, while showing that FedAU converges to an optimal solution of the original objective.
arXiv Detail & Related papers (2023-06-06T04:32:10Z)
Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model. Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z)
A Unified Analysis of Federated Learning with Arbitrary Client Participation [33.86606068136201]
Federated learning (FL) faces challenges of intermittent client availability and computation/communication efficiency. It is important to understand how partial client participation affects convergence. We provide a unified convergence analysis for FL with arbitrary client participation.
arXiv Detail & Related papers (2022-05-26T21:56:31Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning [4.530678016396477]
This work addresses the problem of optimizing communications between server and clients in federated learning (FL) Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability. We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients aggregation weights in FL.
arXiv Detail & Related papers (2021-05-12T18:19:20Z)
On the Practicality of Differential Privacy in Federated Learning by Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively. Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks. Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.