A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg
- URL: http://arxiv.org/abs/2007.05690v4
- Date: Sun, 31 Dec 2023 19:35:55 GMT
- Title: A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg
- Authors: Zhaonan Qu, Kaixiang Lin, Zhaojian Li, Jiayu Zhou, Zhengyuan Zhou
- Abstract summary: Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
- Score: 49.76940694847521
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) learns a model jointly from a set of participating
devices without sharing each other's privately held data. The characteristics
of non-i.i.d. data across the network, low device participation, high
communication costs, and the mandate that data remain private bring challenges
in understanding the convergence of FL algorithms, particularly regarding how
convergence scales with the number of participating devices. In this paper, we
focus on Federated Averaging (FedAvg), one of the most popular and effective FL
algorithms in use today, as well as its Nesterov accelerated variant, and
conduct a systematic study of how their convergence scale with the number of
participating devices under non-i.i.d. data and partial participation in convex
settings. We provide a unified analysis that establishes convergence guarantees
for FedAvg under strongly convex, convex, and overparameterized strongly convex
problems. We show that FedAvg enjoys linear speedup in each case, although with
different convergence rates and communication efficiencies. For strongly convex
and convex problems, we also characterize the corresponding convergence rates
for the Nesterov accelerated FedAvg algorithm, which are the first linear
speedup guarantees for momentum variants of FedAvg in convex settings.
Empirical studies of the algorithms in various settings have supported our
theoretical results.
Related papers
- Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification [51.04894019092156]
Federated learning (FL) has been recognized as a rapidly growing area, where the model is trained over clients under the FL orchestration (PS)
In this paper, we propose a novel primal sparification algorithm for and guarantee non-smooth FL problems.
Its unique insightful properties and its analyses are also presented.
arXiv Detail & Related papers (2023-10-30T14:15:47Z) - Momentum Benefits Non-IID Federated Learning Simply and Provably [22.800862422479913]
Federated learning is a powerful paradigm for large-scale machine learning.
FedAvg and SCAFFOLD are two prominent algorithms to address these challenges.
This paper explores the utilization of momentum to enhance the performance of FedAvg and SCAFFOLD.
arXiv Detail & Related papers (2023-06-28T18:52:27Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Communication-Efficient Federated Learning With Data and Client
Heterogeneity [22.432529149142976]
Federated Learning (FL) enables large-scale distributed training of machine learning models.
executing FL at scale comes with inherent practical challenges.
We present the first variant of the classic federated averaging (FedAvg) algorithm.
arXiv Detail & Related papers (2022-06-20T22:39:39Z) - On Convergence of FedProx: Local Dissimilarity Invariant Bounds,
Non-smoothness and Beyond [41.082982732100696]
We develop a novel local dissimilarity in convergence theory for FedProx and its minibatch extension through the lens of algorithmic stability.
Preliminary experimental results on a series of benchmark FL datasets are reported to demonstrate the benefit of minibatching for improving the sample efficiency of FedProx.
arXiv Detail & Related papers (2022-06-10T15:35:10Z) - Communication-Efficient Stochastic Zeroth-Order Optimization for
Federated Learning [28.65635956111857]
Federated learning (FL) enables edge devices to collaboratively train a global model without sharing their private data.
To enhance the training efficiency of FL, various algorithms have been proposed, ranging from first-order computation to first-order methods.
arXiv Detail & Related papers (2022-01-24T08:56:06Z) - Achieving Statistical Optimality of Federated Learning: Beyond
Stationary Points [19.891597817559038]
Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud.
Recent work raised concerns on two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally.
We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates.
arXiv Detail & Related papers (2021-06-29T09:59:43Z) - Tackling the Objective Inconsistency Problem in Heterogeneous Federated
Optimization [93.78811018928583]
This paper provides a framework to analyze the convergence of federated heterogeneous optimization algorithms.
We propose FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.
arXiv Detail & Related papers (2020-07-15T05:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.