Analysis of Error Feedback in Federated Non-Convex Optimization with
Biased Compression
- URL: http://arxiv.org/abs/2211.14292v1
- Date: Fri, 25 Nov 2022 18:49:53 GMT
- Title: Analysis of Error Feedback in Federated Non-Convex Optimization with
Biased Compression
- Authors: Xiaoyun Li and Ping Li
- Abstract summary: In learning server (FL) systems, the communication cost between the clients and the central bottleneck can be high.
In this paper, we propose a technique to remedy the downsides of biased compression.
Under partial participation, we develop an extra slow-down factor due to a so-called stale error accumulation'' effect.
- Score: 37.6593006747285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In federated learning (FL) systems, e.g., wireless networks, the
communication cost between the clients and the central server can often be a
bottleneck. To reduce the communication cost, the paradigm of communication
compression has become a popular strategy in the literature. In this paper, we
focus on biased gradient compression techniques in non-convex FL problems. In
the classical setting of distributed learning, the method of error feedback
(EF) is a common technique to remedy the downsides of biased gradient
compression. In this work, we study a compressed FL scheme equipped with error
feedback, named Fed-EF. We further propose two variants: Fed-EF-SGD and
Fed-EF-AMS, depending on the choice of the global model optimizer. We provide a
generic theoretical analysis, which shows that directly applying biased
compression in FL leads to a non-vanishing bias in the convergence rate. The
proposed Fed-EF is able to match the convergence rate of the full-precision FL
counterparts under data heterogeneity with a linear speedup.
Moreover, we develop a new analysis of the EF under partial client
participation, which is an important scenario in FL. We prove that under
partial participation, the convergence rate of Fed-EF exhibits an extra
slow-down factor due to a so-called ``stale error compensation'' effect. A
numerical study is conducted to justify the intuitive impact of stale error
accumulation on the norm convergence of Fed-EF under partial participation.
Finally, we also demonstrate that incorporating the two-way compression in
Fed-EF does not change the convergence results. In summary, our work conducts a
thorough analysis of the error feedback in federated non-convex optimization.
Our analysis with partial client participation also provides insights on a
theoretical limitation of the error feedback mechanism, and possible directions
for improvements.
Related papers
- Harnessing the Power of Federated Learning in Federated Contextual Bandits [20.835106310302876]
Federated contextual bandits (FCB) are a pivotal integration of FL and sequential decision-making.
FCB approaches have largely employed their tailored FL components, often deviating from the canonical FL framework.
In particular, a novel FCB design, termed FedIGW, is proposed to leverage a regression-based CB algorithm.
arXiv Detail & Related papers (2023-12-26T21:44:09Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - FedNAR: Federated Optimization with Normalized Annealing Regularization [54.42032094044368]
We explore the choices of weight decay and identify that weight decay value appreciably influences the convergence of existing FL algorithms.
We develop Federated optimization with Normalized Annealing Regularization (FedNAR), a plug-in that can be seamlessly integrated into any existing FL algorithms.
arXiv Detail & Related papers (2023-10-04T21:11:40Z) - Federated Conformal Predictors for Distributed Uncertainty
Quantification [83.50609351513886]
Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning.
In this paper, we extend conformal prediction to the federated learning setting.
We propose a weaker notion of partial exchangeability, better suited to the FL setting, and use it to develop the Federated Conformal Prediction framework.
arXiv Detail & Related papers (2023-05-27T19:57:27Z) - On The Impact of Client Sampling on Federated Learning Convergence [4.530678016396477]
We introduce a novel decomposition theorem for the convergence of FL, allowing to clearly quantify the impact of client sampling on the global model update.
Our results suggest that MD sampling should be used as default sampling scheme, due to the resilience to the changes in data ratio during the learning process, while Uniform sampling is superior only in the special case when clients have the same amount of data.
arXiv Detail & Related papers (2021-07-26T13:36:06Z) - Tight Mutual Information Estimation With Contrastive Fenchel-Legendre
Optimization [69.07420650261649]
We introduce a novel, simple, and powerful contrastive MI estimator named as FLO.
Empirically, our FLO estimator overcomes the limitations of its predecessors and learns more efficiently.
The utility of FLO is verified using an extensive set of benchmarks, which also reveals the trade-offs in practical MI estimation.
arXiv Detail & Related papers (2021-07-02T15:20:41Z) - Achieving Statistical Optimality of Federated Learning: Beyond
Stationary Points [19.891597817559038]
Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud.
Recent work raised concerns on two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally.
We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates.
arXiv Detail & Related papers (2021-06-29T09:59:43Z) - 1-Bit Compressive Sensing for Efficient Federated Learning Over the Air [32.14738452396869]
This paper develops and analyzes a communication-efficient scheme for learning (FL) over the air, which incorporates 1-bit sensing (CS) into analog aggregation transmissions.
For scalable computing, we develop an efficient implementation that is suitable for large-scale networks.
Simulation results show that our proposed 1-bit CS based FL over the air achieves comparable performance to the ideal case.
arXiv Detail & Related papers (2021-03-30T03:50:31Z) - Federated Composite Optimization [28.11253930828807]
Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately.
Standard FL algorithms such as FedAvg are primarily geared towards smooth unconstrained settings.
We propose a new primal-dual algorithm, Federated Dual Averaging (FedDualAvg), which by employing a novel server dual averaging procedure circumvents the curse of primal averaging.
arXiv Detail & Related papers (2020-11-17T06:54:06Z) - A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.