Local Adaptivity in Federated Learning: Convergence and Consistency
- URL: http://arxiv.org/abs/2106.02305v1
- Date: Fri, 4 Jun 2021 07:36:59 GMT
- Title: Local Adaptivity in Federated Learning: Convergence and Consistency
- Authors: Jianyu Wang, Zheng Xu, Zachary Garrett, Zachary Charles, Luyang Liu,
Gauri Joshi
- Abstract summary: Federated learning (FL) framework trains a machine learning model using decentralized data stored at edge client devices by periodically aggregating locally trained models.
We show in both theory and practice that while local adaptive methods can accelerate convergence, they can cause a non-vanishing solution bias.
We propose correction techniques to overcome this inconsistency and complement the local adaptive methods for FL.
- Score: 25.293584783673413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The federated learning (FL) framework trains a machine learning model using
decentralized data stored at edge client devices by periodically aggregating
locally trained models. Popular optimization algorithms of FL use vanilla
(stochastic) gradient descent for both local updates at clients and global
updates at the aggregating server. Recently, adaptive optimization methods such
as AdaGrad have been studied for server updates. However, the effect of using
adaptive optimization methods for local updates at clients is not yet
understood. We show in both theory and practice that while local adaptive
methods can accelerate convergence, they can cause a non-vanishing solution
bias, where the final converged solution may be different from the stationary
point of the global objective function. We propose correction techniques to
overcome this inconsistency and complement the local adaptive methods for FL.
Extensive experiments on realistic federated training tasks show that the
proposed algorithms can achieve faster convergence and higher test accuracy
than the baselines without local adaptivity.
Related papers
- FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Efficient Federated Learning via Local Adaptive Amended Optimizer with
Linear Speedup [90.26270347459915]
We propose a novel momentum-based algorithm via utilizing the global descent locally adaptive.
textitLADA could greatly reduce the communication rounds and achieves higher accuracy than several baselines.
arXiv Detail & Related papers (2023-07-30T14:53:21Z) - Locally Adaptive Federated Learning [30.19411641685853]
Federated learning is a paradigm of distributed machine learning in which multiple clients coordinate with a central server to learn a model.
Standard federated optimization methods such as Federated Averaging (FedAvg) ensure generalization among the clients.
We propose locally federated learning algorithms, that leverage the local geometric information for each client function.
arXiv Detail & Related papers (2023-07-12T17:02:32Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias
Estimation [12.62716075696359]
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data.
In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into FL optimization recently.
We propose an adaptive algorithm that accurately estimates drift across clients.
arXiv Detail & Related papers (2022-04-27T20:04:24Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Accelerating Federated Learning with a Global Biased Optimiser [16.69005478209394]
Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices.
We propose a novel, generalised approach for applying adaptive optimisation techniques to FL with the Federated Global Biased Optimiser (FedGBO) algorithm.
FedGBO accelerates FL by applying a set of global biased optimiser values during the local training phase of FL, which helps to reduce client-drift' from non-IID data.
arXiv Detail & Related papers (2021-08-20T12:08:44Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.