Efficient Federated Learning via Local Adaptive Amended Optimizer with
Linear Speedup
- URL: http://arxiv.org/abs/2308.00522v1
- Date: Sun, 30 Jul 2023 14:53:21 GMT
- Title: Efficient Federated Learning via Local Adaptive Amended Optimizer with
Linear Speedup
- Authors: Yan Sun, Li Shen, Hao Sun, Liang Ding and Dacheng Tao
- Abstract summary: We propose a novel momentum-based algorithm via utilizing the global descent locally adaptive.
textitLADA could greatly reduce the communication rounds and achieves higher accuracy than several baselines.
- Score: 90.26270347459915
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adaptive optimization has achieved notable success for distributed learning
while extending adaptive optimizer to federated Learning (FL) suffers from
severe inefficiency, including (i) rugged convergence due to inaccurate
gradient estimation in global adaptive optimizer; (ii) client drifts
exacerbated by local over-fitting with the local adaptive optimizer. In this
work, we propose a novel momentum-based algorithm via utilizing the global
gradient descent and locally adaptive amended optimizer to tackle these
difficulties. Specifically, we incorporate a locally amended technique to the
adaptive optimizer, named Federated Local ADaptive Amended optimizer
(\textit{FedLADA}), which estimates the global average offset in the previous
communication round and corrects the local offset through a momentum-like term
to further improve the empirical training speed and mitigate the heterogeneous
over-fitting. Theoretically, we establish the convergence rate of
\textit{FedLADA} with a linear speedup property on the non-convex case under
the partial participation settings. Moreover, we conduct extensive experiments
on the real-world dataset to demonstrate the efficacy of our proposed
\textit{FedLADA}, which could greatly reduce the communication rounds and
achieves higher accuracy than several baselines.
Related papers
- Federated Zeroth-Order Optimization using Trajectory-Informed Surrogate
Gradients [31.674600866528788]
We introduce trajectory-informed surrogate gradients (FZooS) algorithm for query- and communication-efficient federated ZOO.
Our FZooS achieves theoretical improvements over the existing approaches, which is supported by our real-world experiments such as federated black-box adversarial attack and federated non-differentiable metric optimization.
arXiv Detail & Related papers (2023-08-08T06:26:54Z) - Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape [59.841889495864386]
In federated learning (FL), a cluster of local clients are chaired under the coordination of a global server.
Clients are prone to overfit into their own optima, which extremely deviates from the global objective.
ttfamily FedSMOO adopts a dynamic regularizer to guarantee the local optima towards the global objective.
Our theoretical analysis indicates that ttfamily FedSMOO achieves fast $mathcalO (1/T)$ convergence rate with low bound generalization.
arXiv Detail & Related papers (2023-05-19T10:47:44Z) - FedSpeed: Larger Local Interval, Less Communication Round, and Higher
Generalization Accuracy [84.45004766136663]
Federated learning is an emerging distributed machine learning framework.
It suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting.
We propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems.
arXiv Detail & Related papers (2023-02-21T03:55:29Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias
Estimation [12.62716075696359]
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data.
In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into FL optimization recently.
We propose an adaptive algorithm that accurately estimates drift across clients.
arXiv Detail & Related papers (2022-04-27T20:04:24Z) - Accelerating Federated Learning with a Global Biased Optimiser [16.69005478209394]
Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices.
We propose a novel, generalised approach for applying adaptive optimisation techniques to FL with the Federated Global Biased Optimiser (FedGBO) algorithm.
FedGBO accelerates FL by applying a set of global biased optimiser values during the local training phase of FL, which helps to reduce client-drift' from non-IID data.
arXiv Detail & Related papers (2021-08-20T12:08:44Z) - Local Adaptivity in Federated Learning: Convergence and Consistency [25.293584783673413]
Federated learning (FL) framework trains a machine learning model using decentralized data stored at edge client devices by periodically aggregating locally trained models.
We show in both theory and practice that while local adaptive methods can accelerate convergence, they can cause a non-vanishing solution bias.
We propose correction techniques to overcome this inconsistency and complement the local adaptive methods for FL.
arXiv Detail & Related papers (2021-06-04T07:36:59Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.