Faster Adaptive Federated Learning
- URL: http://arxiv.org/abs/2212.00974v1
- Date: Fri, 2 Dec 2022 05:07:50 GMT
- Title: Faster Adaptive Federated Learning
- Authors: Xidong Wu, Feihu Huang, Zhengmian Hu, Heng Huang
- Abstract summary: Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
- Score: 84.38913517122619
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning has attracted increasing attention with the emergence of
distributed data. While extensive federated learning algorithms have been
proposed for the non-convex distributed problem, the federated learning in
practice still faces numerous challenges, such as the large training iterations
to converge since the sizes of models and datasets keep increasing, and the
lack of adaptivity by SGD-based model updates. Meanwhile, the study of adaptive
methods in federated learning is scarce and existing works either lack a
complete theoretical convergence guarantee or have slow sample complexity. In
this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on
the momentum-based variance reduced technique in cross-silo FL. We first
explore how to design the adaptive algorithm in the FL setting. By providing a
counter-example, we prove that a simple combination of FL and adaptive methods
could lead to divergence. More importantly, we provide a convergence analysis
for our method and prove that our algorithm is the first adaptive FL algorithm
to reach the best-known samples $O(\epsilon^{-3})$ and $O(\epsilon^{-2})$
communication rounds to find an $\epsilon$-stationary point without large
batches. The experimental results on the language modeling task and image
classification task with heterogeneous data demonstrate the efficiency of our
algorithms.
Related papers
- Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Preconditioned Federated Learning [7.7269332266153326]
Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner.
FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations.
We propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp)
arXiv Detail & Related papers (2023-09-20T14:58:47Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - Faster Adaptive Momentum-Based Federated Methods for Distributed
Composition Optimization [14.579475552088692]
We propose a class of faster federated composition optimization algorithms (i.e. MFCGD and AdaMFCGD) to solve the non distributed composition problems.
In particular, our adaptive algorithm (i.e., AdaMFCGD) uses a unified adaptive matrix to flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-03T15:17:04Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.