On Principled Local Optimization Methods for Federated Learning
- URL: http://arxiv.org/abs/2401.13216v1
- Date: Wed, 24 Jan 2024 03:57:45 GMT
- Title: On Principled Local Optimization Methods for Federated Learning
- Authors: Honglin Yuan
- Abstract summary: dissertation aims to advance the theoretical foundation of local methods in the following three directions.
First, we establish sharp bounds for FedAvg, the most popular algorithm in Federated Learning.
Second, we propose Federated Accelerated Descent (FedAc), which provably improves the convergence rate and communication efficiency.
- Score: 2.628859872479184
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Federated Learning (FL), a distributed learning paradigm that scales
on-device learning collaboratively, has emerged as a promising approach for
decentralized AI applications. Local optimization methods such as Federated
Averaging (FedAvg) are the most prominent methods for FL applications. Despite
their simplicity and popularity, the theoretical understanding of local
optimization methods is far from clear. This dissertation aims to advance the
theoretical foundation of local methods in the following three directions.
First, we establish sharp bounds for FedAvg, the most popular algorithm in
Federated Learning. We demonstrate how FedAvg may suffer from a notion we call
iterate bias, and how an additional third-order smoothness assumption may
mitigate this effect and lead to better convergence rates. We explain this
phenomenon from a Stochastic Differential Equation (SDE) perspective.
Second, we propose Federated Accelerated Stochastic Gradient Descent (FedAc),
the first principled acceleration of FedAvg, which provably improves the
convergence rate and communication efficiency. Our technique uses on a
potential-based perturbed iterate analysis, a novel stability analysis of
generalized accelerated SGD, and a strategic tradeoff between acceleration and
stability.
Third, we study the Federated Composite Optimization problem, which extends
the classic smooth setting by incorporating a shared non-smooth regularizer. We
show that direct extensions of FedAvg may suffer from the "curse of primal
averaging," resulting in slow convergence. As a solution, we propose a new
primal-dual algorithm, Federated Dual Averaging, which overcomes the curse of
primal averaging by employing a novel inter-client dual averaging procedure.
Related papers
- Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - Aiding Global Convergence in Federated Learning via Local Perturbation and Mutual Similarity Information [6.767885381740953]
Federated learning has emerged as a distributed optimization paradigm.
We propose a novel modified framework wherein each client locally performs a perturbed gradient step.
We show that our algorithm speeds convergence up to a margin of 30 global rounds compared with FedAvg.
arXiv Detail & Related papers (2024-10-07T23:14:05Z) - Preconditioned Federated Learning [7.7269332266153326]
Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner.
FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations.
We propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp)
arXiv Detail & Related papers (2023-09-20T14:58:47Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - On Second-order Optimization Methods for Federated Learning [59.787198516188425]
We evaluate the performance of several second-order distributed methods with local steps in the federated learning setting.
We propose a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.
arXiv Detail & Related papers (2021-09-06T12:04:08Z) - Federated Composite Optimization [28.11253930828807]
Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately.
Standard FL algorithms such as FedAvg are primarily geared towards smooth unconstrained settings.
We propose a new primal-dual algorithm, Federated Dual Averaging (FedDualAvg), which by employing a novel server dual averaging procedure circumvents the curse of primal averaging.
arXiv Detail & Related papers (2020-11-17T06:54:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.