Federated Composite Optimization
- URL: http://arxiv.org/abs/2011.08474v3
- Date: Sat, 5 Jun 2021 06:32:27 GMT
- Title: Federated Composite Optimization
- Authors: Honglin Yuan, Manzil Zaheer, Sashank Reddi
- Abstract summary: Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately.
Standard FL algorithms such as FedAvg are primarily geared towards smooth unconstrained settings.
We propose a new primal-dual algorithm, Federated Dual Averaging (FedDualAvg), which by employing a novel server dual averaging procedure circumvents the curse of primal averaging.
- Score: 28.11253930828807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is a distributed learning paradigm that scales
on-device learning collaboratively and privately. Standard FL algorithms such
as FedAvg are primarily geared towards smooth unconstrained settings. In this
paper, we study the Federated Composite Optimization (FCO) problem, in which
the loss function contains a non-smooth regularizer. Such problems arise
naturally in FL applications that involve sparsity, low-rank, monotonicity, or
more general constraints. We first show that straightforward extensions of
primal algorithms such as FedAvg are not well-suited for FCO since they suffer
from the "curse of primal averaging," resulting in poor convergence. As a
solution, we propose a new primal-dual algorithm, Federated Dual Averaging
(FedDualAvg), which by employing a novel server dual averaging procedure
circumvents the curse of primal averaging. Our theoretical analysis and
empirical experiments demonstrate that FedDualAvg outperforms the other
baselines.
Related papers
- On Principled Local Optimization Methods for Federated Learning [2.628859872479184]
dissertation aims to advance the theoretical foundation of local methods in the following three directions.
First, we establish sharp bounds for FedAvg, the most popular algorithm in Federated Learning.
Second, we propose Federated Accelerated Descent (FedAc), which provably improves the convergence rate and communication efficiency.
arXiv Detail & Related papers (2024-01-24T03:57:45Z) - Escaping Saddle Points in Heterogeneous Federated Learning via
Distributed SGD with Communication Compression [42.85797250438604]
We propose a novel algorithm Power-EF that only communicates compressed information via a novel error-feedback scheme.
Power-EF is the first distributed and compressed SGD algorithm that provably escapes saddle points in heterogeneous FL without any data homogeneity assumptions.
Our theory improves/recovers previous results, while extending to much more tolerant settings on the local data.
arXiv Detail & Related papers (2023-10-29T16:24:53Z) - FedNAR: Federated Optimization with Normalized Annealing Regularization [54.42032094044368]
We explore the choices of weight decay and identify that weight decay value appreciably influences the convergence of existing FL algorithms.
We develop Federated optimization with Normalized Annealing Regularization (FedNAR), a plug-in that can be seamlessly integrated into any existing FL algorithms.
arXiv Detail & Related papers (2023-10-04T21:11:40Z) - Preconditioned Federated Learning [7.7269332266153326]
Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner.
FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations.
We propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp)
arXiv Detail & Related papers (2023-09-20T14:58:47Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated
Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model.
Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - A Convergence Theory for Federated Average: Beyond Smoothness [28.074273047592065]
Federated learning enables a large amount of edge computing devices to learn a model without data sharing jointly.
As a leading algorithm in this setting, Federated Average FedAvg, which runs Gradient Descent (SGD) in parallel on local devices, has been widely used.
This paper provides a theoretical convergence study on Federated Learning.
arXiv Detail & Related papers (2022-11-03T04:50:49Z) - Achieving Statistical Optimality of Federated Learning: Beyond
Stationary Points [19.891597817559038]
Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud.
Recent work raised concerns on two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally.
We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates.
arXiv Detail & Related papers (2021-06-29T09:59:43Z) - Hybrid Federated Learning: Algorithms and Implementation [61.0640216394349]
Federated learning (FL) is a recently proposed distributed machine learning paradigm dealing with distributed and private data sets.
We propose a new model-matching-based problem formulation for hybrid FL.
We then propose an efficient algorithm that can collaboratively train the global and local models to deal with full and partial featured data.
arXiv Detail & Related papers (2020-12-22T23:56:03Z) - A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.