Related papers: Federated Composite Optimization

Federated Composite Optimization

URL: http://arxiv.org/abs/2011.08474v3
Date: Sat, 5 Jun 2021 06:32:27 GMT
Title: Federated Composite Optimization
Authors: Honglin Yuan, Manzil Zaheer, Sashank Reddi
Abstract summary: Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately. Standard FL algorithms such as FedAvg are primarily geared towards smooth unconstrained settings. We propose a new primal-dual algorithm, Federated Dual Averaging (FedDualAvg), which by employing a novel server dual averaging procedure circumvents the curse of primal averaging.
Score: 28.11253930828807
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately. Standard FL algorithms such as FedAvg are primarily geared towards smooth unconstrained settings. In this paper, we study the Federated Composite Optimization (FCO) problem, in which the loss function contains a non-smooth regularizer. Such problems arise naturally in FL applications that involve sparsity, low-rank, monotonicity, or more general constraints. We first show that straightforward extensions of primal algorithms such as FedAvg are not well-suited for FCO since they suffer from the "curse of primal averaging," resulting in poor convergence. As a solution, we propose a new primal-dual algorithm, Federated Dual Averaging (FedDualAvg), which by employing a novel server dual averaging procedure circumvents the curse of primal averaging. Our theoretical analysis and empirical experiments demonstrate that FedDualAvg outperforms the other baselines.

Related papers

FedSWA: Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging [23.786128968778396]
We find that FedSAM usually performs worse than FedAvg in the case of highly heterogeneous data.<n>We propose a novel and effective federated learning algorithm with Weight Averaging (called texttFedSWA), which aims to find flatter minima.<n>We also introduce a new momentum-based controlled weight averaging FL algorithm (texttFedMoSWA), which is designed to better align local and global models.
arXiv Detail & Related papers (2025-07-26T17:12:40Z)
On Principled Local Optimization Methods for Federated Learning [2.628859872479184]
dissertation aims to advance the theoretical foundation of local methods in the following three directions. First, we establish sharp bounds for FedAvg, the most popular algorithm in Federated Learning. Second, we propose Federated Accelerated Descent (FedAc), which provably improves the convergence rate and communication efficiency.
arXiv Detail & Related papers (2024-01-24T03:57:45Z)
Escaping Saddle Points in Heterogeneous Federated Learning via Distributed SGD with Communication Compression [42.85797250438604]
We propose a novel algorithm Power-EF that only communicates compressed information via a novel error-feedback scheme. Power-EF is the first distributed and compressed SGD algorithm that provably escapes saddle points in heterogeneous FL without any data homogeneity assumptions. Our theory improves/recovers previous results, while extending to much more tolerant settings on the local data.
arXiv Detail & Related papers (2023-10-29T16:24:53Z)
FedNAR: Federated Optimization with Normalized Annealing Regularization [54.42032094044368]
We explore the choices of weight decay and identify that weight decay value appreciably influences the convergence of existing FL algorithms. We develop Federated optimization with Normalized Annealing Regularization (FedNAR), a plug-in that can be seamlessly integrated into any existing FL algorithms.
arXiv Detail & Related papers (2023-10-04T21:11:40Z)
On the Power of Adaptive Weighted Aggregation in Heterogeneous Federated Learning and Beyond [37.894835756324454]
Federated averaging (FedAvg) is the most fundamental algorithm in Federated learning (FL) Recent empirical results show that FedAvg can perform well in many real-world heterogeneous tasks. We present a simple and effective FedAvg variant termed FedAWARE.
arXiv Detail & Related papers (2023-10-04T10:15:57Z)
Preconditioned Federated Learning [7.7269332266153326]
Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner. FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations. We propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp)
arXiv Detail & Related papers (2023-09-20T14:58:47Z)
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data. We propose textbfFedDA, a novel framework for local adaptive gradient methods. We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z)
Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model. Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
A Convergence Theory for Federated Average: Beyond Smoothness [28.074273047592065]
Federated learning enables a large amount of edge computing devices to learn a model without data sharing jointly. As a leading algorithm in this setting, Federated Average FedAvg, which runs Gradient Descent (SGD) in parallel on local devices, has been widely used. This paper provides a theoretical convergence study on Federated Learning.
arXiv Detail & Related papers (2022-11-03T04:50:49Z)
Achieving Statistical Optimality of Federated Learning: Beyond Stationary Points [19.891597817559038]
Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud. Recent work raised concerns on two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally. We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates.
arXiv Detail & Related papers (2021-06-29T09:59:43Z)
Hybrid Federated Learning: Algorithms and Implementation [61.0640216394349]
Federated learning (FL) is a recently proposed distributed machine learning paradigm dealing with distributed and private data sets. We propose a new model-matching-based problem formulation for hybrid FL. We then propose an efficient algorithm that can collaboratively train the global and local models to deal with full and partial featured data.
arXiv Detail & Related papers (2020-12-22T23:56:03Z)
A Unified Linear Speedup Analysis of Federated Averaging and Nesterov FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data. In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today. We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.