Lower Bounds and Optimal Algorithms for Personalized Federated Learning
- URL: http://arxiv.org/abs/2010.02372v1
- Date: Mon, 5 Oct 2020 22:29:51 GMT
- Title: Lower Bounds and Optimal Algorithms for Personalized Federated Learning
- Authors: Filip Hanzely, Slavom\'ir Hanzely, Samuel Horv\'ath, Peter Richt\'arik
- Abstract summary: We consider the optimization formulation of personalized federated learning recently introduced by Hanzely and Richt'arik ( 2020)
Our first contribution is establishing the first lower bounds for this formulation, for both the communication complexity and the local oracle complexity.
Our second contribution is the design of several optimal methods matching these lower bounds in almost all regimes.
- Score: 7.742297876120562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we consider the optimization formulation of personalized
federated learning recently introduced by Hanzely and Richt\'arik (2020) which
was shown to give an alternative explanation to the workings of local {\tt SGD}
methods. Our first contribution is establishing the first lower bounds for this
formulation, for both the communication complexity and the local oracle
complexity. Our second contribution is the design of several optimal methods
matching these lower bounds in almost all regimes. These are the first provably
optimal methods for personalized federated learning. Our optimal methods
include an accelerated variant of {\tt FedProx}, and an accelerated
variance-reduced version of {\tt FedAvg}/Local {\tt SGD}. We demonstrate the
practical superiority of our methods through extensive numerical experiments.
Related papers
- A Learn-to-Optimize Approach for Coordinate-Wise Step Sizes for Quasi-Newton Methods [9.82454981262489]
We introduce a learn-to-optimize (L2O) method that employs LSTM-based networks to learn optimal step sizes.<n>Our approach achieves substantial improvements over scalar step size methods and hypergradient descent-based method.
arXiv Detail & Related papers (2024-11-25T07:13:59Z) - Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization.
We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate.
We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z) - Towards Differentiable Multilevel Optimization: A Gradient-Based Approach [1.6114012813668932]
This paper introduces a novel gradient-based approach for multilevel optimization.
Our method significantly reduces computational complexity while improving both solution accuracy and convergence speed.
To the best of our knowledge, this is one of the first algorithms to provide a general version of implicit differentiation.
arXiv Detail & Related papers (2024-10-15T06:17:59Z) - On Principled Local Optimization Methods for Federated Learning [2.628859872479184]
dissertation aims to advance the theoretical foundation of local methods in the following three directions.
First, we establish sharp bounds for FedAvg, the most popular algorithm in Federated Learning.
Second, we propose Federated Accelerated Descent (FedAc), which provably improves the convergence rate and communication efficiency.
arXiv Detail & Related papers (2024-01-24T03:57:45Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Versatile Single-Loop Method for Gradient Estimator: First and Second
Order Optimality, and its Application to Federated Learning [45.78238792836363]
We present a single-loop algorithm named SLEDGE (Single-Loop-E Gradient Estimator) for periodic convergence.
Unlike existing methods, SLEDGE has the advantage of versatility; (ii) second-order optimal, (ii) in the PL region, and (iii) smaller complexity under less of data.
arXiv Detail & Related papers (2022-09-01T11:05:26Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - On Second-order Optimization Methods for Federated Learning [59.787198516188425]
We evaluate the performance of several second-order distributed methods with local steps in the federated learning setting.
We propose a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.
arXiv Detail & Related papers (2021-09-06T12:04:08Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Distributed Second Order Methods with Fast Rates and Compressed
Communication [6.069611493148631]
We develop several new communication-efficient second-order methods for distributed optimization.
We prove global sublinear and linear convergence rates, and a fast superlinear rate.
Our results are supported with experimental results on real datasets.
arXiv Detail & Related papers (2021-02-14T14:06:45Z) - FedSplit: An algorithmic framework for fast federated optimization [40.42352500741025]
We introduce FedSplit, a class of algorithms for solving distributed convex minimization with additive structure.
Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
arXiv Detail & Related papers (2020-05-11T16:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.