Achieving Statistical Optimality of Federated Learning: Beyond
Stationary Points
- URL: http://arxiv.org/abs/2106.15216v1
- Date: Tue, 29 Jun 2021 09:59:43 GMT
- Title: Achieving Statistical Optimality of Federated Learning: Beyond
Stationary Points
- Authors: Lili Su, Jiaming Xu, Pengkun Yang
- Abstract summary: Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud.
Recent work raised concerns on two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally.
We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates.
- Score: 19.891597817559038
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is a promising framework that has great potentials in
privacy preservation and in lowering the computation load at the cloud. FedAvg
and FedProx are two widely adopted algorithms. However, recent work raised
concerns on these two methods: (1) their fixed points do not correspond to the
stationary points of the original optimization problem, and (2) the common
model found might not generalize well locally.
In this paper, we alleviate these concerns. Towards this, we adopt the
statistical learning perspective yet allow the distributions to be
heterogeneous and the local data to be unbalanced. We show, in the general
kernel regression setting, that both FedAvg and FedProx converge to the
minimax-optimal error rates. Moreover, when the kernel function has a finite
rank, the convergence is exponentially fast. Our results further analytically
quantify the impact of the model heterogeneity and characterize the federation
gain - the reduction of the estimation error for a worker to join the federated
learning compared to the best local estimator. To the best of our knowledge, we
are the first to show the achievability of minimax error rates under FedAvg and
FedProx, and the first to characterize the gains in joining FL. Numerical
experiments further corroborate our theoretical findings on the statistical
optimality of FedAvg and FedProx and the federation gains.
Related papers
- Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - On Convergence of FedProx: Local Dissimilarity Invariant Bounds,
Non-smoothness and Beyond [41.082982732100696]
We develop a novel local dissimilarity in convergence theory for FedProx and its minibatch extension through the lens of algorithmic stability.
Preliminary experimental results on a series of benchmark FL datasets are reported to demonstrate the benefit of minibatching for improving the sample efficiency of FedProx.
arXiv Detail & Related papers (2022-06-10T15:35:10Z) - On the Unreasonable Effectiveness of Federated Averaging with
Heterogeneous Data [39.600069116159695]
Existing theory predicts that data heterogeneity will degrade the performance of the Federated Averaging (FedAvg) algorithm in federated learning.
This paper explains the seemingly unreasonable effectiveness of FedAvg that contradicts the previous theoretical predictions.
arXiv Detail & Related papers (2022-06-09T18:25:25Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - On the Practicality of Differential Privacy in Federated Learning by
Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively.
Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks.
Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z) - Federated Composite Optimization [28.11253930828807]
Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately.
Standard FL algorithms such as FedAvg are primarily geared towards smooth unconstrained settings.
We propose a new primal-dual algorithm, Federated Dual Averaging (FedDualAvg), which by employing a novel server dual averaging procedure circumvents the curse of primal averaging.
arXiv Detail & Related papers (2020-11-17T06:54:06Z) - A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z) - FedDANE: A Federated Newton-Type Method [49.9423212899788]
Federated learning aims to jointly learn low statistical models over massively distributed datasets.
We propose FedDANE, an optimization that we adapt from DANE, to handle federated learning.
arXiv Detail & Related papers (2020-01-07T07:44:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.