Local Stochastic Bilevel Optimization with Momentum-Based Variance
Reduction
- URL: http://arxiv.org/abs/2205.01608v1
- Date: Tue, 3 May 2022 16:40:22 GMT
- Title: Local Stochastic Bilevel Optimization with Momentum-Based Variance
Reduction
- Authors: Junyi Li, Feihu Huang, Heng Huang
- Abstract summary: We study Federated Bilevel Optimization problems. Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm.
We show FedBiO has complexity of $O(epsilon-1.5)$.
Our algorithms show superior performances compared to other baselines in numerical experiments.
- Score: 104.41634756395545
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bilevel Optimization has witnessed notable progress recently with new
emerging efficient algorithms and has been applied to many machine learning
tasks such as data cleaning, few-shot learning, and neural architecture search.
However, little attention has been paid to solve the bilevel problems under
distributed setting. Federated learning (FL) is an emerging paradigm which
solves machine learning tasks over distributed-located data. FL problems are
challenging to solve due to the heterogeneity and communication bottleneck.
However, it is unclear how these challenges will affect the convergence of
Bilevel Optimization algorithms. In this paper, we study Federated Bilevel
Optimization problems. Specifically, we first propose the FedBiO, a
deterministic gradient-based algorithm and we show it requires
$O(\epsilon^{-2})$ number of iterations to reach an $\epsilon$-stationary
point. Then we propose FedBiOAcc to accelerate FedBiO with the momentum-based
variance-reduction technique under the stochastic scenario. We show FedBiOAcc
has complexity of $O(\epsilon^{-1.5})$. Finally, we validate our proposed
algorithms via the important Fair Federated Learning task. More specifically,
we define a bilevel-based group fair FL objective. Our algorithms show superior
performances compared to other baselines in numerical experiments.
Related papers
- Communication-Efficient Federated Bilevel Optimization with Local and
Global Lower Level Problems [118.00379425831566]
We propose a communication-efficient algorithm, named FedBiOAcc.
We prove that FedBiOAcc-Local converges at the same rate for this type of problems.
Empirical results show superior performance of our algorithms.
arXiv Detail & Related papers (2023-02-13T21:28:53Z) - Achieving Linear Speedup in Non-IID Federated Bilevel Learning [16.56643290676128]
We propose a new federated bilevel algorithm named FedMBO.
We show that FedMBO achieves a convergence rate of $mathcalObig(frac1sqrtnK+frac1K+fracsqrtnK3/2big)$ on non-i.i.d.datasets.
This is the first theoretical linear speedup result for non-i.i.d.federated bilevel optimization.
arXiv Detail & Related papers (2023-02-10T18:28:00Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - Fast Adaptive Federated Bilevel Optimization [14.579475552088692]
We propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems.
AdaFBiO uses the unified adaptive matrices to flexibly incorporate various adaptive learning rates to update variables in both UL and LL problems.
We provide a convergence analysis framework for our AdaFBiO algorithm, and prove it needs the sample of complexity of $tildeO(epsilon-3)$ with communication complexity of $tildeO(epsilon-2)$ to obtain an $
arXiv Detail & Related papers (2022-11-02T13:55:47Z) - Communication-Efficient Robust Federated Learning with Noisy Labels [144.31995882209932]
Federated learning (FL) is a promising privacy-preserving machine learning paradigm over distributed located data.
We propose a learning-based reweighting approach to mitigate the effect of noisy labels in FL.
Our approach has shown superior performance on several real-world datasets compared to various baselines.
arXiv Detail & Related papers (2022-06-11T16:21:17Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.