Related papers: Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction

Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction

URL: http://arxiv.org/abs/2205.01608v1
Date: Tue, 3 May 2022 16:40:22 GMT
Title: Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction
Authors: Junyi Li, Feihu Huang, Heng Huang
Abstract summary: We study Federated Bilevel Optimization problems. Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm. We show FedBiO has complexity of $O(epsilon-1.5)$. Our algorithms show superior performances compared to other baselines in numerical experiments.
Score: 104.41634756395545
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bilevel Optimization has witnessed notable progress recently with new emerging efficient algorithms and has been applied to many machine learning tasks such as data cleaning, few-shot learning, and neural architecture search. However, little attention has been paid to solve the bilevel problems under distributed setting. Federated learning (FL) is an emerging paradigm which solves machine learning tasks over distributed-located data. FL problems are challenging to solve due to the heterogeneity and communication bottleneck. However, it is unclear how these challenges will affect the convergence of Bilevel Optimization algorithms. In this paper, we study Federated Bilevel Optimization problems. Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm and we show it requires $O(\epsilon^{-2})$ number of iterations to reach an $\epsilon$-stationary point. Then we propose FedBiOAcc to accelerate FedBiO with the momentum-based variance-reduction technique under the stochastic scenario. We show FedBiOAcc has complexity of $O(\epsilon^{-1.5})$. Finally, we validate our proposed algorithms via the important Fair Federated Learning task. More specifically, we define a bilevel-based group fair FL objective. Our algorithms show superior performances compared to other baselines in numerical experiments.

Related papers

Communication-Efficient Federated Bilevel Optimization with Local and Global Lower Level Problems [118.00379425831566]
We propose a communication-efficient algorithm, named FedBiOAcc. We prove that FedBiOAcc-Local converges at the same rate for this type of problems. Empirical results show superior performance of our algorithms.
arXiv Detail & Related papers (2023-02-13T21:28:53Z)
Achieving Linear Speedup in Non-IID Federated Bilevel Learning [16.56643290676128]
We propose a new federated bilevel algorithm named FedMBO. We show that FedMBO achieves a convergence rate of $mathcalObig(frac1sqrtnK+frac1K+fracsqrtnK3/2big)$ on non-i.i.d.datasets. This is the first theoretical linear speedup result for non-i.i.d.federated bilevel optimization.
arXiv Detail & Related papers (2023-02-10T18:28:00Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems. It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z)
Fast Adaptive Federated Bilevel Optimization [14.579475552088692]
We propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems. AdaFBiO uses the unified adaptive matrices to flexibly incorporate various adaptive learning rates to update variables in both UL and LL problems. We provide a convergence analysis framework for our AdaFBiO algorithm, and prove it needs the sample of complexity of $tildeO(epsilon-3)$ with communication complexity of $tildeO(epsilon-2)$ to obtain an $
arXiv Detail & Related papers (2022-11-02T13:55:47Z)
Communication-Efficient Robust Federated Learning with Noisy Labels [144.31995882209932]
Federated learning (FL) is a promising privacy-preserving machine learning paradigm over distributed located data. We propose a learning-based reweighting approach to mitigate the effect of noisy labels in FL. Our approach has shown superior performance on several real-world datasets compared to various baselines.
arXiv Detail & Related papers (2022-06-11T16:21:17Z)
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems. MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.