SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity
- URL: http://arxiv.org/abs/2405.18777v1
- Date: Wed, 29 May 2024 05:36:03 GMT
- Title: SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity
- Authors: Tianshu Chu, Dachuan Xu, Wei Yao, Jin Zhang,
- Abstract summary: We show that there is no gap in complexity analysis between bilevel and single-level optimization when implementing SPABA.
We propose several other bi-loop or nested bi-level algorithms to improve the state of complexity analysis.
- Score: 5.046146134689904
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While stochastic bilevel optimization methods have been extensively studied for addressing large-scale nested optimization problems in machine learning, it remains an open question whether the optimal complexity bounds for solving bilevel optimization are the same as those in single-level optimization. Our main result resolves this question: SPABA, an adaptation of the PAGE method for nonconvex optimization in (Li et al., 2021) to the bilevel setting, can achieve optimal sample complexity in both the finite-sum and expectation settings. We show the optimality of SPABA by proving that there is no gap in complexity analysis between stochastic bilevel and single-level optimization when implementing PAGE. Notably, as indicated by the results of (Dagr\'eou et al., 2022), there might exist a gap in complexity analysis when implementing other stochastic gradient estimators, like SGD and SAGA. In addition to SPABA, we propose several other single-loop stochastic bilevel algorithms, that either match or improve the state-of-the-art sample complexity results, leveraging our convergence rate and complexity analysis. Numerical experiments demonstrate the superior practical performance of the proposed methods.
Related papers
- Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization.
We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate.
We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z) - Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable.
It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed
Smoothness Conditions [9.518010235273785]
We present a novel fully Liploop Hessian-inversion-free algorithmic framework for bilevel optimization.
We show that by a slight modification of our approach our approach can handle a more general multi-objective robust bilevel optimization problem.
arXiv Detail & Related papers (2023-06-21T07:32:29Z) - Will Bilevel Optimizers Benefit from Loops [63.22466953441521]
Two current popular bilevelimats AID-BiO and ITD-BiO naturally involve solving one or two sub-problems.
We first establish unified convergence analysis for both AID-BiO and ITD-BiO that are applicable to all implementation choices of loops.
arXiv Detail & Related papers (2022-05-27T20:28:52Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex
Compositional Optimization [20.410012564838933]
compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management.
In this paper, we investigate the general compositional optimization in the general smooth non-cursive setting.
arXiv Detail & Related papers (2019-12-31T18:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.