Will Bilevel Optimizers Benefit from Loops
- URL: http://arxiv.org/abs/2205.14224v2
- Date: Tue, 31 May 2022 16:33:13 GMT
- Title: Will Bilevel Optimizers Benefit from Loops
- Authors: Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying
- Abstract summary: Two current popular bilevelimats AID-BiO and ITD-BiO naturally involve solving one or two sub-problems.
We first establish unified convergence analysis for both AID-BiO and ITD-BiO that are applicable to all implementation choices of loops.
- Score: 63.22466953441521
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bilevel optimization has arisen as a powerful tool for solving a variety of
machine learning problems. Two current popular bilevel optimizers AID-BiO and
ITD-BiO naturally involve solving one or two sub-problems, and consequently,
whether we solve these problems with loops (that take many iterations) or
without loops (that take only a few iterations) can significantly affect the
overall computational efficiency. Existing studies in the literature cover only
some of those implementation choices, and the complexity bounds available are
not refined enough to enable rigorous comparison among different
implementations. In this paper, we first establish unified convergence analysis
for both AID-BiO and ITD-BiO that are applicable to all implementation choices
of loops. We then specialize our results to characterize the computational
complexity for all implementations, which enable an explicit comparison among
them. Our result indicates that for AID-BiO, the loop for estimating the
optimal point of the inner function is beneficial for overall efficiency,
although it causes higher complexity for each update step, and the loop for
approximating the outer-level Hessian-inverse-vector product reduces the
gradient complexity. For ITD-BiO, the two loops always coexist, and our
convergence upper and lower bounds show that such loops are necessary to
guarantee a vanishing convergence error, whereas the no-loop scheme suffers
from an unavoidable non-vanishing convergence error. Our numerical experiments
further corroborate our theoretical results.
Related papers
- Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization [25.438298531555468]
Bilevel optimization is widely applied in many machine learning tasks such as hyper learning, meta learning and reinforcement learning.
We propose an efficient Hessian/BiO method with the optimal convergence $frac1TT) under some mild conditions.
We conduct some some experiments on the bilevel game hyper-stationary numerical convergence.
arXiv Detail & Related papers (2024-07-25T07:25:06Z) - SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity [5.046146134689904]
We show that there is no gap in complexity analysis between bilevel and single-level optimization when implementing SPABA.
We propose several other bi-loop or nested bi-level algorithms to improve the state of complexity analysis.
arXiv Detail & Related papers (2024-05-29T05:36:03Z) - A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk
Minimization [23.401092942765196]
We propose a bilevel extension of the celebrated SARAH algorithm.
We demonstrate that the algorithm requires $mathcalO((n+m)frac12varepsilon-1)$ oracle calls to achieve $varepsilon$-stationarity.
We provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem.
arXiv Detail & Related papers (2023-02-17T09:04:18Z) - A Constrained Optimization Approach to Bilevel Optimization with
Multiple Inner Minima [49.320758794766185]
We propose a new approach, which convert the bilevel problem to an equivalent constrained optimization, and then the primal-dual algorithm can be used to solve the problem.
Such an approach enjoys a few advantages including (a) addresses the multiple inner minima challenge; (b) fully first-order efficiency without Jacobian computations.
arXiv Detail & Related papers (2022-03-01T18:20:01Z) - A Fully Single Loop Algorithm for Bilevel Optimization without Hessian
Inverse [121.54116938140754]
We propose a new Hessian inverse free Fully Single Loop Algorithm for bilevel optimization problems.
We show that our algorithm converges with the rate of $O(epsilon-2)$.
arXiv Detail & Related papers (2021-12-09T02:27:52Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Convergence of Meta-Learning with Task-Specific Adaptation over Partial
Parameters [152.03852111442114]
Although model-agnostic metalearning (MAML) is a very successful algorithm meta-learning practice, it can have high computational complexity.
Our paper shows that such complexity can significantly affect the overall convergence performance of ANIL.
arXiv Detail & Related papers (2020-06-16T19:57:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.