Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles
- URL: http://arxiv.org/abs/2306.14853v2
- Date: Mon, 28 Aug 2023 04:46:52 GMT
- Title: Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles
- Authors: Lesi Chen, Yaohua Ma, Jingzhao Zhang
- Abstract summary: We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods.
Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
- Score: 14.697733592222658
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Bilevel optimization has wide applications such as hyperparameter tuning,
neural architecture search, and meta-learning. Designing efficient algorithms
for bilevel optimization is challenging because the lower-level problem defines
a feasibility set implicitly via another optimization problem. In this work, we
consider one tractable case when the lower-level problem is strongly convex.
Recent works show that with a Hessian-vector product oracle, one can provably
find an $\epsilon$-first-order stationary point within
$\tilde{\mathcal{O}}(\epsilon^{-2})$ oracle calls. However, Hessian-vector
product may be inaccessible or expensive in practice. Kwon et al. (ICML 2023)
addressed this issue by proposing a first-order method that can achieve the
same goal at a slower rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$. In this
work, we provide a tighter analysis demonstrating that this method can converge
at the near-optimal $\tilde {\mathcal{O}}(\epsilon^{-2})$ rate as second-order
methods. Our analysis further leads to simple first-order algorithms that
achieve similar convergence rates for finding second-order stationary points
and for distributed bilevel problems.
Related papers
- First-Order Methods for Linearly Constrained Bilevel Optimization [38.19659447295665]
We present first-order linearly constrained optimization methods for high-level Hessian computations.
For linear inequality constraints, we attain $(delta,epsilon)$-Goldstein stationarity in $widetildeO(ddelta-1 epsilon-3)$ gradient oracle calls.
arXiv Detail & Related papers (2024-06-18T16:41:21Z) - Achieving ${O}(\epsilon^{-1.5})$ Complexity in Hessian/Jacobian-free
Stochastic Bilevel Optimization [21.661676550609876]
We show how to achieve an $O(epsilon1.5)$ sample complexity for non-strongly-accurate stationary point gradient bilevel optimization.
As far as we know, this is the first Hessian/Jacobian-free method with an $O(epsilon1.5)$ sample complexity for non-strongly-accurate stationary point gradient optimization.
arXiv Detail & Related papers (2023-12-06T16:34:58Z) - Projection-Free Methods for Stochastic Simple Bilevel Optimization with
Convex Lower-level Problem [16.9187409976238]
We study a class of convex bilevel optimization problems, also known as simple bilevel optimization.
We introduce novel bilevel optimization methods that approximate the solution set of the lower-level problem.
arXiv Detail & Related papers (2023-08-15T02:37:11Z) - Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems.
Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - A Conditional Gradient-based Method for Simple Bilevel Optimization with
Convex Lower-level Problem [18.15207779559351]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem via a cutting plane.
Our method achieves best-known assumption for the considered class of bilevel problems.
arXiv Detail & Related papers (2022-06-17T16:12:47Z) - The First Optimal Acceleration of High-Order Methods in Smooth Convex
Optimization [88.91190483500932]
We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems.
The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
arXiv Detail & Related papers (2022-05-19T16:04:40Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Second-order Conditional Gradient Sliding [79.66739383117232]
We present the emphSecond-Order Conditional Gradient Sliding (SOCGS) algorithm.
The SOCGS algorithm converges quadratically in primal gap after a finite number of linearly convergent iterations.
It is useful when the feasible region can only be accessed efficiently through a linear optimization oracle.
arXiv Detail & Related papers (2020-02-20T17:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.