Accelerating Inexact HyperGradient Descent for Bilevel Optimization
- URL: http://arxiv.org/abs/2307.00126v1
- Date: Fri, 30 Jun 2023 20:36:44 GMT
- Title: Accelerating Inexact HyperGradient Descent for Bilevel Optimization
- Authors: Haikuo Yang, Luo Luo, Chris Junchi Li, Michael I. Jordan
- Abstract summary: We present a method for solving general non-strongly-concave bilevel optimization problems.
Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
- Score: 84.00488779515206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method for solving general nonconvex-strongly-convex bilevel
optimization problems. Our method -- the \emph{Restarted Accelerated
HyperGradient Descent} (\texttt{RAHGD}) method -- finds an
$\epsilon$-first-order stationary point of the objective with
$\tilde{\mathcal{O}}(\kappa^{3.25}\epsilon^{-1.75})$ oracle complexity, where
$\kappa$ is the condition number of the lower-level objective and $\epsilon$ is
the desired accuracy. We also propose a perturbed variant of \texttt{RAHGD} for
finding an
$\big(\epsilon,\mathcal{O}(\kappa^{2.5}\sqrt{\epsilon}\,)\big)$-second-order
stationary point within the same order of oracle complexity. Our results
achieve the best-known theoretical guarantees for finding stationary points in
bilevel optimization and also improve upon the existing upper complexity bound
for finding second-order stationary points in nonconvex-strongly-concave
minimax optimization problems, setting a new state-of-the-art benchmark.
Empirical studies are conducted to validate the theoretical results in this
paper.
Related papers
- First-Order Methods for Linearly Constrained Bilevel Optimization [38.19659447295665]
We present first-order linearly constrained optimization methods for high-level Hessian computations.
For linear inequality constraints, we attain $(delta,epsilon)$-Goldstein stationarity in $widetildeO(ddelta-1 epsilon-3)$ gradient oracle calls.
arXiv Detail & Related papers (2024-06-18T16:41:21Z) - The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem.
We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z) - Projection-Free Methods for Stochastic Simple Bilevel Optimization with
Convex Lower-level Problem [16.9187409976238]
We study a class of convex bilevel optimization problems, also known as simple bilevel optimization.
We introduce novel bilevel optimization methods that approximate the solution set of the lower-level problem.
arXiv Detail & Related papers (2023-08-15T02:37:11Z) - Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods.
Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization [25.00475462213752]
Decentralized Recursive desc. Method (DREAM)
Concretely, it requires $mathcalO(minminappaappa3eps-3,kappa2 N)$ first-order oracle (SFO) calls and $tildemathcalO(kappa2 epsilon-2) communication rounds.
Our numerical experiments validate the superiority of previous methods.
arXiv Detail & Related papers (2022-12-05T16:09:39Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - A Newton-CG based barrier method for finding a second-order stationary
point of nonconvex conic optimization with complexity guarantees [0.5922488908114022]
We consider finding an approximate second-order stationary point (SOSP) that minimizes a twice differentiable function intersection.
Our method is not only iteration, but also achieves $mincal O(epsilon/2) which matches the best known complexity.
arXiv Detail & Related papers (2022-07-12T17:21:45Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.