Projection-Free Methods for Stochastic Simple Bilevel Optimization with
Convex Lower-level Problem
- URL: http://arxiv.org/abs/2308.07536v1
- Date: Tue, 15 Aug 2023 02:37:11 GMT
- Title: Projection-Free Methods for Stochastic Simple Bilevel Optimization with
Convex Lower-level Problem
- Authors: Jincheng Cao, Ruichen Jiang, Nazanin Abolfazli, Erfan Yazdandoost
Hamedani, Aryan Mokhtari
- Abstract summary: We study a class of convex bilevel optimization problems, also known as simple bilevel optimization.
We introduce novel bilevel optimization methods that approximate the solution set of the lower-level problem.
- Score: 16.9187409976238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study a class of stochastic bilevel optimization problems,
also known as stochastic simple bilevel optimization, where we minimize a
smooth stochastic objective function over the optimal solution set of another
stochastic convex optimization problem. We introduce novel stochastic bilevel
optimization methods that locally approximate the solution set of the
lower-level problem via a stochastic cutting plane, and then run a conditional
gradient update with variance reduction techniques to control the error induced
by using stochastic gradients. For the case that the upper-level function is
convex, our method requires
$\tilde{\mathcal{O}}(\max\{1/\epsilon_f^{2},1/\epsilon_g^{2}\}) $ stochastic
oracle queries to obtain a solution that is $\epsilon_f$-optimal for the
upper-level and $\epsilon_g$-optimal for the lower-level. This guarantee
improves the previous best-known complexity of
$\mathcal{O}(\max\{1/\epsilon_f^{4},1/\epsilon_g^{4}\})$. Moreover, for the
case that the upper-level function is non-convex, our method requires at most
$\tilde{\mathcal{O}}(\max\{1/\epsilon_f^{3},1/\epsilon_g^{3}\}) $ stochastic
oracle queries to find an $(\epsilon_f, \epsilon_g)$-stationary point. In the
finite-sum setting, we show that the number of stochastic oracle calls required
by our method are $\tilde{\mathcal{O}}(\sqrt{n}/\epsilon)$ and
$\tilde{\mathcal{O}}(\sqrt{n}/\epsilon^{2})$ for the convex and non-convex
settings, respectively, where $\epsilon=\min \{\epsilon_f,\epsilon_g\}$.
Related papers
- An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization [16.709026203727007]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem.
We measure the performance of our method in terms of suboptimality and infeasibility errors.
arXiv Detail & Related papers (2024-02-12T22:34:53Z) - Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems.
Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods.
Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density.
We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z) - Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient
Oracle Complexity [15.18055488087588]
We give an algorithm that minimizes the above convex formulation to $epsilon$-accuracy in $widetildeO(sum_i=1n d_i log (1 /epsilon))$ gradient computations.
Our main technical contribution is an adaptive procedure to select an $f_i$ term at every iteration via a novel combination of cutting-plane and interior-point methods.
arXiv Detail & Related papers (2022-08-07T20:53:42Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Decentralized Stochastic Variance Reduced Extragradient Method [25.21457349137344]
This paper studies decentralized convex-concave minimax optimization problems of the form $min_xmax_y fx,y triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msum
arXiv Detail & Related papers (2022-02-01T16:06:20Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Stochastic Recursive Gradient Descent Ascent for Stochastic
Nonconvex-Strongly-Concave Minimax Problems [36.645753881826955]
In this paper, we propose a novel method called RecurEnti Ascent (SREDA), which estimates more efficiently using variance.
This method achieves the best known for this problem.
arXiv Detail & Related papers (2020-01-11T09:05:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.