Related papers: On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization

On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization

URL: http://arxiv.org/abs/2507.23155v1
Date: Wed, 30 Jul 2025 23:10:29 GMT
Title: On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization
Authors: Jincheng Cao, Ruichen Jiang, Erfan Yazdandoost Hamedani, Aryan Mokhtari,
Abstract summary: We show that a simple implementable variant of the dynamic gradient can effectively solve the simple bilevel problems.<n>This is the first result-time algorithm that guarantees joint station for both levels in general non simple bilevel problems.
Score: 16.709026203727007
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study the problem of solving a simple bilevel optimization problem, where the upper-level objective is minimized over the solution set of the lower-level problem. We focus on the general setting in which both the upper- and lower-level objectives are smooth but potentially nonconvex. Due to the absence of additional structural assumptions for the lower-level objective-such as convexity or the Polyak-{\L}ojasiewicz (PL) condition-guaranteeing global optimality is generally intractable. Instead, we introduce a suitable notion of stationarity for this class of problems and aim to design a first-order algorithm that finds such stationary points in polynomial time. Intuitively, stationarity in this setting means the upper-level objective cannot be substantially improved locally without causing a larger deterioration in the lower-level objective. To this end, we show that a simple and implementable variant of the dynamic barrier gradient descent (DBGD) framework can effectively solve the considered nonconvex simple bilevel problems up to stationarity. Specifically, to reach an $(\epsilon_f, \epsilon_g)$-stationary point-where $\epsilon_f$ and $\epsilon_g$ denote the target stationarity accuracies for the upper- and lower-level objectives, respectively-the considered method achieves a complexity of $\mathcal{O}\left(\max\left(\epsilon_f^{-\frac{3+p}{1+p}}, \epsilon_g^{-\frac{3+p}{2}}\right)\right)$, where $p \geq 0$ is an arbitrary constant balancing the terms. To the best of our knowledge, this is the first complexity result for a discrete-time algorithm that guarantees joint stationarity for both levels in general nonconvex simple bilevel problems.

Related papers

An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization [16.709026203727007]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem. We measure the performance of our method in terms of suboptimality and infeasibility errors.
arXiv Detail & Related papers (2024-02-12T22:34:53Z)
Adaptive Mirror Descent Bilevel Optimization [25.438298531555468]
We propose a class of efficient adaptive bilevel methods based on mirror descent for non bifraclevel optimization. We provide an analysis for methods under some conditions, and prove that our methods have a fast number of iterations.
arXiv Detail & Related papers (2023-11-08T08:17:09Z)
Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems. Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z)
A Generalized Alternating Method for Bilevel Learning under the Polyak-{\L}ojasiewicz Condition [63.66516306205932]
Bilevel optimization has recently regained interest owing to its applications in emerging machine learning fields. Recent results have shown that simple alternating iteration-based iterations can match interest owing to convex lower-level objective.
arXiv Detail & Related papers (2023-06-04T17:54:11Z)
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis [18.08351275534193]
Bilevel optimization reveals the inner structure of otherwise oblique optimization problems.<n>A common goal in bilevel optimization is to a hyper-objective that implicitly depends on the solution of the set of factors.
arXiv Detail & Related papers (2023-01-02T15:09:12Z)
A Conditional Gradient-based Method for Simple Bilevel Optimization with Convex Lower-level Problem [18.15207779559351]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem via a cutting plane. Our method achieves best-known assumption for the considered class of bilevel problems.
arXiv Detail & Related papers (2022-06-17T16:12:47Z)
Efficiently Escaping Saddle Points in Bilevel Optimization [48.925688192913]
Bilevel optimization is one of the problems in machine learning. Recent developments in bilevel optimization converge on the first fundamental nonaptature multi-step analysis.
arXiv Detail & Related papers (2022-02-08T07:10:06Z)
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems. MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)
Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point. The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z)
Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization [64.26238893241322]
Simple algorithms have been shown to lead to good empirical results in many contexts. Several works have pursued rigorous analytical justification for studying non optimization problems. A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
arXiv Detail & Related papers (2020-03-31T16:54:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.