First-order penalty methods for bilevel optimization
- URL: http://arxiv.org/abs/2301.01716v2
- Date: Thu, 7 Mar 2024 15:42:33 GMT
- Title: First-order penalty methods for bilevel optimization
- Authors: Zhaosong Lu and Sanyou Mei
- Abstract summary: We show a class of $varepsilon$ complexity solution for unconstrained and constrained bilevel optimization problems.
We also propose first-order penalty methods for finding an $epsilon$KKT solution of the unconstrained and constrained bilevel problems.
- Score: 1.2691047660244337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we study a class of unconstrained and constrained bilevel
optimization problems in which the lower level is a possibly nonsmooth convex
optimization problem, while the upper level is a possibly nonconvex
optimization problem. We introduce a notion of $\varepsilon$-KKT solution for
them and show that an $\varepsilon$-KKT solution leads to an
$O(\sqrt{\varepsilon})$- or $O(\varepsilon)$-hypergradient based stionary point
under suitable assumptions. We also propose first-order penalty methods for
finding an $\varepsilon$-KKT solution of them, whose subproblems turn out to be
a structured minimax problem and can be suitably solved by a first-order method
recently developed by the authors. Under suitable assumptions, an
\emph{operation complexity} of $O(\varepsilon^{-4}\log\varepsilon^{-1})$ and
$O(\varepsilon^{-7}\log\varepsilon^{-1})$, measured by their fundamental
operations, is established for the proposed penalty methods for finding an
$\varepsilon$-KKT solution of the unconstrained and constrained bilevel
optimization problems, respectively. Preliminary numerical results are
presented to illustrate the performance of our proposed methods. To the best of
our knowledge, this paper is the first work to demonstrate that bilevel
optimization can be approximately solved as minimax optimization, and moreover,
it provides the first implementable method with complexity guarantees for such
sophisticated bilevel optimization.
Related papers
- A first-order method for nonconvex-strongly-concave constrained minimax optimization [2.735801286587348]
We propose a first-order subproblems method for non-strongly-con constrained minimax problems.<n>We find an $epsilon$-KK solution which improves the previous best-known operation.
arXiv Detail & Related papers (2025-12-28T12:31:56Z) - Solving bilevel optimization via sequential minimax optimization [2.735801286587348]
We study a class of constrained bilevel optimization problems in which the part is a possibly non-smooth convex part.<n>We find that the SMO method improves the complexity of the lower-level objective functions.<n>Preliminary results compare to the recently developed first-order penalty performance.
arXiv Detail & Related papers (2025-11-10T18:51:05Z) - An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization [16.709026203727007]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem.
We measure the performance of our method in terms of suboptimality and infeasibility errors.
arXiv Detail & Related papers (2024-02-12T22:34:53Z) - Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems.
Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z) - Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods.
Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z) - On Momentum-Based Gradient Methods for Bilevel Optimization with
Nonconvex Lower-Level [25.438298531555468]
Bilevel optimization is a popular process in machine learning tasks.
In this paper, we investigate the non-representation problem of bilevel PL game.
We show that our method improves the existing best results by a factor of $tO(Enabla F(x)leq epsilon$)
arXiv Detail & Related papers (2023-03-07T14:55:05Z) - Optimal Stochastic Non-smooth Non-convex Optimization through
Online-to-Non-convex Conversion [56.92236659731376]
We present new algorithms for optimizing non-known, non-smooth objectives based on a novel analysis technique.
For deterministic second-order smooth objectives, applying advanced optimistic online learning techniques enables a new $O(delta0.5) all$ to recover optimal or best-known results.
arXiv Detail & Related papers (2023-02-07T22:09:20Z) - A Fully First-Order Method for Stochastic Bilevel Optimization [8.663726907303303]
We consider unconstrained bilevel optimization problems when only the first-order gradient oracles are available.
We propose a Fully First-order Approximation (F2SA) method, and study its non-asymptotic convergence properties.
We demonstrate even superior practical performance of the proposed method over existing second-order based approaches on MNIST data-hypercleaning experiments.
arXiv Detail & Related papers (2023-01-26T05:34:21Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - A Conditional Gradient-based Method for Simple Bilevel Optimization with
Convex Lower-level Problem [18.15207779559351]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem via a cutting plane.
Our method achieves best-known assumption for the considered class of bilevel problems.
arXiv Detail & Related papers (2022-06-17T16:12:47Z) - Accelerated first-order methods for convex optimization with locally
Lipschitz continuous gradient [0.0]
We first consider unconstrained convex optimization with Lipschitz continuous gradient (LLCG) and propose accelerated proximal gradient (APG) methods for solving it.
The proposed APG methods are equipped with a verifiable termination criterion and enjoy an operation complexity of $cal O(varepsilon-1/2log varepsilon-1)$.
Preliminary numerical results are presented to demonstrate the performance of our proposed methods.
arXiv Detail & Related papers (2022-06-02T10:34:26Z) - BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications.
We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.