Related papers: Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization

Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization

URL: http://arxiv.org/abs/2407.17823v1
Date: Thu, 25 Jul 2024 07:25:06 GMT
Title: Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization
Authors: Feihu Huang,
Abstract summary: Bilevel optimization is widely applied in many machine learning tasks such as hyper learning, meta learning and reinforcement learning. We propose an efficient Hessian/BiO method with the optimal convergence $frac1TT) under some mild conditions. We conduct some some experiments on the bilevel game hyper-stationary numerical convergence.
Score: 25.438298531555468
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Bilevel optimization is widely applied in many machine learning tasks such as hyper-parameter learning, meta learning and reinforcement learning. Although many algorithms recently have been developed to solve the bilevel optimization problems, they generally rely on the (strongly) convex lower-level problems. More recently, some methods have been proposed to solve the nonconvex-PL bilevel optimization problems, where their upper-level problems are possibly nonconvex, and their lower-level problems are also possibly nonconvex while satisfying Polyak-{\L}ojasiewicz (PL) condition. However, these methods still have a high convergence complexity or a high computation complexity such as requiring compute expensive Hessian/Jacobian matrices and its inverses. In the paper, thus, we propose an efficient Hessian/Jacobian-free method (i.e., HJFBiO) with the optimal convergence complexity to solve the nonconvex-PL bilevel problems. Theoretically, under some mild conditions, we prove that our HJFBiO method obtains an optimal convergence rate of $O(\frac{1}{T})$, where $T$ denotes the number of iterations, and has an optimal gradient complexity of $O(\epsilon^{-1})$ in finding an $\epsilon$-stationary solution. We conduct some numerical experiments on the bilevel PL game and hyper-representation learning task to demonstrate efficiency of our proposed method.

Related papers

Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization [64.99236464953032]
We propose a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.<n>By applying our hinge-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution, we achieve a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.
arXiv Detail & Related papers (2025-06-03T06:31:59Z)
Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization. We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate. We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z)
An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization [16.709026203727007]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem. We measure the performance of our method in terms of suboptimality and infeasibility errors.
arXiv Detail & Related papers (2024-02-12T22:34:53Z)
Adaptive Mirror Descent Bilevel Optimization [25.438298531555468]
We propose a class of efficient adaptive bilevel methods based on mirror descent for non bifraclevel optimization. We provide an analysis for methods under some conditions, and prove that our methods have a fast number of iterations.
arXiv Detail & Related papers (2023-11-08T08:17:09Z)
On Momentum-Based Gradient Methods for Bilevel Optimization with Nonconvex Lower-Level [25.438298531555468]
Bilevel optimization is a popular process in machine learning tasks. In this paper, we investigate the non-representation problem of bilevel PL game. We show that our method improves the existing best results by a factor of $tO(Enabla F(x)leq epsilon$)
arXiv Detail & Related papers (2023-03-07T14:55:05Z)
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis [18.08351275534193]
Bilevel optimization reveals the inner structure of otherwise oblique optimization problems.<n>A common goal in bilevel optimization is to a hyper-objective that implicitly depends on the solution of the set of factors.
arXiv Detail & Related papers (2023-01-02T15:09:12Z)
A Conditional Gradient-based Method for Simple Bilevel Optimization with Convex Lower-level Problem [18.15207779559351]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem via a cutting plane. Our method achieves best-known assumption for the considered class of bilevel problems.
arXiv Detail & Related papers (2022-06-17T16:12:47Z)
A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima [49.320758794766185]
We propose a new approach, which convert the bilevel problem to an equivalent constrained optimization, and then the primal-dual algorithm can be used to solve the problem. Such an approach enjoys a few advantages including (a) addresses the multiple inner minima challenge; (b) fully first-order efficiency without Jacobian computations.
arXiv Detail & Related papers (2022-03-01T18:20:01Z)
Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions. We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z)
BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications. We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z)
Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel Optimization [62.87181271021217]
We consider non-SBO problems that have many applications in machine learning. This paper proposes fast randomized algorithms for non-SBO problems.
arXiv Detail & Related papers (2021-05-05T18:28:42Z)
Lower Bounds and Accelerated Algorithms for Bilevel Optimization [62.192297758346484]
Bilevel optimization has recently attracted growing interests due to its wide applications in modern machine learning problems. We show that our results are more challenging than that of minimax applications.
arXiv Detail & Related papers (2021-02-07T21:46:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.