Related papers: LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace

LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace

URL: http://arxiv.org/abs/2404.03331v1
Date: Thu, 4 Apr 2024 09:57:29 GMT
Title: LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace
Authors: Bin Gao, Yan Yang, Ya-xiang Yuan,
Abstract summary: In this paper, we construct a sequence of low-dimensional approximate Krylov subspaces with the aid of the Lanczos process. The constructed subspace is able to dynamically and incrementally approximate the Hessian inverse vector product. We also propose aprovable subspace-based framework for bilevel problems where one central step is to solve a small-size tridiagonal linear system.
Score: 4.917399520581689
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bilevel optimization, with broad applications in machine learning, has an intricate hierarchical structure. Gradient-based methods have emerged as a common approach to large-scale bilevel problems. However, the computation of the hyper-gradient, which involves a Hessian inverse vector product, confines the efficiency and is regarded as a bottleneck. To circumvent the inverse, we construct a sequence of low-dimensional approximate Krylov subspaces with the aid of the Lanczos process. As a result, the constructed subspace is able to dynamically and incrementally approximate the Hessian inverse vector product with less effort and thus leads to a favorable estimate of the hyper-gradient. Moreover, we propose a~provable subspace-based framework for bilevel problems where one central step is to solve a small-size tridiagonal linear system. To the best of our knowledge, this is the first time that subspace techniques are incorporated into bilevel optimization. This successful trial not only enjoys $\mathcal{O}(\epsilon^{-1})$ convergence rate but also demonstrates efficiency in a synthetic problem and two deep learning tasks.

Related papers

qNBO: quasi-Newton Meets Bilevel Optimization [26.0555315825777]
Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning. We introduce a general framework to address these computational challenges in a coordinated manner. Specifically, we leverage quasi-Newton algorithms to accelerate the resolution of the lower-level problem while efficiently approximating the inverse Hessian-vector product.
arXiv Detail & Related papers (2025-02-03T05:36:45Z)
Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization. We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate. We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z)
Convex Bi-Level Optimization Problems with Non-smooth Outer Objective Function [0.0]
We show that Bi-SG tackles bi-level optimization problems and sub-linear rates both in terms of the inner and outer objective functions. We prove that the distance of the generated sequence to the set of optimal solutions of the bi-level problem converges to zero.
arXiv Detail & Related papers (2023-07-17T05:03:53Z)
A Generalized Alternating Method for Bilevel Learning under the Polyak-{\L}ojasiewicz Condition [63.66516306205932]
Bilevel optimization has recently regained interest owing to its applications in emerging machine learning fields. Recent results have shown that simple alternating iteration-based iterations can match interest owing to convex lower-level objective.
arXiv Detail & Related papers (2023-06-04T17:54:11Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Efficient Gradient Approximation Method for Constrained Bilevel Optimization [2.0305676256390934]
Bilevel optimization has been developed with large-scale high-dimensional data. This paper considers a constrained bilevel problem with convex and non-differentiable approximations.
arXiv Detail & Related papers (2023-02-03T19:34:56Z)
A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima [49.320758794766185]
We propose a new approach, which convert the bilevel problem to an equivalent constrained optimization, and then the primal-dual algorithm can be used to solve the problem. Such an approach enjoys a few advantages including (a) addresses the multiple inner minima challenge; (b) fully first-order efficiency without Jacobian computations.
arXiv Detail & Related papers (2022-03-01T18:20:01Z)
Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems [0.0]
Two-level formula search optimization has become instrumental in a number of machine learning contexts. New low-rank bi-level gradient methods are developed that do not require second-order derivatives.
arXiv Detail & Related papers (2021-10-01T18:20:14Z)
Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions. We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z)
A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization [38.75417864443519]
Bi-level optimization model is able to capture a wide range of complex learning tasks with practical interest. We propose a new interior Bi-level Value-based Interior-point scheme, we penalize the regularized value function of the lower level problem into the upper level objective.
arXiv Detail & Related papers (2021-06-15T09:10:40Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.