Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed
Smoothness Conditions
- URL: http://arxiv.org/abs/2306.12067v1
- Date: Wed, 21 Jun 2023 07:32:29 GMT
- Title: Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed
Smoothness Conditions
- Authors: Xuxing Chen, Tesi Xiao, Krishnakumar Balasubramanian
- Abstract summary: We present a novel fully Liploop Hessian-inversion-free algorithmic framework for bilevel optimization.
We show that by a slight modification of our approach our approach can handle a more general multi-objective robust bilevel optimization problem.
- Score: 9.518010235273785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic Bilevel optimization usually involves minimizing an upper-level
(UL) function that is dependent on the arg-min of a strongly-convex lower-level
(LL) function. Several algorithms utilize Neumann series to approximate certain
matrix inverses involved in estimating the implicit gradient of the UL function
(hypergradient). The state-of-the-art StOchastic Bilevel Algorithm (SOBA) [16]
instead uses stochastic gradient descent steps to solve the linear system
associated with the explicit matrix inversion. This modification enables SOBA
to match the lower bound of sample complexity for the single-level counterpart
in non-convex settings. Unfortunately, the current analysis of SOBA relies on
the assumption of higher-order smoothness for the UL and LL functions to
achieve optimality. In this paper, we introduce a novel fully single-loop and
Hessian-inversion-free algorithmic framework for stochastic bilevel
optimization and present a tighter analysis under standard smoothness
assumptions (first-order Lipschitzness of the UL function and second-order
Lipschitzness of the LL function). Furthermore, we show that by a slight
modification of our approach, our algorithm can handle a more general
multi-objective robust bilevel optimization problem. For this case, we obtain
the state-of-the-art oracle complexity results demonstrating the generality of
both the proposed algorithmic and analytic frameworks. Numerical experiments
demonstrate the performance gain of the proposed algorithms over existing ones.
Related papers
- Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization.
We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate.
We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z) - An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness [15.656614304616006]
This paper investigates a class of bilevel optimization problems where the upper-level function is non- unbounded smoothness and the lower-level problem is strongly convex.
These problems have significant applications in data learning, such as text classification using neural networks.
arXiv Detail & Related papers (2024-09-28T02:30:44Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity [5.046146134689904]
We show that there is no gap in complexity analysis between bilevel and single-level optimization when implementing SPABA.
We propose several other bi-loop or nested bi-level algorithms to improve the state of complexity analysis.
arXiv Detail & Related papers (2024-05-29T05:36:03Z) - Bilevel Optimization under Unbounded Smoothness: A New Algorithm and
Convergence Analysis [17.596465452814883]
Current bilevel optimization algorithms assume that the iterations of the upper-level function is unbounded smooth.
Recent studies reveal that certain neural networks exhibit such unbounded smoothness.
arXiv Detail & Related papers (2024-01-17T20:28:15Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Bilevel Optimization for Machine Learning: Algorithm Design and
Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms.
For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.
We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.