Bilevel Optimization: Convergence Analysis and Enhanced Design
- URL: http://arxiv.org/abs/2010.07962v3
- Date: Fri, 27 Aug 2021 17:32:20 GMT
- Title: Bilevel Optimization: Convergence Analysis and Enhanced Design
- Authors: Kaiyi Ji, Junjie Yang and Yingbin Liang
- Abstract summary: Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
- Score: 63.64636047748605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bilevel optimization has arisen as a powerful tool for many machine learning
problems such as meta-learning, hyperparameter optimization, and reinforcement
learning. In this paper, we investigate the nonconvex-strongly-convex bilevel
optimization problem. For deterministic bilevel optimization, we provide a
comprehensive convergence rate analysis for two popular algorithms respectively
based on approximate implicit differentiation (AID) and iterative
differentiation (ITD). For the AID-based method, we orderwisely improve the
previous convergence rate analysis due to a more practical parameter selection
as well as a warm start strategy, and for the ITD-based method we establish the
first theoretical convergence rate. Our analysis also provides a quantitative
comparison between ITD and AID based approaches. For stochastic bilevel
optimization, we propose a novel algorithm named stocBiO, which features a
sample-efficient hypergradient estimator using efficient Jacobian- and
Hessian-vector product computations. We provide the convergence rate guarantee
for stocBiO, and show that stocBiO outperforms the best known computational
complexities orderwisely with respect to the condition number $\kappa$ and the
target accuracy $\epsilon$. We further validate our theoretical results and
demonstrate the efficiency of bilevel optimization algorithms by the
experiments on meta-learning and hyperparameter optimization.
Related papers
- A Single-Loop Algorithm for Decentralized Bilevel Optimization [11.67135350286933]
We propose a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem.
Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration.
Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms.
arXiv Detail & Related papers (2023-11-15T13:29:49Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Decentralized Multi-Level Compositional Optimization Algorithms with Level-Independent Convergence Rate [26.676582181833584]
Decentralized multi-level optimization is challenging because of the multilevel structure and decentralized communication.
We develop two novel decentralized optimization algorithms to optimize the multi-level compositional problem.
arXiv Detail & Related papers (2023-06-06T00:23:28Z) - A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and
Nonsmooth Bi-level Optimization [26.68351521813062]
Existing bi-level algorithms cannot handle nonsmooth or hyper-smooth regularizers.
In this paper, we show that an implicit differentiation (AID) scheme can be used to speed up comprehensive machine learning applications.
arXiv Detail & Related papers (2022-03-30T18:53:04Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations.
We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z) - Bilevel Optimization for Machine Learning: Algorithm Design and
Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms.
For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.
We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.