Decentralized Stochastic Bilevel Optimization with Improved
per-Iteration Complexity
- URL: http://arxiv.org/abs/2210.12839v2
- Date: Wed, 31 May 2023 23:36:56 GMT
- Title: Decentralized Stochastic Bilevel Optimization with Improved
per-Iteration Complexity
- Authors: Xuxing Chen, Minhui Huang, Shiqian Ma, Krishnakumar Balasubramanian
- Abstract summary: We propose a novel decentralized bilevel optimization (DSBO) algorithm that only requires first order oracle, Hessian-vector product and Jacobian-vector product.
The advantage of our algorithm is that it does not require estimating the full Hessian and Jacobian matrices, thereby having improved per-it complexity.
- Score: 17.870370505179014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bilevel optimization recently has received tremendous attention due to its
great success in solving important machine learning problems like meta
learning, reinforcement learning, and hyperparameter optimization. Extending
single-agent training on bilevel problems to the decentralized setting is a
natural generalization, and there has been a flurry of work studying
decentralized bilevel optimization algorithms. However, it remains unknown how
to design the distributed algorithm with sample complexity and convergence rate
comparable to SGD for stochastic optimization, and at the same time without
directly computing the exact Hessian or Jacobian matrices. In this paper we
propose such an algorithm. More specifically, we propose a novel decentralized
stochastic bilevel optimization (DSBO) algorithm that only requires first order
stochastic oracle, Hessian-vector product and Jacobian-vector product oracle.
The sample complexity of our algorithm matches the currently best known results
for DSBO, and the advantage of our algorithm is that it does not require
estimating the full Hessian and Jacobian matrices, thereby having improved
per-iteration complexity.
Related papers
- Faster Adaptive Decentralized Learning Algorithms [24.379734053137597]
We propose a class of faster distributed non decentralized algorithms (i.e. AdaMDOS and AdaMDOF) for adaptive learning and finite-sum optimization.
Some experimental results demonstrate efficiency of our algorithms.
arXiv Detail & Related papers (2024-08-19T08:05:33Z) - A Single-Loop Algorithm for Decentralized Bilevel Optimization [11.67135350286933]
We propose a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem.
Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration.
Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms.
arXiv Detail & Related papers (2023-11-15T13:29:49Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Decentralized Multi-Level Compositional Optimization Algorithms with Level-Independent Convergence Rate [26.676582181833584]
Decentralized multi-level optimization is challenging because of the multilevel structure and decentralized communication.
We develop two novel decentralized optimization algorithms to optimize the multi-level compositional problem.
arXiv Detail & Related papers (2023-06-06T00:23:28Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations.
We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z) - Bilevel Optimization for Machine Learning: Algorithm Design and
Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms.
For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.
We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.