Related papers: Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

URL: http://arxiv.org/abs/2210.12839v2
Date: Wed, 31 May 2023 23:36:56 GMT
Title: Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity
Authors: Xuxing Chen, Minhui Huang, Shiqian Ma, Krishnakumar Balasubramanian
Abstract summary: We propose a novel decentralized bilevel optimization (DSBO) algorithm that only requires first order oracle, Hessian-vector product and Jacobian-vector product. The advantage of our algorithm is that it does not require estimating the full Hessian and Jacobian matrices, thereby having improved per-it complexity.
Score: 17.870370505179014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bilevel optimization recently has received tremendous attention due to its great success in solving important machine learning problems like meta learning, reinforcement learning, and hyperparameter optimization. Extending single-agent training on bilevel problems to the decentralized setting is a natural generalization, and there has been a flurry of work studying decentralized bilevel optimization algorithms. However, it remains unknown how to design the distributed algorithm with sample complexity and convergence rate comparable to SGD for stochastic optimization, and at the same time without directly computing the exact Hessian or Jacobian matrices. In this paper we propose such an algorithm. More specifically, we propose a novel decentralized stochastic bilevel optimization (DSBO) algorithm that only requires first order stochastic oracle, Hessian-vector product and Jacobian-vector product oracle. The sample complexity of our algorithm matches the currently best known results for DSBO, and the advantage of our algorithm is that it does not require estimating the full Hessian and Jacobian matrices, thereby having improved per-iteration complexity.

Related papers

Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization. We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate. We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z)
Faster Adaptive Decentralized Learning Algorithms [24.379734053137597]
We propose a class of faster distributed non decentralized algorithms (i.e. AdaMDOS and AdaMDOF) for adaptive learning and finite-sum optimization. Some experimental results demonstrate efficiency of our algorithms.
arXiv Detail & Related papers (2024-08-19T08:05:33Z)
A Single-Loop Algorithm for Decentralized Bilevel Optimization [11.67135350286933]
We propose a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem. Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration. Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms.
arXiv Detail & Related papers (2023-11-15T13:29:49Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Decentralized Multi-Level Compositional Optimization Algorithms with Level-Independent Convergence Rate [26.676582181833584]
Decentralized multi-level optimization is challenging because of the multilevel structure and decentralized communication. We develop two novel decentralized optimization algorithms to optimize the multi-level compositional problem.
arXiv Detail & Related papers (2023-06-06T00:23:28Z)
Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning. In this work, we present the first trial in the single-dependent generalization of AUPRC optimization. Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z)
ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems. Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations. We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z)
Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms. For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms. We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z)
Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions. We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z)
Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications. We propose two new algorithms for bilevel optimization. We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.