Related papers: Provably Faster Algorithms for Bilevel Optimization

Provably Faster Algorithms for Bilevel Optimization

URL: http://arxiv.org/abs/2106.04692v1
Date: Tue, 8 Jun 2021 21:05:30 GMT
Title: Provably Faster Algorithms for Bilevel Optimization
Authors: Junjie Yang, Kaiyi Ji, Yingbin Liang
Abstract summary: Bilevel optimization has been widely applied in many important machine learning applications. We propose two new algorithms for bilevel optimization. We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
Score: 54.83583213812667
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning. Recently, several momentum-based algorithms have been proposed to solve bilevel optimization problems faster. However, those momentum-based algorithms do not achieve provably better computational complexity than $\mathcal{O}(\epsilon^{-2})$ of the SGD-based algorithm. In this paper, we propose two new algorithms for bilevel optimization, where the first algorithm adopts momentum-based recursive iterations, and the second algorithm adopts recursive gradient estimations in nested loops to decrease the variance. We show that both algorithms achieve the complexity of $\mathcal{O}(\epsilon^{-1.5})$, which outperforms all existing algorithms by the order of magnitude. Our experiments validate our theoretical results and demonstrate the superior empirical performance of our algorithms in hyperparameter applications. Our codes for MRBO, VRBO and other benchmarks are available $\text{online}^1$.

Related papers

Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization. We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate. We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z)
A Single-Loop Algorithm for Decentralized Bilevel Optimization [11.67135350286933]
We propose a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem. Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration. Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms.
arXiv Detail & Related papers (2023-11-15T13:29:49Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
A Fully Single Loop Algorithm for Bilevel Optimization without Hessian Inverse [121.54116938140754]
We propose a new Hessian inverse free Fully Single Loop Algorithm for bilevel optimization problems. We show that our algorithm converges with the rate of $O(epsilon-2)$.
arXiv Detail & Related papers (2021-12-09T02:27:52Z)
Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms. For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms. We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z)
BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications. We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization [21.555331273873175]
We consider the task of decentralized minimization of the sum of smooth strongly convex functions stored across the nodes of a network. We propose two new algorithms for this decentralized optimization problem and equip them with complexity guarantees.
arXiv Detail & Related papers (2020-06-21T11:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.