A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk
Minimization
- URL: http://arxiv.org/abs/2302.08766v4
- Date: Tue, 20 Feb 2024 08:46:49 GMT
- Title: A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk
Minimization
- Authors: Mathieu Dagr\'eou, Thomas Moreau, Samuel Vaiter, Pierre Ablin
- Abstract summary: We propose a bilevel extension of the celebrated SARAH algorithm.
We demonstrate that the algorithm requires $mathcalO((n+m)frac12varepsilon-1)$ oracle calls to achieve $varepsilon$-stationarity.
We provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem.
- Score: 23.401092942765196
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bilevel optimization problems, which are problems where two optimization
problems are nested, have more and more applications in machine learning. In
many practical cases, the upper and the lower objectives correspond to
empirical risk minimization problems and therefore have a sum structure. In
this context, we propose a bilevel extension of the celebrated SARAH algorithm.
We demonstrate that the algorithm requires
$\mathcal{O}((n+m)^{\frac12}\varepsilon^{-1})$ oracle calls to achieve
$\varepsilon$-stationarity with $n+m$ the total number of samples, which
improves over all previous bilevel algorithms. Moreover, we provide a lower
bound on the number of oracle calls required to get an approximate stationary
point of the objective function of the bilevel problem. This lower bound is
attained by our algorithm, making it optimal in terms of sample complexity.
Related papers
- Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization [25.438298531555468]
Bilevel optimization is widely applied in many machine learning tasks such as hyper learning, meta learning and reinforcement learning.
We propose an efficient Hessian/BiO method with the optimal convergence $frac1TT) under some mild conditions.
We conduct some some experiments on the bilevel game hyper-stationary numerical convergence.
arXiv Detail & Related papers (2024-07-25T07:25:06Z) - Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods.
Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z) - Alternating Implicit Projected SGD and Its Efficient Variants for
Equality-constrained Bilevel Optimization [41.10094500516342]
This paper considers the bilevel optimization problems both in the equality constraints and constrained upper-level problems.
By leveraging the equality constraints approach, first, an alternating implicit projection SGD approach is used for unconstrained bilevel problems.
arXiv Detail & Related papers (2022-11-14T03:47:43Z) - A Fully Single Loop Algorithm for Bilevel Optimization without Hessian
Inverse [121.54116938140754]
We propose a new Hessian inverse free Fully Single Loop Algorithm for bilevel optimization problems.
We show that our algorithm converges with the rate of $O(epsilon-2)$.
arXiv Detail & Related papers (2021-12-09T02:27:52Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Lower Bounds and Accelerated Algorithms for Bilevel Optimization [62.192297758346484]
Bilevel optimization has recently attracted growing interests due to its wide applications in modern machine learning problems.
We show that our results are more challenging than that of minimax applications.
arXiv Detail & Related papers (2021-02-07T21:46:29Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.