Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence
- URL: http://arxiv.org/abs/2410.02626v1
- Date: Thu, 3 Oct 2024 16:08:16 GMT
- Title: Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence
- Authors: Ruichen Jiang, Aryan Mokhtari,
- Abstract summary: We prove a global convergence rate of $O(min1/k,sqrtd/k1.25)$ in terms of the duality gap.
These results are the first global convergence results to demonstrate a provable advantage of a quasi-Newton method over the extragradient method.
- Score: 20.766358513158206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a quasi-Newton method for solving smooth and monotone nonlinear equations, including unconstrained minimization and minimax optimization as special cases. For the strongly monotone setting, we establish two global convergence bounds: (i) a linear convergence rate that matches the rate of the celebrated extragradient method, and (ii) an explicit global superlinear convergence rate that provably surpasses the linear convergence rate after at most ${O}(d)$ iterations, where $d$ is the problem's dimension. In addition, for the case where the operator is only monotone, we prove a global convergence rate of ${O}(\min\{{1}/{k},{\sqrt{d}}/{k^{1.25}}\})$ in terms of the duality gap. This matches the rate of the extragradient method when $k = {O}(d^2)$ and is faster when $k = \Omega(d^2)$. These results are the first global convergence results to demonstrate a provable advantage of a quasi-Newton method over the extragradient method, without querying the Jacobian of the operator. Unlike classical quasi-Newton methods, we achieve this by using the hybrid proximal extragradient framework and a novel online learning approach for updating the Jacobian approximation matrices. Specifically, guided by the convergence analysis, we formulate the Jacobian approximation update as an online convex optimization problem over non-symmetric matrices, relating the regret of the online problem to the convergence rate of our method. To facilitate efficient implementation, we further develop a tailored online learning algorithm based on an approximate separation oracle, which preserves structures such as symmetry and sparsity in the Jacobian matrices.
Related papers
- Incremental Quasi-Newton Methods with Faster Superlinear Convergence
Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian.
The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate.
This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Strictly Low Rank Constraint Optimization -- An Asymptotically
$\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution.
We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z) - Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic
Superlinear Convergence Rate [37.49160762239869]
We present a Limited-memory Greedy BFGS (LG-BFGS) method that can achieve an explicit non-asymptotic superlinear rate.
Our established non-asymptotic superlinear convergence rate demonstrates an explicit trade-off between the convergence speed and memory requirement.
arXiv Detail & Related papers (2023-06-27T12:59:56Z) - Accelerated Quasi-Newton Proximal Extragradient: Faster Rate for Smooth
Convex Optimization [26.328847475942894]
We prove that our method can achieve a convergence rate of $Obigl(minfrac1k2, fracsqrtdlog kk2.5bigr)$, where $d$ is the problem dimension and $k$ is the number of iterations.
To the best of our knowledge, this result is the first to demonstrate a provable gain of a quasi-Newton-type method over Nesterov's accelerated gradient.
arXiv Detail & Related papers (2023-06-03T23:31:27Z) - Online Learning Guided Curvature Approximation: A Quasi-Newton Method
with Global Non-Asymptotic Superlinear Convergence [22.286753988825048]
We present the first globally convergent quasi-Newton method with an explicit non-asymptotic superlinear convergence rate.
Unlike classical quasi-Newton methods, we build our algorithm upon the hybrid proximal extragradient method.
arXiv Detail & Related papers (2023-02-16T20:58:09Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
Non-uniform refinement of objective functions leads to emphNon-uniform Smoothness (NS) and emphNon-uniform Lojasiewicz inequality (NL)
New definitions inspire new geometry-aware first-order methods that converge to global optimality faster than the classical $Omega (1/t2)$ lower bounds.
arXiv Detail & Related papers (2021-05-13T04:23:07Z) - Non-asymptotic Superlinear Convergence of Standard Quasi-Newton Methods [26.328847475942894]
We prove the non-asymptotic superlinear convergence rate of the Broyden class of quasi-Newton algorithms.
Our results are first to provide a non-asymptotic superlinear convergence rate for quasi-Newton methods.
arXiv Detail & Related papers (2020-03-30T16:42:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.