Second-order optimization with lazy Hessians
- URL: http://arxiv.org/abs/2212.00781v3
- Date: Thu, 15 Jun 2023 12:25:04 GMT
- Title: Second-order optimization with lazy Hessians
- Authors: Nikita Doikov, El Mahdi Chayti, Martin Jaggi
- Abstract summary: We analyze Newton's lazy Hessian updates for solving general possibly non-linear optimization problems.
We reuse a previously seen Hessian iteration while computing new gradients at each step of the method.
- Score: 55.51077907483634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyze Newton's method with lazy Hessian updates for solving general
possibly non-convex optimization problems. We propose to reuse a previously
seen Hessian for several iterations while computing new gradients at each step
of the method. This significantly reduces the overall arithmetical complexity
of second-order optimization schemes. By using the cubic regularization
technique, we establish fast global convergence of our method to a second-order
stationary point, while the Hessian does not need to be updated each iteration.
For convex problems, we justify global and local superlinear rates for lazy
Newton steps with quadratic regularization, which is easier to compute. The
optimal frequency for updating the Hessian is once every $d$ iterations, where
$d$ is the dimension of the problem. This provably improves the total
arithmetical complexity of second-order algorithms by a factor $\sqrt{d}$.
Related papers
- First and zeroth-order implementations of the regularized Newton method
with lazy approximated Hessians [4.62316736194615]
We develop Lip-order (Hessian-O) and zero-order (derivative-free) implementations of general non-free$ normfree problems.
We also equip our algorithms with the lazy bound update that reuses a previously computed Hessian approximation matrix for several iterations.
arXiv Detail & Related papers (2023-09-05T17:40:54Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization [37.300102993926046]
We study the complexity of producing neither smooth nor convex points of Lipschitz objectives which are possibly using only zero-order evaluations.
Our analysis is based on a simple yet powerful.
Goldstein-subdifferential set, which allows recent advancements in.
nonsmooth non optimization.
arXiv Detail & Related papers (2023-07-10T11:56:04Z) - Faster Riemannian Newton-type Optimization by Subsampling and Cubic
Regularization [3.867143522757309]
This work is on constrained large-scale non-constrained optimization where the constraint set implies a manifold structure.
We propose a new second-order saddleian optimization algorithm, aiming at improving convergence and reducing computational cost.
arXiv Detail & Related papers (2023-02-22T00:37:44Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - A Fully Single Loop Algorithm for Bilevel Optimization without Hessian
Inverse [121.54116938140754]
We propose a new Hessian inverse free Fully Single Loop Algorithm for bilevel optimization problems.
We show that our algorithm converges with the rate of $O(epsilon-2)$.
arXiv Detail & Related papers (2021-12-09T02:27:52Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.