Related papers: First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

URL: http://arxiv.org/abs/2309.02412v1
Date: Tue, 5 Sep 2023 17:40:54 GMT
Title: First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians
Authors: Nikita Doikov, Geovani Nunes Grapiglia
Abstract summary: We develop Lip-order (Hessian-O) and zero-order (derivative-free) implementations of general non-free$ normfree problems. We also equip our algorithms with the lazy bound update that reuses a previously computed Hessian approximation matrix for several iterations.
Score: 4.62316736194615
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we develop first-order (Hessian-free) and zero-order (derivative-free) implementations of the Cubically regularized Newton method for solving general non-convex optimization problems. For that, we employ finite difference approximations of the derivatives. We use a special adaptive search procedure in our algorithms, which simultaneously fits both the regularization constant and the parameters of the finite difference approximations. It makes our schemes free from the need to know the actual Lipschitz constants. Additionally, we equip our algorithms with the lazy Hessian update that reuse a previously computed Hessian approximation matrix for several iterations. Specifically, we prove the global complexity bound of $\mathcal{O}( n^{1/2} \epsilon^{-3/2})$ function and gradient evaluations for our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2} \epsilon^{-3/2} )$ function evaluations for the derivative-free method, where $n$ is the dimension of the problem and $\epsilon$ is the desired accuracy for the gradient norm. These complexity bounds significantly improve the previously known ones in terms of the joint dependence on $n$ and $\epsilon$, for the first-order and zeroth-order non-convex optimization.

Related papers

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees [31.772894924814395]
We find an $epsilon-frac32) + tilde O$ in terms of the second-order local calls, and a global complexity of $tilde O(epsilon-frac74)$ for Hessian-vectorvectors. Preliminary numerical results illustrate our algorithm.
arXiv Detail & Related papers (2025-02-07T10:10:10Z)
Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm. Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z)
A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization. In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z)
Second-order optimization with lazy Hessians [55.51077907483634]
We analyze Newton's lazy Hessian updates for solving general possibly non-linear optimization problems. We reuse a previously seen Hessian iteration while computing new gradients at each step of the method.
arXiv Detail & Related papers (2022-12-01T18:58:26Z)
Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems. We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z)
A Newton-CG based barrier method for finding a second-order stationary point of nonconvex conic optimization with complexity guarantees [0.5922488908114022]
We consider finding an approximate second-order stationary point (SOSP) that minimizes a twice differentiable function intersection. Our method is not only iteration, but also achieves $mincal O(epsilon/2) which matches the best known complexity.
arXiv Detail & Related papers (2022-07-12T17:21:45Z)
A Projection-free Algorithm for Constrained Stochastic Multi-level Composition Optimization [12.096252285460814]
We propose a projection-free conditional gradient-type algorithm for composition optimization. We show that the number of oracles and the linear-minimization oracle required by the proposed algorithm, are of order $mathcalO_T(epsilon-2)$ and $mathcalO_T(epsilon-3)$ respectively.
arXiv Detail & Related papers (2022-02-09T06:05:38Z)
Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations [54.42518331209581]
We find an algorithm which finds. epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using. $(epsilon,gamma)$surimate random random points. Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.