First and zeroth-order implementations of the regularized Newton method
with lazy approximated Hessians
- URL: http://arxiv.org/abs/2309.02412v1
- Date: Tue, 5 Sep 2023 17:40:54 GMT
- Title: First and zeroth-order implementations of the regularized Newton method
with lazy approximated Hessians
- Authors: Nikita Doikov, Geovani Nunes Grapiglia
- Abstract summary: We develop Lip-order (Hessian-O) and zero-order (derivative-free) implementations of general non-free$ normfree problems.
We also equip our algorithms with the lazy bound update that reuses a previously computed Hessian approximation matrix for several iterations.
- Score: 4.62316736194615
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we develop first-order (Hessian-free) and zero-order
(derivative-free) implementations of the Cubically regularized Newton method
for solving general non-convex optimization problems. For that, we employ
finite difference approximations of the derivatives. We use a special adaptive
search procedure in our algorithms, which simultaneously fits both the
regularization constant and the parameters of the finite difference
approximations. It makes our schemes free from the need to know the actual
Lipschitz constants. Additionally, we equip our algorithms with the lazy
Hessian update that reuse a previously computed Hessian approximation matrix
for several iterations. Specifically, we prove the global complexity bound of
$\mathcal{O}( n^{1/2} \epsilon^{-3/2})$ function and gradient evaluations for
our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2}
\epsilon^{-3/2} )$ function evaluations for the derivative-free method, where
$n$ is the dimension of the problem and $\epsilon$ is the desired accuracy for
the gradient norm. These complexity bounds significantly improve the previously
known ones in terms of the joint dependence on $n$ and $\epsilon$, for the
first-order and zeroth-order non-convex optimization.
Related papers
- Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm.
Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Second-order optimization with lazy Hessians [55.51077907483634]
We analyze Newton's lazy Hessian updates for solving general possibly non-linear optimization problems.
We reuse a previously seen Hessian iteration while computing new gradients at each step of the method.
arXiv Detail & Related papers (2022-12-01T18:58:26Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - A Newton-CG based barrier method for finding a second-order stationary
point of nonconvex conic optimization with complexity guarantees [0.5922488908114022]
We consider finding an approximate second-order stationary point (SOSP) that minimizes a twice differentiable function intersection.
Our method is not only iteration, but also achieves $mincal O(epsilon/2) which matches the best known complexity.
arXiv Detail & Related papers (2022-07-12T17:21:45Z) - A Projection-free Algorithm for Constrained Stochastic Multi-level
Composition Optimization [12.096252285460814]
We propose a projection-free conditional gradient-type algorithm for composition optimization.
We show that the number of oracles and the linear-minimization oracle required by the proposed algorithm, are of order $mathcalO_T(epsilon-2)$ and $mathcalO_T(epsilon-3)$ respectively.
arXiv Detail & Related papers (2022-02-09T06:05:38Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.