Related papers: Higher-Order Newton Methods with Polynomial Work per Iteration

Higher-Order Newton Methods with Polynomial Work per Iteration

URL: http://arxiv.org/abs/2311.06374v2
Date: Wed, 12 Jun 2024 17:30:24 GMT
Title: Higher-Order Newton Methods with Polynomial Work per Iteration
Authors: Amir Ali Ahmadi, Abraar Chaudhry, Jeffrey Zhang,
Abstract summary: We present generalizations of oracles method that incorporate derivatives of an arbitrary $d$ but maintain a dependence on basins in their iteration dimension. We show on numerical examples that of attraction around local as $d$ can get larger around local as additional assumptions are made.
Score: 0.7568448369029973
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present generalizations of Newton's method that incorporate derivatives of an arbitrary order $d$ but maintain a polynomial dependence on dimension in their cost per iteration. At each step, our $d^{\text{th}}$-order method uses semidefinite programming to construct and minimize a sum of squares-convex approximation to the $d^{\text{th}}$-order Taylor expansion of the function we wish to minimize. We prove that our $d^{\text{th}}$-order method has local convergence of order $d$. This results in lower oracle complexity compared to the classical Newton method. We show on numerical examples that basins of attraction around local minima can get larger as $d$ increases. Under additional assumptions, we present a modified algorithm, again with polynomial cost per iteration, which is globally convergent and has local convergence of order $d$.

Related papers

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees [31.772894924814395]
We find an $epsilon-frac32) + tilde O$ in terms of the second-order local calls, and a global complexity of $tilde O(epsilon-frac74)$ for Hessian-vectorvectors. Preliminary numerical results illustrate our algorithm.
arXiv Detail & Related papers (2025-02-07T10:10:10Z)
A simple and improved algorithm for noisy, convex, zeroth-order optimisation [59.51990161522328]
We construct an algorithm that returns a point $hat xin barmathcal X$ such that $f(hat x)$ is as small as possible. We prove that this method is such that the $f(hat x) - min_xin barmathcal X f(x)$ is of smaller order than $d2/sqrtn$ up to poly-logarithmic terms.
arXiv Detail & Related papers (2024-06-26T18:19:10Z)
Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems. Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z)
First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians [4.62316736194615]
We develop Lip-order (Hessian-O) and zero-order (derivative-free) implementations of general non-free$ normfree problems. We also equip our algorithms with the lazy bound update that reuses a previously computed Hessian approximation matrix for several iterations.
arXiv Detail & Related papers (2023-09-05T17:40:54Z)
A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization. In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z)
Second-order optimization with lazy Hessians [55.51077907483634]
We analyze Newton's lazy Hessian updates for solving general possibly non-linear optimization problems. We reuse a previously seen Hessian iteration while computing new gradients at each step of the method.
arXiv Detail & Related papers (2022-12-01T18:58:26Z)
Perseus: A Simple and Optimal High-Order Method for Variational Inequalities [81.32967242727152]
A VI involves finding $xstar in mathcalX$ such that $langle F(x), x - xstarrangle geq 0$ for all $x in mathcalX$. We propose a $pth$-order method that does textitnot require any line search procedure and provably converges to a weak solution at a rate of $O(epsilon-2/(p+1))$.
arXiv Detail & Related papers (2022-05-06T13:29:14Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Regularized Newton Method with Global $O(1/k^2)$ Convergence [10.685862129925729]
We prove that our method converges superlinearly when the objective is strongly convex. Our method is the first variant of Newton's method that has both cheap iterations and provably fast global convergence.
arXiv Detail & Related papers (2021-12-03T18:55:50Z)
Computing the Newton-step faster than Hessian accumulation [8.147652597876862]
We show that given the computational graph of the function, this bound can be reduced to $O(mtau3)$, where $tau, m$ are the width and size of a tree-decomposition of the graph. The proposed algorithm generalizes nonlinear optimal-control methods based on LQR to general optimization problems and provides non-trivial gains in iteration-complexity even in cases where the Hessian is dense.
arXiv Detail & Related papers (2021-08-02T11:22:08Z)
Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations. In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z)
A fast and simple modification of Newton's method helping to avoid saddle points [0.0]
This paper roughly says that if $f$ is $C3$ and a sequence $x_n$, then the limit point is a critical point and is not a saddle point, and the convergence rate is the same as that of Newton's method.
arXiv Detail & Related papers (2020-06-02T10:38:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.