New Q-Newton's method meets Backtracking line search: good convergence
guarantee, saddle points avoidance, quadratic rate of convergence, and easy
implementation
- URL: http://arxiv.org/abs/2108.10249v1
- Date: Mon, 23 Aug 2021 15:39:00 GMT
- Title: New Q-Newton's method meets Backtracking line search: good convergence
guarantee, saddle points avoidance, quadratic rate of convergence, and easy
implementation
- Authors: Tuyen Trung Truong
- Abstract summary: modification of Newton's method, named New Q-Newton's method, which can avoid saddle points and has quadratic rate of convergence.
Experiments on small scale show that the method works very competitively against other well known modifications of Newton's method.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a recent joint work, the author has developed a modification of Newton's
method, named New Q-Newton's method, which can avoid saddle points and has
quadratic rate of convergence. While good theoretical convergence guarantee has
not been established for this method, experiments on small scale problems show
that the method works very competitively against other well known modifications
of Newton's method such as Adaptive Cubic Regularization and BFGS, as well as
first order methods such as Unbounded Two-way Backtracking Gradient Descent.
In this paper, we resolve the convergence guarantee issue by proposing a
modification of New Q-Newton's method, named New Q-Newton's method
Backtracking, which incorporates a more sophisticated use of hyperparameters
and a Backtracking line search. This new method has very good theoretical
guarantees, which for a {\bf Morse function} yields the following (which is
unknown for New Q-Newton's method):
{\bf Theorem.} Let $f:\mathbb{R}^m\rightarrow \mathbb{R}$ be a Morse
function, that is all its critical points have invertible Hessian. Then for a
sequence $\{x_n\}$ constructed by New Q-Newton's method Backtracking from a
random initial point $x_0$, we have the following two alternatives:
i) $\lim _{n\rightarrow\infty}||x_n||=\infty$,
or
ii) $\{x_n\}$ converges to a point $x_{\infty}$ which is a {\bf local
minimum} of $f$, and the rate of convergence is {\bf quadratic}.
Moreover, if $f$ has compact sublevels, then only case ii) happens.
As far as we know, for Morse functions, this is the best theoretical
guarantee for iterative optimization algorithms so far in the literature. We
have tested in experiments on small scale, with some further simplified
versions of New Q-Newton's method Backtracking, and found that the new method
significantly improve New Q-Newton's method.
Related papers
- A simple and improved algorithm for noisy, convex, zeroth-order optimisation [59.51990161522328]
We construct an algorithm that returns a point $hat xin barmathcal X$ such that $f(hat x)$ is as small as possible.
We prove that this method is such that the $f(hat x) - min_xin barmathcal X f(x)$ is of smaller order than $d2/sqrtn$ up to poly-logarithmic terms.
arXiv Detail & Related papers (2024-06-26T18:19:10Z) - Incremental Quasi-Newton Methods with Faster Superlinear Convergence
Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian.
The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate.
This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z) - Krylov Cubic Regularized Newton: A Subspace Second-Order Method with
Dimension-Free Convergence Rate [83.3933097134767]
We introduce a novel subspace cubic regularized Newton method that achieves a dimension-independent global convergence rate of $Oleft(frac1mk+frac1k2right)$.
Our method converges faster than existing random subspace methods, especially for high-dimensional problems.
arXiv Detail & Related papers (2024-01-05T20:24:18Z) - Backtracking New Q-Newton's method, Newton's flow, Voronoi's diagram and
Stochastic root finding [0.0]
A new variant of Newton's method - named Backtracking New Q-Newton's method (BNQN) - has strong theoretical guarantee, is easy to implement, and has good experimental performance.
arXiv Detail & Related papers (2024-01-02T15:37:47Z) - Sketch-and-Project Meets Newton Method: Global $\mathcal O(k^{-2})$ Convergence with Low-Rank Updates [1.592855981846993]
We propose the first sketch-and-project Newton method with fast $mathcal O(k-2)$ global convergence rate for self-concordant functions.
SGN inherits best of all three worlds: cheap iteration costs of sketch-and-project methods, state-of-the-art $mathcal O(k-2)$ global convergence rate of full-rank Newton-like methods and the algorithm simplicity of damped Newton methods.
arXiv Detail & Related papers (2023-05-22T14:51:40Z) - Regularized Newton Method with Global $O(1/k^2)$ Convergence [10.685862129925729]
We prove that our method converges superlinearly when the objective is strongly convex.
Our method is the first variant of Newton's method that has both cheap iterations and provably fast global convergence.
arXiv Detail & Related papers (2021-12-03T18:55:50Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - Finite-Time Analysis for Double Q-learning [50.50058000948908]
We provide the first non-asymptotic, finite-time analysis for double Q-learning.
We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an $epsilon$-accurate neighborhood of the global optimum.
arXiv Detail & Related papers (2020-09-29T18:48:21Z) - Unconstrained optimisation on Riemannian manifolds [0.0]
We give explicit descriptions of versions of (Local-) Backtracking Gradient Descent and New Q-Newton's method.
We also do explicit calculations for calculating the smallest eigenvalue of a symmetric square matrix.
arXiv Detail & Related papers (2020-08-25T15:10:21Z) - A fast and simple modification of Newton's method helping to avoid
saddle points [0.0]
This paper roughly says that if $f$ is $C3$ and a sequence $x_n$, then the limit point is a critical point and is not a saddle point, and the convergence rate is the same as that of Newton's method.
arXiv Detail & Related papers (2020-06-02T10:38:00Z) - A Newton Frank-Wolfe Method for Constrained Self-Concordant Minimization [60.90222082871258]
We demonstrate how to scalably solve a class of constrained self-concordant minimization problems using linear minimization oracles (LMO) over the constraint set.
We prove that the number of LMO calls of our method is nearly the same as that of the Frank-Wolfe method in the L-smooth case.
arXiv Detail & Related papers (2020-02-17T15:28:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.