Newton-CG methods for nonconvex unconstrained optimization with Hölder continuous Hessian
- URL: http://arxiv.org/abs/2311.13094v2
- Date: Thu, 12 Dec 2024 13:33:57 GMT
- Title: Newton-CG methods for nonconvex unconstrained optimization with Hölder continuous Hessian
- Authors: Chuan He, Heng Huang, Zhaosong Lu,
- Abstract summary: We consider a nonconstrained unconstrained optimization problem minimizing a twice differentiable objective function with H"older Hessian.
We first propose a Newton-conjugate (Newton-CG) method for finding an approximate first- and second-order stationary point.
We present preliminary numerical results to demonstrate the superior practical performance of our parameter-free NewtonCG method.
- Score: 53.044526424637866
- License:
- Abstract: In this paper we consider a nonconvex unconstrained optimization problem minimizing a twice differentiable objective function with H\"older continuous Hessian. Specifically, we first propose a Newton-conjugate gradient (Newton-CG) method for finding an approximate first- and second-order stationary point of this problem, assuming the associated the H\"older parameters are explicitly known. Then we develop a parameter-free Newton-CG method without requiring any prior knowledge of these parameters. To the best of our knowledge, this method is the first parameter-free second-order method achieving the best-known iteration and operation complexity for finding an approximate first- and second-order stationary point of this problem. Finally, we present preliminary numerical results to demonstrate the superior practical performance of our parameter-free Newton-CG method over a well-known regularized Newton method.
Related papers
- Symmetric Rank-One Quasi-Newton Methods for Deep Learning Using Cubic Regularization [0.5120567378386615]
First-order descent and other first-order variants, such as Adam and AdaGrad, are commonly used in the field of deep learning.
However, these methods do not exploit curvature information.
Quasi-Newton methods re-use previously computed low Hessian approximations.
arXiv Detail & Related papers (2025-02-17T20:20:11Z) - Symplectic Stiefel manifold: tractable metrics, second-order geometry and Newton's methods [1.190653833745802]
We develop explicit second-order geometry and Newton's methods on the symplectic Stiefel manifold.
We then solve the resulting Newton equation, as the central step of Newton's methods.
Various numerical experiments are presented to validate the proposed methods.
arXiv Detail & Related papers (2024-06-20T13:26:06Z) - Incremental Quasi-Newton Methods with Faster Superlinear Convergence
Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian.
The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate.
This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - Distributed Second Order Methods with Fast Rates and Compressed
Communication [6.069611493148631]
We develop several new communication-efficient second-order methods for distributed optimization.
We prove global sublinear and linear convergence rates, and a fast superlinear rate.
Our results are supported with experimental results on real datasets.
arXiv Detail & Related papers (2021-02-14T14:06:45Z) - Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods.
We discuss momentum methods in detail, starting with the seminal work of Nesterov.
We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z) - A Unified Analysis of First-Order Methods for Smooth Games via Integral
Quadratic Constraints [10.578409461429626]
In this work, we adapt the integral quadratic constraints theory to first-order methods for smooth and strongly-varying games and iteration.
We provide emphfor the first time a global convergence rate for the negative momentum method(NM) with an complexity $mathcalO(kappa1.5)$, which matches its known lower bound.
We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch.
arXiv Detail & Related papers (2020-09-23T20:02:00Z) - Gradient-Free Methods for Saddle-Point Problem [125.99533416395765]
We generalize the approach Gasnikov et. al., 2017, which allows to solve (stochastic) convex optimization problems with an inexact gradient-free oracle.
Our approach reduces $fracnlog n$ times the required number of oracle calls.
In the second part of the paper, we analyze the case when such an assumption cannot be made, we propose a general approach on how to modernize the method to solve this problem.
arXiv Detail & Related papers (2020-05-12T16:44:27Z) - On Newton Screening [14.040371216692645]
We develop a new screening method called Newton screening (NS) which is a generalized Newton method with a built-in screening mechanism.
We show that NS possesses an optimal convergence property in the sense that it achieves one-step local convergence.
arXiv Detail & Related papers (2020-01-27T11:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.