Newton-CG methods for nonconvex unconstrained optimization with H\"older
continuous Hessian
- URL: http://arxiv.org/abs/2311.13094v1
- Date: Wed, 22 Nov 2023 01:50:43 GMT
- Title: Newton-CG methods for nonconvex unconstrained optimization with H\"older
continuous Hessian
- Authors: Chuan He and Zhaosong Lu
- Abstract summary: We consider a nonconstrained unconstrained optimization problem minimizing a twice differentiable objective function with H"older Hessian.
We develop a parameter-free NewtonCG method without requiring any prior knowledge of these parameters.
We present preliminary numerical results to demonstrate the superior practical performance over a well-known regularized Newton method.
- Score: 5.161026048499362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we consider a nonconvex unconstrained optimization problem
minimizing a twice differentiable objective function with H\"older continuous
Hessian. Specifically, we first propose a Newton-conjugate gradient (Newton-CG)
method for finding an approximate first-order stationary point (FOSP) of this
problem, assuming the associated the H\"older parameters are explicitly known.
Then we develop a parameter-free Newton-CG method without requiring any prior
knowledge of these parameters. To the best of our knowledge, this method is the
first parameter-free second-order method achieving the best-known iteration and
operation complexity for finding an approximate FOSP of this problem.
Furthermore, we propose a Newton-CG method for finding an approximate
second-order stationary point (SOSP) of the considered problem with high
probability and establish its iteration and operation complexity. Finally, we
present preliminary numerical results to demonstrate the superior practical
performance of our parameter-free Newton-CG method over a well-known
regularized Newton method.
Related papers
- A Stochastic Quasi-Newton Method for Non-convex Optimization with Non-uniform Smoothness [4.097563258332958]
We propose a fast quasi-Newton method when there exists non-uniformity in smoothness.
Our algorithm can achieve the best-known $mathcalO(epsilon-3)$ sample complexity and enjoys convergence speedup.
Our numerical experiments show that our proposed algorithm outperforms the state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-22T14:40:29Z) - Incremental Quasi-Newton Methods with Faster Superlinear Convergence
Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian.
The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate.
This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - A Newton-CG based barrier method for finding a second-order stationary
point of nonconvex conic optimization with complexity guarantees [0.5922488908114022]
We consider finding an approximate second-order stationary point (SOSP) that minimizes a twice differentiable function intersection.
Our method is not only iteration, but also achieves $mincal O(epsilon/2) which matches the best known complexity.
arXiv Detail & Related papers (2022-07-12T17:21:45Z) - STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions.
Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - Distributed Second Order Methods with Fast Rates and Compressed
Communication [6.069611493148631]
We develop several new communication-efficient second-order methods for distributed optimization.
We prove global sublinear and linear convergence rates, and a fast superlinear rate.
Our results are supported with experimental results on real datasets.
arXiv Detail & Related papers (2021-02-14T14:06:45Z) - A Unified Analysis of First-Order Methods for Smooth Games via Integral
Quadratic Constraints [10.578409461429626]
In this work, we adapt the integral quadratic constraints theory to first-order methods for smooth and strongly-varying games and iteration.
We provide emphfor the first time a global convergence rate for the negative momentum method(NM) with an complexity $mathcalO(kappa1.5)$, which matches its known lower bound.
We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch.
arXiv Detail & Related papers (2020-09-23T20:02:00Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - On Newton Screening [14.040371216692645]
We develop a new screening method called Newton screening (NS) which is a generalized Newton method with a built-in screening mechanism.
We show that NS possesses an optimal convergence property in the sense that it achieves one-step local convergence.
arXiv Detail & Related papers (2020-01-27T11:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.