Approximate Secular Equations for the Cubic Regularization Subproblem
- URL: http://arxiv.org/abs/2209.13268v1
- Date: Tue, 27 Sep 2022 09:22:44 GMT
- Title: Approximate Secular Equations for the Cubic Regularization Subproblem
- Authors: Yihang Gao, Man-Chung Yue, Michael K. Ng
- Abstract summary: We propose a novel solver for the cubic regularization subproblem (CRS)
The proposed solver outperforms two state-of-the-art methods.
- Score: 20.396963952753435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The cubic regularization method (CR) is a popular algorithm for unconstrained
non-convex optimization. At each iteration, CR solves a cubically regularized
quadratic problem, called the cubic regularization subproblem (CRS). One way to
solve the CRS relies on solving the secular equation, whose computational
bottleneck lies in the computation of all eigenvalues of the Hessian matrix. In
this paper, we propose and analyze a novel CRS solver based on an approximate
secular equation, which requires only some of the Hessian eigenvalues and is
therefore much more efficient. Two approximate secular equations (ASEs) are
developed. For both ASEs, we first study the existence and uniqueness of their
roots and then establish an upper bound on the gap between the root and that of
the standard secular equation. Such an upper bound can in turn be used to bound
the distance from the approximate CRS solution based ASEs to the true CRS
solution, thus offering a theoretical guarantee for our CRS solver. A desirable
feature of our CRS solver is that it requires only matrix-vector multiplication
but not matrix inversion, which makes it particularly suitable for
high-dimensional applications of unconstrained non-convex optimization, such as
low-rank recovery and deep learning. Numerical experiments with synthetic and
real data-sets are conducted to investigate the practical performance of the
proposed CRS solver. Experimental results show that the proposed solver
outperforms two state-of-the-art methods.
Related papers
- Optimization without Retraction on the Random Generalized Stiefel Manifold [9.301728976515255]
We propose a cheap iterative method that solves the optimization problem while having access only to random estimates of $B$.
Our method does not enforce the constraint in every iteration; instead, it produces iterations that converge to critical points on the generalized Stiefel manifold defined in expectation.
arXiv Detail & Related papers (2024-05-02T19:55:30Z) - A Novel Normalized-Cut Solver with Nearest Neighbor Hierarchical
Initialization [107.07093621337084]
Normalized-Cut (N-Cut) is a famous model of spectral clustering.
Traditional N-Cut solvers are two-stage: 1) calculating the continuous spectral embedding of normalized Laplacian matrix; 2) discretization via $K$-means or spectral rotation.
We propose a novel N-Cut solver based on the famous coordinate descent method.
arXiv Detail & Related papers (2023-11-26T07:11:58Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Optimal Diagonal Preconditioning: Theory and Practice [23.79536881427839]
This paper presents the problem of optimal diagonal preconditioning to achieve maximal reduction in any full-rank number of rows or columns or simultaneously.
We provide a baseline bisection algorithm that is easy to implement in practice.
Next, we specialize to one-sided optimal diagonal preconditioning problems, and demonstrate that they can be formulated as standard dual SDP problems.
arXiv Detail & Related papers (2022-09-02T04:21:28Z) - A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic
Regularized Optimization [0.38233569758620045]
We describe an Adaptive Regularization using cubics (ARC) methods for large-scale unconstrained optimization.
We find that our new approach, ARCLQN, compares to moderns with minimal tuning, a common pain-point for second order methods.
arXiv Detail & Related papers (2022-04-19T20:25:29Z) - Low-Rank Updates of Matrix Square Roots [7.832944895330117]
We consider the matrix square root and inverse square root operations.
Given a low rank perturbation to a matrix, we argue that a low-rank approximate correction to the (inverse) square root exists.
We then discuss how a low-rank solution to that equation can be computed.
arXiv Detail & Related papers (2022-01-31T12:05:33Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Square Root Bundle Adjustment for Large-Scale Reconstruction [56.44094187152862]
We propose a new formulation for the bundle adjustment problem which relies on nullspace marginalization of landmark variables by QR decomposition.
Our approach, which we call square root bundle adjustment, is algebraically equivalent to the commonly used Schur complement trick.
We show in real-world experiments with the BAL datasets that even in single precision the proposed solver achieves on average equally accurate solutions.
arXiv Detail & Related papers (2021-03-02T16:26:20Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.