Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function
- URL: http://arxiv.org/abs/2310.11866v1
- Date: Wed, 18 Oct 2023 10:29:58 GMT
- Title: Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function
- Authors: Liu Liu, Xuanqing Liu, Cho-Jui Hsieh, and Dacheng Tao
- Abstract summary: Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
- Score: 99.31457740916815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trust-region (TR) and adaptive regularization using cubics (ARC) have proven
to have some very appealing theoretical properties for non-convex optimization
by concurrently computing function value, gradient, and Hessian matrix to
obtain the next search direction and the adjusted parameters. Although
stochastic approximations help largely reduce the computational cost, it is
challenging to theoretically guarantee the convergence rate. In this paper, we
explore a family of stochastic TR and ARC methods that can simultaneously
provide inexact computations of the Hessian matrix, gradient, and function
values. Our algorithms require much fewer propagations overhead per iteration
than TR and ARC. We prove that the iteration complexity to achieve
$\epsilon$-approximate second-order optimality is of the same order as the
exact computations demonstrated in previous studies. Additionally, the mild
conditions on inexactness can be met by leveraging a random sampling technology
in the finite-sum minimization problem. Numerical experiments with a non-convex
problem support these findings and demonstrate that, with the same or a similar
number of iterations, our algorithms require less computational overhead per
iteration than current second-order methods.
Related papers
- Closing the Computational-Query Depth Gap in Parallel Stochastic Convex Optimization [26.36906884097317]
We develop a new parallel algorithm for minimizing Lipschitz, convex functions with a subgradient oracle.
Our result closes a gap between the best-known query depth and the best-known computational depth of parallel algorithms.
arXiv Detail & Related papers (2024-06-11T15:41:48Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Slowly Varying Regression under Sparsity [5.22980614912553]
We present the framework of slowly hyper regression under sparsity, allowing regression models to exhibit slow and sparse variations.
We suggest a procedure that reformulates as a binary convex algorithm.
We show that the resulting model outperforms competing formulations in comparable times across various datasets.
arXiv Detail & Related papers (2021-02-22T04:51:44Z) - Practical Precoding via Asynchronous Stochastic Successive Convex
Approximation [8.808993671472349]
We consider optimization of a smooth non-studied loss function with a convex non-smooth regularizer.
In this work, we take a closer look at the SCA algorithm and develop its asynchronous variant for resource allocation in wireless networks.
arXiv Detail & Related papers (2020-10-03T13:53:30Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.