A Sub-sampled Tensor Method for Non-convex Optimization
- URL: http://arxiv.org/abs/1911.10367v3
- Date: Sat, 15 Jul 2023 18:21:11 GMT
- Title: A Sub-sampled Tensor Method for Non-convex Optimization
- Authors: Aurelien Lucchi and Jonas Kohler
- Abstract summary: We present a method that uses a fourth-order regularized model to find local minima of smooth and potentially objective functions with a finite-sum structure.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a stochastic optimization method that uses a fourth-order
regularized model to find local minima of smooth and potentially non-convex
objective functions with a finite-sum structure. This algorithm uses
sub-sampled derivatives instead of exact quantities. The proposed approach is
shown to find an $(\epsilon_1,\epsilon_2,\epsilon_3)$-third-order critical
point in at most $\bigO\left(\max\left(\epsilon_1^{-4/3}, \epsilon_2^{-2},
\epsilon_3^{-4}\right)\right)$ iterations, thereby matching the rate of
deterministic approaches. In order to prove this result, we derive a novel
tensor concentration inequality for sums of tensors of any order that makes
explicit use of the finite-sum structure of the objective function.
Related papers
- Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization [2.0657831823662574]
When the non problem is by which the non problem is by whichity, the sample of first-order methods may depend linearly on the problem dimension, is for undesirable problems.
Our algorithms allow for the estimate of complexity using the distance of.
mathO (log d) / EuM4.
We prove that DISFOM can sharpen variance employing $mathO (log d) / EuM4.
arXiv Detail & Related papers (2024-06-27T18:38:42Z) - Cubic regularized subspace Newton for non-convex optimization [3.481985817302898]
We analyze a coordinate second-order SSCN which can be interpreted as applying stationary regularization in random subspaces.
We demonstrate substantial speed-ups compared to conventional first-order methods.
arXiv Detail & Related papers (2024-06-24T14:20:02Z) - Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum [30.01198677588252]
First-order algorithms require at least $mathcalO(varepsilonepsilon-4)$ complexity to find an $varepsilon-stationary point.
We introduce novel momentum algorithms utilizing efficient variable complexity.
The effectiveness of the method is validated through robust logistic regression using real-world datasets.
arXiv Detail & Related papers (2024-06-18T20:14:52Z) - Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm.
Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z) - A Fully First-Order Method for Stochastic Bilevel Optimization [8.663726907303303]
We consider unconstrained bilevel optimization problems when only the first-order gradient oracles are available.
We propose a Fully First-order Approximation (F2SA) method, and study its non-asymptotic convergence properties.
We demonstrate even superior practical performance of the proposed method over existing second-order based approaches on MNIST data-hypercleaning experiments.
arXiv Detail & Related papers (2023-01-26T05:34:21Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Dualize, Split, Randomize: Toward Fast Nonsmooth Optimization Algorithms [21.904012114713428]
We consider the sum of three convex functions, where the first one F is smooth, the second one is nonsmooth and proximable.
This template problem has many applications, for instance, in image processing and machine learning.
We propose a new primal-dual algorithm, which we call PDDY, for this problem.
arXiv Detail & Related papers (2020-04-03T10:48:01Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.