Can We Find Near-Approximately-Stationary Points of Nonsmooth Nonconvex
Functions?
- URL: http://arxiv.org/abs/2002.11962v3
- Date: Thu, 15 Apr 2021 19:09:29 GMT
- Title: Can We Find Near-Approximately-Stationary Points of Nonsmooth Nonconvex
Functions?
- Authors: Ohad Shamir
- Abstract summary: It is well-known that given a bounded, smooth non function, standard gradient-based methods can find certain points.
Perhaps the most natural relaxation is to find points which are such $ilon$ points.
- Score: 31.714928102950584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well-known that given a bounded, smooth nonconvex function, standard
gradient-based methods can find $\epsilon$-stationary points (where the
gradient norm is less than $\epsilon$) in $\mathcal{O}(1/\epsilon^2)$
iterations. However, many important nonconvex optimization problems, such as
those associated with training modern neural networks, are inherently not
smooth, making these results inapplicable. Moreover, as recently pointed out in
Zhang et al. [2020], it is generally impossible to provide finite-time
guarantees for finding an $\epsilon$-stationary point of nonsmooth functions.
Perhaps the most natural relaxation of this is to find points which are near
such $\epsilon$-stationary points. In this paper, we show that even this
relaxed goal is hard to obtain in general, given only black-box access to the
function values and gradients. We also discuss the pros and cons of alternative
approaches.
Related papers
- The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem.
We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth.
We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z) - Faster Perturbed Stochastic Gradient Methods for Finding Local Minima [92.99933928528797]
We propose tttPullback, a faster perturbed gradient framework for finding local minima.
We show that Pullback with gradient estimators such as SARAH/SP and STORM can find $(epsilon, epsilon_H)$approximate local minima within $tilde O(epsilon-3 + H-6)$.
The core idea of our framework is a step-size pullback'' scheme to control the average movement of the gradient evaluations.
arXiv Detail & Related papers (2021-10-25T07:20:05Z) - Oracle Complexity in Nonsmooth Nonconvex Optimization [49.088972349825085]
It is well-known that given a smooth, bounded-from-below $$stationary points, Oracle-based methods can find smooth approximation of smoothness.
In this paper, we prove an inherent trade-off between optimization and smoothing dimension.
arXiv Detail & Related papers (2021-04-14T10:42:45Z) - Escaping Saddle Points for Nonsmooth Weakly Convex Functions via
Perturbed Proximal Algorithms [0.0]
Main results are based on a novel characterization of $epsilon$-approximate local minimum for nonsmooth functions.
We show that under standard assumptions, the perturbed proximal point, perturbed proximal gradient and perturbed proximal linear algorithms find $epsilon$-approximate local minimum for nonsmooth weakly convex functions.
arXiv Detail & Related papers (2021-02-04T19:17:13Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z) - Stochastic Recursive Gradient Descent Ascent for Stochastic
Nonconvex-Strongly-Concave Minimax Problems [36.645753881826955]
In this paper, we propose a novel method called RecurEnti Ascent (SREDA), which estimates more efficiently using variance.
This method achieves the best known for this problem.
arXiv Detail & Related papers (2020-01-11T09:05:03Z) - Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond [25.845034951660544]
We show that no first-order method can improve upon ours.
We develop a variant accelerated descent that computes an $$quasar alog function.
No first-order method can improve upon ours.
arXiv Detail & Related papers (2019-06-27T22:39:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.