Oracle Complexity in Nonsmooth Nonconvex Optimization
- URL: http://arxiv.org/abs/2104.06763v1
- Date: Wed, 14 Apr 2021 10:42:45 GMT
- Title: Oracle Complexity in Nonsmooth Nonconvex Optimization
- Authors: Guy Kornowski, Ohad Shamir
- Abstract summary: It is well-known that given a smooth, bounded-from-below $$stationary points, Oracle-based methods can find smooth approximation of smoothness.
In this paper, we prove an inherent trade-off between optimization and smoothing dimension.
- Score: 49.088972349825085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well-known that given a smooth, bounded-from-below, and possibly
nonconvex function, standard gradient-based methods can find
$\epsilon$-stationary points (with gradient norm less than $\epsilon$) in
$\mathcal{O}(1/\epsilon^2)$ iterations. However, many important nonconvex
optimization problems, such as those associated with training modern neural
networks, are inherently not smooth, making these results inapplicable. In this
paper, we study nonsmooth nonconvex optimization from an oracle complexity
viewpoint, where the algorithm is assumed to be given access only to local
information about the function at various points. We provide two main results
(under mild assumptions): First, we consider the problem of getting near
$\epsilon$-stationary points. This is perhaps the most natural relaxation of
finding $\epsilon$-stationary points, which is impossible in the nonsmooth
nonconvex case. We prove that this relaxed goal cannot be achieved efficiently,
for any distance and $\epsilon$ smaller than some constants. Our second result
deals with the possibility of tackling nonsmooth nonconvex optimization by
reduction to smooth optimization: Namely, applying smooth optimization methods
on a smooth approximation of the objective function. For this approach, we
prove an inherent trade-off between oracle complexity and smoothness: On the
one hand, smoothing a nonsmooth nonconvex function can be done very efficiently
(e.g., by randomized smoothing), but with dimension-dependent factors in the
smoothness parameter, which can strongly affect iteration complexity when
plugging into standard smooth optimization methods. On the other hand, these
dimension factors can be eliminated with suitable smoothing methods, but only
by making the oracle complexity of the smoothing process exponentially large.
Related papers
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization [16.709026203727007]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem.
We measure the performance of our method in terms of suboptimality and infeasibility errors.
arXiv Detail & Related papers (2024-02-12T22:34:53Z) - Nonsmooth Projection-Free Optimization with Functional Constraints [12.20060970177791]
This paper presents a subgradient-based algorithm for constrained nonsmooth convex computation.
Our proposed algorithm can handle nonsmooth problems with general convex functional inequality constraints.
Similar performance is observed when deterministic subgradients are replaced with subgradients.
arXiv Detail & Related papers (2023-11-18T23:06:33Z) - An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization [37.300102993926046]
We study the complexity of producing neither smooth nor convex points of Lipschitz objectives which are possibly using only zero-order evaluations.
Our analysis is based on a simple yet powerful.
Goldstein-subdifferential set, which allows recent advancements in.
nonsmooth non optimization.
arXiv Detail & Related papers (2023-07-10T11:56:04Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm.
Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z) - Optimal Stochastic Non-smooth Non-convex Optimization through
Online-to-Non-convex Conversion [56.92236659731376]
We present new algorithms for optimizing non-known, non-smooth objectives based on a novel analysis technique.
For deterministic second-order smooth objectives, applying advanced optimistic online learning techniques enables a new $O(delta0.5) all$ to recover optimal or best-known results.
arXiv Detail & Related papers (2023-02-07T22:09:20Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Optimal Algorithms for Convex Nested Stochastic Composite Optimization [13.200502573462712]
convex nested composite optimization (NSCO) has received considerable attention for its applications in reinforcement learning and risk-averse optimization.
The current NSCO algorithms have worse oracle complexities, by orders of magnitude, than those for simpler composite optimization problems without the nested structure.
We develop order-optimal algorithms for the convex NSCO problem constructed from an arbitrary composition of smooth, structured non-smooth and general non-smooth layer functions.
arXiv Detail & Related papers (2020-11-19T19:22:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.