Continuized Acceleration for Quasar Convex Functions in Non-Convex
Optimization
- URL: http://arxiv.org/abs/2302.07851v1
- Date: Wed, 15 Feb 2023 18:35:45 GMT
- Title: Continuized Acceleration for Quasar Convex Functions in Non-Convex
Optimization
- Authors: Jun-Kun Wang and Andre Wibisono
- Abstract summary: Quasar convexity is a condition that allows some first-order to efficiently a function even when the landscape is non-optimal.
We show that quasar convexity is possible under certain conditions, while acceleration is not known in general for these classes of functions.
- Score: 13.427128424538502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quasar convexity is a condition that allows some first-order methods to
efficiently minimize a function even when the optimization landscape is
non-convex. Previous works develop near-optimal accelerated algorithms for
minimizing this class of functions, however, they require a subroutine of
binary search which results in multiple calls to gradient evaluations in each
iteration, and consequently the total number of gradient evaluations does not
match a known lower bound. In this work, we show that a recently proposed
continuized Nesterov acceleration can be applied to minimizing quasar convex
functions and achieves the optimal bound with a high probability. Furthermore,
we find that the objective functions of training generalized linear models
(GLMs) satisfy quasar convexity, which broadens the applicability of the
relevant algorithms, while known practical examples of quasar convexity in
non-convex learning are sparse in the literature. We also show that if a smooth
and one-point strongly convex, Polyak-Lojasiewicz, or quadratic-growth function
satisfies quasar convexity, then attaining an accelerated linear rate for
minimizing the function is possible under certain conditions, while
acceleration is not known in general for these classes of functions.
Related papers
- An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes [17.804065824245402]
In machine learning applications, each loss function is non-negative and can be expressed as the composition of a square and its real-valued square root.
We show how to apply the Gauss-Newton method or the Levssquardt method to minimize the average of smooth but possibly non-negative functions.
arXiv Detail & Related papers (2024-07-05T08:53:06Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - A simple uniformly optimal method without line search for convex optimization [9.280355951055865]
We show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori.
We present a novel accelerated gradient descent type algorithm called AC-FGM that can achieve an optimal $mathcalO (1/k2)$ rate of convergence for smooth convex optimization.
arXiv Detail & Related papers (2023-10-16T05:26:03Z) - Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys.
Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast
Evasion of Non-Degenerate Saddle Points [8.452349885923507]
Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks.
Motivated by the recent advances in fixed-time theory of optimal time, we introduce a framework for designing accelerated optimization algorithms.
For functions that admit non-de saddle-points, we show that the time required to evade these saddle-points is uniformly bounded for all initial conditions.
arXiv Detail & Related papers (2022-12-07T16:36:23Z) - Proximal Subgradient Norm Minimization of ISTA and FISTA [8.261388753972234]
We show that the squared proximal subgradient norm for the class of iterative shrinkage-thresholding algorithms converges at an inverse square rate.
We also show that the squared proximal subgradient norm for the class of faster iterative shrinkage-thresholding algorithms (FISTA) is accelerated to convergence at an inverse cubic rate.
arXiv Detail & Related papers (2022-11-03T06:50:19Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.