Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic
Optimization Problems
- URL: http://arxiv.org/abs/2005.09261v2
- Date: Sun, 24 May 2020 15:14:43 GMT
- Title: Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic
Optimization Problems
- Authors: Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis
- Abstract summary: We analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) optimization problems.
Experimental results indicate how the proposed algorithms empirically outperform its zerothorder gradient descent and its design variant.
- Score: 12.010310883787911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we design and analyze a new family of adaptive subgradient
methods for solving an important class of weakly convex (possibly nonsmooth)
stochastic optimization problems. Adaptive methods that use exponential moving
averages of past gradients to update search directions and learning rates have
recently attracted a lot of attention for solving optimization problems that
arise in machine learning. Nevertheless, their convergence analysis almost
exclusively requires smoothness and/or convexity of the objective function. In
contrast, we establish non-asymptotic rates of convergence of first and
zeroth-order adaptive methods and their proximal variants for a reasonably
broad class of nonsmooth \& nonconvex optimization problems. Experimental
results indicate how the proposed algorithms empirically outperform stochastic
gradient descent and its zeroth-order variant for solving such optimization
problems.
Related papers
- Accelerated stochastic first-order method for convex optimization under heavy-tailed noise [3.5877352183559386]
We study convex composite optimization problems, where the objective function is given by the sum of a prox-friendly function and a convex function whose subgradients are estimated under heavy-tailed noise.<n>We demonstrate that a vanilla algorithm can achieve optimal complexity for these problems without additional modifications such as clipping or normalization.
arXiv Detail & Related papers (2025-10-13T17:45:05Z) - Zeroth-Order Methods for Stochastic Nonconvex Nonsmooth Composite Optimization [51.63258496873442]
Previous works on composite optimization problem requires the major part to smoothness or some machine learning examples which excludes such two approximate smoothness points respectively.<n>In this work we propose two new algorithms for such problem.<n>We show that these algorithms are effective using numerical experiments.
arXiv Detail & Related papers (2025-10-06T02:35:42Z) - Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization [31.032959636901086]
We propose novel adaptive algorithms for hierarchical optimization problems.<n>Our algorithms achieve sharp convergence rates without prior knowledge of the noise level.<n>Experiments on synthetic and deep learning tasks demonstrate the effectiveness of our proposed algorithms.
arXiv Detail & Related papers (2025-09-18T20:17:18Z) - A simple uniformly optimal method without line search for convex optimization [9.280355951055865]
We show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori.
We present a novel accelerated gradient descent type algorithm called AC-FGM that can achieve an optimal $mathcalO (1/k2)$ rate of convergence for smooth convex optimization.
arXiv Detail & Related papers (2023-10-16T05:26:03Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Efficient Gradient Approximation Method for Constrained Bilevel
Optimization [2.0305676256390934]
Bilevel optimization has been developed with large-scale high-dimensional data.
This paper considers a constrained bilevel problem with convex and non-differentiable approximations.
arXiv Detail & Related papers (2023-02-03T19:34:56Z) - Consistent Approximations in Composite Optimization [0.0]
We develop a framework for consistent approximations of optimization problems.
The framework is developed for a broad class of optimizations.
A programming analysis method illustrates extended nonlinear programming solutions.
arXiv Detail & Related papers (2022-01-13T23:57:08Z) - Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive
Step Size [29.15132344744801]
We establish local convergence for gradient descent with adaptive step size for problems such as matrix inversion.
We show that these first order optimization methods can achieve sub-linear or linear convergence.
arXiv Detail & Related papers (2021-12-30T00:50:30Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - Unified Convergence Analysis for Adaptive Optimization with Moving Average Estimator [75.05106948314956]
We show that an increasing large momentum parameter for the first-order moment is sufficient for adaptive scaling.
We also give insights for increasing the momentum in a stagewise manner in accordance with stagewise decreasing step size.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.