Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic
Optimization Problems
- URL: http://arxiv.org/abs/2005.09261v2
- Date: Sun, 24 May 2020 15:14:43 GMT
- Title: Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic
Optimization Problems
- Authors: Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis
- Abstract summary: We analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) optimization problems.
Experimental results indicate how the proposed algorithms empirically outperform its zerothorder gradient descent and its design variant.
- Score: 12.010310883787911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we design and analyze a new family of adaptive subgradient
methods for solving an important class of weakly convex (possibly nonsmooth)
stochastic optimization problems. Adaptive methods that use exponential moving
averages of past gradients to update search directions and learning rates have
recently attracted a lot of attention for solving optimization problems that
arise in machine learning. Nevertheless, their convergence analysis almost
exclusively requires smoothness and/or convexity of the objective function. In
contrast, we establish non-asymptotic rates of convergence of first and
zeroth-order adaptive methods and their proximal variants for a reasonably
broad class of nonsmooth \& nonconvex optimization problems. Experimental
results indicate how the proposed algorithms empirically outperform stochastic
gradient descent and its zeroth-order variant for solving such optimization
problems.
Related papers
- A simple uniformly optimal method without line search for convex optimization [9.280355951055865]
We show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori.
We present a novel accelerated gradient descent type algorithm called AC-FGM that can achieve an optimal $mathcalO (1/k2)$ rate of convergence for smooth convex optimization.
arXiv Detail & Related papers (2023-10-16T05:26:03Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Efficient Gradient Approximation Method for Constrained Bilevel
Optimization [2.0305676256390934]
Bilevel optimization has been developed with large-scale high-dimensional data.
This paper considers a constrained bilevel problem with convex and non-differentiable approximations.
arXiv Detail & Related papers (2023-02-03T19:34:56Z) - Consistent Approximations in Composite Optimization [0.0]
We develop a framework for consistent approximations of optimization problems.
The framework is developed for a broad class of optimizations.
A programming analysis method illustrates extended nonlinear programming solutions.
arXiv Detail & Related papers (2022-01-13T23:57:08Z) - Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive
Step Size [29.15132344744801]
We establish local convergence for gradient descent with adaptive step size for problems such as matrix inversion.
We show that these first order optimization methods can achieve sub-linear or linear convergence.
arXiv Detail & Related papers (2021-12-30T00:50:30Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - Unified Convergence Analysis for Adaptive Optimization with Moving Average Estimator [75.05106948314956]
We show that an increasing large momentum parameter for the first-order moment is sufficient for adaptive scaling.
We also give insights for increasing the momentum in a stagewise manner in accordance with stagewise decreasing step size.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.