Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart
for Nonconvex Optimization
- URL: http://arxiv.org/abs/2002.11582v3
- Date: Mon, 27 Apr 2020 15:23:01 GMT
- Title: Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart
for Nonconvex Optimization
- Authors: Yi Zhou and Zhe Wang and Kaiyi Ji and Yingbin Liang and Vahid Tarokh
- Abstract summary: Various types of parameter restart schemes have been proposed for accelerated algorithms to facilitate their practical convergence in rates.
In this paper, we propose an algorithm for solving nonsmooth problems.
- Score: 73.38702974136102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Various types of parameter restart schemes have been proposed for accelerated
gradient algorithms to facilitate their practical convergence in convex
optimization. However, the convergence properties of accelerated gradient
algorithms under parameter restart remain obscure in nonconvex optimization. In
this paper, we propose a novel accelerated proximal gradient algorithm with
parameter restart (named APG-restart) for solving nonconvex and nonsmooth
problems. Our APG-restart is designed to 1) allow for adopting flexible
parameter restart schemes that cover many existing ones; 2) have a global
sub-linear convergence rate in nonconvex and nonsmooth optimization; and 3)
have guaranteed convergence to a critical point and have various types of
asymptotic convergence rates depending on the parameterization of local
geometry in nonconvex and nonsmooth optimization. Numerical experiments
demonstrate the effectiveness of our proposed algorithm.
Related papers
- A simple uniformly optimal method without line search for convex optimization [9.280355951055865]
We show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori.
We present a novel accelerated gradient descent type algorithm called AC-FGM that can achieve an optimal $mathcalO (1/k2)$ rate of convergence for smooth convex optimization.
arXiv Detail & Related papers (2023-10-16T05:26:03Z) - Parameter-free projected gradient descent [0.0]
We consider the problem of minimizing a convex function over a closed convex set, with Projected Gradient Descent (PGD)
We propose a fully parameter-free version of AdaGrad, which is adaptive to the distance between the initialization and the optimum, and to the sum of the square norm of the subgradients.
Our algorithm is able to handle projection steps, does not involve restarts, reweighing along the trajectory or additional evaluations compared to the classical PGD.
arXiv Detail & Related papers (2023-05-31T07:22:44Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and
Nonsmooth Bi-level Optimization [26.68351521813062]
Existing bi-level algorithms cannot handle nonsmooth or hyper-smooth regularizers.
In this paper, we show that an implicit differentiation (AID) scheme can be used to speed up comprehensive machine learning applications.
arXiv Detail & Related papers (2022-03-30T18:53:04Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.