Practical Schemes for Finding Near-Stationary Points of Convex
Finite-Sums
- URL: http://arxiv.org/abs/2105.12062v1
- Date: Tue, 25 May 2021 16:46:35 GMT
- Title: Practical Schemes for Finding Near-Stationary Points of Convex
Finite-Sums
- Authors: Kaiwen Zhou, Lai Tian, Anthony Man-Cho So, James Cheng
- Abstract summary: We conduct a systematic study of the algorithmic techniques in finding near-stationary points of convex finite-sums.
Our main contributions are several algorithmic discoveries.
We put an emphasis on the simplicity and practicality of the new schemes, which could facilitate future developments.
- Score: 45.91933657088324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of finding near-stationary points in convex optimization has not
been adequately studied yet, unlike other optimality measures such as
minimizing function value. Even in the deterministic case, the optimal method
(OGM-G, due to Kim and Fessler (2021)) has just been discovered recently. In
this work, we conduct a systematic study of the algorithmic techniques in
finding near-stationary points of convex finite-sums. Our main contributions
are several algorithmic discoveries: (1) we discover a memory-saving variant of
OGM-G based on the performance estimation problem approach (Drori and Teboulle,
2014); (2) we design a new accelerated SVRG variant that can simultaneously
achieve fast rates for both minimizing gradient norm and function value; (3) we
propose an adaptively regularized accelerated SVRG variant, which does not
require the knowledge of some unknown initial constants and achieves
near-optimal complexities. We put an emphasis on the simplicity and
practicality of the new schemes, which could facilitate future developments.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Versatile Single-Loop Method for Gradient Estimator: First and Second
Order Optimality, and its Application to Federated Learning [45.78238792836363]
We present a single-loop algorithm named SLEDGE (Single-Loop-E Gradient Estimator) for periodic convergence.
Unlike existing methods, SLEDGE has the advantage of versatility; (ii) second-order optimal, (ii) in the PL region, and (iii) smaller complexity under less of data.
arXiv Detail & Related papers (2022-09-01T11:05:26Z) - Formal guarantees for heuristic optimization algorithms used in machine
learning [6.978625807687497]
Gradient Descent (SGD) and its variants have become the dominant methods in the large-scale optimization machine learning (ML) problems.
We provide formal guarantees of a few convex optimization methods and proposing improved algorithms.
arXiv Detail & Related papers (2022-07-31T19:41:22Z) - Continuation Newton methods with deflation techniques for global
optimization problems [3.705839280172101]
A global minimum point of an optimization problem is of interest in engineering.
In this article, we consider a new memetic algorithm for this nonlinear largescale problem.
According to our numerical experiments, new algorithm works well for unconstrained unconstrained problems.
arXiv Detail & Related papers (2021-07-29T09:53:49Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z) - Second-Order Guarantees in Centralized, Federated and Decentralized
Nonconvex Optimization [64.26238893241322]
Simple algorithms have been shown to lead to good empirical results in many contexts.
Several works have pursued rigorous analytical justification for studying non optimization problems.
A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
arXiv Detail & Related papers (2020-03-31T16:54:22Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.