Optimal Guarantees for Algorithmic Reproducibility and Gradient
Complexity in Convex Optimization
- URL: http://arxiv.org/abs/2310.17759v2
- Date: Tue, 9 Jan 2024 19:26:33 GMT
- Title: Optimal Guarantees for Algorithmic Reproducibility and Gradient
Complexity in Convex Optimization
- Authors: Liang Zhang, Junchi Yang, Amin Karbasi, Niao He
- Abstract summary: Previous work suggests that first-order methods would need to trade-off convergence rate (gradient convergence rate) for better.
We demonstrate that both optimal complexity and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems.
- Score: 55.115992622028685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithmic reproducibility measures the deviation in outputs of machine
learning algorithms upon minor changes in the training process. Previous work
suggests that first-order methods would need to trade-off convergence rate
(gradient complexity) for better reproducibility. In this work, we challenge
this perception and demonstrate that both optimal reproducibility and
near-optimal convergence guarantees can be achieved for smooth convex
minimization and smooth convex-concave minimax problems under various
error-prone oracle settings. Particularly, given the inexact initialization
oracle, our regularization-based algorithms achieve the best of both worlds -
optimal reproducibility and near-optimal gradient complexity - for minimization
and minimax optimization. With the inexact gradient oracle, the near-optimal
guarantees also hold for minimax optimization. Additionally, with the
stochastic gradient oracle, we show that stochastic gradient descent ascent is
optimal in terms of both reproducibility and gradient complexity. We believe
our results contribute to an enhanced understanding of the
reproducibility-convergence trade-off in the context of convex optimization.
Related papers
- Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions.
Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms.
We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Versatile Single-Loop Method for Gradient Estimator: First and Second
Order Optimality, and its Application to Federated Learning [45.78238792836363]
We present a single-loop algorithm named SLEDGE (Single-Loop-E Gradient Estimator) for periodic convergence.
Unlike existing methods, SLEDGE has the advantage of versatility; (ii) second-order optimal, (ii) in the PL region, and (iii) smaller complexity under less of data.
arXiv Detail & Related papers (2022-09-01T11:05:26Z) - Near-Optimal Algorithms for Making the Gradient Small in Stochastic
Minimax Optimization [14.719077076351377]
We study the problem of finding a near-stationary point for smooth minimax optimization.
We show that a novel algorithm called Recursive IteratioNRAIN achieves both convex-concave and strongly-concave cases.
arXiv Detail & Related papers (2022-08-11T16:55:26Z) - Projection-Free Adaptive Gradients for Large-Scale Optimization [22.0439695290991]
Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens by querying only approximate first-order information from the objective.
We show that our method can improve the performance of adaptive algorithms for constrained optimization.
arXiv Detail & Related papers (2020-09-29T15:56:12Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - Stochastic Coordinate Minimization with Progressive Precision for
Stochastic Convex Optimization [16.0251555430107]
A framework based on iterative coordinate minimization (CM) is developed for convex optimization.
We establish the optimal precision control and the resulting order-optimal regret performance.
The proposed algorithm is amenable to online implementation and inherits the scalability and parallelizability properties of CM for large-scale optimization.
arXiv Detail & Related papers (2020-03-11T18:42:40Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.