Accelerated, Optimal, and Parallel: Some Results on Model-Based
Stochastic Optimization
- URL: http://arxiv.org/abs/2101.02696v1
- Date: Thu, 7 Jan 2021 18:58:39 GMT
- Title: Accelerated, Optimal, and Parallel: Some Results on Model-Based
Stochastic Optimization
- Authors: Karan Chadha, Gary Cheng, John C. Duchi
- Abstract summary: We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving convex optimization problems.
We provide non-asymptotic convergence guarantees and an acceleration scheme for which we provide linear speedup in minibatch size.
We show improved convergence rates and matching lower bounds identifying new fundamental constants for "interpolation" problems.
- Score: 33.71051480619541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We extend the Approximate-Proximal Point (aProx) family of model-based
methods for solving stochastic convex optimization problems, including
stochastic subgradient, proximal point, and bundle methods, to the minibatch
and accelerated setting. To do so, we propose specific model-based algorithms
and an acceleration scheme for which we provide non-asymptotic convergence
guarantees, which are order-optimal in all problem-dependent constants and
provide linear speedup in minibatch size, while maintaining the desirable
robustness traits (e.g. to stepsize) of the aProx family. Additionally, we show
improved convergence rates and matching lower bounds identifying new
fundamental constants for "interpolation" problems, whose importance in
statistical machine learning is growing; this, for example, gives a
parallelization strategy for alternating projections. We corroborate our
theoretical results with empirical testing to demonstrate the gains accurate
modeling, acceleration, and minibatching provide.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - Enhancing Gaussian Process Surrogates for Optimization and Posterior Approximation via Random Exploration [2.984929040246293]
novel noise-free Bayesian optimization strategies that rely on a random exploration step to enhance the accuracy of Gaussian process surrogate models.
New algorithms retain the ease of implementation of the classical GP-UCB, but an additional exploration step facilitates their convergence.
arXiv Detail & Related papers (2024-01-30T14:16:06Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent
Flows [4.817429789586127]
We introduce a Poly-based optimization framework for achieving acceleration, based on the notion of fixed-time stability dynamical systems.
We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms.
arXiv Detail & Related papers (2021-12-02T16:04:40Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Minibatch and Momentum Model-based Methods for Stochastic Non-smooth
Non-convex Optimization [3.4809730725241597]
We make two important extensions to model-based methods.
First, we propose a new minibatch which takes a set of samples to approximate the model function in each iteration.
Second, by the success of momentum techniques we propose a new convex-based model.
arXiv Detail & Related papers (2021-06-06T05:31:57Z) - Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods.
We discuss momentum methods in detail, starting with the seminal work of Nesterov.
We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z) - A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained
Optimization [25.73397307080647]
We present a new model of a general convex or non objective machine machine objectives.
We propose an algorithm that solves a constraint with gradually relaxed point levels of each subproblem.
We demonstrate the effectiveness of our new numerical scale problems.
arXiv Detail & Related papers (2020-10-23T05:24:05Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Geometry, Computation, and Optimality in Stochastic Optimization [24.154336772159745]
We study computational and statistical consequences of problem geometry in and online optimization.
By focusing on constraint set and gradient geometry, we characterize the problem families for which- and adaptive-gradient methods are (minimax) optimal.
arXiv Detail & Related papers (2019-09-23T16:14:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.