Minibatch and Momentum Model-based Methods for Stochastic Non-smooth
Non-convex Optimization
- URL: http://arxiv.org/abs/2106.03034v1
- Date: Sun, 6 Jun 2021 05:31:57 GMT
- Title: Minibatch and Momentum Model-based Methods for Stochastic Non-smooth
Non-convex Optimization
- Authors: Qi Deng and Wenzhi Gao
- Abstract summary: We make two important extensions to model-based methods.
First, we propose a new minibatch which takes a set of samples to approximate the model function in each iteration.
Second, by the success of momentum techniques we propose a new convex-based model.
- Score: 3.4809730725241597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic model-based methods have received increasing attention lately due
to their appealing robustness to the stepsize selection and provable efficiency
guarantee for non-smooth non-convex optimization. To further improve the
performance of stochastic model-based methods, we make two important
extensions. First, we propose a new minibatch algorithm which takes a set of
samples to approximate the model function in each iteration. For the first
time, we show that stochastic algorithms achieve linear speedup over the batch
size even for non-smooth and non-convex problems. To this end, we develop a
novel sensitivity analysis of the proximal mapping involved in each algorithm
iteration. Our analysis can be of independent interests in more general
settings. Second, motivated by the success of momentum techniques for convex
optimization, we propose a new stochastic extrapolated model-based method to
possibly improve the convergence in the non-smooth and non-convex setting. We
obtain complexity guarantees for a fairly flexible range of extrapolation term.
In addition, we conduct experiments to show the empirical advantage of our
proposed methods.
Related papers
- The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs)
It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques.
Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Enhancing Gaussian Process Surrogates for Optimization and Posterior Approximation via Random Exploration [2.984929040246293]
novel noise-free Bayesian optimization strategies that rely on a random exploration step to enhance the accuracy of Gaussian process surrogate models.
New algorithms retain the ease of implementation of the classical GP-UCB, but an additional exploration step facilitates their convergence.
arXiv Detail & Related papers (2024-01-30T14:16:06Z) - Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic
Optimization [1.7513645771137178]
We consider unconstrained optimization problems with no available gradient information.
We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a simulation function using finite differences within a common random number framework.
We develop modified versions of a norm test and an inner product quasi-Newton test to control the sample sizes used in the approximations and provide global convergence results to the neighborhood of the optimal solution.
arXiv Detail & Related papers (2021-09-24T21:49:25Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - On Stochastic Moving-Average Estimators for Non-Convex Optimization [105.22760323075008]
In this paper, we demonstrate the power of a widely used estimator based on moving average (SEMA) problems.
For all these-the-art results, we also present the results for all these-the-art problems.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Accelerated, Optimal, and Parallel: Some Results on Model-Based
Stochastic Optimization [33.71051480619541]
We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving convex optimization problems.
We provide non-asymptotic convergence guarantees and an acceleration scheme for which we provide linear speedup in minibatch size.
We show improved convergence rates and matching lower bounds identifying new fundamental constants for "interpolation" problems.
arXiv Detail & Related papers (2021-01-07T18:58:39Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.