Stochastic Bias-Reduced Gradient Methods
- URL: http://arxiv.org/abs/2106.09481v1
- Date: Thu, 17 Jun 2021 13:33:05 GMT
- Title: Stochastic Bias-Reduced Gradient Methods
- Authors: Hilal Asi, Yair Carmon, Arun Jambulapati, Yujia Jin and Aaron Sidford
- Abstract summary: We develop a new primitive for optimization: a low-bias, low-cost smoothing of ther $x_star$ of any bound of the Moreau-Yoshida function.
- Score: 44.35885731095432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a new primitive for stochastic optimization: a low-bias, low-cost
estimator of the minimizer $x_\star$ of any Lipschitz strongly-convex function.
In particular, we use a multilevel Monte-Carlo approach due to Blanchet and
Glynn to turn any optimal stochastic gradient method into an estimator of
$x_\star$ with bias $\delta$, variance $O(\log(1/\delta))$, and an expected
sampling cost of $O(\log(1/\delta))$ stochastic gradient evaluations. As an
immediate consequence, we obtain cheap and nearly unbiased gradient estimators
for the Moreau-Yoshida envelope of any Lipschitz convex function, allowing us
to perform dimension-free randomized smoothing.
We demonstrate the potential of our estimator through four applications.
First, we develop a method for minimizing the maximum of $N$ functions,
improving on recent results and matching a lower bound up logarithmic factors.
Second and third, we recover state-of-the-art rates for projection-efficient
and gradient-efficient optimization using simple algorithms with a transparent
analysis. Finally, we show that an improved version of our estimator would
yield a nearly linear-time, optimal-utility, differentially-private non-smooth
stochastic optimization method.
Related papers
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates [15.27596975662702]
We explore algorithms achieving optimal rates of DP optimization with heavy-tailed gradients.
Our results match the minimax lower bound in citekamath2022, indicating that the theoretical limit of convex optimization under DP is achievable.
arXiv Detail & Related papers (2024-08-19T11:07:05Z) - Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions [19.008521454738425]
We study the problem of differentially private convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a $ktextth$-moment bound on the Lipschitz constants of sample functions rather than a uniform bound.
We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error $Gcdot frac 1 sqrt n + G_k cdot (fracsqrt dnepsilon)1
arXiv Detail & Related papers (2024-06-04T21:26:29Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Variance-reduced Clipping for Non-convex Optimization [24.765794811146144]
Gradient clipping is a technique used in deep learning applications such as large-scale language modeling.
Recent experimental training have a fairly special behavior in that it mitigates order complexity.
arXiv Detail & Related papers (2023-03-02T00:57:38Z) - Estimating the minimizer and the minimum value of a regression function
under passive design [72.85024381807466]
We propose a new method for estimating the minimizer $boldsymbolx*$ and the minimum value $f*$ of a smooth and strongly convex regression function $f$.
We derive non-asymptotic upper bounds for the quadratic risk and optimization error of $boldsymbolz_n$, and for the risk of estimating $f*$.
arXiv Detail & Related papers (2022-11-29T18:38:40Z) - Efficient Minimax Optimal Global Optimization of Lipschitz Continuous
Multivariate Functions [0.0]
We show that our algorithm achieves an average regretO(LstnT-frac1n)$ for the horizon for the Lipschitz continuous functions.
arXiv Detail & Related papers (2022-06-06T06:28:38Z) - A Variance Controlled Stochastic Method with Biased Estimation for
Faster Non-convex Optimization [0.0]
We propose a new technique, em variance controlled gradient (VCSG), to improve the performance of the reduced gradient (SVRG)
$lambda$ is introduced in VCSG to avoid over-reducing a variance by SVRG.
$mathcalO(min1/epsilon3/2,n1/4/epsilon)$ number of gradient evaluations, which improves the leading gradient complexity.
arXiv Detail & Related papers (2021-02-19T12:22:56Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.