Optimal Rates for Random Order Online Optimization
- URL: http://arxiv.org/abs/2106.15207v1
- Date: Tue, 29 Jun 2021 09:48:46 GMT
- Title: Optimal Rates for Random Order Online Optimization
- Authors: Uri Sherman, Tomer Koren, Yishay Mansour
- Abstract summary: We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
- Score: 60.011653053877126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study online convex optimization in the random order model, recently
proposed by \citet{garber2020online}, where the loss functions may be chosen by
an adversary, but are then presented to the online algorithm in a uniformly
random order. Focusing on the scenario where the cumulative loss function is
(strongly) convex, yet individual loss functions are smooth but might be
non-convex, we give algorithms that achieve the optimal bounds and
significantly outperform the results of \citet{garber2020online}, completely
removing the dimension dependence and improving their scaling with respect to
the strong convexity parameter. Our analysis relies on novel connections
between algorithmic stability and generalization for sampling
without-replacement analogous to those studied in the with-replacement
i.i.d.~setting, as well as on a refined average stability analysis of
stochastic gradient descent.
Related papers
- Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions.
Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms.
We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Random Scaling and Momentum for Non-smooth Non-convex Optimization [38.443430569753026]
Training neural networks requires a loss function that may be highly irregular, and in particular neither convex nor smooth.
Popular training algorithms are based on gradient descent with momentum (SGDM), for which analysis applies only if the loss is either convex or smooth.
arXiv Detail & Related papers (2024-05-16T00:52:03Z) - Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z) - Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives [1.7640556247739623]
We analyze algorithms for zeroth-order entropy composite objectives, focusing on dependence on dimensionality.
This is achieved by exploiting low dimensional structure of the decision set using the mirror descent method with an estimation alike function.
To improve the gradient, we replace the classic sampling method based on Rademacher and show that the mini-batch method copes with non-Eucli geometry.
arXiv Detail & Related papers (2022-08-09T07:36:25Z) - A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random
Perturbations for Stochastic Optimization [10.820943271350442]
We present a convex gradient algorithm for minimizing a smooth objective function that is an expectation over noisy cost samples.
We also show that our algorithm avoids the ratelibria, implying convergence to local minima.
arXiv Detail & Related papers (2022-07-30T18:50:36Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Non-asymptotic bounds for stochastic optimization with biased noisy
gradient oracles [8.655294504286635]
We introduce biased gradient oracles to capture a setting where the function measurements have an estimation error.
Our proposed oracles are in practical contexts, for instance, risk measure estimation from a batch of independent and identically distributed simulation.
arXiv Detail & Related papers (2020-02-26T12:53:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.