Nearly Optimal Robust Method for Convex Compositional Problems with
Heavy-Tailed Noise
- URL: http://arxiv.org/abs/2006.10095v1
- Date: Wed, 17 Jun 2020 18:36:05 GMT
- Title: Nearly Optimal Robust Method for Convex Compositional Problems with
Heavy-Tailed Noise
- Authors: Yan Yan and Xin Man and Tianbao Yang
- Abstract summary: We propose robust algorithms for solving convex compositional problems of the form $f(E_xi g(cdot; xi)) + r(cdot)$.
To the best of our knowledge, this is the first work that establishes nearly optimal sub-Gaussian confidence bounds for compositional problems under heavy-tailed assumptions.
- Score: 46.94819585976063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose robust stochastic algorithms for solving convex
compositional problems of the form $f(\E_\xi g(\cdot; \xi)) + r(\cdot)$ by
establishing {\bf sub-Gaussian confidence bounds} under weak assumptions about
the tails of noise distribution, i.e., {\bf heavy-tailed noise} with bounded
second-order moments. One can achieve this goal by using an existing boosting
strategy that boosts a low probability convergence result into a high
probability result. However, piecing together existing results for solving
compositional problems suffers from several drawbacks: (i) the boosting
technique requires strong convexity of the objective; (ii) it requires a
separate algorithm to handle non-smooth $r$; (iii) it also suffers from an
additional polylogarithmic factor of the condition number. To address these
issues, we directly develop a single-trial stochastic algorithm for minimizing
optimal strongly convex compositional objectives, which has a nearly optimal
high probability convergence result matching the lower bound of stochastic
strongly convex optimization up to a logarithmic factor. To the best of our
knowledge, this is the first work that establishes nearly optimal sub-Gaussian
confidence bounds for compositional problems under heavy-tailed assumptions.
Related papers
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise [96.80184504268593]
gradient, clipping is one of the key algorithmic ingredients to derive good high-probability guarantees.
Clipping can spoil the convergence of the popular methods for composite and distributed optimization.
arXiv Detail & Related papers (2023-10-03T07:49:17Z) - Stochastic Nested Compositional Bi-level Optimization for Robust Feature
Learning [11.236838268731804]
We develop and analyze algorithms for solving nested bi-level optimization problems.
Our proposed algorithm does not rely on matrix complexity or mini-batches.
arXiv Detail & Related papers (2023-07-11T15:52:04Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Efficient First-order Methods for Convex Optimization with Strongly
Convex Function Constraints [3.667453772837954]
We show how to minimize a convex function subject to strongly convex function constraints.
We identify the sparsity pattern within a finite number result that appears to have independent significance.
arXiv Detail & Related papers (2022-12-21T16:04:53Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - The Strength of Nesterov's Extrapolation in the Individual Convergence
of Nonsmooth Optimization [0.0]
We prove that Nesterov's extrapolation has the strength to make the individual convergence of gradient descent methods optimal for nonsmooth problems.
We give an extension of the derived algorithms to solve regularized learning tasks with nonsmooth losses in settings.
Our method is applicable as an efficient tool for solving large-scale $l$1-regularized hinge-loss learning problems.
arXiv Detail & Related papers (2020-06-08T03:35:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.