Regret minimization in stochastic non-convex learning via a
proximal-gradient approach
- URL: http://arxiv.org/abs/2010.06250v1
- Date: Tue, 13 Oct 2020 09:22:21 GMT
- Title: Regret minimization in stochastic non-convex learning via a
proximal-gradient approach
- Authors: Nadav Hallak and Panayotis Mertikopoulos and Volkan Cevher
- Abstract summary: Motivated by applications in machine learning and operations, we regret with first-order oracle feedback minimization online constrained problems.
We develop a new prox-grad with guarantees proximal complexity reduction techniques.
- Score: 80.59047515124198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by applications in machine learning and operations research, we
study regret minimization with stochastic first-order oracle feedback in online
constrained, and possibly non-smooth, non-convex problems. In this setting, the
minimization of external regret is beyond reach for first-order methods, so we
focus on a local regret measure defined via a proximal-gradient mapping. To
achieve no (local) regret in this setting, we develop a prox-grad method based
on stochastic first-order feedback, and a simpler method for when access to a
perfect first-order oracle is possible. Both methods are min-max order-optimal,
and we also establish a bound on the number of prox-grad queries these methods
require. As an important application of our results, we also obtain a link
between online and offline non-convex stochastic optimization manifested as a
new prox-grad scheme with complexity guarantees matching those obtained via
variance reduction techniques.
Related papers
- Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization [20.093236438944718]
We develop novel gradient-based methods for tackling non-linear minimax problems.
We show that the new methods achieve comparable results with other methods.
arXiv Detail & Related papers (2024-10-29T17:47:22Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Adaptive multi-gradient methods for quasiconvex vector optimization and
applications to multi-task learning [1.03590082373586]
We present an adaptive step-size method, which does not include line-search techniques, for solving a wide class of nonobjective multi-size programming problems.
We prove an unbounded convergence set on modest assumptions.
We apply the proposed technique to some multi-task experiments to show efficacy for largescale challenges.
arXiv Detail & Related papers (2024-02-09T07:20:14Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Restarts subject to approximate sharpness: A parameter-free and optimal scheme for first-order methods [0.6554326244334866]
Sharpness is an assumption in continuous optimization that bounds the distance from minima by objective function suboptimality.
Sharpness involves problem-specific constants that are typically unknown, and restart schemes typically reduce convergence rates.
We consider the assumption of approximate sharpness, a generalization of sharpness that incorporates an unknown constant perturbation to the objective function error.
arXiv Detail & Related papers (2023-01-05T19:01:41Z) - Self-adaptive algorithms for quasiconvex programming and applications to
machine learning [0.0]
We provide a self-adaptive step-size strategy that does not include convex line-search techniques and a generic approach under mild assumptions.
The proposed method is verified by preliminary results from some computational examples.
To demonstrate the effectiveness of the proposed technique for large-scale problems, we apply it to some experiments on machine learning.
arXiv Detail & Related papers (2022-12-13T05:30:29Z) - Smoothed Online Learning is as Easy as Statistical Learning [77.00766067963195]
We provide the first oracle-efficient, no-regret algorithms in this setting.
We show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits.
arXiv Detail & Related papers (2022-02-09T19:22:34Z) - Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities.
Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z) - On Constraints in First-Order Optimization: A View from Non-Smooth
Dynamical Systems [99.59934203759754]
We introduce a class of first-order methods for smooth constrained optimization.
Two distinctive features of our approach are that projections or optimizations over the entire feasible set are avoided.
The resulting algorithmic procedure is simple to implement even when constraints are nonlinear.
arXiv Detail & Related papers (2021-07-17T11:45:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.