Faster Projection-free Online Learning
- URL: http://arxiv.org/abs/2001.11568v2
- Date: Fri, 14 Feb 2020 17:36:31 GMT
- Title: Faster Projection-free Online Learning
- Authors: Elad Hazan and Edgar Minasyan
- Abstract summary: We give an efficient projection-free algorithm that guarantees $T2/3$ regret for general online convex optimization.
Our algorithm is derived using the Follow-the-Perturbed-Leader method and is analyzed using an online primal-dual framework.
- Score: 34.96927532439896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many online learning problems the computational bottleneck for
gradient-based methods is the projection operation. For this reason, in many
problems the most efficient algorithms are based on the Frank-Wolfe method,
which replaces projections by linear optimization. In the general case,
however, online projection-free methods require more iterations than
projection-based methods: the best known regret bound scales as $T^{3/4}$.
Despite significant work on various variants of the Frank-Wolfe method, this
bound has remained unchanged for a decade. In this paper we give an efficient
projection-free algorithm that guarantees $T^{2/3}$ regret for general online
convex optimization with smooth cost functions and one linear optimization
computation per iteration. As opposed to previous Frank-Wolfe approaches, our
algorithm is derived using the Follow-the-Perturbed-Leader method and is
analyzed using an online primal-dual framework.
Related papers
- Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions.
Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms.
We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints.
We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z) - Projection-free Adaptive Regret with Membership Oracles [31.422532403048738]
Most iterative algorithms require the computation of projections onto convex sets, which can be computationally expensive.
Recent work by GK22 gave sublinear adaptive regret guarantees with projection free algorithms based on the Frank Wolfe approach.
We give projection-free algorithms that are based on a different technique, inspired by Mhammedi22, that replaces projections by set-membership computations.
arXiv Detail & Related papers (2022-11-22T23:53:06Z) - Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method
for Empirical Risk Minimization [1.4504054468850665]
In Empirical Minimization -- Minimization -- we present a novel computational step-size approach for which we have computational guarantees.
We show that our methods exhibit very significant problems on realworld binary datasets.
We also present a novel adaptive step-size approach for which we have computational guarantees.
arXiv Detail & Related papers (2022-08-30T00:08:37Z) - Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z) - Efficient Projection-Free Online Convex Optimization with Membership
Oracle [11.745866777357566]
We present a new reduction that turns any algorithm A defined on a Euclidean ball to an algorithm on a constrained set C contained within the ball.
Our reduction requires O(T log T) calls to a Membership Oracle on C after T rounds, and no linear optimization on C is needed.
arXiv Detail & Related papers (2021-11-10T17:22:29Z) - BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications.
We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z) - Boosting for Online Convex Optimization [64.15578413206715]
We consider the decision-making framework of online convex optimization with a large number of experts.
We define a weak learning algorithm as a mechanism that guarantees approximate regret against a base class of experts.
We give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class.
arXiv Detail & Related papers (2021-02-18T12:30:49Z) - Complexity of Linear Minimization and Projection on Some Sets [33.53609344219565]
The Frank-Wolfe algorithm is a method for constrained optimization that relies on linear minimizations, as opposed to projections.
This paper reviews the complexity bounds for both tasks on several sets commonly used in optimization.
arXiv Detail & Related papers (2021-01-25T12:14:34Z) - Efficient Projection-Free Algorithms for Saddle Point Problems [39.88460595129901]
We study projection-free algorithms for convex-strongly-concave saddle point problems with complicated constraints.
Our method combines Conditional Gradient Sliding with Mirror-Prox and shows that it only requires $tildeO (1/sqrtepsilon)$ evaluations and $tildeO (1/epsilon2)$ linear optimizations in the batch setting.
arXiv Detail & Related papers (2020-10-21T15:05:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.