Related papers: Faster Projection-free Online Learning

Faster Projection-free Online Learning

URL: http://arxiv.org/abs/2001.11568v2
Date: Fri, 14 Feb 2020 17:36:31 GMT
Title: Faster Projection-free Online Learning
Authors: Elad Hazan and Edgar Minasyan
Abstract summary: We give an efficient projection-free algorithm that guarantees $T2/3$ regret for general online convex optimization. Our algorithm is derived using the Follow-the-Perturbed-Leader method and is analyzed using an online primal-dual framework.
Score: 34.96927532439896
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In many online learning problems the computational bottleneck for gradient-based methods is the projection operation. For this reason, in many problems the most efficient algorithms are based on the Frank-Wolfe method, which replaces projections by linear optimization. In the general case, however, online projection-free methods require more iterations than projection-based methods: the best known regret bound scales as $T^{3/4}$. Despite significant work on various variants of the Frank-Wolfe method, this bound has remained unchanged for a decade. In this paper we give an efficient projection-free algorithm that guarantees $T^{2/3}$ regret for general online convex optimization with smooth cost functions and one linear optimization computation per iteration. As opposed to previous Frank-Wolfe approaches, our algorithm is derived using the Follow-the-Perturbed-Leader method and is analyzed using an online primal-dual framework.

Related papers

Revisiting Frank-Wolfe for Structured Nonconvex Optimization [33.44652927142219]
We introduce a new projection (Frank-Wolfe) method for optimizing structured non functions that are expressed as a difference of two convex functions. We prove that the proposed method achieves in $O-(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(
arXiv Detail & Related papers (2025-03-11T22:09:44Z)
Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions. Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms. We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z)
Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$. Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z)
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints. We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z)
Projection-free Adaptive Regret with Membership Oracles [31.422532403048738]
Most iterative algorithms require the computation of projections onto convex sets, which can be computationally expensive. Recent work by GK22 gave sublinear adaptive regret guarantees with projection free algorithms based on the Frank Wolfe approach. We give projection-free algorithms that are based on a different technique, inspired by Mhammedi22, that replaces projections by set-membership computations.
arXiv Detail & Related papers (2022-11-22T23:53:06Z)
Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization [1.4504054468850665]
In Empirical Minimization -- Minimization -- we present a novel computational step-size approach for which we have computational guarantees. We show that our methods exhibit very significant problems on realworld binary datasets. We also present a novel adaptive step-size approach for which we have computational guarantees.
arXiv Detail & Related papers (2022-08-30T00:08:37Z)
Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates. We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor. Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z)
Efficient Projection-Free Online Convex Optimization with Membership Oracle [11.745866777357566]
We present a new reduction that turns any algorithm A defined on a Euclidean ball to an algorithm on a constrained set C contained within the ball. Our reduction requires O(T log T) calls to a Membership Oracle on C after T rounds, and no linear optimization on C is needed.
arXiv Detail & Related papers (2021-11-10T17:22:29Z)
BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications. We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z)
Boosting for Online Convex Optimization [64.15578413206715]
We consider the decision-making framework of online convex optimization with a large number of experts. We define a weak learning algorithm as a mechanism that guarantees approximate regret against a base class of experts. We give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class.
arXiv Detail & Related papers (2021-02-18T12:30:49Z)
Complexity of Linear Minimization and Projection on Some Sets [33.53609344219565]
The Frank-Wolfe algorithm is a method for constrained optimization that relies on linear minimizations, as opposed to projections. This paper reviews the complexity bounds for both tasks on several sets commonly used in optimization.
arXiv Detail & Related papers (2021-01-25T12:14:34Z)
Efficient Projection-Free Algorithms for Saddle Point Problems [39.88460595129901]
We study projection-free algorithms for convex-strongly-concave saddle point problems with complicated constraints. Our method combines Conditional Gradient Sliding with Mirror-Prox and shows that it only requires $tildeO (1/sqrtepsilon)$ evaluations and $tildeO (1/epsilon2)$ linear optimizations in the batch setting.
arXiv Detail & Related papers (2020-10-21T15:05:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.