Projection-free Online Learning over Strongly Convex Sets
- URL: http://arxiv.org/abs/2010.08177v2
- Date: Mon, 24 Jun 2024 03:35:02 GMT
- Title: Projection-free Online Learning over Strongly Convex Sets
- Authors: Yuanyu Wan, Lijun Zhang,
- Abstract summary: We study the special case of online learning over strongly convex sets, for which we first prove that OFW can enjoy a better regret bound of $O(T2/3)$ for general convex losses.
We show that it achieves a regret bound of $O(sqrtT)$ over general convex sets and a better regret bound of $O(sqrtT)$ over strongly convex sets.
- Score: 24.517908972536432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To efficiently solve online problems with complicated constraints, projection-free algorithms including online frank-wolfe (OFW) and its variants have received significant interest recently. However, in the general case, existing efficient projection-free algorithms only achieved the regret bound of $O(T^{3/4})$, which is worse than the regret of projection-based algorithms, where $T$ is the number of decision rounds. In this paper, we study the special case of online learning over strongly convex sets, for which we first prove that OFW can enjoy a better regret bound of $O(T^{2/3})$ for general convex losses. The key idea is to refine the decaying step-size in the original OFW by a simple line search rule. Furthermore, for strongly convex losses, we propose a strongly convex variant of OFW by redefining the surrogate loss function in OFW. We show that it achieves a regret bound of $O(T^{2/3})$ over general convex sets and a better regret bound of $O(\sqrt{T})$ over strongly convex sets.
Related papers
- Universal Online Convex Optimization with $1$ Projection per Round [31.16522982351235]
We develop universal algorithms that simultaneously attain minimax rates for multiple types of convex functions.
We employ a surrogate loss defined over simpler domains to develop universal OCO algorithms that only require $1$ projection.
Our analysis sheds new light on the surrogate loss, facilitating rigorous examination of the discrepancy between the regret of the original loss and that of the surrogate loss.
arXiv Detail & Related papers (2024-05-30T05:29:40Z) - Second Order Methods for Bandit Optimization and Control [34.51425758864638]
We show that our algorithm achieves optimal (in terms of terms of convex functions that we call $kappa$-2020) regret bounds for a large class of convex functions.
We also investigate the adaptation of our second-order bandit algorithm to online convex optimization with memory.
arXiv Detail & Related papers (2024-02-14T04:03:38Z) - Settling the Sample Complexity of Online Reinforcement Learning [92.02082223856479]
We show how to achieve minimax-optimal regret without incurring any burn-in cost.
We extend our theory to unveil the influences of problem-dependent quantities like the optimal value/cost and certain variances.
arXiv Detail & Related papers (2023-07-25T15:42:11Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - Near-Optimal Algorithms for Private Online Optimization in the
Realizable Regime [74.52487417350221]
We consider online learning problems in the realizable setting, where there is a zero-loss solution.
We propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds.
arXiv Detail & Related papers (2023-02-27T21:19:24Z) - Improved Dynamic Regret for Online Frank-Wolfe [54.690867216880356]
We investigate the dynamic regret of online Frank-Wolfe (OFW), which is an efficient projection-free algorithm for online convex optimization.
In this paper, we derive improved dynamic regret bounds for OFW by extending the fast convergence rates of FW from offline optimization to online optimization.
arXiv Detail & Related papers (2023-02-11T07:19:51Z) - Projection-free Adaptive Regret with Membership Oracles [31.422532403048738]
Most iterative algorithms require the computation of projections onto convex sets, which can be computationally expensive.
Recent work by GK22 gave sublinear adaptive regret guarantees with projection free algorithms based on the Frank Wolfe approach.
We give projection-free algorithms that are based on a different technique, inspired by Mhammedi22, that replaces projections by set-membership computations.
arXiv Detail & Related papers (2022-11-22T23:53:06Z) - Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
Reinforcement Learning [52.76230802067506]
A novel model-free algorithm is proposed to minimize regret in episodic reinforcement learning.
The proposed algorithm employs an em early-settled reference update rule, with the aid of two Q-learning sequences.
The design principle of our early-settled variance reduction method might be of independent interest to other RL settings.
arXiv Detail & Related papers (2021-10-09T21:13:48Z) - Projection-free Distributed Online Learning with Strongly Convex Losses [37.08975118221237]
We exploit the strong convexity of loss functions to improve the regret bound and communication complexity.
Our algorithm is nearly optimal for obtaining the $O(T2/3log T)$ regret bound up to polylogarithmic factors.
arXiv Detail & Related papers (2021-03-20T05:38:51Z) - Lazy OCO: Online Convex Optimization on a Switching Budget [34.936641201844054]
We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds.
Similar problems have been addressed in prior work for the discrete decision set setting, and more recently in the continuous setting but only with an adaptive adversary.
arXiv Detail & Related papers (2021-02-07T14:47:19Z) - Online Learning with Imperfect Hints [72.4277628722419]
We develop algorithms and nearly matching lower bounds for online learning with imperfect directional hints.
Our algorithms are oblivious to the quality of the hints, and the regret bounds interpolate between the always-correlated hints case and the no-hints case.
arXiv Detail & Related papers (2020-02-11T23:06:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.