Smoothed Online Learning is as Easy as Statistical Learning
- URL: http://arxiv.org/abs/2202.04690v1
- Date: Wed, 9 Feb 2022 19:22:34 GMT
- Title: Smoothed Online Learning is as Easy as Statistical Learning
- Authors: Adam Block, Yuval Dagan, Noah Golowich, and Alexander Rakhlin
- Abstract summary: We provide the first oracle-efficient, no-regret algorithms in this setting.
We show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits.
- Score: 77.00766067963195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Much of modern learning theory has been split between two regimes: the
classical \emph{offline} setting, where data arrive independently, and the
\emph{online} setting, where data arrive adversarially. While the former model
is often both computationally and statistically tractable, the latter requires
no distributional assumptions. In an attempt to achieve the best of both
worlds, previous work proposed the smooth online setting where each sample is
drawn from an adversarially chosen distribution, which is smooth, i.e., it has
a bounded density with respect to a fixed dominating measure. We provide tight
bounds on the minimax regret of learning a nonparametric function class, with
nearly optimal dependence on both the horizon and smoothness parameters.
Furthermore, we provide the first oracle-efficient, no-regret algorithms in
this setting. In particular, we propose an oracle-efficient improper algorithm
whose regret achieves optimal dependence on the horizon and a proper algorithm
requiring only a single oracle call per round whose regret has the optimal
horizon dependence in the classification setting and is sublinear in general.
Both algorithms have exponentially worse dependence on the smoothness parameter
of the adversary than the minimax rate. We then prove a lower bound on the
oracle complexity of any proper learning algorithm, which matches the
oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the
existence of a statistical-computational gap in smooth online learning.
Finally, we apply our results to the contextual bandit setting to show that if
a function class is learnable in the classical setting, then there is an
oracle-efficient, no-regret algorithm for contextual bandits in the case that
contexts arrive in a smooth manner.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Improving Adaptive Online Learning Using Refined Discretization [44.646191058243645]
We study unconstrained Online Linear Optimization with Lipschitz losses.
Motivated by the pursuit of instance optimality, we propose a new algorithm.
Central to these results is a continuous time approach to online learning.
arXiv Detail & Related papers (2023-09-27T21:54:52Z) - Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making [73.48977854003697]
This work introduces a new notion of complexity, the generalized bracketing numbers, which marries constraints on the adversary to the size of the space.
We then instantiate our bounds in several problems of interest, including online prediction and planning of piecewise continuous functions.
arXiv Detail & Related papers (2023-02-10T18:45:52Z) - Efficient and Near-Optimal Smoothed Online Learning for Generalized
Linear Functions [28.30744223973527]
We give a computationally efficient algorithm that is the first to enjoy the statistically optimal log(T/sigma) regret for realizable K-wise linear classification.
We develop a novel characterization of the geometry of the disagreement region induced by generalized linear classifiers.
arXiv Detail & Related papers (2022-05-25T21:31:36Z) - Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Beyond UCB: Optimal and Efficient Contextual Bandits with Regression
Oracles [112.89548995091182]
We provide the first universal and optimal reduction from contextual bandits to online regression.
Our algorithm requires no distributional assumptions beyond realizability, and works even when contexts are chosen adversarially.
arXiv Detail & Related papers (2020-02-12T11:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.