Stochastic Online Learning with Feedback Graphs: Finite-Time and
Asymptotic Optimality
- URL: http://arxiv.org/abs/2206.10022v1
- Date: Mon, 20 Jun 2022 22:11:08 GMT
- Title: Stochastic Online Learning with Feedback Graphs: Finite-Time and
Asymptotic Optimality
- Authors: Teodor V. Marinov and Mehryar Mohri and Julian Zimmert
- Abstract summary: We show that, surprisingly, the notion of optimal finite-time regret is not a uniquely defined property in this context.
We give an algorithm that admits quasi-optimal regret both in finite-time andally.
- Score: 39.2230418563572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We revisit the problem of stochastic online learning with feedback graphs,
with the goal of devising algorithms that are optimal, up to constants, both
asymptotically and in finite time. We show that, surprisingly, the notion of
optimal finite-time regret is not a uniquely defined property in this context
and that, in general, it is decoupled from the asymptotic rate. We discuss
alternative choices and propose a notion of finite-time optimality that we
argue is \emph{meaningful}. For that notion, we give an algorithm that admits
quasi-optimal regret both in finite-time and asymptotically.
Related papers
- Asymptotic and Non-Asymptotic Convergence of AdaGrad for Non-Convex Optimization via Novel Stopping Time-based Analysis [17.34603953600226]
We introduce an innovative comprehensive analysis of Ada, filling the existing gaps in the literature.
We derive almost sure and mean-prodd in expectation for AdaGrad.
In addition, we demonstrate the average non-a-bpt-d gradient measured by the rate-prod in expectation, which is potentially independent of the existing results.
arXiv Detail & Related papers (2024-09-08T08:29:51Z) - Improving Adaptive Online Learning Using Refined Discretization [44.646191058243645]
We study unconstrained Online Linear Optimization with Lipschitz losses.
Motivated by the pursuit of instance optimality, we propose a new algorithm.
Central to these results is a continuous time approach to online learning.
arXiv Detail & Related papers (2023-09-27T21:54:52Z) - Instance-Optimality in Interactive Decision Making: Toward a
Non-Asymptotic Theory [30.061707627742766]
We aim for instance-optimality, a strong notion of adaptivity which asserts that, on any particular problem instance, the algorithm under consideration outperforms all consistent algorithms.
In this paper, we take the first step toward developing a non-asymptotic theory of instance-optimal decision making with general function approximation.
arXiv Detail & Related papers (2023-04-24T21:51:58Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds
for Episodic Reinforcement Learning [50.44564503645015]
We provide improved gap-dependent regret bounds for reinforcement learning in finite episodic Markov decision processes.
We prove tighter upper regret bounds for optimistic algorithms and accompany them with new information-theoretic lower bounds for a large class of MDPs.
arXiv Detail & Related papers (2021-07-02T20:36:05Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms.
We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients.
We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z) - An Index-based Deterministic Asymptotically Optimal Algorithm for
Constrained Multi-armed Bandit Problems [0.0]
For the model of constrained multi-armed bandit, we show that there exists an index-based deterministically optimal algorithm.
We provide a finite-time bound to the probability of the optimality given as 1-O(|A|Te-T) where T is the horizon size and A is the set of the arms in the bandit.
arXiv Detail & Related papers (2020-07-29T01:54:22Z) - Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic
Optimality [25.627939043735683]
We study structured bandits for minimizing regret.
In this paper we focus on the finite hypothesis and ask if one can achieve the optimality while enjoying bounded regret.
arXiv Detail & Related papers (2020-06-15T20:46:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.