Related papers: Comparator-adaptive Convex Bandits

Comparator-adaptive Convex Bandits

URL: http://arxiv.org/abs/2007.08448v1
Date: Thu, 16 Jul 2020 16:33:35 GMT
Title: Comparator-adaptive Convex Bandits
Authors: Dirk van der Hoeven and Ashok Cutkosky and Haipeng Luo
Abstract summary: We develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We extend the ideas to convex bandits with Lipschitz or smooth loss functions.
Score: 77.43350984086119
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart. Specifically, we develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We first use techniques from the full-information setting to develop comparator-adaptive algorithms for linear bandits. Then, we extend the ideas to convex bandits with Lipschitz or smooth loss functions, using a new single-point gradient estimator and carefully designed surrogate losses.

Related papers

Bregman-divergence-based Arimoto-Blahut algorithm [53.64687146666141]
We generalize the Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system. We propose a convex-optimization-free algorithm that can be applied to classical and quantum rate-distortion theory.
arXiv Detail & Related papers (2024-08-10T06:16:24Z)
Towards Efficient and Optimal Covariance-Adaptive Algorithms for Combinatorial Semi-Bandits [12.674929126684528]
We address the problem of semi-bandits, where a player selects among P actions from the power set of a set containing d base items. We show that our approach efficiently leverages the semi-bandit feedback and outperforms bandit feedback approaches.
arXiv Detail & Related papers (2024-02-23T08:07:54Z)
Robust Low-Rank Matrix Completion via a New Sparsity-Inducing Regularizer [30.920908325825668]
This paper presents a novel loss function to as hybrid ordinary-Welsch (HOW) and a new sparsity-inducing matrix problem solver.
arXiv Detail & Related papers (2023-10-07T09:47:55Z)
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization [76.57464214864756]
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave. Our algorithms are based on a unified zeroth-order optimization paradigm that applies in great generality. We show that the standard optimistic algorithms are sub-optimal by dimension factors.
arXiv Detail & Related papers (2021-07-09T16:04:24Z)
Optimal Stochastic Nonconvex Optimization with Bandit Feedback [45.675080529219365]
We analyze continuous armed bandit problems for non cost functions under certain smoothness and sublevel set assumptions. We then propose an adaptive splitting method, which can significantly improve the performance.
arXiv Detail & Related papers (2021-03-30T05:21:12Z)
Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms. Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z)
Experimental Design for Regret Minimization in Linear Bandits [19.8309784360219]
We propose a novel design-based algorithm to minimize regret in online linear and bandits. We provide state-of-the-art finite time regret guarantees and show that our algorithm can be applied in both the bandit and semi-bandit feedback regime.
arXiv Detail & Related papers (2020-11-01T17:59:19Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits [34.06611065493047]
This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. We provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. We also propose the first algorithm for linear bandits in the the fixed budget setting.
arXiv Detail & Related papers (2020-06-21T00:56:33Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.