Related papers: Open Problem: Model Selection for Contextual Bandits

Open Problem: Model Selection for Contextual Bandits

URL: http://arxiv.org/abs/2006.10940v1
Date: Fri, 19 Jun 2020 03:00:01 GMT
Title: Open Problem: Model Selection for Contextual Bandits
Authors: Dylan J. Foster and Akshay Krishnamurthy and Haipeng Luo
Abstract summary: We ask whether similar guarantees are possible for contextual bandit learning. In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class.
Score: 82.57505650713496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class in a sequence. We ask whether similar guarantees are possible for contextual bandit learning.

Related papers

Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem. We propose a consistent approach that does not rely on the uniform distribution assumption. We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z)
Discrete Choice Multi-Armed Bandits [0.0]
This paper establishes a connection between a category of discrete choice models and the realms of online learning and multiarmed bandit algorithms. We furnish sublinear regret bounds for a comprehensive family of algorithms, encompassing the Exp3 algorithm as a particular case. We introduce a novel family of adversarial multiarmed bandit algorithms, drawing inspiration from the generalized nested logit models.
arXiv Detail & Related papers (2023-10-01T03:41:04Z)
Momentum Contrastive Pre-training for Question Answering [54.57078061878619]
MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs. Our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
arXiv Detail & Related papers (2022-12-12T08:28:22Z)
Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z)
Universal and data-adaptive algorithms for model selection in linear contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem. We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$. Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z)
Near Instance Optimal Model Selection for Pure Exploration Linear Bandits [20.67688737534517]
We study the model selection problem in the pure exploration linear bandit setting. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class. Our algorithms define a new optimization problem based on experimental design.
arXiv Detail & Related papers (2021-09-10T22:56:58Z)
Model Selection for Generic Contextual Bandits [20.207989166682832]
We propose a refinement based algorithm called Adaptive Contextual Bandit (ttfamily ACB) We prove that this algorithm is adaptive, i.e., the regret rate order-wise matches that of any provable contextual bandit algorithm. We also show that a much simpler explore-then-commit (ETC) style algorithm also obtains similar regret bound, despite not knowing the true model class.
arXiv Detail & Related papers (2021-07-07T19:35:31Z)
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective [7.318831153179727]
We study the feasibility of similar guarantees for cumulative regret minimization in the contextual bandit setting. Our algorithm is based on a novel misspecification test, and our analysis demonstrates the benefits of using model selection for reward estimation.
arXiv Detail & Related papers (2021-06-11T16:08:03Z)
Online Active Model Selection for Pre-trained Classifiers [72.84853880948894]
We design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round. Our algorithm can be used for online prediction tasks for both adversarial and streams.
arXiv Detail & Related papers (2020-10-19T19:53:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.