Related papers: Near Instance Optimal Model Selection for Pure Exploration Linear Bandits

Near Instance Optimal Model Selection for Pure Exploration Linear Bandits

URL: http://arxiv.org/abs/2109.05131v1
Date: Fri, 10 Sep 2021 22:56:58 GMT
Title: Near Instance Optimal Model Selection for Pure Exploration Linear Bandits
Authors: Yinglun Zhu, Julian Katz-Samuels, Robert Nowak
Abstract summary: We study the model selection problem in the pure exploration linear bandit setting. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class. Our algorithms define a new optimization problem based on experimental design.
Score: 20.67688737534517
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The model selection problem in the pure exploration linear bandit setting is introduced and studied in both the fixed confidence and fixed budget settings. The model selection problem considers a nested sequence of hypothesis classes of increasing complexities. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class containing the true model, rather than suffering from the complexity measure related to the largest hypothesis class. We provide evidence showing that a standard doubling trick over dimension fails to achieve the optimal instance-dependent sample complexity. Our algorithms define a new optimization problem based on experimental design that leverages the geometry of the action set to efficiently identify a near-optimal hypothesis class. Our fixed budget algorithm uses a novel application of a selection-validation trick in bandits. This provides a new method for the understudied fixed budget setting in linear bandits (even without the added challenge of model selection). We further generalize the model selection problem to the misspecified regime, adapting our algorithms in both fixed confidence and fixed budget settings.

Related papers

Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions. We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z)
Universal and data-adaptive algorithms for model selection in linear contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem. We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$. Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z)
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective [7.318831153179727]
We study the feasibility of similar guarantees for cumulative regret minimization in the contextual bandit setting. Our algorithm is based on a novel misspecification test, and our analysis demonstrates the benefits of using model selection for reward estimation.
arXiv Detail & Related papers (2021-06-11T16:08:03Z)
Pareto Optimal Model Selection in Linear Bandits [15.85873315624132]
We study a model selection problem in the linear bandit setting, where the learner must adapt to the dimension of the optimal hypothesis class on the fly. In this paper, we first establish a lower bound showing that, even with a fixed action set, adaptation to the unknown intrinsic dimension $d_star$ comes at a cost.
arXiv Detail & Related papers (2021-02-12T16:02:06Z)
On Statistical Efficiency in Learning [37.08000833961712]
We address the challenge of model selection to strike a balance between model fitting and model complexity. We propose an online algorithm that sequentially expands the model complexity to enhance selection stability and reduce cost. Experimental studies show that the proposed method has desirable predictive power and significantly less computational cost than some popular methods.
arXiv Detail & Related papers (2020-12-24T16:08:29Z)
Open Problem: Model Selection for Contextual Bandits [82.57505650713496]
We ask whether similar guarantees are possible for contextual bandit learning. In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class.
arXiv Detail & Related papers (2020-06-19T03:00:01Z)
Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label. Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.