Best of Both Worlds Model Selection
- URL: http://arxiv.org/abs/2206.14912v1
- Date: Wed, 29 Jun 2022 20:57:30 GMT
- Title: Best of Both Worlds Model Selection
- Authors: Aldo Pacchiano, Christoph Dann, Claudio Gentile
- Abstract summary: We study the problem of model selection in bandit scenarios in the presence of nested policy classes.
Our approach requires that each base learner comes with a candidate regret bound that may or may not hold.
These are the first theoretical results that achieve best of both world (stochastic and adversarial) guarantees while performing model selection in (linear) bandit scenarios.
- Score: 39.211071446838474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of model selection in bandit scenarios in the presence
of nested policy classes, with the goal of obtaining simultaneous adversarial
and stochastic ("best of both worlds") high-probability regret guarantees. Our
approach requires that each base learner comes with a candidate regret bound
that may or may not hold, while our meta algorithm plays each base learner
according to a schedule that keeps the base learner's candidate regret bounds
balanced until they are detected to violate their guarantees. We develop
careful mis-specification tests specifically designed to blend the above model
selection criterion with the ability to leverage the (potentially benign)
nature of the environment. We recover the model selection guarantees of the
CORRAL algorithm for adversarial environments, but with the additional benefit
of achieving high probability regret bounds, specifically in the case of nested
adversarial linear bandits. More importantly, our model selection results also
hold simultaneously in stochastic environments under gap assumptions. These are
the first theoretical results that achieve best of both world (stochastic and
adversarial) guarantees while performing model selection in (linear) bandit
scenarios.
Related papers
- Beyond Expectations: Learning with Stochastic Dominance Made Practical [88.06211893690964]
dominance models risk-averse preferences for decision making with uncertain outcomes.
Despite theoretically appealing, the application of dominance in machine learning has been scarce.
We first generalize the dominance concept to enable feasible comparisons between any arbitrary pair of random variables.
We then develop a simple and efficient approach for finding the optimal solution in terms of dominance.
arXiv Detail & Related papers (2024-02-05T03:21:23Z) - Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Bi-objective Ranking and Selection Using Stochastic Kriging [0.0]
We consider bi-objective ranking and selection problems in which the two objective outcomes have been observed with uncertainty.
We propose a novel Bayesian bi-objective ranking and selection method that sequentially allocates extra samples to competitive solutions.
Experimental results show that the proposed method outperforms the standard allocation method, as well as a well-known state-of-the-art algorithm.
arXiv Detail & Related papers (2022-09-05T23:51:07Z) - Online Learning with Knapsacks: the Best of Both Worlds [54.28273783164608]
We casting online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$m resource constraints.
Our framework allows the decision maker to handle its evidence flexibility and costoretic functions.
arXiv Detail & Related papers (2022-02-28T12:10:48Z) - Model Selection in Batch Policy Optimization [88.52887493684078]
We study the problem of model selection in batch policy optimization.
We identify three sources of error that any model selection algorithm should optimally trade-off in order to be competitive.
arXiv Detail & Related papers (2021-12-23T02:31:50Z) - Towards Costless Model Selection in Contextual Bandits: A Bias-Variance
Perspective [7.318831153179727]
We study the feasibility of similar guarantees for cumulative regret minimization in the contextual bandit setting.
Our algorithm is based on a novel misspecification test, and our analysis demonstrates the benefits of using model selection for reward estimation.
arXiv Detail & Related papers (2021-06-11T16:08:03Z) - Fair Classification with Adversarial Perturbations [35.030329189029246]
We study fair classification in the presence of an omniscient adversary that, given an $eta$, is allowed to choose an arbitrary $eta$-fraction of the training samples and arbitrarily perturb their protected attributes.
Our main contribution is an optimization framework to learn fair classifiers in this adversarial setting that comes with provable guarantees on accuracy and fairness.
We prove near-tightness of our framework's guarantees for natural hypothesis classes: no algorithm can have significantly better accuracy and any algorithm with better fairness must have lower accuracy.
arXiv Detail & Related papers (2021-06-10T17:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.