The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime
- URL: http://arxiv.org/abs/1702.05186v2
- Date: Sun, 23 Apr 2023 15:48:50 GMT
- Title: The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime
- Authors: Max Simchowitz and Kevin Jamieson and Benjamin Recht
- Abstract summary: We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
- Score: 52.38455827779212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel technique for analyzing adaptive sampling called the {\em
Simulator}. Our approach differs from the existing methods by considering not
how much information could be gathered by any fixed sampling strategy, but how
difficult it is to distinguish a good sampling strategy from a bad one given
the limited amount of data collected up to any given time. This change of
perspective allows us to match the strength of both Fano and change-of-measure
techniques, without succumbing to the limitations of either method. For
concreteness, we apply our techniques to a structured multi-arm bandit problem
in the fixed-confidence pure exploration setting, where we show that the
constraints on the means imply a substantial gap between the
moderate-confidence sample complexity, and the asymptotic sample complexity as
$\delta \to 0$ found in the literature. We also prove the first instance-based
lower bounds for the top-k problem which incorporate the appropriate
log-factors. Moreover, our lower bounds zero-in on the number of times each
\emph{individual} arm needs to be pulled, uncovering new phenomena which are
drowned out in the aggregate sample complexity. Our new analysis inspires a
simple and near-optimal algorithm for the best-arm and top-k identification,
the first {\em practical} algorithm of its kind for the latter problem which
removes extraneous log factors, and outperforms the state-of-the-art in
experiments.
Related papers
- Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Learning Rate Free Sampling in Constrained Domains [21.853333421463603]
We introduce a suite of new particle-based algorithms for sampling in constrained domains which are entirely learning rate free.
We demonstrate the performance of our algorithms on a range of numerical examples, including sampling from targets on the simplex.
arXiv Detail & Related papers (2023-05-24T09:31:18Z) - Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes.
Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z) - A Non-asymptotic Approach to Best-Arm Identification for Gaussian
Bandits [0.0]
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance.
This strategy called Exploration-Biased Sampling is not onlyally optimal: we also prove non-asymptotic bounds occurring with high probability.
arXiv Detail & Related papers (2021-05-27T07:42:49Z) - Improved Algorithms for Agnostic Pool-based Active Classification [20.12178157010804]
We consider active learning for binary classification in the agnostic pool-based setting.
Our algorithm is superior to state of the art active learning algorithms on image classification datasets.
arXiv Detail & Related papers (2021-05-13T18:24:30Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - An Empirical Process Approach to the Union Bound: Practical Algorithms
for Combinatorial and Linear Bandits [34.06611065493047]
This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings.
We provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms.
We also propose the first algorithm for linear bandits in the the fixed budget setting.
arXiv Detail & Related papers (2020-06-21T00:56:33Z) - Beyond Worst-Case Analysis in Stochastic Approximation: Moment
Estimation Improves Instance Complexity [58.70807593332932]
We study oracle complexity of gradient based methods for approximation problems.
We focus on instance-dependent complexity instead of worst case complexity.
Our proposed algorithm and its analysis provide a theoretical justification for the success of moment estimation.
arXiv Detail & Related papers (2020-06-08T09:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.