Dual-Directed Algorithm Design for Efficient Pure Exploration
- URL: http://arxiv.org/abs/2310.19319v2
- Date: Wed, 11 Dec 2024 14:10:25 GMT
- Title: Dual-Directed Algorithm Design for Efficient Pure Exploration
- Authors: Chao Qin, Wei You,
- Abstract summary: We consider pure-exploration problems in the context of sequential adaptive experiments with a finite set of alternatives.
We formulate the problem complexity measure as a maximin optimization problem for the static fixed-budget, fixed-confidence, and posterior convergence rate settings.
Our algorithm attains optimality in $varepsilon$-best-arm identification (or ranking and selection with a probability of good selection guarantee) and thresholding bandits.
- Score: 9.728332815218181
- License:
- Abstract: We consider pure-exploration problems in the context of stochastic sequential adaptive experiments with a finite set of alternatives. The central objective is to answer a query regarding the alternatives with high confidence while minimizing measurement efforts. One canonical example is identifying the best-performing alternative, a problem known as ranking and selection in simulation or best-arm identification in machine learning. We formulate the problem complexity measure as a maximin optimization problem for the static fixed-budget, fixed-confidence, and posterior convergence rate settings. By incorporating dual variables directly into the analysis, we derive necessary and sufficient conditions for an allocation's optimality. The introduction of dual variables allows us to sidestep the combinatorial complexity that arises when considering only primal variables. These optimality conditions enable the extension of the top-two algorithm design principle to more general pure-exploration problems. Moreover, our analysis yields a straightforward and effective information-directed selection rule that adaptively chooses from a candidate set based on the informational value of the candidates. We demonstrate the broad range of contexts in which our design principle can be implemented. In particular, when combined with information-directed selection, top-two Thompson sampling achieves asymptotic optimality in Gaussian best-arm identification, resolving a notable open question in the pure-exploration literature. Our algorithm attains optimality in $\varepsilon$-best-arm identification (or ranking and selection with a probability of good selection guarantee) and thresholding bandits. Our results provide a general principle for adapting Thompson sampling to general pure-exploration problems. Numerical experiments highlight the efficiency of our proposed algorithms compared to existing methods.
Related papers
- An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Dynamic Incremental Optimization for Best Subset Selection [15.8362578568708]
Best subset selection is considered the gold standard for many learning problems.
An efficient subset-dual algorithm is developed based on the primal and dual problem structures.
arXiv Detail & Related papers (2024-02-04T02:26:40Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Best Subset Selection with Efficient Primal-Dual Algorithm [24.568094642425837]
Best subset selection is considered the gold standard' for many learning problems.
In this paper, we investigate the dual forms of a family of $ell$-regularized problems.
An efficient primal-dual method has been developed based on the primal and dual problem structures.
arXiv Detail & Related papers (2022-07-05T14:07:52Z) - Information-Directed Selection for Top-Two Algorithms [13.339829037245963]
We consider the best-k-arm identification problem for multi-armed bandits.
The objective is to select the exact set of k arms with the highest mean rewards by sequentially allocating measurement effort.
We propose information-directed selection (IDS) that selects one of the top-two candidates based on a measure of information gain.
arXiv Detail & Related papers (2022-05-24T14:07:13Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Gamification of Pure Exploration for Linear Bandits [34.16123941778227]
We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear bandits.
Whileally optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive.
We design the first insightally optimal algorithm for fixed-confidence pure exploration in linear bandits.
arXiv Detail & Related papers (2020-07-02T08:20:35Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.