Nested Elimination: A Simple Algorithm for Best-Item Identification from
Choice-Based Feedback
- URL: http://arxiv.org/abs/2307.09295v1
- Date: Thu, 13 Jul 2023 05:05:30 GMT
- Title: Nested Elimination: A Simple Algorithm for Best-Item Identification from
Choice-Based Feedback
- Authors: Junwen Yang, Yifan Feng
- Abstract summary: We study the problem of best-item identification from choice-based feedback.
In this problem, a company sequentially and adaptively shows display sets to a population of customers and collects their choices.
We propose an elimination-based algorithm, namely Nested Elimination (NE), which is inspired by the nested structure implied by the information-theoretic lower bound.
- Score: 8.043586007062858
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study the problem of best-item identification from choice-based feedback.
In this problem, a company sequentially and adaptively shows display sets to a
population of customers and collects their choices. The objective is to
identify the most preferred item with the least number of samples and at a high
confidence level. We propose an elimination-based algorithm, namely Nested
Elimination (NE), which is inspired by the nested structure implied by the
information-theoretic lower bound. NE is simple in structure, easy to
implement, and has a strong theoretical guarantee for sample complexity.
Specifically, NE utilizes an innovative elimination criterion and circumvents
the need to solve any complex combinatorial optimization problem. We provide an
instance-specific and non-asymptotic bound on the expected sample complexity of
NE. We also show NE achieves high-order worst-case asymptotic optimality.
Finally, numerical experiments from both synthetic and real data corroborate
our theoretical findings.
Related papers
- Optimal Decision Tree and Adaptive Submodular Ranking with Noisy Outcomes [9.321976218862542]
In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points.
This can be formulated as the classical Optimal Decision Tree (ODT) problem: Given a set of tests, a set of hypotheses, and an outcome for each pair of test and hypothesis, our objective is to find a low-cost testing procedure (i.e., decision tree) that identifies the true hypothesis.
In this work, we study a fundamental variant of the ODT problem in which some test outcomes are noisy, even in the more general
arXiv Detail & Related papers (2023-12-23T21:47:50Z) - Dual-Directed Algorithm Design for Efficient Pure Exploration [9.728332815218181]
We consider pure-exploration problems in the context of sequential adaptive experiments with a finite set of alternatives.
We formulate the problem complexity measure as a maximin optimization problem for the static fixed-budget, fixed-confidence, and posterior convergence rate settings.
Our algorithm attains optimality in $varepsilon$-best-arm identification (or ranking and selection with a probability of good selection guarantee) and thresholding bandits.
arXiv Detail & Related papers (2023-10-30T07:29:17Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.