Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach
- URL: http://arxiv.org/abs/2307.09295v2
- Date: Wed, 01 Jan 2025 15:20:17 GMT
- Title: Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach
- Authors: Junwen Yang, Yifan Feng,
- Abstract summary: We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments.<n>We present novel and simple algorithms for both learning goals.
- Score: 10.293894471295205
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments. In this problem, a company sequentially displays a set of items to a population of customers and collects their choices as feedback. The only information available about the underlying choice model is that the choice probabilities are consistent with some unknown true strict ranking over the items. The objective is to identify, with the fewest samples, the most preferred item or the full ranking over the items at a high confidence level. We present novel and simple algorithms for both learning goals. In the first subproblem regarding best-item identification, we introduce an elimination-based algorithm, Nested Elimination (NE). In the more complex subproblem regarding full-ranking identification, we generalize NE and propose a divide-and-conquer algorithm, Nested Partition (NP). We provide strong characterizations of both algorithms through instance-specific and non-asymptotic bounds on the sample complexity. This is accomplished using an analytical framework that characterizes the system dynamics through analyzing a sequence of multi-dimensional random walks. We also establish a connection between our nested approach and the information-theoretic lower bounds. We thus show that NE is worst-case asymptotically optimal, and NP is optimal up to a constant factor. Finally, numerical experiments from both synthetic and real data corroborate our theoretical findings.
Related papers
- Optimal Decision Tree and Adaptive Submodular Ranking with Noisy Outcomes [9.321976218862542]
In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points.
This can be formulated as the classical Optimal Decision Tree (ODT) problem: Given a set of tests, a set of hypotheses, and an outcome for each pair of test and hypothesis, our objective is to find a low-cost testing procedure (i.e., decision tree) that identifies the true hypothesis.
In this work, we study a fundamental variant of the ODT problem in which some test outcomes are noisy, even in the more general
arXiv Detail & Related papers (2023-12-23T21:47:50Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Dual-Directed Algorithm Design for Efficient Pure Exploration [9.728332815218181]
We consider pure-exploration problems in the context of sequential adaptive experiments with a finite set of alternatives.
We formulate the problem complexity measure as a maximin optimization problem for the static fixed-budget, fixed-confidence, and posterior convergence rate settings.
Our algorithm attains optimality in $varepsilon$-best-arm identification (or ranking and selection with a probability of good selection guarantee) and thresholding bandits.
arXiv Detail & Related papers (2023-10-30T07:29:17Z) - Faster Stochastic Variance Reduction Methods for Compositional MiniMax
Optimization [50.10952609321302]
compositional minimax optimization is a pivotal challenge across various machine learning domains.
Current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes.
This paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O(kappa3 /epsilon3 )$.
arXiv Detail & Related papers (2023-08-18T14:57:21Z) - Debiasing Conditional Stochastic Optimization [15.901623717313493]
We study the conditional causal optimization (CSO) problem which covers a variety of applications including portfolio selection, reinforcement learning, robust learning, etc.
We develop new algorithms for the finite variant variant CSO problem that significantly improve upon existing results.
We believe that our technique has the potential to be a useful tool for addressing similar challenges in other optimization problems.
arXiv Detail & Related papers (2023-04-20T19:19:55Z) - Symmetric Tensor Networks for Generative Modeling and Constrained
Combinatorial Optimization [72.41480594026815]
Constrained optimization problems abound in industry, from portfolio optimization to logistics.
One of the major roadblocks in solving these problems is the presence of non-trivial hard constraints which limit the valid search space.
In this work, we encode arbitrary integer-valued equality constraints of the form Ax=b, directly into U(1) symmetric networks (TNs) and leverage their applicability as quantum-inspired generative models.
arXiv Detail & Related papers (2022-11-16T18:59:54Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - Recursive Causal Structure Learning in the Presence of Latent Variables
and Selection Bias [27.06618125828978]
We consider the problem of learning the causal MAG of a system from observational data in the presence of latent variables and selection bias.
We propose a novel computationally efficient constraint-based method that is sound and complete.
We provide experimental results to compare the proposed approach with the state of the art on both synthetic and real-world structures.
arXiv Detail & Related papers (2021-10-22T19:49:59Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Bloom Origami Assays: Practical Group Testing [90.2899558237778]
Group testing is a well-studied problem with several appealing solutions.
Recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
We develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results.
arXiv Detail & Related papers (2020-07-21T19:31:41Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.