Sample-Optimal Large-Scale Optimal Subset Selection
- URL: http://arxiv.org/abs/2408.09537v1
- Date: Sun, 18 Aug 2024 16:44:41 GMT
- Title: Sample-Optimal Large-Scale Optimal Subset Selection
- Authors: Zaile Li, Weiwei Fan, L. Jeff Hong,
- Abstract summary: We design a top-$m$ greedy selection mechanism that keeps sampling the current top $m$ alternatives with top $m$ running sample means.
We prove that the EFG-$m$ procedure is both sample optimal and consistent in terms of the probability of good selection.
Surprisingly, we also demonstrate that the EFG-$m$ procedure enables to achieve an indifference-based ranking within the selected subset of alternatives at no extra cost.
- Score: 0.9558392439655016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ranking and selection (R&S) conventionally aims to select the unique best alternative with the largest mean performance from a finite set of alternatives. However, for better supporting decision making, it may be more informative to deliver a small menu of alternatives whose mean performances are among the top $m$. Such problem, called optimal subset selection (OSS), is generally more challenging to address than the conventional R&S. This challenge becomes even more significant when the number of alternatives is considerably large. Thus, the focus of this paper is on addressing the large-scale OSS problem. To achieve this goal, we design a top-$m$ greedy selection mechanism that keeps sampling the current top $m$ alternatives with top $m$ running sample means and propose the explore-first top-$m$ greedy (EFG-$m$) procedure. Through an extended boundary-crossing framework, we prove that the EFG-$m$ procedure is both sample optimal and consistent in terms of the probability of good selection, confirming its effectiveness in solving large-scale OSS problem. Surprisingly, we also demonstrate that the EFG-$m$ procedure enables to achieve an indifference-based ranking within the selected subset of alternatives at no extra cost. This is highly beneficial as it delivers deeper insights to decision-makers, enabling more informed decision-makings. Lastly, numerical experiments validate our results and demonstrate the efficiency of our procedures.
Related papers
- Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition.
Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z) - Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario [62.615210194004106]
Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness.
In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task.
arXiv Detail & Related papers (2024-06-18T09:24:09Z) - Efficient Prompt Optimization Through the Lens of Best Arm Identification [50.56113809171805]
This work provides a principled framework, TRIPLE, to efficiently perform prompt selection under an explicit budget constraint.
It is built on a novel connection established between prompt optimization and fixed-budget best arm identification (BAI-FB) in multi-armed bandits (MAB)
arXiv Detail & Related papers (2024-02-15T05:31:13Z) - qPOTS: Efficient batch multiobjective Bayesian optimization via Pareto optimal Thompson sampling [0.0]
A sample-efficient approach to solving multiobjective optimization is via process oracle (GP) surrogates and MOBOOTS$.
We propose a Thompson sampling (TS) based approach ($qtextttPOTS$)
$qtextttPOTS$ solves a cheap multiobjective optimization on the GP posteriors with evolutionary approaches.
arXiv Detail & Related papers (2023-10-24T12:35:15Z) - Best Arm Identification for Stochastic Rising Bandits [84.55453174601826]
Rising Bandits (SRBs) model sequential decision-making problems in which the expected reward of the available options increases every time they are selected.
This paper focuses on the fixed-budget Best Arm Identification (BAI) problem for SRBs.
We propose two algorithms to tackle the above-mentioned setting, namely R-UCBE and R-SR.
arXiv Detail & Related papers (2023-02-15T08:01:37Z) - Discovering Many Diverse Solutions with Bayesian Optimization [7.136022698519586]
We propose Rank-Ordered Bayesian Optimization with Trust-regions (ROBOT)
ROBOT aims to find a portfolio of high-performing solutions that are diverse according to a user-specified diversity metric.
We show that it can discover large sets of high-performing diverse solutions while requiring few additional function evaluations.
arXiv Detail & Related papers (2022-10-20T01:56:38Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Uncertainty-Aware Search Framework for Multi-Objective Bayesian
Optimization [40.40632890861706]
We consider the problem of multi-objective (MO) blackbox optimization using expensive function evaluations.
We propose a novel uncertainty-aware search framework referred to as USeMO to efficiently select the sequence of inputs for evaluation.
arXiv Detail & Related papers (2022-04-12T16:50:48Z) - SelectAugment: Hierarchical Deterministic Sample Selection for Data
Augmentation [72.58308581812149]
We propose an effective approach, dubbed SelectAugment, to select samples to be augmented in a deterministic and online manner.
Specifically, in each batch, we first determine the augmentation ratio, and then decide whether to augment each training sample under this ratio.
In this way, the negative effects of the randomness in selecting samples to augment can be effectively alleviated and the effectiveness of DA is improved.
arXiv Detail & Related papers (2021-12-06T08:38:38Z) - Lookahead and Hybrid Sample Allocation Procedures for Multiple Attribute
Selection Decisions [0.9137554315375922]
This paper considers settings in which each measurement yields one sample of one attribute for one alternative.
When given a fixed number of samples to collect, the decision-maker must determine which samples to obtain, make the measurements, update prior beliefs about the attribute magnitudes, and then select an alternative.
arXiv Detail & Related papers (2020-07-31T15:04:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.