Related papers: Identifying All ε-Best Arms in (Misspecified) Linear Bandits

Identifying All ε-Best Arms in (Misspecified) Linear Bandits

URL: http://arxiv.org/abs/2510.00073v1
Date: Mon, 29 Sep 2025 22:26:52 GMT
Title: Identifying All ε-Best Arms in (Misspecified) Linear Bandits
Authors: Zhekai Li, Tianyi Ma, Cheng Hua, Ruihao Zhu,
Abstract summary: LinFACT is an algorithm designed to optimize the identification of all epsilon-best arms in linear bandits.<n>We show that LinFACT achieves instance optimality by matching this lower bound up to a logarithmic factor.<n> Numerical experiments, including synthetic and real drug discovery data, demonstrate that LinFACT identifies more promising candidates with reduced sample complexity.
Score: 9.638337713545065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motivated by the need to efficiently identify multiple candidates in high trial-and-error cost tasks such as drug discovery, we propose a near-optimal algorithm to identify all {\epsilon}-best arms (i.e., those at most {\epsilon} worse than the optimum). Specifically, we introduce LinFACT, an algorithm designed to optimize the identification of all {\epsilon}-best arms in linear bandits. We establish a novel information-theoretic lower bound on the sample complexity of this problem and demonstrate that LinFACT achieves instance optimality by matching this lower bound up to a logarithmic factor. A key ingredient of our proof is to integrate the lower bound directly into the scaling process for upper bound derivation, determining the termination round and thus the sample complexity. We also extend our analysis to settings with model misspecification and generalized linear models. Numerical experiments, including synthetic and real drug discovery data, demonstrate that LinFACT identifies more promising candidates with reduced sample complexity, offering significant computational efficiency and accelerating early-stage exploratory experiments.

Related papers

Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z)
Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.<n>We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.<n>Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z)
Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.<n>We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
Dual-Directed Algorithm Design for Efficient Pure Exploration [9.728332815218181]
We develop a new design principle for pure-exploration problems that extends the top-two approach beyond best-arm identification.<n>We prove that, when combined with Information-Directed Selection, top-two Thompson sampling attains optimality for best-arm identification.<n>We also produce optimal algorithms for thresholding bandits and $varepsilon$-best-arm identification.
arXiv Detail & Related papers (2023-10-30T07:29:17Z)
Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach [10.293894471295205]
We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments.<n>We present novel and simple algorithms for both learning goals.
arXiv Detail & Related papers (2023-07-13T05:05:30Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification [0.0]
We study the problem of the identification of m arms with largest means under a fixed error rate $delta$ (fixed-confidence Top-m identification) This problem is motivated by practical applications, especially in medicine and recommendation systems.
arXiv Detail & Related papers (2021-11-02T10:27:17Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits [34.06611065493047]
This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. We provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. We also propose the first algorithm for linear bandits in the the fixed budget setting.
arXiv Detail & Related papers (2020-06-21T00:56:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.