Bandit Pareto Set Identification in a Multi-Output Linear Model
- URL: http://arxiv.org/abs/2507.04255v1
- Date: Sun, 06 Jul 2025 06:14:43 GMT
- Title: Bandit Pareto Set Identification in a Multi-Output Linear Model
- Authors: Cyrille Kone, Emilie Kaufmann, Laura Richert,
- Abstract summary: The goal is to identify the set of non-dominated arms by adaptively collecting samples from the arms.<n>We introduce and analyze the first optimal design-based algorithms for PSI, providing nearly optimal guarantees in both the fixed-budget and the fixed-confidence settings.
- Score: 10.967572582187014
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the Pareto Set Identification (PSI) problem in a structured multi-output linear bandit model. In this setting, each arm is associated a feature vector belonging to $\mathbb{R}^h$, and its mean vector in $\mathbb{R}^d$ linearly depends on this feature vector through a common unknown matrix $\Theta \in \mathbb{R}^{h \times d}$. The goal is to identify the set of non-dominated arms by adaptively collecting samples from the arms. We introduce and analyze the first optimal design-based algorithms for PSI, providing nearly optimal guarantees in both the fixed-budget and the fixed-confidence settings. Notably, we show that the difficulty of these tasks mainly depends on the sub-optimality gaps of $h$ arms only. Our theoretical results are supported by an extensive benchmark on synthetic and real-world datasets.
Related papers
- Multi-thresholding Good Arm Identification with Bandit Feedback [1.079960007119637]
We consider a good arm identification problem in a bandit setting with multi-objectives.<n>We propose the Multi-Thresholding UCB(MultiTUCB) algorithm with a sample complexity bound.
arXiv Detail & Related papers (2025-03-13T14:04:04Z) - Representative Arm Identification: A fixed confidence approach to identify cluster representatives [7.459521930846415]
We study the representative arm identification problem in the multi-armed bandits (MAB) framework.
The RAI problem covers as special cases several well-studied MAB problems such as identifying the best arm or any $M$ out of the top $K$.
We propose two algorithms, based on the idea of confidence intervals, and provide high probability upper bounds on their sample complexity.
arXiv Detail & Related papers (2024-08-26T11:47:52Z) - Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed.
SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z) - Bandit Pareto Set Identification: the Fixed Budget Setting [10.967572582187014]
We study a pure exploration problem in a multi-armed bandit model.<n>The goal is to identify the distributions whose mean is not uniformly worse than that of another distribution.
arXiv Detail & Related papers (2023-11-07T13:43:18Z) - Fixed-Budget Best-Arm Identification in Sparse Linear Bandits [69.6194614504832]
We study the best-arm identification problem in sparse linear bandits under the fixed-budget setting.
We design a two-phase algorithm, Lasso and Optimal-Design- (Lasso-OD) based linear best-arm identification.
We show that Lasso-OD is almost minimax optimal in the exponent.
arXiv Detail & Related papers (2023-11-01T12:32:17Z) - Piecewise-Stationary Multi-Objective Multi-Armed Bandit with Application
to Joint Communications and Sensing [7.0997346625024]
We propose a generic upper confidence bound (UCB)-based algorithm with change detection to solve this problem.
We also formulate an energy-efficient waveform design problem in an integrated communication and sensing system as a toy example.
arXiv Detail & Related papers (2023-02-10T14:10:14Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Universal and data-adaptive algorithms for model selection in linear
contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem.
We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$.
Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z) - Vector Optimization with Stochastic Bandit Feedback [10.66048003460524]
We introduce vector optimization problems with geometric bandit feedback.
We consider $K$ designs, with multi-dimensional mean reward vectors, which are partially ordered according to a polyhedral ordering cone $C$.
arXiv Detail & Related papers (2021-10-23T22:38:54Z) - Towards Minimax Optimal Best Arm Identification in Linear Bandits [95.22854522340938]
We study the problem of best arm identification in linear bandits in the fixed-budget setting.
By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm.
We provide a theoretical analysis of the failure probability of OD-LinBAI.
arXiv Detail & Related papers (2021-05-27T09:19:10Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.