Related papers: Bandit Pareto Set Identification in a Multi-Output Linear Model

Bandit Pareto Set Identification in a Multi-Output Linear Model

URL: http://arxiv.org/abs/2507.04255v1
Date: Sun, 06 Jul 2025 06:14:43 GMT
Title: Bandit Pareto Set Identification in a Multi-Output Linear Model
Authors: Cyrille Kone, Emilie Kaufmann, Laura Richert,
Abstract summary: The goal is to identify the set of non-dominated arms by adaptively collecting samples from the arms.<n>We introduce and analyze the first optimal design-based algorithms for PSI, providing nearly optimal guarantees in both the fixed-budget and the fixed-confidence settings.
Score: 10.967572582187014
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the Pareto Set Identification (PSI) problem in a structured multi-output linear bandit model. In this setting, each arm is associated a feature vector belonging to $\mathbb{R}^h$, and its mean vector in $\mathbb{R}^d$ linearly depends on this feature vector through a common unknown matrix $\Theta \in \mathbb{R}^{h \times d}$. The goal is to identify the set of non-dominated arms by adaptively collecting samples from the arms. We introduce and analyze the first optimal design-based algorithms for PSI, providing nearly optimal guarantees in both the fixed-budget and the fixed-confidence settings. Notably, we show that the difficulty of these tasks mainly depends on the sub-optimality gaps of $h$ arms only. Our theoretical results are supported by an extensive benchmark on synthetic and real-world datasets.

Related papers

Multi-thresholding Good Arm Identification with Bandit Feedback [1.079960007119637]
We consider a good arm identification problem in a bandit setting with multi-objectives.<n>We propose the Multi-Thresholding UCB(MultiTUCB) algorithm with a sample complexity bound.
arXiv Detail & Related papers (2025-03-13T14:04:04Z)
Representative Arm Identification: A fixed confidence approach to identify cluster representatives [7.459521930846415]
We study the representative arm identification problem in the multi-armed bandits (MAB) framework. The RAI problem covers as special cases several well-studied MAB problems such as identifying the best arm or any $M$ out of the top $K$. We propose two algorithms, based on the idea of confidence intervals, and provide high probability upper bounds on their sample complexity.
arXiv Detail & Related papers (2024-08-26T11:47:52Z)
Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed. SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z)
Bandit Pareto Set Identification: the Fixed Budget Setting [10.967572582187014]
We study a pure exploration problem in a multi-armed bandit model.<n>The goal is to identify the distributions whose mean is not uniformly worse than that of another distribution.
arXiv Detail & Related papers (2023-11-07T13:43:18Z)
Fixed-Budget Best-Arm Identification in Sparse Linear Bandits [69.6194614504832]
We study the best-arm identification problem in sparse linear bandits under the fixed-budget setting. We design a two-phase algorithm, Lasso and Optimal-Design- (Lasso-OD) based linear best-arm identification. We show that Lasso-OD is almost minimax optimal in the exponent.
arXiv Detail & Related papers (2023-11-01T12:32:17Z)
Piecewise-Stationary Multi-Objective Multi-Armed Bandit with Application to Joint Communications and Sensing [7.0997346625024]
We propose a generic upper confidence bound (UCB)-based algorithm with change detection to solve this problem. We also formulate an energy-efficient waveform design problem in an integrated communication and sensing system as a toy example.
arXiv Detail & Related papers (2023-02-10T14:10:14Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
Universal and data-adaptive algorithms for model selection in linear contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem. We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$. Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z)
Vector Optimization with Stochastic Bandit Feedback [10.66048003460524]
We introduce vector optimization problems with geometric bandit feedback. We consider $K$ designs, with multi-dimensional mean reward vectors, which are partially ordered according to a polyhedral ordering cone $C$.
arXiv Detail & Related papers (2021-10-23T22:38:54Z)
Towards Minimax Optimal Best Arm Identification in Linear Bandits [95.22854522340938]
We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm. We provide a theoretical analysis of the failure probability of OD-LinBAI.
arXiv Detail & Related papers (2021-05-27T09:19:10Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.