Multiplayer Information Asymmetric Bandits in Metric Spaces
- URL: http://arxiv.org/abs/2503.08004v1
- Date: Tue, 11 Mar 2025 03:08:09 GMT
- Title: Multiplayer Information Asymmetric Bandits in Metric Spaces
- Authors: William Chang, Aditi Kartik,
- Abstract summary: We consider information asymmetry in rewards, actions, or both.<n>We adopt the CAB algorithm given in citekleinberg2004nearly which uses a fixed discretization to give regret bounds of the same order.<n>We also adopt their zooming algorithm cite kleinberg2008multi which uses an adaptive discretization and apply it to information asymmetry in rewards and information asymmetry in actions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years the information asymmetric Lipschitz bandits In this paper we studied the Lipschitz bandit problem applied to the multiplayer information asymmetric problem studied in \cite{chang2022online, chang2023optimal}. More specifically we consider information asymmetry in rewards, actions, or both. We adopt the CAB algorithm given in \cite{kleinberg2004nearly} which uses a fixed discretization to give regret bounds of the same order (in the dimension of the action) space in all 3 problem settings. We also adopt their zooming algorithm \cite{ kleinberg2008multi}which uses an adaptive discretization and apply it to information asymmetry in rewards and information asymmetry in actions.
Related papers
- Multiplayer Information Asymmetric Contextual Bandits [0.0]
We propose a novel multiplayer information asymmetric contextual bandit framework.
There are multiple players each with their own set of actions. At every round, they observe the same context vectors and simultaneously take an action from their own set of actions, giving rise to a joint action.
We propose a novel algorithm textttETC that is built on explore-then-commit principles to achieve the same optimal regret when both types of asymmetry are present.
arXiv Detail & Related papers (2025-03-11T23:48:31Z) - Computing Game Symmetries and Equilibria That Respect Them [77.72705755558839]
We study the computational of identifying and using symmetries in games.<n>We find a strong connection between game symmetries and graph automorphisms.<n>We show that finding a Nash equilibrium that respects a given set of symmetries is exactly as hard as Brouwer fixed point and gradient descent problems.
arXiv Detail & Related papers (2025-01-15T16:15:16Z) - Symmetric Linear Bandits with Hidden Symmetry [17.40632789343385]
We study high-dimensional symmetric linear bandits where the symmetry is hidden from the learner.
We provide a method based on model selection within the collection of low-dimensional subspaces.
arXiv Detail & Related papers (2024-05-22T18:11:57Z) - Forced Exploration in Bandit Problems [12.13966146283641]
The multi-armed bandit(MAB) is a classical sequential decision problem.
This paper aims to design a multi-armed bandit algorithm that can be implemented without using information about the reward distribution.
arXiv Detail & Related papers (2023-12-12T14:00:29Z) - Iterative Sketching for Secure Coded Regression [66.53950020718021]
We propose methods for speeding up distributed linear regression.
Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem.
arXiv Detail & Related papers (2023-08-08T11:10:42Z) - Implicitly normalized forecaster with clipping for linear and non-linear
heavy-tailed multi-armed bandits [85.27420062094086]
Implicitly Normalized Forecaster (INF) is considered an optimal solution for adversarial multi-armed bandit (MAB) problems.
We propose a new version of INF called the Implicitly Normalized Forecaster with clipping (INFclip) for MAB problems with heavy-tailed settings.
We demonstrate that INFclip is optimal for linear heavy-tailed MAB problems and works well for non-linear ones.
arXiv Detail & Related papers (2023-05-11T12:00:43Z) - Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
Learning: Adaptivity and Computational Efficiency [90.40062452292091]
We present the first computationally efficient algorithm for linear bandits with heteroscedastic noise.
Our algorithm is adaptive to the unknown variance of noise and achieves an $tildeO(d sqrtsum_k = 1K sigma_k2 + d)$ regret.
We also propose a variance-adaptive algorithm for linear mixture Markov decision processes (MDPs) in reinforcement learning.
arXiv Detail & Related papers (2023-02-21T00:17:24Z) - Invariant Lipschitz Bandits: A Side Observation Approach [18.688474183114085]
We study the invariant Lipschitz bandit setting, where the reward function and the set of arms are preserved under a group of transformations.
We introduce an algorithm named textttUniformMesh-N, which naturally integrates side observations using group orbits.
We prove an improved regret upper bound, which depends on the cardinality of the group, given that the group is finite.
arXiv Detail & Related papers (2022-12-14T22:12:32Z) - A gradient estimator via L1-randomization for online zero-order
optimization with two point feedback [93.57603470949266]
We present a novel gradient estimator based on two function evaluation and randomization.
We consider two types of assumptions on the noise of the zero-order oracle: canceling noise and adversarial noise.
We provide an anytime and completely data-driven algorithm, which is adaptive to all parameters of the problem.
arXiv Detail & Related papers (2022-05-27T11:23:57Z) - A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit [1.2183405753834562]
This work addresses a version of the two-armed Bernoulli bandit problem where the sum of the means of the arms is one.
We obtain the leading order terms of the minmax optimal regret and pseudoregret for this problem by associating each of them with a solution of a linear heat equation.
arXiv Detail & Related papers (2022-02-11T17:03:18Z) - High-Dimensional Sparse Linear Bandits [67.9378546011416]
We derive a novel $Omega(n2/3)$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime.
We also prove a dimension-free $O(sqrtn)$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.
arXiv Detail & Related papers (2020-11-08T16:48:11Z) - Nearly Dimension-Independent Sparse Linear Bandit over Small Action
Spaces via Best Subset Selection [71.9765117768556]
We consider the contextual bandit problem under the high dimensional linear model.
This setting finds essential applications such as personalized recommendation, online advertisement, and personalized medicine.
We propose doubly growing epochs and estimating the parameter using the best subset selection method.
arXiv Detail & Related papers (2020-09-04T04:10:39Z) - Adaptive Discretization for Adversarial Lipschitz Bandits [85.39106976861702]
Lipschitz bandits is a prominent version of multi-armed bandits that studies large, structured action spaces.
A central theme here is the adaptive discretization of the action space, which gradually zooms in'' on the more promising regions.
We provide the first algorithm for adaptive discretization in the adversarial version, and derive instance-dependent regret bounds.
arXiv Detail & Related papers (2020-06-22T16:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.