Regret Lower Bounds in Multi-agent Multi-armed Bandit
- URL: http://arxiv.org/abs/2308.08046v1
- Date: Tue, 15 Aug 2023 21:20:24 GMT
- Title: Regret Lower Bounds in Multi-agent Multi-armed Bandit
- Authors: Mengfan Xu, Diego Klabjan
- Abstract summary: Multi-armed Bandit motivates methods with provable upper bounds on regret.
We provide the first comprehensive study on regret lower bounds across different settings.
- Score: 14.822625665220068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-armed Bandit motivates methods with provable upper bounds on regret and
also the counterpart lower bounds have been extensively studied in this
context. Recently, Multi-agent Multi-armed Bandit has gained significant
traction in various domains, where individual clients face bandit problems in a
distributed manner and the objective is the overall system performance,
typically measured by regret. While efficient algorithms with regret upper
bounds have emerged, limited attention has been given to the corresponding
regret lower bounds, except for a recent lower bound for adversarial settings,
which, however, has a gap with let known upper bounds. To this end, we herein
provide the first comprehensive study on regret lower bounds across different
settings and establish their tightness. Specifically, when the graphs exhibit
good connectivity properties and the rewards are stochastically distributed, we
demonstrate a lower bound of order $O(\log T)$ for instance-dependent bounds
and $\sqrt{T}$ for mean-gap independent bounds which are tight. Assuming
adversarial rewards, we establish a lower bound $O(T^{\frac{2}{3}})$ for
connected graphs, thereby bridging the gap between the lower and upper bound in
the prior work. We also show a linear regret lower bound when the graph is
disconnected. While previous works have explored these settings with upper
bounds, we provide a thorough study on tight lower bounds.
Related papers
- Learning to Explore with Lagrangians for Bandits under Unknown Linear Constraints [8.784438985280094]
We study problems as pure exploration in multi-armed bandits with unknown linear constraints.
First, we propose a Lagrangian relaxation of the sample complexity lower bound for pure exploration under constraints.
Second, we leverage the Lagrangian lower bound and the properties of convex to propose two computationally efficient extensions of Track-and-Stop and Gamified Explorer, namely LATS and LAGEX.
arXiv Detail & Related papers (2024-10-24T15:26:14Z) - Best Arm Identification with Minimal Regret [55.831935724659175]
Best arm identification problem elegantly amalgamates regret minimization and BAI.
Agent's goal is to identify the best arm with a prescribed confidence level.
Double KL-UCB algorithm achieves optimality as the confidence level tends to zero.
arXiv Detail & Related papers (2024-09-27T16:46:02Z) - Closing the Gap Between the Upper Bound and the Lower Bound of Adam's
Iteration Complexity [51.96093077151991]
We derive a new convergence guarantee of Adam, with only an $L$-smooth condition and a bounded noise variance assumption.
Our proof utilizes novel techniques to handle the entanglement between momentum and adaptive learning rate.
arXiv Detail & Related papers (2023-10-27T09:16:58Z) - Lower Bounds for Learning in Revealing POMDPs [88.23337313766355]
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging emphpartially observable setting.
For emphmulti-step revealing POMDPs, we show that the latent state-space dependence is at least $Omega(S1.5)$ in the sample complexity.
arXiv Detail & Related papers (2023-02-02T18:59:30Z) - Precise Regret Bounds for Log-loss via a Truncated Bayesian Algorithm [14.834625066344582]
We study the sequential general online regression, known also as the sequential probability assignments, under logarithmic loss.
We focus on obtaining tight, often matching, lower and upper bounds for the sequential minimax regret that are defined as the excess loss it incurs over a class of experts.
arXiv Detail & Related papers (2022-05-07T22:03:00Z) - Sharp Bounds for Federated Averaging (Local SGD) and Continuous
Perspective [49.17352150219212]
Federated AveragingFedAvg, also known as Local SGD, is one of the most popular algorithms in Federated Learning (FL)
We show how to analyze this quantity from the Differential Equation (SDE) perspective.
arXiv Detail & Related papers (2021-11-05T22:16:11Z) - Unified lower bounds for interactive high-dimensional estimation under
information constraints [40.339506154827106]
We provide a unified framework enabling us to derive a variety of (tight) minimax lower bounds for different parametric families of distributions.
Our lower bound framework is versatile and yields "plug-and-play" bounds that are widely applicable to a large range of estimation problems.
arXiv Detail & Related papers (2020-10-13T17:25:19Z) - On Lower Bounds for Standard and Robust Gaussian Process Bandit
Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm.
We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z) - Tight Lower Bounds for Combinatorial Multi-Armed Bandits [72.56064196252498]
The Combinatorial Multi-Armed Bandit problem is a sequential decision-making problem in which an agent selects a set of arms on each round.
We show that the recently proposed Gini-weighted smoothness parameter determines the lower bounds for monotone reward functions.
arXiv Detail & Related papers (2020-02-13T08:53:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.