Multi-agent Multi-armed Bandit with Fully Heavy-tailed Dynamics
- URL: http://arxiv.org/abs/2501.19239v1
- Date: Fri, 31 Jan 2025 15:53:14 GMT
- Title: Multi-agent Multi-armed Bandit with Fully Heavy-tailed Dynamics
- Authors: Xingyu Wang, Mengfan Xu,
- Abstract summary: We study decentralized multi-agent multi-armed bandits in fully heavy-tailed settings, where clients communicate over sparse random graphs with heavy-tailed degree distributions.
We are the first to address such fully heavy-tailed scenarios, which capture the dynamics and challenges in communication and inference among multiple clients in real-world systems.
- Score: 5.54892060291103
- License:
- Abstract: We study decentralized multi-agent multi-armed bandits in fully heavy-tailed settings, where clients communicate over sparse random graphs with heavy-tailed degree distributions and observe heavy-tailed (homogeneous or heterogeneous) reward distributions with potentially infinite variance. The objective is to maximize system performance by pulling the globally optimal arm with the highest global reward mean across all clients. We are the first to address such fully heavy-tailed scenarios, which capture the dynamics and challenges in communication and inference among multiple clients in real-world systems. In homogeneous settings, our algorithmic framework exploits hub-like structures unique to heavy-tailed graphs, allowing clients to aggregate rewards and reduce noises via hub estimators when constructing UCB indices; under $M$ clients and degree distributions with power-law index $\alpha > 1$, our algorithm attains a regret bound (almost) of order $O(M^{1 -\frac{1}{\alpha}} \log{T})$. Under heterogeneous rewards, clients synchronize by communicating with neighbors, aggregating exchanged estimators in UCB indices; With our newly established information delay bounds on sparse random graphs, we prove a regret bound of $O(M \log{T})$. Our results improve upon existing work, which only address time-invariant connected graphs, or light-tailed dynamics in dense graphs and rewards.
Related papers
- Continuous K-Max Bandits [54.21533414838677]
We study the $K$-Max multi-armed bandits problem with continuous outcome distributions and weak value-index feedback.
This setting captures critical applications in recommendation systems, distributed computing, server scheduling, etc.
Our key contribution is the computationally efficient algorithm DCK-UCB, which combines adaptive discretization with bias-corrected confidence bounds.
arXiv Detail & Related papers (2025-02-19T06:37:37Z) - Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
We present a $textbfMA-OSMA$ algorithm to transfer the discrete submodular problem into a continuous optimization.
We also introduce a projection-free $textbfMA-OSEA$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution.
Our algorithms significantly improve the $(frac11+c)$-approximation provided by the state-of-the-art OSG algorithm.
arXiv Detail & Related papers (2025-02-07T15:57:56Z) - Cooperative Multi-Agent Constrained Stochastic Linear Bandits [2.099922236065961]
A network of $N$ agents communicate locally to minimize their collective regret while keeping their expected cost under a specified threshold $tau$.
We propose a safe distributed upper confidence bound algorithm, so called textitMA-OPLB, and establish a high probability bound on its $T$-round regret.
We show that our regret bound is of order $ mathcalOleft(fracdtau-c_0fraclog(NT)2sqrtNsqrtTlog (1/|lambda|)
arXiv Detail & Related papers (2024-10-22T19:34:53Z) - Communication-Efficient Federated Group Distributionally Robust Optimization [49.14751984342068]
Federated learning faces challenges due to the heterogeneity in data volumes and distributions at different clients.
Existing approaches to address this issue based on group distributionally robust optimization (GDRO) often lead to high communication and sample complexity.
This work introduces algorithms tailored for communication-efficient Federated Group Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-10-08T21:07:53Z) - Distributed Optimization via Kernelized Multi-armed Bandits [6.04275169308491]
We model a distributed optimization problem as a multi-agent kernelized multi-armed bandit problem with a heterogeneous reward setting.
We present a fully decentralized algorithm, Multi-agent IGP-UCB (MA-IGP-UCB), which achieves a sub-linear regret bound for popular classes for kernels.
We also propose an extension, Multi-agent Delayed IGP-UCB (MAD-IGP-UCB) algorithm, which reduces the dependence of the regret bound on the number of agents in the network.
arXiv Detail & Related papers (2023-12-07T21:57:48Z) - Fed-GraB: Federated Long-tailed Learning with Self-Adjusting Gradient
Balancer [47.82735112096587]
This paper investigates a federated long-tailed learning (Fed-LT) task in which each client holds a locally heterogeneous dataset.
We propose a method termed $textttFed-GraB$, comprised of a Self-Natural Gradient Balancer (SGB) module.
We show that $textttFed-GraB$ achieves state-of-the-art performance on representative datasets such as CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iist.
arXiv Detail & Related papers (2023-10-11T15:28:39Z) - Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with
Heterogeneous Rewards [14.822625665220068]
We study a decentralized multi-agent multi-armed bandit problem in which multiple clients are connected by time dependent random graphs provided by an environment.
The goal is to minimize the overall regret of the entire system through collaborations.
arXiv Detail & Related papers (2023-06-08T22:37:24Z) - Rate-Distortion Theoretic Bounds on Generalization Error for Distributed
Learning [9.00236182523638]
In this paper, we use tools from rate-distortion theory to establish new upper bounds on the generalization error of statistical distributed learning algorithms.
The bounds depend on the compressibility of each client's algorithm while keeping other clients' algorithms un-compressed.
arXiv Detail & Related papers (2022-06-06T13:21:52Z) - Collaborative Multi-agent Stochastic Linear Bandits [28.268809091816287]
We study a collaborative multi-agent linear bandit setting, where $N$ agents that form a network communicate locally to minimize their overall regret.
All the agents observe the corresponding rewards of the played actions and use an accelerated consensus procedure to compute an estimate of the average of the rewards obtained by all the agents.
arXiv Detail & Related papers (2022-05-12T19:46:35Z) - Low-Latency Federated Learning over Wireless Channels with Differential
Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server.
In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.