Related papers: On-Demand Communication for Asynchronous Multi-Agent Bandits

On-Demand Communication for Asynchronous Multi-Agent Bandits

URL: http://arxiv.org/abs/2302.07446v2
Date: Thu, 31 Aug 2023 02:28:41 GMT
Title: On-Demand Communication for Asynchronous Multi-Agent Bandits
Authors: Yu-Zhen Janice Chen, Lin Yang, Xuchuang Wang, Xutong Liu, Mohammad Hajiesmaili, John C.S. Lui, Don Towsley
Abstract summary: ODC is an on-demand communication protocol that tailors the communication of each pair of agents based on their empirical pull times. ODC is efficient when the pull times of agents are highly heterogeneous, and its communication complexity depends on the empirical pull times of agents.
Score: 43.3383282058292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where agents operate asynchronously -- agent pull times and rates are unknown, irregular, and heterogeneous -- and face the same instance of a K-armed bandit problem. Agents can share reward information to speed up the learning process at additional communication costs. We propose ODC, an on-demand communication protocol that tailors the communication of each pair of agents based on their empirical pull times. ODC is efficient when the pull times of agents are highly heterogeneous, and its communication complexity depends on the empirical pull times of agents. ODC is a generic protocol that can be integrated into most cooperative bandit algorithms without degrading their performance. We then incorporate ODC into the natural extensions of UCB and AAE algorithms and propose two communication-efficient cooperative algorithms. Our analysis shows that both algorithms are near-optimal in regret.

Related papers

Agent Context Protocols Enhance Collective Inference [25.611935537078825]
Agent context protocols (ACPs) are a domain- and agent-agnostic family of structured protocols for agent-agent communication, coordination, and error handling.<n>ACPs combine persistent execution blueprints and standardized message schemas, enabling robust and fault-tolerant collective inference.<n>ACP-powered generalist systems reach state-of-the-art performance: 28.3 % accuracy on AssistantBench for long-horizon web assistance and best-in-class multimodal technical reports.
arXiv Detail & Related papers (2025-05-20T16:28:08Z)
A Multiagent Path Search Algorithm for Large-Scale Coalition Structure Generation [61.08720171136229]
Coalition structure generation is a fundamental computational problem in multiagent systems. We develop SALDAE, a multiagent path finding algorithm for CSG that operates on a graph of coalition structures.
arXiv Detail & Related papers (2025-02-14T15:21:27Z)
Federated Contextual Cascading Bandits with Asynchronous Communication and Heterogeneous Users [95.77678166036561]
We propose a UCB-type algorithm with delicate communication protocols. We give sub-linear regret bounds on par with those achieved in the synchronous framework. Empirical evaluation on synthetic and real-world datasets validates our algorithm's superior performance in terms of regrets and communication costs.
arXiv Detail & Related papers (2024-02-26T05:31:14Z)
Pure Exploration in Asynchronous Federated Bandits [57.02106627533004]
We study the federated pure exploration problem of multi-armed bandits and linear bandits, where $M$ agents cooperatively identify the best arm via communicating with the central server. We propose the first asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence.
arXiv Detail & Related papers (2023-10-17T06:04:00Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits [7.23389716633927]
We show that a collaborative multi-agent emphfollow-the-regularized-leader (FTRL) algorithm has an individual regret upper bound that matches the lower bound up to a constant factor. We also show that an FTRL algorithm with a suitable regularizer is regret optimal with respect to the scaling with the edge-delay parameter.
arXiv Detail & Related papers (2022-11-30T16:46:41Z)
Private and Byzantine-Proof Cooperative Decision-Making [15.609414012418043]
The cooperative bandit problem is a multi-agent decision problem involving a group of agents that interact simultaneously with a multi-armed bandit. In this paper, we investigate the bandit problem under two settings - (a) when the agents wish to make their communication private with respect to the action sequence, and (b) when the agents can be byzantine. We provide upper-confidence bound algorithms that obtain optimal regret while being (a) differentially-private and (b) private. Our decentralized algorithms require no information about the network of connectivity between agents, making them scalable to large dynamic systems.
arXiv Detail & Related papers (2022-05-27T18:03:54Z)
Depthwise Convolution for Multi-Agent Communication with Enhanced Mean-Field Approximation [9.854975702211165]
We propose a new method based on local communication learning to tackle the multi-agent RL (MARL) challenge. First, we design a new communication protocol that exploits the ability of depthwise convolution to efficiently extract local relations. Second, we introduce the mean-field approximation into our method to reduce the scale of agent interactions.
arXiv Detail & Related papers (2022-03-06T07:42:43Z)
Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints. We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z)
Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly. In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously. In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z)
Kernel Methods for Cooperative Multi-Agent Contextual Bandits [15.609414012418043]
Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. We consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related kernel reproducing Hilbert space (RKHS) We propose textscCoop- KernelUCB, an algorithm that provides near-optimal bounds on the per-agent regret.
arXiv Detail & Related papers (2020-08-14T07:37:44Z)
R-MADDPG for Partially Observable Environments and Limited Communication [42.771013165298186]
This paper introduces a deep recurrent multiagent actor-critic framework (R-MADDPG) for handling multiagent coordination under partial observable set-tings and limited communication. We demonstrate that the resulting framework learns time dependencies for sharing missing observations, handling resource limitations, and developing different communication patterns among agents.
arXiv Detail & Related papers (2020-02-16T21:25:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.