Related papers: Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents

Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents

URL: http://arxiv.org/abs/2408.03405v1
Date: Tue, 6 Aug 2024 18:56:29 GMT
Title: Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents
Authors: Lucia Gordon, Esther Rolf, Milind Tambe,
Abstract summary: Multi-agent bandits assume that rewards from each arm follow a fixed distribution. rewards can depend on the sensitivity of each agent to their environment. We introduce a UCB-style algorithm, Min-Width, which aggregates information from diverse agents.
Score: 26.075152706845454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stochastic multi-agent multi-armed bandits typically assume that the rewards from each arm follow a fixed distribution, regardless of which agent pulls the arm. However, in many real-world settings, rewards can depend on the sensitivity of each agent to their environment. In medical screening, disease detection rates can vary by test type; in preference matching, rewards can depend on user preferences; and in environmental sensing, observation quality can vary across sensors. Since past work does not specify how to allocate agents of heterogeneous but known sensitivity of these types in a stochastic bandit setting, we introduce a UCB-style algorithm, Min-Width, which aggregates information from diverse agents. In doing so, we address the joint challenges of (i) aggregating the rewards, which follow different distributions for each agent-arm pair, and (ii) coordinating the assignments of agents to arms. Min-Width facilitates efficient collaboration among heterogeneous agents, exploiting the known structure in the agents' reward functions to weight their rewards accordingly. We analyze the regret of Min-Width and conduct pseudo-synthetic and fully synthetic experiments to study the performance of different levels of information sharing. Our results confirm that the gains to modeling agent heterogeneity tend to be greater when the sensitivities are more varied across agents, while combining more information does not always improve performance.

Related papers

Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches. We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
An Extensible Framework for Open Heterogeneous Collaborative Perception [58.70875361688463]
Collaborative perception aims to mitigate the limitations of single-agent perception. In this paper, we introduce a new open heterogeneous problem: how to accommodate continually emerging new heterogeneous agent types into collaborative perception. We propose HEterogeneous ALliance (HEAL), a novel collaborative perception framework.
arXiv Detail & Related papers (2024-01-25T05:55:03Z)
DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents. We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z)
Pure Exploration under Mediators' Feedback [63.56002444692792]
Multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a reward. We consider the scenario in which the learner has access to a set of mediators, each of which selects the arms on the agent's behalf according to a and possibly unknown policy. We propose a sequential decision-making strategy for discovering the best arm under the assumption that the mediators' policies are known to the learner.
arXiv Detail & Related papers (2023-08-29T18:18:21Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
Near-Optimal Collaborative Learning in Bandits [15.456561090871244]
This paper introduces a general multi-agent bandit model in which each agent is facing a finite set of arms. The twist is that the optimal arm for each agent is the arm with largest expected mixed reward, where the mixed reward of an arm is a weighted sum of the rewards of this arm for all agents. We propose a near-optimal algorithm for pure exploration.
arXiv Detail & Related papers (2022-05-31T21:11:47Z)
Learning Multi-agent Skills for Tabular Reinforcement Learning using Factor Graphs [41.17714498464354]
We show that it is possible to directly compute multi-agent options with collaborative exploratory behaviors among the agents. The proposed algorithm can successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-01-20T15:33:08Z)
Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction [97.40338982628094]
We propose a simple yet effective Unlimited Neighborhood Interaction Network (UNIN) which predicts trajectories of heterogeneous agents in multiply categories. Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously. A hierarchical graph attention module is proposed to obtain category-tocategory interaction and agent-to-agent interaction.
arXiv Detail & Related papers (2021-07-31T13:36:04Z)
Heterogeneous Explore-Exploit Strategies on Multi-Star Networks [0.0]
We study a class of distributed bandit problems in which agents communicate over a multi-star network. We propose new heterogeneous explore-exploit strategies using the multi-star as the model irregular network graph.
arXiv Detail & Related papers (2020-09-02T20:56:49Z)
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities. Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z)
Individual specialization in multi-task environments with multiagent reinforcement learners [0.0]
There is a growing interest in Multi-Agent Reinforcement Learning (MARL) as the first steps towards building general intelligent agents. Previous results point us towards increased conditions for coordination, efficiency/fairness, and common-pool resource sharing. We further study coordination in multi-task environments where several rewarding tasks can be performed and thus agents don't necessarily need to perform well in all tasks, but under certain conditions may specialize.
arXiv Detail & Related papers (2019-12-29T15:20:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.