Related papers: Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

URL: http://arxiv.org/abs/2110.00751v1
Date: Sat, 2 Oct 2021 08:17:30 GMT
Title: Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams
Authors: Erdem B{\i}y{\i}k, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh
Abstract summary: We propose and analyze a decentralized Multi-Armed Bandit (MAB) problem with coupled rewards as an abstraction of more general multi-agent collaboration. We propose a Partner-Aware strategy for joint sequential decision-making that extends the well-known single-agent Upper Confidence Bound algorithm. Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.
Score: 14.215359943041369
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When humans collaborate with each other, they often make decisions by observing others and considering the consequences that their actions may have on the entire team, instead of greedily doing what is best for just themselves. We would like our AI agents to effectively collaborate in a similar way by capturing a model of their partners. In this work, we propose and analyze a decentralized Multi-Armed Bandit (MAB) problem with coupled rewards as an abstraction of more general multi-agent collaboration. We demonstrate that na\"ive extensions of single-agent optimal MAB algorithms fail when applied for decentralized bandit teams. Instead, we propose a Partner-Aware strategy for joint sequential decision-making that extends the well-known single-agent Upper Confidence Bound algorithm. We analytically show that our proposed strategy achieves logarithmic regret, and provide extensive experiments involving human-AI and human-robot collaboration to validate our theoretical findings. Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.

Related papers

Human-AI Collaboration: Trade-offs Between Performance and Preferences [5.172575113585139]
We show that agents who are more considerate of human actions are preferred over purely performance-maximizing agents. We find evidence for inequality-aversion effects being a driver of human choices, suggesting that people prefer collaborative agents which allow them to meaningfully contribute to the team.
arXiv Detail & Related papers (2025-02-28T23:50:14Z)
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [51.452664740963066]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments. We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions. Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z)
Multi-Agent Best Arm Identification in Stochastic Linear Bandits [0.7673339435080443]
We study the problem of collaborative best-arm identification in linear bandits under a fixed-budget scenario.<n>We propose two algorithms, MaLinBAI-Star and MaLinBAI-Gen for star networks and networks with arbitrary structure.
arXiv Detail & Related papers (2024-11-20T20:09:44Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning. COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy. We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z)
A Reinforcement Learning-assisted Genetic Programming Algorithm for Team Formation Problem Considering Person-Job Matching [70.28786574064694]
A reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions. The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams.
arXiv Detail & Related papers (2023-04-08T14:32:12Z)
PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination [52.991211077362586]
We propose a policy ensemble method to increase the diversity of partners in the population. We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives. In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
arXiv Detail & Related papers (2023-01-16T12:14:58Z)
Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space. Also, we propose a novel framework to adopt the multi-agent options in the MARL process. We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z)
Communication-Efficient Collaborative Best Arm Identification [6.861971769602314]
We investigate top-$m$ arm identification, a basic problem in bandit theory, in a multi-agent learning model in which agents collaborate to learn an objective function. We are interested in designing collaborative learning algorithms that achieve maximum speedup.
arXiv Detail & Related papers (2022-08-18T19:02:29Z)
Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination [0.4153433779716327]
We formalize an alternative criteria for evaluating cooperative AI, referred to as inter-algorithm cross-play. We show that existing state-of-the-art cooperative AI algorithms, such as Other-Play and Off-Belief Learning, under-perform in this paradigm. We propose the Any-Play learning augmentation for generalizing self-play-based algorithms to the inter-algorithm cross-play setting.
arXiv Detail & Related papers (2022-01-28T21:43:58Z)
Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time. We propose a novel approach to address the difficulties of scalability and data scarcity. Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z)
Emergence of Theory of Mind Collaboration in Multiagent Systems [65.97255691640561]
We propose an adaptive training algorithm to develop effective collaboration between agents with ToM. We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
arXiv Detail & Related papers (2021-09-30T23:28:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.