Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams
- URL: http://arxiv.org/abs/2110.00751v1
- Date: Sat, 2 Oct 2021 08:17:30 GMT
- Title: Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams
- Authors: Erdem B{\i}y{\i}k, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith,
Dorsa Sadigh
- Abstract summary: We propose and analyze a decentralized Multi-Armed Bandit (MAB) problem with coupled rewards as an abstraction of more general multi-agent collaboration.
We propose a Partner-Aware strategy for joint sequential decision-making that extends the well-known single-agent Upper Confidence Bound algorithm.
Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.
- Score: 14.215359943041369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When humans collaborate with each other, they often make decisions by
observing others and considering the consequences that their actions may have
on the entire team, instead of greedily doing what is best for just themselves.
We would like our AI agents to effectively collaborate in a similar way by
capturing a model of their partners. In this work, we propose and analyze a
decentralized Multi-Armed Bandit (MAB) problem with coupled rewards as an
abstraction of more general multi-agent collaboration. We demonstrate that
na\"ive extensions of single-agent optimal MAB algorithms fail when applied for
decentralized bandit teams. Instead, we propose a Partner-Aware strategy for
joint sequential decision-making that extends the well-known single-agent Upper
Confidence Bound algorithm. We analytically show that our proposed strategy
achieves logarithmic regret, and provide extensive experiments involving
human-AI and human-robot collaboration to validate our theoretical findings.
Our results show that the proposed partner-aware strategy outperforms other
known methods, and our human subject studies suggest humans prefer to
collaborate with AI agents implementing our partner-aware strategy.
Related papers
- ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - A Reinforcement Learning-assisted Genetic Programming Algorithm for Team
Formation Problem Considering Person-Job Matching [70.28786574064694]
A reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions.
The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams.
arXiv Detail & Related papers (2023-04-08T14:32:12Z) - PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI
Coordination [52.991211077362586]
We propose a policy ensemble method to increase the diversity of partners in the population.
We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives.
In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
arXiv Detail & Related papers (2023-01-16T12:14:58Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Communication-Efficient Collaborative Best Arm Identification [6.861971769602314]
We investigate top-$m$ arm identification, a basic problem in bandit theory, in a multi-agent learning model in which agents collaborate to learn an objective function.
We are interested in designing collaborative learning algorithms that achieve maximum speedup.
arXiv Detail & Related papers (2022-08-18T19:02:29Z) - Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination [0.4153433779716327]
We formalize an alternative criteria for evaluating cooperative AI, referred to as inter-algorithm cross-play.
We show that existing state-of-the-art cooperative AI algorithms, such as Other-Play and Off-Belief Learning, under-perform in this paradigm.
We propose the Any-Play learning augmentation for generalizing self-play-based algorithms to the inter-algorithm cross-play setting.
arXiv Detail & Related papers (2022-01-28T21:43:58Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - Emergence of Theory of Mind Collaboration in Multiagent Systems [65.97255691640561]
We propose an adaptive training algorithm to develop effective collaboration between agents with ToM.
We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
arXiv Detail & Related papers (2021-09-30T23:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.