PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI
Coordination
- URL: http://arxiv.org/abs/2301.06387v4
- Date: Mon, 22 May 2023 13:04:03 GMT
- Title: PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI
Coordination
- Authors: Xingzhou Lou, Jiaxian Guo, Junge Zhang, Jun Wang, Kaiqi Huang, Yali Du
- Abstract summary: We propose a policy ensemble method to increase the diversity of partners in the population.
We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives.
In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
- Score: 52.991211077362586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot human-AI coordination holds the promise of collaborating with
humans without human data. Prevailing methods try to train the ego agent with a
population of partners via self-play. However, these methods suffer from two
problems: 1) The diversity of a population with finite partners is limited,
thereby limiting the capacity of the trained ego agent to collaborate with a
novel human; 2) Current methods only provide a common best response for every
partner in the population, which may result in poor zero-shot coordination
performance with a novel partner or humans. To address these issues, we first
propose the policy ensemble method to increase the diversity of partners in the
population, and then develop a context-aware method enabling the ego agent to
analyze and identify the partner's potential policy primitives so that it can
take different actions accordingly. In this way, the ego agent is able to learn
more universal cooperative behaviors for collaborating with diverse partners.
We conduct experiments on the Overcooked environment, and evaluate the
zero-shot human-AI coordination performance of our method with both
behavior-cloned human proxies and real humans. The results demonstrate that our
method significantly increases the diversity of partners and enables ego agents
to learn more diverse behaviors than baselines, thus achieving state-of-the-art
performance in all scenarios. We also open-source a human-AI coordination study
framework on the Overcooked for the convenience of future studies.
Related papers
- Learning to Cooperate with Humans using Generative Agents [40.605931138995714]
Training agents that can coordinate zero-shot with humans is a key mission in multi-agent reinforcement learning (MARL)
We show emphlearning a generative model of human partners can effectively address this issue.
By sampling from the latent space, we can use the generative model to produce different partners to train Cooperator agents.
arXiv Detail & Related papers (2024-11-21T08:36:17Z) - Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge [47.74313897705183]
CHAIC is an inclusive embodied social intelligence challenge designed to test social perception and cooperation in embodied agents.
In CHAIC, the goal is for an embodied agent equipped with egocentric observations to assist a human who may be operating under physical constraints.
We benchmark planning- and learning-based baselines on the challenge and introduce a new method that leverages large language models and behavior modeling.
arXiv Detail & Related papers (2024-11-04T04:41:12Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - Efficient Human-AI Coordination via Preparatory Language-based
Convention [17.840956842806975]
Existing methods for human-AI coordination typically train an agent to coordinate with a diverse set of policies or with human models fitted from real human data.
We propose employing the large language model (LLM) to develop an action plan that effectively guides both human and AI.
Our method achieves better alignment with human preferences and an average performance improvement of 15% compared to the state-of-the-art.
arXiv Detail & Related papers (2023-11-01T10:18:23Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - A Hierarchical Approach to Population Training for Human-AI
Collaboration [20.860808795671343]
We introduce a Hierarchical Reinforcement Learning (HRL) based method for Human-AI Collaboration.
We demonstrate that our method is able to dynamically adapt to novel partners of different play styles and skill levels in the 2-player collaborative Overcooked game environment.
arXiv Detail & Related papers (2023-05-26T07:53:12Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams [14.215359943041369]
We propose and analyze a decentralized Multi-Armed Bandit (MAB) problem with coupled rewards as an abstraction of more general multi-agent collaboration.
We propose a Partner-Aware strategy for joint sequential decision-making that extends the well-known single-agent Upper Confidence Bound algorithm.
Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.
arXiv Detail & Related papers (2021-10-02T08:17:30Z) - On the Critical Role of Conventions in Adaptive Human-AI Collaboration [73.21967490610142]
We propose a learning framework that teases apart rule-dependent representation from convention-dependent representation.
We experimentally validate our approach on three collaborative tasks varying in complexity.
arXiv Detail & Related papers (2021-04-07T02:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.