Related papers: A Hierarchical Approach to Population Training for Human-AI Collaboration

A Hierarchical Approach to Population Training for Human-AI Collaboration

URL: http://arxiv.org/abs/2305.16708v1
Date: Fri, 26 May 2023 07:53:12 GMT
Title: A Hierarchical Approach to Population Training for Human-AI Collaboration
Authors: Yi Loo, Chen Gong and Malika Meghjani
Abstract summary: We introduce a Hierarchical Reinforcement Learning (HRL) based method for Human-AI Collaboration. We demonstrate that our method is able to dynamically adapt to novel partners of different play styles and skill levels in the 2-player collaborative Overcooked game environment.
Score: 20.860808795671343
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A major challenge for deep reinforcement learning (DRL) agents is to collaborate with novel partners that were not encountered by them during the training phase. This is specifically worsened by an increased variance in action responses when the DRL agents collaborate with human partners due to the lack of consistency in human behaviors. Recent work have shown that training a single agent as the best response to a diverse population of training partners significantly increases an agent's robustness to novel partners. We further enhance the population-based training approach by introducing a Hierarchical Reinforcement Learning (HRL) based method for Human-AI Collaboration. Our agent is able to learn multiple best-response policies as its low-level policy while at the same time, it learns a high-level policy that acts as a manager which allows the agent to dynamically switch between the low-level best-response policies based on its current partner. We demonstrate that our method is able to dynamically adapt to novel partners of different play styles and skill levels in the 2-player collaborative Overcooked game environment. We also conducted a human study in the same environment to test the effectiveness of our method when partnering with real human subjects.

Related papers

Improving Human-AI Coordination through Adversarial Training and Generative Models [36.54154192505703]
Generalizing to novel humans requires training on data that captures the diversity of human behaviors. Adversarial training is one avenue for searching for such data and ensuring that agents are robust. We propose a novel strategy for overcoming self-sabotage that combines a pre-trained generative model to simulate valid cooperative agent policies.
arXiv Detail & Related papers (2025-04-21T21:53:00Z)
Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning. We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning. We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z)
Large Language Model-based Human-Agent Collaboration for Complex Task Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving. We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC. This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z)
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming [14.250120245287109]
We develop a Human-AI PbRL Cooperation Game, where the RL agent queries the human-in-the-loop to elicit task objective and human's preferences on the joint team behavior. Under this game formulation, we first introduce the notion of Human Flexibility to evaluate team performance based on if humans prefer to follow a fixed policy or adapt to the RL agent on the fly. We highlight a special case along these two dimensions, which we call Specified Orchestration, where the human is least flexible and agent has complete access to human policy.
arXiv Detail & Related papers (2023-12-21T20:48:15Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination [52.991211077362586]
We propose a policy ensemble method to increase the diversity of partners in the population. We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives. In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
arXiv Detail & Related papers (2023-01-16T12:14:58Z)
Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments. Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search. We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z)
Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time. We propose a novel approach to address the difficulties of scalability and data scarcity. Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z)
Behaviour-conditioned policies for cooperative reinforcement learning tasks [41.74498230885008]
In various real-world tasks, an agent needs to cooperate with unknown partner agent types. Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning. We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour. We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability.
arXiv Detail & Related papers (2021-10-04T09:16:41Z)
On the Critical Role of Conventions in Adaptive Human-AI Collaboration [73.21967490610142]
We propose a learning framework that teases apart rule-dependent representation from convention-dependent representation. We experimentally validate our approach on three collaborative tasks varying in complexity.
arXiv Detail & Related papers (2021-04-07T02:46:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.