Reinforcement Learning on Human Decision Models for Uniquely
Collaborative AI Teammates
- URL: http://arxiv.org/abs/2111.09800v1
- Date: Thu, 18 Nov 2021 17:06:57 GMT
- Title: Reinforcement Learning on Human Decision Models for Uniquely
Collaborative AI Teammates
- Authors: Nicholas Kantack
- Abstract summary: This study details the development of the agent that won the challenge by achieving a human-play average score of 16.5.
The winning agent's development consisted of observing and accurately modeling the author's decision making in Hanabi, then training with a behavioral clone of the author.
Notably, the agent discovered a human-complementary play style by first mimicking human decision making, then exploring variations to the human-like strategy that led to higher simulated human-bot scores.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In 2021 the Johns Hopkins University Applied Physics Laboratory held an
internal challenge to develop artificially intelligent (AI) agents that could
excel at the collaborative card game Hanabi. Agents were evaluated on their
ability to play with human players whom the agents had never previously
encountered. This study details the development of the agent that won the
challenge by achieving a human-play average score of 16.5, outperforming the
current state-of-the-art for human-bot Hanabi scores. The winning agent's
development consisted of observing and accurately modeling the author's
decision making in Hanabi, then training with a behavioral clone of the author.
Notably, the agent discovered a human-complementary play style by first
mimicking human decision making, then exploring variations to the human-like
strategy that led to higher simulated human-bot scores. This work examines in
detail the design and implementation of this human compatible Hanabi teammate,
as well as the existence and implications of human-complementary strategies and
how they may be explored for more successful applications of AI in human
machine teams.
Related papers
- Enhancing Human Experience in Human-Agent Collaboration: A
Human-Centered Modeling Approach Based on Positive Human Gain [18.968232976619912]
We propose a "human-centered" modeling scheme for collaborative AI agents.
We expect that agents should learn to enhance the extent to which humans achieve these goals while maintaining agents' original abilities.
We evaluate the RLHG agent in the popular Multi-player Online Battle Arena (MOBA) game, Honor of Kings.
arXiv Detail & Related papers (2024-01-28T05:05:57Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Exploration with Principles for Diverse AI Supervision [88.61687950039662]
Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.
While this generative AI approach has produced impressive results, it heavily leans on human supervision.
This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation.
We propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data.
arXiv Detail & Related papers (2023-10-13T07:03:39Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - BO-Muse: A human expert and AI teaming framework for accelerated
experimental design [58.61002520273518]
Our algorithm lets the human expert take the lead in the experimental process.
We show that our algorithm converges sub-linearly, at a rate faster than the AI or human alone.
arXiv Detail & Related papers (2023-03-03T02:56:05Z) - Mastering the Game of No-Press Diplomacy via Human-Regularized
Reinforcement Learning and Planning [95.78031053296513]
No-press Diplomacy is a complex strategy game involving both cooperation and competition.
We introduce a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy.
We show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL.
arXiv Detail & Related papers (2022-10-11T14:47:35Z) - Human-AI Coordination via Human-Regularized Search and Learning [33.95649252941375]
We develop a three-step algorithm that achieve strong performance in coordinating with real humans in the Hanabi benchmark.
We first use a regularized search algorithm and behavioral cloning to produce a better human model that captures diverse skill levels.
We show that our method beats a vanilla best response to behavioral cloning baseline by having experts play repeatedly with the two agents.
arXiv Detail & Related papers (2022-10-11T03:46:12Z) - Incorporating Rivalry in Reinforcement Learning for a Competitive Game [65.2200847818153]
This work proposes a novel reinforcement learning mechanism based on the social impact of rivalry behavior.
Our proposed model aggregates objective and social perception mechanisms to derive a rivalry score that is used to modulate the learning of artificial agents.
arXiv Detail & Related papers (2022-08-22T14:06:06Z) - Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi [0.0]
We evaluate teams of humans and AI agents in the cooperative card game emphHanabi with both rule-based and learning-based agents.
We find that humans have a clear preference toward a rule-based AI teammate over a state-of-the-art learning-based AI teammate.
arXiv Detail & Related papers (2021-07-15T22:19:15Z) - Learning Models of Individual Behavior in Chess [4.793072503820555]
We develop highly accurate predictive models of individual human behavior in chess.
Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people.
arXiv Detail & Related papers (2020-08-23T18:24:21Z) - Real-World Human-Robot Collaborative Reinforcement Learning [6.089774484591287]
We present a real-world setup of a human-robot collaborative maze game, designed to be non-trivial and only solvable through collaboration.
We use deep reinforcement learning for the control of the robotic agent, and achieve results within 30 minutes of real-world play.
We present results on how co-policy learning occurs over time between the human and the robotic agent resulting in each participant's agent serving as a representation of how they would play the game.
arXiv Detail & Related papers (2020-03-02T19:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.