Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi
- URL: http://arxiv.org/abs/2107.07630v1
- Date: Thu, 15 Jul 2021 22:19:15 GMT
- Title: Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi
- Authors: Ho Chit Siu, Jaime D. Pena, Kimberlee C. Chang, Edenna Chen, Yutai
Zhou, Victor J. Lopez, Kyle Palko, Ross E. Allen
- Abstract summary: We evaluate teams of humans and AI agents in the cooperative card game emphHanabi with both rule-based and learning-based agents.
We find that humans have a clear preference toward a rule-based AI teammate over a state-of-the-art learning-based AI teammate.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep reinforcement learning has generated superhuman AI in competitive games
such as Go and StarCraft. Can similar learning techniques create a superior AI
teammate for human-machine collaborative games? Will humans prefer AI teammates
that improve objective team performance or those that improve subjective
metrics of trust? In this study, we perform a single-blind evaluation of teams
of humans and AI agents in the cooperative card game \emph{Hanabi}, with both
rule-based and learning-based agents. In addition to the game score, used as an
objective metric of the human-AI team performance, we also quantify subjective
measures of the human's perceived performance, teamwork, interpretability,
trust, and overall preference of AI teammate. We find that humans have a clear
preference toward a rule-based AI teammate (SmartBot) over a state-of-the-art
learning-based AI teammate (Other-Play) across nearly all subjective metrics,
and generally view the learning-based agent negatively, despite no statistical
difference in the game score. This result has implications for future AI design
and reinforcement learning benchmarking, highlighting the need to incorporate
subjective metrics of human-AI teaming rather than a singular focus on
objective task performance.
Related papers
- Reinforcement Learning for High-Level Strategic Control in Tower Defense Games [47.618236610219554]
In strategy games, one of the most important aspects of game design is maintaining a sense of challenge for players.
We propose an automated approach that combines traditional scripted methods with reinforcement learning.
Results show that combining a learned approach, such as reinforcement learning, with a scripted AI produces a higher-performing and more robust agent than using only AI.
arXiv Detail & Related papers (2024-06-12T08:06:31Z) - Toward Human-AI Alignment in Large-Scale Multi-Player Games [24.784173202415687]
We analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games)
We find that while human players exhibit variability in fight-flight and explore-exploit behavior, AI players tend towards uniformity.
These stark differences underscore the need for interpretable evaluation, design, and integration of AI in human-aligned applications.
arXiv Detail & Related papers (2024-02-05T22:55:33Z) - Enhancing Human Experience in Human-Agent Collaboration: A
Human-Centered Modeling Approach Based on Positive Human Gain [18.968232976619912]
We propose a "human-centered" modeling scheme for collaborative AI agents.
We expect that agents should learn to enhance the extent to which humans achieve these goals while maintaining agents' original abilities.
We evaluate the RLHG agent in the popular Multi-player Online Battle Arena (MOBA) game, Honor of Kings.
arXiv Detail & Related papers (2024-01-28T05:05:57Z) - DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan.
We first put forward an AI program named DanZero for this game.
In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z) - Capturing Humans' Mental Models of AI: An Item Response Theory Approach [12.129622383429597]
We show that people expect AI agents' performance to be significantly better on average than the performance of other humans.
Our results indicate that people expect AI agents' performance to be significantly better on average than the performance of other humans.
arXiv Detail & Related papers (2023-05-15T23:17:26Z) - Navigates Like Me: Understanding How People Evaluate Human-Like AI in
Video Games [36.96985093527702]
We collect hundreds of crowd-sourced assessments comparing the human-likeness of navigation behavior generated by our agent and baseline AI agents.
Our proposed agent passes a Turing Test, while the baseline agents do not.
This work provides insights into the characteristics that people consider human-like in the context of goal-directed video game navigation.
arXiv Detail & Related papers (2023-03-02T18:59:04Z) - Human Decision Makings on Curriculum Reinforcement Learning with
Difficulty Adjustment [52.07473934146584]
We guide the curriculum reinforcement learning results towards a preferred performance level that is neither too hard nor too easy via learning from the human decision process.
Our system is highly parallelizable, making it possible for a human to train large-scale reinforcement learning applications.
It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.
arXiv Detail & Related papers (2022-08-04T23:53:51Z) - Collusion Detection in Team-Based Multiplayer Games [57.153233321515984]
We propose a system that detects colluding behaviors in team-based multiplayer games.
The proposed method analyzes the players' social relationships paired with their in-game behavioral patterns.
We then automate the detection using Isolation Forest, an unsupervised learning technique specialized in highlighting outliers.
arXiv Detail & Related papers (2022-03-10T02:37:39Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.