Navigates Like Me: Understanding How People Evaluate Human-Like AI in
Video Games
- URL: http://arxiv.org/abs/2303.02160v1
- Date: Thu, 2 Mar 2023 18:59:04 GMT
- Title: Navigates Like Me: Understanding How People Evaluate Human-Like AI in
Video Games
- Authors: Stephanie Milani, Arthur Juliani, Ida Momennejad, Raluca Georgescu,
Jaroslaw Rzpecki, Alison Shaw, Gavin Costello, Fei Fang, Sam Devlin, Katja
Hofmann
- Abstract summary: We collect hundreds of crowd-sourced assessments comparing the human-likeness of navigation behavior generated by our agent and baseline AI agents.
Our proposed agent passes a Turing Test, while the baseline agents do not.
This work provides insights into the characteristics that people consider human-like in the context of goal-directed video game navigation.
- Score: 36.96985093527702
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We aim to understand how people assess human likeness in navigation produced
by people and artificially intelligent (AI) agents in a video game. To this
end, we propose a novel AI agent with the goal of generating more human-like
behavior. We collect hundreds of crowd-sourced assessments comparing the
human-likeness of navigation behavior generated by our agent and baseline AI
agents with human-generated behavior. Our proposed agent passes a Turing Test,
while the baseline agents do not. By passing a Turing Test, we mean that human
judges could not quantitatively distinguish between videos of a person and an
AI agent navigating. To understand what people believe constitutes human-like
navigation, we extensively analyze the justifications of these assessments.
This work provides insights into the characteristics that people consider
human-like in the context of goal-directed video game navigation, which is a
key step for further improving human interactions with AI agents.
Related papers
- Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.
We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - CoNav: A Benchmark for Human-Centered Collaborative Navigation [66.6268966718022]
We propose a collaborative navigation (CoNav) benchmark.
Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities.
We propose an intention-aware agent for reasoning both long-term and short-term human intention.
arXiv Detail & Related papers (2024-06-04T15:44:25Z) - Explainable Human-AI Interaction: A Planning Perspective [32.477369282996385]
AI systems need to be explainable to the humans in the loop.
We will discuss how the AI agent can use mental models to either conform to human expectations, or change those expectations through explanatory communication.
While the main focus of the book is on cooperative scenarios, we will point out how the same mental models can be used for obfuscation and deception.
arXiv Detail & Related papers (2024-05-19T22:22:21Z) - Toward Human-AI Alignment in Large-Scale Multi-Player Games [24.784173202415687]
We analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games)
We find that while human players exhibit variability in fight-flight and explore-exploit behavior, AI players tend towards uniformity.
These stark differences underscore the need for interpretable evaluation, design, and integration of AI in human-aligned applications.
arXiv Detail & Related papers (2024-02-05T22:55:33Z) - Measuring an artificial intelligence agent's trust in humans using
machine incentives [2.1016374925364616]
Gauging an AI agent's trust in humans is challenging because dishonesty might respond falsely about their trust in humans.
We present a method for incentivizing machine decisions without altering an AI agent's underlying algorithms or goal orientation.
Our experiments suggest that one of the most advanced AI language models to date alters its social behavior in response to incentives.
arXiv Detail & Related papers (2022-12-27T06:05:49Z) - A Cognitive Framework for Delegation Between Error-Prone AI and Human
Agents [0.0]
We investigate the use of cognitively inspired models of behavior to predict the behavior of both human and AI agents.
The predicted behavior is used to delegate control between humans and AI agents through the use of an intermediary entity.
arXiv Detail & Related papers (2022-04-06T15:15:21Z) - On some Foundational Aspects of Human-Centered Artificial Intelligence [52.03866242565846]
There is no clear definition of what is meant by Human Centered Artificial Intelligence.
This paper introduces the term HCAI agent to refer to any physical or software computational agent equipped with AI components.
We see the notion of HCAI agent, together with its components and functions, as a way to bridge the technical and non-technical discussions on human-centered AI.
arXiv Detail & Related papers (2021-12-29T09:58:59Z) - Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being.
For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z) - Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [9.456752543341464]
A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness.
We address these limitations through a novel automated Navigation Turing Test (ANTT) that learns to predict human judgments of human-likeness.
arXiv Detail & Related papers (2021-05-20T10:14:23Z) - Watch-And-Help: A Challenge for Social Perception and Human-AI
Collaboration [116.28433607265573]
We introduce Watch-And-Help (WAH), a challenge for testing social intelligence in AI agents.
In WAH, an AI agent needs to help a human-like agent perform a complex household task efficiently.
We build VirtualHome-Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines.
arXiv Detail & Related papers (2020-10-19T21:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.