Aligning Superhuman AI with Human Behavior: Chess as a Model System
- URL: http://arxiv.org/abs/2006.01855v3
- Date: Tue, 14 Jul 2020 17:57:37 GMT
- Title: Aligning Superhuman AI with Human Behavior: Chess as a Model System
- Authors: Reid McIlroy-Young and Siddhartha Sen and Jon Kleinberg and Ashton
Anderson
- Abstract summary: We develop Maia, a customized version of Alpha-Zero trained on human chess games, that predicts human moves at a much higher accuracy than existing engines.
For a dual task of predicting whether a human will make a large mistake on the next move, we develop a deep neural network that significantly outperforms competitive baselines.
- Score: 5.236087378443016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As artificial intelligence becomes increasingly intelligent---in some cases,
achieving superhuman performance---there is growing potential for humans to
learn from and collaborate with algorithms. However, the ways in which AI
systems approach problems are often different from the ways people do, and thus
may be uninterpretable and hard to learn from. A crucial step in bridging this
gap between human and artificial intelligence is modeling the granular actions
that constitute human behavior, rather than simply matching aggregate human
performance.
We pursue this goal in a model system with a long history in artificial
intelligence: chess. The aggregate performance of a chess player unfolds as
they make decisions over the course of a game. The hundreds of millions of
games played online by players at every skill level form a rich source of data
in which these decisions, and their exact context, are recorded in minute
detail. Applying existing chess engines to this data, including an open-source
implementation of AlphaZero, we find that they do not predict human moves well.
We develop and introduce Maia, a customized version of Alpha-Zero trained on
human chess games, that predicts human moves at a much higher accuracy than
existing engines, and can achieve maximum accuracy when predicting decisions
made by players at a specific skill level in a tuneable way. For a dual task of
predicting whether a human will make a large mistake on the next move, we
develop a deep neural network that significantly outperforms competitive
baselines. Taken together, our results suggest that there is substantial
promise in designing artificial intelligence systems with human collaboration
in mind by first accurately modeling granular human decision-making.
Related papers
- Human-aligned Chess with a Bit of Search [35.16633353273246]
Chess has long been a testbed for AI's quest to match human intelligence.
In this paper, we introduce Allie, a chess-playing AI designed to bridge the gap between artificial and human intelligence in this classic game.
arXiv Detail & Related papers (2024-10-04T19:51:03Z) - Maia-2: A Unified Model for Human-AI Alignment in Chess [10.577896749797485]
We propose a unified modeling approach for human-AI alignment in chess.
We introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions.
Our results pave the way for deeper insights into human decision-making and AI-guided teaching tools.
arXiv Detail & Related papers (2024-09-30T17:54:23Z) - HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation [50.616995671367704]
We present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands.
Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning approach achieves superior performance when supported by robust low-level policies.
arXiv Detail & Related papers (2024-03-15T17:45:44Z) - AI for Mathematics: A Cognitive Science Perspective [86.02346372284292]
Mathematics is one of the most powerful conceptual systems developed and used by the human species.
Rapid progress in AI, particularly propelled by advances in large language models (LLMs), has sparked renewed, widespread interest in building such systems.
arXiv Detail & Related papers (2023-10-19T02:00:31Z) - RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning [61.10744686260994]
We introduce RoboPianist, a system that enables simulated anthropomorphic hands to learn an extensive repertoire of 150 piano pieces.
We additionally introduce an open-sourced environment, benchmark of tasks, interpretable evaluation metrics, and open challenges for future study.
arXiv Detail & Related papers (2023-04-09T03:53:05Z) - Superhuman Artificial Intelligence Can Improve Human Decision Making by
Increasing Novelty [8.120494737877799]
We analyze more than 5.8 million move decisions made by professional Go players over the past 71 years.
We find that superhuman humans began to make significantly better decisions following the advent of superhuman AI.
arXiv Detail & Related papers (2023-03-13T20:49:13Z) - Detecting Individual Decision-Making Style: Exploring Behavioral
Stylometry in Chess [4.793072503820555]
We present a transformer-based approach to behavioral stylometry in the context of chess.
Our method operates in a few-shot classification framework, and can correctly identify a player from among thousands of candidate players.
We consider more broadly what our resulting embeddings reveal about human style in chess, as well as the potential ethical implications.
arXiv Detail & Related papers (2022-08-02T11:18:16Z) - Skill Preferences: Learning to Extract and Execute Robotic Skills from
Human Feedback [82.96694147237113]
We present Skill Preferences, an algorithm that learns a model over human preferences and uses it to extract human-aligned skills from offline data.
We show that SkiP enables a simulated kitchen robot to solve complex multi-step manipulation tasks.
arXiv Detail & Related papers (2021-08-11T18:04:08Z) - Teach me to play, gamer! Imitative learning in computer games via
linguistic description of complex phenomena and decision tree [55.41644538483948]
We present a new machine learning model by imitation based on the linguistic description of complex phenomena.
The method can be a good alternative to design and implement the behaviour of intelligent agents in video game development.
arXiv Detail & Related papers (2021-01-06T21:14:10Z) - Learning Models of Individual Behavior in Chess [4.793072503820555]
We develop highly accurate predictive models of individual human behavior in chess.
Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people.
arXiv Detail & Related papers (2020-08-23T18:24:21Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.