Learning to Imitate with Less: Efficient Individual Behavior Modeling in Chess
- URL: http://arxiv.org/abs/2507.21488v1
- Date: Tue, 29 Jul 2025 04:09:31 GMT
- Title: Learning to Imitate with Less: Efficient Individual Behavior Modeling in Chess
- Authors: Zhenwei Tang, Difan Jiao, Eric Xue, Reid McIlroy-Young, Jon Kleinberg, Siddhartha Sen, Ashton Anderson,
- Abstract summary: Maia4All is a framework designed to learn and adapt to individual decision-making styles efficiently.<n>Maia4All achieves individual human behavior modeling in chess with only 20 games, compared to the 5,000 games required previously.
- Score: 10.090379544417432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As humans seek to collaborate with, learn from, and better understand artificial intelligence systems, developing AIs that can accurately emulate individual decision-making becomes increasingly important. Chess, a long-standing AI benchmark with precise skill measurement, offers an ideal testbed for human-AI alignment. However, existing approaches to modeling human behavior require prohibitively large amounts of data from each individual, making them impractical for new or sparsely represented users. In this work, we introduce Maia4All, a framework designed to learn and adapt to individual decision-making styles efficiently, even with limited data. Maia4All achieves this through a two-stage optimization process: (1) an enrichment step, which bridges population and individual-level human behavior modeling with a prototype-enriched model, and (2) a democratization step, which leverages ability levels or user prototypes to initialize and refine individual embeddings with minimal data. Our experimental results show that Maia4All can accurately predict individual moves and profile behavioral patterns with high fidelity, establishing a new standard for personalized human-like AI behavior modeling in chess. Maia4All achieves individual human behavior modeling in chess with only 20 games, compared to the 5,000 games required previously, representing a significant improvement in data efficiency. Our work provides an example of how population AI systems can flexibly adapt to individual users using a prototype-enriched model as a bridge. This approach extends beyond chess, as shown in our case study on idiosyncratic LLMs, highlighting its potential for broader applications in personalized AI adaptation.
Related papers
- Is Diversity All You Need for Scalable Robotic Manipulation? [50.747150672933316]
We investigate the nuanced role of data diversity in robot learning by examining three critical dimensions-task (what to do), embodiment (which robot to use), and expert (who demonstrates)-challenging the conventional intuition of "more diverse is better"<n>We show that task diversity proves more critical than per-task demonstration quantity, benefiting transfer from diverse pre-training tasks to novel downstream scenarios.<n>We propose a distribution debiasing method to mitigate velocity ambiguity, the yielding GO-1-Pro achieves substantial performance gains of 15%, equivalent to using 2.5 times pre-training data.
arXiv Detail & Related papers (2025-07-08T17:52:44Z) - Maia-2: A Unified Model for Human-AI Alignment in Chess [10.577896749797485]
We propose a unified modeling approach for human-AI alignment in chess.
We introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions.
Our results pave the way for deeper insights into human decision-making and AI-guided teaching tools.
arXiv Detail & Related papers (2024-09-30T17:54:23Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors.
Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z) - Learning signatures of decision making from many individuals playing the
same game [54.33783158658077]
We design a predictive framework that learns representations to encode an individual's 'behavioral style'
We apply our method to a large-scale behavioral dataset from 1,000 humans playing a 3-armed bandit task.
arXiv Detail & Related papers (2023-02-21T21:41:53Z) - Detecting Individual Decision-Making Style: Exploring Behavioral
Stylometry in Chess [4.793072503820555]
We present a transformer-based approach to behavioral stylometry in the context of chess.
Our method operates in a few-shot classification framework, and can correctly identify a player from among thousands of candidate players.
We consider more broadly what our resulting embeddings reveal about human style in chess, as well as the potential ethical implications.
arXiv Detail & Related papers (2022-08-02T11:18:16Z) - Uncalibrated Models Can Improve Human-AI Collaboration [10.106324182884068]
We show that presenting AI models as more confident than they actually are can improve human-AI performance.
We first learn a model for how humans incorporate AI advice using data from thousands of human interactions.
arXiv Detail & Related papers (2022-02-12T04:51:00Z) - Skill Preferences: Learning to Extract and Execute Robotic Skills from
Human Feedback [82.96694147237113]
We present Skill Preferences, an algorithm that learns a model over human preferences and uses it to extract human-aligned skills from offline data.
We show that SkiP enables a simulated kitchen robot to solve complex multi-step manipulation tasks.
arXiv Detail & Related papers (2021-08-11T18:04:08Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z) - Learning Models of Individual Behavior in Chess [4.793072503820555]
We develop highly accurate predictive models of individual human behavior in chess.
Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people.
arXiv Detail & Related papers (2020-08-23T18:24:21Z) - Aligning Superhuman AI with Human Behavior: Chess as a Model System [5.236087378443016]
We develop Maia, a customized version of Alpha-Zero trained on human chess games, that predicts human moves at a much higher accuracy than existing engines.
For a dual task of predicting whether a human will make a large mistake on the next move, we develop a deep neural network that significantly outperforms competitive baselines.
arXiv Detail & Related papers (2020-06-02T18:12:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.