Related papers: Measuring the Non-Transitivity in Chess

Measuring the Non-Transitivity in Chess

URL: http://arxiv.org/abs/2110.11737v1
Date: Fri, 22 Oct 2021 12:15:42 GMT
Title: Measuring the Non-Transitivity in Chess
Authors: Ricky Sanjaya, Jun Wang, Yaodong Yang
Abstract summary: We quantify the non-transitivity in Chess through real-world data from human players. There exists a strong connection between the degree of non-transitivity and the progression of a Chess player's rating.
Score: 19.618609913302855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It has long been believed that Chess is the \emph{Drosophila} of Artificial Intelligence (AI). Studying Chess can productively provide valid knowledge about complex systems. Although remarkable progress has been made on solving Chess, the geometrical landscape of Chess in the strategy space is still mysterious. Judging on AI-generated strategies, researchers hypothesised that the strategy space of Chess possesses a spinning top geometry, with the upright axis representing the \emph{transitive} dimension (e.g., A beats B, B beats C, A beats C), and the radial axis representing the \emph{non-transitive} dimension (e.g., A beats B, B beats C, C beats A). However, it is unclear whether such a hypothesis holds for real-world strategies. In this paper, we quantify the non-transitivity in Chess through real-world data from human players. Specifically, we performed two ways of non-transitivity quantifications -- Nash Clustering and counting the number of Rock-Paper-Scissor cycles -- on over one billion match data from Lichess and FICS. Our findings positively indicate that the strategy space occupied by real-world Chess strategies demonstrates a spinning top geometry, and more importantly, there exists a strong connection between the degree of non-transitivity and the progression of a Chess player's rating. In particular, high degrees of non-transitivity tend to prevent human players from making progress on their Elo rating, whereas progressions are easier to make at the level of ratings where the degree of non-transitivity is lower. Additionally, we also investigate the implication of the degree of non-transitivity for population-based training methods. By considering \emph{fixed-memory Fictitious Play} as a proxy, we reach the conclusion that maintaining large-size and diverse populations of strategies is imperative to training effective AI agents in solving Chess types of games.

Related papers

ChessQA: Evaluating Large Language Models for Chess Understanding [10.480398008794436]
Chess provides an ideal testbed for evaluating the reasoning, modeling, and abstraction capabilities of large language models (LLMs)<n>We present ChessQA, a benchmark that assesses LLM chess understanding across five task categories.<n>We find persistent weaknesses across all five categories and provide results and error analyses by category.
arXiv Detail & Related papers (2025-10-28T00:02:52Z)
Out-of-distribution Tests Reveal Compositionality in Chess Transformers [6.356179251855671]
We train a 270M parameter chess Transformer and test it on out-of-distribution scenarios, designed to reveal failures of systematic generalization.<n>Our analysis shows that Transformers exhibit compositional generalization, as evidenced by strong rule extrapolation.<n>In a more challenging test, we evaluate the models on variants including Chess960 - a variant of chess where starting positions of pieces are randomized.
arXiv Detail & Related papers (2025-10-23T17:51:28Z)
Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess [54.5355907369231]
We investigate whether large language models (LLMs) can develop strategic reasoning capabilities through reinforcement learning (RL) in chess.<n>Our experiments show that our distillation-based dense rewards often outperform sparse binary rewards.<n>We provide SFT and RL ablations on chess reasoning training and find evidence that this limitation stems from a deficit in the pretrained models' internal understanding of chess.
arXiv Detail & Related papers (2025-07-01T13:16:34Z)
General search techniques without common knowledge for imperfect-information games, and application to superhuman Fog of War chess [68.20244032271847]
We present Obscuro, the first superhuman AI for Fog of War chess.<n>It introduces advances to search in imperfect-information games, enabling strong, scalable reasoning.<n>Experiments against the prior state-of-the-art AI and human players show that Obscuro is significantly stronger.
arXiv Detail & Related papers (2025-06-02T01:41:27Z)
Explore the Reasoning Capability of LLMs in the Chess Testbed [45.12891789312405]
We propose improving the reasoning capability of large language models in chess by integrating annotated strategy and tactic. We finetune the LLaMA-3-8B model and compare it against state-of-the-art commercial language models in the task of selecting better chess moves.
arXiv Detail & Related papers (2024-11-11T01:42:56Z)
Predicting Chess Puzzle Difficulty with Transformers [0.0]
We present GlickFormer, a novel transformer-based architecture that predicts chess puzzle difficulty by approximating the Glicko-2 rating system. The proposed model utilizes a modified ChessFormer backbone for spatial feature extraction and incorporates temporal information via factorized transformer techniques. Results demonstrate GlickFormer's superior performance compared to the state-of-the-art ChessFormer baseline across multiple metrics.
arXiv Detail & Related papers (2024-10-14T20:39:02Z)
Predicting User Perception of Move Brilliance in Chess [3.434553688053531]
We show the first system for classifying chess moves as brilliant. The system achieves an accuracy of 79% (with 50% base-rate), a PPV of 83%, and an NPV of 75%. We show that a move is more likely to be predicted as brilliant, all things being equal, if a weaker engine considers it lower-quality.
arXiv Detail & Related papers (2024-06-14T17:46:26Z)
Amortized Planning with Large-Scale Transformers: A Case Study on Chess [11.227110138932442]
This paper uses chess, a landmark planning problem in AI, to assess performance on a planning task. ChessBench is a large-scale benchmark of 10 million chess games with legal move and value annotations (15 billion points) provided by Stockfish. We show that, although a remarkably good approximation can be distilled into large-scale transformers via supervised learning, perfect distillation is still beyond reach.
arXiv Detail & Related papers (2024-02-07T00:36:24Z)
Learning to Play Chess from Textbooks (LEAP): a Corpus for Evaluating Chess Moves based on Sentiment Analysis [4.314956204483074]
This paper examines chess textbooks as a new knowledge source for enabling machines to learn how to play chess. We developed the LEAP corpus, a first and new heterogeneous dataset with structured (chess move notations and board states) and unstructured data. We performed empirical experiments that assess the performance of various transformer-based baseline models for sentiment analysis.
arXiv Detail & Related papers (2023-10-31T08:26:02Z)
Intransitively winning chess players positions [91.3755431537592]
The space of relations between winningness of positions of chess players is non-Euclidean. The Zermelo-von Neumann theorem is complemented by statements about possibility vs. impossibility of building pure winning strategies.
arXiv Detail & Related papers (2022-12-11T05:55:05Z)
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning [86.37438204416435]
Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform.
arXiv Detail & Related papers (2022-06-30T15:53:19Z)
Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes) We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play. Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z)
Online Learning in Unknown Markov Games [55.07327246187741]
We study online learning in unknown Markov games. We show that achieving sublinear regret against the best response in hindsight is statistically hard. We present an algorithm that achieves a sublinear $tildemathcalO(K2/3)$ regret after $K$ episodes.
arXiv Detail & Related papers (2020-10-28T14:52:15Z)
Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess [5.3524101179510595]
We use AlphaZero to creatively explore and design new chess variants. We compare nine other variants that involve atomic changes to the rules of chess. By learning near-optimal strategies for each variant with AlphaZero, we determine what games between strong human players might look like if these variants were adopted.
arXiv Detail & Related papers (2020-09-09T15:49:14Z)
Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action. We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents. Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z)
Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.