Related papers: Complete Chess Games Enable LLM Become A Chess Master

Complete Chess Games Enable LLM Become A Chess Master

URL: http://arxiv.org/abs/2501.17186v2
Date: Thu, 30 Jan 2025 04:02:48 GMT
Title: Complete Chess Games Enable LLM Become A Chess Master
Authors: Yinqi Zhang, Xintian Han, Haolong Li, Kedi Chen, Shaohui Lin,
Abstract summary: Large language models (LLM) have shown remarkable abilities in text generation, question answering, language translation, reasoning and many other tasks.<n>Despite LLM's success in multiple areas, its ability to play abstract games, such as chess, is underexplored.<n>Here, we propose the Large language model ChessLLM to play full chess games.
Score: 10.108949088950927
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLM) have shown remarkable abilities in text generation, question answering, language translation, reasoning and many other tasks. It continues to advance rapidly and is becoming increasingly influential in various fields, from technology and business to education and entertainment. Despite LLM's success in multiple areas, its ability to play abstract games, such as chess, is underexplored. Chess-playing requires the language models to output legal and reasonable moves from textual inputs. Here, we propose the Large language model ChessLLM to play full chess games. We transform the game into a textual format with the best move represented in the Forsyth-Edwards Notation. We show that by simply supervised fine-tuning, our model has achieved a professional-level Elo rating of 1788 in matches against the standard Elo-rated Stockfish when permitted to sample 10 times. We further show that data quality is important. Long-round data supervision enjoys a 350 Elo rating improvement over short-round data.

Related papers

Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora [84.03928547166873]
Children can acquire language from less than 100 million words of input. Large language models are far less data-efficient: they typically require 3 or 4 orders of magnitude more data and still do not perform as well as humans on many evaluations. The BabyLM Challenge is a communal effort in which participants compete to optimize language model training on a fixed data budget.
arXiv Detail & Related papers (2025-04-10T23:22:43Z)
Explore the Reasoning Capability of LLMs in the Chess Testbed [45.12891789312405]
We propose improving the reasoning capability of large language models in chess by integrating annotated strategy and tactic. We finetune the LLaMA-3-8B model and compare it against state-of-the-art commercial language models in the task of selecting better chess moves.
arXiv Detail & Related papers (2024-11-11T01:42:56Z)
Learning the Latent Rules of a Game from Data: A Chess Story [0.0]
We show that 28M and 125M parameter pretrained small language models (SLMs) can be instruction fine-tuned with 1,000-to-1,000,000 examples. We also explore the impact of successive language model fine-tuning epochs on improved outcomes.
arXiv Detail & Related papers (2024-10-03T12:19:49Z)
Instruction-Driven Game Engines on Large Language Models [59.280666591243154]
The IDGE project aims to democratize game development by enabling a large language model to follow free-form game rules. We train the IDGE in a curriculum manner that progressively increases the model's exposure to complex scenarios. Our initial progress lies in developing an IDGE for Poker, a universally cherished card game.
arXiv Detail & Related papers (2024-03-30T08:02:16Z)
Amortized Planning with Large-Scale Transformers: A Case Study on Chess [11.227110138932442]
This paper uses chess, a landmark planning problem in AI, to assess performance on a planning task. ChessBench is a large-scale benchmark of 10 million chess games with legal move and value annotations (15 billion points) provided by Stockfish. We show that, although a remarkably good approximation can be distilled into large-scale transformers via supervised learning, perfect distillation is still beyond reach.
arXiv Detail & Related papers (2024-02-07T00:36:24Z)
FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information [75.201485544517]
We present FIREBALL, a large dataset containing nearly 25,000 unique sessions from real D&D gameplay on Discord with true game state info. We demonstrate that FIREBALL can improve natural language generation (NLG) by using Avrae state information.
arXiv Detail & Related papers (2023-05-02T15:36:10Z)
Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models [68.85478477006178]
We present a Promptable Game Model (PGM) for neural video game simulators. It allows a user to play the game by prompting it with high- and low-level action sequences. Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt. Our method significantly outperforms existing neural video game simulators in terms of rendering quality and unlocks applications beyond the capabilities of the current state of the art.
arXiv Detail & Related papers (2023-03-23T17:43:17Z)
Learning Chess With Language Models and Transformers [0.0]
Representing a board game and its positions by text-based notation enables the possibility of NLP applications. BERT models, first to the simple Nim game to analyze its performance in the presence of noise in a setup of a few-shot learning architecture. Model practically learns the rules of the chess game and can survive games against Stockfish at a category-A rating level.
arXiv Detail & Related papers (2022-09-24T01:22:59Z)
Learning Chess Blindfolded: Evaluating Language Models on State Tracking [69.3794549747725]
We consider the task of language modeling for the game of chess. Unlike natural language, chess notations describe a simple, constrained, and deterministic domain. We find that transformer language models can learn to track pieces and predict legal moves with high accuracy when trained solely on move sequences.
arXiv Detail & Related papers (2021-02-26T01:16:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.