The Chess Transformer: Mastering Play using Generative Language Models
- URL: http://arxiv.org/abs/2008.04057v5
- Date: Fri, 18 Sep 2020 17:12:52 GMT
- Title: The Chess Transformer: Mastering Play using Generative Language Models
- Authors: David Noever, Matt Ciolino and Josh Kalin
- Abstract summary: This work demonstrates that natural language transformers can support more generic strategic modeling.
In addition to learning natural language skills, the abstract transformer architecture can generate meaningful moves on a chessboard.
We anticipate future work will build on this transformer's promise, particularly in other strategy games.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work demonstrates that natural language transformers can support more
generic strategic modeling, particularly for text-archived games. In addition
to learning natural language skills, the abstract transformer architecture can
generate meaningful moves on a chessboard. With further fine-tuning, the
transformer learns complex gameplay by training on 2.8 million chess games in
Portable Game Notation. After 30,000 training steps, OpenAI's Generative
Pre-trained Transformer (GPT-2) optimizes weights for 774 million parameters.
This fine-tuned Chess Transformer generates plausible strategies and displays
game formations identifiable as classic openings, such as English or the Slav
Exchange. Finally, in live play, the novel model demonstrates a
human-to-transformer interface that correctly filters illegal moves and
provides a novel method to challenge the transformer's chess strategies. We
anticipate future work will build on this transformer's promise, particularly
in other strategy games where features can capture the underlying complex rule
syntax from simple but expressive player annotations.
Related papers
- Mastering Chess with a Transformer Model [0.0]
We show that transformers endowed with a sufficiently expressive position representation can match existing chess-playing models at a fraction of the computational cost.
Our architecture, which we call the Chessformer, significantly outperforms AlphaZero in both playing strength and puzzle solving ability with 8x less computation.
arXiv Detail & Related papers (2024-09-18T19:05:21Z) - Transformer Explainer: Interactive Learning of Text-Generative Models [65.91049787390692]
Transformer Explainer is an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model.
It runs a live GPT-2 instance locally in the user's browser, empowering users to experiment with their own input and observe in real-time how the internal components and parameters of the Transformer work together.
arXiv Detail & Related papers (2024-08-08T17:49:07Z) - A Transformer with Stack Attention [84.18399019794036]
We propose augmenting transformer-based language models with a differentiable, stack-based attention mechanism.
Our stack-based attention mechanism can be incorporated into any transformer-based language model and adds a level of interpretability to the model.
We show that the addition of our stack-based attention mechanism enables the transformer to model some, but not all, deterministic context-free languages.
arXiv Detail & Related papers (2024-05-07T17:47:57Z) - Learning Transformer Programs [78.9509560355733]
We introduce a procedure for training Transformers that are mechanistically interpretable by design.
Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization.
The Transformer Programs can automatically find reasonable solutions, performing on par with standard Transformers of comparable size.
arXiv Detail & Related papers (2023-06-01T20:27:01Z) - Word Play for Playing Othello (Reverses) [0.0]
Research applies both the larger (GPT-3) and smaller (GPT-2) language models to explore the complex strategies for the game of Othello (or Reverses)
The language model automatically captures or emulates championship-level strategies.
The fine-tuned GPT-2 model generates Othello games ranging from 13-71% completion, while the larger GPT-3 model reaches 41% of a complete game.
arXiv Detail & Related papers (2022-07-18T17:13:32Z) - Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance.
We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning.
We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z) - Leveraging Transformers for StarCraft Macromanagement Prediction [1.5469452301122177]
We introduce a transformer-based neural architecture for two key StarCraft II macromanagement tasks: global state and build order prediction.
Unlike recurrent neural networks which suffer from a recency bias, transformers are able to capture patterns across very long time horizons.
One key advantage of transformers is their ability to generalize well, and we demonstrate that our model achieves an even better accuracy when used in a transfer learning setting.
arXiv Detail & Related papers (2021-10-11T15:12:21Z) - Learning Chess Blindfolded: Evaluating Language Models on State Tracking [69.3794549747725]
We consider the task of language modeling for the game of chess.
Unlike natural language, chess notations describe a simple, constrained, and deterministic domain.
We find that transformer language models can learn to track pieces and predict legal moves with high accuracy when trained solely on move sequences.
arXiv Detail & Related papers (2021-02-26T01:16:23Z) - The Go Transformer: Natural Language Modeling for Game Play [0.0]
This work applies natural language modeling to generate plausible strategic moves in the ancient game of Go.
We train the Generative Pretrained Transformer (GPT-2) to mimic the style of Go champions as archived in Smart Game Format.
The trained model further generates valid but previously unseen strategies for Go.
arXiv Detail & Related papers (2020-07-07T14:37:27Z) - Segatron: Segment-Aware Transformer for Language Modeling and
Understanding [79.84562707201323]
We propose a segment-aware Transformer (Segatron) to generate better contextual representations from sequential tokens.
We first introduce the segment-aware mechanism to Transformer-XL, which is a popular Transformer-based language model.
We find that our method can further improve the Transformer-XL base model and large model, achieving 17.1 perplexity on the WikiText-103 dataset.
arXiv Detail & Related papers (2020-04-30T17:38:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.