Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models
- URL: http://arxiv.org/abs/2403.15498v2
- Date: Sun, 14 Jul 2024 20:23:19 GMT
- Title: Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models
- Authors: Adam Karvonen,
- Abstract summary: We train a GPT model on Othello games and find that the model learned an internal representation of the board state.
We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations.
Unlike Li et al.'s prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models have shown unprecedented capabilities, sparking debate over the source of their performance. Is it merely the outcome of learning syntactic patterns and surface level statistics, or do they extract semantics and a world model from the text? Prior work by Li et al. investigated this by training a GPT model on synthetic, randomly generated Othello games and found that the model learned an internal representation of the board state. We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations using linear probes and contrastive activations. The model is given no a priori knowledge of the game and is solely trained on next character prediction, yet we find evidence of internal representations of board state. We validate these internal representations by using them to make interventions on the model's activations and edit its internal board state. Unlike Li et al's prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character. We derive a player skill vector and add it to the model, improving the model's win rate by up to 2.6 times.
Related papers
- Emergent Linear Representations in World Models of Self-Supervised
Sequence Models [5.712566125397807]
Othello-playing neural network learned nonlinear models of the board state.
We show that probing for "my colour" vs. "opponent's colour" may be a simple yet powerful way to interpret the model's internal state.
arXiv Detail & Related papers (2023-09-02T13:37:34Z) - Opening the Black Box: Analyzing Attention Weights and Hidden States in
Pre-trained Language Models for Non-language Tasks [0.8889304968879164]
We apply a pre-trained language model to constrained arithmetic problems with hierarchical structure, to analyze their attention weight scores and hidden states.
The investigation reveals promising results, with the model addressing hierarchical problems in a moderately structured manner, similar to human problem-solving strategies.
The attention analysis allows us to hypothesize that the model can generalize to longer sequences in ListOps dataset, a conclusion later confirmed through testing on sequences longer than those in the training set.
arXiv Detail & Related papers (2023-06-21T11:48:07Z) - Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains.
How do language models of different sizes learn during pre-training?
Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task [75.35278593566068]
Language models show a surprising range of capabilities, but the source of their apparent competence is unclear.
Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see?
We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello.
arXiv Detail & Related papers (2022-10-24T16:29:55Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Machine Learning in Sports: A Case Study on Using Explainable Models for
Predicting Outcomes of Volleyball Matches [0.0]
This paper explores a two-phased Explainable Artificial Intelligence(XAI) approach to predict outcomes of matches in the Brazilian volleyball League (SuperLiga)
In the first phase, we directly use the interpretable rule-based ML models that provide a global understanding of the model's behaviors.
In the second phase, we construct non-linear models such as Support Vector Machine (SVM) and Deep Neural Network (DNN) to obtain predictive performance on the volleyball matches' outcomes.
arXiv Detail & Related papers (2022-06-18T18:09:15Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Learning Chess Blindfolded: Evaluating Language Models on State Tracking [69.3794549747725]
We consider the task of language modeling for the game of chess.
Unlike natural language, chess notations describe a simple, constrained, and deterministic domain.
We find that transformer language models can learn to track pieces and predict legal moves with high accuracy when trained solely on move sequences.
arXiv Detail & Related papers (2021-02-26T01:16:23Z) - Visualizing MuZero Models [0.23624125155742054]
MuZero, a model-based reinforcement learning algorithm, achieved state-of-the-art performance in Chess, Shogi and the game of Go.
We visualize the latent representation of MuZero agents.
We propose two regularization techniques to stabilize MuZero's performance.
arXiv Detail & Related papers (2021-02-25T15:25:17Z) - Navigating Human Language Models with Synthetic Agents [7.99536002595393]
We train a version of the GPT-2 on a corpora of historical chess games, and then "launch" clusters of synthetic agents into the model.
We find that the percentages of moves by piece using the model are substantially similar from human patterns.
arXiv Detail & Related papers (2020-08-10T14:39:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.