Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
- URL: http://arxiv.org/abs/2210.13382v5
- Date: Wed, 26 Jun 2024 14:27:49 GMT
- Title: Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
- Authors: Kenneth Li, Aspen K. Hopkins, David Bau, Fernanda ViƩgas, Hanspeter Pfister, Martin Wattenberg,
- Abstract summary: Language models show a surprising range of capabilities, but the source of their apparent competence is unclear.
Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see?
We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello.
- Score: 75.35278593566068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models show a surprising range of capabilities, but the source of their apparent competence is unclear. Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see? We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello. Although the network has no a priori knowledge of the game or its rules, we uncover evidence of an emergent nonlinear internal representation of the board state. Interventional experiments indicate this representation can be used to control the output of the network and create "latent saliency maps" that can help explain predictions in human terms.
Related papers
- States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly [72.24742240125369]
In this paper, we uncover the intrinsic ability to perform extended sequences of calculations without relying on chain-of-thought step-by-step solutions.
Remarkably, the most advanced models can directly output the results of two-digit number additions with lengths extending up to 15 addends.
arXiv Detail & Related papers (2024-07-16T06:27:22Z) - Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models [0.0]
We train a GPT model on Othello games and find that the model learned an internal representation of the board state.
We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations.
Unlike Li et al.'s prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character.
arXiv Detail & Related papers (2024-03-21T18:53:23Z) - Emergent Linear Representations in World Models of Self-Supervised
Sequence Models [5.712566125397807]
Othello-playing neural network learned nonlinear models of the board state.
We show that probing for "my colour" vs. "opponent's colour" may be a simple yet powerful way to interpret the model's internal state.
arXiv Detail & Related papers (2023-09-02T13:37:34Z) - Towards Few-shot Inductive Link Prediction on Knowledge Graphs: A
Relational Anonymous Walk-guided Neural Process Approach [49.00753238429618]
Few-shot inductive link prediction on knowledge graphs aims to predict missing links for unseen entities with few-shot links observed.
Recent inductive methods utilize the sub-graphs around unseen entities to obtain the semantics and predict links inductively.
We propose a novel relational anonymous walk-guided neural process for few-shot inductive link prediction on knowledge graphs, denoted as RawNP.
arXiv Detail & Related papers (2023-06-26T12:02:32Z) - Towards Prototype-Based Self-Explainable Graph Neural Network [37.90997236795843]
We study a novel problem of learning prototype-based self-explainable GNNs that can simultaneously give accurate predictions and prototype-based explanations on predictions.
The learned prototypes are also used to simultaneously make prediction for for a test instance and provide instance-level explanation.
arXiv Detail & Related papers (2022-10-05T00:47:42Z) - Hidden Schema Networks [3.4123736336071864]
We introduce a novel neural language model that enforces, via inductive biases, explicit relational structures.
The model encodes sentences into sequences of symbols, which correspond to nodes visited by biased random walkers.
We show that the model is able to uncover ground-truth graphs from artificially generated datasets of random token sequences.
arXiv Detail & Related papers (2022-07-08T09:26:19Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - A Sober Look at the Unsupervised Learning of Disentangled
Representations and their Evaluation [63.042651834453544]
We show that the unsupervised learning of disentangled representations is impossible without inductive biases on both the models and the data.
We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision.
Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision.
arXiv Detail & Related papers (2020-10-27T10:17:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.