Large language models implicitly learn to straighten neural sentence
trajectories to construct a predictive representation of natural language
- URL: http://arxiv.org/abs/2311.04930v1
- Date: Sun, 5 Nov 2023 22:16:21 GMT
- Title: Large language models implicitly learn to straighten neural sentence
trajectories to construct a predictive representation of natural language
- Authors: Eghbal A. Hosseini, Evelina Fedorenko
- Abstract summary: We test a hypothesis about predictive representations of autoregressive transformers.
Key insight is that straighter trajectories should facilitate prediction via linear extrapolation.
We quantify straightness using a 1-dimensional curvature metric.
- Score: 2.1756081703276
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Predicting upcoming events is critical to our ability to interact with our
environment. Transformer models, trained on next-word prediction, appear to
construct representations of linguistic input that can support diverse
downstream tasks. But how does a predictive objective shape such
representations? Inspired by recent work in vision (Henaff et al., 2019), we
test a hypothesis about predictive representations of autoregressive
transformers. In particular, we test whether the neural trajectory of a
sentence becomes progressively straighter as it passes through the network
layers. The key insight is that straighter trajectories should facilitate
prediction via linear extrapolation. We quantify straightness using a
1-dimensional curvature metric, and present four findings in support of the
trajectory straightening hypothesis: i) In trained models, the curvature
decreases from the early to the deeper layers of the network. ii) Models that
perform better on the next-word prediction objective exhibit greater decreases
in curvature, suggesting that this improved ability to straighten sentence
trajectories may be the driver of better language modeling performance. iii)
Given the same linguistic context, the sequences that are generated by the
model have lower curvature than the actual continuations observed in a language
corpus, suggesting that the model favors straighter trajectories for making
predictions. iv) A consistent relationship holds between the average curvature
and the average surprisal of sentences in the deep model layers, such that
sentences with straighter trajectories also have lower surprisal. Importantly,
untrained models do not exhibit these behaviors. In tandem, these results
support the trajectory straightening hypothesis and provide a possible
mechanism for how the geometry of the internal representations of
autoregressive models supports next word prediction.
Related papers
- The Power of Next-Frame Prediction for Learning Physical Laws [5.624870417352306]
Next-frame prediction is a useful and powerful method for modelling and understanding the dynamics of video data.
We introduce six diagnostic simulation video datasets derived from fundamental physical laws created by varying physical constants such as gravity and mass.
We find that the generative training phase alone induces a model state that can predict physical constants significantly better than that of a random model.
arXiv Detail & Related papers (2024-05-21T17:55:54Z) - On the Origins of Linear Representations in Large Language Models [51.88404605700344]
We introduce a simple latent variable model to formalize the concept dynamics of the next token prediction.
Experiments show that linear representations emerge when learning from data matching the latent variable model.
We additionally confirm some predictions of the theory using the LLaMA-2 large language model.
arXiv Detail & Related papers (2024-03-06T17:17:36Z) - Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges.
Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role.
We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z) - Inverse Dynamics Pretraining Learns Good Representations for Multitask
Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning.
We consider a setting where the pretraining corpus consists of multitask demonstrations.
We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z) - Explaining How Transformers Use Context to Build Predictions [0.1749935196721634]
Language Generation Models produce words based on the previous context.
It is still unclear how prior words affect the model's decision throughout the layers.
We leverage recent advances in explainability of the Transformer and present a procedure to analyze models for language generation.
arXiv Detail & Related papers (2023-05-21T18:29:10Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Trajectory Prediction with Linguistic Representations [27.71805777845141]
We present a novel trajectory prediction model that uses linguistic intermediate representations to forecast trajectories.
The model learns the meaning of each of the words without direct per-word supervision.
It generates a linguistic description of trajectories which captures maneuvers and interactions over an extended time interval.
arXiv Detail & Related papers (2021-10-19T05:22:38Z) - Generative Text Modeling through Short Run Inference [47.73892773331617]
The present work proposes a short run dynamics for inference. It is variation from the prior distribution of the latent variable and then runs a small number of Langevin dynamics steps guided by its posterior distribution.
We show that the models trained with short run dynamics more accurately model the data, compared to strong language model and VAE baselines, and exhibit no sign of posterior collapse.
arXiv Detail & Related papers (2021-05-27T09:14:35Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Improved Speech Representations with Multi-Target Autoregressive
Predictive Coding [23.424410568555547]
We extend the hypothesis that hidden states that can accurately predict future frames are a useful representation for many downstream tasks.
We propose an auxiliary objective that serves as a regularization to improve generalization of the future frame prediction task.
arXiv Detail & Related papers (2020-04-11T01:09:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.