Implicit Representations of Meaning in Neural Language Models
- URL: http://arxiv.org/abs/2106.00737v1
- Date: Tue, 1 Jun 2021 19:23:20 GMT
- Title: Implicit Representations of Meaning in Neural Language Models
- Authors: Belinda Z. Li, Maxwell Nye, Jacob Andreas
- Abstract summary: We identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse.
Our results indicate that prediction in pretrained neural language models is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state.
- Score: 31.71898809435222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Does the effectiveness of neural language models derive entirely from
accurate modeling of surface word co-occurrence statistics, or do these models
represent and reason about the world they describe? In BART and T5 transformer
language models, we identify contextual word representations that function as
models of entities and situations as they evolve throughout a discourse. These
neural representations have functional similarities to linguistic models of
dynamic semantics: they support a linear readout of each entity's current
properties and relations, and can be manipulated with predictable effects on
language generation. Our results indicate that prediction in pretrained neural
language models is supported, at least in part, by dynamic representations of
meaning and implicit simulation of entity state, and that this behavior can be
learned with only text as training data. Code and data are available at
https://github.com/belindal/state-probes .
Related papers
- A generative framework to bridge data-driven models and scientific theories in language neuroscience [84.76462599023802]
We present generative explanation-mediated validation, a framework for generating concise explanations of language selectivity in the brain.
We show that explanatory accuracy is closely related to the predictive power and stability of the underlying statistical models.
arXiv Detail & Related papers (2024-10-01T15:57:48Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Transparency at the Source: Evaluating and Interpreting Language Models
With Access to the True Distribution [4.01799362940916]
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data.
The data is generated using a massive probabilistic grammar, that is itself derived from a large natural language corpus.
With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.
arXiv Detail & Related papers (2023-10-23T12:03:01Z) - Seeing in Words: Learning to Classify through Language Bottlenecks [59.97827889540685]
Humans can explain their predictions using succinct and intuitive descriptions.
We show that a vision model whose feature representations are text can effectively classify ImageNet images.
arXiv Detail & Related papers (2023-06-29T00:24:42Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Pretraining on Interactions for Learning Grounded Affordance
Representations [22.290431852705662]
We train a neural network to predict objects' trajectories in a simulated interaction.
We show that our network's latent representations differentiate between both observed and unobserved affordances.
Our results suggest a way in which modern deep learning approaches to grounded language learning can be integrated with traditional formal semantic notions of lexical representations.
arXiv Detail & Related papers (2022-07-05T19:19:53Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Hierarchical Interpretation of Neural Text Classification [31.95426448656938]
This paper proposes a novel Hierarchical INTerpretable neural text classifier, called Hint, which can automatically generate explanations of model predictions.
Experimental results on both review datasets and news datasets show that our proposed approach achieves text classification results on par with existing state-of-the-art text classifiers.
arXiv Detail & Related papers (2022-02-20T11:15:03Z) - Scaling Language Models: Methods, Analysis & Insights from Training
Gopher [83.98181046650664]
We present an analysis of Transformer-based language model performance across a wide range of model scales.
Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language.
We discuss the application of language models to AI safety and the mitigation of downstream harms.
arXiv Detail & Related papers (2021-12-08T19:41:47Z) - The Grammar-Learning Trajectories of Neural Language Models [42.32479280480742]
We show that neural language models acquire linguistic phenomena in a similar order, despite having different end performances over the data.
Results suggest that NLMs exhibit consistent developmental'' stages.
arXiv Detail & Related papers (2021-09-13T16:17:23Z) - A Visuospatial Dataset for Naturalistic Verb Learning [18.654373173232205]
We introduce a new dataset for training and evaluating grounded language models.
Our data is collected within a virtual reality environment and is designed to emulate the quality of language data to which a pre-verbal child is likely to have access.
We use the collected data to compare several distributional semantics models for verb learning.
arXiv Detail & Related papers (2020-10-28T20:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.