Discourse structure interacts with reference but not syntax in neural
language models
- URL: http://arxiv.org/abs/2010.04887v1
- Date: Sat, 10 Oct 2020 03:14:00 GMT
- Title: Discourse structure interacts with reference but not syntax in neural
language models
- Authors: Forrest Davis and Marten van Schijndel
- Abstract summary: We study the ability of language models (LMs) to learn interactions between different linguistic representations.
We find that, contrary to humans, implicit causality only influences LM behavior for reference, not syntax.
Our results suggest that LM behavior can contradict not only learned representations of discourse but also syntactic agreement.
- Score: 17.995905582226463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models (LMs) trained on large quantities of text have been claimed
to acquire abstract linguistic representations. Our work tests the robustness
of these abstractions by focusing on the ability of LMs to learn interactions
between different linguistic representations. In particular, we utilized
stimuli from psycholinguistic studies showing that humans can condition
reference (i.e. coreference resolution) and syntactic processing on the same
discourse structure (implicit causality). We compared both transformer and long
short-term memory LMs to find that, contrary to humans, implicit causality only
influences LM behavior for reference, not syntax, despite model representations
that encode the necessary discourse information. Our results further suggest
that LM behavior can contradict not only learned representations of discourse
but also syntactic agreement, pointing to shortcomings of standard language
modeling.
Related papers
- Large Language Models Are Partially Primed in Pronoun Interpretation [6.024776891570197]
We investigate whether large language models (LLMs) display human-like referential biases using stimuli and procedures from real psycholinguistic experiments.
Recent psycholinguistic studies suggest that humans adapt their referential biases with recent exposure to referential patterns.
We find that InstructGPT adapts its pronominal interpretations in response to the frequency of referential patterns in the local discourse.
arXiv Detail & Related papers (2023-05-26T13:30:48Z) - Language Models as Agent Models [42.37422271002712]
I argue that LMs are models of intentional communication in a specific, narrow sense.
Even in today's non-robust and error-prone models, LMs infer and use representations of fine-grained communicative intentions.
arXiv Detail & Related papers (2022-12-03T20:18:16Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Are Representations Built from the Ground Up? An Empirical Examination
of Local Composition in Language Models [91.3755431537592]
Representing compositional and non-compositional phrases is critical for language understanding.
We first formulate a problem of predicting the LM-internal representations of longer phrases given those of their constituents.
While we would expect the predictive accuracy to correlate with human judgments of semantic compositionality, we find this is largely not the case.
arXiv Detail & Related papers (2022-10-07T14:21:30Z) - Context Limitations Make Neural Language Models More Human-Like [32.488137777336036]
We show discrepancies in context access between modern neural language models (LMs) and humans in incremental sentence processing.
Additional context limitation was needed to make LMs better simulate human reading behavior.
Our analyses also showed that human-LM gaps in memory access are associated with specific syntactic constructions.
arXiv Detail & Related papers (2022-05-23T17:01:13Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics.
Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding.
We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - A learning perspective on the emergence of abstractions: the curious
case of phonemes [2.580765958706854]
We test two opposing principles regarding the development of language knowledge in linguistically untrained language users.
We probed whether MBL and ECL could give rise to a type of language knowledge that resembles linguistic abstractions.
We show that ECL learning models can learn abstractions and that at least part of the phone inventory can be reliably identified from the input.
arXiv Detail & Related papers (2020-12-14T13:33:34Z) - Learning Music Helps You Read: Using Transfer to Study Linguistic
Structure in Language Models [27.91397366776451]
Training LSTMs on latent structure (MIDI music or Java code) improves test performance on natural language.
Experiments on transfer between natural languages controlling for vocabulary overlap show that zero-shot performance on a test language is highly correlated with typological similarity to the training language.
arXiv Detail & Related papers (2020-04-30T06:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.