Collateral facilitation in humans and language models
- URL: http://arxiv.org/abs/2211.05198v1
- Date: Wed, 9 Nov 2022 21:08:08 GMT
- Title: Collateral facilitation in humans and language models
- Authors: James A. Michaelov, Benjamin K. Bergen
- Abstract summary: We show that humans display a similar processing advantage for highly anomalous words.
We discuss the implications for our understanding of both human language comprehension and the predictions made by language models.
- Score: 0.6091702876917281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Are the predictions of humans and language models affected by similar things?
Research suggests that while comprehending language, humans make predictions
about upcoming words, with more predictable words being processed more easily.
However, evidence also shows that humans display a similar processing advantage
for highly anomalous words when these words are semantically related to the
preceding context or to the most probable continuation. Using stimuli from 3
psycholinguistic experiments, we find that this is also almost always also the
case for 8 contemporary transformer language models (BERT, ALBERT, RoBERTa,
XLM-R, GPT-2, GPT-Neo, GPT-J, and XGLM). We then discuss the implications of
this phenomenon for our understanding of both human language comprehension and
the predictions made by language models.
Related papers
- A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument Roles [0.06554326244334868]
We evaluate large language models' sensitivity to argument roles by replicating psycholinguistic studies on human argument role processing.
We find that language models are able to distinguish verbs that appear in plausible and implausible contexts, where plausibility is determined through the relation between the verb and its preceding arguments.
This indicates that language models' capacity to detect verb plausibility does not arise from the same mechanism that underlies human real-time sentence processing.
arXiv Detail & Related papers (2024-10-21T16:05:58Z) - Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges.
Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role.
We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z) - Testing the Predictions of Surprisal Theory in 11 Languages [77.45204595614]
We investigate the relationship between surprisal and reading times in eleven different languages.
By focusing on a more diverse set of languages, we argue that these results offer the most robust link to-date between information theory and incremental language processing across languages.
arXiv Detail & Related papers (2023-07-07T15:37:50Z) - Why can neural language models solve next-word prediction? A
mathematical perspective [53.807657273043446]
We study a class of formal languages that can be used to model real-world examples of English sentences.
Our proof highlights the different roles of the embedding layer and the fully connected component within the neural language model.
arXiv Detail & Related papers (2023-06-20T10:41:23Z) - Do large language models resemble humans in language use? [1.8524806794216748]
Large language models (LLMs) such as ChatGPT and Vicuna have shown remarkable capacities in comprehending and producing language.
We subjected ChatGPT and Vicuna to 12 experiments ranging from sounds to dialogue, preregistered and with 1000 runs (i.e., iterations) per experiment.
ChatGPT and Vicuna replicated the human pattern of language use in 10 and 7 out of the 12 experiments, respectively.
arXiv Detail & Related papers (2023-03-10T10:47:59Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Do Large Language Models know what humans know? [6.2997667081978825]
We present a linguistic version of the False Belief Task to both human participants and a Large Language Model, GPT-3.
Both are sensitive to others' beliefs, but while the language model significantly exceeds chance behavior, it does not perform as well as the humans, nor does it explain the full extent of their behavior.
This suggests that while statistical learning from language exposure may in part explain how humans develop the ability to reason about the mental states of others, other mechanisms are also responsible.
arXiv Detail & Related papers (2022-09-04T01:29:53Z) - Do language models make human-like predictions about the coreferents of
Italian anaphoric zero pronouns? [0.6091702876917281]
We test whether 12 contemporary language models display expectations that reflect human behavior when exposed to sentences with zero pronouns.
We find that three models - XGLM 2.9B, 4.5B, and 7.5B - capture the human behavior from all the experiments.
This result suggests that human expectations about coreference can be derived from exposure to language, and also indicates features of language models that allow them to better reflect human behavior.
arXiv Detail & Related papers (2022-08-30T22:06:07Z) - It's not Rocket Science : Interpreting Figurative Language in Narratives [48.84507467131819]
We study the interpretation of two non-compositional figurative languages (idioms and similes)
Our experiments show that models based solely on pre-trained language models perform substantially worse than humans on these tasks.
We additionally propose knowledge-enhanced models, adopting human strategies for interpreting figurative language.
arXiv Detail & Related papers (2021-08-31T21:46:35Z) - PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D
World [86.21137454228848]
We factorize PIGLeT into a physical dynamics model, and a separate language model.
PIGLeT can read a sentence, simulate neurally what might happen next, and then communicate that result through a literal symbolic representation.
It is able to correctly forecast "what happens next" given an English sentence over 80% of the time, outperforming a 100x larger, text-to-text approach by over 10%.
arXiv Detail & Related papers (2021-06-01T02:32:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.