Language Models Explain Word Reading Times Better Than Empirical
Predictability
- URL: http://arxiv.org/abs/2202.01128v1
- Date: Wed, 2 Feb 2022 16:38:43 GMT
- Title: Language Models Explain Word Reading Times Better Than Empirical
Predictability
- Authors: Markus J. Hofmann, Steffen Remus, Chris Biemann, Ralph Radach and Lars
Kuchinke
- Abstract summary: The traditional approach in cognitive reading research assumes that word predictability from sentence context is best captured by cloze completion probability.
Probability language models provide deeper explanations for syntactic and semantic effects than CCP.
N-gram and RNN probabilities of the present word more consistently predicted reading performance compared with topic models or CCP.
- Score: 20.38397241720963
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Though there is a strong consensus that word length and frequency are the
most important single-word features determining visual-orthographic access to
the mental lexicon, there is less agreement as how to best capture syntactic
and semantic factors. The traditional approach in cognitive reading research
assumes that word predictability from sentence context is best captured by
cloze completion probability (CCP) derived from human performance data. We
review recent research suggesting that probabilistic language models provide
deeper explanations for syntactic and semantic effects than CCP. Then we
compare CCP with (1) Symbolic n-gram models consolidate syntactic and semantic
short-range relations by computing the probability of a word to occur, given
two preceding words. (2) Topic models rely on subsymbolic representations to
capture long-range semantic similarity by word co-occurrence counts in
documents. (3) In recurrent neural networks (RNNs), the subsymbolic units are
trained to predict the next word, given all preceding words in the sentences.
To examine lexical retrieval, these models were used to predict single fixation
durations and gaze durations to capture rapidly successful and standard lexical
access, and total viewing time to capture late semantic integration. The linear
item-level analyses showed greater correlations of all language models with all
eye-movement measures than CCP. Then we examined non-linear relations between
the different types of predictability and the reading times using generalized
additive models. N-gram and RNN probabilities of the present word more
consistently predicted reading performance compared with topic models or CCP.
Related papers
- Contextual Dictionary Lookup for Knowledge Graph Completion [32.493168863565465]
Knowledge graph completion (KGC) aims to solve the incompleteness of knowledge graphs (KGs) by predicting missing links from known triples.
Most existing embedding models map each relation into a unique vector, overlooking the specific fine-grained semantics of them under different entities.
We present a novel method utilizing contextual dictionary lookup, enabling conventional embedding models to learn fine-grained semantics of relations in an end-to-end manner.
arXiv Detail & Related papers (2023-06-13T12:13:41Z) - Token-wise Decomposition of Autoregressive Language Model Hidden States
for Analyzing Model Predictions [9.909170013118775]
This work presents a linear decomposition of final hidden states from autoregressive language models based on each initial input token.
Using the change in next-word probability as a measure of importance, this work first examines which context words make the biggest contribution to language model predictions.
arXiv Detail & Related papers (2023-05-17T23:55:32Z) - Topics in the Haystack: Extracting and Evaluating Topics beyond
Coherence [0.0]
We propose a method that incorporates a deeper understanding of both sentence and document themes.
This allows our model to detect latent topics that may include uncommon words or neologisms.
We present correlation coefficients with human identification of intruder words and achieve near-human level results at the word-intrusion task.
arXiv Detail & Related papers (2023-03-30T12:24:25Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner.
We study the entropy, or uncertainty, of the model's token-level predictions.
We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z) - High-order Semantic Role Labeling [86.29371274587146]
This paper introduces a high-order graph structure for the neural semantic role labeling model.
It enables the model to explicitly consider not only the isolated predicate-argument pairs but also the interaction between the predicate-argument pairs.
Experimental results on 7 languages of the CoNLL-2009 benchmark show that the high-order structural learning techniques are beneficial to the strong performing SRL models.
arXiv Detail & Related papers (2020-10-09T15:33:54Z) - Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic
Priming [8.08493736237816]
We present a case study analyzing the pre-trained BERT model with tests informed by semantic priming.
We find that BERT too shows "priming," predicting a word with greater probability when the context includes a related word versus an unrelated one.
Follow-up analysis shows BERT to be increasingly distracted by related prime words as context becomes more informative.
arXiv Detail & Related papers (2020-10-06T20:30:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.