Related papers: Modeling cognitive processes of natural reading with transformer-based Language Models

Modeling cognitive processes of natural reading with transformer-based Language Models

URL: http://arxiv.org/abs/2505.11485v1
Date: Fri, 16 May 2025 17:47:58 GMT
Title: Modeling cognitive processes of natural reading with transformer-based Language Models
Authors: Bruno Bianchi, Fermín Travi, Juan E. Kamienkowski,
Abstract summary: Previous research has shown that models such as N-grams and LSTM networks can partially account for predictability effects in explaining eye movement behaviors.<n>In this study, we extend these findings by evaluating transformer-based models (GPT2, LLaMA-7B, and LLaMA2-7B) to further investigate this relationship.<n>Our results indicate that these architectures outperform earlier models in explaining the variance in Gaze Durations recorded from Rioplantense Spanish readers.
Score: 2.048226951354646
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in Natural Language Processing (NLP) have led to the development of highly sophisticated language models for text generation. In parallel, neuroscience has increasingly employed these models to explore cognitive processes involved in language comprehension. Previous research has shown that models such as N-grams and LSTM networks can partially account for predictability effects in explaining eye movement behaviors, specifically Gaze Duration, during reading. In this study, we extend these findings by evaluating transformer-based models (GPT2, LLaMA-7B, and LLaMA2-7B) to further investigate this relationship. Our results indicate that these architectures outperform earlier models in explaining the variance in Gaze Durations recorded from Rioplantense Spanish readers. However, similar to previous studies, these models still fail to account for the entirety of the variance captured by human predictability. These findings suggest that, despite their advancements, state-of-the-art language models continue to predict language in ways that differ from human readers.

Related papers

Neural Correlates of Language Models Are Specific to Human Language [0.5076419064097734]
This study tests whether previous results are robust to several possible concerns.<n>Results confirm and strengthen the results of previous research and contribute to the debate on the biological plausibility and interpretability of state-of-the-art large language models.
arXiv Detail & Related papers (2025-10-03T16:28:31Z)
Surprisal from Larger Transformer-based Language Models Predicts fMRI Data More Poorly [9.45662351979314]
Recent work has observed a positive relationship between Transformer-based models' perplexity and the predictive power of their surprisal estimates on reading times.<n>This study evaluates the predictive power of surprisal estimates from 17 pre-trained Transformer-based models across three different language families on brain imaging data.
arXiv Detail & Related papers (2025-06-12T22:18:48Z)
Modelando procesos cognitivos de la lectura natural con GPT-2 [0.0]
In recent years, Neuroscience has been using language models to better understand cognitive processes. In the present work, we further this line of research by using GPT-2 based models. The results show that this architecture achieves better outcomes than its predecessors.
arXiv Detail & Related papers (2024-09-30T10:34:32Z)
Investigating the Timescales of Language Processing with EEG and Language Models [0.0]
This study explores the temporal dynamics of language processing by examining the alignment between word representations from a pre-trained language model and EEG data. Using a Temporal Response Function (TRF) model, we investigate how neural activity corresponds to model representations across different layers. Our analysis reveals patterns in TRFs from distinct layers, highlighting varying contributions to lexical and compositional processing.
arXiv Detail & Related papers (2024-06-28T12:49:27Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges. Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role. We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z)
A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z)
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)
Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models. In detail, we first train neural language models with a novel dependency modeling objective. We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z)
Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities. We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form. We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z)
Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction. It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition. We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z)
The Grammar-Learning Trajectories of Neural Language Models [42.32479280480742]
We show that neural language models acquire linguistic phenomena in a similar order, despite having different end performances over the data. Results suggest that NLMs exhibit consistent developmental'' stages.
arXiv Detail & Related papers (2021-09-13T16:17:23Z)
Multi-timescale Representation Learning in LSTM Language Models [69.98840820213937]
Language models must capture statistical dependencies between words at timescales ranging from very short to very long. We derived a theory for how the memory gating mechanism in long short-term memory language models can capture power law decay. Experiments showed that LSTM language models trained on natural English text learn to approximate this theoretical distribution.
arXiv Detail & Related papers (2020-09-27T02:13:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.