Related papers: On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior

On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior

URL: http://arxiv.org/abs/2006.01912v1
Date: Tue, 2 Jun 2020 19:47:01 GMT
Title: On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior
Authors: Ethan Gotlieb Wilcox, Jon Gauthier, Jennifer Hu, Peng Qian and Roger Levy
Abstract summary: We test over two dozen models on how well their next-word expectations predict human reading time on naturalistic text corpora. We evaluate how features of these models determine their psychometric predictive power, or ability to predict human reading behavior. For any given perplexity, deep Transformer models and n-gram models show superior psychometric predictive power over LSTM or structurally supervised neural models.
Score: 29.260666424382446
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human reading behavior is tuned to the statistics of natural language: the time it takes human subjects to read a word can be predicted from estimates of the word's probability in context. However, it remains an open question what computational architecture best characterizes the expectations deployed in real time by humans that determine the behavioral signatures of reading. Here we test over two dozen models, independently manipulating computational architecture and training dataset size, on how well their next-word expectations predict human reading time behavior on naturalistic text corpora. We find that across model architectures and training dataset sizes the relationship between word log-probability and reading time is (near-)linear. We next evaluate how features of these models determine their psychometric predictive power, or ability to predict human reading behavior. In general, the better a model's next-word expectations, the better its psychometric predictive power. However, we find nontrivial differences across model architectures. For any given perplexity, deep Transformer models and n-gram models generally show superior psychometric predictive power over LSTM or structurally supervised neural models, especially for eye movement data. Finally, we compare models' psychometric predictive power to the depth of their syntactic knowledge, as measured by a battery of syntactic generalization tests developed using methods from controlled psycholinguistic experiments. Once perplexity is controlled for, we find no significant relationship between syntactic knowledge and predictive power. These results suggest that different approaches may be required to best model human real-time language comprehension behavior in naturalistic reading versus behavior for controlled linguistic materials designed for targeted probing of syntactic knowledge.

Related papers

Reverse-Engineering the Reader [43.26660964074272]
We introduce a novel alignment technique in which we fine-tune a language model to implicitly optimize the parameters of a linear regressor. Using words as a test case, we evaluate our technique across multiple model sizes and datasets. We find an inverse relationship between psychometric power and a model's performance on downstream NLP tasks as well as its perplexity on held-out test data.
arXiv Detail & Related papers (2024-10-16T23:05:01Z)
Beyond Text: Leveraging Multi-Task Learning and Cognitive Appraisal Theory for Post-Purchase Intention Analysis [10.014248704653]
This study evaluates multi-task learning frameworks grounded in Cognitive Appraisal Theory to predict user behavior. Our experiments show that users' language and traits improve predictions above and beyond models predicting only from text.
arXiv Detail & Related papers (2024-07-11T04:57:52Z)
Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective. We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention. Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z)
Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges. Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role. We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z)
Meta predictive learning model of languages in neural circuits [2.5690340428649328]
We propose a mean-field learning model within the predictive coding framework. Our model reveals that most of the connections become deterministic after learning. Our model provides a starting point to investigate the connection among brain computation, next-token prediction and general intelligence.
arXiv Detail & Related papers (2023-09-08T03:58:05Z)
Transformer-Based Language Model Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens [17.80735287413141]
We evaluate surprisal estimates from Transformer-based language model variants on their ability to predict human reading times. Results show that surprisal estimates from most variants with contemporary model capacities provide the best fit after seeing about two billion training tokens. Newly-trained smaller model variants reveal a 'tipping point' at convergence, after which the decrease in language model perplexity begins to result in poorer fits to human reading times.
arXiv Detail & Related papers (2023-04-22T12:50:49Z)
Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models. In detail, we first train neural language models with a novel dependency modeling objective. We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z)
The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain. In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z)
Multi-timescale Representation Learning in LSTM Language Models [69.98840820213937]
Language models must capture statistical dependencies between words at timescales ranging from very short to very long. We derived a theory for how the memory gating mechanism in long short-term memory language models can capture power law decay. Experiments showed that LSTM language models trained on natural English text learn to approximate this theoretical distribution.
arXiv Detail & Related papers (2020-09-27T02:13:38Z)
Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling [0.8668211481067458]
We re-evaluate a claim due to Goodkind and Bicknell that a language model's ability to model reading times is a linear function of its perplexity. We show that the proposed relation does not always hold for Long Short-Term Memory networks, Transformers, and pre-trained models.
arXiv Detail & Related papers (2020-09-08T19:12:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.