Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences
- URL: http://arxiv.org/abs/2406.04988v2
- Date: Fri, 2 Aug 2024 11:49:54 GMT
- Title: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences
- Authors: Patrick Haller, Lena S. Bolliger, Lena A. Jäger,
- Abstract summary: We revisit the predictive power of surprisal and entropy measures estimated from a range of language models (LMs) on data of human reading times.
We investigate if modulating surprisal and entropy relative to cognitive scores increases prediction accuracy of reading times.
Our study finds that in most cases, incorporating cognitive capacities increases predictive power of surprisal and entropy on reading times.
- Score: 1.942809872918085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To date, most investigations on surprisal and entropy effects in reading have been conducted on the group level, disregarding individual differences. In this work, we revisit the predictive power of surprisal and entropy measures estimated from a range of language models (LMs) on data of human reading times as a measure of processing effort by incorporating information of language users' cognitive capacities. To do so, we assess the predictive power of surprisal and entropy estimated from generative LMs on reading data obtained from individuals who also completed a wide range of psychometric tests. Specifically, we investigate if modulating surprisal and entropy relative to cognitive scores increases prediction accuracy of reading times, and we examine whether LMs exhibit systematic biases in the prediction of reading times for cognitively high- or low-performing groups, revealing what type of psycholinguistic subject a given LM emulates. Our study finds that in most cases, incorporating cognitive capacities increases predictive power of surprisal and entropy on reading times, and that generally, high performance in the psychometric tests is associated with lower sensitivity to predictability effects. Finally, our results suggest that the analyzed LMs emulate readers with lower verbal intelligence, suggesting that for a given target group (i.e., individuals with high verbal intelligence), these LMs provide less accurate predictability estimates.
Related papers
- Beyond Text: Leveraging Multi-Task Learning and Cognitive Appraisal Theory for Post-Purchase Intention Analysis [10.014248704653]
This study evaluates multi-task learning frameworks grounded in Cognitive Appraisal Theory to predict user behavior.
Our experiments show that users' language and traits improve predictions above and beyond models predicting only from text.
arXiv Detail & Related papers (2024-07-11T04:57:52Z) - Connected Speech-Based Cognitive Assessment in Chinese and English [10.205946648609752]
We present a novel benchmark dataset and prediction tasks for investigating approaches to assess cognitive function through analysis of connected speech.
The dataset consists of speech samples and clinical information for speakers of Mandarin Chinese and English with different levels of cognitive impairment.
The prediction tasks encompass mild cognitive impairment diagnosis and cognitive test score prediction.
arXiv Detail & Related papers (2024-06-11T19:04:29Z) - LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - On the Effect of Anticipation on Reading Times [84.27103313675342]
We operationalize anticipation as a word's contextual entropy.
We find substantial evidence for effects of contextual entropy over surprisal on a word's reading time.
arXiv Detail & Related papers (2022-11-25T18:58:23Z) - Predicting Human Psychometric Properties Using Computational Language
Models [5.806723407090421]
Transformer-based language models (LMs) continue to achieve state-of-the-art performance on natural language processing (NLP) benchmarks.
Can LMs be of use in predicting the psychometric properties of test items, when those items are given to human participants?
We gather responses from numerous human participants and LMs on a broad diagnostic test of linguistic competencies.
We then use the human responses to calculate standard psychometric properties of the items in the diagnostic test, using the human responses and the LM responses separately.
arXiv Detail & Related papers (2022-05-12T16:40:12Z) - Evaluating Distributional Distortion in Neural Language Modeling [81.83408583979745]
A heavy-tail of rare events accounts for a significant amount of the total probability mass of distributions in language.
Standard language modeling metrics such as perplexity quantify the performance of language models (LM) in aggregate.
We develop a controlled evaluation scheme which uses generative models trained on natural data as artificial languages.
arXiv Detail & Related papers (2022-03-24T01:09:46Z) - Can Transformer Language Models Predict Psychometric Properties? [0.0]
Transformer-based language models (LMs) continue to advance state-of-the-art performance on NLP benchmark tasks.
Can LMs be of use in predicting what the psychometric properties of test items will be when those items are given to human participants?
We gather responses from numerous human participants and LMs on a broad diagnostic test of linguistic competencies.
arXiv Detail & Related papers (2021-06-12T20:05:33Z) - On the Predictive Power of Neural Language Models for Human Real-Time
Comprehension Behavior [29.260666424382446]
We test over two dozen models on how well their next-word expectations predict human reading time on naturalistic text corpora.
We evaluate how features of these models determine their psychometric predictive power, or ability to predict human reading behavior.
For any given perplexity, deep Transformer models and n-gram models show superior psychometric predictive power over LSTM or structurally supervised neural models.
arXiv Detail & Related papers (2020-06-02T19:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.