Large Language Models Are Human-Like Internally
- URL: http://arxiv.org/abs/2502.01615v1
- Date: Mon, 03 Feb 2025 18:48:32 GMT
- Title: Large Language Models Are Human-Like Internally
- Authors: Tatsuki Kuribayashi, Yohei Oseki, Souhaib Ben Taieb, Kentaro Inui, Timothy Baldwin,
- Abstract summary: Recent cognitive modeling studies have reported that larger language models (LMs) exhibit a poorer fit to human reading behavior.
We argue that prior conclusions were skewed by an exclusive focus on the final layers of LMs.
Our analysis reveals that next-word probabilities derived from internal layers of larger LMs align with human sentence processing data as well as, or better than, those from smaller LMs.
- Score: 44.996518290660816
- License:
- Abstract: Recent cognitive modeling studies have reported that larger language models (LMs) exhibit a poorer fit to human reading behavior, leading to claims of their cognitive implausibility. In this paper, we revisit this argument through the lens of mechanistic interpretability and argue that prior conclusions were skewed by an exclusive focus on the final layers of LMs. Our analysis reveals that next-word probabilities derived from internal layers of larger LMs align with human sentence processing data as well as, or better than, those from smaller LMs. This alignment holds consistently across behavioral (self-paced reading times, gaze durations, MAZE task processing times) and neurophysiological (N400 brain potentials) measures, challenging earlier mixed results and suggesting that the cognitive plausibility of larger LMs has been underestimated. Furthermore, we first identify an intriguing relationship between LM layers and human measures: earlier layers correspond more closely with fast gaze durations, while later layers better align with relatively slower signals such as N400 potentials and MAZE processing times. Our work opens new avenues for interdisciplinary research at the intersection of mechanistic interpretability and cognitive modeling.
Related papers
- MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations [61.59658203704757]
We propose Multi-View Independent Component Analysis with Delays and Dilations (MVICAD2), which allows sources to differ across subjects in both temporal delays and dilations.
We present a model with identifiable sources, derive an approximation of its likelihood in closed form, and use regularization and optimization techniques to enhance performance.
arXiv Detail & Related papers (2025-01-13T15:47:02Z) - Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors [74.04775677110179]
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs)
In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt.
Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead.
arXiv Detail & Related papers (2024-10-17T17:16:00Z) - Lost in Translation: The Algorithmic Gap Between LMs and the Brain [8.799971499357499]
Language Models (LMs) have achieved impressive performance on various linguistic tasks, but their relationship to human language processing in the brain remains unclear.
This paper examines the gaps and overlaps between LMs and the brain at different levels of analysis.
We discuss how insights from neuroscience, such as sparsity, modularity, internal states, and interactive learning, can inform the development of more biologically plausible language models.
arXiv Detail & Related papers (2024-07-05T17:43:16Z) - The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM)
We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions.
Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z) - Contextual Feature Extraction Hierarchies Converge in Large Language
Models and the Brain [12.92793034617015]
We show that as large language models (LLMs) achieve higher performance on benchmark tasks, they become more brain-like.
We also show the importance of contextual information in improving model performance and brain similarity.
arXiv Detail & Related papers (2024-01-31T08:48:35Z) - Large GPT-like Models are Bad Babies: A Closer Look at the Relationship
between Linguistic Competence and Psycholinguistic Measures [25.210837736795565]
We train a series of GPT-like language models of different sizes on the strict version of the BabyLM pretraining corpus.
We find a positive correlation between LM size and performance on all three challenge tasks, with different preferences for model width and depth in each of the tasks.
This suggests that modelling processing effort and linguistic competence may require an approach different from training GPT-like LMs on a developmentally plausible corpus.
arXiv Detail & Related papers (2023-11-08T09:26:27Z) - Probing Large Language Models from A Human Behavioral Perspective [24.109080140701188]
Large Language Models (LLMs) have emerged as dominant foundational models in modern NLP.
The understanding of their prediction processes and internal mechanisms, such as feed-forward networks (FFN) and multi-head self-attention (MHSA) remains largely unexplored.
arXiv Detail & Related papers (2023-10-08T16:16:21Z) - Unveiling Theory of Mind in Large Language Models: A Parallel to Single
Neurons in the Human Brain [2.5350521110810056]
Large language models (LLMs) have been found to exhibit a certain level of Theory of Mind (ToM)
The precise processes underlying LLM's capacity for ToM or their similarities with that of humans remains largely unknown.
arXiv Detail & Related papers (2023-09-04T15:26:15Z) - An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning [70.48605869773814]
Catastrophic forgetting (CF) is a phenomenon that occurs in machine learning when a model forgets previously learned information.
This study empirically evaluates the forgetting phenomenon in large language models during continual instruction tuning.
arXiv Detail & Related papers (2023-08-17T02:53:23Z) - Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models [76.48370548802464]
This paper focuses on conducting a series of analytical experiments to examine the relations between the multi-head self-attention and the final MRC system performance.
We discover that passage-to-question and passage understanding attentions are the most important ones in the question answering process.
Through comprehensive visualizations and case studies, we also observe several general findings on the attention maps, which can be helpful to understand how these models solve the questions.
arXiv Detail & Related papers (2021-08-26T04:23:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.