Related papers: Do Large Language Models Mirror Cognitive Language Processing?

Do Large Language Models Mirror Cognitive Language Processing?

URL: http://arxiv.org/abs/2402.18023v2
Date: Tue, 28 May 2024 05:51:15 GMT
Title: Do Large Language Models Mirror Cognitive Language Processing?
Authors: Yuqi Ren, Renren Jin, Tongxuan Zhang, Deyi Xiong,
Abstract summary: Large Language Models (LLMs) have demonstrated remarkable abilities in text comprehension and logical reasoning. In cognitive science, brain cognitive processing signals are typically utilized to study human language processing. We employ Representational Similarity Analysis (RSA) to measure the alignment between 23 mainstream LLMs and fMRI signals of the brain.
Score: 43.68923267228057
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated remarkable abilities in text comprehension and logical reasoning, indicating that the text representations learned by LLMs can facilitate their language processing capabilities. In cognitive science, brain cognitive processing signals are typically utilized to study human language processing. Therefore, it is natural to ask how well the text embeddings from LLMs align with the brain cognitive processing signals, and how training strategies affect the LLM-brain alignment? In this paper, we employ Representational Similarity Analysis (RSA) to measure the alignment between 23 mainstream LLMs and fMRI signals of the brain to evaluate how effectively LLMs simulate cognitive language processing. We empirically investigate the impact of various factors (e.g., pre-training data size, model scaling, alignment training, and prompts) on such LLM-brain alignment. Experimental results indicate that pre-training data size and model scaling are positively correlated with LLM-brain similarity, and alignment training can significantly improve LLM-brain similarity. Explicit prompts contribute to the consistency of LLMs with brain cognitive language processing, while nonsensical noisy prompts may attenuate such alignment. Additionally, the performance of a wide range of LLM evaluations (e.g., MMLU, Chatbot Arena) is highly correlated with the LLM-brain similarity.

Related papers

Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain) [22.244699182222824]
Transformer-based language models, though not explicitly trained to mimic brain recordings, have demonstrated surprising alignment with brain activity.<n>Recently, a new class of instruction-tuned multimodal LLMs have emerged, showing remarkable zero-shot capabilities in open-ended multimodal vision tasks.<n>We investigate whether MLLMs, when prompted with natural instructions, lead to better brain alignment and effectively capture instruction-specific representations.
arXiv Detail & Related papers (2025-05-26T14:18:15Z)
Explicit Learning and the LLM in Machine Translation [20.630120942837564]
This study explores the capacity of large language models (LLMs) for explicit learning. Using constructed languages generated by means as controlled test environments, we designed experiments to assess an LLM's ability to explicitly learn and apply grammar rules. Supervised fine-tuning on chains of thought significantly enhances LLM performance but struggles to generalize to typologically novel or more complex linguistic features.
arXiv Detail & Related papers (2025-03-12T14:57:08Z)
Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability. The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network [16.317199232071232]
Large Language Models (LLMs) have been shown to be effective models of the human language system. In this work, we investigate the key architectural components driving the surprising alignment of untrained models.
arXiv Detail & Related papers (2024-06-21T12:54:03Z)
What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores [1.8175282137722093]
Internal representations from large language models (LLMs) achieve state-of-the-art brain scores, leading to speculation that they share computational principles with human language processing. Here, we analyze three neural datasets used in an impactful study on LLM-to-brain mappings, with a particular focus on an fMRI dataset where participants read short passages. We find that brain scores of trained LLMs on this dataset can largely be explained by sentence length, position, and pronoun-dereferenced static word embeddings.
arXiv Detail & Related papers (2024-06-03T17:13:27Z)
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL [78.80673954827773]
Large Language Models (LLMs) play a crucial role in capturing structured semantics to enhance language understanding, improve interpretability, and reduce bias. We propose using Semantic Role Labeling (SRL) as a fundamental task to explore LLMs' ability to extract structured semantics. We find interesting potential: LLMs can indeed capture semantic structures, and scaling-up doesn't always mirror potential. We are surprised to discover that significant overlap in the errors is made by both LLMs and untrained humans, accounting for almost 30% of all errors.
arXiv Detail & Related papers (2024-05-10T11:44:05Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z)
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain [12.92793034617015]
We show that as large language models (LLMs) achieve higher performance on benchmark tasks, they become more brain-like. We also show the importance of contextual information in improving model performance and brain similarity.
arXiv Detail & Related papers (2024-01-31T08:48:35Z)
Do LLMs Dream of Ontologies? [15.049502693786698]
Large language models (LLMs) have recently revolutionized automated text understanding and generation. This paper investigates whether and to what extent general-purpose pre-trained LLMs have information from known.
arXiv Detail & Related papers (2024-01-26T15:10:23Z)
Instruction-tuning Aligns LLMs to the Human Brain [19.450164922129723]
We investigate the effect of instruction-tuning on aligning large language models and human language processing mechanisms. We find that instruction-tuning generally enhances brain alignment, but has no similar effect on behavioral alignment. Our results suggest that the mechanisms that encode world knowledge in LLMs also improve representational alignment to the human brain.
arXiv Detail & Related papers (2023-12-01T13:31:02Z)
Divergences between Language Models and Human Brains [63.405788999891335]
Recent research has hinted that brain signals can be effectively predicted using internal representations of language models (LMs) We show that there are clear differences in how LMs and humans represent and use language. We identify two domains that are not captured well by LMs: social/emotional intelligence and physical commonsense.
arXiv Detail & Related papers (2023-11-15T19:02:40Z)
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT [72.83383437501577]
Large language models (LLMs) have recently demonstrated significant potential in mathematical abilities. LLMs currently have difficulty in bridging perception, language understanding and reasoning capabilities. This paper presents a novel method for integrating LLMs into the abductive learning framework.
arXiv Detail & Related papers (2023-04-21T16:23:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.