Related papers: Explanations of Large Language Models Explain Language Representations in the Brain

Explanations of Large Language Models Explain Language Representations in the Brain

URL: http://arxiv.org/abs/2502.14671v3
Date: Thu, 03 Apr 2025 21:56:08 GMT
Title: Explanations of Large Language Models Explain Language Representations in the Brain
Authors: Maryam Rahimi, Yadollah Yaghoobzadeh, Mohammad Reza Daliri,
Abstract summary: We propose a novel approach using explainable AI (XAI) to strengthen link between language processing and brain neural activity.<n>Applying attribution methods, we quantify the influence of preceding words on predictions.<n>We find stronger attributions suggest brain alignment for assessing the biological explanation methods.
Score: 5.7916055414970895
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large language models (LLMs) not only exhibit human-like performance but also share computational principles with the brain's language processing mechanisms. While prior research has focused on mapping LLMs' internal representations to neural activity, we propose a novel approach using explainable AI (XAI) to strengthen this link. Applying attribution methods, we quantify the influence of preceding words on LLMs' next-word predictions and use these explanations to predict fMRI data from participants listening to narratives. We find that attribution methods robustly predict brain activity across the language network, revealing a hierarchical pattern: explanations from early layers align with the brain's initial language processing stages, while later layers correspond to more advanced stages. Additionally, layers with greater influence on next-word prediction$\unicode{x2014}$reflected in higher attribution scores$\unicode{x2014}$demonstrate stronger brain alignment. These results underscore XAI's potential for exploring the neural basis of language and suggest brain alignment for assessing the biological plausibility of explanation methods.

Related papers

Do Large Language Models Think Like the Brain? Sentence-Level Evidence from fMRI and Hierarchical Embeddings [28.210559128941593]
This study investigates how hierarchical representations in large language models align with the dynamic neural responses during human sentence comprehension.<n>Results show that improvements in model performance drive the evolution of representational architectures toward brain-like hierarchies.
arXiv Detail & Related papers (2025-05-28T16:40:06Z)
Generative causal testing to bridge data-driven models and scientific theories in language neuroscience [82.995061475971]
We present generative causal testing (GCT), a framework for generating concise explanations of language selectivity in the brain. We show that GCT can dissect fine-grained differences between brain areas with similar functional selectivity.
arXiv Detail & Related papers (2024-10-01T15:57:48Z)
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network [16.317199232071232]
Large Language Models (LLMs) have been shown to be effective models of the human language system. In this work, we investigate the key architectural components driving the surprising alignment of untrained models.
arXiv Detail & Related papers (2024-06-21T12:54:03Z)
What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores [1.8175282137722093]
Internal representations from large language models (LLMs) achieve state-of-the-art brain scores, leading to speculation that they share computational principles with human language processing. Here, we analyze three neural datasets used in an impactful study on LLM-to-brain mappings, with a particular focus on an fMRI dataset where participants read short passages. We find that brain scores of trained LLMs on this dataset can largely be explained by sentence length, position, and pronoun-dereferenced static word embeddings.
arXiv Detail & Related papers (2024-06-03T17:13:27Z)
Language Reconstruction with Brain Predictive Coding from fMRI Data [28.217967547268216]
Theory of predictive coding suggests that human brain naturally engages in continuously predicting future word representations. textscPredFT achieves current state-of-the-art decoding performance with a maximum BLEU-1 score of $27.8%$.
arXiv Detail & Related papers (2024-05-19T16:06:02Z)
Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z)
Divergences between Language Models and Human Brains [59.100552839650774]
We systematically explore the divergences between human and machine language processing.<n>We identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense.<n>Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
arXiv Detail & Related papers (2023-11-15T19:02:40Z)
Deep Learning Models to Study Sentence Comprehension in the Human Brain [0.1503974529275767]
Recent artificial neural networks that process natural language achieve unprecedented performance in tasks requiring sentence-level understanding. We review works that compare these artificial language models with human brain activity and we assess the extent to which this approach has improved our understanding of the neural processes involved in natural language comprehension.
arXiv Detail & Related papers (2023-01-16T10:31:25Z)
Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps [75.84770193489639]
We examine the impact of test loss, training corpus and model architecture on the prediction of functional Magnetic Resonance Imaging timecourses of participants listening to an audiobook. We find that untrained versions of each model already explain significant amount of signal in the brain by capturing similarity in brain responses across identical words. We suggest good practices for future studies aiming at explaining the human language system using neural language models.
arXiv Detail & Related papers (2022-07-07T15:37:17Z)
Toward a realistic model of speech processing in the brain with self-supervised learning [67.7130239674153]
Self-supervised algorithms trained on the raw waveform constitute a promising candidate. We show that Wav2Vec 2.0 learns brain-like representations with as little as 600 hours of unlabelled speech.
arXiv Detail & Related papers (2022-06-03T17:01:46Z)
Long-range and hierarchical language predictions in brains and algorithms [82.81964713263483]
We show that while deep language algorithms are optimized to predict adjacent words, the human brain would be tuned to make long-range and hierarchical predictions. This study strengthens predictive coding theory and suggests a critical role of long-range and hierarchical predictions in natural language processing.
arXiv Detail & Related papers (2021-11-28T20:26:07Z)
Model-based analysis of brain activity reveals the hierarchy of language in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli. Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z)
Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings. We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI. This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z)
Does injecting linguistic structure into language models lead to better alignment with brain recordings? [13.880819301385854]
We evaluate whether language models align better with brain recordings if their attention is biased by annotations from syntactic or semantic formalisms. Our proposed approach enables the evaluation of more targeted hypotheses about the composition of meaning in the brain.
arXiv Detail & Related papers (2021-01-29T14:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.