On Representational Dissociation of Language and Arithmetic in Large Language Models
- URL: http://arxiv.org/abs/2502.11932v1
- Date: Mon, 17 Feb 2025 15:42:01 GMT
- Title: On Representational Dissociation of Language and Arithmetic in Large Language Models
- Authors: Riku Kisako, Tatsuki Kuribayashi, Ryohei Sasano,
- Abstract summary: We show that simple arithmetic equations and general language input are encoded in completely separated regions in large language models.
This suggests arithmetic reasoning is mapped into a distinct region from general language input.
- Score: 12.182463768190509
- License:
- Abstract: The association between language and (non-linguistic) thinking ability in humans has long been debated, and recently, neuroscientific evidence of brain activity patterns has been considered. Such a scientific context naturally raises an interdisciplinary question -- what about such a language-thought dissociation in large language models (LLMs)? In this paper, as an initial foray, we explore this question by focusing on simple arithmetic skills (e.g., $1+2=$ ?) as a thinking ability and analyzing the geometry of their encoding in LLMs' representation space. Our experiments with linear classifiers and cluster separability tests demonstrate that simple arithmetic equations and general language input are encoded in completely separated regions in LLMs' internal representation space across all the layers, which is also supported with more controlled stimuli (e.g., spelled-out equations). These tentatively suggest that arithmetic reasoning is mapped into a distinct region from general language input, which is in line with the neuroscientific observations of human brain activations, while we also point out their somewhat cognitively implausible geometric properties.
Related papers
- The neural correlates of logical-mathematical symbol systems processing resemble that of spatial cognition more than natural language processing [6.613108038833871]
The ability to manipulate logical-mathematical symbols (LMS) is a cognitive skill arguably unique to humans.
Previous studies have pinpointed two primary candidates, natural language processing and spatial cognition.
The present study compared the neural correlates at the domain level with both automated meta-analysis and synthesized maps.
arXiv Detail & Related papers (2024-06-20T14:31:09Z) - Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
We classify neurons into four distinct categories based on their responses to a specific input across different languages.
Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z) - What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores [1.8175282137722093]
Internal representations from large language models (LLMs) achieve state-of-the-art brain scores, leading to speculation that they share computational principles with human language processing.
Here, we analyze three neural datasets used in an impactful study on LLM-to-brain mappings, with a particular focus on an fMRI dataset where participants read short passages.
We find that brain scores of trained LLMs on this dataset can largely be explained by sentence length, position, and pronoun-dereferenced static word embeddings.
arXiv Detail & Related papers (2024-06-03T17:13:27Z) - Revealing the Parallel Multilingual Learning within Large Language Models [50.098518799536144]
In this study, we reveal an in-context learning capability of multilingual large language models (LLMs)
By translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities.
arXiv Detail & Related papers (2024-03-14T03:33:46Z) - Divergences between Language Models and Human Brains [59.100552839650774]
We systematically explore the divergences between human and machine language processing.
We identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense.
Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
arXiv Detail & Related papers (2023-11-15T19:02:40Z) - Unveiling A Core Linguistic Region in Large Language Models [49.860260050718516]
This paper conducts an analogical research using brain localization as a prototype.
We have discovered a core region in large language models that corresponds to linguistic competence.
We observe that an improvement in linguistic competence does not necessarily accompany an elevation in the model's knowledge level.
arXiv Detail & Related papers (2023-10-23T13:31:32Z) - Information-Restricted Neural Language Models Reveal Different Brain
Regions' Sensitivity to Semantics, Syntax and Context [87.31930367845125]
We trained a lexical language model, Glove, and a supra-lexical language model, GPT-2, on a text corpus.
We then assessed to what extent these information-restricted models were able to predict the time-courses of fMRI signal of humans listening to naturalistic text.
Our analyses show that, while most brain regions involved in language are sensitive to both syntactic and semantic variables, the relative magnitudes of these effects vary a lot across these regions.
arXiv Detail & Related papers (2023-02-28T08:16:18Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Emergence of Separable Manifolds in Deep Language Representations [26.002842878797765]
Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities.
Recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain.
DNNs have subsequently become a popular model class to infer computational principles underlying complex cognitive functions.
arXiv Detail & Related papers (2020-06-01T17:23:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.