Unveiling A Core Linguistic Region in Large Language Models
- URL: http://arxiv.org/abs/2310.14928v1
- Date: Mon, 23 Oct 2023 13:31:32 GMT
- Title: Unveiling A Core Linguistic Region in Large Language Models
- Authors: Jun Zhao, Zhihao Zhang, Yide Ma, Qi Zhang, Tao Gui, Luhui Gao and
Xuanjing Huang
- Abstract summary: This paper conducts an analogical research using brain localization as a prototype.
We have discovered a core region in large language models that corresponds to linguistic competence.
We observe that an improvement in linguistic competence does not necessarily accompany an elevation in the model's knowledge level.
- Score: 49.860260050718516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Brain localization, which describes the association between specific regions
of the brain and their corresponding functions, is widely accepted in the field
of cognitive science as an objective fact. Today's large language models (LLMs)
possess human-level linguistic competence and can execute complex tasks
requiring abstract knowledge and reasoning. To deeply understand the inherent
mechanisms of intelligence emergence in LLMs, this paper conducts an analogical
research using brain localization as a prototype. We have discovered a core
region in LLMs that corresponds to linguistic competence, accounting for
approximately 1% of the total model parameters. This core region exhibits
significant dimension dependency, and perturbations to even a single parameter
on specific dimensions can lead to a loss of linguistic competence.
Furthermore, we observe that an improvement in linguistic competence does not
necessarily accompany an elevation in the model's knowledge level, which might
imply the existence of regions of domain knowledge that are dissociated from
the linguistic region. Overall, exploring the LLMs' functional regions provides
insights into the foundation of their intelligence. In the future, we will
continue to investigate knowledge regions within LLMs and the interactions
between them.
Related papers
- Lost in Translation: The Algorithmic Gap Between LMs and the Brain [8.799971499357499]
Language Models (LMs) have achieved impressive performance on various linguistic tasks, but their relationship to human language processing in the brain remains unclear.
This paper examines the gaps and overlaps between LMs and the brain at different levels of analysis.
We discuss how insights from neuroscience, such as sparsity, modularity, internal states, and interactive learning, can inform the development of more biologically plausible language models.
arXiv Detail & Related papers (2024-07-05T17:43:16Z) - Analyzing Key Neurons in Large Language Models [14.69046890281591]
Large Language Models (LLMs) possess vast amounts of knowledge within their parameters, prompting research into methods for locating and editing this knowledge.
We introduce Neuron-Inverse Cluster Attribution (NA-ICA), a novel architecture-agnostic framework capable of identifying key neurons in LLMs.
arXiv Detail & Related papers (2024-06-16T09:36:32Z) - Can large language models understand uncommon meanings of common words? [30.527834781076546]
Large language models (LLMs) have shown significant advancements across diverse natural language understanding (NLU) tasks.
Yet, lacking widely acknowledged testing mechanisms, answering whether LLMs are parrots or genuinely comprehend the world' remains unclear.
This paper presents innovative construction of a Lexical Semantic dataset with novel evaluation metrics.
arXiv Detail & Related papers (2024-05-09T12:58:22Z) - Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.
We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.
Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z) - Unveiling Linguistic Regions in Large Language Models [49.298360366468934]
Large Language Models (LLMs) have demonstrated considerable cross-lingual alignment and generalization ability.
This paper conducts several investigations on the linguistic competence of LLMs.
arXiv Detail & Related papers (2024-02-22T16:56:13Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - Information-Restricted Neural Language Models Reveal Different Brain
Regions' Sensitivity to Semantics, Syntax and Context [87.31930367845125]
We trained a lexical language model, Glove, and a supra-lexical language model, GPT-2, on a text corpus.
We then assessed to what extent these information-restricted models were able to predict the time-courses of fMRI signal of humans listening to naturalistic text.
Our analyses show that, while most brain regions involved in language are sensitive to both syntactic and semantic variables, the relative magnitudes of these effects vary a lot across these regions.
arXiv Detail & Related papers (2023-02-28T08:16:18Z) - Dissociating language and thought in large language models [52.39241645471213]
Large Language Models (LLMs) have come closest among all models to date to mastering human language.
We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms.
Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty.
arXiv Detail & Related papers (2023-01-16T22:41:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.