Dissociating language and thought in large language models
- URL: http://arxiv.org/abs/2301.06627v3
- Date: Sat, 23 Mar 2024 19:52:33 GMT
- Title: Dissociating language and thought in large language models
- Authors: Kyle Mahowald, Anna A. Ivanova, Idan A. Blank, Nancy Kanwisher, Joshua B. Tenenbaum, Evelina Fedorenko,
- Abstract summary: Large Language Models (LLMs) have come closest among all models to date to mastering human language.
We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms.
Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty.
- Score: 52.39241645471213
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence - knowledge of linguistic rules and patterns - and functional linguistic competence - understanding and using language in the world. We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of mechanisms specialized for formal linguistic competence, distinct from functional competence.
Related papers
- MaestroMotif: Skill Design from Artificial Intelligence Feedback [67.17724089381056]
MaestroMotif is a method for AI-assisted skill design, which yields high-performing and adaptable agents.
We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents.
arXiv Detail & Related papers (2024-12-11T16:59:31Z) - The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units [16.317199232071232]
Large language models (LLMs) exhibit remarkable capabilities on not just language tasks, but also various tasks that are not linguistic in nature.
In the human brain, neuroscience has identified a core language system that selectively and causally supports language processing.
We identify language-selective units within 18 popular LLMs, using the same localization approach that is used in neuroscience.
arXiv Detail & Related papers (2024-11-04T17:09:10Z) - Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability [2.672177830116334]
This study employs psycholinguistic paradigms in English to explore neuron-level representations in language model.
Our findings indicate that while GPT-2-XL struggles with the sound-shape task, it demonstrates human-like abilities in both sound-gender association and implicit causality.
arXiv Detail & Related papers (2024-09-24T07:40:33Z) - Language Models as Models of Language [0.0]
This chapter critically examines the potential contributions of modern language models to theoretical linguistics.
I review a growing body of empirical evidence suggesting that language models can learn hierarchical syntactic structure and exhibit sensitivity to various linguistic phenomena.
I conclude that closer collaboration between theoretical linguists and computational researchers could yield valuable insights.
arXiv Detail & Related papers (2024-08-13T18:26:04Z) - Comuniqa : Exploring Large Language Models for improving speaking skills [2.8227892155844088]
We investigate the potential of Large Language Models (LLMs) to improve English speaking skills.
Recent advancements in Artificial Intelligence (AI) offer promising solutions to overcome limitations.
We propose Comuniqa, a novel LLM-based system designed to enhance English speaking skills.
arXiv Detail & Related papers (2024-01-28T07:37:33Z) - How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering [52.86931192259096]
Knowledge Base Question Answering (KBQA) aims to answer natural language questions based on facts in knowledge bases.
Recent works leverage the capabilities of large language models (LLMs) for logical form generation to improve performance.
arXiv Detail & Related papers (2024-01-11T09:27:50Z) - Unveiling A Core Linguistic Region in Large Language Models [49.860260050718516]
This paper conducts an analogical research using brain localization as a prototype.
We have discovered a core region in large language models that corresponds to linguistic competence.
We observe that an improvement in linguistic competence does not necessarily accompany an elevation in the model's knowledge level.
arXiv Detail & Related papers (2023-10-23T13:31:32Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.