Towards Explainable and Language-Agnostic LLMs: Symbolic Reverse
Engineering of Language at Scale
- URL: http://arxiv.org/abs/2306.00017v4
- Date: Thu, 27 Jul 2023 16:47:26 GMT
- Title: Towards Explainable and Language-Agnostic LLMs: Symbolic Reverse
Engineering of Language at Scale
- Authors: Walid S. Saba
- Abstract summary: Large language models (LLMs) have achieved a milestone that undenia-bly changed many held beliefs in artificial intelligence (AI)
We argue for a bottom-up reverse engineering of language in a symbolic setting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved a milestone that undenia-bly
changed many held beliefs in artificial intelligence (AI). However, there
remains many limitations of these LLMs when it comes to true language
understanding, limitations that are a byproduct of the under-lying architecture
of deep neural networks. Moreover, and due to their subsymbolic nature,
whatever knowledge these models acquire about how language works will always be
buried in billions of microfeatures (weights), none of which is meaningful on
its own, making such models hopelessly unexplainable. To address these
limitations, we suggest com-bining the strength of symbolic representations
with what we believe to be the key to the success of LLMs, namely a successful
bottom-up re-verse engineering of language at scale. As such we argue for a
bottom-up reverse engineering of language in a symbolic setting. Hints on what
this project amounts to have been suggested by several authors, and we discuss
in some detail here how this project could be accomplished.
Related papers
- LLMs' Understanding of Natural Language Revealed [0.0]
Large language models (LLMs) are the result of a massive experiment in bottom-up, data-driven reverse engineering of language at scale.
We will focus on testing LLMs for their language understanding capabilities, their supposed forte.
arXiv Detail & Related papers (2024-07-29T01:21:11Z) - Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency [0.11510009152620666]
We argue that claims regarding linguistic capabilities of Large Language Models (LLMs) are based on at least two unfounded assumptions.
Language completeness assumes that a distinct and complete thing such as a natural language' exists.
The assumption of data completeness relies on the belief that a language can be quantified and wholly captured by data.
arXiv Detail & Related papers (2024-07-11T18:06:01Z) - Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models [0.0]
We argue that the relative success of large language models (LLMs) is not a reflection on the symbolic vs. subsymbolic debate.
We suggest employing the same successful bottom-up strategy employed in LLMs but in a symbolic setting.
arXiv Detail & Related papers (2024-06-06T20:38:35Z) - Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models [71.93366651585275]
Large language models (LLMs) have exhibited impressive performance in language comprehension and various reasoning tasks.
We propose Visualization-of-Thought (VoT) to elicit spatial reasoning of LLMs by visualizing their reasoning traces.
VoT significantly enhances the spatial reasoning abilities of LLMs.
arXiv Detail & Related papers (2024-04-04T17:45:08Z) - Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.
We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.
Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z) - Let Models Speak Ciphers: Multiagent Debate through Embeddings [84.20336971784495]
We introduce CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue.
By deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights.
This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
arXiv Detail & Related papers (2023-10-10T03:06:38Z) - Stochastic LLMs do not Understand Language: Towards Symbolic,
Explainable and Ontologically Based LLMs [0.0]
We argue that the relative success of data-driven large language models (LLMs) is not a reflection on the symbolic vs. subsymbolic debate.
We suggest in this paper applying the effective bottom-up strategy in a symbolic setting resulting in symbolic, explainable, and ontologically grounded language models.
arXiv Detail & Related papers (2023-09-12T02:14:05Z) - Symbolic and Language Agnostic Large Language Models [0.0]
We argue that the relative success of large language models (LLMs) is not a reflection on the symbolic vs. subsymbolic debate.
We suggest here is employing the successful bottom-up strategy in a symbolic setting, producing symbolic, language agnostic and ontologically grounded large language models.
arXiv Detail & Related papers (2023-08-27T20:24:33Z) - Large Language Models are In-Context Semantic Reasoners rather than
Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years.
Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear.
In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z) - Shortcut Learning of Large Language Models in Natural Language
Understanding [119.45683008451698]
Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks.
They might rely on dataset bias and artifacts as shortcuts for prediction.
This has significantly affected their generalizability and adversarial robustness.
arXiv Detail & Related papers (2022-08-25T03:51:39Z) - MRKL Systems: A modular, neuro-symbolic architecture that combines large
language models, external knowledge sources and discrete reasoning [50.40151403246205]
Huge language models (LMs) have ushered in a new era for AI, serving as a gateway to natural-language-based knowledge tasks.
We define a flexible architecture with multiple neural models, complemented by discrete knowledge and reasoning modules.
We describe this neuro-symbolic architecture, dubbed the Modular Reasoning, Knowledge and Language (MRKL) system.
arXiv Detail & Related papers (2022-05-01T11:01:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.