A Sentence is Worth a Thousand Pictures: Can Large Language Models
Understand Human Language?
- URL: http://arxiv.org/abs/2308.00109v1
- Date: Wed, 26 Jul 2023 18:58:53 GMT
- Title: A Sentence is Worth a Thousand Pictures: Can Large Language Models
Understand Human Language?
- Authors: Gary Marcus, Evelina Leivada, Elliot Murphy
- Abstract summary: We analyze the contribution of large language models as theoretically informative representations of a target system vs. atheoretical powerful mechanistic tools.
We identify the key abilities that are still missing from the current state of development and exploitation of these models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Artificial Intelligence applications show great potential for
language-related tasks that rely on next-word prediction. The current
generation of large language models have been linked to claims about human-like
linguistic performance and their applications are hailed both as a key step
towards Artificial General Intelligence and as major advance in understanding
the cognitive, and even neural basis of human language. We analyze the
contribution of large language models as theoretically informative
representations of a target system vs. atheoretical powerful mechanistic tools,
and we identify the key abilities that are still missing from the current state
of development and exploitation of these models.
Related papers
- Tokens, the oft-overlooked appetizer: Large language models, the distributional hypothesis, and meaning [31.632816425798108]
Tokenization is a necessary component within the current architecture of many language models.
We discuss how tokens and pretraining can act as a backdoor for bias and other unwanted content.
We relay evidence that the tokenization algorithm's objective function impacts the large language model's cognition.
arXiv Detail & Related papers (2024-12-14T18:18:52Z) - Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Visual Grounding Helps Learn Word Meanings in Low-Data Regimes [47.7950860342515]
Modern neural language models (LMs) are powerful tools for modeling human sentence production and comprehension.
But to achieve these results, LMs must be trained in distinctly un-human-like ways.
Do models trained more naturalistically -- with grounded supervision -- exhibit more humanlike language learning?
We investigate this question in the context of word learning, a key sub-task in language acquisition.
arXiv Detail & Related papers (2023-10-20T03:33:36Z) - Brain in a Vat: On Missing Pieces Towards Artificial General
Intelligence in Large Language Models [83.63242931107638]
We propose four characteristics of generally intelligent agents.
We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations.
We conclude by outlining promising future research directions in the field of artificial general intelligence.
arXiv Detail & Related papers (2023-07-07T13:58:16Z) - A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades.
Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora.
To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Structured, flexible, and robust: benchmarking and improving large
language models towards more human-like behavior in out-of-distribution
reasoning tasks [39.39138995087475]
We ask how much of human-like thinking can be captured by learning statistical patterns in language alone.
Our benchmark contains two problem-solving domains (planning and explanation generation) and is designed to require generalization.
We find that humans are far more robust than LLMs on this benchmark.
arXiv Detail & Related papers (2022-05-11T18:14:33Z) - Estimating the Personality of White-Box Language Models [0.589889361990138]
Large-scale language models, which are trained on large corpora of text, are being used in a wide range of applications everywhere.
Existing research shows that these models can and do capture human biases.
Many of these biases, especially those that could potentially cause harm, are being well-investigated.
However, studies that infer and change human personality traits inherited by these models have been scarce or non-existent.
arXiv Detail & Related papers (2022-04-25T23:53:53Z) - A Survey of Knowledge Enhanced Pre-trained Models [28.160826399552462]
We refer to pre-trained language models with knowledge injection as knowledge-enhanced pre-trained language models (KEPLMs)
These models demonstrate deep understanding and logical reasoning and introduce interpretability.
arXiv Detail & Related papers (2021-10-01T08:51:58Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.