On the Unexpected Abilities of Large Language Models
- URL: http://arxiv.org/abs/2308.09720v2
- Date: Mon, 18 Dec 2023 16:17:36 GMT
- Title: On the Unexpected Abilities of Large Language Models
- Authors: Stefano Nolfi
- Abstract summary: Large Language Models (LLMs) are capable of displaying a wide range of abilities that are not directly connected with the task for which they are trained.
I discuss the nature of the indirect process that leads to the acquisition of these cognitive abilities, their relation to other indirect processes, and the implications for the acquisition of integrated abilities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) are capable of displaying a wide range of
abilities that are not directly connected with the task for which they are
trained: predicting the next words of human-written texts. In this article, I
review recent research investigating the cognitive abilities developed by LLMs
and their relation to human cognition. I discuss the nature of the indirect
process that leads to the acquisition of these cognitive abilities, their
relation to other indirect processes, and the implications for the acquisition
of integrated abilities. Moreover, I propose the factors that enable the
development of abilities that are related only very indirectly to the proximal
objective of the training task. Finally, I discuss whether the full set of
capabilities that LLMs could possibly develop is predictable.
Related papers
- Can Language Models Learn to Skip Steps? [59.84848399905409]
We study the ability to skip steps in reasoning.
Unlike humans, who may skip steps to enhance efficiency or to reduce cognitive load, models do not possess such motivations.
Our work presents the first exploration into human-like step-skipping ability.
arXiv Detail & Related papers (2024-11-04T07:10:24Z) - Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors [74.04775677110179]
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs)
In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt.
Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead.
arXiv Detail & Related papers (2024-10-17T17:16:00Z) - Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.
In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation.
We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z) - Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning [0.0]
Large Language Models (LLMs) have demonstrated their capabilities across various tasks.
This paper exploits the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks.
We compare the performance of LLMs with a cognitive instance-based learning model, which imitates human experiential decision-making.
arXiv Detail & Related papers (2024-07-12T14:13:06Z) - Development of Cognitive Intelligence in Pre-trained Language Models [3.1815791977708834]
Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models.
The developmental trajectories of PLMs consistently exhibit a window of maximal alignment to human cognitive development.
After that window, training appears to serve the engineering goal of reducing loss but not the scientific goal of increasing alignment with human cognition.
arXiv Detail & Related papers (2024-07-01T07:56:36Z) - Exploring the LLM Journey from Cognition to Expression with Linear Representations [10.92882688742428]
This paper presents an in-depth examination of the evolution and interplay of cognitive and expressive capabilities in large language models (LLMs)
We define and explore the model's cognitive and expressive capabilities through linear representations across three critical phases: Pretraining, Supervised Fine-Tuning (SFT), and Reinforcement Learning from Human Feedback (RLHF)
Our findings unveil a sequential development pattern, where cognitive abilities are largely established during Pretraining, whereas expressive abilities predominantly advance during SFT and RLHF.
arXiv Detail & Related papers (2024-05-27T08:57:04Z) - The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM)
We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions.
Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z) - Igniting Language Intelligence: The Hitchhiker's Guide From
Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence.
LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer.
Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z) - Post Hoc Explanations of Language Models Can Improve Language Models [43.2109029463221]
We present a novel framework, Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (AMPLIFY)
We leverage post hoc explanation methods which output attribution scores (explanations) capturing the influence of each of the input features on model predictions.
Our framework, AMPLIFY, leads to prediction accuracy improvements of about 10-25% over a wide range of tasks.
arXiv Detail & Related papers (2023-05-19T04:46:04Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.