Turning large language models into cognitive models
- URL: http://arxiv.org/abs/2306.03917v1
- Date: Tue, 6 Jun 2023 18:00:01 GMT
- Title: Turning large language models into cognitive models
- Authors: Marcel Binz, Eric Schulz
- Abstract summary: We show that large language models can be turned into cognitive models.
These models offer accurate representations of human behavior, even outperforming traditional cognitive models in two decision-making domains.
Taken together, these results suggest that large, pre-trained models can be adapted to become generalist cognitive models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models are powerful systems that excel at many tasks, ranging
from translation to mathematical reasoning. Yet, at the same time, these models
often show unhuman-like characteristics. In the present paper, we address this
gap and ask whether large language models can be turned into cognitive models.
We find that -- after finetuning them on data from psychological experiments --
these models offer accurate representations of human behavior, even
outperforming traditional cognitive models in two decision-making domains. In
addition, we show that their representations contain the information necessary
to model behavior on the level of individual subjects. Finally, we demonstrate
that finetuning on multiple tasks enables large language models to predict
human behavior in a previously unseen task. Taken together, these results
suggest that large, pre-trained models can be adapted to become generalist
cognitive models, thereby opening up new research directions that could
transform cognitive psychology and the behavioral sciences as a whole.
Related papers
- Can Language Models Learn to Skip Steps? [59.84848399905409]
We study the ability to skip steps in reasoning.
Unlike humans, who may skip steps to enhance efficiency or to reduce cognitive load, models do not possess such motivations.
Our work presents the first exploration into human-like step-skipping ability.
arXiv Detail & Related papers (2024-11-04T07:10:24Z) - Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations [52.11801730860999]
In recent years, the robot learning community has shown increasing interest in using deep generative models to capture the complexity of large datasets.
We present the different types of models that the community has explored, such as energy-based models, diffusion models, action value maps, or generative adversarial networks.
We also present the different types of applications in which deep generative models have been used, from grasp generation to trajectory generation or cost learning.
arXiv Detail & Related papers (2024-08-08T11:34:31Z) - Using Artificial Populations to Study Psychological Phenomena in Neural
Models [0.0]
Investigation of cognitive behavior in language models must be conducted in an appropriate population for the results to be meaningful.
We leverage work in uncertainty estimation in a novel approach to efficiently construct experimental populations.
We provide theoretical grounding in the uncertainty estimation literature and motivation from current cognitive work regarding language models.
arXiv Detail & Related papers (2023-08-15T20:47:51Z) - Language Models are Bounded Pragmatic Speakers: Understanding RLHF from
a Bayesian Cognitive Modeling Perspective [2.8282906214258805]
This paper formulates a probabilistic cognitive model called the bounded pragmatic speaker.
We demonstrate that large language models fine-tuned with reinforcement learning from human feedback embody a model of thought that resembles a fast-and-slow model.
arXiv Detail & Related papers (2023-05-28T16:04:48Z) - A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades.
Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora.
To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z) - Language Model Behavior: A Comprehensive Survey [5.663056267168211]
We discuss over 250 recent studies of English language model behavior before task-specific fine-tuning.
Despite dramatic increases in generated text quality as models scale to hundreds of billions of parameters, the models are still prone to unfactual responses, commonsense errors, memorized text, and social biases.
arXiv Detail & Related papers (2023-03-20T23:54:26Z) - Chain of Hindsight Aligns Language Models with Feedback [62.68665658130472]
We propose a novel technique, Chain of Hindsight, that is easy to optimize and can learn from any form of feedback, regardless of its polarity.
We convert all types of feedback into sequences of sentences, which are then used to fine-tune the model.
By doing so, the model is trained to generate outputs based on feedback, while learning to identify and correct negative attributes or errors.
arXiv Detail & Related papers (2023-02-06T10:28:16Z) - Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains.
How do language models of different sizes learn during pre-training?
Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z) - Emergent Abilities of Large Language Models [172.08007363384218]
We consider an ability to be emergent if it is not present in smaller models but is present in larger models.
The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.
arXiv Detail & Related papers (2022-06-15T17:32:01Z) - Estimating the Personality of White-Box Language Models [0.589889361990138]
Large-scale language models, which are trained on large corpora of text, are being used in a wide range of applications everywhere.
Existing research shows that these models can and do capture human biases.
Many of these biases, especially those that could potentially cause harm, are being well-investigated.
However, studies that infer and change human personality traits inherited by these models have been scarce or non-existent.
arXiv Detail & Related papers (2022-04-25T23:53:53Z) - Uncovering Constraint-Based Behavior in Neural Models via Targeted
Fine-Tuning [9.391375268580806]
We show that competing linguistic processes within a language obscure underlying linguistic knowledge.
While human behavior has been found to be similar across languages, we find cross-linguistic variation in model behavior.
Our results suggest that models need to learn both the linguistic constraints in a language and their relative ranking, with mismatches in either producing non-human-like behavior.
arXiv Detail & Related papers (2021-06-02T14:52:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.