Artificial Neuropsychology: Are Large Language Models Developing
Executive Functions?
- URL: http://arxiv.org/abs/2305.04134v2
- Date: Tue, 17 Oct 2023 16:53:21 GMT
- Title: Artificial Neuropsychology: Are Large Language Models Developing
Executive Functions?
- Authors: Hernan Ceferino Vazquez
- Abstract summary: We evaluate the planning function and working memory of GPT using the popular Towers of Hanoi method.
Preliminary results show that LLMs generates near-optimal solutions in Towers of Hanoi related tasks.
These abilities are quite limited and worse than well-trained humans when the tasks are not known.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial Intelligence (AI) has been rapidly advancing and has demonstrated
its ability to perform a wide range of cognitive tasks, including language
processing, visual recognition, and decision-making. Part of this progress is
due to LLMs (Large Language Models) like those of the GPT (Generative
Pre-Trained Transformers) family. These models are capable of exhibiting
behavior that can be perceived as intelligent. Most authors in Neuropsychology
consider intelligent behavior to depend on a number of overarching skills, or
Executive Functions (EFs), which rely on the correct functioning of neural
networks in the frontal lobes, and have developed a series of tests to evaluate
them. In this work, we raise the question of whether LLMs are developing
executive functions similar to those of humans as part of their learning, and
we evaluate the planning function and working memory of GPT using the popular
Towers of Hanoi method. Additionally, we introduce a new variant of the
classical method in order to avoid that the solutions are found in the LLM
training data (dataleakeage). Preliminary results show that LLMs generates
near-optimal solutions in Towers of Hanoi related tasks, adheres to task
constraints, and exhibits rapid planning capabilities and efficient working
memory usage, indicating a potential development of executive functions.
However, these abilities are quite limited and worse than well-trained humans
when the tasks are not known and are not part of the training data.
Related papers
- Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks [0.8425561594225592]
This study introduces a novel framework for training smaller language models in function calling.
It focuses on specific logical and mathematical reasoning tasks.
The approach aims to improve performances of small-scale models for these tasks using function calling.
arXiv Detail & Related papers (2024-10-24T16:27:35Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning [0.0]
Large Language Models (LLMs) have demonstrated their capabilities across various tasks.
This paper exploits the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks.
We compare the performance of LLMs with a cognitive instance-based learning model, which imitates human experiential decision-making.
arXiv Detail & Related papers (2024-07-12T14:13:06Z) - Development of Cognitive Intelligence in Pre-trained Language Models [3.1815791977708834]
Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models.
The developmental trajectories of PLMs consistently exhibit a window of maximal alignment to human cognitive development.
After that window, training appears to serve the engineering goal of reducing loss but not the scientific goal of increasing alignment with human cognition.
arXiv Detail & Related papers (2024-07-01T07:56:36Z) - KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents [54.09074527006576]
Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges.
This inadequacy primarily stems from the lack of built-in action knowledge in language agents.
We introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge.
arXiv Detail & Related papers (2024-03-05T16:39:12Z) - Empowering Large Language Model Agents through Action Learning [85.39581419680755]
Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error.
We argue that the capacity to learn new actions from experience is fundamental to the advancement of learning in LLM agents.
We introduce a framework LearnAct with an iterative learning strategy to create and improve actions in the form of Python functions.
arXiv Detail & Related papers (2024-02-24T13:13:04Z) - Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for
Instruction Generation Models [5.975913042883176]
Recent work studies the cognitive capabilities of language models through psychological tests designed for humans.
We formulate task-oriented cognitive capabilities, which are human-like cognitive capabilities that language models leverage to perform tasks.
arXiv Detail & Related papers (2022-12-21T04:43:19Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.