Related papers: The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories

The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories

URL: http://arxiv.org/abs/2501.12651v1
Date: Wed, 22 Jan 2025 05:24:23 GMT
Title: The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
Authors: Raj Sanjay Shah, Sashank Varma,
Abstract summary: We discuss challenges to the use of PLMs as cognitive science theories.<n>We review assumptions used by researchers to map measures of PLM performance to measures of human performance.<n>We end by enumerating criteria for using PLMs as credible accounts of cognition and cognitive development.
Score: 2.6549754445378344
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Many studies have evaluated the cognitive alignment of Pre-trained Language Models (PLMs), i.e., their correspondence to adult performance across a range of cognitive domains. Recently, the focus has expanded to the developmental alignment of these models: identifying phases during training where improvements in model performance track improvements in children's thinking over development. However, there are many challenges to the use of PLMs as cognitive science theories, including different architectures, different training data modalities and scales, and limited model interpretability. In this paper, we distill lessons learned from treating PLMs, not as engineering artifacts but as cognitive science and developmental science models. We review assumptions used by researchers to map measures of PLM performance to measures of human performance. We identify potential pitfalls of this approach to understanding human thinking, and we end by enumerating criteria for using PLMs as credible accounts of cognition and cognitive development.

Related papers

A Framework for Robust Cognitive Evaluation of LLMs [13.822169295436177]
Emergent cognitive abilities in large language models (LLMs) have been widely observed, but their nature and underlying mechanisms remain poorly understood. We develop CognitivEval, a framework for systematically evaluating the artificial cognitive capabilities of LLMs.
arXiv Detail & Related papers (2025-04-03T17:35:54Z)
Towards Automation of Cognitive Modeling using Large Language Models [4.269194018613294]
Computational cognitive models enable researchers to quantify cognitive processes and arbitrate between theories by fitting models to behavioral data. Previous work has demonstrated that Large Language Models (LLMs) are adept at pattern recognition in-context, solving complex problems, and generating executable code. We leverage these abilities to explore the potential of LLMs in automating the generation of cognitive models based on behavioral data.
arXiv Detail & Related papers (2025-02-02T19:07:13Z)
Large Language Models and Cognitive Science: A Comprehensive Review of Similarities, Differences, and Challenges [14.739357670600102]
This comprehensive review explores the intersection of Large Language Models (LLMs) and cognitive science.<n>We analyze methods for evaluating LLMs cognitive abilities and discuss their potential as cognitive models.<n>We assess cognitive biases and limitations of LLMs, along with proposed methods for improving their performance.
arXiv Detail & Related papers (2024-09-04T02:30:12Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions [66.40362209055023]
This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods. By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models.
arXiv Detail & Related papers (2024-07-07T18:02:00Z)
Development of Cognitive Intelligence in Pre-trained Language Models [3.1815791977708834]
Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models. The developmental trajectories of PLMs consistently exhibit a window of maximal alignment to human cognitive development. After that window, training appears to serve the engineering goal of reducing loss but not the scientific goal of increasing alignment with human cognition.
arXiv Detail & Related papers (2024-07-01T07:56:36Z)
ConcEPT: Concept-Enhanced Pre-Training for Language Models [57.778895980999124]
ConcEPT aims to infuse conceptual knowledge into pre-trained language models. It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts. Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
arXiv Detail & Related papers (2024-01-11T05:05:01Z)
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation [60.65820977963331]
We introduce a novel evaluation paradigm for Large Language Models (LLMs) This paradigm shifts the emphasis from result-oriented assessments, which often neglect the reasoning process, to a more comprehensive evaluation. By applying this paradigm in the GSM8K dataset, we have developed the MR-GSM8K benchmark.
arXiv Detail & Related papers (2023-12-28T15:49:43Z)
Exploring the Cognitive Knowledge Structure of Large Language Models: An Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence. Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains. We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z)
Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology. We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table. It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z)
Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models. Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model. Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.