Related papers: Cognitive Modeling of Semantic Fluency Using Transformers

Cognitive Modeling of Semantic Fluency Using Transformers

URL: http://arxiv.org/abs/2208.09719v1
Date: Sat, 20 Aug 2022 16:48:04 GMT
Title: Cognitive Modeling of Semantic Fluency Using Transformers
Authors: Animesh Nighojkar, Anna Khlyzova, John Licato
Abstract summary: We take the first step by predicting human performance in the semantic fluency task (SFT), a well-studied task in cognitive science. We report preliminary evidence suggesting that, despite obvious implementational differences, TLMs can be used to identify individual differences in human fluency task behaviors. We discuss the implications of this work for cognitive modeling of knowledge representations.
Score: 6.445605125467574
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Can deep language models be explanatory models of human cognition? If so, what are their limits? In order to explore this question, we propose an approach called hyperparameter hypothesization that uses predictive hyperparameter tuning in order to find individuating descriptors of cognitive-behavioral profiles. We take the first step in this approach by predicting human performance in the semantic fluency task (SFT), a well-studied task in cognitive science that has never before been modeled using transformer-based language models (TLMs). In our task setup, we compare several approaches to predicting which word an individual performing SFT will utter next. We report preliminary evidence suggesting that, despite obvious implementational differences in how people and TLMs learn and use language, TLMs can be used to identify individual differences in human fluency task behaviors better than existing computational models, and may offer insights into human memory retrieval strategies -- cognitive process not typically considered to be the kinds of things TLMs can model. Finally, we discuss the implications of this work for cognitive modeling of knowledge representations.

Related papers

Towards Automation of Cognitive Modeling using Large Language Models [4.269194018613294]
Computational cognitive models enable researchers to quantify cognitive processes and arbitrate between theories by fitting models to behavioral data. Previous work has demonstrated that Large Language Models (LLMs) are adept at pattern recognition in-context, solving complex problems, and generating executable code. We leverage these abilities to explore the potential of LLMs in automating the generation of cognitive models based on behavioral data.
arXiv Detail & Related papers (2025-02-02T19:07:13Z)
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories [2.6549754445378344]
We discuss challenges to the use of PLMs as cognitive science theories. We review assumptions used by researchers to map measures of PLM performance to measures of human performance. We end by enumerating criteria for using PLMs as credible accounts of cognition and cognitive development.
arXiv Detail & Related papers (2025-01-22T05:24:23Z)
Can Language Models Learn to Skip Steps? [59.84848399905409]
We study the ability to skip steps in reasoning. Unlike humans, who may skip steps to enhance efficiency or to reduce cognitive load, models do not possess such motivations. Our work presents the first exploration into human-like step-skipping ability.
arXiv Detail & Related papers (2024-11-04T07:10:24Z)
Reverse-Engineering the Reader [43.26660964074272]
We introduce a novel alignment technique in which we fine-tune a language model to implicitly optimize the parameters of a linear regressor. Using words as a test case, we evaluate our technique across multiple model sizes and datasets. We find an inverse relationship between psychometric power and a model's performance on downstream NLP tasks as well as its perplexity on held-out test data.
arXiv Detail & Related papers (2024-10-16T23:05:01Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
Latent Variable Sequence Identification for Cognitive Models with Neural Bayes Estimation [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space. Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z)
LLMs as Models for Analogical Reasoning [14.412456982731467]
Analogical reasoning is fundamental to human cognition and learning.<n>Recent studies have shown that large language models can sometimes match humans in analogical reasoning tasks.
arXiv Detail & Related papers (2024-06-19T20:07:37Z)
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice [4.029252551781513]
We propose a novel way to enhance the utility of Large Language Models as cognitive models. We show that an LLM pretrained on an ecologically valid arithmetic dataset, predicts human behavior better than many traditional cognitive models.
arXiv Detail & Related papers (2024-05-29T17:37:14Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results. We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z)
Beyond Convergence: Identifiability of Machine Learning and Deep Learning Models [0.0]
We investigate the notion of model parameter identifiability through a case study focused on parameter estimation from motion sensor data. We employ a deep neural network to estimate subject-wise parameters, including mass, stiffness, and equilibrium leg length. The results show that while certain parameters can be identified from the observation data, others remain unidentifiable.
arXiv Detail & Related papers (2023-07-21T03:40:53Z)
On Conditional and Compositional Language Model Differentiable Prompting [75.76546041094436]
Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks. We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts.
arXiv Detail & Related papers (2023-07-04T02:47:42Z)
Turning large language models into cognitive models [0.0]
We show that large language models can be turned into cognitive models. These models offer accurate representations of human behavior, even outperforming traditional cognitive models in two decision-making domains. Taken together, these results suggest that large, pre-trained models can be adapted to become generalist cognitive models.
arXiv Detail & Related papers (2023-06-06T18:00:01Z)
Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning. We consider a setting where the pretraining corpus consists of multitask demonstrations. We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z)
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks [112.1942546460814]
We report the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM) Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models.
arXiv Detail & Related papers (2022-03-31T03:26:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.