Related papers: Towards a Psychology of Machines: Large Language Models Predict Human Memory

Towards a Psychology of Machines: Large Language Models Predict Human Memory

URL: http://arxiv.org/abs/2403.05152v3
Date: Wed, 04 Dec 2024 19:01:43 GMT
Title: Towards a Psychology of Machines: Large Language Models Predict Human Memory
Authors: Markus Huff, Elanur Ulakçı,
Abstract summary: Large language models (LLMs) have shown remarkable abilities in natural language processing.<n>This study explores whether LLMs can predict human memory performance in tasks involving garden-path sentences and contextual information.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models (LLMs), such as ChatGPT, have shown remarkable abilities in natural language processing, opening new avenues in psychological research. This study explores whether LLMs can predict human memory performance in tasks involving garden-path sentences and contextual information. In the first part, we used ChatGPT to rate the relatedness and memorability of garden-path sentences preceded by either fitting or unfitting contexts. In the second part, human participants read the same sentences, rated their relatedness, and completed a surprise memory test. The results demonstrated that ChatGPT's relatedness ratings closely matched those of the human participants, and its memorability ratings effectively predicted human memory performance. Both LLM and human data revealed that higher relatedness in the unfitting context condition was associated with better memory performance, aligning with probabilistic frameworks of context-dependent learning. These findings suggest that LLMs, despite lacking human-like memory mechanisms, can model aspects of human cognition and serve as valuable tools in psychological research. We propose the field of machine psychology to explore this interplay between human cognition and artificial intelligence, offering a bidirectional approach where LLMs can both benefit from and contribute to our understanding of human cognitive processes.

Related papers

Human-like conceptual representations emerge from language prediction [72.5875173689788]
Large language models (LLMs) trained exclusively through next-token prediction over language data exhibit remarkably human-like behaviors. Are these models developing concepts akin to humans, and if so, how are such concepts represented and organized? Our results demonstrate that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts. These findings establish that structured, human-like conceptual representations can naturally emerge from language prediction without real-world grounding.
arXiv Detail & Related papers (2025-01-21T23:54:17Z)
Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence [0.0]
Large language models (LLMs) have shown impressive alignment with human cognitive processes. This study investigates whether ChatGPT possess metacognitive monitoring abilities akin to humans.
arXiv Detail & Related papers (2024-10-17T09:42:30Z)
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion. In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation. We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z)
Measuring Psychological Depth in Language Models [50.48914935872879]
We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM's ability to produce authentic and narratively complex stories. We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff's alpha) Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit.
arXiv Detail & Related papers (2024-06-18T14:51:54Z)
Grammaticality Representation in ChatGPT as Compared to Linguists and Laypeople [0.0]
This study builds upon a previous study that collected laypeople's grammatical judgments on 148 linguistic phenomena. Our primary focus was to compare ChatGPT with both laypeople and linguists in the judgement of these linguistic constructions. Overall, our findings demonstrate convergence rates ranging from 73% to 95% between ChatGPT and linguists, with an overall point-estimate of 89%.
arXiv Detail & Related papers (2024-06-17T00:23:16Z)
Linking In-context Learning in Transformers to Human Episodic Memory [1.124958340749622]
We focus on induction heads, which contribute to in-context learning in Transformer-based large language models. We demonstrate that induction heads are behaviorally, functionally, and mechanistically similar to the contextual maintenance and retrieval model of human episodic memory.
arXiv Detail & Related papers (2024-05-23T18:51:47Z)
Can ChatGPT Read Who You Are? [10.577227353680994]
We report the results of a comprehensive user study featuring texts written in Czech by a representative population sample of 155 participants. We compare the personality trait estimations made by ChatGPT against those by human raters and report ChatGPT's competitive performance in inferring personality traits from text.
arXiv Detail & Related papers (2023-12-26T14:43:04Z)
Divergences between Language Models and Human Brains [59.100552839650774]
We systematically explore the divergences between human and machine language processing. We identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense. Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
arXiv Detail & Related papers (2023-11-15T19:02:40Z)
PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner. Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z)
Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer. We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z)
Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges. Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role. We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z)
Affect Recognition in Conversations Using Large Language Models [9.689990547610664]
Affect recognition plays a pivotal role in human communication. This study investigates the capacity of large language models (LLMs) to recognise human affect in conversations.
arXiv Detail & Related papers (2023-09-22T14:11:23Z)
Is ChatGPT a Good Personality Recognizer? A Preliminary Study [19.278538849802025]
This study investigates ChatGPT's ability in recognizing personality from given text. We employ a variety of prompting strategies to explore ChatGPT's ability in recognizing personality from given text.
arXiv Detail & Related papers (2023-07-08T11:02:02Z)
Does ChatGPT have Theory of Mind? [2.3129337924262927]
Theory of Mind (ToM) is the ability to understand human thinking and decision-making. This paper investigates what extent recent Large Language Models in the ChatGPT tradition possess ToM.
arXiv Detail & Related papers (2023-05-23T12:55:21Z)
Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5 [0.0]
GPT-3.5 is an example of an LLM that supports a conversational agent called ChatGPT. In this work, we used a series of novel prompts to determine whether ChatGPT shows biases, and other decision effects. We also tested the same prompts on human participants.
arXiv Detail & Related papers (2023-05-08T01:02:52Z)
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT [103.57103957631067]
ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries. We evaluate ChatGPT's understanding ability by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models. We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question answering tasks.
arXiv Detail & Related papers (2023-02-19T12:29:33Z)
A Categorical Archive of ChatGPT Failures [47.64219291655723]
ChatGPT, developed by OpenAI, has been trained using massive amounts of data and simulates human conversation. It has garnered significant attention due to its ability to effectively answer a broad range of human inquiries. However, a comprehensive analysis of ChatGPT's failures is lacking, which is the focus of this study.
arXiv Detail & Related papers (2023-02-06T04:21:59Z)
Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance. This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings. Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z)
SensAI+Expanse Emotional Valence Prediction Studies with Cognition and Memory Integration [0.0]
This work contributes with an artificial intelligent agent able to assist on cognitive science studies. The developed artificial agent system (SensAI+Expanse) includes machine learning algorithms, empathetic algorithms, and memory. Results of the present study show evidence of significant emotional behaviour differences between some age ranges and gender combinations.
arXiv Detail & Related papers (2020-01-03T18:17:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.