Towards a Psychology of Machines: Large Language Models Predict Human Memory
- URL: http://arxiv.org/abs/2403.05152v2
- Date: Mon, 14 Oct 2024 14:24:08 GMT
- Title: Towards a Psychology of Machines: Large Language Models Predict Human Memory
- Authors: Markus Huff, Elanur Ulakçı,
- Abstract summary: Large language models (LLMs) are excelling across various tasks despite not being based on human cognition.
This study examines ChatGPT's ability to predict human performance in a language-based memory task.
- Score: 0.0
- License:
- Abstract: Large language models (LLMs) are excelling across various tasks despite not being based on human cognition, prompting an investigation into their potential to offer insights into human cognitive mechanisms. This study examines ChatGPT's ability to predict human performance in a language-based memory task. Following theories of text comprehension, we hypothesized that recognizing ambiguous sentences is easier with relevant preceding context. Participants, including humans and ChatGPT, were given pairs of sentences: the second always a garden-path sentence, and the first providing either fitting or unfitting context. We measured their ratings of sentence relatedness and memorability. Results showed a strong alignment between ChatGPT's assessments and human memory performance. Sentences in the fitting context were rated as being more related and memorable by ChatGPT and were better remembered by humans, highlighting LLMs' potential to predict human performance and contribute to psychological theories.
Related papers
- Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence [0.0]
Large language models (LLMs) have shown impressive alignment with human cognitive processes.
This study investigates whether ChatGPT possess metacognitive monitoring abilities akin to humans.
arXiv Detail & Related papers (2024-10-17T09:42:30Z) - Measuring Psychological Depth in Language Models [50.48914935872879]
We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM's ability to produce authentic and narratively complex stories.
We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff's alpha)
Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit.
arXiv Detail & Related papers (2024-06-18T14:51:54Z) - Grammaticality Representation in ChatGPT as Compared to Linguists and Laypeople [0.0]
This study builds upon a previous study that collected laypeople's grammatical judgments on 148 linguistic phenomena.
Our primary focus was to compare ChatGPT with both laypeople and linguists in the judgement of these linguistic constructions.
Overall, our findings demonstrate convergence rates ranging from 73% to 95% between ChatGPT and linguists, with an overall point-estimate of 89%.
arXiv Detail & Related papers (2024-06-17T00:23:16Z) - Can ChatGPT Read Who You Are? [10.577227353680994]
We report the results of a comprehensive user study featuring texts written in Czech by a representative population sample of 155 participants.
We compare the personality trait estimations made by ChatGPT against those by human raters and report ChatGPT's competitive performance in inferring personality traits from text.
arXiv Detail & Related papers (2023-12-26T14:43:04Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer.
We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z) - Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges.
Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role.
We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z) - Is ChatGPT a Good Personality Recognizer? A Preliminary Study [19.278538849802025]
This study investigates ChatGPT's ability in recognizing personality from given text.
We employ a variety of prompting strategies to explore ChatGPT's ability in recognizing personality from given text.
arXiv Detail & Related papers (2023-07-08T11:02:02Z) - Does ChatGPT have Theory of Mind? [2.3129337924262927]
Theory of Mind (ToM) is the ability to understand human thinking and decision-making.
This paper investigates what extent recent Large Language Models in the ChatGPT tradition possess ToM.
arXiv Detail & Related papers (2023-05-23T12:55:21Z) - Can ChatGPT Understand Too? A Comparative Study on ChatGPT and
Fine-tuned BERT [103.57103957631067]
ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.
We evaluate ChatGPT's understanding ability by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models.
We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question answering tasks.
arXiv Detail & Related papers (2023-02-19T12:29:33Z) - A Categorical Archive of ChatGPT Failures [47.64219291655723]
ChatGPT, developed by OpenAI, has been trained using massive amounts of data and simulates human conversation.
It has garnered significant attention due to its ability to effectively answer a broad range of human inquiries.
However, a comprehensive analysis of ChatGPT's failures is lacking, which is the focus of this study.
arXiv Detail & Related papers (2023-02-06T04:21:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.