Related papers: Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis

Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis

URL: http://arxiv.org/abs/2410.11756v1
Date: Tue, 15 Oct 2024 16:27:22 GMT
Title: Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis
Authors: Isaac R. Galatzer-Levy, Jed McGiffin, David Munday, Xin Liu, Danny Karmon, Ilia Labzovsky, Rivka Moroshko, Amir Zait, Daniel McDuff,
Abstract summary: This study explores how several recent GenAI models perform on the Clock Drawing Test (CDT), a neuropsychological assessment of visuospatial planning and organization. While models create clock-like drawings, they struggle with accurate time representation, showing deficits similar to mild-severe cognitive impairment.
Score: 17.5336703613751
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative AI's rapid advancement sparks interest in its cognitive abilities, especially given its capacity for tasks like language understanding and code generation. This study explores how several recent GenAI models perform on the Clock Drawing Test (CDT), a neuropsychological assessment of visuospatial planning and organization. While models create clock-like drawings, they struggle with accurate time representation, showing deficits similar to mild-severe cognitive impairment (Wechsler, 2009). Errors include numerical sequencing issues, incorrect clock times, and irrelevant additions, despite accurate rendering of clock features. Only GPT 4 Turbo and Gemini Pro 1.5 produced the correct time, scoring like healthy individuals (4/4). A follow-up clock-reading test revealed only Sonnet 3.5 succeeded, suggesting drawing deficits stem from difficulty with numerical concepts. These findings may reflect weaknesses in visual-spatial understanding, working memory, or calculation, highlighting strengths in learned knowledge but weaknesses in reasoning. Comparing human and machine performance is crucial for understanding AI's cognitive capabilities and guiding development toward human-like cognitive functions.

Related papers

Inverse Scaling in Test-Time Compute [51.16323216811257]
Extending the reasoning length of Large Reasoning Models (LRMs) deteriorates performance.<n>We identify five distinct failure modes when models reason for longer.<n>These findings suggest that while test-time compute scaling remains promising for improving model capabilities, it may inadvertently reinforce problematic reasoning patterns.
arXiv Detail & Related papers (2025-07-19T00:06:13Z)
Reasoning in machine vision: learning to think fast and slow [10.430190333487957]
Reasoning is a hallmark of human intelligence, enabling adaptive decision-making in complex and unfamiliar scenarios.<n>Machine intelligence remains bound to training data, lacking the ability to dynamically refine solutions at inference time.<n>Here we present a novel learning paradigm that enables machine reasoning in vision by allowing performance improvement with increasing thinking time.
arXiv Detail & Related papers (2025-06-27T10:03:05Z)
Auto Detecting Cognitive Events Using Machine Learning on Pupillary Data [0.0]
Pupil size is a valuable indicator of cognitive workload, reflecting changes in attention and arousal governed by the autonomic nervous system. This study explores the potential of using machine learning to automatically detect cognitive events experienced using individuals.
arXiv Detail & Related papers (2024-10-18T04:54:46Z)
Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence [0.0]
Large language models (LLMs) have shown impressive alignment with human cognitive processes. This study investigates whether ChatGPT possess metacognitive monitoring abilities akin to humans.
arXiv Detail & Related papers (2024-10-17T09:42:30Z)
Visual Agents as Fast and Slow Thinkers [88.6691504568041]
We introduce FaST, which incorporates the Fast and Slow Thinking mechanism into visual agents. FaST employs a switch adapter to dynamically select between System 1/2 modes. It tackles uncertain and unseen objects by adjusting model confidence and integrating new contextual data.
arXiv Detail & Related papers (2024-08-16T17:44:02Z)
The Generative AI Paradox: "What It Can Create, It May Not Understand" [81.89252713236746]
Recent wave of generative AI has sparked excitement and concern over potentially superhuman levels of artificial intelligence. At the same time, models still show basic errors in understanding that would not be expected even in non-expert humans. This presents us with an apparent paradox: how do we reconcile seemingly superhuman capabilities with the persistence of errors that few humans would make?
arXiv Detail & Related papers (2023-10-31T18:07:07Z)
Empowering Psychotherapy with Large Language Models: Cognitive Distortion Detection through Diagnosis of Thought Prompting [82.64015366154884]
We study the task of cognitive distortion detection and propose the Diagnosis of Thought (DoT) prompting. DoT performs diagnosis on the patient's speech via three stages: subjectivity assessment to separate the facts and the thoughts; contrastive reasoning to elicit the reasoning processes supporting and contradicting the thoughts; and schema analysis to summarize the cognition schemas. Experiments demonstrate that DoT obtains significant improvements over ChatGPT for cognitive distortion detection, while generating high-quality rationales approved by human experts.
arXiv Detail & Related papers (2023-10-11T02:47:21Z)
Incremental procedural and sensorimotor learning in cognitive humanoid robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally. We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent. Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z)
Mind meets machine: Unravelling GPT-4's cognitive psychology [0.7302002320865727]
Large language models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. This study focuses on the evaluation of GPT-4's performance on datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models.
arXiv Detail & Related papers (2023-03-20T20:28:26Z)
Modeling cognitive load as a self-supervised brain rate with electroencephalography and deep learning [2.741266294612776]
This research presents a novel self-supervised method for mental workload modelling from EEG data. The method is a convolutional recurrent neural network trainable with spatially preserving spectral topographic head-maps from EEG data to fit the brain rate variable. Findings point to the existence of quasi-stable blocks of learnt high-level representations of cognitive activation because they can be induced through convolution and seem not to be dependent on each other over time, intuitively matching the non-stationary nature of brain responses.
arXiv Detail & Related papers (2022-09-21T07:44:21Z)
CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals [60.921888445317705]
We propose a CogAlign approach to integrate cognitive language processing signals into natural language processing models. We show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets.
arXiv Detail & Related papers (2021-06-10T07:10:25Z)
Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning [78.13740873213223]
Bongard problems (BPs) were introduced as an inspirational challenge for visual cognition in intelligent systems. We propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning.
arXiv Detail & Related papers (2020-10-02T03:19:46Z)
Time Perception: A Review on Psychological, Computational and Robotic Models [2.223733768286313]
We introduce a brief background from the psychology and neuroscience literature, covering the characteristics and models of time perception. We summarize the emergent computational and robotic models of time perception. Most models of timing are developed for either sensory timing (i.e. ability to assess an interval) or motor timing (i.e. ability to reproduce an interval)
arXiv Detail & Related papers (2020-07-23T08:16:47Z)
Visualizing and Understanding Vision System [0.6510507449705342]
We use a vision recognition-reconstruction network (RRN) to investigate the development, recognition, learning and forgetting mechanisms. In digit recognition study, we witness that the RRN could maintain object invariance representation under various viewing conditions. In the learning and forgetting study, novel structure recognition is implemented by adjusting entire synapses in low magnitude while pattern specificities of original synaptic connectivity are preserved.
arXiv Detail & Related papers (2020-06-11T07:08:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.