Related papers: Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses

Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses

URL: http://arxiv.org/abs/2305.11243v2
Date: Tue, 7 Nov 2023 20:20:47 GMT
Title: Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses
Authors: Eliza Kosoy, Emily Rose Reagan, Leslie Lai, Alison Gopnik and Danielle Krettek Cobb
Abstract summary: We adapt classical developmental experiments to evaluate the capabilities of LaMDA, a large language model from Google. We find that LaMDA generates appropriate responses similar to those of children in experiments involving social understanding. On the other hand, LaMDA's responses in early object and action understanding, theory of mind, and especially causal reasoning tasks are very different from those of young children.
Score: 0.02999888908665658
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developmental psychologists have spent decades devising experiments to test the intelligence and knowledge of infants and children, tracing the origin of crucial concepts and capacities. Moreover, experimental techniques in developmental psychology have been carefully designed to discriminate the cognitive capacities that underlie particular behaviors. We propose that using classical experiments from child development is a particularly effective way to probe the computational abilities of AI models, in general, and LLMs in particular. First, the methodological techniques of developmental psychology, such as the use of novel stimuli to control for past experience or control conditions to determine whether children are using simple associations, can be equally helpful for assessing the capacities of LLMs. In parallel, testing LLMs in this way can tell us whether the information that is encoded in text is sufficient to enable particular responses, or whether those responses depend on other kinds of information, such as information from exploration of the physical world. In this work we adapt classical developmental experiments to evaluate the capabilities of LaMDA, a large language model from Google. We propose a novel LLM Response Score (LRS) metric which can be used to evaluate other language models, such as GPT. We find that LaMDA generates appropriate responses that are similar to those of children in experiments involving social understanding, perhaps providing evidence that knowledge of these domains is discovered through language. On the other hand, LaMDA's responses in early object and action understanding, theory of mind, and especially causal reasoning tasks are very different from those of young children, perhaps showing that these domains require more real-world, self-initiated exploration and cannot simply be learned from patterns in language input.

Related papers

Simulating Human-Like Learning Dynamics with LLM-Empowered Agents [14.945276470541364]
We introduce LearnerAgent, a novel multi-agent framework based on Large Language Models (LLMs) to simulate a realistic teaching environment.<n>We construct learners with psychologically grounded profiles-such as Deep, Surface, and Lazy-as well as a persona-free General Learner to inspect the base LLM's default behavior.<n>Through weekly knowledge acquisition, monthly strategic choices, periodic tests, and peer interaction, we can track the dynamic learning progress of individual learners over a full-year journey.
arXiv Detail & Related papers (2025-08-07T17:57:46Z)
How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMs [13.822169295436177]
We investigate how large language models (LLMs) process the temporal meaning of linguistic aspect in narratives that were previously used in human studies.<n>Our findings show that LLMs over-rely on prototypicality, produce inconsistent aspectual judgments, and struggle with causal reasoning derived from aspect.<n>These results suggest that LLMs process aspect fundamentally differently from humans and lack robust narrative understanding.
arXiv Detail & Related papers (2025-07-18T18:28:35Z)
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study [50.065744358362345]
Large language models (LLMs) have shown impressive capabilities across tasks such as mathematics, coding, and reasoning.<n>Yet their learning ability, which is crucial for adapting to dynamic environments and acquiring new knowledge, remains underexplored.
arXiv Detail & Related papers (2025-06-16T13:24:50Z)
How Deep is Love in LLMs' Hearts? Exploring Semantic Size in Human-like Cognition [75.11808682808065]
This study investigates whether large language models (LLMs) exhibit similar tendencies in understanding semantic size. Our findings reveal that multi-modal training is crucial for LLMs to achieve more human-like understanding. Lastly, we examine whether LLMs are influenced by attention-grabbing headlines with larger semantic sizes in a real-world web shopping scenario.
arXiv Detail & Related papers (2025-03-01T03:35:56Z)
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories [2.6549754445378344]
We discuss challenges to the use of PLMs as cognitive science theories. We review assumptions used by researchers to map measures of PLM performance to measures of human performance. We end by enumerating criteria for using PLMs as credible accounts of cognition and cognitive development.
arXiv Detail & Related papers (2025-01-22T05:24:23Z)
Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia [27.650551131885152]
Research into large language models (LLMs) has shown promise in addressing complex tasks in the physical world. Studies suggest that powerful LLMs, like GPT-4, are beginning to exhibit human-like cognitive abilities.
arXiv Detail & Related papers (2024-10-02T15:47:25Z)
Development of Cognitive Intelligence in Pre-trained Language Models [3.1815791977708834]
Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models. The developmental trajectories of PLMs consistently exhibit a window of maximal alignment to human cognitive development. After that window, training appears to serve the engineering goal of reducing loss but not the scientific goal of increasing alignment with human cognition.
arXiv Detail & Related papers (2024-07-01T07:56:36Z)
Exploring Spatial Schema Intuitions in Large Language and Vision Models [8.944921398608063]
We investigate whether large language models (LLMs) effectively capture implicit human intuitions about building blocks of language. Surprisingly, correlations between model outputs and human responses emerge, revealing adaptability without a tangible connection to embodied experiences. This research contributes to a nuanced understanding of the interplay between language, spatial experiences, and computations made by large language models.
arXiv Detail & Related papers (2024-02-01T19:25:50Z)
Exploring the Frontiers of LLMs in Psychological Applications: A Comprehensive Review [4.147674289030404]
Large language models (LLMs) have the potential to simulate aspects of human cognition and behavior. LLMs offer innovative tools for literature review, hypothesis generation, experimental design, experimental subjects, data analysis, academic writing, and peer review in psychology. There are issues like data privacy, the ethical implications of using LLMs in psychological research, and the need for a deeper understanding of these models' limitations.
arXiv Detail & Related papers (2024-01-03T03:01:29Z)
Divergences between Language Models and Human Brains [63.405788999891335]
Recent research has hinted that brain signals can be effectively predicted using internal representations of language models (LMs) We show that there are clear differences in how LMs and humans represent and use language. We identify two domains that are not captured well by LMs: social/emotional intelligence and physical commonsense.
arXiv Detail & Related papers (2023-11-15T19:02:40Z)
Character-LLM: A Trainable Agent for Role-Playing [67.35139167985008]
Large language models (LLMs) can be used to serve as agents to simulate human behaviors. We introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc.
arXiv Detail & Related papers (2023-10-16T07:58:56Z)
Exploring the Cognitive Knowledge Structure of Large Language Models: An Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence. Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains. We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z)
Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology. We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table. It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z)
Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors. To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors. MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories. We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)
Is the Computation of Abstract Sameness Relations Human-Like in Neural Language Models? [4.0810783261728565]
This work explores whether state-of-the-art NLP models exhibit elementary mechanisms known from human cognition. The computation of "abstract sameness relations" is assumed to play an important role in human language acquisition and processing.
arXiv Detail & Related papers (2022-05-12T15:19:54Z)
From Psychological Curiosity to Artificial Curiosity: Curiosity-Driven Learning in Artificial Intelligence Tasks [56.20123080771364]
Psychological curiosity plays a significant role in human intelligence to enhance learning through exploration and information acquisition. In the Artificial Intelligence (AI) community, artificial curiosity provides a natural intrinsic motivation for efficient learning. CDL has become increasingly popular, where agents are self-motivated to learn novel knowledge.
arXiv Detail & Related papers (2022-01-20T17:07:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.