Studying and improving reasoning in humans and machines
- URL: http://arxiv.org/abs/2309.12485v1
- Date: Thu, 21 Sep 2023 21:02:05 GMT
- Title: Studying and improving reasoning in humans and machines
- Authors: Nicolas Yax, Hernan Anll\'o, Stefano Palminteri
- Abstract summary: We investigate and compare reasoning in large language models (LLM) and humans.
Our results show that most of the included models presented reasoning errors akin to those frequently ascribed to error-prone, induce-based human reasoning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the present study, we investigate and compare reasoning in large language
models (LLM) and humans using a selection of cognitive psychology tools
traditionally dedicated to the study of (bounded) rationality. To do so, we
presented to human participants and an array of pretrained LLMs new variants of
classical cognitive experiments, and cross-compared their performances. Our
results showed that most of the included models presented reasoning errors akin
to those frequently ascribed to error-prone, heuristic-based human reasoning.
Notwithstanding this superficial similarity, an in-depth comparison between
humans and LLMs indicated important differences with human-like reasoning, with
models limitations disappearing almost entirely in more recent LLMs releases.
Moreover, we show that while it is possible to devise strategies to induce
better performance, humans and machines are not equally-responsive to the same
prompting schemes. We conclude by discussing the epistemological implications
and challenges of comparing human and machine behavior for both artificial
intelligence and cognitive psychology.
Related papers
- Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models [2.9312156642007294]
We systematically review Large Language Models' capabilities across three important cognitive domains: decision-making biases, reasoning, and creativity.
On decision-making, our synthesis reveals that while LLMs demonstrate several human-like biases, some biases observed in humans are absent.
On reasoning, advanced LLMs like GPT-4 exhibit deliberative reasoning akin to human System-2 thinking, while smaller models fall short of human-level performance.
A distinct dichotomy emerges in creativity: while LLMs excel in language-based creative tasks, such as storytelling, they struggle with divergent thinking tasks that require real-world context.
arXiv Detail & Related papers (2024-12-20T02:26:56Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning [0.0]
Large Language Models (LLMs) have demonstrated their capabilities across various tasks.
This paper exploits the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks.
We compare the performance of LLMs with a cognitive instance-based learning model, which imitates human experiential decision-making.
arXiv Detail & Related papers (2024-07-12T14:13:06Z) - PHAnToM: Persona-based Prompting Has An Effect on Theory-of-Mind Reasoning in Large Language Models [25.657579792829743]
We empirically evaluate how role-playing prompting influences Theory-of-Mind (ToM) reasoning capabilities.
We propose the mechanism that, beyond the inherent variance in the complexity of reasoning tasks, performance differences arise because of socially-motivated prompting differences.
arXiv Detail & Related papers (2024-03-04T17:34:34Z) - Large language models as linguistic simulators and cognitive models in human research [0.0]
The rise of large language models (LLMs) that generate human-like text has sparked debates over their potential to replace human participants in behavioral and cognitive research.
We critically evaluate this replacement perspective to appraise the fundamental utility of language models in psychology and social science.
This perspective reframes the role of language models in behavioral and cognitive science, serving as linguistic simulators and cognitive models that shed light on the similarities and differences between machine intelligence and human cognition and thoughts.
arXiv Detail & Related papers (2024-02-06T23:28:23Z) - Divergences between Language Models and Human Brains [59.100552839650774]
We systematically explore the divergences between human and machine language processing.
We identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense.
Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
arXiv Detail & Related papers (2023-11-15T19:02:40Z) - From Heuristic to Analytic: Cognitively Motivated Strategies for
Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions.
Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z) - Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology.
We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table.
It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z) - JECC: Commonsense Reasoning Tasks Derived from Interactive Fictions [75.42526766746515]
We propose a new commonsense reasoning dataset based on human's Interactive Fiction (IF) gameplay walkthroughs.
Our dataset focuses on the assessment of functional commonsense knowledge rules rather than factual knowledge.
Experiments show that the introduced dataset is challenging to previous machine reading models as well as the new large language models.
arXiv Detail & Related papers (2022-10-18T19:20:53Z) - MIRROR: Differentiable Deep Social Projection for Assistive Human-Robot
Communication [18.711591679232367]
We present MIRROR, an approach to quickly learn human models from human demonstrations.
We also present a human-subject study using the CARLA simulator which shows that (i) MIRROR is able to scale to complex domains.
arXiv Detail & Related papers (2022-03-06T05:01:00Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.