Related papers: LLM Cognitive Judgements Differ From Human

Related papers

How Deep is Love in LLMs' Hearts? Exploring Semantic Size in Human-like Cognition [75.11808682808065]
This study investigates whether large language models (LLMs) exhibit similar tendencies in understanding semantic size. Our findings reveal that multi-modal training is crucial for LLMs to achieve more human-like understanding. Lastly, we examine whether LLMs are influenced by attention-grabbing headlines with larger semantic sizes in a real-world web shopping scenario.
arXiv Detail & Related papers (2025-03-01T03:35:56Z)
Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models [2.9312156642007294]
We systematically review Large Language Models' capabilities across three important cognitive domains: decision-making biases, reasoning, and creativity. On decision-making, our synthesis reveals that while LLMs demonstrate several human-like biases, some biases observed in humans are absent. On reasoning, advanced LLMs like GPT-4 exhibit deliberative reasoning akin to human System-2 thinking, while smaller models fall short of human-level performance. A distinct dichotomy emerges in creativity: while LLMs excel in language-based creative tasks, such as storytelling, they struggle with divergent thinking tasks that require real-world context.
arXiv Detail & Related papers (2024-12-20T02:26:56Z)
CBEval: A framework for evaluating and interpreting cognitive biases in LLMs [1.4633779950109127]
Large Language models exhibit notable gaps in their cognitive processes. As reflections of human-generated data, these models have the potential to inherit cognitive biases.
arXiv Detail & Related papers (2024-12-04T05:53:28Z)
Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence [0.0]
Large language models (LLMs) have shown impressive alignment with human cognitive processes. This study investigates whether ChatGPT possess metacognitive monitoring abilities akin to humans.
arXiv Detail & Related papers (2024-10-17T09:42:30Z)
Exploring the Cognitive Knowledge Structure of Large Language Models: An Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence. Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains. We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z)
The Confidence-Competence Gap in Large Language Models: A Cognitive Study [3.757390057317548]
Large Language Models (LLMs) have acquired ubiquitous attention for their performances across diverse domains. We exploit these models with diverse sets of questionnaires and real-world scenarios. Our findings reveal intriguing instances where models demonstrate high confidence even when they answer incorrectly.
arXiv Detail & Related papers (2023-09-28T03:50:09Z)
Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias [57.42417061979399]
Recent studies show that instruction tuning (IT) and reinforcement learning from human feedback (RLHF) improve the abilities of large language models (LMs) dramatically. In this work, we investigate the effect of IT and RLHF on decision making and reasoning in LMs. Our findings highlight the presence of these biases in various models from the GPT-3, Mistral, and T5 families.
arXiv Detail & Related papers (2023-08-01T01:39:25Z)
Human-Like Intuitive Behavior and Reasoning Biases Emerged in Language Models -- and Disappeared in GPT-4 [0.0]
We show that large language models (LLMs) exhibit behavior that resembles human-like intuition. We also probe how sturdy the inclination for intuitive-like decision-making is.
arXiv Detail & Related papers (2023-06-13T08:43:13Z)
Revisiting the Reliability of Psychological Scales on Large Language Models [62.57981196992073]
This study aims to determine the reliability of applying personality assessments to Large Language Models. Analysis of 2,500 settings per model, including GPT-3.5, GPT-4, Gemini-Pro, and LLaMA-3.1, reveals that various LLMs show consistency in responses to the Big Five Inventory.
arXiv Detail & Related papers (2023-05-31T15:03:28Z)
Do Large Language Models Know What They Don't Know? [74.65014158544011]
Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks. Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend. This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions.
arXiv Detail & Related papers (2023-05-29T15:30:13Z)
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models [82.50173296858377]
Many anecdotal examples were used to suggest newer large language models (LLMs) like ChatGPT and GPT-4 exhibit Neural Theory-of-Mind (N-ToM) We investigate the extent of LLMs' N-ToM through an extensive evaluation on 6 tasks and find that while LLMs exhibit certain N-ToM abilities, this behavior is far from being robust.
arXiv Detail & Related papers (2023-05-24T06:14:31Z)
Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5 [0.0]
GPT-3.5 is an example of an LLM that supports a conversational agent called ChatGPT. In this work, we used a series of novel prompts to determine whether ChatGPT shows biases, and other decision effects. We also tested the same prompts on human participants.
arXiv Detail & Related papers (2023-05-08T01:02:52Z)
Dissociating language and thought in large language models [52.39241645471213]
Large Language Models (LLMs) have come closest among all models to date to mastering human language. We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty.
arXiv Detail & Related papers (2023-01-16T22:41:19Z)
Thinking Fast and Slow in Large Language Models [0.08057006406834465]
Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. In this study, we show that LLMs like GPT-3 exhibit behavior that resembles human-like intuition - and the cognitive errors that come with it.
arXiv Detail & Related papers (2022-12-10T05:07:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.