GPT-4o reads the mind in the eyes
- URL: http://arxiv.org/abs/2410.22309v2
- Date: Wed, 30 Oct 2024 16:30:30 GMT
- Title: GPT-4o reads the mind in the eyes
- Authors: James W. A. Strachan, Oriana Pansardi, Eugenio Scaliti, Marco Celotto, Krati Saxena, Chunzhi Yi, Fabio Manzi, Alessandro Rufo, Guido Manzi, Michael S. A. Graziano, Stefano Panzeri, Cristina Becchio,
- Abstract summary: We find that GPT-4o outperformed humans in interpreting mental states from upright faces.
GPT-4o's errors were not random but revealed a highly consistent, yet incorrect, processing of mental-state information.
These findings highlight how advanced mental state inference abilities and human-like face processing signatures, such as inversion effects, coexist in GPT-4o.
- Score: 33.2650796309076
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) are capable of reproducing human-like inferences, including inferences about emotions and mental states, from text. Whether this capability extends beyond text to other modalities remains unclear. Humans possess a sophisticated ability to read the mind in the eyes of other people. Here we tested whether this ability is also present in GPT-4o, a multimodal LLM. Using two versions of a widely used theory of mind test, the Reading the Mind in Eyes Test and the Multiracial Reading the Mind in the Eyes Test, we found that GPT-4o outperformed humans in interpreting mental states from upright faces but underperformed humans when faces were inverted. While humans in our sample showed no difference between White and Non-white faces, GPT-4o's accuracy was higher for White than for Non-white faces. GPT-4o's errors were not random but revealed a highly consistent, yet incorrect, processing of mental-state information across trials, with an orientation-dependent error structure that qualitatively differed from that of humans for inverted faces but not for upright faces. These findings highlight how advanced mental state inference abilities and human-like face processing signatures, such as inversion effects, coexist in GPT-4o alongside substantial differences in information processing compared to humans.
Related papers
- Kernels of Selfhood: GPT-4o shows humanlike patterns of cognitive consistency moderated by free choice [0.5277756703318045]
We show that GPT-4o exhibits patterns of attitude change mimicking cognitive consistency effects in humans.
This result suggests that GPT-4o manifests a functional analog of humanlike selfhood.
arXiv Detail & Related papers (2025-01-27T02:25:12Z) - Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence [0.0]
Large language models (LLMs) have shown impressive alignment with human cognitive processes.
This study investigates whether ChatGPT possess metacognitive monitoring abilities akin to humans.
arXiv Detail & Related papers (2024-10-17T09:42:30Z) - GPT-4 is judged more human than humans in displaced and inverted Turing tests [0.7437224586066946]
Everyday AI detection requires differentiating between people and AI in online conversations.
We measured how well people and large language models can discriminate using two modified versions of the Turing test: inverted and displaced.
arXiv Detail & Related papers (2024-07-11T20:28:24Z) - People cannot distinguish GPT-4 from a human in a Turing test [0.913127392774573]
GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%)
Results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected.
arXiv Detail & Related papers (2024-05-09T04:14:09Z) - GPT-4 Understands Discourse at Least as Well as Humans Do [1.3499500088995462]
GPT-4 performs slightly, but not statistically significantly, better than humans given the very high level of human performance.
Both GPT-4 and humans exhibit a strong ability to make inferences about information that is not explicitly stated in a story, a critical test of understanding.
arXiv Detail & Related papers (2024-03-25T21:17:14Z) - Human vs. LMMs: Exploring the Discrepancy in Emoji Interpretation and Usage in Digital Communication [68.40865217231695]
This study examines the behavior of GPT-4V in replicating human-like use of emojis.
The findings reveal a discernible discrepancy between human and GPT-4V behaviors, likely due to the subjective nature of human interpretation.
arXiv Detail & Related papers (2024-01-16T08:56:52Z) - Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs [67.51906565969227]
We study the unintended side-effects of persona assignment on the ability of LLMs to perform basic reasoning tasks.
Our study covers 24 reasoning datasets, 4 LLMs, and 19 diverse personas (e.g. an Asian person) spanning 5 socio-demographic groups.
arXiv Detail & Related papers (2023-11-08T18:52:17Z) - Holistic Analysis of Hallucination in GPT-4V(ision): Bias and
Interference Challenges [54.42256219010956]
This benchmark is designed to evaluate and shed light on the two common types of hallucinations in visual language models: bias and interference.
bias refers to the model's tendency to hallucinate certain types of responses, possibly due to imbalance in its training data.
interference pertains to scenarios where the judgment of GPT-4V(ision) can be disrupted due to how the text prompt is phrased or how the input image is presented.
arXiv Detail & Related papers (2023-11-06T17:26:59Z) - Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias [57.42417061979399]
Recent studies show that instruction tuning (IT) and reinforcement learning from human feedback (RLHF) improve the abilities of large language models (LMs) dramatically.
In this work, we investigate the effect of IT and RLHF on decision making and reasoning in LMs.
Our findings highlight the presence of these biases in various models from the GPT-3, Mistral, and T5 families.
arXiv Detail & Related papers (2023-08-01T01:39:25Z) - Human-Like Intuitive Behavior and Reasoning Biases Emerged in Language
Models -- and Disappeared in GPT-4 [0.0]
We show that large language models (LLMs) exhibit behavior that resembles human-like intuition.
We also probe how sturdy the inclination for intuitive-like decision-making is.
arXiv Detail & Related papers (2023-06-13T08:43:13Z) - Sparks of Artificial General Intelligence: Early experiments with GPT-4 [66.1188263570629]
GPT-4, developed by OpenAI, was trained using an unprecedented scale of compute and data.
We demonstrate that GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more.
We believe GPT-4 could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.
arXiv Detail & Related papers (2023-03-22T16:51:28Z) - Large language models predict human sensory judgments across six
modalities [12.914521751805658]
We show that state-of-the-art large language models can unlock new insights into the problem of recovering the perceptual world from language.
We elicit pairwise similarity judgments from GPT models across six psychophysical datasets.
We show that the judgments are significantly correlated with human data across all domains, recovering well-known representations like the color wheel and pitch spiral.
arXiv Detail & Related papers (2023-02-02T18:32:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.