Related papers: The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs

The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs

URL: http://arxiv.org/abs/2509.03730v2
Date: Fri, 05 Sep 2025 01:39:01 GMT
Title: The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs
Authors: Pengrui Han, Rafal Kocielnik, Peiyang Song, Ramit Debnath, Dean Mobbs, Anima Anandkumar, R. Michael Alvarez,
Abstract summary: Personality traits have long been studied as predictors of human behavior.<n>Recent advances in Large Language Models (LLMs) suggest similar patterns may emerge in artificial systems.
Score: 60.15472325639723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Personality traits have long been studied as predictors of human behavior. Recent advances in Large Language Models (LLMs) suggest similar patterns may emerge in artificial systems, with advanced LLMs displaying consistent behavioral tendencies resembling human traits like agreeableness and self-regulation. Understanding these patterns is crucial, yet prior work primarily relied on simplified self-reports and heuristic prompting, with little behavioral validation. In this study, we systematically characterize LLM personality across three dimensions: (1) the dynamic emergence and evolution of trait profiles throughout training stages; (2) the predictive validity of self-reported traits in behavioral tasks; and (3) the impact of targeted interventions, such as persona injection, on both self-reports and behavior. Our findings reveal that instructional alignment (e.g., RLHF, instruction tuning) significantly stabilizes trait expression and strengthens trait correlations in ways that mirror human data. However, these self-reported traits do not reliably predict behavior, and observed associations often diverge from human patterns. While persona injection successfully steers self-reports in the intended direction, it exerts little or inconsistent effect on actual behavior. By distinguishing surface-level trait expression from behavioral consistency, our findings challenge assumptions about LLM personality and underscore the need for deeper evaluation in alignment and interpretability.

Related papers

Enhancing Personality Recognition by Comparing the Predictive Power of Traits, Facets, and Nuances [37.83859643892549]
Personality recognition models aim to infer personality traits from different sources of behavioral data.<n>We trained a transformer-based model including cross-modal (audiovisual) and cross-subject (dyad-aware) attention mechanisms.<n>Results show that nuance-level models consistently outperform facet and trait-level models, reducing mean squared error by up to 74% across interaction scenarios.
arXiv Detail & Related papers (2026-02-05T13:35:04Z)
Judging with Personality and Confidence: A Study on Personality-Conditioned LLM Relevance Assessment [27.57574817687014]
Large language models (LLMs) can simulate specific personality traits and produce behaviors that align with those traits.<n>Few studies have examined how simulated personalities impact confidence calibration, specifically the tendencies toward overconfidence or underconfidence.<n>We show that personalities such as low agreeableness consistently align more closely with human labels than the unprompted condition.
arXiv Detail & Related papers (2026-01-05T07:46:29Z)
IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization [66.6349183886101]
We propose IROTE, a novel in-context method for stable and transferable trait elicitation.<n>We show that one single IROTE-generated self-reflection can induce LLMs' stable impersonation of the target trait across diverse downstream tasks.
arXiv Detail & Related papers (2025-08-12T08:04:28Z)
Investigating VLM Hallucination from a Cognitive Psychology Perspective: A First Step Toward Interpretation with Intriguing Observations [60.63340688538124]
Hallucination is a long-standing problem that has been actively investigated in Vision-Language Models (VLMs)<n>Existing research commonly attributes hallucinations to technical limitations or sycophancy bias, where the latter means the models tend to generate incorrect answers to align with user expectations.<n>In this work, we introduce a psychological taxonomy, categorizing VLMs' cognitive biases that lead to hallucinations, including sycophancy, logical inconsistency, and a newly identified VLMs behaviour: appeal to authority.
arXiv Detail & Related papers (2025-07-03T19:03:16Z)
Evaluating the Simulation of Human Personality-Driven Susceptibility to Misinformation with LLMs [0.18416014644193066]
Large language models (LLMs) make it possible to generate synthetic behavioural data at scale.<n>We evaluate the capacity of LLM agents, conditioned on Big-Five profiles, to reproduce personality-based variation in susceptibility to misinformation.
arXiv Detail & Related papers (2025-06-30T08:16:07Z)
A Comparative Study of Large Language Models and Human Personality Traits [6.354326674890978]
Large Language Models (LLMs) have demonstrated human-like capabilities in language comprehension and generation.<n>This study investigates whether LLMs exhibit personality-like traits and how these traits compare with human personality.
arXiv Detail & Related papers (2025-05-01T15:10:15Z)
Exploring the Impact of Personality Traits on LLM Bias and Toxicity [34.54047035781886]
"Personification" of large language models (LLMs) with different personalities has attracted increasing research interests.<n>This study explores how assigning different personality traits to LLMs affects the toxicity and biases of their outputs.
arXiv Detail & Related papers (2025-02-18T06:07:09Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
Evaluating Large Language Models with Psychometrics [59.821829073478376]
This paper offers a comprehensive benchmark for quantifying psychological constructs of Large Language Models (LLMs)<n>Our work identifies five key psychological constructs -- personality, values, emotional intelligence, theory of mind, and self-efficacy -- assessed through a suite of 13 datasets.<n>We uncover significant discrepancies between LLMs' self-reported traits and their response patterns in real-world scenarios, revealing complexities in their behaviors.
arXiv Detail & Related papers (2024-06-25T16:09:08Z)
LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts. Most existing methods learn post features directly by fine-tuning the pre-trained language models. We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z)
RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms [45.97077960079147]
We introduce a framework, RealBehavior, which is designed to characterize the humanoid behaviors of models faithfully. Our findings suggest that a simple application of psychological tools cannot faithfully characterize all human-like behaviors.
arXiv Detail & Related papers (2023-10-17T12:58:17Z)
Dataset Bias in Human Activity Recognition [57.91018542715725]
This contribution statistically curates the training data to assess to what degree the physical characteristics of humans influence HAR performance. We evaluate the performance of a state-of-the-art convolutional neural network on two HAR datasets that vary in the sensors, activities, and recording for time-series HAR.
arXiv Detail & Related papers (2023-01-19T12:33:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.