Persistent Instability in LLM's Personality Measurements: Effects of Scale, Reasoning, and Conversation History
- URL: http://arxiv.org/abs/2508.04826v1
- Date: Wed, 06 Aug 2025 19:11:33 GMT
- Title: Persistent Instability in LLM's Personality Measurements: Effects of Scale, Reasoning, and Conversation History
- Authors: Tommaso Tosato, Saskia Helbling, Yorguin-Jose Mantilla-Ramos, Mahmood Hegazy, Alberto Tosato, David John Lemay, Irina Rish, Guillaume Dumas,
- Abstract summary: Even 400B+ models exhibit substantial response variability.<n> Interventions expected to stabilize behavior, such as chain-of-thought reasoning, detailed personas instruction, inclusion of conversation history, can paradoxically increase variability.<n>For safety-critical applications requiring predictable behavior, these findings indicate that personality-based alignment strategies may be fundamentally inadequate.
- Score: 7.58175460763641
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models require consistent behavioral patterns for safe deployment, yet their personality-like traits remain poorly understood. We present PERSIST (PERsonality Stability in Synthetic Text), a comprehensive evaluation framework testing 25+ open-source models (1B-671B parameters) across 500,000+ responses. Using traditional (BFI-44, SD3) and novel LLM-adapted personality instruments, we systematically vary question order, paraphrasing, personas, and reasoning modes. Our findings challenge fundamental deployment assumptions: (1) Even 400B+ models exhibit substantial response variability (SD > 0.4); (2) Minor prompt reordering alone shifts personality measurements by up to 20%; (3) Interventions expected to stabilize behavior, such as chain-of-thought reasoning, detailed personas instruction, inclusion of conversation history, can paradoxically increase variability; (4) LLM-adapted instruments show equal instability to human-centric versions, confirming architectural rather than translational limitations. This persistent instability across scales and mitigation strategies suggests current LLMs lack the foundations for genuine behavioral consistency. For safety-critical applications requiring predictable behavior, these findings indicate that personality-based alignment strategies may be fundamentally inadequate.
Related papers
- Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight [0.0]
Large Behavioral Model is a behavioral foundation model fine-tuned to predict individual strategic choices with high fidelity.<n>We trained on a proprietary dataset linking stable dispositions, motivational states, and situational constraints to observed choices.<n>We find that while prompting-based baselines exhibit a complexity ceiling, LBM continues to benefit from increasingly dense trait profiles.
arXiv Detail & Related papers (2026-02-19T10:13:17Z) - PTCBENCH: Benchmarking Contextual Stability of Personality Traits in LLM Systems [30.449659477704543]
We introduce PTCBENCH, a benchmark designed to quantify the consistency of large language models (LLMs) personalities under controlled situational contexts.<n> PTCBENCH subjects models to 12 distinct external conditions spanning diverse location contexts and life events, and rigorously assesses the personality using the NEO Five-Factor Inventory.<n>Our study on 39,240 personality trait records reveals that certain external scenarios can trigger significant personality changes of LLMs, and even alter their reasoning capabilities.
arXiv Detail & Related papers (2026-01-12T18:15:50Z) - Two-Faced Social Agents: Context Collapse in Role-Conditioned Large Language Models [0.0]
GPT-5 exhibited complete mathematics contextual collapse and adopted a singular identity towards optimal responses.<n> Claude Sonnet 4.5 retained limited but measurable role-specific variation on the SAT items.<n>All models exhibited distinct role-conditioned affective preference, indicating that socio-affective variation can reemerge when cognitive constraints are relaxed.
arXiv Detail & Related papers (2025-11-19T16:04:49Z) - TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation [55.55404595177229]
Large Language Models (LLMs) are exhibiting emergent human-like abilities.<n>TwinVoice is a benchmark for assessing persona simulation across diverse real-world contexts.
arXiv Detail & Related papers (2025-10-29T14:00:42Z) - DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios [57.327907850766785]
characterization of deception across realistic real-world scenarios remains underexplored.<n>We establish DeceptionBench, the first benchmark that systematically evaluates how deceptive tendencies manifest across different domains.<n>On the intrinsic dimension, we explore whether models exhibit self-interested egoistic tendencies or sycophantic behaviors that prioritize user appeasement.<n>We incorporate sustained multi-turn interaction loops to construct a more realistic simulation of real-world feedback dynamics.
arXiv Detail & Related papers (2025-10-17T10:14:26Z) - Evaluating LLM Alignment on Personality Inference from Real-World Interview Data [7.061237517845673]
Large Language Models (LLMs) are increasingly deployed in roles requiring nuanced psychological understanding.<n>Their ability to interpret human personality traits, a critical aspect of such applications, remains unexplored.<n>We introduce a novel benchmark comprising semi-structured interview transcripts paired with validated continuous Big Five trait scores.
arXiv Detail & Related papers (2025-09-16T16:54:35Z) - The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs [60.15472325639723]
Personality traits have long been studied as predictors of human behavior.<n>Recent advances in Large Language Models (LLMs) suggest similar patterns may emerge in artificial systems.
arXiv Detail & Related papers (2025-09-03T21:27:10Z) - IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization [66.6349183886101]
We propose IROTE, a novel in-context method for stable and transferable trait elicitation.<n>We show that one single IROTE-generated self-reflection can induce LLMs' stable impersonation of the target trait across diverse downstream tasks.
arXiv Detail & Related papers (2025-08-12T08:04:28Z) - LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models [51.55869466207234]
Existing evaluation of Large Language Models (LLMs) on static benchmarks is vulnerable to data contamination and leaderboard overfitting.<n>We introduce LLMEval-3, a framework for dynamic evaluation of LLMs.<n>LLEval-3 is built on a proprietary bank of 220k graduate-level questions, from which it dynamically samples unseen test sets for each evaluation run.
arXiv Detail & Related papers (2025-08-07T14:46:30Z) - Evaluating the Simulation of Human Personality-Driven Susceptibility to Misinformation with LLMs [0.18416014644193066]
Large language models (LLMs) make it possible to generate synthetic behavioural data at scale.<n>We evaluate the capacity of LLM agents, conditioned on Big-Five profiles, to reproduce personality-based variation in susceptibility to misinformation.
arXiv Detail & Related papers (2025-06-30T08:16:07Z) - Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation [16.76995815742803]
We propose an atomic-level evaluation framework that quantifies persona fidelity at a finer granularity.<n>Our three key metrics measure the degree of persona alignment and consistency within and across generations.<n>By analyzing persona fidelity across diverse tasks and personality types, we reveal how task structure and persona desirability influence model adaptability.
arXiv Detail & Related papers (2025-06-24T06:33:10Z) - A Comparative Study of Large Language Models and Human Personality Traits [6.354326674890978]
Large Language Models (LLMs) have demonstrated human-like capabilities in language comprehension and generation.<n>This study investigates whether LLMs exhibit personality-like traits and how these traits compare with human personality.
arXiv Detail & Related papers (2025-05-01T15:10:15Z) - Personality Editing for Language Models through Adjusting Self-Referential Queries [17.051166122108857]
We present PALETTE (Personality Adjustment by LLM SElf-TargeTed quEries), a novel method for personality editing in Large Language Models (LLMs)<n>Our approach introduces adjustment queries, where self-referential statements grounded in psychological constructs are treated analogously to factual knowledge, enabling direct editing of personality-related responses.<n>Unlike fine-tuning, PALETTE requires only 12 editing samples to achieve substantial improvements in personality alignment across personality dimensions.
arXiv Detail & Related papers (2025-02-17T13:28:14Z) - Self-Evolving Critique Abilities in Large Language Models [59.861013614500024]
This paper explores enhancing critique abilities of Large Language Models (LLMs)<n>We introduce SCRIT, a framework that trains LLMs with self-generated data to evolve their critique abilities.<n>Our analysis reveals that SCRIT's performance scales positively with data and model size.
arXiv Detail & Related papers (2025-01-10T05:51:52Z) - Rediscovering the Latent Dimensions of Personality with Large Language Models as Trait Descriptors [4.814107439144414]
We introduce a novel approach that uncovers latent personality dimensions in large language models (LLMs)
Our experiments show that LLMs "rediscover" core personality traits such as extraversion, agreeableness, conscientiousness, neuroticism, and openness without relying on direct questionnaire inputs.
We can use the derived principal components to assess personality along the Big Five dimensions, and achieve improvements in average personality prediction accuracy of up to 5% over fine-tuned models.
arXiv Detail & Related papers (2024-09-16T00:24:40Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) are used to automate decision-making tasks.<n>In this paper, we evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.<n>We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types.<n>These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - LLMs Simulate Big Five Personality Traits: Further Evidence [51.13560635563004]
We analyze the personality traits simulated by Llama2, GPT4, and Mixtral.
This contributes to the broader understanding of the capabilities of LLMs to simulate personality traits.
arXiv Detail & Related papers (2024-01-31T13:45:25Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Personality Traits in Large Language Models [42.31355340867784]
Personality is a key factor determining the effectiveness of communication.<n>We present a novel and comprehensive psychometrically valid and reliable methodology for administering and validating personality tests on widely-used large language models.<n>We discuss the application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.
arXiv Detail & Related papers (2023-07-01T00:58:51Z) - Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality [0.0]
Large Language Models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users.
This study aimed to assess the temporal stability and inter-rater agreement on their responses on personality instruments in two time points.
The findings revealed varying levels of inter-rater agreement in the LLMs responses over a short time.
arXiv Detail & Related papers (2023-06-07T10:14:17Z) - Revisiting the Reliability of Psychological Scales on Large Language Models [62.57981196992073]
This study aims to determine the reliability of applying personality assessments to Large Language Models.
Analysis of 2,500 settings per model, including GPT-3.5, GPT-4, Gemini-Pro, and LLaMA-3.1, reveals that various LLMs show consistency in responses to the Big Five Inventory.
arXiv Detail & Related papers (2023-05-31T15:03:28Z) - Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards
Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally.
We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.